Extended Lojban Grammar (a draft): Difference between revisions

From Lojban
Jump to navigation Jump to search
No edit summary
No edit summary
Line 1: Line 1:
<div align="center" style="font-size:180%">Independent reference grammar of Lojban.</div>
<div align="center" style="font-size:180%">Independent reference grammar of Lojban.</div>


{{:ELG. Introduction}}
{{:ELG. Introduction}}
Line 7: Line 6:


{{:ELG. Basic sentence elements}}
{{:ELG. Basic sentence elements}}
{{:ELG. Morphology}}


{{:ELG. Subjunctives, imaginary situations}}
{{:ELG. Subjunctives, imaginary situations}}


=Chinese style ''yes/no'' questions=
=Chinese style ''yes/no'' questions=

Revision as of 11:21, 9 June 2014

Independent reference grammar of Lojban.

Lojban (pronounced as [ˈloʒban]) is a new cool language.

It is a constructed language based on so called predicate logic which makes it kind of a bridge between different languages and cultures.

Lojban, the Odyssey to our Universe.

What is unique about this language? All natural grown languages have inner drawbacks like complications in grammar rules, biases and restrictions that discourage other ways of thinking.

Lojban is designed to free us from these restrictions and see the world brighter.

Here are its advantages:

  • Lojban speech allows you to say things shorter without unnecessary distracting details. For example, you don't have to always think of what tense (past, present or future) to use in a verb when it's already clear from context. When you need details you add them. But unlike other languages Lojban doesn't force you to do so.
  • Lojban has unprecedented tools for expressing human emotions.
  • Lojban is made consistent and easy to extend. There are no exclusions in grammar. There are nice ways of extending existing vocabulary.

Ask a question on Lojban on the guest board

Lojban is an endless journey

Being carefully worked-out in design by its creators this language accurately and precisely reveals the delicate aspects of logic and our mentality.

  • Lojban is clean, simple, and with all these advantages powerful language. Why not start speaking it?
  • Lojban is for artists that would adore expressing tiny details of human emotions
  • Lojban is for lovers of wisdom (philosophers, in the original sense)
  • Lojban is for scientists that like all concepts to be put in a concise system.
  • Lojban is the best tool for implementing machine automatic translation. Still it's a speakable language.
  • Lojban breaks the barrier of misunderstanding of people with different background.
  • Lastly, Lojban is also fun !

Lojban is easy to learn - the number of root words is only 1341. Not that much to learn to start speaking!

Alphabet and phonology

Lojban is designed so that any properly spoken Lojban utterance can be uniquely transcribed in writing, and any properly written Lojban can be spoken so as to be uniquely reproduced by another person. As a consequence, the standard Lojban orthography must assign to each distinct sound, or phoneme, a unique letter or symbol. Each letter or symbol has only one sound or, more accurately, a limited range of sounds that are permitted pronunciations for that phoneme. Some symbols indicate stress (speech emphasis) and pause, which are also essential to Lojban word recognition. In addition, everything that is represented in other languages by punctuation (when written) or by tone of voice (when spoken) is represented in Lojban by words. These two properties together are known technically as “audio-visual isomorphism”.

Lojban uses a variant of the Latin (Roman) alphabet, consisting of the following letters and symbols:

, . ` a b c d e f g i j k l m n o p r s t u v x y z omitting the letters “h”, “q”, and “w”.

The alphabetic order given above is that of the ASCII coded character set, widely used in computers. By making Lojban alphabetical order the same as ASCII, computerized sorting and searching of Lojban text is facilitated.

Capital letters are used only to represent non-standard stress, which can appear only in the representation of Lojbanized names. Thus the English name “Josephine”, as normally pronounced, is Lojbanized as DJOsefin., pronounced ['dʒosɛfinʔ]. Technically, it is sufficient to capitalize the vowel letter, in this case O, but it is easier on the reader to capitalize the whole syllable.

Another method of dealing with stress is to put the letter ` immediately before the stressed phoneme.

Without the capitalization, the ordinary rules of Lojban stress would cause the se syllable to be stressed. Lojbanized names are meant to represent the pronunciation of names from other languages with as little distortion as may be; as such, they are exempt from many of the regular rules of Lojban phonology, as will appear in the rest of this chapter.

Basic Phonetics

Lojban pronunciations are defined using the International Phonetic Alphabet, or IPA, a standard method of transcribing pronunciations. By convention, IPA transcriptions are always within square brackets: for example, the word “cat” is pronounced (in General American pronunciation) [kæt]. Section 2.6 contains a brief explanation of the IPA characters used in this chapter, with their nearest analogues in English, and will be especially useful to those not familiar with the technical terms used in describing speech sounds.

The standard pronunciations and permitted variants of the Lojban letters are listed in the table below. The descriptions have deliberately been made a bit ambiguous to cover variations in pronunciation by speakers of different native languages and dialects. In all cases except r the first IPA symbol shown represents the preferred pronunciation; for r, all of the variations (and any other rhotic sound) are equally acceptable. <tab class=wikitable> Letter IPA X-SAMPA Description ' [h] [h] an unvoiced glottal spirant , - - the syllable separator . [ʔ] [?] a glottal stop or a pause a [a], [ɑ] [a], [A] an open vowel b [b] [b] a voiced bilabial stop c [ʃ], [ʂ] [S], [s`] an unvoiced coronal sibilant d [d] [d] a voiced dental/alveolar stop e [ɛ], [e] [E], [e] a front mid vowel f [f], [ɸ] [f], [p\] an unvoiced labial fricative g [ɡ] [g] a voiced velar stop i [i] [i] a front close vowel j [ʒ], [ʐ] [Z], [z`] a voiced coronal sibilant k [k] [k] an unvoiced velar stop l [l], [l̩] [l], [l=] a voiced lateral approximant (may be syllabic) m [m], [m̩] [m], [m=] a voiced bilabial nasal (may be syllabic) n [n], [n̩], [ŋ], [ŋ̍] [n], [n=], [N], [N=] a voiced dental or velar nasal (may be syllabic) o [o], [ɔ] [o], [O] a back mid vowel p [p] [p] an unvoiced bilabial stop r [r], [ɹ], [ɾ], [ʀ], [r̩], [ɹ̩], [ɾ̩], [ʀ̩] [r], [r\], [4], [R\], [r=], [r\=], [4=], [R\=] a rhotic sound s [s] [s] an unvoiced alveolar sibilant t [t] [t] an unvoiced dental/alveolar stop u [u] [u] a back close vowel v [v], [β] [v], [B] a voiced labial fricative x [x] [x] an unvoiced velar fricative y [ə] [@] a central mid vowel z [z] [z] a voiced alveolar sibilant </tab> The Lojban sounds must be clearly pronounced so that they are not mistaken for each other. Voicing and placement of the tongue are the key factors in correct pronunciation, but other subtle differences will develop between consonants in a Lojban-speaking community. At this point these are the only mandatory rules on the range of sounds.

Note in particular that Lojban vowels can be pronounced with either rounded or unrounded lips; typically o and u are rounded and the others are not, as in English, but this is not a requirement; some people round y as well. Lojban consonants can be aspirated or unaspirated. Palatalizing of consonants, as found in Russian and other languages, is not generally acceptable in pronunciation, though a following i may cause it.

The sounds represented by the letters c, g, j, s, and x require special attention for speakers of English, either because they are ambiguous in the orthography of English (c, g, s), or because they are strikingly different in Lojban (c, j, x). The English “c” represents three different sounds, [k] in “cat” and

[s] in “cent”, as well as the [ʃ] of “ocean”. Similarly, English “g” can represent [ɡ] as in “go”, [dʒ] as in “gentle”, and [ʒ] as in the second "g" in “garage” (in some pronunciations). English “s” can be either [s] as in “cats”, [z] as in “cards”, [ʃ] as in “tension”, or [ʒ] as in “measure”. The sound of Lojban x doesn't appear in most English dialects at all.

There are two common English sounds that are found in Lojban but are not Lojban consonants: the “ch” of “church” and the “j” of “judge”. In Lojban, these are considered two consonant sounds spoken together without an intervening vowel sound, and so are represented in Lojban by the two separate consonants: tc (IPA [tʃ]) and dj (IPA [dʒ]). In general, whether a complex sound is considered one sound or two depends on the language: Russian views “ts” as a single sound, whereas English, French, and Lojban consider it to be a consonant cluster.

The Special Lojban Characters

The apostrophe, period, and comma need special attention. They are all used as indicators of a division between syllables, but each has a different pronunciation, and each is used for different reasons:

The apostrophe represents a phoneme similar to a short, breathy English “h”, (IPA [h]). The letter “h” is not used to represent this sound for two reasons: primarily in order to simplify explanations of the morphology, but also because the sound is very common, and the apostrophe is a visually lightweight representation of it. The apostrophe sound is a consonant in nature, but is not treated as either a consonant or a vowel for purposes of Lojban morphology (word-formation), which is explained in Chapter 5. In addition, the apostrophe visually parallels the comma and the period, which are also used (in different ways) to separate syllables.

The apostrophe is included in Lojban only to enable a smooth transition between vowels, while joining the vowels within a single word. In fact, one way to think of the apostrophe is as representing an unvoiced vowel glide.

As a permitted variant, any unvoiced fricative other than those already used in Lojban may be used to render the apostrophe: IPA [θ] is one possibility. The convenience of the listener should be regarded as paramount in deciding to use a substitute for [h].

The period represents a mandatory pause, with no specified length; a glottal stop (IPA

[ʔ]) is considered a pause of shortest length. A pause (or glottal stop) may appear between any two words, and in certain cases – explained in detail in

Section 5.8 – must occur. In particular, a word beginning with a vowel is always preceded by a pause, and a word ending in a consonant is always followed by a pause.

Technically, the period is an optional reminder to the reader of a mandatory pause that is dictated by the rules of the language; because these rules are unambiguous, a missing period can be inferred from otherwise correct text. Periods are included only as an aid to the reader.

A period also may be found apparently embedded in a word. When this occurs, such a written string is not one word but two, written together to indicate that the writer intends a unitary meaning for the compound. It is not really necessary to use a space between words if a period appears.

The comma is used to indicate a syllable break within a word, generally one that is not obvious to the reader. Such a comma is written to separate syllables, but indicates that there must be no pause between them, in contrast to the period. Between two vowels, a comma indicates that some type of glide may be necessary to avoid a pause that would split the two syllables into separate words. It is always legal to use the apostrophe (IPA

[h]) sound in pronouncing a comma. However, a comma cannot be pronounced as a pause or glottal stop between the two letters separated by the comma, because that pronunciation would split the word into two words.

Otherwise, a comma is usually only used to clarify the presence of syllabic l, m, n, or r (discussed later). Commas are never required: no two Lojban words differ solely because of the presence or placement of a comma.

Here is a somewhat artificial example of the difference in pronunciation between periods, commas and apostrophes. In the English song about Old MacDonald's Farm, the vowel string which is written as “ee-i-ee-i-o” in English could be Lojbanized with periods as:

Example 2.1:
.i.ai.i.ai.o
[ʔi ʔaj ʔi ʔaj ʔo]
Ee! Eye! Ee! Eye! Oh!

However, this would sound clipped, staccato, and unmusical compared to the English. Furthermore, although Example 2.1 is a string of meaningful Lojban words, as a sentence it makes very little sense. (Note the use of periods embedded within the written word.)

If commas were used instead of periods, we could represent the English string as a Lojbanized name, ending in a consonant:

Example 2.2:
.i,ai,i,ai,on.
[ʔi jaj ji jaj jonʔ]

The commas represent new syllable breaks, but prohibit the use of pauses or glottal stop. The pronunciation shown is just one possibility, but closely parallels the intended English pronunciation.

However, the use of commas in this way is risky to unambiguous interpretation, since the glides might be heard by some listeners as diphthongs, producing something like

Example 2.3:
.i,iai,ii,iai,ion.

which is technically a different Lojban name. Since the intent with Lojbanized names is to allow them to be pronounced more like their native counterparts, the comma is allowed to represent vowel glides or some non-Lojbanic sound. Such an exception affects only spelling accuracy and the ability of a reader to replicate the desired pronunciation exactly; it will not affect the recognition of word boundaries.

Still, it is better if Lojbanized names are always distinct. Therefore, the apostrophe is preferred in regular Lojbanized names that are not attempting to simulate a non-Lojban pronunciation perfectly. (Perfection, in any event, is not really achievable, because some sounds simply lack reasonable Lojbanic counterparts.)

If apostrophes were used instead of commas in Example 2.2, it would appear as:

Example 2.4:
.i'ai'i'ai'on.
[ʔi hai hi hai honʔ]

which preserves the rhythm and length, if not the exact sounds, of the original English.

Diphthongs and Syllabic Consonants

There exist 16 diphthongs in the Lojban language. A diphthong is a vowel sound that consists of two elements, a short vowel sound and a glide, either a labial (IPA [w]) or palatal (IPA [j]) glide, that either precedes (an on-glide) or follows (an off-glide) the main vowel. Diphthongs always constitute a single syllable.

For Lojban purposes, a vowel sound is a relatively long speech-sound that forms the nucleus of a syllable. Consonant sounds are relatively brief and normally require an accompanying vowel sound in order to be audible. Consonants may occur at the beginning or end of a syllable, around the vowel, and there may be several consonants in a cluster in either position. Each separate vowel sound constitutes a distinct syllable; consonant sounds do not affect the determination of syllables.

The six Lojban vowels are a, e, i, o, u, and y. The first five vowels appear freely in all kinds of Lojban words. The vowel y has a limited distribution: it appears only in Lojbanized names, in the Lojban names of the letters of the alphabet, as a glue vowel in compound words, and standing alone as a space-filler word (like English “uh” or “er”).

The Lojban diphthongs are shown in the table below. (Variant pronunciations have been omitted, but are much as one would expect based on the variant pronunciations of the separate vowel letters: ai may be pronounced

[ɑj], for example.)

<tab class=wikitable head=top> Letters IPA Description ai [aj] an open vowel with palatal off-glideei [ɛj] a front mid vowel with palatal off-glideoi [oj] a back mid vowel with palatal off-glideau [aw] an open vowel with labial off-glideia [ja] an open vowel with palatal on-glideie [jɛ] a front mid vowel with palatal on-glideii [ji] a front close vowel with palatal on-glideio [jo] a back mid vowel with palatal on-glideiu [ju] a back close vowel with palatal on-glideua [wa] an open vowel with labial on-glideue [wɛ] a front mid vowel with labial on-glideui [wi] a front close vowel with labial on-glideuo [wo] a back mid vowel with labial on-glideuu [wu] a back close vowel with labial on-glideiy [jə] a central mid vowel with palatal on-glideuy [wə] a central mid vowel with labial on-glide

(Approximate English equivalents of most of these diphthongs exist: See Section for examples.)

The first four diphthongs above ( ai, ei, oi, and au, the ones with off-glides) are freely used in most types of Lojban words; the ten following ones are used only as stand-alone words and in Lojbanized names and borrowings; and the last two ( iy and uy) are used only in Lojbanized names.

The syllabic consonants of Lojban, [l̩], [m̩], [n̩], and [r̩], are variants of the non-syllabic [l], [m], [n], and [r] respectively. They normally have only a limited distribution, appearing in Lojban names and borrowings, although in principle any l, m, n, or r may be pronounced syllabically. If a syllabic consonant appears next to a l, m, n, or r that is not syllabic, it may not be clear which is which:

brlgan.
[br̩lgan]

is a hypothetical Lojbanized name with more than one valid pronunciation; however it is pronounced, it remains the same word.

Syllabic consonants are treated as consonants rather than vowels from the standpoint of Lojban morphology. Thus Lojbanized names, which are generally required to end in a consonant, are allowed to end with a syllabic consonant. An example is rl., which is an approximation of the English name “Earl”, and has two syllabic consonants.

Syllables with syllabic consonants and no vowel are never stressed or counted when determining which syllables to stress (see Section 2.5).

Vowel Pairs

Lojban vowels also occur in pairs, where each vowel sound is in a separate syllable. These two vowel sounds are connected (and separated) by an apostrophe. Lojban vowel pairs should be pronounced continuously with the [h] sound between (and not by a glottal stop or pause, which would split the two vowels into separate words).

All vowel combinations are permitted in two-syllable pairs with the apostrophe separating them; this includes those which constitute diphthongs when the apostrophe is not included.

The Lojban vowel pairs are: a'a, a'e, a'i, a'o, a'u, a'y, e'a, e'e, e'i, e'o, e'u, e'y, i'a, i'e, i'i, i'o, i'u, i'y, o'a, o'e, o'i, o'o, o'u, o'y, u'a, u'e, u'i, u'o, u'u, u'y, y'a, y'e, y'i, y'o, y'u, y'y.

Vowel pairs involving y appear only in Lojbanized names. They could appear in cmavo (structure words), but only .y'y. is so used – it is the Lojban name of the apostrophe letter (See Section ).

When more than two vowels occur together in Lojban, the normal pronunciation pairs vowels from the left into syllables, as in the Lojbanized name:

meiin.
mei,in.

Example contains the diphthong ei followed by the vowel i. In order to indicate a different grouping, the comma must always be used, leading to:

me,iin.

which contains the vowel e followed by the diphthong ii. In rough English representation, Example is “May Een”, whereas Example is “Meh Yeen”.

Consonant Clusters

A consonant sound is a relatively brief speech-sound that precedes or follows a vowel sound in a syllable; its presence either preceding or following does not add to the count of syllables, nor is a consonant required in either position for any syllable. Lojban has seventeen consonants: for the purposes of this section, the apostrophe is not counted as a consonant.

An important distinction dividing Lojban consonants is that of voicing. The following table shows the unvoiced consonants and the corresponding voiced ones: <tab class=wikitable> Unvoiced Voiced p b t d k g f v c j s z x - </tab> The consonant x has no voiced counterpart in Lojban. The remaining consonants, l, m, n, and r, are typically pronounced with voice, but can be pronounced unvoiced.

Consonant sounds occur in languages as single consonants, or as doubled, or as clustered combinations. Single consonant sounds are isolated by word boundaries or by intervening vowel sounds from other consonant sounds. Doubled consonant sounds are either lengthened like [s] in English “hiss”, or repeated like [k] in English “backcourt”. Consonant clusters consist of two or more single or doubled consonant sounds in a group, each of which is different from its immediate neighbor. In Lojban, doubled consonants are excluded altogether, and clusters are limited to two or three members, except in Lojbanized names.

Consonants can occur in three positions in words: initial (at the beginning), medial (in the middle), and final (at the end). In many languages, the sound of a consonant varies depending upon its position in the word. In Lojban, as much as possible, the sound of a consonant is unrelated to its position. In particular, the common American English trait of changing a “t” between vowels into a “d” or even an alveolar tap (IPA [ɾ]) is unacceptable in Lojban.

Lojban imposes no restrictions on the appearance of single consonants in any valid consonant position; however, no consonant (including syllabic consonants) occurs final in a word except in Lojbanized names.

Pairs of consonants can also appear freely, with the following restrictions:

  • It is forbidden for both consonants to be the same, as this would violate the rule against double consonants.
  • It is forbidden for one consonant to be voiced and the other unvoiced. The consonants l, m, n, and r are exempt from this restriction. As a result, bf is forbidden, and so is sd, but both fl and vl, and both ls and lz, are permitted.
  • It is forbidden for both consonants to be drawn from the set c, j, s, z.
  • The specific pairs cx, kx, xc, xk, and mz are forbidden.

These rules apply to all kinds of words, even Lojbanized names. If a name would normally contain a forbidden consonant pair, a y can be inserted to break up the pair:

djeimyz.
[dʒɛj məzʔ]
James

The regular English pronunciation of “James”, which is [dʒɛjmz], would Lojbanize as djeimz., which contains a forbidden consonant pair.

Initial Consonant Pairs

The set of consonant pairs that may appear at the beginning of a word (excluding Lojbanized names) is far more restricted than the fairly large group of permissible consonant pairs described in Section 2.3. Even so, it is more than English allows, although hopefully not more than English-speakers (and others) can learn to pronounce.

There are just 48 such permissible initial consonant pairs, as follows: bl br cf ck cl cm cn cp cr ct dj dr dz fl fr gl gr jb jd jg jm jv kl kr ml mr pl pr sf sk sl sm sn sp sr st tc tr ts vl vr xl xr zb zd zg zm zv Lest this list seem almost random, a pairing of voiced and unvoiced equivalent vowels will show significant patterns which may help in learning: pl pr fl fr bl br vl vr

cp cf ct ck cm cn cl cr jb jv jd jg jm sp sf st sk sm sn sl sr zb zv zd zg zm

tc tr ts kl kr dj dr dz gl gr

ml mr xl xr Note that if both consonants of an initial pair are voiced, the unvoiced equivalent is also permissible, and the voiced pair can be pronounced simply by voicing the unvoiced pair. (The converse is not true: cn is a permissible initial pair, but jn is not.)

Consonant triples can occur medially in Lojban words. They are subject to the following rules:

  • The first two consonants must constitute a permissible consonant pair;
  • The last two consonants must constitute a permissible initial consonant pair;
  • The triples ndj, ndz, ntc, and nts are forbidden.

Lojbanized names can begin or end with any permissible consonant pair, not just the 48 initial consonant pairs listed above, and can have consonant triples in any location, as long as the pairs making up those triples are permissible. In addition, names can contain consonant clusters with more than three consonants, again requiring that each pair within the cluster is valid.

Buffering Of Consonant Clusters

Many languages do not have consonant clusters at all, and even those languages that do have them often allow only a subset of the full Lojban set. As a result, the Lojban design allows the use of a buffer sound between consonant combinations which a speaker finds unpronounceable. This sound may be any non-Lojbanic vowel which is clearly separable by the listener from the Lojban vowels. Some possibilities are IPA [ɪ], [ɨ], [ʊ], or even [ʏ], but there probably is no universally acceptable buffer sound. When using a consonant buffer, the sound should be made as short as possible. Two examples showing such buffering (we will use [ɪ] in this chapter) are:

vrusi
[ˈvru si] or [vɪ ˈru si]
.AMsterdam.
[ʔam ster damʔ] or [ˈʔa mɪ sɪ tɛ rɪ da mɪʔ]

When a buffer vowel is used, it splits each buffered consonant into its own syllable. However, the buffering syllables are never stressed, and are not counted in determining stress. They are, in effect, not really syllables to a Lojban listener, and thus their impact is ignored.

Here are more examples of unbuffered and buffered pronunciations:

klama
[ˈkla ma]
[kɪ ˈla ma]
xapcke
[ˈxap ʃkɛ]
[ˈxa pɪ ʃkɛ]
[ˈxa pɪ ʃɪ kɛ]

In Example , we see that buffering vowels can be used in just some, rather than all, of the possible places: the second pronunciation buffers the pc consonant pair but not the ck. The third pronunciation buffers both.

ponyni'u
[po nə 'ni hu]

Example cannot contain any buffering vowel. It is important not to confuse the vowel y, which is pronounced [ə], with the buffer, which has a variety of possible pronunciations and is never written. Consider the contrast between

bongynanba
[boŋ gə ˈnan ba]

an unlikely Lojban compound word meaning “bone bread” (note the use of [ŋ] as a representative of n before g) and

bongnanba
[boŋ ˈgnan ba]

a possible borrowing from another language (Lojban borrowings can only take a limited form). If Example were pronounced with buffering, as

[boŋ gɪ ˈnan ba]

it would be very similar to Example . Only a clear distinction between y and any buffering vowel would keep the two words distinct.

Since buffering is done for the benefit of the speaker in order to aid pronounceability, there is no guarantee that the listener will not mistake a buffer vowel for one of the six regular Lojban vowels. The buffer vowel should be as laxly pronounced as possible, as central as possible, and as short as possible. Furthermore, it is worthwhile for speakers who use buffers to pronounce their regular vowels a bit longer than usual, to avoid confusion with buffer vowels. The speakers of many languages will have trouble correctly hearing any of the suggested buffer vowels otherwise. By this guideline, Example would be pronounced

[boːŋ gɪ ˈnaːn baː]

with lengthened vowels.

Syllabication And Stress

A Lojban word has one syllable for each of its vowels, diphthongs, and syllabic consonants (referred to simply as “vowels” for the purposes of this section.) Syllabication rules determine which of the consonants separating two vowels belong to the preceding vowel and which to the following vowel. These rules are conventional only; the phonetic facts of the matter about how utterances are syllabified in any language are always very complex.

A single consonant always belongs to the following vowel. A consonant pair is normally divided between the two vowels; however, if the pair constitute a valid initial consonant pair, they are normally both assigned to the following vowel. A consonant triple is divided between the first and second consonants. Apostrophes and commas, of course, also represent syllable breaks. Syllabic consonants usually appear alone in their syllables.

It is permissible to vary from these rules in Lojbanized names. For example, there are no definitive rules for the syllabication of names with consonant clusters longer than three consonants. The comma is used to indicate variant syllabication or to explicitly mark normal syllabication.

Here are some examples of Lojban syllabication:

pujenaicajeba
pu,je,nai,ca,je,ba

This word has no consonant pairs and is therefore syllabified before each medial consonant.

ninmu
nin,mu

This word is split at a consonant pair.

fitpri
fit,pri

This word is split at a consonant triple, between the first two consonants of the triple.

sairgoi
sair,goi
sai,r,goi

This word contains the consonant pair rg; the r may be pronounced syllabically or not.

klezba
klez,ba
kle,zba

This word contains the permissible initial pair zb, and so may be syllabicated either between z and b or before zb.

Stress is a relatively louder pronunciation of one syllable in a word or group of words. Since every syllable has a vowel sound (or diphthong or syllabic consonant) as its nucleus, and the stress is on the vowel sound itself, the terms “stressed syllable” and “stressed vowel” are largely interchangeable concepts.

Most Lojban words are stressed on the next-to-the-last, or penultimate, syllable. In counting syllables, however, syllables whose vowel is y or which contain a syllabic consonant ( l, m, n, or r) are never counted. (The Lojban term for penultimate stress is

da'amoi terbasna.) Similarly, syllables created solely by adding a buffer vowel, such as
[ɪ], are not counted.

There are actually three levels of stress – primary, secondary, and weak. Weak stress is the lowest level, so it really means no stress at all. Weak stress is required for syllables containing y, a syllabic consonant, or a buffer vowel.

Primary stress is required on the penultimate syllable of Lojban content words (called brivla). Lojbanized names may be stressed on any syllable, but if a syllable other than the penultimate is stressed, the syllable (or at least its vowel) must be capitalized in writing. Lojban structural words (called cmavo) may be stressed on any syllable or none at all. However, primary stress may not be used in a syllable just preceding a brivla, unless a pause divides them; otherwise, the two words may run together.

Secondary stress is the optional and non-distinctive emphasis used for other syllables besides those required to have either weak or primary stress. There are few rules governing secondary stress, which typically will follow a speaker's native language habits or preferences. Secondary stress can be used for contrast, or for emphasis of a point. Secondary stress can be emphasized at any level up to primary stress, although the speaker must not allow a false primary stress in brivla, since errors in word resolution could result.

The following are Lojban words with stress explicitly shown:

dikyjvo
DI,ky,jvo

(In a fully-buffered dialect, the pronunciation would be:

['di kə ʒɪ vo].) Note that the syllable ky is not counted in determining stress. The vowel y is never stressed in a normal Lojban context.
.armstrong.
.ARM,strong.

This is a Lojbanized version of the name “Armstrong”. The final g must be explicitly pronounced. With full buffering, the name would be pronounced:

[ˈʔa rɪ mɪ sɪ tɪ ro nɪ gɪʔ]

However, there is no need to insert a buffer in every possible place just because it is inserted in one place: partial buffering is also acceptable. In every case, however, the stress remains in the same place: on the first syllable.

The English pronunciation of “Armstrong”, as spelled in English, is not correct by Lojban standards; the letters “ng” in English represent a velar nasal (IPA

[ŋ]) which is a single consonant. In Lojban,

ng represents two separate consonants that must both be pronounced; you may not use

[ŋ] to pronounce Lojban

ng, although

[ŋg] is acceptable. English speakers are likely to have to pronounce the ending with a buffer, as one of the following:
[ˈʔarm stron gɪʔ]

or

[ˈʔarm stroŋ gɪʔ]

or even

[ˈʔarm stro nɪgʔ]

The normal English pronunciation of the name “Armstrong” could be Lojbanized as:

.ARMstron.

since Lojban n is allowed to be pronounced as the velar nasal

[ŋ].

Here is another example showing the use of y:

bisydja
BI,sy,dja
BI,syd,ja

This word is a compound word, or lujvo, built from the two affixes bis and dja. When they are joined, an impermissible consonant pair results: sd. In accordance with the algorithm for making lujvo, explained in Section 5.10, a y is inserted to separate the impermissible consonant pair; the y is not counted as a syllable for purposes of stress determination.

da'udja
da'UD,ja
da'U,dja

These two syllabications sound the same to a Lojban listener – the association of unbuffered consonants in syllables is of no import in recognizing the word.

e'u bridi
e'u BRI,di
E'u BRI,di
e'U.BRI,di

In Example , e'u is a cmavo and bridi is a brivla. Either of the first two pronunciations is permitted: no primary stress on either syllable of e'u, or primary stress on the first syllable. The third pronunciation, which places primary stress on the second syllable of the cmavo, requires that – since the following word is a brivla – the two words must be separated by a pause. Consider the following two cases:

le re nobli prenu
le re NObli PREnu
le re no bliprenu
le re no bliPREnu

If the cmavo no in Example were to be stressed, the phrase would sound exactly like the given pronunciation of Example , which is unacceptable in Lojban: a single pronunciation cannot represent both.

IPA For English Speakers

There are many dialects of English, thus making it difficult to define the standardized symbols of the IPA in terms useful to every reader. All the symbols used in this chapter are repeated here, in more or less alphabetical order, with examples drawn from General American. In addition, some attention is given to the Received Pronunciation of (British) English. These two dialects are referred to as GA and RP respectively. Speakers of other dialects should consult a book on phonetics or their local television sets.

[ˈ]
  • An IPA indicator of primary stress; the syllable which follows [ˈ] receives primary stress.
[ʔ]
  • An allowed variant of Lojban .. This sound is not usually considered part of English. It is the catch in your throat that sometimes occurs prior to the beginning of a word (and sometimes a syllable) which starts with a vowel. In some dialects, like Cockney and some kinds of American English, it is used between vowels instead of “t”: “bottle” [boʔl̩]. The English interjection “uh-oh!” almost always has it between the syllables.
[ː]
  • A symbol indicating that the previous vowel is to be spoken for a longer time than usual. Lojban vowels can be pronounced long in order to make a greater contrast with buffer vowels.
[a]
  • The preferred pronunciation of Lojban a. This sound doesn't occur in GA, but sounds somewhat like the “ar” of “park”, as spoken in RP or New England American. It is pronounced further forward in the mouth than [ɑ].
[ɑ]
  • An allowed variant of Lojban a. The “a” of GA “father”. The sound [a] is preferred because GA speakers often relax an unstressed [ɑ] into a schwa [ə], as in the usual pronunciations of “about” and “sofa”. Because schwa is a distinct vowel in Lojban, English speakers must either learn to avoid this shift or to use [a] instead: the Lojban word for “sofa” is sfofa, pronounced :[sfofa] or [sfofɑ] but never [sfofə] which would be the non-word sfofy.
[æ]
  • Not a Lojban sound. The “a” of English “cat”.
[b]
  • The preferred pronunciation of Lojban b. As in English “boy”, “sober”, or “job”.
[β]
  • An allowed variant of Lojban v. Not an English sound; the Spanish "b" or "v" between vowels. This sound should not be used for Lojban b.
[d]
  • The preferred pronunciation of Lojban d. As in English “dog”, “soda”, or “mad”.
[ɛ]
  • The preferred pronunciation of Lojban e. The “e” of English “met”.
[e]
  • An allowed variant of Lojban e. This sound is not found in English, but is the Spanish "e", or the tense "e" of Italian. The vowel of English “say” is similar except for the off-glide: you can learn to make this sound by holding your tongue steady while saying the first part of the English vowel.
[ə]
  • The preferred pronunciation of Lojban y. As in the “a” of English “sofa” or “about”. Schwa is generally unstressed in Lojban, as it is in English. It is a totally relaxed sound made with the tongue in the middle of the mouth.
[f]
  • The preferred pronunciation of Lojban f. As in “fee”, “loafer”, or “chef”.
[ɸ]
  • An allowed variant of Lojban f. Not an English sound; the Japanese “f” sound.
[g]
  • The preferred pronunciation of Lojban g. As in English “go”, “eagle”, or “dog”.
[h]
  • The preferred pronunciation of the Lojban apostrophe sound. As in English “aha” or the second "h" in “oh, hello”.
[i]
  • The preferred pronunciation of Lojban i. Essentially like the English vowel of “pizza” or “machine”, although the English vowel is sometimes pronounced with an off-glide, which should not be present in Lojban.
[ɪ]
  • A possible Lojban buffer vowel. The “i” of English “bit”.
[ɨ]
  • A possible Lojban buffer vowel. The “u” of “just” in some varieties of GA, those which make the word sound more or less like “jist”. Also Russian "y" as in "byt'" (to be); like a schwa :[ə], but higher in the mouth.
[j]
  • Used in Lojban diphthongs beginning or ending with i. Like the “y” in English “yard” or “say”.
[k]
  • The preferred pronunciation of Lojban k. As in English “kill”, “token”, or “flak”.
[l]
  • The preferred pronunciation of Lojban l. As in English “low”, “nylon”, or “excel”.
[l̩]
  • The syllabic version of Lojban l, as in English “bottle” or “middle”.
[m]
  • The preferred pronunciation of Lojban m. As in English “me”, “humor”, or “ham”.
[m̩]
  • The syllabic version of Lojban m. As in English “catch 'em” or “bottom”.
[n]
  • The preferred pronunciation of Lojban n. As in English no, “honor”, or “son”.
[n̩]
  • The syllabic version of Lojban n. As in English “button”.
[ŋ]
  • An allowed variant of Lojban n, especially in Lojbanized names and before g or k. As in English “sing” or “singer” (but not “finger” or “danger”).
[ŋ̍]
  • An allowed variant of Lojban syllabic n, especially in Lojbanized names.
[o]
  • The preferred pronunciation of Lojban o. As in the French "haute (cuisine)" or Spanish "como". There is no exact English equivalent of this sound. The nearest GA equivalent is the “o” of “dough” or “joke”, but it is essential that the off-glide (a :[w]-like sound) at the end of the vowel is not pronounced when speaking Lojban. The RP sound in these words is :[əw] in IPA terms, and has no :[o] in it at all; unless you can speak with a Scots, Irish, or American accent, you may have trouble with this sound.
[ɔ]
  • An allowed variant of Lojban o, especially before r. This sound is a shortened form of the “aw” in GA “dawn” (for those people who don't pronounce “dawn” and “Don” alike; if you do, you may have trouble with this sound). In RP, but not GA, it is the “o” of “hot”.
[p]
  • The preferred pronunciation of Lojban p. As in English “pay”, “super”, or “up”.
[r]
  • One version of Lojban r. Not an English sound. The Spanish "rr" and the Scots “r”, a tongue-tip trill.
[ɹ]
  • One version of Lojban r. As in GA “right”, “baron”, or “car”. Not found in RP.
[ɾ]
  • One version of Lojban r. In GA, appears as a variant of “t” or “d” in the words “metal” and “medal” respectively. A tongue-tip flap.
[ʀ]
  • One version of Lojban r. Not an English sound. The French or German "r" in "reine" or "rot" respectively. A uvular trill.
[r̩],
[ɹ̩],
[ɾ̩],
[ʀ̩]
  • are syllabic versions of the above.
[ɹ̩] appears in the GA (but not RP) pronunciation of “bird”.
[s]
  • The preferred pronunciation of Lojban s. As in English “so”, “basin”, or “yes”.
[ʃ]
  • The preferred pronunciation of Lojban c. The “sh” of English “ship”, “ashen”, or “dish”.
[ʂ]
  • An allowed variant of Lojban s. Not an English sound. The Hindi retroflex "s" with dot below, or Klingon "S".
[t]
  • The preferred pronunciation of Lojban t. As in English “tea”, “later”, or “not”. It is important to avoid the GA habit of pronouncing the “t” between vowels as :[d] or :[ɾ].
[θ]
  • Not normally a Lojban sound, but a possible variant of Lojban '. The “th” of English “thin” (but not “then”).
[v]
  • The preferred pronunciation of Lojban v. As in English “voice”, “savor”, or “live”.
[w]
  • Used in Lojban diphthongs beginning or ending with u. Like the “w” in English “wet” [wɛt] or “cow” [kɑw].
[x]
  • The preferred pronunciation of Lojban x. Not normally an English sound, but used in some pronunciations of “loch” and “Bach”; “gh” in Scots “might” and “night”. The German "Ach-Laut". To pronounce :[x], force air through your throat without vibrating your vocal chords; there should be lots of scrape.
[ʏ]
  • A possible Lojban buffer vowel. Not an English sound: the "ü" of German "hübsch".
[z]
  • The preferred pronunciation of Lojban z. As in English “zoo”, “hazard”, or “fizz”.
[ʒ]
  • The preferred pronunciation of Lojban j. The “si” of English “vision”, or the consonant at the end of GA “garage”.
[ʐ]
  • An allowed variant of Lojban z. Not an English sound. The voiced version of :[ʂ].

English Analogues For Lojban Diphthongs

Here is a list of English words that contain diphthongs that are similar to the Lojban diphthongs. This list does not constitute an official pronunciation guide; it is intended as a help to English-speakers. <tab class=wikitable header=true>Lojban English ai “pie” ei “pay” oi “boy” au “cow” ia “yard” ie “yes” ii “ye” io “yodel” (in GA only) iu “unicorn” or “few” ua “suave” ue “wet” ui “we” uo “woe” (in GA only) uu “woo” iy “million” (the “io” part, that is) uy “was” (when unstressed) </tab>

Oddball Orthographies

The following notes describe ways in which Lojban has been written or could be written that differ from the standard orthography explained in the rest of this chapter. Nobody needs to read this section except people with an interest in the obscure. Technicalities are used without explanation or further apology.

There exists an alternative orthography for Lojban, which is designed to be as compatible as possible (but no more so) with the orthography used in pre-Lojban versions of Loglan. The consonants undergo no change, except that x is replaced by h. The individual vowels likewise remain unchanged. However, the vowel pairs and diphthongs are changed as follows:

  • ai, ei, oi, au become ai, ei, oi, ao.
  • ia through iu and ua through uu remain unchanged.
  • a'i, e'i, o'i and a'o become a,i, e,i, o,i and a,o.
  • i'a through i'u and u'a through u'u are changed to ia through iu and ua through uu in lujvo and cmavo other than attitudinals, but become i,a through i,u and u,a through u,u in names, fu'ivla, and attitudinal cmavo.
  • All other vowel pairs simply drop the apostrophe.

The result of these rules is to eliminate the apostrophe altogether, replacing it with comma where necessary, and otherwise with nothing. In addition, names and the cmavo i are capitalized, and irregular stress is marked with an apostrophe (now no longer used for a sound) following the stressed syllable.

Three points must be emphasized about this alternative orthography:

  • It is not standard, and has not been used.
  • It does not represent any changes to the standard Lojban phonology; it is simply a representation of the same phonology using a different written form.
  • It was designed to aid in a planned rapprochement between the Logical Language Group and The Loglan Institute, a group headed by James Cooke Brown. The rapprochement never took place.

There also exists a Cyrillic orthography for Lojban which was designed when the introductory Lojban brochure was translated into Russian. It uses the “а”, “б”, “в”, “г”, “д”, “е”, “ж”, “з”, “и”, “к”, “л”, “м”, “н”, “о”, “п”, “р”, “с”, “т”, “у”, “ф”, “х”, and “ш” in the obvious ways. The Latin letter “y” is mapped onto the hard sign “ъ”, as in Bulgarian. The apostrophe, comma, and period are unchanged. Diphthongs are written as vowel pairs, as in the Roman representation.

Finally, an orthography using the Tengwar of Féanor, a fictional orthography invented by J. R. R. Tolkien and described in the Appendixes to The Lord Of The Rings, has been devised for Lojban. The following mapping, which closely resembles that used for Westron, will be meaningful only to those who have read those appendixes. In brief, the tincotéma and parmatéma are used in the conventional ways; the calmatéma represents palatal consonants, and the quessetéma represents velar consonants.

tinco
t
calma
-
ando
d
anga
-
thule
-
harma
c
anto
-
anca
j
numen
n
noldo
-
ore
r
anna
i
parma
p
quesse
k
umbar
b
ungwe
g
formen
f
hwesta
x
ampa
v
unque
-
malta
m
nwalme
-
vala
u
vilya
-

The letters "vala" and "anna" are used for u and i only when those letters are used to represent glides. Of the additional letters, r, l, s, and z are written with "rómen", "lambe", "silme", and "áre"/"esse" respectively; the inverted forms are used as free variants.

Lojban, like Quenya, is a vowel-last language, so tehtar are read as following the tengwar on which they are placed. The conventional tehtar are used for the five regular vowels, and the dot below for y. The Lojban apostrophe is represented by "halla". There is no equivalent of the Lojban comma or period.

Basic sentence elements

The picture for chapter 2

The concept of the bridi

This chapter gives diagrammed examples of basic Lojban sentence structures. The most general pattern is covered first, followed by successive variations on the basic components of the Lojban sentence. There are many more capabilities not covered in this chapter, but covered in detail in later chapters, so this chapter is a “quick tour” of the material later covered more slowly throughout the book. It also introduces most of the Lojban words used to discuss Lojban grammar.

Let us consider John and Sam and three statements about them:

Example 3.1:

John is the father of Sam.
Example 3.2:

John hits Sam.
Example 3.3:

John is taller than Sam.

These examples all describe relationships between John and Sam. However, in English, we use the noun “father” to describe a static relationship in Example 3.1, the verb “hits” to describe an active relationship in Example 3.2, and the adjective “taller” to describe an attributive relationship in Example 3.3. In Lojban we make no such grammatical distinctions; these three sentences, when expressed in Lojban, are structurally identical. The same part of speech is used to represent the relationship. In formal logic this whole structure is called a “predication”; in Lojban it is called a bridi, and the central part of speech is the selbri. Logicians refer to the things thus related as “arguments”, while Lojbanists call them sumti. These Lojban terms will be used for the rest of the book.

bridi (predicate)
______________|__________________
|                               |
John     is the father of       Sam
|____|    |______________|      |___|
|              |               |
sumti         selbri          sumti (argument)
|CLL-chapter-2-diagram.png}}

In a relationship, there are a definite number of things being related. In English, for example, “give” has three places: the donor, the recipient and the gift. For example:

Example 3.4:

John gives Sam the book.

and

Example 3.5:

Sam gives John the book.

mean two different things because the relative positions of “John” and “Sam” have been switched. Further,

Example 3.6:

The book gives John Sam.

seems strange to us merely because the places are being filled by unorthodox arguments. The relationship expressed by “give” has not changed.

In Lojban, each selbri has a specified number and type of arguments, known collectively as its “place structure”. The simplest kind of selbri consists of a single root word, called a gismu, and the definition in a dictionary gives the place structure explicitly. The primary task of constructing a Lojban sentence, after choosing the relationship itself, is deciding what you will use to fill in the sumti places.

This book uses the Lojban terms bridi, sumti, and selbri, because it is best to come to understand them independently of the English associations of the corresponding words, which are only roughly similar in meaning anyhow.

The Lojban examples in this chapter (but not in the rest of the book) use a single underline (---) under each sumti, and a double underline (===) under each selbri, to help you to tell them apart.

Pronunciation

Detailed pronunciation and spelling rules are given in Chapter 4, but what follows will keep the reader from going too far astray while digesting this chapter.

Lojban has six recognized vowels: a, e, i, o, u and y. The first five are roughly pronounced as “a” as in “father”, e as in “let”, i as in “machine”, o as in “dome” and u as in “flute”. y is pronounced as the sound called “schwa”, that is, as the unstressed “a” as in “about” or “around”.

Twelve consonants in Lojban are pronounced more or less as their counterparts are in English: b, d, f, k, l, m, n, p, r, t, v and z. The letter c, on the other hand is pronounced as the “sh” in “hush”, while j is its voiced counterpart, the sound of the “s” in “pleasure”. g is always pronounced as it is in “gift”, never as in “giant”. s is as in “sell”, never as in “rose”. The sound of x is not found in English in normal words. It is found as "ch" in Scottish "loch", as "j" in Spanish "junta", and as "ch" in German "Bach"; it also appears in the English interjection “yecchh!”. It gets easier to say as you practice it. The letter r can be trilled, but doesn't have to be.

The Lojban diphthongs ai, ei, oi, and au are pronounced much as in the English words “sigh”, “say”, “boy”, and “how”. Other Lojban diphthongs begin with an i pronounced like English “y” (for example, io is pronounced “yo”) or else with a u pronounced like English “w” (for example, ua is pronounced “wa”).

Lojban also has three “semi-letters”: the period, the comma and the apostrophe. The period represents a glottal stop or a pause; it is a required stoppage of the flow of air in the speech stream. The apostrophe sounds just like the English letter “h”. Unlike a regular consonant, it is not found at the beginning or end of a word, nor is it found adjacent to a consonant; it is only found between two vowels. The comma has no sound associated with it, and is used to separate syllables that might ordinarily run together. It is not used in this chapter.

Stress falls on the next to the last syllable of all words, unless that vowel is y, which is never stressed; in such words the third-to-last syllable is stressed. If a word only has one syllable, then that syllable is not stressed.

All Lojban words are pronounced as they are spelled: there are no silent letters.

Words that can act as sumti

Here is a short table of single words used as sumti. This table provides examples only, not the entire set of such words, which may be found in Section .

mi = '
do = '
ti = '
ta = '
tu = '
zo'e = '

Lojban sumti are not specific as to number (singular or plural), nor gender (masculine/feminine/neutral). Such distinctions can be optionally added by methods that are beyond the scope of this chapter.

The cmavo ti, ta, and tu refer to whatever the speaker is pointing at, and should not be used to refer to things that cannot in principle be pointed at.

Names may also be used as sumti, provided they are preceded with the word la:

la meris.
the one/ones named Mary
la djan.
the one/ones named John

Other Lojban spelling versions are possible for names from other languages, and there are restrictions on which letters may appear in Lojban names: See Section for more information.

Some words used to indicate selbri relations

Here is a short table of some words used as Lojban selbri in this chapter:

vecnu = '
tavla = '
sutra = '
blari'o = '
melbi = '
cutci = '
bajra = '
klama = '
pluka = '
gerku = '
kurji = '
kanro = '
stali = '
zarci = '

Each selbri (relation) has a specific rule that defines the role of each sumti in the bridi, based on its position. In the table above, that order was expressed by labeling the sumti positions as x1, x2, x3, x4, and x5.

Like the table in Section 3.3, this table is far from complete: in fact, no complete table can exist, because Lojban allows new words to be created (in specified ways) whenever a speaker or writer finds the existing supply of words inadequate. This notion is a basic difference between Lojban (and some other languages such as German and Chinese) and English; in English, most people are very leery of using words that “aren't in the dictionary”. Lojbanists are encouraged to invent new words; doing so is a major way of participating in the development of the language. Chapter 5 explains how to make new words, and Chapter ELG-ERROR in Template:Lch explains how to give them appropriate meanings.

Some simple Lojban bridi

Let's look at a simple Lojban bridi. The place structure of the gismu tavla is

Example 3.7:

x1 talks to x2 about x3 in language x4

where the “x”-es with following numbers represent the various arguments that could be inserted at the given positions in the English sentence. For example:

Example 3.8:

John talks to Sam about engineering in Lojban.

has “John” in the x1 place, “Sam” in the x2 place, “engineering” in the x3 place, and “Lojban” in the x4 place, and could be paraphrased:

Example 3.9:

Talking is going on, with speaker John and listener Sam and subject matter engineering and language Lojban.

The Lojban bridi corresponding to Example 3.7 will have the form

Example 3.10:

x1 cu tavla x2 x3 x4

The word cu serves as a separator between any preceding sumti and the selbri. It can often be omitted, as in the following examples.

Example 3.11:

mi tavla do zo'e zo'e
I talk to you about something in some language.
Example 3.12:

do tavla mi ta zo'e
You talk to me about that thing in a language.
Example 3.13:

mi tavla zo'e tu ti
I talk to someone about that thing yonder in this language.

(Example 3.13 is a bit unusual, as there is no easy way to point to a language; one might point to a copy of this book, and hope the meaning gets across!)

When there are one or more occurrences of the cmavo zo'e at the end of a bridi, they may be omitted, a process called “ellipsis”. Example 3.11 and Example 3.12 may be expressed thus:

Example 3.14:

mi tavla do
I talk to you (about something in some language).
Example 3.15:

do tavla mi ta
You talk to me about that thing (in some language).

Note that Example 3.13 is not subject to ellipsis by this direct method, as the zo'e in it is not at the end of the bridi.

Variant bridi structure

Consider the sentence

Example 3.16:

mi cu vecnu ti ta zo'e
seller-x1 sells goods-sold-x2 buyer-x3 price-x4
I sell this to that for some price.
I sell this-thing/these-things to that-buyer/those-buyers.
(the price is obvious or unimportant)Example 3.16 has one sumti (the x1) before the selbri. It is also possible to put more than one sumti before the selbri, without changing the order of sumti:
Example 3.17:

mi ti cu vecnu ta
seller-x1 goods-sold-x2 sells buyer-x3
I this sell to that.
(translates as stilted or poetic English)
I this thing do sell to that buyer.
Example 3.18:

mi ti ta cu vecnu
seller-x1 goods-sold-x2 buyer-x3 sells
I this to that sell
(translates as stilted or poetic English)
I this thing to that buyer do sell.

Example 3.16 through Example 3.18 mean the same thing. Usually, placing more than one sumti before the selbri is done for style or for emphasis on the sumti that are out-of-place from their normal position. (Native speakers of languages other than English may prefer such orders.)

If there are no sumti before the selbri, then it is understood that the x1 sumti value is equivalent to zo'e; i.e. unimportant or obvious, and therefore not given. Any sumti after the selbri start counting from x2.

Example 3.19:

ta cu melbi
object/idea-x1 is-beautiful
to someone by some standard
That/Those is/are beautiful.
That is beautiful.:Those are beautiful.when the x1 is omitted, becomes:
Example 3.20:
melbi
unspecified-x1 is-beautiful
to someone by some standard
Beautiful!:It's beautiful!

Omitting the x1 adds emphasis to the selbri relation, which has become first in the sentence. This kind of sentence is termed an observative, because it is often used when someone first observes or takes note of the relationship, and wishes to quickly communicate it to someone else. Commonly understood English observatives include “Smoke!” upon seeing smoke or smelling the odor, or “Car!” to a person crossing the street who might be in danger. Any Lojban selbri can be used as an observative if no sumti appear before the selbri.

The word cu does not occur in an observative; cu is a separator, and there must be a sumti before the selbri that needs to be kept separate for cu to be used. With no sumti preceding the selbri, cu is not permitted. Short words like cu which serve grammatical functions are called cmavo in Lojban.

Varying the order of sumti

For one reason or another you may want to change the order, placing one particular sumti at the front of the bridi. The cmavo se, when placed before the last word of the selbri, will switch the meanings of the first and second sumti places. So

Example 3.21:

mi tavla do ti
I talk to you about this.

has the same meaning as

Example 3.22:
do se tavla mi ti
You are talked to by me about this.

The cmavo te, when used in the same location, switches the meanings of the first and the third sumti places.

Example 3.23:
mi tavla do ti
I talk to you about this.has the same meaning as
Example 3.24:
ti te tavla do mi
This is talked about to you by me.

Note that only the first and third sumti have switched places; the second sumti has remained in the second place.

The cmavo ve and xe switch the first and fourth sumti places, and the first and fifth sumti places, respectively. These changes in the order of places are known as “conversions”, and the se, te, ve, and xe cmavo are said to convert the selbri.

More than one of these operators may be used on a given selbri at one time, and in such a case they are evaluated from left to right. However, in practice they are used one at a time, as there are better tools for complex manipulation of the sumti places. See Chapter ELG-ERROR in Template:Lch for details. The effect is similar to what in English is called the “passive voice”. In Lojban, the converted selbri has a new place structure that is renumbered to reflect the place reversal, thus having effects when such a conversion is used in combination with other constructs such as

le selbri [ku] (See Section 3.10).

The basic structure of longer utterances

People don't always say just one sentence. Lojban has a specific structure for talk or writing that is longer than one sentence. The entirety of a given speech event or written text is called an utterance. The sentences (usually, but not always, bridi) in an utterance are separated by the cmavo ni'o and i. These correspond to a brief pause (or nothing at all) in spoken English, and the various punctuation marks like period, question mark, and exclamation mark in written English. These separators prevent the sumti at the beginning of the next sentence from being mistaken for a trailing sumti of the previous sentence.

The cmavo ni'o separates paragraphs (covering different topics of discussion). In a long text or utterance, the topical structure of the text may be indicated by multiple ni'o s, with perhaps

ni'oni'oni'o used to indicate a chapter,
ni'oni'o to indicate a section, and a single ni'o to indicate a subtopic corresponding to a single English paragraph.

The cmavo i separates sentences. It is sometimes compounded with words that modify the exact meaning (the semantics) of the sentence in the context of the utterance. (The cmavo xu, discussed in Section 3.15, is one such word – it turns the sentence from a statement to a question about truth.) When more than one person is talking, a new speaker will usually omit the i even though she/he may be continuing on the same topic.

It is still O.K. for a new speaker to say the i before continuing; indeed, it is encouraged for maximum clarity (since it is possible that the second speaker might merely be adding words onto the end of the first speaker's sentence). A good translation for i is the “and” used in run-on sentences when people are talking informally: “I did this, and then I did that, and ..., and ...”.

tanru

When two gismu are adjacent, the first one modifies the second, and the selbri takes its place structure from the rightmost word. Such combinations of gismu are called tanru. For example,

Example 3.25:

sutra tavla

has the place structure

Example 3.26:

x1 is a fast type-of talker to x2 about x3 in language x4
x1 talks fast to x2 about x3 in language x4

When three or more gismu are in a row, the first modifies the second, and that combined meaning modifies the third, and that combined meaning modifies the fourth, and so on. For example

Example 3.27:

sutra tavla cutci

has the place structure

Example 3.28:

s1 is a fast-talker type of shoe worn by s2 of material s3

That is, it is a shoe that is worn by a fast talker rather than a shoe that is fast and is also worn by a talker.

Note especially the use of “type-of” as a mechanism for connecting the English translations of the two or more gismu; this convention helps the learner understand each tanru in its context. Creative interpretations are also possible, however:

Example 3.29:

bajra cutci
runner shoe

most probably refers to shoes suitable for runners, but might be interpreted in some imaginative instances as “shoes that run (by themselves?)”. In general, however, the meaning of a tanru is determined by the literal meaning of its components, and not by any connotations or figurative meanings. Thus

Example 3.30:

sutra tavla
fast-talker

would not necessarily imply any trickery or deception, unlike the English idiom, and a

Example 3.31:

jikca toldi
social butterfly

must always be an insect with large brightly-colored wings, of the family Lepidoptera.

The place structure of a tanru is always that of the final component of the tanru. Thus, the following has the place structure of klama:

Example 3.32:

mi cu sutra klama la meris.
I quickly-go to Mary.

With the conversion se klama as the final component of the tanru, the place structure of the entire selbri is that of se klama: the x1 place is the destination, and the x2 place is the one who goes:

Example 3.33:
mi cu sutra se klama la meris.
I quickly am-gone-to by Mary.

The following example shows that there is more to conversion than merely switching places, though:

Example 3.34:

la tam. cu melbi tavla la meris.
Tom beautifully-talks to Mary.
Tom is a beautiful-talker to Mary.

has the place structure of tavla, but note the two distinct interpretations. Now, using conversion, we can modify the place structure order:

Example 3.35:

la meris. cu melbi se tavla la tam.
Mary is beautifully-talked-to by Tom.
Mary is a beautiful-audience for Tom.

and we see that the modification has been changed so as to focus on Mary's role in the bridi relationship, leading to a different set of possible interpretations.Note that there is no place structure change if the modifying term is converted, and so less drastic variation in possible meanings:

Example 3.36:

la tam. cu tavla melbi la meris.
Tom is talkerly-beautiful to Mary.
Example 3.37:
la tam. cu se tavla melbi la meris.
Tom is audiencely-beautiful to Mary.

and we see that the manner in which Tom is seen as beautiful by Mary changes, but Tom is still the one perceived as beautiful, and Mary, the observer of beauty.

Description sumti

Often we wish to talk about things other than the speaker, the listener and things we can point to. Let's say I want to talk about a talker other than mi. What I want to talk about would naturally fit into the first place of tavla. Lojban, it turns out, has an operator that pulls this first place out of a selbri and converts it to a sumti called a “description sumti”. The description sumti

le tavla ku means “the talker”, and may be used wherever any sumti may be used.

For example,

Example 3.38:

mi tavla do le tavla [ku]

means the same as

Example 3.39:

I talk to you about the talker.

where “the talker” is presumably someone other than me, though not necessarily.

Similarly

le sutra tavla ku is
“the fast talker”, and
le sutra te tavla ku is
“the fast subject of talk” or
“the subject of fast talk”.

Which of these related meanings is understood will depend on the context in which the expression is used. The most plausible interpretation within the context will generally be assumed by a listener to be the intended one.

In many cases the word ku may be omitted. In particular, it is never necessary in a description at the end of a sentence, so:

Example 3.40:

mi tavla do le tavla
I talk-to you about-the talker

means exactly the same thing as Example 3.38.

There is a problem when we want to say “The fast one is talking.” The “obvious” translation le sutra tavla turns out to mean “the fast talker”, and has no selbri at all. To solve this problem we can use the word cu, which so far has always been optional, in front of the selbri.

The word cu has no meaning, and exists only to mark the beginning of the selbri within the bridi, separating it from a previous sumti. It comes before any other part of the selbri, including other cmavo like se or te. Thus:

Example 3.41:

le sutra tavla
The fast talker
Example 3.42:
le sutra cu tavla
The fast one is talking.
Example 3.43::le sutra se tavla
The fast talked-to one
Example 3.44:
le sutra cu se tavla
The fast one is talked to.

Consider the following more complex example, with two description sumti.

Example 3.45:
mi cu tavla le vecnu ku le blari'o ku
I talk-to the seller about the blue-green-thing.

The sumti le vecnu contains the selbri vecnu, which has the “seller” in the x1 place, and uses it in this sentence to describe a particular “seller” that the speaker has in mind (one that he or she probably expects the listener will also know about). Similarly, the speaker has a particular blue-green thing in mind, which is described using le to mark blari'o, a selbri whose first sumti is something blue-green.

It is safe to omit both occurrences of ku in Example 3.45, and it is also safe to omit the cu.

Examples of brivla

The simplest form of selbri is an individual word. A word which may by itself express a selbri relation is called a brivla. The three types of brivla are gismu (root words), lujvo (compounds), and fu'ivla (borrowings from other languages). All have identical grammatical uses. So far, most of our selbri have been gismu or tanru built from gismu.

gismu:

Example 3.46:

mi cu klama ti zo'e zo'e ta
Go-er goes destination origin route means.
I go here (to this) using that means (from somewhere via some route). lujvo:
Example 3.47:
ta cu blari'o
That is-blue-green.

fu'ivla:

Example 3.48:
ti cu djarspageti
This is-spaghetti.

Some cmavo may also serve as selbri, acting as variables that stand for another selbri. The most commonly used of these is go'i, which represents the main bridi of the previous Lojban sentence, with any new sumti or other sentence features being expressed replacing the previously expressed ones. Thus, in this context:

Example 3.49:
ta cu go'i
That too/same-as-last selbri.
That (is spaghetti), too.

The sumti di'u and la'e di'u

In English, I might say “The dog is beautiful”, and you might reply “This pleases me.” How do you know what “this” refers to? Lojban uses different expressions to convey the possible meanings of the English:

Example 3.50:

le gerku ku cu melbi
The dog is beautiful.The following three sentences all might translate as “This pleases me.”
Example 3.51:
ti cu pluka mi
This (the dog) pleases me.
Example 3.52:
di'u cu pluka mi
This (the last sentence) pleases me (perhaps because it is grammatical or sounds nice).
Example 3.53:
la'e di'u cu pluka mi
This (the meaning of the last sentence; i.e. that the dog is beautiful) pleases me. Example 3.53 uses one sumti to point to or refer to another by inference. It is common to write la'edi'u as a single word; it is used more often than di'u by itself.

Possession

“Possession” refers to the concept of specifying an object by saying who it belongs to (or with). A full explanation of Lojban possession is given in Chapter ELG-ERROR in Template:Lch. A simple means of expressing possession, however, is to place a sumti representing the possessor of an object within the description sumti that refers to the object: specifically, between the le and the selbri of the description:

Example 3.54:

le mi gerku cu sutra
The of-me dog is fast.
My dog is fast.

In Lojban, possession doesn't necessarily mean ownership: one may “possess” a chair simply by sitting on it, even though it actually belongs to someone else. English uses possession casually in the same way, but also uses it to refer to actual ownership or even more intimate relationships: “my arm” doesn't mean “some arm I own” but rather “the arm that is part of my body”. Lojban has methods of specifying all these different kinds of possession precisely and easily.

Vocatives and commands

You may call someone's attention to the fact that you are addressing them by using doi followed by their name. The sentence

Example 3.55:

doi djan.

means “Oh, John, I'm talking to you”. It also has the effect of setting the value of do; do now refers to “John” until it is changed in some way in the conversation. Note that Example 3.55 is not a bridi, but it is a legitimate Lojban sentence nevertheless; it is known as a “vocative phrase”.

Other cmavo can be used instead of doi in a vocative phrase, with a different significance. For example, the cmavo coi means “hello” and co'o means “good-bye”. Either word may stand alone, they may follow one another, or either may be followed by a pause and a name. (Vocative phrases with doi do not need a pause before the name.)

Example 3.56:

coi. djan.
Hello, John.
Example 3.57:

co'o. djan.
Good-bye, John.

Commands are expressed in Lojban by a simple variation of the main bridi structure. If you say

Example 3.58:

do tavla
You are-talking.

you are simply making a statement of fact. In order to issue a command in Lojban, substitute the word ko for do. The bridi

Example 3.59:
ko tavla instructs the listener to do whatever is necessary to make Example 3.58 true; it means “Talk!” Other examples:
Example 3.60:
ko sutra
Be fast!

The ko need not be in the x1 place, but rather can occur anywhere a sumti is allowed, leading to possible Lojban commands that are very unlike English commands:

Example 3.61:
mi tavla ko
Be talked to by me.:Let me talk to you.The cmavo ko can fill any appropriate sumti place, and can be used as often as is appropriate for the selbri:
Example 3.62:
ko kurji koand
Example 3.63:
ko ko kurjiboth mean “You take care of you” and “Be taken care of by you”, or to put it colloquially, “Take care of yourself”.

Questions

There are many kinds of questions in Lojban: full explanations appear in Section and in various other chapters throughout the book. In this chapter, we will introduce three kinds: sumti questions, selbri questions, and yes/no questions.

The cmavo ma is used to create a sumti question: it indicates that the speaker wishes to know the sumti which should be placed at the location of the ma to make the bridi true. It can be translated as “Who?” or “What?” in most cases, but also serves for “When?”, “Where?”, and “Why?” when used in sumti places that express time, location, or cause. For example:

Example 3.64:

ma tavla do mi
Who? talks to-you about-me.
Who is talking to you about me?The listener can reply by simply stating a sumti:
Example 3.65:: la djan.
John (is talking to you about me).Like ko, ma can occur in any position where a sumti is allowed, not just in the first position:
Example 3.66:
do cu tavla ma
You talk to what/whom?

A ma can also appear in multiple sumti positions in one sentence, in effect asking several questions at once.

Example 3.67:
ma cu tavla ma
What/Who talks to what/whom?

The two separate ma positions ask two separate questions, and can therefore be answered with different values in each sumti place. The cmavo mo is the selbri analogue of ma. It asks the respondent to provide a selbri that would be a true relation if inserted in place of the mo:

Example 3.68:

do cu mo
You are-what/do-what?

A mo may be used anywhere a brivla or other selbri might. Keep this in mind for later examples. Unfortunately, by itself, mo is a very non-specific question. The response to the question in Example 3.68 could be:

Example 3.69:
mi cu melbi
I am beautiful.or:
Example 3.70:
mi cu tavla
I talk.

Clearly, mo requires some cooperation between the speaker and the respondent to ensure that the right question is being answered. If context doesn't make the question specific enough, the speaker must ask the question more specifically using a more complex construction such as a tanru (See Section 3.9).

It is perfectly permissible for the respondent to fill in other unspecified places in responding to a mo question. Thus, the respondent in Example 3.70 could have also specified an audience, a topic, and/or a language in the response.

Finally, we must consider questions that can be answered “Yes” or “No”, such as

Example 3.71:

Are you talking to me?

Like all yes-or-no questions in English, Example 3.71 may be reformulated as

Example 3.72:

Is it true that you are talking to me?

In Lojban we have a word that asks precisely that question in precisely the same way. The cmavo xu, when placed in front of a bridi, asks whether that bridi is true as stated. So

Example 3.73:

xu do tavla mi
Is-it-true-that you are-talking to-me?

is the Lojban translation of Example 3.71. The answer “Yes” may be given by simply restating the bridi without the xu question word. Lojban has a shorthand for doing this with the word go'i, mentioned in Section 3.11. Instead of a negative answer, the bridi may be restated in such a way as to make it true. If this can be done by substituting sumti, it may be done with go'i as well. For example:

Example 3.74:

xu do kanro
Are you healthy?

can be answered with

Example 3.75:
mi kanro
I am healthy.or
Example 3.76:
go'i
I am healthy.(Note that do to the questioner is mi to the respondent.)or
Example 3.77:
le tavla cu kanro
The talker is healthy.or
Example 3.78:
le tavla cu go'i
The talker is healthy. A general negative answer may be given by na go'i. na may be placed before any selbri (but after the cu). It is equivalent to stating “It is not true that ...” before the bridi. It does not imply that anything else is true or untrue, only that that specific bridi is not true. More details on negative statements are available in Chapter ELG-ERROR in Template:Lch.

Indicators

Different cultures express emotions and attitudes with a variety of intonations and gestures that are not usually included in written language. Some of these are available in some languages as interjections (i.e. Aha!, Oh no!, Ouch!, Aahh!, etc.), but they vary greatly from culture to culture.</para>

Lojban has a group of cmavo known as “attitudinal indicators” which specifically covers this type of commentary on spoken statements. They are both written and spoken, but require no specific intonation or gestures. Grammatically they are very simple: one or more attitudinals at the beginning of a bridi apply to the entire bridi; anywhere else in the bridi they apply to the word immediately to the left. For example:

Example 3.79:

.ie mi cu klama
Agreement! I go.
Yep! I'll go.
Example 3.80:
.ei mi cu klama
Obligation! I go.
I should go.
Example 3.81:
mi cu klama le melbi .ui ku
I go to-the beautiful-thing and I am happy because it is the beautiful thing I'm going to

Not all indicators indicate attitudes. Discursives, another group of cmavo with the same grammatical rules as attitudinal indicators, allow free expression of certain kinds of commentary about the main utterances. Using discursives allows a clear separation of these so-called “metalinguistic” features from the underlying statements and logical structure. By comparison, the English words “but” and “also”, which discursively indicate contrast or an added weight of example, are logically equivalent to “and”, which does not have a discursive content. The average English-speaker does not think about, and may not even realize, the paradoxical idea that “but” basically means “and”.

Example 3.82:

mi cu klama .i do cu stali
I go. [[:Category:|]] You stay.
Example 3.83:
mi cu klama .i ji'a do cu stali
I go. [[:Category:|]]In addition, you stay. added weight
Example 3.84:

mi cu klama .i ku'i do cu stali
I go. However, you stay.contrast

Another group of indicators are called “evidentials”. Evidentials show the speaker's relationship to the statement, specifically how the speaker came to make the statement. These include za'a (I directly observe the relationship), pe'i (I believe that the relationship holds), ru'a (I postulate the relationship), and others. Many American Indian languages use this kind of words.

Example 3.85:

pe'i do cu melbi
I opine! You are beautiful.
Example 3.86:
za'a do cu melbi
I directly observe! You are beautiful.

Tenses

In English, every verb is tagged for the grammatical category called tense: past, present, or future. The sentence

Example 3.87:

John went to the store

necessarily happens at some time in the past, whereas

Example 3.88:

John is going to the store

is necessarily happening right now.

The Lojban sentence

Example 3.89:

la djan. cu klama le zarci
John goes/went/will-go to-the store

serves as a translation of either Example 3.87 or Example 3.88, and of many other possible English sentences as well. It is not marked for tense, and can refer to an event in the past, the present or the future. This rule does not mean that Lojban has no way of representing the time of an event. A close translation of Example 3.87 would be:

Example 3.90:
la djan. pu klama le zarci
John [[:Category:[past]|[past]]] goes to-the store

where the tag pu forces the sentence to refer to a time in the past. Similarly,

Example 3.91:

la djan. ca klama le zarci
John [[:Category:[present]|[present]]] goes to-the store

necessarily refers to the present, because of the tag ca. Tags used in this way always appear at the very beginning of the selbri, just after the cu, and they may make a cu unnecessary, since tags cannot be absorbed into tanru. Such tags serve as an equivalent to English tenses and adverbs. In Lojban, tense information is completely optional. If unspecified, the appropriate tense is picked up from context. Lojban also extends the notion of “tense” to refer not only to time but to space. The following example uses the tag vu to specify that the event it describes happens far away from the speaker:

Example 3.92:

do vu vecnu zo'e
You yonder sell something-unspecified.

In addition, tense tags (either for time or space) can be prefixed to the selbri of a description, producing a tensed sumti:

Example 3.93:
le pu bajra ku cu tavla
The earlier/former/past runner talked/talks.

(Since Lojban tense is optional, we don't know when he or she talks.)Tensed sumti with space tags correspond roughly to the English use of “this” or “that” as adjectives, as in the following example, which uses the tag vi meaning “nearby”:

Example 3.94:

le vi bajra ku cu tavla
The nearby runner talks.
This runner talks.

Do not confuse the use of vi in Example 3.94 with the cmavo ti, which also means “this”, but in the sense of “this thing”.

Furthermore, a tense tag can appear both on the selbri and within a description, as in the following example (where ba is the tag for future time):

Example 3.95:

le vi tavla ku cu ba klama
The here talker [future] goes.
The talker who is here will go.:This talker will go.

Lojban grammatical terms

Here is a review of the Lojban grammatical terms used in this chapter, plus some others used throughout this book. Only terms that are themselves Lojban words are included: there are of course many expressions like “indicator” in Chapter ELG-ERROR in Template:Lch that are not explained here. See the Index for further help with these.

bridi
  • predication; the basic unit of Lojban expression; the main kind of Lojban sentence; a claim that some objects stand in some relationship, or that some single object has some property.
sumti
  • argument; words identifying something which stands in a specified relationship to something else, or which has a specified property. See Chapter ELG-ERROR in Template:Lch.
selbri:
  • logical predicate; the core of a bridi; the word or words specifying the relationship between the objects referred to by the sumti. See Chapter ELG-ERROR in Template:Lch.
cmavo:
  • one of the Lojban parts of speech; a short word; a structural word; a word used for its grammatical function.
brivla:
  • one of the Lojban parts of speech; a content word; a predicate word; can function as a selbri; is a gismu, a lujvo, or a fu'ivla. See

Chapter 5.

gismu:
  • a root word; a kind of brivla; has associated rafsi. See Chapter 5.
lujvo:
fu'ivla:
  • a borrowed word; a kind of brivla; may or may not appear in a dictionary; copied in a modified form from some non-Lojban language; usually refers to some aspect of culture or the natural world; does not have associated rafsi. See

Chapter 5.

rafsi:
  • a word fragment; one or more is associated with each gismu; can be assembled according to rules in order to make lujvo; not a valid word by itself. See Chapter 5.
tanru:
  • a group of two or more brivla, possibly with associated cmavo, that form a selbri; always divisible into two parts, with the first part modifying the meaning of the second part (which is taken to be basic). See Chapter ELG-ERROR in Template:Lch.
selma'o:
  • a group of cmavo that have the same grammatical use (can appear interchangeably in sentences, as far as the grammar is concerned) but differ in meaning or other usage. See Chapter ELG-ERROR in Template:Lch.


Morphology

Introductory

Morphology is the part of grammar that deals with the form of words. Lojban's morphology is fairly simple compared to that of many languages, because Lojban words don't change form depending on how they are used. English has only a small number of such changes compared to languages like Russian, but it does have changes like “boys” as the plural of “boy”, or “walked” as the past-tense form of “walk”. To make plurals or past tenses in Lojban, you add separate words to the sentence that express the number of boys, or the time when the walking was going on.

However, Lojban does have what is called “derivational morphology”: the capability of building new words from old words. In addition, the form of words tells us something about their grammatical uses, and sometimes about the means by which they entered the language. Lojban has very orderly rules for the formation of words of various types, both the words that already exist and new words yet to be created by speakers and writers.

A stream of Lojban sounds can be uniquely broken up into its component words according to specific rules. These so-called “morphology rules” are summarized in this chapter. (However, a detailed algorithm for breaking sounds into words has not yet been fully debugged, and so is not presented in this book.) First, here are some conventions used to talk about groups of Lojban letters, including vowels and consonants.

  • V represents any single Lojban vowel except y; that is, it represents a, e, i, o, or u.
  • VV represents either a diphthong, one of the following:
    • 'ai'
    • 'ei'
    • 'oi'
    • 'au'
    or a two-syllable vowel pair with an apostrophe separating the vowels, one of the following:
    • 'a'a'
    • 'a'e'
    • 'a'i'
    • 'a'o'
    • 'a'u'
    • 'e'a'
    • 'e'e'
    • 'e'i'
    • 'e'o'
    • 'e'u'
    • 'i'a'
    • 'i'e'
    • 'i'i'
    • 'i'o'
    • 'i'u'
    • 'o'a'
    • 'o'e'
    • 'o'i'
    • 'o'o'
    • 'o'u'
    • 'u'a'
    • 'u'e'
    • 'u'i'
    • 'u'o'
    • 'u'u'
  • C represents a single Lojban consonant, not including the apostrophe, one of
  • b
  • c
  • d
  • f
  • g
  • j
  • k
  • l
  • m
  • n
  • p
  • r
  • s
  • t
  • v
  • x
  • or z.

Syllabic l, m, n, and r always count as consonants for the purposes of this chapter.

  • CC represents two adjacent consonants of type C which constitute one of the 48 permissible initial consonant pairs:
bl br
cf ck cl cm cn cp cr ct
dj dr dz
fl fr
gl gr
jb jd jg jm jv
kl kr
ml mr
pl pr
sf sk sl sm sn sp sr st
tc tr ts
vl vr 
xl xr
zb zd zg zm zv
  • C/C represents two adjacent consonants which constitute one of the permissible consonant pairs (not necessarily a permissible initial consonant pair). The permissible consonant pairs are explained in Section 2.3. In brief, any consonant pair is permissible unless it: contains two identical letters, contains both a voiced (excluding r, l, m, n) and an unvoiced consonant, or is one of certain specified forbidden pairs.
  • C/CC represents a consonant triple. The first two consonants must constitute a permissible consonant pair; the last two consonants must constitute a permissible initial consonant pair.

Lojban has three basic word classes – parts of speech – in contrast to the eight that are traditional in English. These three classes are called cmavo, brivla, and cmene. Each of these classes has uniquely identifying properties – an arrangement of letters that allows the word to be uniquely and unambiguously recognized as a separate word in a string of Lojban, upon either reading or hearing, and as belonging to a specific word-class.

They are also functionally different: cmavo are the structure words, corresponding to English words like “and”, “if”, “the” and “to”; brivla are the content words, corresponding to English words like “come”, “red”, “doctor”, and “freely”; cmene are proper names, corresponding to English “James”, “Afghanistan”, and “Pope John Paul II”.

cmavo

The first group of Lojban words discussed in this chapter are the cmavo. They are the structure words that hold the Lojban language together. They often have no semantic meaning in themselves, though they may affect the semantics of brivla to which they are attached. The cmavo include the equivalent of English articles, conjunctions, prepositions, numbers, and punctuation marks. There are over a hundred subcategories of cmavo, known as selma'o, each having a specifically defined grammatical usage. The various selma'o are discussed throughout Chapter ELG-ERROR in Template:Lch to Chapter ELG-ERROR in Template:Lch and summarized in Chapter ELG-ERROR in Template:Lch.

Standard cmavo occur in four forms defined by their word structure. Here are some examples of the various forms: <tab class=wikitable header=true> V-form .a .e .i .o .u CV-form ba ce di fo gu VV-form .au .ei .ia o'u u'e CVV-form ki'a pei mi'o coi cu'u </tab> In addition, there is the cmavo .y. (remember that y is not a V), which must have pauses before and after it.

A simple cmavo thus has the property of having only one or two vowels, or of having a single consonant followed by one or two vowels. Words consisting of three or more vowels in a row, or a single consonant followed by three or more vowels, are also of cmavo form, but are reserved for experimental use: a few examples are

ku'a'e,
sau'e, and
bai'ai. All CVV cmavo beginning with the letter x are also reserved for experimental use. In general, though, the form of a cmavo tells you little or nothing about its grammatical use.

“Experimental use” means that the language designers will not assign any standard meaning or usage to these words, and words and usages coined by Lojban speakers will not appear in official dictionaries for the indefinite future. Experimental-use words provide an escape hatch for adding grammatical mechanisms (as opposed to semantic concepts) the need for which was not foreseen.

The cmavo of VV-form include not only the diphthongs and vowel pairs listed in Section , but also the following ten additional diphthongs:

  • .ia
  • .ie
  • .ii
  • .io
  • .iu
  • .ua
  • .ue
  • .ui
  • .uo
  • .uu

In addition, cmavo can have the form Cy, a consonant followed by the letter y. These cmavo represent letters of the Lojban alphabet, and are discussed in detail in Chapter ELG-ERROR in Template:Lch.

Compound cmavo are sequences of cmavo attached together to form a single written word. A compound cmavo is always identical in meaning and in grammatical use to the separated sequence of simple cmavo from which it is composed. These words are written in compound form merely to save visual space, and to ease the reader's burden in identifying when the component cmavo are acting together.

Compound cmavo, while not visually short like their components, can be readily identified by two characteristics:

  • They have no consonant pairs or clusters, and
  • They end in a vowel.

For example:

.iseci'i
.i se ci'i
punaijecanai]}
cc
ki'e.u'e
ki'e .u'e

The cmavo u'e begins with a vowel, and like all words beginning with a vowel, requires a pause (represented by .) before it. This pause cannot be omitted simply because the cmavo is incorporated into a compound cmavo. On the other hand,

ki'e'u'e

is a single cmavo reserved for experimental purposes: it has four vowels.


cy.ibu.abu
cy. .ibu .abu

Again the pauses are required (See Section 5.8); the pause after cy. merges with the pause before .ibu.

There is no particular stress required in cmavo or their compounds. Some conventions do exist that are not mandatory. For two-syllable cmavo, for example, stress is typically placed on the first vowel; an example is


.e'o ko ko kurji
.E'o ko ko KURji

This convention results in a consistent rhythm to the language, since brivla are required to have penultimate stress; some find this esthetically pleasing.

If the final syllable of one word is stressed, and the first syllable of the next word is stressed, you must insert a pause or glottal stop between the two stressed syllables. Thus

le re nanmu

can be optionally pronounced


le RE. NANmu

since there are no rules forcing stress on either of the first two words; the stress on re, though, demands that a pause separate re from the following syllable

nan to ensure that the stress on
nan is properly heard as a stressed syllable. The alternative pronunciation


LE re NANmu

is also valid; this would apply secondary stress (used for purposes of emphasis, contrast or sentence rhythm) to le, comparable in rhythmical effect to the English phrase “THE two men”. In Example , the secondary stress on re would be similar to that in the English phrase “the TWO men”.

Both cmavo may also be left unstressed, thus:


le re NANmu

This would probably be the most common usage.

brivla

Predicate words, called brivla, are at the core of Lojban. They carry most of the semantic information in the language. They serve as the equivalent of English nouns, verbs, adjectives, and adverbs, all in a single part of speech.

Every brivla belongs to one of three major subtypes. These subtypes are defined by the form, or morphology, of the word – all words of a particular structure can be assigned by sight or sound to a particular type (cmavo, brivla, or cmene) and subtype. Knowing the type and subtype then gives you, the reader or listener, significant clues to the meaning and the origin of the word, even if you have never heard the word before.

The same principle allows you, when speaking or writing, to invent new brivla for new concepts “on the fly”; yet it offers people that you are trying to communicate with a good chance to figure out your meaning. In this way, Lojban has a flexible vocabulary which can be expanded indefinitely.

All brivla have the following properties:

  • always end in a vowel;
  • always contain a consonant pair in the first five letters, where y and apostrophe are not counted as letters for this purpose (See Section 5.5.);
  • always are stressed on the next-to-the-last (penultimate) syllable; this implies that they have two or more syllables.

The presence of a consonant pair distinguishes brivla from cmavo and their compounds. The final vowel distinguishes brivla from cmene, which always end in a consonant. Thus

da'amei must be a compound cmavo because it lacks a consonant pair; lojban. must be a name because it lacks a final vowel.

Thus, bisycla has the consonant pair sc in the first five non- y letters even though the sc actually appears in the form of sy.. Similarly, the word ro'inre'o contains nr in the first five letters because the apostrophes are not counted for this purpose.

The three subtypes of brivla are:

  • gismu, the Lojban primitive roots from which all other brivla are built;
  • lujvo, the compounds of two or more gismu; and
  • fu'ivla (literally “copy-word”), the specialized words that are not Lojban primitives or natural compounds, and are therefore borrowed from other languages.

gismu

The gismu, or Lojban root words, are those brivla representing concepts most basic to the language. The gismu were chosen for various reasons: some represent concepts that are very familiar and basic; some represent concepts that are frequently used in other languages; some were added because they would be helpful in constructing more complex words; some because they represent fundamental Lojban concepts (like cmavo and gismu themselves).

The gismu do not represent any sort of systematic partitioning of semantic space. Some gismu may be superfluous, or appear for historical reasons: the gismu list was being collected for almost 35 years and was only weeded out once. Instead, the intention is that the gismu blanket semantic space: they make it possible to talk about the entire range of human concerns.

There are about 1350 gismu. In learning Lojban, you need only to learn most of these gismu and their combining forms (known as rafsi) as well as perhaps 200 major cmavo, and you will be able to communicate effectively in the language. This may sound like a lot, but it is a small number compared to the vocabulary needed for similar communications in other languages.

All gismu have very strong form restrictions. Using the conventions defined in Section , all gismu are of the forms CVC/CV or CCVCV. They must meet the rules for all brivla given in Section 5.2; furthermore, they:

  • always have five letters;
  • always start with a consonant and end with a single vowel;
  • always contain exactly one consonant pair, which is a permissible initial pair (CC) if it's at the beginning of the gismu, but otherwise only has to be a permissible pair (C/C);
  • are always stressed on the first syllable (since that is penultimate).

The five letter length distinguishes gismu from lujvo and fu'ivla. In addition, no gismu contains '.

With the exception of five special brivla variables, broda, brode, brodi, brodo, and brodu, no two gismu differ only in the final vowel. Furthermore, the set of gismu was specifically designed to reduce the likelihood that two similar sounding gismu could be confused. For example, because gismu is in the set of gismu, kismu, xismu, gicmu, gizmu, and gisnu cannot be.

Almost all Lojban gismu are constructed from pieces of words drawn from other languages, specifically Chinese, English, Hindi, Spanish, Russian, and Arabic, the six most widely spoken natural languages. For a given concept, words in the six languages that represent that concept were written in Lojban phonetics. Then a gismu was selected to maximize the recognizability of the Lojban word for speakers of the six languages by weighting the inclusion of the sounds drawn from each language by the number of speakers of that language. See Section 5.13 for a full explanation of the algorithm.

Here are a few examples of gismu, with rough English equivalents (not definitions):

creka
shirt
lijda
religion
blanu
blue
mamta
mother
cukta
book
patfu
father
nanmu
man
ninmu
woman

A small number of gismu were formed differently; See Section 5.14 for a list.

lujvo

When specifying a concept that is not found among the gismu (or, more specifically, when the relevant gismu seems too general in meaning), a Lojbanist generally attempts to express the concept as a tanru. Lojban tanru are an elaboration of the concept of “metaphor” used in English. In Lojban, any brivla can be used to modify another brivla. The first of the pair modifies the second. This modification is usually restrictive – the modifying brivla reduces the broader sense of the modified brivla to form a more narrow, concrete, or specific concept. Modifying brivla may thus be seen as acting like English adverbs or adjectives. For example,

skami pilno

is the tanru which expresses the concept of “computer user”.

The simplest Lojban tanru are pairings of two concepts or ideas. Such tanru take two simpler ideas that can be represented by gismu and combine them into a single more complex idea. Two-part tanru may then be recombined in pairs with other tanru, or with individual gismu, to form more complex or more specific ideas, and so on.

The meaning of a tanru is usually at least partly ambiguous:

skami pilno could refer to a computer that is a user, or to a user of computers. There are a variety of ways that the modifier component can be related to the modified component. It is also possible to use cmavo within tanru to provide variations (or to prevent ambiguities) of meaning.

Making tanru is essentially a poetic or creative act, not a science. While the syntax expressing the grouping relationships within tanru is unambiguous, tanru are still semantically ambiguous, since the rules defining the relationships between the gismu are flexible. The process of devising a new tanru is dealt with in detail in Chapter ELG-ERROR in Template:Lch.

To express a simple tanru, simply say the component gismu together. Thus the binary metaphor “big boat” becomes the tanru


barda bloti

representing roughly the same concept as the English word “ship”.

The binary metaphor “father mother” can refer to a paternal grandmother ( “a father-ly type of mother”), while “mother father” can refer to a maternal grandfather ( “a mother-ly type of father”). In Lojban, these become the tanru


patfu mamta

and


mamta patfu

respectively.

The possibility of semantic ambiguity can easily be seen in the last case. To interpret Example , the listener must determine what type of motherliness pertains to the father being referred to. In an appropriate context,

mamta patfu could mean not “grandfather” but simply “father with some motherly attributes”, depending on the culture. If absolute clarity is required, there are ways to expand upon and explain the exact interrelationship between the components; but such detail is usually not needed.

When a concept expressed in a tanru proves useful, or is frequently expressed, it is desirable to choose one of the possible meanings of the tanru and assign it to a new brivla. For Example , we would probably choose “user of computers”, and form the new word

sampli

Such a brivla, built from the rafsi which represent its component words, is called a lujvo. Another example, corresponding to the tanru of Example , would be:

bralo'i
big-boat
ship

The lujvo representing a given tanru is built from units representing the component gismu. These units are called rafsi in Lojban. Each rafsi represents only one gismu. The rafsi are attached together in the order of the words in the tanru, occasionally inserting so-called “hyphen” letters to ensure that the pieces stick together as a single word and cannot accidentally be broken apart into cmavo, gismu, or other word forms. As a result, each lujvo can be readily and accurately recognized, allowing a listener to pick out the word from a string of spoken Lojban, and if necessary, unambiguously decompose the word to a unique source tanru, thus providing a strong clue to its meaning.

The lujvo that can be built from the tanru

mamta patfu in Example is
mampa'u

which refers specifically to the concept “maternal grandfather”. The two gismu that constitute the tanru are represented in mampa'u by the rafsi mam- and -pa'u, respectively; these two rafsi are then concatenated together to form mampa'u.

Like gismu, lujvo have only one meaning. When a lujvo is formally entered into a dictionary of the language, a specific definition will be assigned based on one particular interrelationship between the terms. (See Chapter ELG-ERROR in Template:Lch for how this has been done.) Unlike gismu, lujvo may have more than one form. This is because there is no difference in meaning between the various rafsi for a gismu when they are used to build a lujvo. A long rafsi may be used, especially in noisy environments, in place of a short rafsi; the result is considered the same lujvo, even though the word is spelled and pronounced differently. Thus the word brivla, built from the tanru :bridi valsi, is the same lujvo as brivalsi, bridyvla, and bridyvalsi, each of which uses a different combination of rafsi.

When assembling rafsi together into lujvo, the rules for valid brivla must be followed: a consonant cluster must occur in the first five letters (excluding y and '), and the lujvo must end in a vowel.

A y (which is ignored in determining stress or consonant clusters) is inserted in the middle of the consonant cluster to glue the word together when the resulting cluster is either not permissible or the word is likely to break up. There are specific rules describing these conditions, detailed in Section 5.5.

An r (in some cases, an n) is inserted when a CVV-form rafsi attaches to the beginning of a lujvo in such a way that there is no consonant cluster. For example, in the lujvo


soirsai
sonci sanmi
soldier meal
field rations

the rafsi soi- and -sai are joined, with the additional r making up the rs consonant pair needed to make the word a brivla. Without the r, the word would break up into

soi sai, two cmavo. The pair of cmavo have no relation to their rafsi lookalikes; they will either be ungrammatical (as in this case), or will express a different meaning from what was intended.

Learning rafsi and the rules for assembling them into lujvo is clearly seen to be necessary for fully using the potential Lojban vocabulary.

Most important, it is possible to invent new lujvo while you speak or write in order to represent a new or unfamiliar concept, one for which you do not know any existing Lojban word. As long as you follow the rules for building these compounds, there is a good chance that you will be understood without explanation.

rafsi

Every gismu has from two to five rafsi, each of a different form, but each such rafsi represents only one gismu. It is valid to use any of the rafsi forms in building lujvo – whichever the reader or listener will most easily understand, or whichever is most pleasing – subject to the rules of lujvo making. There is a scoring algorithm which is intended to determine which of the possible and legal lujvo forms will be the standard dictionary form (See Section 5.11).

Each gismu always has at least two rafsi forms; one is the gismu itself (used only at the end of a lujvo), and one is the gismu without its final vowel (used only at the beginning or middle of a lujvo). These forms are represented as CVC/CV or CCVCV (called “the 5-letter rafsi”), and CVC/C or CCVC (called “the 4-letter rafsi”) respectively. The dashes in these rafsi form representations show where other rafsi may be attached to form a valid lujvo. When lujvo are formed only from 4-letter and 5-letter rafsi, known collectively as “long rafsi”, they are called “unreduced lujvo”.

Some examples of unreduced lujvo forms are:

mamtypatfu
mamta patfu
“mother father”
or “maternal grandfather”
lerfyliste
lerfu liste
“letter list” or a “list of letters”
(letters of the alphabet)
nancyprali
nanca prali
“year profit”
or “annual profit”
prunyplipe
pruni plipe
“elastic (springy) leap”
or “spring” (the verb)


vancysanmi
vanci sanmi
“evening meal”
or “supper”

In addition to these two forms, each gismu may have up to three additional short rafsi, three letters long. All short rafsi have one of the forms CVC, CCV, or CVV. The total number of rafsi forms that are assigned to a gismu depends on how useful the gismu is, or is presumed to be, in making lujvo, when compared to other gismu that could be assigned the rafsi.

For example, zmadu ( “more than”) has the two short rafsi zma and mau (in addition to its unreduced rafsi

zmad and zmadu), because a vast number of lujvo have been created based on zmadu, corresponding in general to English comparative adjectives ending in “-er” such as “whiter” (Lojban labmau). On the other hand, bakri (“chalk”) has no short rafsi and few lujvo.

There are at most one CVC-form, one CCV-form, and one CVV-form rafsi per gismu. In fact, only a tiny handful of gismu have both a CCV-form and a CVV-form rafsi assigned, and still fewer have all three forms of short rafsi. However, gismu with both a CVC-form and another short rafsi are fairly common, partly because more possible CVC-form rafsi exist. Yet CVC-form rafsi, even though they are fairly easy to remember, cannot be used at the end of a lujvo (because lujvo must end in vowels), so justifying the assignment of an additional short rafsi to many gismu.

The intention was to use the available “rafsi space”- the set of all possible short rafsi forms – in the most efficient way possible; the goal is to make the most-used lujvo as short as possible (thus maximizing the use of short rafsi), while keeping the rafsi very recognizable to anyone who knows the source gismu. For this reason, the letters in a rafsi have always been chosen from among the five letters of the corresponding gismu. As a result, there are a limited set of short rafsi available for assignment to each gismu. At most seven possible short rafsi are available for consideration (of which at most three can be used, as explained above).

Here are the only short rafsi forms that can possibly exist for gismu of the form CVC/CV, like sakli. The digits in the second column represent the gismu letters used to form the rafsi. <tab class=wikitable header=true>CVC 123 -sak- CVC 124 -sal- CVV 12'5 -sa'i- CVV 125 -sai- CCV 345 -kli- CCV 132 -ska- </tab> (The only actual short rafsi for sakli is -sal-.)

For gismu of the form CCVCV, like blaci, the only short rafsi forms that can exist are: <tab class=wikitable header=true>CVC 134 -bac- CVC 234 -lac CVV 13'5 -ba'i- CVV 135 -bai- CVV 23'5 -la'i- CVV 235 -lai- CCV 123 -bla- </tab> (In fact, blaci has none of these short rafsi; they are all assigned to other gismu. Lojban speakers are not free to reassign any of the rafsi; the tables shown here are to help understand how the rafsi were chosen in the first place.)

There are a few restrictions: a CVV-form rafsi without an apostrophe cannot exist unless the vowels make up one of the four diphthongs ai, ei, oi, or au; and a CCV-form rafsi is possible only if the two consonants form a permissible initial consonant pair (See Section ). Thus mamta, which has the same form as salci, can only have mam, mat, and ma'a as possible rafsi: in fact, only mam is assigned to it.

Some cmavo also have associated rafsi, usually CVC-form. For example, the ten common numerical digits, which are all CV form cmavo, each have a CVC-form rafsi formed by adding a consonant to the cmavo. Most cmavo that have rafsi are ones used in composing tanru.

The term for a lujvo made up solely of short rafsi is “fully reduced lujvo”. Here are some examples of fully reduced lujvo:

cumfri
cumki lifri
“possible experience”


klezba
klesi zbasu
“category make”


kixta'a
krixa tavla
“cry-out talk”


sniju'o
sinxa djuno
“sign know”

In addition, the unreduced forms in Example and Example may be fully reduced to:

mampa'u
mamta patfu
“mother father”
or “maternal grandfather”


lerste
lerfu liste
“letter list” or a “list of letters”

As noted above, CVC-form rafsi cannot appear as the final rafsi in a lujvo, because all lujvo must end with one or two vowels. As a brivla, a lujvo must also contain a consonant cluster within the first five letters – this ensures that they cannot be mistaken for compound cmavo. Of course, all lujvo have at least six letters since they have two or more rafsi, each at least three letters long; hence they cannot be confused with gismu.

When attaching two rafsi together, it may be necessary to insert a hyphen letter. In Lojban, the term “hyphen” always refers to a letter, either the vowel y or one of the consonants r and n. (The letter l can also be a hyphen, but is not used as one in lujvo.)

The y-hyphen is used after a CVC-form rafsi when joining it with the following rafsi could result in an impermissible consonant pair, or when the resulting lujvo could fall apart into two or more words (either cmavo or gismu).

Thus, the tanru

pante tavla ( “protest talk”) cannot produce the lujvo
patta'a, because

tt is not a permissible consonant pair; the lujvo must be patyta'a. Similarly, the tanru

mudri siclu ( “wooden whistle”) cannot form the lujvo
mudsiclu; instead, mudysiclu must be used. (Remember that y is not counted in determining whether the first five letters of a brivla contain a consonant cluster: this is why.)

The y-hyphen is also used to attach a 4-letter rafsi, formed by dropping the final vowel of a gismu, to the following rafsi. (This procedure was shown, but not explained, in Example to Example .)

The lujvo forms zunlyjamfu, zunlyjma, zuljamfu, and zuljma are all legitimate and equivalent forms made from the tanru

zunle jamfu ( “left foot”). Of these, zuljma is the preferred one since it is the shortest; it thus is likely to be the form listed in a Lojban dictionary.

The r-hyphen and its close relative, the n-hyphen, are used in lujvo only after CVV-form rafsi. A hyphen is always required in a two-part lujvo of the form CVV-CVV, since otherwise there would be no consonant cluster.

An r-hyphen or n-hyphen is also required after the CVV-form rafsi of any lujvo of the form CVV-CVC/CV or CVV-CCVCV since it would otherwise fall apart into a CVV-form cmavo and a gismu. In any lujvo with more than two parts, a CVV-form rafsi in the initial position must always be followed by a hyphen. If the hyphen were to be omitted, the supposed lujvo could be broken into smaller words without the hyphen: because the CVV-form rafsi would be interpreted as a cmavo, and the remainder of the word as a valid lujvo that is one rafsi shorter.

An n-hyphen is only used in place of an r-hyphen when the following rafsi begins with r. For example, the tanru

rokci renro ( “rock throw”) cannot be expressed as
ro'ire'o (which breaks up into two cmavo), nor can it be
ro'irre'o (which has an impermissible double consonant); the

n-hyphen is required, and the correct form of the hyphenated lujvo is ro'inre'o. The same lujvo could also be expressed without hyphenation as rokre'o.

There is also a different way of building lujvo, or rather phrases which are grammatically and semantically equivalent to lujvo. You can make a phrase containing any desired words, joining each pair of them with the special cmavo zei. Thus,

bridi zei valsi

is the exact equivalent of brivla (but not necessarily the same as the underlying tanru

bridi valsi, which could have other meanings.) Using zei is the only way to get a cmavo lacking a rafsi, a cmene, or a fu'ivla into a lujvo:


xy. zei kantu
X ray


kulnr,farsi zei lolgai
Farsi floor-cover
Persian rug


na'e zei .a zei na'e zei by. livgyterbilma
non-A, non-B liver-disease
non-A, non-B hepatitis


.cerman. zei jamkarce
Sherman war-car
Sherman tank

Example is particularly noteworthy because the phrase that would be produced by removing the zei s from it doesn't end with a brivla, and in fact is not even grammatical. As written, the example is a tanru with two components, but by adding a zei between by. and livgyterbilma to produce


na'e zei .a zei na'e zei by. zei livgyterbilma
non-A-non-B-hepatitis

the whole phrase would become a single lujvo. The longer lujvo of Example may be preferable, because its place structure can be built from that of bilma, whereas the place structure of a lujvo without a brivla must be constructed ad hoc.

Note that rafsi may not be used in zei phrases, because they are not words. CVV rafsi look like words (specifically cmavo) but there can be no confusion between the two uses of the same letters, because cmavo appear only as separate words or in compound cmavo (which are really just a notation for writing separate but closely related words as if they were one); rafsi appear only as parts of lujvo.

fu'ivla

The use of tanru or lujvo is not always appropriate for very concrete or specific terms (e.g. “brie” or “cobra”), or for jargon words specialized to a narrow field (e.g. “quark”, “integral”, or “iambic pentameter”). These words are in effect names for concepts, and the names were invented by speakers of another language. The vast majority of words referring to plants, animals, foods, and scientific terminology cannot be easily expressed as tanru. They thus must be borrowed (actually “copied”) into Lojban from the original language.

There are four stages of borrowing in Lojban, as words become more and more modified (but shorter and easier to use). Stage 1 is the use of a foreign name quoted with the cmavo la'o (explained in full in Section ):

me la'o ly. spaghetti .ly.

is a predicate with the place structure “x1 is a quantity of spaghetti”.

Stage 2 involves changing the foreign name to a Lojbanized name, as explained in Section 5.7:

me la spagetis.

One of these expedients is often quite sufficient when you need a word quickly in conversation. (This can make it easier to get by when you do not yet have full command of the Lojban vocabulary, provided you are talking to someone who will recognize the borrowing.)

Where a little more universality is desired, the word to be borrowed must be Lojbanized into one of several permitted forms. A rafsi is then usually attached to the beginning of the Lojbanized form, using a hyphen to ensure that the resulting word doesn't fall apart.

The rafsi categorizes or limits the meaning of the fu'ivla; otherwise a word having several different jargon meanings in other languages would require the word-inventor to choose which meaning should be assigned to the fu'ivla, since fu'ivla (like other brivla) are not permitted to have more than one definition. Such a Stage 3 borrowing is the most common kind of fu'ivla.

Finally, Stage 4 fu'ivla do not have any rafsi classifier, and are used where a fu'ivla has become so common or so important that it must be made as short as possible. (See Section 5.15 for a proposal concerning Stage 4 fu'ivla.)

The form of a fu'ivla reliably distinguishes it from both the gismu and the cmavo. Like cultural gismu, fu'ivla are generally based on a word from a single non-Lojban language. The word is “borrowed” (actually “copied”, hence the Lojban tanru

fukpi valsi) from the other language and Lojbanized – the phonemes are converted to their closest Lojban equivalent and modifications are made as necessary to make the word a legitimate Lojban fu'ivla-form word. All fu'ivla:
  • must contain a consonant cluster in the first five letters of the word; if this consonant cluster is at the beginning, it must either be a permissible initial consonant pair, or a longer cluster such that each pair of adjacent consonants in the cluster is a permissible initial consonant pair: spraile is acceptable, but not ktraile or trkaile;
  • must end in one or more vowels;
  • must not be gismu or lujvo, or any combination of cmavo, gismu, and lujvo; furthermore, a fu'ivla with a CV cmavo joined to the front of it must not have the form of a lujvo (the so-called “slinku'i test”, not discussed further in this book);
  • cannot contain y, although they may contain syllabic pronunciations of Lojban consonants;
  • like other brivla, are stressed on the penultimate syllable.

Note that consonant triples or larger clusters that are not at the beginning of a fu'ivla can be quite flexible, as long as all consonant pairs are permissible. There is no need to restrict fu'ivla clusters to permissible initial pairs except at the beginning.

This is a fairly liberal definition and allows quite a lot of possibilities within “fu'ivla space”. Stage 3 fu'ivla can be made easily on the fly, as lujvo can, because the procedure for forming them always guarantees a word that cannot violate any of the rules. Stage 4 fu'ivla require running tests that are not simple to characterize or perform, and should be made only after deliberation and by someone knowledgeable about all the considerations that apply.

Here is a simple and reliable procedure for making a non-Lojban word into a valid Stage 3 fu'ivla:

  • Eliminate all double consonants and silent letters.
  • Convert all sounds to their closest Lojban equivalents. Lojban y, however, may not be used in any fu'ivla.
  • If the last letter is not a vowel, modify the ending so that the word ends in a vowel, either by removing a final consonant or by adding a suggestively chosen final vowel.
  • If the first letter is not a consonant, modify the beginning so that the word begins with a consonant, either by removing an initial vowel or adding a suggestively chosen initial consonant.
  • Prefix the result of steps 1-5 with a 4-letter rafsi that categorizes the fu'ivla into a “topic area”. It is only safe to use a 4-letter rafsi; short rafsi sometimes produce invalid fu'ivla. Hyphenate the rafsi to the rest of the fu'ivla with an

r-hyphen; if that would produce a double r, use an n-hyphen instead; if the rafsi ends in r and the rest of the fu'ivla begins with n (or vice versa), or if the rafsi ends in "r" and the rest of the fu'ivla begins with "tc", "ts", "dj", or "dz" (using "n" would result in a phonotactically impermissible cluster), use an l-hyphen. (This is the only use of l-hyphen in Lojban.)

Alternatively, if a CVC-form short rafsi is available it can be used instead of the long rafsi.

  • Remember that the stress necessarily appears on the penultimate (next-to-the-last) syllable.

In this section, the hyphen is set off with commas in the examples, but these commas are not required in writing, and the hyphen need not be pronounced as a separate syllable.

Here are a few examples:


spaghetti
from English or Italian
spageti
Lojbanize
cidj,r,spageti
prefix long rafsi
dja,r,spageti
prefix short rafsi

where cidj- is the 4-letter rafsi for cidja, the Lojban gismu for “food”, thus categorizing cidjrspageti as a kind of food. The form with the short rafsi happens to work, but such good fortune cannot be relied on: in any event, it means the same thing.



Acer
the scientific name of maple trees
acer
Lojbanize
xaceru
add initial consonant and final vowel
tric,r,xaceru
prefix rafsi
ric,r,xaceru
prefix short rafsi

where tric- and ric- are rafsi for tricu, the gismu for “tree”. Note that by the same principles, “maple sugar” could get the fu'ivla saktrxaceru, or could be represented by the tanru

tricrxaceru sakta. Technically, ricrxaceru and tricrxaceru are distinct fu'ivla, but they would surely be given the same meanings if both happened to be in use.


brie
from French
bri
Lojbanize
cirl,r,bri
prefix rafsi

where cirl- represents cirla ( “cheese”).


cobra
kobra
Lojbanize
sinc,r,kobra
prefix rafsi

where sinc- represents since ( “snake”).


quark
kuark
Lojbanize
kuarka
add final vowel
sask,r,kuarka
prefix rafsi

where sask- represents saske ( “science”). Note the extra vowel a added to the end of the word, and the diphthong ua, which never appears in gismu or lujvo, but may appear in fu'ivla.


자모
from Korean
djamo
Lojbanize
lerf,r,djamo
prefix rafsi
ler,l,djamo
prefix rafsi

where ler- represents lerfu ( “letter”). Note the l-hyphen in "lerldjamo", since "lerndjamo" contains the forbidden cluster "ndj".

The use of the prefix helps distinguish among the many possible meanings of the borrowed word, depending on the field. As it happens, spageti and kuarka are valid Stage 4 fu'ivla, but

xaceru looks like a compound cmavo, and
kobra like a gismu.

For another example, “integral” has a specific meaning to a mathematician. But the Lojban fu'ivla integrale, which is a valid Stage 4 fu'ivla, does not convey that mathematical sense to a non-mathematical listener, even one with an English-speaking background; its source – the English word “integral” – has various other specialized meanings in other fields.

Left uncontrolled, integrale almost certainly would eventually come to mean the same collection of loosely related concepts that English associates with “integral”, with only the context to indicate (possibly) that the mathematical term is meant.

The prefix method would render the mathematical concept as cmacrntegrale, if the i of integrale is removed, or something like cmacrnintegrale, if a new consonant is added to the beginning; cmac- is the rafsi for cmaci ( “mathematics”). The architectural sense of “integral” might be conveyed with djinrnintegrale or tarmrnintegrale, where dinju and tarmi mean “building” and “form” respectively.

Here are some fu'ivla representing cultures and related things, shown with more than one rafsi prefix:


bang,r,blgaria
Bulgarian
in language



kuln,r,blgaria
Bulgarian
in culture



gugd,r,blgaria
Bulgaria
the country


bang,r,kore,a
Korean
the language



kuln,r,kore,a
Korean
the culture


Note the commas in Example and Example , used because ea is not a valid diphthong in Lojban. Arguably, some form of the native name “Chosen” should have been used instead of the internationally known “Korea”; this is a recurring problem in all borrowings. In general, it is better to use the native name unless using it will severely impede understanding: “Navajo” is far more widely known than “Dine'e”.

cmene

Lojbanized names, called cmene, are very much like their counterparts in other languages. They are labels applied to things (or people) to stand for them in descriptions or in direct address. They may convey meaning in themselves, but do not necessarily do so.

Because names are often highly personal and individual, Lojban attempts to allow native language names to be used with a minimum of modification. The requirement that the Lojban speech stream be unambiguously analyzable, however, means that most names must be modified somewhat when they are Lojbanized. Here are a few examples of English names and possible Lojban equivalents:


djim.
Jim


djein.
Jane


.arnold.
Arnold


pit.
Pete


katrinas.
Katrina


kat,r,in.
Catherine

(Note that syllabic r is skipped in determining the stressed syllable, so Example is stressed on the ka.)


katis.
Cathy


keit.
Kate

Names may have almost any form, but always end in a consonant, and are followed by a pause. They are penultimately stressed, unless unusual stress is marked with capitalization. A name may have multiple parts, each ending with a consonant and pause, or the parts may be combined into a single word with no pause. For example,


djan. braun.

and


djanbraun.

are both valid Lojbanizations of “John Brown”.

The final arbiter of the correct form of a name is the person doing the naming, although most cultures grant people the right to determine how they want their own name to be spelled and pronounced. The English name “Mary” can thus be Lojbanized as meris., maris., meiris., merix., or even marys.. The last alternative is not pronounced much like its English equivalent, but may be desirable to someone who values spelling over pronunciation. The final consonant need not be an s; there must, however, be some Lojban consonant at the end.

Names are not permitted to have the sequences la, lai, or doi embedded in them, unless the sequence is immediately preceded by a consonant. These minor restrictions are due to the fact that all Lojban cmene embedded in a speech stream will be preceded by one of these words or by a pause. With one of these words embedded, the cmene might break up into valid Lojban words followed by a shorter cmene. However, break-up cannot happen after a consonant, because that would imply that the word before the la, or whatever, ended in a consonant without pause, which is impossible.

For example, the invalid name laplas. would look like the Lojban words

la plas., and

ilanas. would be misunderstood as

.i la nas.. However,
NEderlants. cannot be misheard as
NEder lants., because
NEder with no following pause is not a possible Lojban word.

There are close alternatives to these forbidden sequences that can be used in Lojbanizing names, such as ly, lei, and dai or do'i, that do not cause these problems.

Lojban cmene are identifiable as word forms by the following characteristics:

  • They must end in one or more consonants. There are no rules about how many consonants may appear in a cluster in cmene, provided that each consonant pair (whether standing by itself, or as part of a larger cluster) is a permissible pair.
  • They may contain the letter y as a normal, non-hyphenating vowel. They are the only kind of Lojban word that may contain the two diphthongs iy and uy.
  • They are always followed in speech by a pause after the final consonant, written as ..
  • They may be stressed on any syllable; if this syllable is not the penultimate one, it must be capitalized when writing. Neither names nor words that begin sentences are capitalized in Lojban, so this is the only use of capital letters.

Names meeting these criteria may be invented, Lojbanized from names in other languages, or formed by appending a consonant onto a cmavo, a gismu, a fu'ivla or a lujvo. Some cmene built from Lojban words are:


pav.
the One

from the cmavo pa, with rafsi pav, meaning “one”


sol.
the Sun

from the gismu solri, meaning “solar”, or actually “pertaining to the Sun”


ralj.
Chief
as a title

from the gismu ralju, meaning “principal”.


nol.
Lord/Lady

from the gismu nobli, with rafsi nol, meaning “noble”.

To Lojbanize a name from the various natural languages, apply the following rules:

  • Eliminate double consonants and silent letters.
  • Add a final s or n (or some other consonant that sounds good) if the name ends in a vowel.
  • Convert all sounds to their closest Lojban equivalents.
  • If possible and acceptable, shift the stress to the penultimate (next-to-the-last) syllable. Use commas and capitalization in written Lojban when it is necessary to preserve non-standard syllabication or stress. Do not capitalize names otherwise.
  • If the name contains an impermissible consonant pair, insert a vowel between the consonants: y is recommended.
  • No cmene may have the syllables la, lai, or doi in them, unless immediately preceded by a consonant. If these combinations are present, they must be converted to something else. Possible substitutions include ly,
ly'i, and dai or

do'i, respectively.

There are some additional rules for Lojbanizing the scientific names (technically known as “Linnaean binomials” after their inventor) which are internationally applied to each species of animal or plant. Where precision is essential, these names need not be Lojbanized, but can be directly inserted into Lojban text using the cmavo la'o, explained in Section . Using this cmavo makes the already lengthy Latinized names at least four syllables longer, however, and leaves the pronunciation in doubt. The following suggestions, though incomplete, will assist in converting Linnaean binomals to valid Lojban names. They can also help to create fu'ivla based on Linnaean binomials or other words of the international scientific vocabulary. The term “back vowel” in the following list refers to any of the letters a, o, or u; the term “front vowel” correspondingly refers to any of the letters e, i, or y.

  • Change double consonants other than

cc to single consonants.

  • Change

cc before a front vowel to kc, but otherwise to k.

  • Change c before a back vowel and final c to k.
  • Change

ng before a consonant (other than h) and final ng to n.

  • Change x to z initially, but otherwise to

ks.

  • Change

pn to n initially.

  • Change final ie and ii to i.
  • Make the following idiosyncratic substitutions:

<tab class=wikitable header=true> aa a ae e ch k ee i eigh ei ew u igh ai oo u ou u ow au ph f q k sc sk w u y i </tab> However, the diphthong substitutions should not be done if the two vowels are in two different syllables.

  • Change “h” between two vowels to ', but otherwise remove it completely. If preservation of the “h” seems essential, change it to x instead.
  • Place ' between any remaining vowel pairs that do not form Lojban diphthongs.

Some further examples of Lojbanized names are: <tab class=wikitable header=true>English “Mary” meris. or meiris. English “Smith” smit. English “Jones” djonz. English “John” djan. or jan. (American) or djon. or jon. (British) English “Alice” .alis. English “Elise” .eLIS. English “Johnson” djansn. English “William” .uiliam. or .uil,iam. English “Brown” braun. English “Charles” tcarlz. French “Charles” carl. French “De Gaulle” dyGOL. German “Heinrich” xainrix. Spanish “Joaquin” xuaKIN. Russian “Svetlana” sfietlanys. Russian “Khrushchev” xrucTCOF. Hindi “Krishna” kricnas. Polish “Lech Walesa” lex. va,uensas. Spanish “Don Quixote” don. kicotes. or modern Spanish: don. kixotes. or Mexican dialect: don. ki'otes. Chinese “Mao Zedong” maudzydyn. Japanese “Fujiko” fudjikos. or fujikos. </tab>

Rules for inserting pauses

Summarized in one place, here are the rules for inserting pauses between Lojban words:

  • Any two words may have a pause between them; it is always illegal to pause in the middle of a word, because that breaks up the word into two words.
  • Every word ending in a consonant must be followed by a pause. Necessarily, all such words are cmene.
  • Every word beginning with a vowel must be preceded by a pause. Such words are either cmavo, fu'ivla, or cmene; all gismu and lujvo begin with consonants.
  • Every cmene must be preceded by a pause, unless the immediately preceding word is one of the cmavo la, lai, la'i, or doi (which is why those strings are forbidden in cmene). However, the situation triggering this rule rarely occurs.
  • If the last syllable of a word bears the stress, and a brivla follows, the two must be separated by a pause, to prevent confusion with the primary stress of the brivla. In this case, the first word must be either a cmavo or a cmene with unusual stress (which already ends with a pause, of course).
  • A cmavo of the form “Cy” must be followed by a pause unless another “Cy”-form cmavo follows.
  • When non-Lojban text is embedded in Lojban, it must be preceded and followed by pauses. (How to embed non-Lojban text is explained in

Section .)

Considerations for making lujvo

Given a tanru which expresses an idea to be used frequently, it can be turned into a lujvo by following the lujvo-making algorithm which is given in Section 5.10.

In building a lujvo, the first step is to replace each gismu with a rafsi that uniquely represents that gismu. These rafsi are then attached together by fixed rules that allow the resulting compound to be recognized as a single word and to be analyzed in only one way.

There are three other complications; only one is serious.

The first is that there is usually more than one rafsi that can be used for each gismu. The one to be used is simply whichever one sounds or looks best to the speaker or writer. There are usually many valid combinations of possible rafsi. They all are equally valid, and all of them mean exactly the same thing. (The scoring algorithm given in Section 5.11 is used to choose the standard form of the lujvo – the version which would be entered into a dictionary.)

The second complication is the serious one. Remember that a tanru is ambiguous – it has several possible meanings. A lujvo, or at least one that would be put into the dictionary, has just a single meaning. Like a gismu, a lujvo is a predicate which encompasses one area of the semantic universe, with one set of places. Hopefully the meaning chosen is the most useful of the possible semantic spaces. A possible source of linguistic drift in Lojban is that as Lojbanic society evolves, the concept that seems the most useful one may change.

You must also be aware of the possibility of some prior meaning of a new lujvo, especially if you are writing for posterity. If a lujvo is invented which involves the same tanru as one that is in the dictionary, and is assigned a different meaning (or even just a different place structure), linguistic drift results. This isn't necessarily bad. Every natural language does it. But in communication, when you use a meaning different from the dictionary definition, someone else may use the dictionary and therefore misunderstand you. You can use the cmavo za'e (explained in Section ) before a newly coined lujvo to indicate that it may have a non-dictionary meaning.

The essential nature of human communication is that if the listener understands, then all is well. Let this be the ultimate guideline for choosing meanings and place structures for invented lujvo.

The third complication is also simple, but tends to scare new Lojbanists with its implications. It is based on Zipf's Law, which says that the length of words is inversely proportional to their usage. The shortest words are those which are used more; the longest ones are used less. Conversely, commonly used concepts will be tend to be abbreviated. In English, we have abbreviations and acronyms and jargon, all of which represent complex ideas that are used often by small groups of people, so they shortened them to convey more information more rapidly.

Therefore, given a complicated tanru with grouping markers, abstraction markers, and other cmavo in it to make it syntactically unambiguous, the psychological basis of Zipf's Law may compel the lujvo-maker to drop some of the cmavo to make a shorter (technically incorrect) tanru, and then use that tanru to make the lujvo.

This doesn't lead to ambiguity, as it might seem to. A given lujvo still has exactly one meaning and place structure. It is just that more than one tanru is competing for the same lujvo. But more than one meaning for the tanru was already competing for the “right” to define the meaning of the lujvo. Someone has to use judgment in deciding which one meaning is to be chosen over the others.

If the lujvo made by a shorter form of tanru is in use, or is likely to be useful for another meaning, the decider then retains one or more of the cmavo, preferably ones that set this meaning apart from the shorter form meaning that is used or anticipated. As a rule, therefore, the shorter lujvo will be used for a more general concept, possibly even instead of a more frequent word. If both words are needed, the simpler one should be shorter. It is easier to add a cmavo to clarify the meaning of the more complex term than it is to find a good alternate tanru for the simpler term.

And of course, we have to consider the listener. On hearing an unknown word, the listener will decompose it and get a tanru that makes no sense or the wrong sense for the context. If the listener realizes that the grouping operators may have been dropped out, he or she may try alternate groupings, or try inserting an abstraction operator if that seems plausible. (The grouping of tanru is explained in Chapter ELG-ERROR in Template:Lch; abstraction is explained in Chapter ELG-ERROR in Template:Lch.) Plausibility is the key to learning new ideas and to evaluating unfamiliar lujvo.

The lujvo-making algorithm

The following is the current algorithm for generating Lojban lujvo given a known tanru and a complete list of gismu and their assigned rafsi. The algorithm was designed by Bob LeChevalier and Dr. James Cooke Brown for computer program implementation. It was modified in 1989 with the assistance of Nora LeChevalier, who detected a flaw in the original “tosmabru test”.

Given a tanru that is to be made into a lujvo:

  • Choose a 3-letter or 4-letter rafsi for each of the gismu and cmavo in the tanru except the last.
  • Choose a 3-letter (CVV-form or CCV-form) or 5-letter rafsi for the final gismu in the tanru.
  • Join the resulting string of rafsi, initially without hyphens.
  • Add hyphen letters where necessary. It is illegal to add a hyphen at a place that is not required by this algorithm. Right-to-left tests are recommended, for reasons discussed below.
  • If there are more than two words in the tanru, put an

r-hyphen (or an n-hyphen) after the first rafsi if it is CVV-form. If there are exactly two words, then put an r-hyphen (or an n-hyphen) between the two rafsi if the first rafsi is CVV-form, unless the second rafsi is CCV-form (for example,

saicli requires no hyphen). Use an

r-hyphen unless the letter after the hyphen is r, in which case use an n-hyphen. Never use an n-hyphen unless it is required.

  • Put a

y-hyphen between the consonants of any impermissible consonant pair. This will always appear between rafsi.

  • Put a

y-hyphen after any 4-letter rafsi form.

  • Test all forms with one or more initial CVC-form rafsi – with the pattern “CVC ... CVC + X” – for “tosmabru failure”. X must either be a CVCCV long rafsi that happens to have a permissible initial pair as the consonant cluster, or is something which has caused a

y-hyphen to be installed between the previous CVC and itself by one of the above rules.

The test is as follows:

  • Examine all the C/C consonant pairs up to the first y-hyphen, or up to the end of the word in case there are no y-hyphens.

These consonant pairs are called "joints”.

  • If all of those joints are permissible initials, then the trial word will break up into a cmavo and a shorter brivla. If not, the word will not break up, and no further hyphens are needed.
  • Install a y-hyphen at the first such joint.

Note that the “tosmabru test” implies that the algorithm will be more efficient if rafsi junctures are tested for required hyphens from right to left, instead of from left to right; when the test is required, it cannot be completed until hyphenation to the right has been determined.

The lujvo scoring algorithm

This algorithm was devised by Bob and Nora LeChevalier in 1989. It is not the only possible algorithm, but it usually gives a choice that people find preferable. The algorithm may be changed in the future. The lowest-scoring variant will usually be the dictionary form of the lujvo. (In previous versions, it was the highest-scoring variant.)

  • Count the total number of letters, including hyphens and apostrophes; call it

L.

  • Count the number of apostrophes; call it

A.

  • Count the number of y-, r-, and

n-hyphens; call it

H.

  • For each rafsi, find the value in the following table. Sum this value over all rafsi; call it

R: <tab class=wikitable header=true>CVC/CV (final) (-sarji) 1 CVC/C (-sarj-) 2 CCVCV (final) (-zbasu) 3 CCVC (-zbas-) 4 CVC (-nun-) 5 CVV with an apostrophe (-ta'u-) 6 CCV (-zba-) 7 CVV with no apostrophe (-sai-) 8 </tab>

  • Count the number of vowels, not including y; call it

V.

The score is then: (1000 * L) - (500 * A) + (100 * H) - (10 * R) - V In case of ties, there is no preference. This should be rare. Note that the algorithm essentially encodes a hierarchy of priorities: short words are preferred (counting apostrophes as half a letter), then words with fewer hyphens, words with more pleasing rafsi (this judgment is subjective), and finally words with more vowels are chosen. Each decision principle is applied in turn if the ones before it have failed to choose; it is possible that a lower-ranked principle might dominate a higher-ranked one if it is ten times better than the alternative.

Here are some lujvo with their scores (not necessarily the lowest scoring forms for these lujvo, nor even necessarily sensible lujvo):

zbasai

zba + sai {{{1}}}


nunynau

nun + y + nau {{{1}}}


sairzbata'u

sai + r + zba + ta'u {{{1}}}


zbazbasysarji

zba + zbas + y + sarji {{{1}}}

lujvo-making examples

This section contains examples of making and scoring lujvo. First, we will start with the tanru

gerku zdani ( “dog house”) and construct a lujvo meaning “doghouse”, that is, a house where a dog lives. We will use a brute-force application of the algorithm in Section 5.11, using every possible rafsi.

The rafsi for gerku are:

  • -ger-,
  • -ge'u-,
  • -gerk-,
  • -gerku

The rafsi for zdani are:

  • -zda-,
  • -zdan-,
  • -zdani.

Step 1 of the algorithm directs us to use -ger-, -ge'u- and -gerk- as possible rafsi for gerku; Step 2 directs us to use -zda- and -zdani as possible rafsi for zdani. The six possible forms of the lujvo are then:

  • ger-zda
  • ger-zdani
  • ge'u-zda
  • ge'u-zdani
  • gerk-zda
  • gerk-zdani

We must then insert appropriate hyphens in each case. The first two forms need no hyphenation: ge cannot fall off the front, because the following word would begin with rz, which is not a permissible initial consonant pair. So the lujvo forms are gerzda and gerzdani.

The third form, ge'u-zda, needs no hyphen, because even though the first rafsi is CVV, the second one is CCV, so there is a consonant cluster in the first five letters. So ge'uzda is this form of the lujvo.

The fourth form,

ge'u-zdani, however, requires an

r-hyphen; otherwise, the ge'u- part would fall off as a cmavo. So this form of the lujvo is ge'urzdani.

The last two forms require y-hyphens, as all 4-letter rafsi do, and so are gerkyzda and gerkyzdani respectively.

The scoring algorithm is heavily weighted in favor of short lujvo, so we might expect that gerzda would win. Its L score is 6, its A score is 0, its H score is 0, its R score is 12, and its V score is 3, for a final score of 5878. The other forms have scores of 7917, 6367, 9506, 8008, and 10047 respectively. Consequently, this lujvo would probably appear in the dictionary in the form gerzda.

For the next example, we will use the tanru

bloti klesi ( “boat class”) presumably referring to the category (rowboat, motorboat, cruise liner) into which a boat falls. We will omit the long rafsi from the process, since lujvo containing long rafsi are almost never preferred by the scoring algorithm when there are short rafsi available.

The rafsi for bloti are -lot-, -blo-, and -lo'i-; for klesi they are -kle- and -lei-. Both these gismu are among the handful which have both CVV-form and CCV-form rafsi, so there is an unusual number of possibilities available for a two-part tanru:

  • lotkle
  • blokle
  • lo'ikle
  • lotlei
  • blolei
  • lo'irlei

Only lo'irlei requires hyphenation (to avoid confusion with the cmavo sequence lo'i lei). All six forms are valid versions of the lujvo, as are the six further forms using long rafsi; however, the scoring algorithm produces the following results:

lotkle 5878 blokle 5858 lo'ikle 6367 lotlei 5867 blolei 5847 lo'irlei 7456

So the form blolei is preferred, but only by a tiny margin over blokle; "lotlei" and "lotkle" are only slightly worse; lo'ikle suffers because of its apostrophe, and lo'irlei because of having both apostrophe and hyphen.

Our third example will result in forming both a lujvo and a name from the tanru

logji bangu girzu, or “logical-language group” in English. ( “The Logical Language Group” is the name of the publisher of this book and the organization for the promotion of Lojban.)

The available rafsi are -loj- and -logj-; -ban-, -bau-, and -bang-; and -gri- and -girzu, and (for name purposes only) -gir- and -girz-. The resulting 12 lujvo possibilities are:

  • loj-ban-gri
  • loj-bau-gri
  • loj-bang-gri
  • logj-ban-gri
  • logj-bau-gri
  • logj-bang-gri
  • loj-ban-girzu
  • loj-bau-girzu
  • loj-bang-girzu
  • logj-ban-girzu
  • logj-bau-girzu
  • logj-bang-girzu

and the 12 name possibilities are:

  • loj-ban-gir
  • loj-bau-gir
  • loj-bang-gir
  • logj-ban-gir
  • logj-bau-gir
  • logj-bang-gir
  • loj-ban-girz
  • loj-bau-girz
  • loj-bang-girz
  • logj-ban-girz
  • logj-bau-girz
  • logj-bang-girz

After hyphenation, we have:

  • lojbangri
  • lojbaugri
  • lojbangygri
  • logjybangri
  • logjybaugri
  • logjybangygri
  • lojbangirzu
  • lojbaugirzu
  • lojbangygirzu
  • logjybangirzu
  • logjybaugirzu
  • logjybangygirzu
  • lojbangir
  • lojbaugir
  • lojbangygir
  • logjybangir
  • logjybaugir
  • logjybangygir
  • lojbangirz
  • lojbaugirz
  • lojbangygirz
  • logjybangirz
  • logjybaugirz
  • logjybangygirz


The only fully reduced lujvo forms are lojbangri and lojbaugri, of which the latter has a slightly lower score: 8827 versus 8796, respectively. However, for the name of the organization, we chose to make sure the name of the language was embedded in it, and to use the clearer long-form rafsi for girzu, producing lojbangirz.

Finally, here is a four-part lujvo with a cmavo in it, based on the tanru

nakni ke cinse ctuca or “male (sexual teacher)”. The

ke cmavo ensures the interpretation “teacher of sexuality who is male”, rather than “teacher of male sexuality”. Here are the possible forms of the lujvo, both before and after hyphenation:

  • nak-kem-cin-ctu
  • nakykemcinctu
  • nak-kem-cin-ctuca
  • nakykemcinctuca
  • nak-kem-cins-ctu
  • nakykemcinsyctu
  • nak-kem-cins-ctuca
  • nakykemcinsyctuca
  • nakn-kem-cin-ctu
  • naknykemcinctu
  • nakn-kem-cin-ctuca
  • naknykemcinctuca
  • nakn-kem-cins-ctu
  • naknykemcinsyctu
  • nakn-kem-cins-ctuca
  • naknykemcinsyctuca

Of these forms, nakykemcinctu is the shortest and is preferred by the scoring algorithm. On the whole, however, it might be better to just make a lujvo for

cinse ctuca (which would be cinctu) since the sex of the teacher is rarely important. If there was a reason to specify “male”, then the simpler tanru
nakni cinctu ( “male sexual-teacher”) would be appropriate. This tanru is actually shorter than the four-part lujvo, since the ke required for grouping need not be expressed.

The gismu creation algorithm

The gismu were created through the following process:

  • At least one word was found in each of the six source languages (Chinese, English, Hindi, Spanish, Russian, Arabic) corresponding to the proposed gismu. This word was rendered into Lojban phonetics rather liberally: consonant clusters consisting of a stop and the corresponding fricative were simplified to just the fricative (

tc became c, dj became j) and non-Lojban vowels were mapped onto Lojban ones. Furthermore, morphological endings were dropped. The same mapping rules were applied to all six languages for the sake of consistency.

  • All possible gismu forms were matched against the six source-language forms. The matches were scored as follows:
  • If three or more letters were the same in the proposed gismu and the source-language word, and appeared in the same order, the score was equal to the number of letters that were the same. Intervening letters, if any, did not matter.
  • If exactly two letters were the same in the proposed gismu and the source-language word, and either the two letters were consecutive in both words, or were separated by a single letter in both words, the score was 2. Letters in reversed order got no score.
  • Otherwise, the score was 0.
  • The scores were divided by the length of the source-language word in its Lojbanized form, and then multiplied by a weighting value specific to each language, reflecting the proportional number of first-language and second-language speakers of the language. (Second-language speakers were reckoned at half their actual numbers.) The weights were chosen to sum to 1.00. The sum of the weighted scores was the total score for the proposed gismu form.
  • Any gismu forms that conflicted with existing gismu were removed. Obviously, being identical with an existing gismu constitutes a conflict. In addition, a proposed gismu that was identical to an existing gismu except for the final vowel was considered a conflict, since two such gismu would have identical 4-letter rafsi.

More subtly: If the proposed gismu was identical to an existing gismu except for a single consonant, and the consonant was "too similar” based on the following table, then the proposed gismu was rejected. <tab class=wikitable header=true>proposed gismu existing gismub p, vc j, sd tf p, vg k, xj c, zk g, xl rm nn mp b, fr ls c, zt dv b, fx g, kz j, s </tab> See Section 5.3 for an example.

  • The gismu form with the highest score usually became the actual gismu. Sometimes a lower-scoring form was used to provide a better rafsi. A few gismu were changed in error as a result of transcription blunders (for example, the gismu gismu should have been gicmu, but it's too late to fix it now).

The language weights used to make most of the gismu were as follows: <tab class=wikitable header=true>Chinese 0.36 English 0.21 Hindi 0.16 Spanish 0.11 Russian 0.09 Arabic 0.07 </tab> reflecting 1985 number-of-speakers data. A few gismu were made much later using updated weights: <tab class=wikitable header=true>Chinese 0.347 Hindi 0.196 English 0.160 Spanish 0.123 Russian 0.089 Arabic 0.085 </tab> (English and Hindi switched places due to demographic changes.)

Note that the stressed vowel of the gismu was considered sufficiently distinctive that two or more gismu may differ only in this vowel; as an extreme example, bradi, bredi, bridi, and brodi (but fortunately not brudi) are all existing gismu.

Cultural and other non-algorithmic gismu

The following gismu were not made by the gismu creation algorithm. They are, in effect, coined words similar to fu'ivla. They are exceptions to the otherwise mandatory gismu creation algorithm where there was sufficient justification for such exceptions. Except for the small metric prefixes and the assignable predicates beginning with brod-, they all end in the letter o, which is otherwise a rare letter in Lojban gismu.

The following gismu represent concepts that are sufficiently unique to Lojban that they were either coined from combining forms of other gismu, or else made up out of whole cloth. These gismu are thus conceptually similar to lujvo even though they are only five letters long; however, unlike lujvo, they have rafsi assigned to them for use in building more complex lujvo. Assigning gismu to these concepts helps to keep the resulting lujvo reasonably short.

broda
1st assignable predicate
brode
2nd assignable predicate
brodi
3rd assignable predicate
brodo
4th assignable predicate
brodu
5th assignable predicate
cmavo
structure word (from cmalu valsi)
lojbo
Lojbanic (from logji bangu)
lujvo
compound word (from pluja valsi)
mekso
Mathematical EXpression

It is important to understand that even though cmavo, lojbo, and lujvo were made up from parts of other gismu, they are now full-fledged gismu used in exactly the same way as all other gismu, both in grammar and in word formation.

The following three groups of gismu represent concepts drawn from the international language of science and mathematics. They are used for concepts that are represented in most languages by a root which is recognized internationally.

Small metric prefixes (values less than 1): <tab class=wikitable header=true>decti .1 decicenti .01 centimilti .001 millimikri 10-6 micronanvi 10-9 nanopicti 10-12 picofemti 10-15 femtoxatsi 10-18 attozepti 10-21 zeptogocti 10-24 yocto </tab> Large metric prefixes (values greater than 1): <tab class=wikitable header=true>dekto 10 dekaxecto 100 hectokilto 1000 kilomegdo 106 megagigdo 109 gigaterto 1012 terapetso 1015 petaxexso 1018 exazetro 1021 zettagotro 1024 yotta </tab> Other scientific or mathematical terms:

delno
candela
kelvo
kelvin
molro
mole
radno
radian
sinso
sine
stero
steradian
tanjo
tangent
xampo
ampere

The gismu sinso and tanjo were only made non-algorithmically because they were identical (having been borrowed from a common source) in all the dictionaries that had translations. The other terms in this group are units in the international metric system; some metric units, however, were made by the ordinary process (usually because they are different in Chinese).

Finally, there are the cultural gismu, which are also borrowed, but by modifying a word from one particular language, instead of using the multi-lingual gismu creation algorithm. Cultural gismu are used for words that have local importance to a particular culture; other cultures or languages may have no word for the concept at all, or may borrow the word from its home culture, just as Lojban does. In such a case, the gismu algorithm, which uses weighted averages, doesn't accurately represent the frequency of usage of the individual concept. Cultural gismu are not even required to be based on the six major languages.

The six Lojban source languages:

jungo
Chinese (from "Zhong 1 guo 2")
glico
English
xindo
Hindi
spano
Spanish
rusko
Russian
xrabo
Arabic

Seven other widely spoken languages that were on the list of candidates for gismu-making, but weren't used:

bengo
Bengali
porto
Portuguese
baxso
Bahasa Melayu/Bahasa Indonesia
ponjo
Japanese (from “Nippon”)
dotco
German (from "Deutsch")
fraso
French (from "Français")
xurdo
Urdu

(Urdu and Hindi began as the same language with different writing systems, but have now become somewhat different, principally in borrowed vocabulary. Urdu-speakers were counted along with Hindi-speakers when weights were assigned for gismu-making purposes.)

Countries with a large number of speakers of any of the above languages (where the meaning of “large” is dependent on the specific language): <tab class=wikitable header=true> English:merko Americanbrito Britishskoto Scottishsralo Australiankadno Canadian </tab> <tab class=wikitable header=true> Spanish:gento Argentinianmexno Mexican </tab> <tab class=wikitable header=true> Russian:softo Soviet/USSRvukro Ukrainian </tab> <tab class=wikitable header=true> Arabic:filso Palestinianjerxo Algerianjordo Jordanianlibjo Libyanlubno Lebanesemisro Egyptian (from "Mizraim")morko Moroccanrakso Iraqisadjo Saudisirxo Syrian </tab> <tab class=wikitable header=true> Bahasa Melayu/Bahasa Indonesia:bindo Indonesianmeljo Malaysian </tab> <tab class=wikitable header=true> Portuguese:brazo Brazilian </tab> <tab class=wikitable header=true> Urdu:kisto Pakistani </tab>

The continents (and oceanic regions) of the Earth:

bemro
North American (from berti merko)
dzipo
Antarctican (from cadzu cipni)
ketco
South American (from "Quechua")
friko
African
polno
Polynesian/Oceanic
ropno
European
xazdo
Asiatic

A few smaller but historically important cultures:

latmo
Latin/Roman
srito
Sanskrit
xebro
Hebrew/Israeli/Jewish
xelso
Greek (from "Hellas")

Major world religions:

budjo
Buddhist
dadjo
Taoist
muslo
Islamic/Moslem
xriso
Christian

A few terms that cover multiple groups of the above:

jegvo
Jehovist (Judeo-Christian-Moslem)
semto
Semitic
slovo
Slavic
xispo
Hispanic (New World Spanish)

rafsi fu'ivla: a proposal

The list of cultures represented by gismu, given in Section 5.14, is unavoidably controversial. Much time has been spent debating whether this or that culture “deserves a gismu” or “must languish in fu'ivla space”. To help defuse this argument, a last-minute proposal was made when this book was already substantially complete. I have added it here with experimental status: it is not yet a standard part of Lojban, since all its implications have not been tested in open debate, and it affects a part of the language (lujvo-making) that has long been stable, but is known to be fragile in the face of small changes. (Many attempts were made to add general mechanisms for making lujvo that contained fu'ivla, but all failed on obvious or obscure counterexamples; finally the general zei mechanism was devised instead.)

The first part of the proposal is uncontroversial and involves no change to the language mechanisms. All valid Type 4 fu'ivla of the form CCVVCV would be reserved for cultural brivla analogous to those described in Section 5.14. For example,


tci'ile
Chilean

is of the appropriate form, and passes all tests required of a Stage 4 fu'ivla. No two fu'ivla of this form would be allowed to coexist if they differed only in the final vowel; this rule was applied to gismu, but does not apply to other fu'ivla or to lujvo.

The second, and fully experimental, part of the proposal is to allow rafsi to be formed from these cultural fu'ivla by removing the final vowel and treating the result as a 4-letter rafsi (although it would contain five letters, not four). These rafsi could then be used on a par with all other rafsi in forming lujvo. The tanru

tci'ile ke canre tutra
Chilean type-of
sand territory
Chilean desert

could be represented by the lujvo

tci'ilykemcantutra

which is an illegal word in standard Lojban, but a valid lujvo under this proposal. There would be no short rafsi or 5-letter rafsi assigned to any fu'ivla, so no fu'ivla could appear as the last element of a lujvo.

The cultural fu'ivla introduced under this proposal are called

rafsi fu'ivla, since they are distinguished from other Type 4 fu'ivla by the property of having rafsi. If this proposal is workable and introduces no problems into Lojban morphology, it might become standard for all Type 4 fu'ivla, including those made for plants, animals, foodstuffs, and other things.


Subjunctives. Imaginative and factitive te sumti types.

Introduction. fau and da'i

In this lesson we'll talk about subjunctive worlds. Let me explain what it means.

The first basic word in this section is.

da'i (UI3). The clause containing this particle describes an imaginary, not real event. Expresses subjunctive mood (in linguistics terms)

The opposite word for it is:

da'inai. The clause containing this particle describes an actual, real, not an imaginary event. Expresses indicative mood (in linguistics terms)

Constructs with da'i are usually translated to English with so called auxiliary verbs such as can/could, will/would, may/might, should and must. Clauses with da'i in English are said to be in subjunctive mood.

The second basic word that we'll deal with in this lesson is fanbu.

fanbu = x1 is a situation / time / place / "internal world" / event / circumstances / conditions (by default this world/this time/this place/this reality) in which x2 actually takes place
lo nu mi ca ciska cu se fanbu
The event of me writing now is actually taking place.

This verb covers some situation, imaginary or real. Sometimes such situation happens during some time in some place. You might notice that this powerful word fanbu in some ways resembles Einstein's concept of unity between time and space. Well, actually in the Quechua language a similar concept of pacha (roughly translated as a world) has existed for at least hundreds of years. However, our fanbu can cover even imaginary situations.

fanbu covers such verbs as fasnu, vanbi (hence its similarity to those two), tcini, cabna, selzvati, munje but it has a more generalized meaning.

The verb fanbu describes events or situations taking place in a described world.

fanbu is very useful when joining two events within one without raising any causal relation like in the following sentence: "By banging his gavel and standing up, the judge declared the trial adjourned".

We'll use one short and very useful preposition here:

fau = in the event/situation/world of ...
fau = fi'o fanbu

Often it's more convenient to use this preposition fau instead of the full verb fanbu.

We might want to combine those two words with each other. That's how we get several scenarios.

Let's discuss them all.

da'i and fau in main and in embedded clauses

da'i broda fau da describes events taking place not in da, i.e. not in our fanbu. In other words we create an imaginary world and talk about it.
broda fau da'i da talks about probabilities of events in da - the fanbu we previously created in our speech and now describe. We don't create any new worlds here.

We can omit da'i in such sentences making them more vague and short.

Omitting da'i doesn't add factuality to the clauses where it is absent.


You should add da'inai only to explicitly state factuality.

da'i broda. Imagination

Let's compare the following sentences:

a) da'i mi pavyseljirna
I could be a unicorn.
b) da'inai mi pavyseljirna
I am a unicorn.
c) mi pavyseljirna
I am a unicorn.

In the sentence a) the event is imagined.

The word da'inai from sentence b) explicitly states that the event is not imagined by anyone and therefore takes place in this world.

The meaning of the sentence c) would be clear from it's context.

da'i broda fau lo nu broda

da'i mi gleki fau lo nu mi ponse lo megdo rupnu
I am happy in-an-imaginary-world-in-which I have one million dollars.
I would/could be happy if I had one million dollars.
I would/could be happy in a world where I had one million dollars.
I imagine myself being happy and having one million dollars.

Here the event inside fau is equally imagined together with mi gleki. And here is the reverse example:

da'inai mi gleki fau lo nu mi ponse lo megdo rupnu
Having one million dollars I am happy.

broda fau da'i da. Probabilities

The following constructs can be used.

  1. broda fau da'i da = x1 is possible; x1 may/can possibly happen.
  2. broda fau da'i ro da= x1 is certain; x1 would necessarily happen.
  3. broda fau da'i so'e da = x1 is probable; x1 will probably/is likely to happen.
  4. broda fau da'i so'o da = x1 is remotely probable; x1 could/might happen.
  5. broda fau da'i so'u da = x1 is not likely, probably not.
  6. broda fau da'i no da = x1 is not possible.

As you can the difference between these is the number of fanbu we take into account.

Suppose you come home and hear someone scratching. You can say one of the following sentences (we'll omit da'i for brevity):

fau da ti mlatu.
This might be/possibly is a cat. It is possible that this is a cat.
(You keep several animals at home. So it might be your cat scratching but you are not sure).
fau ro da ti mlatu.
this must be/certainly is the cat.
(You have a cat and such noise can be produced by only one object, that cat).
fau so'e da ti mlatu.
This should be/probably is the cat.
(If you have a dog then it can also produce such sounds but your dog usually doesn't do that so the cat is more likely).
fau so'u da ti mlatu.
It is not probable that this is the cat.
fau no da ti mlatu
This can't be the cat. This must be not the cat. It is impossible that this is the cat.

Of course we can rephrase any of those sentences as e.g.

da fanbu lo nu ti mlatu
The state of this being-a-cat is possible.

Double negation

Sometimes it's necessary to use double negation. Let's show how this works using examples from English and Chinese.

The structure 非...不可 (fēi … bùkě) is one of the most commonly used in Mandarin Chinese. It means "must"/"absolutely must"/"need to." 非 means "not"/"no" and 不可 means "not possible". It's literally translated as "not not possible."

mi na'e cfifa'i ra fau no da
我非批评她不可。
wǒ fēi pīpíng tā bùkě
I can't not to criticize her.
I absolutely must criticize her.
mi'a na'e tadni fau no da
我们非学习不可。
wǒmen fēi xuéxí bùkě
We must study.

broda fau da'i PA nu brode

Here is a story. A tourist lady glances smilingly at another tourist lady, already seated, as she passes down the aisle. The seated tourist lady says
- Oh! Have I taken your seat?
- No. And if you had, it wouldn't have mattered!

The last sentence in Lojban would be na go'i i fau da'i lo nu do go'i na vajni Here we have a speaker displaying publicly a feature of a privately imagined world. The amazing thing is that we all trust the speaker of this graceful sentence to "know" the causal laws by which she runs events in her own imaginary world, and so we trust her report of this causal linkage! We trust her, in short, to know herself so well that she can speak "truly" of an imaginary situation in which she has probably never found herself before!

Here are other examples.

fau da'i ronu mi megdo rupnu ponse vau mi ricfu
If I have a million dollars, I'm necessarily rich.
fau da'i su'o nu lo trene cu spofu vau mi jai lerci
If the train breaks down, I could be late.
If the train breaks down (or: had broken down), I could be late (= it could happen that I am late).
In some possible world in which the train breaks down, I am late.
fau da'i su'o nu do mi jibni vau mi do darxi
Were you (ever) to come near me, it's possible that I'd hit you.

broda PU da'i PA nu brode

As said earlier fanbu is a more general verb compared to cabna. Therefore instead of fau we can use tenses described in earlier chapters like pu, ca, ba or (in verb form) purci, cabna, balvi and combine them with da'i.

mi gleki ca da'i su'o nu mi ponse lo megdo rupnu
I might be happy when I have one million dollars
fau da'i ro nu do mi ba jibni vau mi do darxi
If you come near me, I will hit you.
ba nu fau da'i da la toriz cu jinga fo lo ba co'e
Come what may, the Tories will win the next election.
fau da'i su'o nu do mi ba jibni vau mi do darxi
If you ever come near me, it's possible that I'll hit you.
fau da'i da la toriz ba jinga fo lo ba co'e
The Tories could win the next election.
Some more examples to show the power of fau da'i
fau da'i da la toriz cu jinga fo lo ba co'e
The Tories could have won the next election
fau da'i ro da ro nanmu cu prenu
Men are necessarily people.
In every possible world, every man is a person
fau da'i ronu lo trene cu spofu vau mi jai lerci
If the train breaks down (or: had broken down), I would be late
In every possible world in which the train breaks down, I am late
fau da'i ronu do mi jibni vau mi do darxi
Were you (ever) to come near me, I would hit you.
Whenever you come near me, I would hit you.
fau da'i ro da la toriz cu jinga fo lo ba co'e
It's impossible that the Tories could have failed to win the next election.

da'i broda fau da'i lo nu brode. Imagination and probabilities

We can also describe an alternative imagined world using da'i in the main clause and talk about possible events in it using fau da'i. Thus we get full subjunctive claims.

da'i mi gleki fau da'i su'o nu mi ponse lo megdo rupnu.
I might (possibly) be happy if I had one million dollars.
da'i mi gleki fau da'i so'e nu mi ponse lo megdo rupnu.
I should (probably) be happy if I had one million dollars.
da'i mi gleki fau da'i ro nu mi ponse lo megdo rupnu.
I would (certainly) be happy if I had one million dollars.

We don't have to combine da'i with fau da'i all the time. We do that when we want maximum clarity giving the subjunctive world this optional second dimension.

Likelihood of possibilities

When we want to specify the pro- or counterfactuality of our subjunctive worlds we use such clauses as but it'll never happen or which is impossible or and that's quite possible.

Let's show how we can express them.

da'i mi gleki fau lonu mi ponse lo megdo rupnu i da'inai go'i fau da'i noda
I would be happy if I had one million dollars, which (the event of me having one million dollars) is not possible.

As usual go'i copies the previous verb phrase but doesn't copy da'i or any other particle of class UI or sei clause. Then da'inai is used in this copied phrase. This da'inai refers to the current non-imagined world. And a new fau da'i clause is added thus stating that the imaginary event of me having one million dollars is not possible inside this world.

da'i mi gleki fau lonu mi ponse lo megdo rupnu i da'inai go'i fau da'i su'o me'i so'e da
I would be happy if I had one million dollars, which (the event of me having one million dollars) is possible but not likely.
da'i mi gleki fau lonu mi ponse lo megdo rupnu i da'inai go'i fau da'i so'e me'i ro da
I would be happy if I had one million dollars, which (the event of me having one million dollars) is likely but not certain.

And this is how we can create contradictions. Let's use our example with morsi.

mi na'e pacna lo nu do morsi
I don't hope you die (and you didn't yet!)
mi na'e pacna lo nu do morsi i me ri da'inai
I don't hope you die (but yet, you do!)

Advanced management of fau and da'i

Some parts of a verb phrase can refer to imaginary objects, others to non-imagined objects. In such cases we mark different parts of the phrase with da'i or da'inai where needed.

  1. da'inai lo panzi be ra cu bilma
    His kids are ill (it is known he has kids and it is known they are ill).
  2. lo da'inai panzi be ra cu bilma fau da'i da
    Maybe his kids are ill (i.e., it is known that he has kids but it is not known whether they are ill).
  3. di bilma fau ro nu di da'inai panzi be ra
    His kids'll be ill OR If he has kids, they are ill (i.e., it is unknown whether he has kids, but if he does, they are certainly ill).
  4. di bilma fau su'o nu di da'inai panzi be ra
    Maybe his kids are ill (i.e., it is unknown if he has kids but if he does, they may be ill).
  5. di bilma fau ro nu di da'i panzi be ra
    His kids would be (would have been) ill (i.e., if he had kids they would be ill, but he doesn't).
  6. di bilma fau su'o nu di da'i panzi be ra
    His kids might've been ill (if he had kids, but he doesn't, so we'll never know).
  7. lo panzi be ra cu bilma fau ro da
    His kids are (must be) ill (i.e., as implied by some other fact such as his staying home from work).
  8. lo panzi be ra cu bilma fau da
    His kids may be ill (i.e., as implied by some other fact such as his staying home from work).

Global subjunctivity

Subjunctivity can be implied in many other places. When you say

lo'e cinfo cu fengu
Typical lion is angry.

you don't state that it is angry now. Actually you are not talking about any given lion but about some typical, i.e. imaginary one. Similarly,

mi vedli lo ka pu bajra
I remember myself running.

talks about past events that are retained only in your memory.

In some languages future tense is equivalent to subjunctive mood. This makes sense as future events can usually only be predicted, i.e. imagined.

The words metfo and pevna (and the interjection pe'a) describe events and objects as having imagined properties.

Words with da'i copied from the clause into arguments

Another example showing that da'i is implied in many abstraction places of verbs.

mi catlu lo nu do morsi
I watch you die (and you really do die, else how could I watch it?)

The second place of the verb catlu copies the value of da'i/da'i nai from the main clause.

So when you have da'i nai stated or implied by context in the main clause (mi catlu) it is also present inside lo nu do morsi. But e.g. explicitly adding da'i into the inner clause leads to no effect. It can't override da'i in the main clause.

Words with static da'i in arguments

Other verbs can have some places with da'i implied (and not copied from external verb phrase). Some of them can be completely defined using fanbu and da'i. Here are some of them with glosses.

  • EXPECT
kanpe = x1 expects/looks for the occurrence of x2 (da'i-event), expected likelihood x3 (0-1, default li so'a, i.e. near 1); x1 subjectively evaluates the likelihood of x2 (event) to be x3.
x1 x2 x3 kanpe = ga'a x1 x2 da'i se fanbu me x3 da
mi kanpe lo nu do ba jinga vau li so'e
I expect with a high probability that you will win.
You'll probably win.
kanpe describes possible events in our fanbu from the viewpoint of it's creator or user.
mi kanpe lo nu mi cortu fau ro nu lo rokci cu farlu lo tuple be mi
I know for a fact that if a rock lands on my foot, it will hurt.
  • POSSIBLE
cumki = x1 (da'i-event/state/property) is possible under conditions x2; x1 may/might occur; x1 is a maybe.
x1 x2 cumki = x1 da'i se fanbu su'o x2
x1 cumki = fa zi'o fe x1 fi li su'o kanpe
  • PROBABLE
lakne = x1 (da'i-event/state/property) is probable/likely under conditions x2.
x1 x2 lakne = x1 da'i se fanbu so'e x2
x1 lakne = fa zi'o fe x1 fi li so'e kanpe
  • WOULD BE
vudbi = x1 (da'i-event/state/property) must occur under conditions x2; x1 can't not to occur; x1 is a must; it's impossible that it wouldn't x1 under conditions x2; it would would necessarily x1 under x2; it is not the case that it is possible that it is not the case that x1 happens under conditions x2
x1 vudbi = x1 da'i se fanbu ro x2
Technically vudbi = naku naku cumki. naku...naku creates a negation scope only between the two naku.
  • DESIRE
djica = x1 wants x2 (da'i-event)
mi djica lo ka vitke fi la .paris.
I would rather visit Paris. I want to visit Paris.
Indeed, what we desire is always in our imaginary world.
  • HOPE
pacna = x1 hopes for x2 (da'i-event) with likelihood x3 (by default liso'a i.e. close to 1)
pacna has the same place structure as kanpe, but in addition to a vague, may be even impartial expectation, it has the meaning of "hope". In fact pacna is something like kanpe je djica.
  • INTEND
te mukti = x1 is motivated to bring about result/goal/objective x2 (da'i-abstraction) by x3 (motive, abstraction).
mi te mukti lo ka vitke fi la .paris.
I will visit Paris. I intend to/I'm gonna visit Paris.
mi te mukti vitke fi la .paris.
I'm visiting Paris intentionally.
  • CAPABLE
kakne = x1 can/is able to do x2 (ka, da'i-abstraction).
mi pu kakne lo ka gunka
I could work. I was able to work.
  • SHOULD
te javni = x1 should/ought to do x2 (da'i-abstraction) under rule x3.
mi te javni lo ka gunka
I should work.
  • Don't have to, Needn't, Don't need to, Lack (absence) of obligation
na te javni
  • NEED, NECESSITY
nitcu
  • HAVE TO, OBLIGATION
bilga = x1 must/is obliged to do x2 (da'i-abstraction) under conditions x3.
mi bilga lo ka gunka
I must work. I have to work.
  • ALLOW
curmi = x1 allows/permits x2 (da'i-abstraction)
  • FORBID
tolcru = x1 forbids/prohibits x2 (da'i-abstraction)
  • ADVISE
stidi
  • BE SURE
birti = x1 is certain/sure/positive/convinced that x2 (da'i-abstraction) is true
  • DOUBT
senpi = x1 doubts that x2 (da'i-abstraction) is true.
senpi = nalbirti
  • IMAGINARY
xanri = x1 (da'i-abstraction) is imagined by x2
mi se xanri lo nu mi pavyseljirna.
I imagine myself being a unicorn.
I could be a unicorn.

Words with static da'inai in arguments

Other verbs can have da'inai implied in one of it's arguments.

  • SURPRISE
spaji = x1 (da'inai-abstraction) surprises/startles/is unexpected [and generally sudden] to x2.

Chinese style yes/no questions

yes/no questions with ji

There is another method of asking 'yes/no' questions. If the verb relation consists of only one verb word you can use repeat that verb word two times linking it with ji:

pei do nelci lo tcati - je'u
Do you like tea? - Yes.
do nelci ji nelci lo tcati - je

When using such method

  • yes is je
  • no is na je nai

This method is similar to the Chinese method of asking yes/no questions:

好不好 ?
hăo bù hăo?
Are you all right? (literally - "good not good"?)

As you can see "good" is repeated two times separated by the word "bù". Similarly, in Lojban we use ji, although the answers are not like in Chinese.