overview Of Lojban

This document is largely supplanted by the Lojban Reference Grammar, Chapters 2, 3, and 4.

First written January 1989.  Minor Updates, August 1990
Copyright 1989, 1990 The Logical Language Group, Inc
HTML Version Copyright 2000 The Logical Language Group, Inc.
2904 Beau Lane, Fairfax VA 22031 USA

This overview builds upon the Lojban brochure, and will hopefully give you a good feel for the design and scope of the language. It serves as an introduction to learning the language; most of the special terminology used in other publications is defined here. This overview is NOT complete, nor detailed; much is glossed over. To actually learn the language you must study the textbook and/or reference/teaching materials.

The material following is divided up into the major facets of language description. These are:

     Orthography - the way the language is written
     Phonology   - the way the language sounds
     Morphology  - the structure of words
     Semantics   - the meanings of words, sentences, and expressions
     Grammar     - the ways in which words may be put together

For many special terms, we will give a definition, and then the Lojban word for the concept. The Lojban words are then used, avoiding confusion due to the various meanings of the English jargon words. The Lojban words are also the ones used in other publications about the language.

1.0 Orthography

Lojban uses a Roman alphabet, consisting of the letters and symbols:

       ' , . a b c d e f g i j k l m n o p r s t u v x y z

omitting the letters 'h', 'q', and 'w'. The three special characters are NOT punctuation. The apostrophe represents a specific sound, similar to the English /h/. The period is a optional reminder to the reader representing a mandatory pause dictated by the rules of the language. Such pauses can be of any duration, and are part of the morphology, or word formation rules, and not the grammar. The comma is used to indicate a syllable break within a word, generally one that is not obvious to the reader. The alphabet order given above is that of the ASCII symbol set, most widely used in computers for sorting and searching. The Lojban word lerfuis used for symbols of an alphabet or character set and their grammar, including their usage in mathematical expressions, acronyms, and spelling.

Lojban does not require capitalization of any word type, including proper names, and such capitalization is discouraged. Capital letters are used to indicate non-standard stress in pronunciation of Lojbanized names. Thus the English name 'Josephine', as normally pronounced, is Lojbanized as DJOsefin, pronounced /JO,seh,feen/. Without the capitalization, Lojban stress rules would force the /seh/ syllable to be stressed.

Lojban's alphabet and pronunciation rules cause what is called audio-visual isomorphism. If a stream of valid Lojban speech is uttered, there is a unique symbol to represent each sound, and a single correct way to separate the sounds into words. Similarly, a given string of Lojban text may be read off sound by sound using pronunciation and stress rules forming a unique uttered expression. Spelling in Lojban is thus trivial to learn.

2.0 Phonology

Each Lojban sound is uniquely assigned to a single letter, or combination of letters. Each letter is defined to have a particular pronunciation, such that there is no overlap between letter sounds.

Most of the consonants are pronounced exactly as they are most commonly pronounced in English. The following gives English and Lojban examples for these:

p  /p/  'powder'  purmo /PUR,mo/       b /b/ 'bottle' botpi  /BOT,pee/
f  /f/  'fall'    farlu /FAHR,lu/      v  /v/ 'voice'  voksa  /VOK,sah/
t  /t/  'time'    temci /TEHM,shee/    d  /d/ 'dance'  dansu  /DAHN,su/
s  /s/  'soldier' sonci /SON,shee/     z  /z/ 'zinc'   zinki  /ZEEN,kee/
k  /k/  'book'    cukta /SHUK,tah/     g  /g/ 'goose'  gunse  /GUN,seh/

Incidentally, for these examples, the Lojban example is a close equivalent of the English example used, showing that some words in Lojban are very similar to their English counterparts. In the pronunciation guides, note the conventions of capitalizing stressed syllables and of separating syllables with commas. These could optionally be used in the Lojban words themselves, but are not necessary.

In the above examples, the left column consonants are spoken without voicing them with the larynx; they are called unvoiced consonants. The consonant to the right of each unvoiced consonant is its voiced equivalent.

When a consonant is made by touching the tongue so as to block air passage, it is called a stop (p, b, t, d, k, g). If the blockage is incomplete, and air rubs between the tongue and the roof of the mouth, it is called a fricative (f, v, s, z). k is an unvoiced stop in the back of the mouth. Its unvoiced fricative equivalent is x, which is rarely found in English (The Scottish 'loch', as in 'Loch Ness monster', is an example.)

     x  /kh/ 'loch'  lalxu  /LAHL,khu/
                     xriso  /KHREE,so/

Two other fricatives are c and j. c is the unvoiced /sh/ sound that is usually represented by two letters in English. j is its voiced equivalent; rarely occurring alone in English (but see below).

c /sh/ 'shirt'   creka /SHREH,kah/  j /zh/ 'measure' lojban /LOZH,bahn/
       'English' glico /GLEE,sho/          'azure'

These two fricatives occur frequently in English combined with a stop. Lojban phonology recognizes this, and the /ch/ sound is written tc, while the /j/ sound is written dj.

tc /tsh/ 'much' mutce /MU,cheh/   dj /dzh/
'jaw' xedja /KHEH,jah/
   =/ch/                             =/j/

The other four Lojban consonants are also pronounced as in English. However, there are two English pronunciations to consider. The normal Lojban pronunciation is shown in the first column. In names, borrowings, and a few other situations, these consonants can occur with no vowel in the same syllable. In this case they are called vocalic consonants, and are pronounced as in the second column.

l /l/ 'late' lerci /LEHR,shee/   l /l/ 'bottle'
                                       'Carl'   kar,l  /KAHR,l/
m /m/ 'move' muvdu /MUV,du/      m /m/ 'bottom' 
                                       'Miriam' miri,m /MEE,ree,m/
n /n/ 'nose' nazbi /NAHZ,bee/    n /n/ 'button'
                                       'Ellen'  el,n   /EHL,n/
r /r/ 'rock' rokci /ROK,shee/    r /r/ 'letter'
                                       'Burt'   brt    /brt/

Consonants may be found in pairs, or even in triples, in many Lojban words; even longer clusters of consonants, often including at least one vocalic consonant, may be found in Lojbanized names or borrowings. Some of these clusters may appear strange to the English speaker (for example mlatu, /MLAH,tu/), but all permitted clusters were chosen so as to be quite pronounceable by most speakers and understandable to most listeners. If you run across a cluster that you simply cannot pronounce due to its unfamiliarity, it is permissible to insert a very short non-Lojban vowel sound between them. The English /i/ as in 'bit', is recommended for English speakers.

The basic Lojban vowels are best described as being similar to the vowels of Spanish and Italian. These languages use pure vowels, whereas English commonly uses vowels that are complexes of two or more purevowels called diphthongs (2-sounds) or triphthongs (3-sounds). English speakers must work at keeping the sounds pure; a crisp, clipped speech tends to help, along with keeping the lips and tongue tensed (for example by smiling tightly) while speaking.

There are five common vowels (a, e, i, o, u), and one special purpose vowel (y). English words that are close in pronunciation are given, but few English speakers pronounce these words with the purity and tension needed in Lojban pronunciation.

      a    /ah/  'top', 'father'  patfu  /PAHT,fu/
      e    /eh/  'bet', 'lens'    lenjo  /LEHN,zho/
      i    /ee/  'green', 'machine'      minji /MEEN,zhee/
      o    /o/   'joke', 'note'   notci  /NO,chee/
      u    /u/   'boot', 'shoe'   cutci  /SHU,chee/
      y    /uh/  'sofa', 'above'  lobypli /LOB,uh,plee/

The sound represented by y, called 'schwa', is a totally relaxed sound, contrasting with all the other tensed vowels. In this way, the Lojban vowels are maximally separated among possible vowel sounds. The English speaker must be especially careful to ensure that a final unstressed a in a Lojban word is kept tensed, and not relaxed as in the English 'sofa' (compare the equivalent Lojban sfofa /SFO,fah/).

Lojban has diphthongs as well, but these are always represented by the two vowels that combine to form them:

      ai /ai/ 'high'  bai    /bai/        ia  /yah/     'yard'
      au /au/ 'cow'   vau    /vau/        ie  /yeh 'yell'
      ei /ei/ 'bay'   pei    /bei/        ii  /yee/     'hear ye'
      oi /oi/ 'boy'   coi    /shoi/       io  /yo/ 'Yolanda'
                                          iu  /yu/ 'beauty'
                                          ua  /wah/     'wander'
                                          ue  /weh/     'well'
                                          ui  /wee/     'wheel'
                                          uo  /wo/ 'woe'
                                          uu  /wu/ 'woo'

The diphthongs in the right column are found in Lojban only in stand-alone words, and in Lojbanized names. Those in the left column may be found anywhere.

Any other time these vowels occur together in a single word, they must be kept separate in order to unambiguously distinguish the separate vowels from the diphthongs. The principle has been extended to all Lojban vowels for consistency, and all non-diphthong vowel pairs in a word are separated in print and in sound by ' representing a short, breathy /h/ sound. (Say 'Oh hello' quickly and without a pause between the words to get an English equivalent, in this case of Lojban o'e.)

When the vowels occur together, one at the end of a word and the other at the beginning of the next word, the ' is not used to separate them (it would attach them into a single word). Instead, a pause is mandatory between the two vowels. The pause may be extremely short (called a glottal stop) as in the English 'he eats', or may be longer. The pause is mandatory and thus may be inferred without writing it, but it is usually signalled to a reader with a period (.) before the word starting with a vowel.

A pause is also required after any Lojban name, which always ends in a consonant. (A . is written after the name to mark this, thus distinguishing names from other words without capitalization. Every vowel-initial Lojban word is thus preceded by a pause, and such words may be habitually spelled with a . at the beginning. There are a small number of other places where pauses are required to separate words. . may be used to mark the separation in these cases as well.

Lojban words of more than one syllable are stressed on the next-to-last, or penultimate, syllable. Syllables for which the vowel is y are not counted in determining penultimate stress, nor are syllables counted in which the letters l, m, n, or r occur in their vocalic forms with no other vowel in the same syllable. In Lojbanized names, a speaker may retain a semblance of native pronunciation of the name by stressing a non-penultimate syllable. In this case, capitalization is used to mark the abnormal stress, as in 'Josephine' in the example above.

Stress and pause are not mandatory in Lojban except for word separation per the above rules. There is no mandatory intonation, as for example the rising tone that always accompanies an English question. Lojban equivalents of English intonations are expressed as spoken (and written) words, and may be adequately communicated even in a monotone voice. Such intonation, and pauses for phrasing, are then totally at the speaker's discretion for ease in speaking or being understood, and carry no meaning.

3.0 Morphology

The forms of Lojban words are extremely regular. This, coupled with the phonology rules, allows a speech stream to be uniquely broken down into its component words.

Lojban uses three kinds of words:

                    names     cmene
        'predicate' words     brivla
        'structure' words     cmavo

3.1 cmene

Names, or cmene, are very much like their counterparts in other languages. They are labels applied to things (or people) to stand for them in descriptions or in direct address. They may convey meaning in themselves, but do not necessarily do so. Because names are often highly personal and individual, Lojban attempts to allow native language names to be used with a minimum of modification. The requirements for regularization to give speech stream recognition, however, do require that most names be Lojbanized to some extent. Examples of Lojbanized cmene include:

djim.      Jim      djein.    Jane       .arnold. Arnold
pit.       Pete
katrinas.  Katrina  KAtr,in.  Catherine   katis.  Cathy
keit.      Kate

cmene may have almost any form, but always end in a consonant, and are followed by a pause. cmene are penultimately stressed unless unusual stress is marked with capitalization. A cmene may have multiple parts, each ending with a consonant and pause, or the parts may be combined into a single word with no pause. Thus djan. djonz. /jahn.jonz./ and djandjonz. /JAHNjonz./ are valid Lojbanizations of 'John Jones', while.iunaited. steits. and either .iuNAIted,steits. or .iunaited,STEITS. are valid Lojbanizations for 'United States', depending upon how you wish to stress the name. In the last example, writing the cmene as a single word requires capitalization of the stressed syllables /NAI/ or /STEITS/, neither of which is penultimate in the single-word form of the cmene.

The final arbiter of the correct form of the cmene is the person doing the naming—although most cultures grant people the right to determine how they want their own name to be spelled and pronounced. The English 'Mary' can thus be Lojbanized as either meris., maris., meiris., or even marys. The latter is not pronounced much like its English equivalent, but may be desirable to someone who values spelling consistency over pronunciation consistency. The final consonant need not be an s; there must, however, be a Lojban consonant of some variety.

cmene are not permitted to have the words la, lai, or doi embedded in them. These minor restrictions are due to the fact that all Lojban cmene embedded in a speech stream will be preceded by one of these words or by a pause. With one of these words embedded, the cmene might break up into valid Lojban words followed by a shorter, incorrect, cmene. There are close alternatives to these that can be used in Lojbanization, such as ly,lei, and do'i, that do not cause these problems.

3.2 brivla

'Predicate' words, or brivla, are the core of Lojban. The concept of 'predicate', or bridi, will be discussed in the grammar section below. brivla carry most of the semantic information in the language. They serve as the equivalent of English nouns, verbs, adjectives, and adverbs, all in a single part of speech.

brivla may be recognized by several properties:

  • they have more than one syllable
  • they are penultimately stressed
  • they have a consonant cluster (at least two adjacent consonants) within or between the first and second syllables, ignoring the letter y
  • they end in a vowel

The consonant cluster rule has the qualification that the letter y is totally ignored, even if it splits a consonant cluster. Thus lobypei /LOB,uh,pei/ is a brivla even though the y separates the bp cluster.

brivla are divided into subcategories:


the 'primitive' roots of Lojban; e.g. klama


compounds of gismu, with meanings defined from their components; e.g. lobypli


'borrowings' from other languages that have been Lojbanized somewhat as cmene are Lojbanized to fit within the brivla requirements; e.g. djarspageti ('spaghetti')

brivla are defined so as to have only one meaning, which is expressed through a unique place structure. This concept will be discussed further in the sections on semantics and grammar.

3.2.1 gismu

The gismu are the basic roots for the Lojban language. These roots were selected based on various criteria:

  • occurrence or word frequency in other languages
  • usefulness in building complex concepts
  • and a few, like the words 'gismu', 'cmavo', and 'lujvo', are included as uniquely Lojbanic concepts that are basic to this language.

Each gismu is exactly five letters long, and has one of two consonant-vowel patterns: CVCCV or CCVCV (e.g. 'rafsi' and 'bridi'). The gismu are built so as to minimize listening errors in a noisy environment. A gismu has at least two combining forms, known as rafsi. One is the gismu itself; one is the gismu with the final vowel deleted. Certain gismu have additional, shorter rafsi assigned. Up to three of these shorter rafsi may be assigned to a gismu, depending on frequency of usage of the gismu in building complex concepts and on availability of these shorter rafsi. Short rafsi use only certain combinations of letters from the gismu, and are of the forms CCV, CVC, and CVV or CV'V. The use of these rafsi is discussed below under lujvo.

3.2.3 lujvo

When specifying a concept that is not found among the gismu, a Lojbanist generally attempts to express the concept as a tanru. tanru is an elaboration of the concept of 'metaphor' used in English. In Lojban, any brivlacan be used to modify another brivla. The first of the pair modifies the second. This modification is in some way restrictive - the modifier brivla reduces the broader sense of the modified brivla to form a more narrow, concrete, or specific concept. Modifier brivla may thus be seen as acting like English adverbs or adjectives. 'skami pilno' is the tanru which expresses the concept of 'computer user'.

The meaning of a tanru is somewhat ambiguous. 'skami pilno' could refer to a computer that is a user, or to a user of computers. There are a variety of ways that the modifier component can be related to the modified component. cmavo are used within tanru to prevent grammatical ambiguities, such as the various possible groupings of 'pretty little girls school'.

When a concept expressed in a tanru proves useful, or is frequently expressed, it is desirable to choose one of the possible meanings of the tanru and assign it to a new brivla. In the example, we would probably choose 'user of computers', and form 'sampli'. Such a brivla, built from the rafsi for the component gismu and cmavo, is called a lujvo.

Like gismu, lujvo have only one meaning. Unlike gismu, lujvo may have more than one form. This is because there is no difference in meaning between the various rafsi for a gismu when they are used to build a lujvo. A long rafsi may be used, especially in noisy environments, in place of a short rafsi; the result is considered the same lujvo, even though the word is spelled and pronounced differently. Thus 'brivla', itself a lujvo built from the tanru 'bridi valsi', is the same lujvo as 'brivalsi', 'bridyvla', and 'bridyvalsi', each using a different combination of rafsi.

When assembling rafsi together into lujvo, the rules for valid brivla must be followed: a consonant cluster must result in the first two syllables, and the lujvo must end in a vowel. A y (which is ignored in determining stress or consonant clusters) is inserted in the middle of the consonant cluster to glue the word together when the resulting cluster is either difficult to say or is likely to break up. There are specific rules describing these conditions, and the inserted y is called a hyphen, or jonle'u.

An r or n consonant may also be inserted as a hyphen when a CVV- or CV'V-form rafsi attaches to the beginning of a lujvo such that there is no consonant cluster. For example, in roinrai (/ROIN,rai/ or /ROI,n,rai/, which are almost indistinguishable), the rafsi 'roi' and 'rai' are joined, with the n hyphen causing the nr consonant cluster needed to make the word a brivla. Without the n, the word would break down into roi rai, which are two cmavo. The cmavo pair have no relation to their rafsi lookalikes; they will either be ungrammatical, or will express a different meaning than that intended.

Learning rafsi and the rules for assembling them into lujvo is thus seen to be basic to having a full use of the Lojban potential vocabulary.

3.2.4 le'avla

The use of tanru or lujvo is not always appropriate for very concrete or specific terms (e.g. 'brie', or 'cobra'), or for jargon words specialized to a narrow field (e.g. 'quark', 'integral', or 'iambic pentameter'). These words are in effect 'names' for concepts, and the names were invented by speakers of another language. The vast majority of names for plants, animals, foods, and scientific terminology cannot be easily expressed as tanru. They thus must be 'borrowed' (actually 'taken') into Lojban from the original language, forming words called le'avla. The word must be Lojbanized into one of several permitted le'avla forms. A rafsi is then attached to the beginning of the Lojbanized form, usually using a vocalic consonant as 'glue' to ensure that the resulting word doesn't fall apart. The rafsi categorizes or limits the meaning of the le'avla; otherwise a word having several different jargon meanings in other languages (such as 'integral'), would require a choice made as to which meaning should be assigned to the le'avla. le'avla, like other brivla, are not permitted to have more than one definition.

3.3 cmavo

cmavo are the structure words that hold the Lojban language together. They often have no semantic meaning in themselves, though they may affect the semantics of brivla to which they are attached. cmavo include the equivalent of English articles, conjunctions, prepositions, numbers, and punctuation marks. There are several dozen subcategories of cmavo, each having a specifically defined grammatical usage.

cmavo are recognized most easily by not being either cmene or brivla. Thus, they:

  • may be a single syllable.
  • never contain a consonant cluster of any type, whether or not y is counted.
  • end in a vowel.
  • Multi-syllable cmavo need not be penultimately stressed, though they often are.

All cmavo display one of the following letter patterns, where C stands for a consonant, and V stands for a vowel:

    V    VV V'V    CV    CVV    CV'V

The letter pattern generally does not indicate anything about the grammar.

Compound cmavo (single words consisting of strings of cmavo) exist, and are used when the componentcmavo act together on the rest of the sentence. For example, a set of digits comprising a longer number is written as a single word (e.g. pareci = pa + re + ci = '123').

A small number of cmavo used in tanru have been assigned rafsi so that they may aid in converting thosetanru into lujvo.

4.0 Semantics

Lojban is designed to be unambiguous in orthography, phonology, morphology, and grammar. Lojban semantics, however, must support the same breadth of human thought as natural languages. Every human being has different 'meanings' attached to the words they use, based on their unique personal experiences with the concepts involved.

Lojban attempts to minimize the ambiguity, partly by systematizing as much as possible about semantics, but mostly by removing the clutter and confusion caused by other forms of ambiguity.

Thus, unlike words in most other languages, a brivla has a single meaning, a portion of the semantic space of all possible meanings; this meaning may be narrow and specific, or may subsume a broad and continuous range of submeanings. gismu tend to have more broad and abstract meanings, while lujvo tend to have very specific definitions; the compounding of gismu into lujvo allows expression of any desired degree of specificity. le'avla have a single specific meaning as well, and part of Lojbanization of a 'borrowing' consists of distinguishing between possible multiple meanings of a le'avla.

The semantic definitions of brivla are closely tied to the 'predicate' nature of brivla, a topic which will be discussed in detail in the grammar section below. In short, a brivla defines the relationship between a group of separate concepts, called sumti.

A brivla definition uses a specific set of 'places' for sumti to be inserted, expressed in a certain order (called a place structure) to allow a speaker to clearly indicate which place is which. By convention, we number these places as: x1, x2, x3, x4, x5, etc. Numbering is always from the left; y and z may be used with subscripts, or multiple subscripts may be used, when comparing two or more place structures. When space is at a premium, ellipsis marks can be used to indicate places, since the reader can always count them in order.

The unique definition of a brivla is thus an enumeration of the component places in order, joined with a description of the relationship between them. A typical definition might be expressed in any of three forms:

   klama   come/go to...from...via...using...
           x1 comes/goes to x2 from x3 via x4 using x5
           x1 describes a party that acts with result;
           x2 describes a destination where x1 is located
              after the action;
           x3 describes an origin where x1 is located
              before the action;
           x4 describes a route, or points along a route
              travelled by x1 between x2 and x3;
           x5 describes the means of transport by which
              the result is obtained.

The English verbs 'come' and 'go' describe the action taken by x1, depending on the relationship between x2, x3, and the speaker. The position of the speaker is not part of the Lojban meaning.

These definitions are of increasing accuracy and length; the last is so complex as to only be practical in a formal Lojban-English dictionary.

From this example, it should also be clear that brivla are neither nouns, verbs, adjectives, nor adverbs; yet they incorporate elements of each. These different aspects are brought out in the way the brivla is used in the grammar, but these different grammatical environments do not change the meaning of the brivla.

brivla are an open-ended set of words; new lujvo and le'avla may be created 'on the fly' as needed. The meaning of a word that is invented is at the discretion of the inventor, subject to conventions necessary to communications. Eventually, invented brivla will be collected, analyzed, and approved, and a formal dictionary will be produced defining the brivla in detail. Simpler definitions are generally clear enough for most usage. These definitions are specified in the case of gismu. Place structures of lujvo can generally be inferred from the way the word was derived, which is built into the lujvo itself. le'avla are generally concrete terms, and are only as ambiguous as the concept is in the source language. The conventions and rules for determining place structures and content of the places will suffice for most Lojban communications, and listeners have methods of querying the nature of an unknown place when there is uncertainty. The result of predictability minimizes the need for a formal dictionary in using Lojban.

The heart of Lojban semantics is embedded in tanru. A speaker may use tanru to be arbitrarily general or specific, to refer to a relatively large or small portion of semantic space. tanru are usually quite easy to interpret; in addition to the various grammatical cmavo to indicate relationships, tanru are always considered as a series of pairs of terms, a 'binary metaphor' relationship. In such a relationship the first term always modifies the second term. The terms may be brivla, certain cmavo like numbers, or they may be shortertanru.

The connotative semantics of Lojban sentences is relatively undefined, as is the semantics of longer expressions or texts. There is nothing clearly corresponding in Lojban to 'mood' or 'tone', no 'formal' or 'informal' styles, etc.

Because of an orientation in the language towards logic, attention is given to the nature of the assertion in a statement and its truth or falsity. Certain constructs in the language are described as making assertions, and having truth values. Other constructs may modify those truth values, and still others constructs are interpreted independently from the truth of the statement.

5.0 Grammar

Lojban's grammar is defined by a set of rules that have been tested to be unambiguous using computers. Grammatical unambiguity means that in a grammatical expression, each word has exactly one grammatical interpretation, and that the resulting combination of these words relate grammatically to each other in exactlyone way. (By comparison, in the English 'Time flies like an arrow.', each of the first three words has at leasttwo meanings. Each possible combination results in a different grammar for the sentence.)

The machine grammar is the set of computer-tested rules that describes, and is the standard for, 'correct Lojban'. If a Lojban speaker follows those rules exactly, the expression will be grammatically unambiguous. If the rules are not followed, ambiguity may exist. Ambiguity does not make communication impossible, of course. Every speaker on Earth speaks an ambiguous language. Lojbanists strive for accuracy in Lojban grammatical usage, and thereby for grammatically unambiguous communication.

It is important to note that new Lojbanists will not be able to speak 'unambiguously' when first learning Lojban. In fact, you may never speak unambiguously in 'natural' Lojban conversation, even though you achieve fluency in the language. No English speaker always speaks textbook English in natural conversation; Lojban speakers will also make grammatical errors when talking quickly. Lojbanists will, however, be able to speak or write unambiguously with care, which is difficult if not impossible with a natural language.

The machine grammar includes rules which describe how each word is interpreted. A classification scheme categorizes each word based on what rules it is used in and how it interacts with other words in the grammar. The classification divides Lojban words into about 100 of these categories of grammar units, called lexemes. Whereas the three word types: cmene, brivla, and cmavo, are generally considered to correspond to the 'parts of speech' of English, these 100 lexemes correspond to the more subtle variations in English grammar, such as the varieties of ways to pluralize a noun or to express the past tense of a verb. In this sense, English has thousands of 'parts of speech'.

Lojban lexemes are named after one word within the category, often the one most frequently used. The lexeme names are capitalized in English discussion of Lojban: BRIVLA, CMENE, PU, and UI, are examples of lexemes. More elaborate expressions are needed to refer to lexeme names within Lojban text, since capitalization is not applicable, and one must distinguish between the lexeme name (e.g. UI) and the word (ui) that typifies the lexeme.

In Lojban grammar rules, lexemes are assembled into short phrases representing a possible piece of a Lojban expression. These phrases are then assembled into longer phrases, and so forth, until all possible pieces have been incorporated in rules that describe all possible expressions in the language. Lojban's rules include grammar for 'incomplete' sentences, for multiple sentences flowing together in a narrative, for quotation, and for mathematical expressions.

The grammar is very simple, but infinitely powerful; often, a more complex phrase can be placed inside a simple structure, which in turn can be used in another iteration of the complex phrase structure. Such a rule set is called recursive. An example of English recursive expression can be found in the nursery rhyme 'The House That Jack Built', but recursion is rare in English, though common in Lojban.

We will now discuss the basic concepts of Lojban grammar, starting with bridi. To make the discussion clearer, the following sample sentences, based on the brivla 'klama' will be used. Refer to the definitions ofklama in 4.0 as necessary:

   (1) le prenu cu klama le zdani le briju le zarci le karce
       The person comes/goes to the house(nest) from the
       office via the market using the car.
   (2) mi sutra klama le blanu zdani be la djan. le briju
       I quickly come to the blue house(nest) of John from
       the office.

More completely, the latter translates as:

       I quickly (at doing...) come to the blue nest of John
       from the office (of...at...) via...using...

5.1 bridi

The bridi is the basic building block of a Lojban sentence. bridi are not words, but concepts. A bridiexpresses a relationship between several 'arguments', called sumti. Those with a background in algebra may recognize the word 'argument' in connection with 'functions', and a bridi can be considered a logical 'function' (called a predicate) with several 'arguments'. A brivla (bridi valsi = bridi-word) is a single word which expresses the relationship of a bridi.

The definition of the brivla 'klama' in 4.0 above shows this relationship. There are five places labelled x1through x5. The brivla 'klama' itself describes how the five places are related, but does not include those places. In example (1), those places are filled in with five specific sumti values, which are labelled a1through a5 to distinguish them from the places (x1 thru x5) that they fill:

    a1= le prenu (the person)
     a2 = le zdani (the nest)
     a3 = le briju (the office)
     a4 = le zarci (the market)
     a5 = le karce (the car)

The brivla and its sumti, used in a sentence, have become a bridi, more specifically a ju'abri, or 'sentence-bridi'. (The brivla for the English concept 'sentence' is jufra.) For logicians, the comparable English concept is called a 'predication'. In each bridi, a brivla or tanru specifies the relationship of the sumti. This specification of the relationship, without the sumti expressed, is called a selbri. Whether or not any sumti are attached, a selbri is found within every bridi.

We express a bridi relationship in Lojban by filling in the sumti places, expressing the sumti such that their position in the place structure is clear, and expressing the selbri that ties the sumti together.

It is not necessary to fill in all of the sumti to make the sentence meaningful. In English we can say 'I go', without saying where we are going. To say 'mi klama' (I go...) specifies only one sumti; the other four are left unspecified.

In Lojban, we know those four places exist; they are part of the definition of klama. In English, there is no implication that anything is missing, and the sentence 'I go.' is considered complete. The bridi 'mi klama' inherently is an incomplete sentence. The omission of defined places in a bridi is called ellipsis; corresponding ellipsis in the natural languages is a major source of semantic ambiguity (the ambiguity embedded in the variable meanings of words when taken in context). Most Lojban expressions involve some amount of ellipsis. The listener, however, knowing that the omissions occurred, has means of quickly questioning any specific one of them (or all of them) and resolving the ambiguity. Semantic ambiguity is thus not eliminated in Lojban, but is made more recognizable and more available for resolution.

It is even permissible to use a selbri alone, with no sumti filled in, as a very elliptical sentence. The sentence 'fagri', is very similar to the English observational exclamation 'Fire!', without the emotional content. The bridimerely states that something the speaker has in mind is 'a fire in fuel...'. A Lojban speaker might quickly say 'fagri' in the same situations where an English speaker would yell 'Fire!' in warning (the warning can be added in to the Lojban bridi as an emotional indicator, but it is not part of the bridi).

When the bridi is filled with whatever places the speaker intends, whether 0 of them, as in 'fagri', or all of them as in example (1), the result is a bridi.

You may have noticed that in example (1), each of the sumti filling the five places of klama contain a brivla. Each of these brivla are selbri as well; i.e. they imply a relationship between certain (usually unspecified)sumti places. A selbri may be labelled with le (among other things) and placed in a sumti. When le is used, that which the speaker has in mind for the x1 place of each sumti selbri is used to fill the sumti place in the sentence bridi. In example (1), there are no places specified for any of the selbri embedded in the sumti; they are all elliptically omitted, except for the x1 place that is superfluous. In example (2), one of the sumti selbrihas had its places specified, while two places of klama have been elliptically omitted:

  a1= mi (I)
   a2 = le blanu zdani be la djan. (the blue-nest of the one named John)
   a2,1 blanu zdani be a2,2 a2,1 = that which fills a2; the thing which is a blue nest
   a2,2 = la djan. (the one named John)
   a3 = le briju  (the office (of...at...))
   a4, a5 are elliptically omitted.

Two of the places of the selbri 'briju' have also been elliptically omitted, and this is expressed in the more exact translation of example (2).

Note that in the two tanru in example (2), 'sutra klama' and 'blanu zdani', each brivla in the tanru may be a self-contained selbri unit as well, having specific sumti attached to it (with be). It turns out that the place structure of the final component of a tanru (klama and zdani, respectively) becomes the place structure of thetanru as a whole, and hence the place structure of the higher level bridi structure (the place structure ofklama thus becomes the place structure of the sentence, while the place structure of zdani becomes the place structure of the a2 sumti.)

There are many permissible ways to express a Lojban bridi. In a sentence, the sumti can be expressed before or after the selbri, or some may be found on both sides. It is of course essential that the listener be able to determine which sumti places are being filled in with which sumti values. This throws us back to the 'single meaning' mentioned above for brivla, which are the simplest form of selbri. A brivla must have a single defined place structure, with specific sumti places to be related.

If this were not so, example (1) might be interpreted as 'The person is the means, the office the route, the market is the time of day, the nest is the cause, by which someone elliptically unspecified comes to somewhere (also elliptical)'. Not only is this nonsense, but it is confusing nonsense. With fixed place structures, a Lojbanist will interpret example (1) correctly. A Lojbanist can also, incidentally, express the nonsense just quoted. It will still be nonsense, but a listener will not be confused by the syntax; each place will be clearly labelled, and the nonsense can be discussed until resolved (if one wants).

Thus, for a given brivla, or indeed for any selbri, we have a specific place structure defined as part of the meaning. Complex selbri, described below, simply have more elaborate place structures determined by simple rules from the components.

The place structure of a bridi is defined with ordered (and implicitly numbered) places. The sumti are typically expressed in this order. When one is skipped, or the sumti are presented in a non-standard order, there are various cmavo to indicate which sumti is which.

The English speaker will normally see a Lojban bridi written with the value of the 1st (x1) sumti place, followed by the selbri, followed by the rest of the sumti values in order. This is called the 'canonical' sentence form, and resembles the English Subject-Verb-Object (SVO) sentence form. It is shown schematically as:

  [sumti a1]x1  [selbri]  [sumti a2]x2 ...
  [sumti an]xn
      or abbreviated as:
  x1 selbri x2 x3 x4 x5

This order is the one used for the bridi sentences in examples (1) and (2).

By the way, the abbreviated form is nearly identical to the definition given above for "klama",which reveals something valuable about bridi place structures: the sumti places and their order are often intuitively obvious merely by expressing a natural English sentence definition of the brivla.

An equally valid form that requires no extra cmavo is called SOV (Subject-Object-Verb) order:

   [sumti a1]x1  [sumti a2]x2 ...
   [sumti an]xn  [selbri]
      or abbreviated as:
   x1 x2 x3 x4 x5 selbri

Example (1), rewritten in this form becomes:

   le prenu le zdani le briju le zarci le karce cu klama

Since most European languages use SVO order than SOV order, a Lojbanist communicating internationally is often likely to see SVO Lojban sentences. On the other hand, SOV order is found in more non-European languages, and the Lojbanists from such cultures will be more likely to use SOV-ordered Lojban.

There are a variety of cmavo operations which modify these orders, or which modify one or more pieces of thebridi. These can make things extremely complicated, yet simple rules allow the listener to take the complications apart, piece by piece, to get the complete and unique structure of the bridi. We cannot describe all of these rules here, but a couple of key ones are given.

The cmavo 'cu' is placed after the last sumti before a selbri in a sentence-bridi. 'cu' is not used if there are nosumti before the selbri, and is otherwise always permitted though not always required. Example (1) shows a 'cu' used that is required; example (2) optionally omits the 'cu'. Skill in Lojban is knowing when 'cu' is required, when it is not required but useful, and when it is permitted, but is wasted or even a distraction.

What happens when the place structure of a bridi does not exactly match the meaning that the speaker is trying to convey? Lojban's design assumes that absolute meanings for words do not serve all needs for human communication. Lojban thus provides a way to adapt a place structure by adding places to the basic structure. The phrases that do so look exactly like sumti, but they have a cmavo marker on the front (called a case tag, or a modal operator) which indicates how the added place relates to the others. The resulting phrase resembles an English prepositional phrase or adverbial phrase, each of which modify a simple English sentence in the same way. Thus I can say:

   ca le cabdei ku mi cusku bau la lojban.
     ca le cabdei ku = an added sumti;
        case tag 'ca' indicates that the added place
        specifies 'at the time of...', or 'during...';
        thus 'during the nowday', or 'today';
     x1 = mi (I)
     selbri = cusku (x1 expresses x2 to x3 in form/media x4)
     x2, x3, and x4 are elliptically omitted
     bau la lojban = an added sumti; case tag 'bau'
        indicates that the added place specifies 'in
        language...'; thus 'in language which is called Lojban'.

The sentence thus roughly translates as 'Today, I express in Lojban.'

Among additional places that can be specified are comparison and causality. There are also a small number ofsumti that can be said to be a part of all bridi place structures: location and time (Lojban tense is formed from either of these, or both combined), the identity of the observer, the conditions under which the bridi is true, etc. While these are generally omitted, they are recognized as possible ellipses, and can thus be either specified by a speaker, or questioned by a listener. In Lojban, wherever possible, semantic components that apply to everything but are not always needed for communication, are left optional. They need not be specified, but they are available when necessary for clear communication.

5.2 selbri

As described above, the simplest form of selbri is a brivla. The place structure of the brivla is used as the place structure of the bridi. Various modifications can be made to the brivla and its place structure usingcmavo. These include abstraction to states, events, activities, properties, amounts, etc. For example, jetnu, expressing that x1 is true, becomes the basis for ka jetnu, the property of truth attributed to x1.

Place structures of a selbri can undergo 'conversion' i.e. a change to the order of the sumti places. Since the listener's attention is usually focussed on the first and/or the last sumti expressed in the bridi, this has a significant semantic effect, somewhat like the 'passive voice' of English (e.g. 'The man was bitten by the dog.' vs. 'The dog bit the man'. Time and location, and combinations of the two, can also be incorporated in theselbri.

As shown in example (2) above, tanru can also be selbri. These tanru can be composed of brivla, brivlamodified by the techniques referred to above, or simpler tanru. tanru themselves can also be modified by the above techniques.

All of the modifications to selbri that are possible are optional semantic components, including tense. With tense unspecified, it is possible that examples (1) and (2) might be intended as past, present, or future tense; the context determines how the sentence should be interpreted.

5.3 sumti

sumti can be compared to the 'subject' and 'object' of English grammar, but as the discussion of bridi above has perhaps indicated, this is only an analogy. The value of the first (x1) sumti place resembles the English 'subject'; the other sumti are like 'objects'.

There are differences, though. sumti are neither singular or plural; number is one of those semanticcomponents mentioned above that is not always relevant to communication, so number is optional. Thus, example (2) could have been translated as 'We quickly go/come/went/came (etc.) to the blue houses of those called John.' If this is plausible given the context, but is not intended, the speaker must add some of the optional semantic information like tense and number to ensure that the listener can understand the intended meaning. There are several ways to specify number when this is important to the speaker; the numerically unambiguous equivalent of the English plural 'people' would be le su'ore prenu ('the at-least-two persons').

There are a large variety of constructs usable as sumti. Only the most important will be mentioned here. These include:


cmavo which serve as short representations for longer sumti expressions; (e.g. ko'a, ti); imperatives are also marked with a pro-sumti (ko);


back references and forward references to other sentences and their components; (e.g. ri, di'u);


grammatical Lojban text, or text in other languages, suitably marked to separate the quote from the rest of the bridi; (e.g. zo djan, lu mi klama li'u, zoi by. I come .by.);

indirect reference:

reference to something by using its label; among other things, this allows one to talk about another sentence ('That isn't true.'), or the state referred to by a sentence ('That didn't happen'), unambiguously; (e.g. la'e di'u na fasnu = 'The referent of the last sentence does not occur.', or 'That didn't happen.');

named references:

reference to something named by using the name; (e.g. la djan, lai ford);


reference to something by describing it; (e.g. le prenu, le pu crino, le nu klama).

Pro-sumti, anaphora/cataphora, and indirect references are all equivalent to various uses of pronouns of English, and no further discussion is provided. Quotations and named references are straight-forward, and quite similar to their English counterparts. Lojban , however, allows (and generally requires) distinction between Lojban and foreign quotation, and between grammatical and ungrammatical Lojban quotation.

Descriptions appear similar to an English noun phrase (le prenu = 'The person'). For most purposes, this analogy holds. However the components of a description are a 'descriptor' or gadri, and a selbri. If the selbrihas no expressed sumti, as in 'le prenu', this effectively turns the selbri into a 'noun'; the value of the sumtiplace is something that would be put into the x1 place (the 'subject') of the selbri. Thus le klama is 'the go-er to... from... via... using...', and le blanu is 'the blue thing'. With conversion, as described above, a speaker can access other places in the bridi structure as the 'subject' or x1 place: le se klama is 'the place gone to by... from... via... using...'. Descriptions are not limited to selbri with attached sumti; as in example (2), they can include bridi with places filled in.

Abstract bridi such as events and properties can also be turned into sumti. These are among the more common descriptions, and a common source of error among new Lojbanists. As stated above, le klama is 'the go-er/come-r to... from... via... using...'. le nu klama is the 'event of ...going/coming to...from...via...using...'. The abstraction treats the bridi as a whole rather than from the aspect of the x1 place. Descriptions can also incorporate sentences based on abstracts; this is needed to elaborate le nu klama; le nu mi klama ti is 'the event of I come here (from... via... using...)', or simply 'my coming here'

In addition to number, Lojban allows for mass concepts to be treated as a unit. This is equivalent to English mass concepts as used in sentences like 'Water is wet.', and 'People are funny.' Mass description also allows a speaker to distinguish, in sentences like 'Two men carried the log across the field', whether they did it together, or whether they did it separately (as in 'One carried it across, and the other carried it back.)

Sets can be described in sumti, as well as logically and non-logically connected lists of sumti. Thus, Lojban provides for: 'Choose the coffee, the tea, or the milk', or 'Choose exactly one from the set of {coffee, tea, milk}'. Note that English connectives are not truly logical. The latter is the common interpretation of 'Coffee, tea, or milk?' and is relatively unambiguous. The former, if translated literally into Lojban would be a different statement, because of the ambiguous meaning of English 'or'.

Finally, sumti can be qualified using time, location, modal operators, or various other means of identification. Incidental notes can be thrown in, and pro-sumti can have values assigned to them. Lojban also has constructs that are similar to the English possessive.

5.4 Free Modifiers

Free modifiers are grammatical constructs that can be inserted in a bridi as a sentence, without changing the meaning, or the truth value, of the bridi. Free modifiers include the following types of structures:


parenthetical notes, which can be of any length, as long as they are grammatical.


these are used for direct address; they include several expressions used for 'protocol', allowing for smooth, organized communications in disruptive environments, as well as some expressions that are part of 'courtesy' in most languages.


these are comments made at a metalinguistic level about the sentence, and about its relationship to other sentences. In English, certain adverbs and conjunctions serve this function (e.g. 'however', 'but', 'in other words').

discursive bridi

these are halfway between discursives and parenthesis, and allow the speaker to make metalinguistic statements about a sentence without modifying the sentence. Thus, the discursive bridi equivalent of 'This sentence is false.' does not result in a paradox, since it would be expressed as a discursive bridiinside of another sentence, the one actually being described.


these are expressions of emotion and attitude about the sentence, being expressed discursively. They are similar to the English exclamations like 'Oh!' and 'Ahhhh!', but there is a much broader range of possibilities covering a range comparable to that expressed by English intonation, as well as indicators of intensity. Also included in this category are indicators of the relationship between the speaker and the expression. Found in native American tongues, these may indicate that the speaker directly observed what is being reported, heard about it from another, deduced it, etc.

5.5 Questions

Lojban's method of asking questions is quite different from English. In Lojban, most questions are asked by placing a question word in place of something to be filled in. The question word mo can be used in the grammatical place of any bridi, including those within sumti. it asks for a bridi (usually a selbri) to be supplied which correctly fills in the space. It is thus similar to English 'what?' The Lojban brochure is titled la lojban. mo, meaning 'The thing called Lojban is what?', or, of course, 'What is Lojban?' The question word mais used in place of a sumti in the same manner. Thus a listener can ask for ellipsis to be filled in, or can pose new questions that are similar to the classic English questions ('who?', 'when?', 'where?', 'how?', and 'why?').

Lojban also provides for questions about indicators and their intensities, sumti-place case tags, tenses and modalities, and logical connectives. Yes/no questions can either be asked as a question of emotional attitude, such as belief, certitude, supposition, decision, approval, or intention, or as a question of truth and falsity. In the first case, the answer is an emotional indicator. In the second case, the answer is an assertion or denial of the bridi expressing the state being questioned.

5.6 Logic and Lojban

Lojban supports all of the standard truth-functions of predicate logic. These can be used to connect any of several different levels of construct: sumti, bridi, selbri, sentences, etc.; the methods used unambiguously indicate what is being joined. As an example of English ambiguity in the scope of logical connectives, the incomplete sentence 'I went to the window and ...' can be completed in a variety of different ways (e.g. '...closed it', '...the door', '...Mary went to the desk'); in these, the 'and' is joining a variety of different constructs. You must hear and analyze the whole sentence to interpret the 'and', and you still may not be certain of correct understanding.

Another way Lojban supports logical connectives is by separating them from non-logical connectives. These latter include the 'and' of mixing (as in the red-and-blue beach ball which is neither red nor blue, but is both-at-once), expressions of causality (Lojban supports expression of four different kinds of causality: physical causation, motivation, justification, and logical implication.), and the various conjunctive discursives (such as 'but', and 'however'), which in English imply 'and' without stating it.

5.7 MEX

Lojban has incorporated a detailed grammar for mathematical expressions (abbreviated MEX in English,mekso in Lojban). This grammar parallels the predicate grammar of the non-mathematical language. Numbers, of course, may be clearly expressed, including exponential and scientific notation. Digits are provided for up to base 16, and letters (lerfu) may be used for additional digits if desired. Distinction is made between mathematical operations and mathematical relations. The set of operations is not limited to 'standard arithmetic'. Operations therefore assume a left-grouping precedence which can be overridden with parenthesis, or with optional inclusion of precedence labels that override this grouping when you evaluate the expression.

Included in Lojban are means to express non-mathematical concepts and quantities as numbers, and mathematical relationships as ordinary bridi. In Lojban, it is easy to talk about a 'brace' of oxen or a 'herd' of cattle, as well as to discuss the '5 fingers of your hand', or the 'integral of' -2x3+x2-3x+5 dx evaluated over the interval of -5 to +5' bottles of water'

5.8 selsku

The set of possible Lojbanic expressions is called selsku. Lojban has a grammar for multiple sentences tied together as narrative text, or as a conversation; an indefinite string of Lojban paragraphs of arbitrary length is supported by the unambiguous Lojban grammar. Using the rules of this grammar, multiple speakers can use, define, and redefine pro-sumti. Paragraphs, chapters, and even books can be separately distinguished (each can be numbered or titled distinctly). One can express logical and non-logical connectives over multi-sentence scope (This is the essence of a set of instructions - a sequence of closely-related sentences.) Complex sets of suppositions can be expressed, as well as long chains of reasoning based on logical deduction. In short, the possibilities of Lojban grammatical expression are endless.

6.0 Where Do I Go From Here?

This has served only as an introduction to the Lojban language, identifying key terms that you will frequently run into in other places. If you wish to learn the language, there are several ways to proceed.

To learn the 'whole' language, you really need to take a class. Learning a language to fluency takes time and practice, and interaction with others; if you do not ever try to communicate in the language, you cannot learn how to be understood. The Lojban education system is being designed around such classes, possibly led by a couple of students rather than by a formal 'teacher'; if you are unable to work with others, you can still develop skill by mail, sending letters and even tapes to other students. The Logical Language Group, Inc. (la lojbangirz.) will work to assist you in establishing contact with students of similar interests and level of understanding.

Prior to at least the end of 1990, there will be neither a textbook or a dictionary. In fact a true dictionary will probably not be started until there are a good number of Lojban speakers.

What do you do to learn the language in the short term?

The orthography, phonology, and morphology are covered in a synopsis that goes into considerably more detail than this overview. This synopsis will eventually be incorporated in a Lojban reference manual, along with similar synopses of the grammar and semantics. The material is a little more dense than a textbook, but the synopsis is complete. An appendix to this synopsis describes the rules for making lujvo; the rules are written algorithmically, and are extremely detailed.

The grammar is most simply and completely defined in machine grammar form, but this is written for computers, and not for people. For those familiar with computer concepts, the grammar is written in 'YACC' format, which is a kind of 'BNF'. If these jargon words mean nothing to you, don't worry.

Drafts of 6 textbook lessons exist, totalling some 280 pages, and covering much of the grammar needed for conversation and simple expression. Due to low volume reproduction, the draft lessons are expensive. They do contain a lot of information, however, and a commitment from la lojbangirz. to support you in using them to learn the language. The final textbook will be completely rewritten from these draft lessons, so obtaining them is not redundant.

Another approach is to read the texts and commentary in the Lojban periodical Ju'i Lobypli (JL), published quarterly. JL6 and later issues have significant amounts of teaching material, and copies of back issues may be ordered. Earlier issues are primarily of historical interest, although anyone with the books describing earlier versions of the language (published by The Loglan Institute, Inc.), will find JL5 useful.

Most essential to learning the language, or even closely following the work of others, is knowledge of the set ofgismu. A list of these, written with abbreviated definitions is available in English keyword and Lojban order, along with a brief pronunciation guide.

Flash cards for the gismu are available. There are also vocabulary teaching programs for MS-DOS computers and for the Apple MacIntosh. Versions are in progress for other machines, but none are yet scheduled for release.

The cmavo also must be learned. A complete list and description have been released, with abbreviated definitions. Earlier, incomplete lists of cmavo, with more explanatory definitions, are available, but have been somewhat obsoleted by a small number of changes to the word assignments. Detailed lists of cmavo for specific areas, such as negation, and attitudinal and discursive cmavo used in free modifiers, are given in detailed papers on those subjects that are available.

If you are not ready to learn Lojban due to a lack of time, a short newsletter is published quarterly. This newsletter lojbo karni (Lojbanic-periodical (LK)), will include a list of contents of the current JL issue, and a list of currently available materials, so that you can become actively involved at any time.

For those of you interested in spreading information about Lojban, we are happy to send a reasonable number of the 'What is Lojban?' brochure, or , better yet, you may copy it or reprint it freely. We will send this overview to anyone who contacts us. While this Overview is copyright by The Logical Language Group, Inc., it may be freely copied in its entirety, with copyright notice and this paragraph unchanged.

Thank you for your interest.

e'osai ko sarji la lojban - Please! Support Lojban!