Modern Lojban Theory (draft)

From Lojban
Jump to: navigation, search

This document is maintained by spheniscine "amphora" ; please submit corrections or comments. However, if I do not respond for a few days, please feel free to make the changes on your own, if this document should no longer reflect the most current thought.


This document is meant to describe the theory of the modern dialect of Lojban. It is meant to parallel/supplement the Complete Lojban Language, but better-reflecting the modern common-usage of phrases and sentence constructions, as well as the various changes in the grammar that have occurred since it was published.

This document is meant to be a reference, mostly for intermediate-to-advanced learners. A beginner might be able to learn the language from this document, however the Lojban Wave Lessons or the Crash Course might be more suitable for picking up conversational skill effectively.

Orthography and Phonology

Lojban has 26 phonemes; 6 vowels, 18 consonants, and 2 semivowels. Each phoneme is represented by a single letter (in some Lojban documentation, you will find the word "letteral" being used to describe letters), with two exceptions: i and u become semivowels when they begin a syllable that contains a second vowel letter. This is, however, completely predictable from the spelling of the word (more on that in the phonotactics section); thus, Lojban spelling is completely regular.

Thus, Lojban has 24 letters: 6 vowels (with 2 that can become semivowels) and 18 consonants.

The vowel letters are: a e i o u y

The consonant letters are: ' b c d f g j k l m n p r s t v x z (the first letter is the apostrophe)

Thus, the Lojban alphabet is the English alphabet, except it lacks h, q, or w, considers y a vowel, and considers the apostrophe a letter.

Additionally, the . (period) and the , (comma) are sometimes seen as "auxiliary characters". The period marks a mandatory pause or glottal stop. The comma is sometimes used within words to separate syllables where their separation may be unclear, e.g. crenzu,ue ("to practice"). These, however, are optional and, again, completely predictable from the spelling/phonotactics of the word.

Some phonemes have allophones (different realizations of what is considered the same "sound" in the language), allowed either as a natural variation due to influence by surrounding phonemes, or to accomodate Lojban learners who speak different languages.

Below are a list of phonemes and their realizations in the IPA (International Phonetic Alphabet). IPA symbols for alternate allophones are in (parentheses); preferred ones have no parentheses.

Letter Lojban name IPA representation(s) Notes
' .y'y. h (θ) Like h as in ahoy. Only ever occurs in the middle of words, and only between two vowels. Some people pronounce it as th as in thigh.
a .a bu / .a'y. a (ɑ) Like a as in father. Avoid "reducing" it like a as in about, as that might cause confusion with y.
b by. b Like b as in bed
c cy. ʃ (ʂ) Like sh as in ship. The digraph tc maps to [tʃ] and sounds like ch as in church
d dy. d Like d as in dog
e .e bu / .e'y. ɛ Like e as in get. Try to pronounce it in a pure, medium-high tone (closer to rare and not to rate), to avoid possible confusion with the ei diphthong.
f fy. f (ɸ) Like f as in fox
g gy. ɡ Like "hard" g as in get (never "soft" g as in giraffe)
i .i bu / .i'y. i [vowel form] Like ee as in see (avoid i as in hit)
j [semi-vowel form] [if beginning a syllable that contains another vowel] Like y as in yes. Hence iu sounds like the English word "you"
j jy. ʒ (ʐ) Like the "zh" sound of si as in vision, or the end of montage. The digraph dj maps to [dʒ] and sounds like j as in judge
k ky. k Like k as in kick
l ly. l Like l as in land
m my. m Like m as in mom
n ny. n Like n as in nun
(ŋ) Allowable natural variation before letters k, g, and x. Like as in skunk or finger.
o .o bu / .o'y. o (ɔ) Like o as in boat. Try to keep a "pure" sound, rather than completing the diphthong; however, Lojban has no ou diphthong, so there is less risk of misunderstanding.
p py. p Like p as in pen
r ry. r (ɹ, ɾ, ʀ) Like r as in three. A "trilled r" is preferred, but isn't necessary.
s sy. s Like s as in sack
t ty. t Like t as in tack
u .u bu / .u'y. u [vowel form] Like oo as in too.
w [semi-vowel form] [if beginning a syllable that contains another vowel] Like w as in weed. Hence uu sounds like English "woo"
v vy. v Like v as in vow
x xy. x Like ch as in Scottish loch. English speakers may find this sound hard to pronounce. Try saying ksss while keeping your tongue down.
y .y bu (never .y'y.) ə Like a as in about
z zy. z Like z as in zoo

Sometimes, i and u are written with breve diacritics when they act as semivowels, e.g. .ĭu , .ŭu , and crenzuŭe , but this is optional.

There are four diphthongs (glides between one vowel to another, like English sigh and how) in Lojban:

Diphthong IPA representation(s) Notes
ai ai, (aj) Like English eye
au au, (aw) Like English how
ei ɛi, (ei, ɛj, ej) Like English hey
oi oi, (oj, ɔi, ɔj) Like English boy

Capitalized letters are only used to show stress in transliterated names, like .DJOsefin. . This is sometimes considered aesthetically unpleasing, so some people prefer to use the grave accent on the vowel instead, e.g. .djòsefin.

Stress rules, semivowels, and mandatory pauses, or basics of morphology

In Lojban, morphology (the shape of words) is an intrinsic part of grammar. This is because one of the design goals of Lojban is to eliminate word boundary ambiguity; there is no Lojban equivalent to "propagate / prop a gate" or "nitrate / night rate". This is accomplished using three tools: stress rules, mandatory pauses, and carefully controlling what consonant/vowel shapes a word is allowed to have.

The latter process can be quite complicated, though it's regular enough to be programmed into a computer. The rules will be described in this document, but for now, you probably want to know: What are the stress rules? And since the periods and commas are optional, how do I know where they should be?

First we'll talk about the rules for mandatory pauses and stress, as well as the three main categories of Lojban words:

  • All words that begin with vowel sounds must have a mandatory pause or glottal stop in front of them, e.g. .i  and .ernace . These pause periods are often left out because this rule is so simple, except for .i to make it stick out since they divide sentences. Yes; in Lojban, all punctuation are spoken words.
  • Words that begin with semivowel sounds like iu and uitki need a mandatory pause or glottal stop if the previous word ended in a diphthong that ends with the same letter. For example, bai .iu would require a pause, while bi iu and ba iu won't. If you have trouble understanding this rule, you can just treat these words as starting with a vowel like in the previous rule, under the "better safe than sorry" approach.
  • All words that end in consonants are cmevla (from cmene valsi, "name words", designed for use with transliterated names). They must have pauses both before and after them, e.g. .djon. , .alis. , and .lojban. . It's considered courtesy to place these periods, since you otherwise need to look at the end of the word to recognize a cmevla. They may be stressed on any syllable; thus, a person who prefers a particular stress may capitalise the stressed syllable, e.g. .DJOsefin., or mark it with a grave accent, e.g. .djòsefin. (Historical note: This class was once just called cmene, but cmene referred to all names, so cmevla is the preferred term now.)
  • That leaves words that end in a vowel. Those that have either have a consonant cluster in them, or have two consonants separated by y (with the exception of letter-word clusters like xykycydy.), are brivla, (from bridi valsi, "predicate words"). These words have very specific meanings, and tend to contain the semantic bulk of Lojban text. Some brivla are .ernace ("hedgehog"), vecnu ("to sell"), pofygau ("to break something"), and... well, brivla itself. They are always stressed at the second-to-last syllable that does not contain y, thus: .erNAce, VEcnu, POFygau, BRIvla .

todo: cmavo, cmavo clusters, and letter words


Phonotactics are the rules that say what combinations of sounds are allowed. For example, /ps/ and /zd/ are combinations that English words can end with (e.g. hops, housed), but not start with. These rules apply to all Lojban words, even in the most-flexible cmevla morphological class.

The consonants of Lojban can be divided into two groups: sonorants, and obstruents. Obstruents can be further divided into voiced and unvoiced obstruents. Which group a consonant is in is important to describe the phonotactics of Lojban.

  • The sonorants are: l m n r. The difference between sonorants and obstruents is that you can "sing"/hold a note with a sonorant just like you could a vowel, but you can't do so with an obstruent.
  • The voiced obstruents are: b d g v j z
  • The unvoiced obstruents are: p t k f c s x

Did we forget the apostrophe? ... well it is not considered to be in any one of these three categories. In fact, the apostrophe often doesn't act like any other Lojban consonants do. The apostrophe stands alone; it only ever occurs in the middle of words, and only between two vowels.

Anyway, these three groups define what consonant clusters (combinations of consonants) are allowed.

  • The first rule is that no consonant can ever be doubled.
  • The second rule is that a voiced obstruent can never be next to an unvoiced obstruent. For example, bf is forbidden, and so is sd, but since sonorants do not have this rescrition, fl, vl, ls, and lz are all permitted.
  • The third rule is that no two sibilants can ever be next to each other. The sibilants are: c, j, s, z
  • The fourth rule is that these specific pairs are explicitly forbidden: cx, kx, xc, xk, mz

todo: stuff about initial consonant pairs, triplets, syllabic sonorants, and semivowel resolution