Morphology How-to

From Lojban
Revision as of 12:52, 3 September 2015 by Gleki (talk | contribs)
Jump to navigation Jump to search

la me zo'e's condensed CLL 2.0 morphology chapter counterfeit.

Phonemes

At the most basic level, an utterance is made of phonemes. Here are the main classes of phonemes (there are subclasses as seen later):

  • consonants {zunsna}:
    • bdgjvz (voiced), cfkpstx (unvoiced), lmnr (syllabic)
  • glides {karmlisna}: i u
  • h {me'o .y'y}: '
  • word break (glottal stop) {depybu'i}: .
  • vowels {karsna}: a e i o u
  • diphthongs: au ai ei oi
  • y {me'o .ybu}: y

The comma {me'o slaka bu} isn't a phoneme, but is used to separate syllables for clarity. Removing it has no effect.

i and u are vowels, unless a vowel or diphthong follows, in which case they are glides. Glide-diphthong pairs win over glide-vowel pairs, which win over diphthongs.

At this level, strings of consonants follow these rules:

  • consonants can be next to consonants, word breaks, vowels,
 diphthongs, and y
  • no consonant can be followed by itself
  • voiced consonants can't be next to voiceless ones, and vice versa
  • sibilants (cjsz) can't be next to each other
  • x can't be next to c or k
  • the substrings mz, nts, ntc, ndz, ndj are not allowed

Glides must follow a word break, vowel, diphthong, or y, and be followed by a vowel, diphthong, or y. i as a glide can't follow a diphthong ending in i, and u as a glide can't follow the diphthong au.

h can't be next to a consonant, glide, or glottal stop.

Vowels, diphthongs, and y can be next to consonants, glides, h, and word breaks.

Syllables

These are the shapes syllables {slaka} can have:

  • Vowel syllable
    • a word break, a glide, or up to three consonants
    • then a vowel or a diphthong
    • then optionally a consonant
    • e.g. .a, spa, pan, blaif, stra
  • h-syllable
    • the letter '
    • then a vowel or diphthong
    • then optionally a consonant
    • e.g. 'u, 'ei, 'am
  • y-syllable
    • a word break, a glide, or up to three consonants
    • then the letter y
    • e.g. by, .y, gry, zbly
  • hy-syllable
    • the string "'y"
  • consonantal syllable {zunsnaslaka}
    • a consonant
    • then a syllabic consonant
    • e.g. fl, sm, rn

When a syllable starts with more than one consonant, the rules for these clusters {zunsnagri} are more restrictive than the general ones above. These are the permissible initial doubles, stolen with love from CLL:


    pl pr                       fl fr

    bl br                       vl vr


    cp cf      ct ck cm cn      cl cr

    jb jv      jd jg jm

    sp sf      st sk sm sn      sl sr

    zb zv      zd zg zm


    tc tr      ts               kl kr

    dj dr      dz               gl gr


    ml mr                       xl xr

And the permissible initial triples:


    cfr cfl sfr sfl   jvr jvl zvr zvl

    cpr cpl spr spl   jbr jbl zbr zbl

    ckr ckl skr skl   jgr jgl zgr zgl

    ctr     str       jdr     zdr

    cmr cml smr sml   jmr jml zmr zml

When segmenting text into syllables, when a consonant could possibly either start a syllable or end one, it's always taken to start one. In other words, onsets are greedy, codas are lazy.

Words

Words can be cmavo, cmevla, or brivla. cmavo and brivla are made of syllables, while cmevla are free strings of phonemes.

cmavo are composed of:

  • one vowel- or y-syllable, with at most one initial consonant and no
 final consonant
  • optionally followed by any number of h- or hy-syllables without any
 final consonants

Examples: .a, ba, bai, ba'i, ba'ai, by, by'i, ia, iai, iy, ua'ai'y

There are two exceptions: "ybu", also spelled "y.bu", is a single cmavo despite the medial consonant and word break, and "y" surrounded by word breaks and not followed by "bu" is a word break itself, not a cmavo.

cmavo can be stressed on any syllable.

cmevla are arbitrary strings of phonemes, following phoneme but not syllable restrictions, starting with a word break, containing no word breaks, and ending with a consonant followed by a word break. They can be stressed on any vowel, diphthong, or syllabic consonant.

A brivla is composed of any number of initial rafsi followed by a final rafsi. It must begin with a vowel syllable, end with a vowel- or h-syllable, and have at least two syllables. It may not be a slinkuhi, and may not start with a sequence of cmavo that yields a valid word when removed. Stress (marked here with a grave accent) is on the second-last vowel- or h-syllable.

A final rafsi is:

  • a zihevla:
    • a vowel syllable
    • followed by any number of vowel, h-, or consonantal syllables
    • followed by a vowel- or h-syllable with no final consonant
    • is not a gismu or sequence of more than one rafsi
    • e.g. cpi,kù,ku àl,ga fì,pr,koi glàu,ka sprà,'e
  • or a gismu:
    • a CV vowel syllable followed by a CCV one
    • or a CVC one then a CV one
    • or a CCV one then a CV one
    • e.g. pà,stu vèd,li tsà,ni
  • or a short final rafsi:
    • a CVV or CCV vowel syllable, e.g. xau, cpa
    • or a CV vowel syllable followed by a 'V h-syllable,
   e.g. fà'i

An initial rafsi is any one of these:

  • a gismu followed by the syllable "'y"
   e.g. fasnu'y
  • a gismu with its final vowel replaced with y
   e.g. fasny
  • a zihevla followed by the syllable "'y"
   e.g. sorpeka'y
  • a CV vowel syllable followed by a Cy y-syllable
   e.g. fa,ky
  • a short y-less rafsi, unless the following rafsi is a zihevla rafsi:
    • a vowel syllable of the form CVV, CVVr, CVC, or CCV
    • or a CV syllable followed by a 'V or 'Vr syllable
   e.g. gau  gaur  gas  jbu  li,'a  li,'ar
  • a short y-less rafsi followed by a short final rafsi followed by "'y"
   e.g. cau,cni,'y  ri,'ar,ju,'o,'y  mul,fau,'y,  jbo,jbe,'y
  • a zihevla that ends in a vowel syllable with its final vowel replaced
 with y, unless the result breaks up into a string of any other rafsi
   e.g. ka,'or,ty  a,sny

If a CVVr or CV'Vr rafsi is followed by a rafsi beginning with "r", and only then, the final "r" of the first rafsi is replaced with an "n". If a rafsi ending in "y" is followed by a rafsi beginning with a vowel, and only then, an "'" is prepended to the second rafsi. In other situations where sticking two rafsi together violates phoneme or syllable rules, the left rafsi needs to be replaced with one ending with "y".

A brivla consisting of just a zihevla is called a zihevla, one consisting of just a gismu is a gismu, and all others are called lujvo.

A slinkuhi {valslinku'i} is a [consonant followed by a brivla that up to its first y-syllable, or if no y-syllables, in its entirety, is composed of non-zihevla rafsi] that itself can't be broken up into a string of rafsi.

 e.g. _p_rà,'i  _s_pòr,te  _z_bla,zdà,vro  _c_nar,jy,fra,gà,ri

Other non-words also behave like slinkuhi, in that prepending a cmavo makes them a word, but these arise from rules other than the one named slinkuhi.

 e.g. cpa  cpau  cpra  cprau  (brivla must have 2+ syllables)
      cl,pàr,nu  (brivla must start with a vowel syllable)

A tosmabru {valrtosmabru} is a sequence of cmavo followed by a brivla. tosmabru can be coerced into being brivla by adding a consonant at the end of the last syllable of the first cmavo.


 e.g. gau,tcì,ni -> gau tcini; cmavo + gismu
      gaur,tcì,ni -> gaurtcini; a single lujvo
      .a,'u,nain,mo -> .a'u nainmo; cmavo + zi'evla
      .a,'ur,nain,mo -> .a'urnainmo; a single zihevla
      boi,kèi,foi -> boi kèi foi; three cmavo
      boir,kèi,foi -> boirkeifoi; a single lujvo

Word breaks, glottal stops

All word breaks may be pronounced as glottal stops, and some word breaks have to. Glottal stops are required before and after all cmevla, as well as before all words starting with a vowel or "y". They are also required after certain cmavo:

  • When pronouncing two words together would break a phonotactic rule,
 they need to be separated with a glottal stop.
   e.g. "au" "uàn,mo" -> {.au .uanmo}
  • Each pair of cmavo of the form CV Cy followed by either a brivla or a
 cmavo of the form CVV or CV'V needs a glottal stop between the last
 and second-last word.
   e.g. "ca" "vy" "càr,vi" -> {ca vy. carvi} /Sa.vy?.'Sar.vi/
        (/Sa.vy.'Sar.vi/ would be {cavycarvi}, a lujvo)
  • Every stressed cmavo followed by a brivla starting with a consonant
 cluster needs a glottal stop after the cmavo.
   e.g. "bà" "sna,jù,'i" -> {bà. snaju'i} /'ba?.sna.'Zu.hi/
        (/'ba.sna.'Zu.hi/ would be {basna jù'i}, a gismu and a cmavo)

Parser peculiarities

jbofihe, popular before camxes came along, has different rules than camxes.

  • Vowel syllables
    • They may start with any number of consonants, and the rule for
   initial triples doesn't exist. The only restriction is that all
   pairs in the initial cluster need to be valid initial pairs.
     e.g. {stsmla'u} is a word
    • They may end with up to two consonants, not just one.
     e.g. {bongnanba} is a word
    • Syllables beginning with glides are their own type, and if not
   preceded by a glottal stop, they continue the word like an
   h-syllable.
     e.g. {.aierne} is one word, not two,
          {.ia} always starts with a glottal stop
    • Syllables beginning with vowels don't require a word boundary
   before them.
     e.g. {sincrboa} is a word, {.joan.} is a word

(Or, more accurately, jbofihe has no notion of syllables in the sense that camxes does, but even under jbofihe practically no one would use words that violated these modified syllable rules)

  • cmevla

Dotside doesn't apply: the beginning of cmevla can also be delimited by some cmavo, namely {la}, {lai}, {la'i}, or {doi}. If one of these cmavo precedes a cmevla, no initial glottal stop is required. cmevla can't contain any of these cmavo. For example {la .larfin.} parses as three words, "la" "la" "rfin"

  • brivla

zihevla as final rafsi, rafsi beginning with vowels, and rafsi ending in "'y" do not exist.

 e.g. {bardykentauru}, {.algyro'i}, {sorpeka'ykla} aren't words

rafsi with CVCy shape are illegal if the corresponding CVC rafsi is legal in the situation.

 e.g. {jbobanyjvo} isn't a word, only {jbobanjvo} is

rafsi with CVVr or CV'Vr shape are only recognized as rafsi if using the corresponding CVV or CV'V rafsi would result in tosmabru.

 e.g. {lerpi'oci'arci'e} is a zihevla,
      {lerpi'oci'aci'e} is a lujvo,
      {ci'arci'e} is a lujvo

All brivla must have a consonant cluster within the first five letters after ' and y are removed. {ko'oinde} is not a word.