Complex Languages and Writing Systems: Difference between revisions

From Lojban
Jump to navigation Jump to search
mNo edit summary
 
mNo edit summary
Line 1: Line 1:


The lerfu shifts (BY1) consist of these cmavo: '''ga'e''', '''ge'o''', '''je'o''', '''jo'o''', '''lo'a''', '''na'a''', '''ru'o''', '''se'e''', '''to'a'''.
'''Standard of Complexity'''


==  Usage observations ==
Languages


For past usage, I am searching through the corpus I have collected of 900 kilobytes of pure-Lojban text. The corpus includes all texts published at lojban.org/files/texts, many large texts from the Wiki, IRC logs, and texts from the CVS server such as Alice.
* significantly ambiguous grammars.
** demonstratable context sensitivity.


===  ga'e, to'a (case shifts) ===
*** I know of no natural spoken language with a [[jbocre: context free grammars ontext free grammar|context free grammars ontext free grammar]].
* conjugation


''ga'e'' is only used correctly in [http://www.lojban.org/files/texts/a algebra], to mark variables named with capital letters. The author assumed that the shift would apply across multiple lerfu strings.
** numerous irregularities in conjugation
* [[jbocre: symbolism|symbolism]]


''to'a'' is never used correctly.
* inflection


''lapoi pelxu ku'o trajynobli'' contains the sentence ".itu'e ga'e ca cpedu fi do to'a". ''ga'e'' and ''to'a'' here both act as pro-sumti, which was probably not intended. Here, ''ga'e'' and ''to'a'' were probably intended to "capitalize" (emphasize) the words between them, but as lerfu modifiers they cannot modify the emphasis of words.
Writing systems


===  ge'o, je'o, jo'o, lo'a, ru'o (alphabet shifts) ===
* many transcriptions for the same verbal message
* many/multiple notations for the same audio sound in semi-phonetic alphabets.


None of these are used anywhere in the corpus, except that the utterance "zo ru'o" appeared on IRC in response to a line of Russian text.
* non-phonetic notation systems (kanji, etc).


'''zai''' has not been assigned to this section, but it has a similar function to the above cmavo. It is also not used anywhere in the corpus.
----


===  na'a (cancel shifts) ===
* Languages
** English


This word is not used in the corpus, though ''nau'' was used in the algebra text where ''na'a'' was probably intended.
** Ancient Greek (Nicolas?!)
** Finnish (???)


===  se'e (character code) ===
** Basque
** Various Native American languages ''Oh, tell me more!'' --mi'e .aulun. who nevertheless might ''participate'' ;-)


This word is not used in the corpus.
*** Any particular ones?
* Writing Systems


==  Proposed definitions ==
** Assyrian cuneiform
** Other mixed ideograph-phonetic systems, such as was used for the first great poetry collection in Japanese.


;ga'e: Converts future letterals to uppercase. The change applies until it is shifted back with ''to'a'' or cancelled with ''na'a''.
*** How does this example differ from modern Japanese, which has 1 ideographic, and 2 phonetic writing systems, which can and ''are'' mixed all in the same text?
**** they're visually rather different, whereas i think the heian (heinian? whatever) era writing system was just one big jumbled mess. (maybe i'm responding to something i wrote ages ago. oh well.)


;na'a: Cancels all shifts (font, case, etc.) currently applied to letterals. Any shifts that occur earlier in the text do not affect letterals from this point on.
**** man'yougana, as used in the Man'youshu, are ideographic characters used for their phonetic value -- the prototypes from which the current syllabaries derived by simplification. The complexity lies in the fact that AIUI in the Man'youshu, they were not simplified but used alongside ideographic characters used for their meaning. As a bad comparison, it would be a bit like having "4tunes" in English and having to figure out whether those symbols refer to more than three melodies ("4" used for its meaning), or to fates/incidents of luck/wealth ("4" used for its sound). --[[jbocre: pne|pne]]
**** Japanese is quite a good example for "complexity" (as far as I'm understanding the term correctly). There are several different "systems" parallel one has to choose the right one. Yet, this is also a feature of modern Japanese where you e.g. have one kanji (hanzi) character and you must decide how to pronounce it choosing from sometimes up to, say, three or four different possibilities from context: genuine Japanese pronunciation, or several historical "Chinese" pronunciations (e.g. Chinese "ren": hito, jin, nin). It's a bit like various forms of Latin or French loans in English. In this context, also Chinese has its "complexity" (not at all speaking of homophones!) ''--aulun.''


;se'e: Convert the next sequence of digits to a character code in ASCII, Unicode, or some other agreed-upon character set. The code includes all digits until the next non-digit, the end of the letteral sequence, or ''na'a''.
Please do tell me what's "complexity" regarding Finnish? ''--aulun.''


;to'a: Converts future letterals to lowercase.  The change applies until it is shifted back with ''ga'e'' or cancelled with ''na'a''.
''What's complex about Finnish & Basque? Just because they are far from English?''


;ge'o: Converts future letterals to the Greek alphabet.  The change applies until it is shifted by ''je'o'', ''jo'o'', ''lo'a'', or ''ru'o'', or cancelled with ''na'a''.
''Why don't you start by explaining what your standard of complexity is?''


;je'o: Converts future letterals to the Hebrew alphabet.  The change applies until it is shifted by ''ge'o'', ''jo'o'', ''lo'a'', or ''ru'o'', or cancelled with ''na'a''.
''Anything more than Esperanto is too complex for me ;-)''


;jo'o: Converts future letterals to the Arabic alphabet. The change applies until it is shifted by ''je'o'', ''ge'o'', ''lo'a'', or ''ru'o'', or cancelled with ''na'a''.
''.i ma te zmadu (More in what way?)''


;lo'a: Converts future letterals to the Lojban (Roman) alphabet.  The change applies until it is shifted by ''je'o'', ''jo'o'', ''ge'o'', or ''ru'o'', or cancelled with ''na'a''.
I would assume ''in complexity''.


;ru'o: Converts future letterals to the Russian (Cyrillic) alphabet.  The change applies until it is shifted by ''je'o'', ''jo'o'', ''lo'a'', or ''ge'o'', or cancelled with ''na'a''.
CIRCULAR LOGIC! A language is complex if it is complex?


==  Proposed keywords ==
* A.K.A. the reflexive principle
* The question was ''ma te zmadu'' and obviously the answer is ''le ka pluja''.


ga'e: uppercase shift
* the question should've been ''ma se pluja'' or ''pluja fi ma'' (or perhaps ''pluja mama'').
 
na'a: cancel shifts
 
se'e: character code
 
to'a: lowercase shift
 
ge'o: Greek shift
 
je'o: Hebrew shift
 
jo'o: Arabic shift
 
lo'a: Lojban shift, Roman shift
 
ru'o: Russian shift, Cyrillic shift
 
==  Changes ==
 
===  Clarification of scope ===
 
The scope of a letteral shift needs to be defined. I will elaborate on Arnt's specification in [[BPFK Section: lerfu Forming cmavo]], also following the "Microsoft Word model" specified at [[jbocre: Interpretive conventions for lerfu formatting cmavo|Interpretive conventions for lerfu formatting cmavo]].
 
A letteral shift lasts until another shift of the same type replaces it, or it is cancelled by ''na'a''.
 
(The sole usage of ''ga'e'' assumed that it would not end at the end of a lerfu string.)
 
It is not so far specified where a ''se'e'' construct should end; I propose that it should be able to be terminated with ''na'a'', because ''na'a'' terminates other sorts of shifts.
 
One possible interpretive convention for these cmavo (apparently intended by the founders), is that a parenthetical shift or font-and-face change that is not followed by lerfu would be taken as applying to whole words - sort of like a mark-up language. For example: "to'i ga'e toi mi to'i to'a toi klama" would be "MI klama".
 
===  Omission of unused cmavo ===
 
Given that Lojban does not seem to be intended for holding multilingual spelling bees, and that a dictionary containing many unused cmavo with bizarre functions could confuse learners of the language, the BPFK does not recommend to include the alphabet shifts (ge'o, je'o, jo'o, lo'a, ru'o) in learning materials intended even for advanced learners. The cmavo should not be reassigned to have other meanings, however.
 
==  Impact ==
 
The clarifications made to the scope of lerfu shifts give a consistent model of how shifts should be applied, and do not invalidate any known usage.
 
I believe that my scope clarifications are consistent with those in [[BPFK Section: lerfu Forming cmavo]], even though that page says otherwise.
 
Given the lack of usage of alphabet shifts, omitting the unused alphabet shifts from learning materials should not have any significant impact on the language.

Revision as of 16:45, 4 November 2013

Standard of Complexity

Languages

  • significantly ambiguous grammars.
    • demonstratable context sensitivity.
    • numerous irregularities in conjugation
  • symbolism
  • inflection

Writing systems

  • many transcriptions for the same verbal message
  • many/multiple notations for the same audio sound in semi-phonetic alphabets.
  • non-phonetic notation systems (kanji, etc).

  • Languages
    • English
    • Ancient Greek (Nicolas?!)
    • Finnish (???)
    • Basque
    • Various Native American languages Oh, tell me more! --mi'e .aulun. who nevertheless might participate ;-)
      • Any particular ones?
  • Writing Systems
    • Assyrian cuneiform
    • Other mixed ideograph-phonetic systems, such as was used for the first great poetry collection in Japanese.
      • How does this example differ from modern Japanese, which has 1 ideographic, and 2 phonetic writing systems, which can and are mixed all in the same text?
        • they're visually rather different, whereas i think the heian (heinian? whatever) era writing system was just one big jumbled mess. (maybe i'm responding to something i wrote ages ago. oh well.)
        • man'yougana, as used in the Man'youshu, are ideographic characters used for their phonetic value -- the prototypes from which the current syllabaries derived by simplification. The complexity lies in the fact that AIUI in the Man'youshu, they were not simplified but used alongside ideographic characters used for their meaning. As a bad comparison, it would be a bit like having "4tunes" in English and having to figure out whether those symbols refer to more than three melodies ("4" used for its meaning), or to fates/incidents of luck/wealth ("4" used for its sound). --pne
        • Japanese is quite a good example for "complexity" (as far as I'm understanding the term correctly). There are several different "systems" parallel one has to choose the right one. Yet, this is also a feature of modern Japanese where you e.g. have one kanji (hanzi) character and you must decide how to pronounce it choosing from sometimes up to, say, three or four different possibilities from context: genuine Japanese pronunciation, or several historical "Chinese" pronunciations (e.g. Chinese "ren": hito, jin, nin). It's a bit like various forms of Latin or French loans in English. In this context, also Chinese has its "complexity" (not at all speaking of homophones!) --aulun.

Please do tell me what's "complexity" regarding Finnish? --aulun.

What's complex about Finnish & Basque? Just because they are far from English?

Why don't you start by explaining what your standard of complexity is?

Anything more than Esperanto is too complex for me ;-)

.i ma te zmadu (More in what way?)

I would assume in complexity.

CIRCULAR LOGIC! A language is complex if it is complex?

  • A.K.A. the reflexive principle
  • The question was ma te zmadu and obviously the answer is le ka pluja.
  • the question should've been ma se pluja or pluja fi ma (or perhaps pluja mama).