Complex Languages and Writing Systems: Difference between revisions

From Lojban
Jump to navigation Jump to search
mNo edit summary
 
 
(9 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 
==Standard of Complexity==
The lerfu shifts (BY1) consist of these cmavo: '''ga'e''', '''ge'o''', '''je'o''', '''jo'o''', '''lo'a''', '''na'a''', '''ru'o''', '''se'e''', '''to'a'''.
===Languages===
 
* significantly ambiguous grammars.
== Usage observations ==
** demonstratable context sensitivity.
 
*** I know of no natural spoken language with a [[context free grammars|context free grammar]].
For past usage, I am searching through the corpus I have collected of 900 kilobytes of pure-Lojban text. The corpus includes all texts published at lojban.org/files/texts, many large texts from the Wiki, IRC logs, and texts from the CVS server such as Alice.
* conjugation
 
** numerous irregularities in conjugation
=== ga'e, to'a (case shifts) ===
* [[symbolism]]
 
* inflection
''ga'e'' is only used correctly in [http://www.lojban.org/files/texts/a algebra], to mark variables named with capital letters. The author assumed that the shift would apply across multiple lerfu strings.
===Writing systems===
 
* many transcriptions for the same verbal message
''to'a'' is never used correctly.
* many/multiple notations for the same audio sound in semi-phonetic alphabets.
 
* non-phonetic notation systems (kanji, etc).
''lapoi pelxu ku'o trajynobli'' contains the sentence ".itu'e ga'e ca cpedu fi do to'a". ''ga'e'' and ''to'a'' here both act as pro-sumti, which was probably not intended. Here, ''ga'e'' and ''to'a'' were probably intended to "capitalize" (emphasize) the words between them, but as lerfu modifiers they cannot modify the emphasis of words.
===Examples===
 
* Languages
=== ge'o, je'o, jo'o, lo'a, ru'o (alphabet shifts) ===
** English
 
** Ancient Greek (Nicolas?!)
None of these are used anywhere in the corpus, except that the utterance "zo ru'o" appeared on IRC in response to a line of Russian text.
** Finnish (???)
 
** Basque
'''zai''' has not been assigned to this section, but it has a similar function to the above cmavo. It is also not used anywhere in the corpus.
** Various Native American languages
 
***aulun:
=== na'a (cancel shifts) ===
****Oh, tell me more! Who nevertheless might <u>participate</u>.
 
*** Any particular ones?
This word is not used in the corpus, though ''nau'' was used in the algebra text where ''na'a'' was probably intended.
* Writing Systems
 
** Assyrian cuneiform
===  se'e (character code) ===
** Other mixed ideograph-phonetic systems, such as was used for the first great poetry collection in Japanese.
 
*** How does this example differ from modern Japanese, which has 1 ideographic, and 2 phonetic writing systems, which can and <u>are</u> mixed all in the same text?
This word is not used in the corpus.
**** they're visually rather different, whereas i think the heian (heinian? whatever) era writing system was just one big jumbled mess. (maybe i'm responding to something i wrote ages ago. oh well.)
 
****[[pne]]:
==  Proposed definitions ==
*****man'yougana, as used in the Man'youshu, are ideographic characters used for their phonetic value - the prototypes from which the current syllabaries derived by simplification. The complexity lies in the fact that AIUI in the Man'youshu, they were not simplified but used alongside ideographic characters used for their meaning. As a bad comparison, it would be a bit like having "4tunes" in English and having to figure out whether those symbols refer to more than three melodies ("4" used for its meaning), or to fates/incidents of luck/wealth ("4" used for its sound).]
 
*****aulun:
;ga'e: Converts future letterals to uppercase. The change applies until it is shifted back with ''to'a'' or cancelled with ''na'a''.
******Japanese is quite a good example for "complexity" (as far as I'm understanding the term correctly). There are several different "systems" parallel one has to choose the right one. Yet, this is also a feature of modern Japanese where you e.g. have one kanji (hanzi) character and you must decide how to pronounce it choosing from sometimes up to, say, three or four different possibilities from context: genuine Japanese pronunciation, or several historical "Chinese" pronunciations (e.g. Chinese "ren": hito, jin, nin). It's a bit like various forms of Latin or French loans in English. In this context, also Chinese has its "complexity" (not at all speaking of homophones!)
 
****aulun:
;na'a: Cancels all shifts (font, case, etc.) currently applied to letterals. Any shifts that occur earlier in the text do not affect letterals from this point on.
*****Please do tell me what's "complexity" regarding Finnish?
 
****What's complex about Finnish & Basque? Just because they are far from English?
;se'e: Convert the next sequence of digits to a character code in ASCII, Unicode, or some other agreed-upon character set. The code includes all digits until the next non-digit, the end of the letteral sequence, or ''na'a''.
****Why don't you start by explaining what your standard of complexity is?
 
****Anything more than Esperanto is too complex for me ;-)
;to'a: Converts future letterals to lowercase. The change applies until it is shifted back with ''ga'e'' or cancelled with ''na'a''.
*****.i ma te zmadu (More in what way?)
 
******I would assume <u>in complexity</u>.
;ge'o: Converts future letterals to the Greek alphabet. The change applies until it is shifted by ''je'o'', ''jo'o'', ''lo'a'', or ''ru'o'', or cancelled with ''na'a''.
******CIRCULAR LOGIC! A language is complex if it is complex?
 
****** A.K.A. the reflexive principle
;je'o: Converts future letterals to the Hebrew alphabet.  The change applies until it is shifted by ''ge'o'', ''jo'o'', ''lo'a'', or ''ru'o'', or cancelled with ''na'a''.
****** The question was '''ma te zmadu''' and obviously the answer is '''le ka pluja'''.
 
******* The question should've been '''ma se pluja''' or '''pluja fi ma''' (or perhaps '''pluja mama''').
;jo'o: Converts future letterals to the Arabic alphabet. The change applies until it is shifted by ''je'o'', ''ge'o'', ''lo'a'', or ''ru'o'', or cancelled with ''na'a''.
 
;lo'a: Converts future letterals to the Lojban (Roman) alphabet.  The change applies until it is shifted by ''je'o'', ''jo'o'', ''ge'o'', or ''ru'o'', or cancelled with ''na'a''.
 
;ru'o: Converts future letterals to the Russian (Cyrillic) alphabet. The change applies until it is shifted by ''je'o'', ''jo'o'', ''lo'a'', or ''ge'o'', or cancelled with ''na'a''.
 
==  Proposed keywords ==
 
ga'e: uppercase shift
 
na'a: cancel shifts
 
se'e: character code
 
to'a: lowercase shift
 
ge'o: Greek shift
 
je'o: Hebrew shift
 
jo'o: Arabic shift
 
lo'a: Lojban shift, Roman shift
 
ru'o: Russian shift, Cyrillic shift
 
==  Changes ==
 
===  Clarification of scope ===
 
The scope of a letteral shift needs to be defined. I will elaborate on Arnt's specification in [[BPFK Section: lerfu Forming cmavo]], also following the "Microsoft Word model" specified at [[jbocre: Interpretive conventions for lerfu formatting cmavo|Interpretive conventions for lerfu formatting cmavo]].
 
A letteral shift lasts until another shift of the same type replaces it, or it is cancelled by ''na'a''.
 
(The sole usage of ''ga'e'' assumed that it would not end at the end of a lerfu string.)
 
It is not so far specified where a ''se'e'' construct should end; I propose that it should be able to be terminated with ''na'a'', because ''na'a'' terminates other sorts of shifts.
 
One possible interpretive convention for these cmavo (apparently intended by the founders), is that a parenthetical shift or font-and-face change that is not followed by lerfu would be taken as applying to whole words - sort of like a mark-up language. For example: "to'i ga'e toi mi to'i to'a toi klama" would be "MI klama".
 
===  Omission of unused cmavo ===
 
Given that Lojban does not seem to be intended for holding multilingual spelling bees, and that a dictionary containing many unused cmavo with bizarre functions could confuse learners of the language, the BPFK does not recommend to include the alphabet shifts (ge'o, je'o, jo'o, lo'a, ru'o) in learning materials intended even for advanced learners. The cmavo should not be reassigned to have other meanings, however.
 
==  Impact ==
 
The clarifications made to the scope of lerfu shifts give a consistent model of how shifts should be applied, and do not invalidate any known usage.
 
I believe that my scope clarifications are consistent with those in [[BPFK Section: lerfu Forming cmavo]], even though that page says otherwise.
 
Given the lack of usage of alphabet shifts, omitting the unused alphabet shifts from learning materials should not have any significant impact on the language.

Latest revision as of 12:26, 12 June 2015

Standard of Complexity

Languages

  • significantly ambiguous grammars.
    • demonstratable context sensitivity.
  • conjugation
    • numerous irregularities in conjugation
  • symbolism
  • inflection

Writing systems

  • many transcriptions for the same verbal message
  • many/multiple notations for the same audio sound in semi-phonetic alphabets.
  • non-phonetic notation systems (kanji, etc).

Examples

  • Languages
    • English
    • Ancient Greek (Nicolas?!)
    • Finnish (???)
    • Basque
    • Various Native American languages
      • aulun:
        • Oh, tell me more! Who nevertheless might participate.
      • Any particular ones?
  • Writing Systems
    • Assyrian cuneiform
    • Other mixed ideograph-phonetic systems, such as was used for the first great poetry collection in Japanese.
      • How does this example differ from modern Japanese, which has 1 ideographic, and 2 phonetic writing systems, which can and are mixed all in the same text?
        • they're visually rather different, whereas i think the heian (heinian? whatever) era writing system was just one big jumbled mess. (maybe i'm responding to something i wrote ages ago. oh well.)
        • pne:
          • man'yougana, as used in the Man'youshu, are ideographic characters used for their phonetic value - the prototypes from which the current syllabaries derived by simplification. The complexity lies in the fact that AIUI in the Man'youshu, they were not simplified but used alongside ideographic characters used for their meaning. As a bad comparison, it would be a bit like having "4tunes" in English and having to figure out whether those symbols refer to more than three melodies ("4" used for its meaning), or to fates/incidents of luck/wealth ("4" used for its sound).]
          • aulun:
            • Japanese is quite a good example for "complexity" (as far as I'm understanding the term correctly). There are several different "systems" parallel one has to choose the right one. Yet, this is also a feature of modern Japanese where you e.g. have one kanji (hanzi) character and you must decide how to pronounce it choosing from sometimes up to, say, three or four different possibilities from context: genuine Japanese pronunciation, or several historical "Chinese" pronunciations (e.g. Chinese "ren": hito, jin, nin). It's a bit like various forms of Latin or French loans in English. In this context, also Chinese has its "complexity" (not at all speaking of homophones!)
        • aulun:
          • Please do tell me what's "complexity" regarding Finnish?
        • What's complex about Finnish & Basque? Just because they are far from English?
        • Why don't you start by explaining what your standard of complexity is?
        • Anything more than Esperanto is too complex for me ;-)
          • .i ma te zmadu (More in what way?)
            • I would assume in complexity.
            • CIRCULAR LOGIC! A language is complex if it is complex?
            • A.K.A. the reflexive principle
            • The question was ma te zmadu and obviously the answer is le ka pluja.
              • The question should've been ma se pluja or pluja fi ma (or perhaps pluja mama).