ISO-generated fu'ivla for scripts

From Lojban
Revision as of 07:25, 30 November 2015 by Krtisfranks (talk | contribs) (Including Examples)
Jump to navigation Jump to search

krtisfránks proposes the adoption of a convention for easily naming scripts in Lojban. Moreover, it is suggested that the convention follow the paradigm established for other ISO-generated fu'ivla.

Source Standard

The proposed source material would be ISO 15924, probably the Alpha-4 codes.

Complications

There are minor complications.

#1: Length

First, the Alpha-4 codes are long, especially compared to the codes for countries, languages, and currencies which each produce codes of three letters in length; however, this is the standard and no reasonable alternative (at least in ISO). Thus, we pretty much just need to accept them. We could go with the numerical codes instead, but that would be more difficult to remember and to produce on the fly and would require the creation of a new (non-conflicting) system. (By the way, we should probably develop such a system in any case, but I will hereinafter assume an Alpha-n code.)

#2: Capitalization

Second, exactly the first letter in the codes is capitalized for scripts unlike the other codes which are monotone in casing or do not seem to care. krtisfránks is not yet sure if this is a requirement of the standard. If it is, then we can ignore this issue because any output from the borrowing algorithm will just be assumed to follow this rule (wherein the first letter is capital and all others are lowercase). If not, then we must address this convention. In any case, if really desired, we could create an additional syllable that indicates the capitalization of exactly the next letter (the others defaulting to lowercase). This could be useful for sticklers who want to be careful with all codes or for use with comes wherein capitalization is important.

#3: Introducer Selection

Third, and this is the important one, we have to choose an appropriate gismu as the introducer. Three options immediately come to mind: "ciska", "lerfu", and "cilfu".

  • "ciska" is not entirely appropriate since it cares about the writer, the medium, and the ink in addition to what is actually written. Even then, it is really about the writing itself rather than the system of symbols that is being employed. It does have a cmarafsi which might be nice but it is not overly useful.
  • "lerfu" is pretty nice. It would be better if we could use "selyle'u", but at least the relationship is there. The downsides are that it ends with "-u", making it potentially confusing since the letters in the code will usually be turned into syllables of the form '-Cu-'. For example, "lerfuzumutuxe" (Zmth) might be confused for "Fzmt... oh, there is an 'h'...". Also, if we ever want to name individual symbols using an introducer (such as borrowing from Unicode or random names), the two conventions in Lojban would at least be confusing- and they could easily conflict or lead to garden-pathing. There might already be zi'evla which cause such problems. The upside is that there are cmarafsi, so the word length can be reduced and we can more closely follow the model of "bangu"-introduced borrowings than "gugde"-introduced ones. (But we would have to adapt since "lerfu" is not as versatile as "bangu". This is a small technical matter. The cmarafsi of "lerfu" are much nicer than of "ciska".)
  • "cilfu" does not have the problem with semantics that the "ciska" and "lerfu" have; it definitely means "script" and can only mean that (no need to worry about borrowing names for individual symbols). It suffers from the terminal "-u" issue though. It also has no cmarafsi. And it is redundant. Why create a fu'ivla when a link-sumti construct would do? On the other hand, this redundancy could be viewed as making this word the prime candidate for the job. (I do think that the codes should be borrowed; so, if the job must be done, maybe this is the best way to do it.)

krtisfránks personally likes "lerfu" or "cilfu" for this role. Perhaps if their final vowels are edited ("lerfa"? "cilfe"?), they would be better. Then it is a matter of choosing either redundancy and word length or potential conflicts with future borrowings.

An alternative would be to let the introducer be "slerfa" or something similar for "selyle'u" and correcting the terminal "-u"; this might even lend itself to shorter words in some situations ("sler-").

In any case, after the introducer is selected, assuming that the first two issues raised are resolved nicely, the translation algorithm is the same as the rest and is ready to go.

Mapping

Each vowel V (except "Y"/"y") will be mapped to V if it is the first letter in the code and to `V if it is any subsequent letter in the code.

Each consonant C (except for "H"/"h", "Q"/"q", and "W"/"w") will map to Cu. This is regardless of their pronunciation in English, French, Latin, or the language that motivated the ISO code designation.

"Y"/"y" will map to je.

"H"/"h" will map to xe.

"Q"/"q" will map to ke.

"W"/"w" will map to ve.

Examples

The aforementioned issues can be resolved by various assumptions; what follows is a series of examples representing different such resolutions. It is not meant to be complete, only demonstrative.

NOTE: THIS IS NOT AN ENDORSEMENT OF ANY OF THESE ASSUMPTIONS!

Type A

Assume that: 1) Alpha-4 codes are used; 2) Casing does not matter; 3) "lerfu" is uses as our introducer (this is the biggest caveat to this example).

Then:

  • The introducer is "ler-" unless the code begins with a vowel (other than "Y"), an "F", or an "R".
  • In the case of an initial vowel (other than "Y"), the introducer is "lerf-".
  • If the code begins with an "F", then the introducer is "le'ur-".
  • If the code begins with an "R", then the introducer is "le'un-".


The name for the script described/named as Mathematical Notation, which is assigned code 'Zmth', would be "lerzumutuxe". More illustratively:

Normal examples:

  • Aaaa -> lerfa'a'a'a.
  • Abaa -> lerfabu'a'a.
  • Baaa -> lerbu'a'a'a.
  • Bbbb -> lerbubububu.

"F"-initial examples:

  • Faaa -> le'urfu'a'a'a.
  • Fbaa -> le'urfubu'a'a.
  • Fbbb -> le'urfubububu.

"R"-initial examples:

  • Raaa -> le'unru'a'a'a.
  • Rbaa -> le'unrubu'a'a.
  • Rbbb -> le'unrubububu.



NOTE: THIS IS NOT AN ENDORSEMENT OF ANY OF THESE ASSUMPTIONS!

  • In particular, I am not advocating for the adoption of "lerfu" as the introducer.