Posted by PierreAbbat on Wed 30 of Jan., 2008 04:32 GMT posts: 324 On Tuesday 29 January 2008 15:42, arj wrote: > Re: BPFK Section: PEG Morphology Algorithm > > Author: arj > > How is the lack of pauses following BY treated in the morphology? How > should it be treated? > > CLL is terribly ambiguous on this point.

Pe'i the lack of a pause following BY should not be an error, but if it results in a brivla it should be parsed as a brivla. If the BY is followed by any number of CV cmavo and then a pause, this cannot be parsed as a brivla, but if BY is followed by CCV or CVV or CV'V, it might. So I suggest that if BY is followed by {bu}, the pause be moved after {bu} to keep the letteral without a pause in it.

If the "Y" is stressed, however, the BY should be treated as a letteral, regardless of whether a pause occurs before the next CCV or CVV. "y" is never stressed in brivla, and someone spelling a word may stress all the letter names so that they can be heard clearly.

Pierre

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by Anonymous on Wed 30 of Jan., 2008 11:45 GMT > Author: arj > > How is the lack of pauses following BY treated in the morphology?

It accepts Cy without a pause as long as it is not both preceded directly by CV and followed directly by a brivla or by a CVV cmavo, i.e. as long as it cannot be mistaken for a part of a CVCy-lujvo.

{.y'y} never needs a final pause since it can never be the start of a lujvo.

mu'o mi'e xorxes

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by Anonymous on Wed 30 of Jan., 2008 11:59 GMT On 1/29/08, arj wrote: > > Currently, camxes accepts commas anywhere. However, the CLL says that "The comma is used to indicate a syllable break within a word ...". I take this to mean that commas may only occur within words.

It simply ignores all commas.

It also accepts things like {v,a,,,,l,s,i} even though neither {v} nor {s}, nor {} are possible syllables.

It could be tweaked to accept commas only at syllable breaks, but I'm not sure it's worth legislating on that. camxes also accepts some symbols like "?" or "!" as spaces, even though I don't think CLL mentions them as alternatives for {.}.

mu'o mi'e xorxes

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

arj Posted by arj on Wed 30 of Jan., 2008 20:57 GMT posts: 953 On Wed, Jan 30, 2008 at 08:43:52AM -0300, Jorge Llambías wrote: > > Author: arj > > > > How is the lack of pauses following BY treated in the morphology? > > It accepts Cy without a pause as long as it is not both > preceded directly by CV and followed directly by a brivla or by > a CVV cmavo, i.e. as long as it cannot be mistaken for a part > of a CVCy-lujvo. > > {.y'y} never needs a final pause since it can never be the start of > a lujvo.

Shortly after I wrote the original question, Stephen Pollei on IRC pointed out that chapter 4 of the CLL says that "A cmavo of the form ``Cy must be followed by a pause unless another ``Cy-form cmavo follows."

So the CLL is actually unambiguous, and the machine morphology is incorrect.

-- Arnt Richard Johansen http://arj.nvg.org/ Du klickar bara p� en ikon s� SER DU DITT LOKALA N�TVERK. — Z mag@zine lovpriser Win95 i nr. 7/95

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by JohnCowan on Wed 30 of Jan., 2008 21:07 GMT posts: 149 Arnt Richard Johansen scripsit:

> Shortly after I wrote the original question, Stephen Pollei on IRC > pointed out that chapter 4 of the CLL says that "A cmavo of the form > ``Cy must be followed by a pause unless another ``Cy-form cmavo > follows." > > So the CLL is actually unambiguous, and the machine morphology is incorrect.

That statement was not meant by me to be the whole truth. At that time there was no effective mechanization of the morphology, and I put it in as a rule of thumb.

-- But the next day there came no dawn, John Cowan and the Grey Company passed on into the cowan@ccil.org darkness of the Storm of Mordor and were http://www.ccil.org/~cowan lost to mortal sight; but the Dead followed them. --"The Passing of the Grey Company"

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by JohnCowan on Fri 01 of Feb., 2008 15:22 GMT posts: 149 Pierre Abbat scripsit:

> Pe'i the lack of a pause following BY should not be an error, but if it > results in a brivla it should be parsed as a brivla. If the BY is followed by > any number of CV cmavo and then a pause, this cannot be parsed as a brivla, > but if BY is followed by CCV or CVV or CV'V, it might. So I suggest that if > BY is followed by {bu}, the pause be moved after {bu} to keep the letteral > without a pause in it. > > If the "Y" is stressed, however, the BY should be treated as a letteral, > regardless of whether a pause occurs before the next CCV or CVV. "y" is never > stressed in brivla, and someone spelling a word may stress all the letter > names so that they can be heard clearly.

This proposal sounds plausible to me.

-- John Cowan cowan@ccil.org http://www.ccil.org/~cowan Thor Heyerdahl recounts his attempt to prove Rudyard Kipling's theory that the mongoose first came to India on a raft from Polynesia. --blurb for Rikki-Kon-Tiki-Tavi

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

arj Posted by arj on Mon 04 of Feb., 2008 02:11 GMT posts: 953 On Fri, Feb 01, 2008 at 10:19:35AM -0500, John Cowan wrote: > Pierre Abbat scripsit: > > > Pe'i the lack of a pause following BY should not be an error, but if it > > results in a brivla it should be parsed as a brivla. If the BY is followed by > > any number of CV cmavo and then a pause, this cannot be parsed as a brivla, > > but if BY is followed by CCV or CVV or CV'V, it might. So I suggest that if > > BY is followed by {bu}, the pause be moved after {bu} to keep the letteral > > without a pause in it. > > > > If the "Y" is stressed, however, the BY should be treated as a letteral, > > regardless of whether a pause occurs before the next CCV or CVV. "y" is never > > stressed in brivla, and someone spelling a word may stress all the letter > > names so that they can be heard clearly. > > This proposal sounds plausible to me.

Maybe, but having just joined the Dot Side, I am wary of morphological rules that need a lot of context to reliably disambiguate.

How sure can we be that it is possible for humans to learn Pierre's system? I am receptive to persuasive arguments, as I have been in the past. :-)

(On a side note, is it really legal to stress y in any circumstance?)

-- Arnt Richard Johansen http://arj.nvg.org/ Jeg er nok verdens sydligste sengev�ter. Forutsatt at ingen p� basen p� Sydpolen driver med slikt, da. --Erling Kagge: Alene til Sydpolen

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by JohnCowan on Mon 04 of Feb., 2008 02:18 GMT posts: 149 Arnt Richard Johansen scripsit:

> How sure can we be that it is possible for humans to learn Pierre's > system? I am receptive to persuasive arguments, as I have been in the > past. :-)

There's what's legal, and then there's what's advisable. The name mgrvgrvlnmsrpr is legal but not advisable.

> (On a side note, is it really legal to stress y in any circumstance?)

Sure, notably in names: dybYtolsrfrz.

-- As you read this, I don't want you to feel John Cowan sorry for me, because, I believe everyone cowan@ccil.org will die someday. http://www.ccil.org/~cowan --From a Nigerian-type scam spam

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by PierreAbbat on Mon 04 of Feb., 2008 04:35 GMT posts: 324 On Sunday 03 February 2008 11:17, Arnt Richard Johansen wrote: > Maybe, but having just joined the Dot Side, I am wary of morphological > rules that need a lot of context to reliably disambiguate. > > How sure can we be that it is possible for humans to learn Pierre's system? > I am receptive to persuasive arguments, as I have been in the past. :-)

To learn to *speak* it, or to learn to *hear* it? Take for instance /lEkymoi/. I'm not recommending that anyone say that, but that if someone does, it not be received as an error. What it is parsed as depends on another question: whether "y" is allowed in lujvo where it is not required. If "y" is allowed, /lEkymoi/ is {lekymoi} which is the same as {lekmoi}; if not, it is {le ky moi}. If one wants to say {le ky moi}, one should say /lekYmoi/ or /lekY.moi/ or /leky.moi/, but not /lekymoi/.

Pierre

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by Anonymous on Tue 05 of Feb., 2008 10:38 GMT On 1/30/08, Pierre Abbat wrote: > > Pe'i the lack of a pause following BY should not be an error, but if it > results in a brivla it should be parsed as a brivla. If the BY is followed by > any number of CV cmavo and then a pause, this cannot be parsed as a brivla, > but if BY is followed by CCV or CVV or CV'V, it might. So I suggest that if > BY is followed by {bu}, the pause be moved after {bu} to keep the letteral > without a pause in it.

That part is already covered by camxes. In fact, if Cy is folowed by {bu} or any other CV cmavo there is no need to pause at all on account of that Cy.

> If the "Y" is stressed, however, the BY should be treated as a letteral, > regardless of whether a pause occurs before the next CCV or CVV. "y" is never > stressed in brivla, and someone spelling a word may stress all the letter > names so that they can be heard clearly.

This is currently not covered by camxes, which only recognizes A, E, I, O, U (or space after a brivla) as stress markers. So for example both {LOMYmoi} and {lomYmoi} are parsed as lujvo. L, M and Y are never distinguished from l, m and y.

mu'o mi'e xorxes

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by "=?ISO-8859-1?Q?Jorge_Llamb=EDas?=" on Sun 07 of Dec., 2008 14:26 GMT On Sun, Dec 7, 2008 at 9:59 AM, arj wrote: > > I'm running Robin's test corpus through bouth camxes and the official parser to check for discrepancies.

Yes, there are quite a few.

> This is one of the first things I found: > > Official: PASS camxes: FAIL > .i,iai,ii,iai,ion. > > The official parser thinks this is a cmene; camxes thinks this is a nonLojbanWord.

I seem to remember CLL doesn't like triphthongs but I can't find a quote now..

camxes is OK with triphthongs (eight of them: iai, iau, iei, ioi, uai, uau, uei, uoi), but doesn't like a syllable that ends in a semi-vowel to be immediately followed by another that starts with one, so the problem for camxes are "iai,ii" and "iai,ion".

.i,ia,ii,ia,ion should be fine.

> If there is a good reason why this shouldn't be allowed (I don't see any), then we should add it to the CLL errata.

We don't yet have a clear consensus about what the rules for long vowel clusters should be. (Nor for long consonant clusters, for that matter.)

mu'o mi'e xorxes

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by "=?ISO-8859-1?Q?Jorge_Llamb=EDas?=" on Mon 08 of Dec., 2008 15:53 GMT The camxes rules for vowels can be summarized as follows:

(1) There are exactly ten possible vocalic syllable nuclei: a, e, i, o, u, ai, au, ei, oi, y

(2) Every syllable MUST have an onset. (Including ".", " ' ", "i" and "u" as special non-C onsets.)

(3) A syllable that ends with a diphthong nucleus cannot be directly followed by one with an i/u-onset.

Therefore, there are only 40 possible syllabes without a C, namely:

.a, .e, .i, .o, .u, .ai, .au, .ei, .oi, .y 'a, 'e, 'i, 'o, 'u, 'ai, 'au, 'ei, 'oi, 'y ia, ie, ii, io, iu, iai, iau, iei, ioi, iy ua, ue, ui, uo, uu, uai, uau, uei, uoi, uy

With these rules, it is possible to have indefinitely long strings of vowels, but the "middle syllables" will always have to be one of ia, ie, ii, io, iu, (iy), ua, ue, ui, uo, uu, (uy). Falling diphthongs can only occur at the end of a string of vowels, and "single" vowels can only occur at the beginning of a string (in fact preceded by . or ').

mu'o mi'e xorxes

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

arj Posted by arj on Mon 08 of Dec., 2008 16:35 GMT posts: 953 On Mon, Dec 08, 2008 at 12:51:01PM -0300, Jorge Llambías wrote: > The camxes rules for vowels can be summarized as follows: > > (1) There are exactly ten possible vocalic syllable nuclei: a, e, i, > o, u, ai, au, ei, oi, y > > (2) Every syllable MUST have an onset. (Including ".", " ' ", "i" and > "u" as special non-C onsets.) > > (3) A syllable that ends with a diphthong nucleus cannot be directly > followed by one with an i/u-onset.

What happens if we remove 2) and 3)? Especially 2) is as I understand it rare among natural languages.

-- Arnt Richard Johansen http://arj.nvg.org/ "My speech recognition software may have trouble with ordinary words, but not with ketoprofen." --Magnus Itland

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by JohnCowan on Mon 08 of Dec., 2008 17:00 GMT posts: 149 Arnt Richard Johansen scripsit:

> > (2) Every syllable MUST have an onset. (Including ".", " ' ", "i" and > > "u" as special non-C onsets.) > > > > (3) A syllable that ends with a diphthong nucleus cannot be directly > > followed by one with an i/u-onset. > > What happens if we remove 2) and 3)? Especially 2) is as I understand > it rare among natural languages.

Lifting those constraints means a severe threat to stability and audio-visual isomorphism. We really don't want words like aoaoaoa, and things like ai,iu are too easily mistaken for ai,u or a,iu. We introduced ' into Lojban (Loglan didn't have it) precisely to reduce the risk of such problems.

-- John Cowan cowan@ccil.org http://www.ccil.org/~cowan C'est la` pourtant que se livre le sens du dire, de ce que, s'y conjuguant le nyania qui bruit des sexes en compagnie, il supplee a ce qu'entre eux, de rapport nyait pas. --Jacques Lacan, "L'Etourdit"

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by "=?ISO-8859-1?Q?Jorge_Llamb=EDas?=" on Mon 08 of Dec., 2008 19:06 GMT On Mon, Dec 8, 2008 at 1:59 PM, John Cowan wrote: > Arnt Richard Johansen scripsit: > >> > (2) Every syllable MUST have an onset. (Including ".", " ' ", "i" and >> > "u" as special non-C onsets.) >> > >> > (3) A syllable that ends with a diphthong nucleus cannot be directly >> > followed by one with an i/u-onset. >> >> What happens if we remove 2) and 3)? Especially 2) is as I understand >> it rare among natural languages. > > Lifting those constraints means a severe threat to stability and > audio-visual isomorphism. We really don't want words like aoaoaoa,

That particular case is not something I mind much. In fact I used "sincrboa" and "tricrbaobao" in the Little Prince translation, which was done before camxes, and which I guess I'll have to change if I want it to comply. CLL also has "gugdrkorea", although in another place it says that there can't be 5-letter fu'ivla without an apostrophe, inplicitly forbidding something like "sprea".

The main reason I went with (2) is to disallow aaa, eee, ooo, and also to simplify the rules for i/u, with their double function as vowel and semi-vowel.

>and > things like ai,iu are too easily mistaken for ai,u or a,iu.

Yes. Disallowing "ai,ia" and "au,ua" can be seen as a simple extension of the "no double consonant" rule, just like an,na or at,ta are disallowed, since i/u can be seen as consonants there, and disallowing "ai,ua" and "au,ia" is similar to an additional forbidden consonant pair.

mu'o mi'e xorxes

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by JohnCowan on Mon 08 of Dec., 2008 20:00 GMT posts: 149 Jorge Llambías scripsit:

> That particular case is not something I mind much. In fact I used > "sincrboa" and "tricrbaobao" in the Little Prince translation, which > was done before camxes, and which I guess I'll have to change if I > want it to comply.

Vowel hiatus is often not stable: it tends to turn into diphthongs. Loglan, in fact, uses "ao" for the diphthong written "au" in Lojban; Loglan "aa", "ae", "au", "ea", "ee", "eo", "eu", "oa", "oe", "oo", "ou" are all instances of hiatus. Adding "'" not only eliminated hiatus, but added lots more vowel pairs (meaning more cmavo and rafsi) to play with. In addition, eliminating things like "tiu" in favor of "ti'u" reduced the chance that the "t" would be palatalized into "tcu".

> CLL also has "gugdrkorea",

Something I have regretted *deeply* since then. "gugdrkore'a" would have been appropriate, or something based on a native name.

-- John Cowan cowan@ccil.org http://ccil.org/~cowan Heckler: "Go on, Al, tell 'em all you know. It won't take long." Al Smith: "I'll tell 'em all we *both* know. It won't take any longer."

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

arj Posted by arj on Mon 08 of Dec., 2008 20:36 GMT posts: 953 On Mon, Dec 08, 2008 at 11:59:58AM -0500, John Cowan wrote: > Arnt Richard Johansen scripsit: > > > > (2) Every syllable MUST have an onset. (Including ".", " ' ", "i" and > > > "u" as special non-C onsets.) > > > > > > (3) A syllable that ends with a diphthong nucleus cannot be directly > > > followed by one with an i/u-onset. > > > > What happens if we remove 2) and 3)? Especially 2) is as I understand > > it rare among natural languages. > > Lifting those constraints means a severe threat to stability and > audio-visual isomorphism.

Hang on a minute. This sounds as if those constraints existed in Lojban to begin with, which AFAIK, they did not. We may need to _introduce_ new constraints to maintain stability and AVI, but it isn't immediately obvious what those should be.

I assert that any new morphological rules that we come up with should be so simple that they can be translated into a human-readable form.

> We really don't want words like aoaoaoa, and > things like ai,iu are too easily mistaken for ai,u or a,iu. We introduced > ' into Lojban (Loglan didn't have it) precisely to reduce the risk of > such problems.

The rules also must not be an insurmountable burden to the speaker. Competent Lojbanists could not get the no la/lai/doi in cmene right. How can they internalise the rules that rule out *.i,iai,ii,iai,ion?

-- Arnt Richard Johansen http://arj.nvg.org/ Many familiar with Descartes' work are likely to remember him from philosophy courses as that French guy who was wrong a lot. --Daniel Harbour

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by "=?ISO-8859-1?Q?Jorge_Llamb=EDas?=" on Mon 08 of Dec., 2008 20:36 GMT On Mon, Dec 8, 2008 at 4:59 PM, John Cowan wrote: > > Vowel hiatus is often not stable: it tends to turn into diphthongs.

How is that measured? Everything in phonology is unstable in the long run, but how do we determine whether some feature is especially unstable?

> Loglan, in fact, uses "ao" for the diphthong written "au" in Lojban;

Yes, I never really understood that. I can unserstand "ao" eventually turning into "au" but not why anyone would choose "ao" as the starting orthography for that diphthong.

> Loglan "aa", "ae", "au", "ea", "ee", "eo", "eu", "oa", "oe", "oo", "ou" > are all instances of hiatus.

Yes. All of those except "aa" and "ou" occur in Spanish words, so they are not particularly problematic for me. (Even "aa" occurs in some names of Arabic origin. And "ou" can occur between words.)

> Adding "'" not only eliminated hiatus, but > added lots more vowel pairs (meaning more cmavo and rafsi) to play with.

That can be seen as a bad thing actually. Lojban has an excess, not a lack of cmavo. :-)

> In addition, eliminating things like "tiu" in favor of "ti'u" reduced > the chance that the "t" would be palatalized into "tcu".

But that would have meant 170 more monosyllabic cmavo to play with! :-) camxes currently does allow Ci and Cu as syllable onsets though.

mu'o mi'e xorxes

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by "=?ISO-8859-1?Q?Jorge_Llamb=EDas?=" on Mon 08 of Dec., 2008 20:45 GMT On Mon, Dec 8, 2008 at 5:35 PM, Arnt Richard Johansen wrote: > > How can they internalise the rules that rule out *.i,iai,ii,iai,ion?

It's basically the same rule that rules out *.an,nas.

mu'o mi'e xorxes

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by JohnCowan on Mon 08 of Dec., 2008 22:29 GMT posts: 149 Arnt Richard Johansen scripsit:

> Hang on a minute. This sounds as if those constraints existed in Lojban > to begin with, which AFAIK, they did not. We may need to _introduce_ > new constraints to maintain stability and AVI, but it isn't immediately > obvious what those should be.

There has never been a official formalized morphology algorithm, so the question of what pre-existed really doesn't arise.

> I assert that any new morphological rules that we come up with should > be so simple that they can be translated into a human-readable form.

I agree.

> The rules also must not be an insurmountable burden to the > speaker. Competent Lojbanists could not get the no la/lai/doi > in cmene right. How can they internalise the rules that rule out > *.i,iai,ii,iai,ion?

As Jorge says, it's just a matter of avoiding double consonant sounds, if we read the i in Vi and iV (and ditto for u) as pseudo-consonants.

-- We pledge allegiance to the penguin John Cowan and to the intellectual property regime cowan@ccil.org for which he stands, one world under http://www.ccil.org/~cowan Linux, with free music and open source software for all. --Julian Dibbell on Brazil, edited

Posted by PierreAbbat on Tue 09 of Dec., 2008 06:27 GMT posts: 324 On Monday 08 December 2008 10:51:01 Jorge Llambías wrote: > The camxes rules for vowels can be summarized as follows: > > (1) There are exactly ten possible vocalic syllable nuclei: a, e, i, > o, u, ai, au, ei, oi, y > > (2) Every syllable MUST have an onset. (Including ".", " ' ", "i" and > "u" as special non-C onsets.)

I consider " ' " to be ambisyllabic: it can't be assigned to either syllable, but belongs to both or separates them. This exists in natlangs too: in "narrow" the r can't be assigned to either syllable because neither æ nor ær is a valid word ending (except in geekish, where I pronounce "char" as kær, but I've heard ker for that).

> (3) A syllable that ends with a diphthong nucleus cannot be directly > followed by one with an i/u-onset. > > Therefore, there are only 40 possible syllabes without a C, namely: > > .a, .e, .i, .o, .u, .ai, .au, .ei, .oi, .y > 'a, 'e, 'i, 'o, 'u, 'ai, 'au, 'ei, 'oi, 'y > ia, ie, ii, io, iu, iai, iau, iei, ioi, iy > ua, ue, ui, uo, uu, uai, uau, uei, uoi, uy

As far as the brivla morphology is concerned, there are no triphthongs; thus "mliau" is two syllables, by default "mlia,u". I consider all sequences of "aeiou" (I'm not sure about sequences with "y") to be valid, with commas implicit after every two or between pairs that aren't valid diphthongs.

Pierre

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by "=?ISO-8859-1?Q?Jorge_Llamb=EDas?=" on Tue 09 of Dec., 2008 11:46 GMT On Tue, Dec 9, 2008 at 2:07 AM, Pierre Abbat wrote: > > As far as the brivla morphology is concerned, there are no triphthongs; > thus "mliau" is two syllables, by default "mlia,u".

camxes wouldn't accept it as a valid brivla, because it takes it as one syllable. And CLL implicitly doesn't allow it either, when it says (Ch. 4 Sect. 4):

<< The five letter length distinguishes gismu from lujvo and fu'ivla. (It is possible to have fu'ivla like ``spa'i that are five letters long, but they must have ``'; no gismu contains ``'.) >>

("spa'i" is a typo for "spra'i", "spa'i" fails the slinku'i test and besides doesn't have five letters as counted in this chapter.)

> I consider all sequences > of "aeiou" (I'm not sure about sequences with "y") to be valid, with commas > implicit after every two or between pairs that aren't valid diphthongs.

But is it possible to hear the difference between ai,a and ai,ia? It seems it would have to be just one of length, which is usually not phonemic in Lojban. And although I might be able to tell "a" and "aa" apart, I don't think I could tell "aaa" apart from "aaaa".

mu'o mi'e xorxes

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by "=?ISO-8859-1?Q?Jorge_Llamb=EDas?=" on Tue 09 of Dec., 2008 11:50 GMT On Tue, Dec 9, 2008 at 8:45 AM, Jorge Llambías wrote:

> On Tue, Dec 9, 2008 at 2:07 AM, Pierre Abbat wrote: >> >> As far as the brivla morphology is concerned, there are no triphthongs; >> thus "mliau" is two syllables, by default "mlia,u". > > camxes wouldn't accept it as a valid brivla, because it takes it as > one syllable.

Correction: it takes "liau" as one syllable, not the whole "mliau". So "am,liau" for example would be accepted.

mu'o mi'e xorxes

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by PierreAbbat on Tue 09 of Dec., 2008 14:15 GMT posts: 324 On Tuesday 09 December 2008 06:45:54 Jorge Llambías wrote:

> On Tue, Dec 9, 2008 at 2:07 AM, Pierre Abbat wrote: > > As far as the brivla morphology is concerned, there are no triphthongs; > > thus "mliau" is two syllables, by default "mlia,u". > > camxes wouldn't accept it as a valid brivla, because it takes it as > one syllable. And CLL implicitly doesn't allow it either, when it says > (Ch. 4 Sect. 4): > > << > The five letter length distinguishes gismu from lujvo and fu'ivla. (It > is possible to have fu'ivla like ``spa'i that are five letters long, > but they must have ``'; no gismu contains ``'.) > > > ("spa'i" is a typo for "spra'i", "spa'i" fails the slinku'i test and > besides doesn't have five letters as counted in this chapter.)

I think that's an error in the Book. "sprae" is a valid fu'ivla because it's two syllables and not a slinku'i. So is "spae", but "spa'e" is a slinku'i. And there are five-letter fu'ivla that begin with a vowel and have no apostrophe, such as "aizdo". (Of course, "aizdo kumte" must be distinguished from "ai zdokumte".) What distinguishes gismu is that they have CVCCV or CCVCV form.

> > I consider all sequences > > of "aeiou" (I'm not sure about sequences with "y") to be valid, with > > commas implicit after every two or between pairs that aren't valid > > diphthongs. > > But is it possible to hear the difference between ai,a and ai,ia? It > seems it would have to be just one of length, which is usually not > phonemic in Lojban. And although I might be able to tell "a" and "aa" > apart, I don't think I could tell "aaa" apart from "aaaa".

"aaaa" is four syllables, not a long vowel, so you could distinguish them by changing pitch at each syllable boundary. In Spanish, how do you pronounce "ay, a", "haya", and "halla"?

Pierre

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by PierreAbbat on Tue 09 of Dec., 2008 14:16 GMT posts: 324 On Tuesday 09 December 2008 06:50:25 Jorge Llambías wrote: > On Tue, Dec 9, 2008 at 8:45 AM, Jorge Llambías wrote:

> > On Tue, Dec 9, 2008 at 2:07 AM, Pierre Abbat wrote: > >> As far as the brivla morphology is concerned, there are no triphthongs; > >> thus "mliau" is two syllables, by default "mlia,u". > > > > camxes wouldn't accept it as a valid brivla, because it takes it as > > one syllable. > > Correction: it takes "liau" as one syllable, not the whole "mliau". So > "am,liau" for example would be accepted.

I take "am,liau" as two words "a mliau". "almiau" is one word.

Pierre

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by "=?ISO-8859-1?Q?Jorge_Llamb=EDas?=" on Tue 09 of Dec., 2008 14:46 GMT On Tue, Dec 9, 2008 at 11:14 AM, Pierre Abbat wrote: > > "aaaa" is four syllables, not a long vowel, so you could distinguish them by > changing pitch at each syllable boundary.

Perhaps you could, but that would require introducing a new feature in Lojban, since pitch is never otherwise phonemic.

> In Spanish, how do you > pronounce "ay, a", "haya", and "halla"?

I* pronounce "haya" and "halla" identically (something like Lojban "aja" or "aca", I don't have any voicing distinction there in Spanish and it's hard to tell which of the two my y=ll is exactly, something in between), and I pronounce "ay, a" as Lojban "aia". Others will pronounce all three the same, and some people in Spain will pronounce "halla" as something close to Lojban "alia".

But none of that helps to distinguish Lojban "ai,ia" from "ai,a". (I can think of some ways of distinguishing them, but none that involve Lojbanic phonemic features).

I remember a discussion I had about the distinction between "bra,ian" and "brai,an". I don't doubt some people can make and hear a difference (for me it's very hard), but in Lojban we are not required to make or hear any difference between those two valid forms of writing the same name.

mu'o mi'e xorxes

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by JohnCowan on Tue 09 of Dec., 2008 15:16 GMT posts: 149 Pierre Abbat scripsit:

> I think that's an error in the Book. "sprae" is a valid fu'ivla because it's > two syllables and not a slinku'i. So is "spae", but "spa'e" is a slinku'i.

Neither "sprae" nor "spae" is a valid word at all, because "ae" is not a valid vowel sequence.

> And there are five-letter fu'ivla that begin with a vowel and have no > apostrophe, such as "aizdo".

That's an interesting class of fu'ivla; if it's ever been discussed before, I don't remember it.

> What distinguishes gismu is that they have CVCCV or > CCVCV form.

In fact yes.

-- Do I contradict myself? John Cowan Very well then, I contradict myself. cowan@ccil.org I am large, I contain multitudes. http://www.ccil.org/~cowan --Walt Whitman, Leaves of Grass

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by "=?ISO-8859-1?Q?Jorge_Llamb=EDas?=" on Tue 09 of Dec., 2008 20:05 GMT On Mon, Dec 8, 2008 at 7:28 PM, John Cowan wrote: > Arnt Richard Johansen scripsit: > >> I assert that any new morphological rules that we come up with should >> be so simple that they can be translated into a human-readable form. > > I agree.

Here is a simple way to state the camxes vowel cluster rule:

V = a, e, i, o, u, y (full vowels) S = i, u (semi-vowels)

(1) Two full vowels cannot be adjacent. (2) Two semivowels cannot be adjacent.

That means two of i/u can be adjacent only if one is playing full vowel and the other is playing semi-vowel. Then the only vowel clusters allowed are those of the form: SVSV...S. At least one V, and then any number of alternating S's and V's

mu'o mi'e xorxes

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by JohnCowan on Tue 09 of Dec., 2008 20:25 GMT posts: 149 Jorge Llambías scripsit: > On Mon, Dec 8, 2008 at 7:28 PM, John Cowan wrote: > > Arnt Richard Johansen scripsit: > > > >> I assert that any new morphological rules that we come up with should > >> be so simple that they can be translated into a human-readable form. > > > > I agree. > > Here is a simple way to state the camxes vowel cluster rule: > > V = a, e, i, o, u, y (full vowels) > S = i, u (semi-vowels) > > (1) Two full vowels cannot be adjacent. > (2) Two semivowels cannot be adjacent.

+1

-- John Cowan http://www.ccil.org/~cowan cowan@ccil.org "After all, would you consider a man without honor wealthy, even if his Dinar laid end to end would reach from here to the Temple of Toplat?" "No, I wouldn't", the beggar replied. "Why is that?" the Master asked. "A Dinar doesn't go very far these days, Master. --Kehlog Albran Besides, the Temple of Toplat is across the street." The Profit

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

arj Posted by arj on Tue 09 of Dec., 2008 20:25 GMT posts: 953 On Tue, Dec 09, 2008 at 05:04:12PM -0300, Jorge Llambías wrote: > On Mon, Dec 8, 2008 at 7:28 PM, John Cowan wrote: > > Arnt Richard Johansen scripsit: > > > >> I assert that any new morphological rules that we come up with should > >> be so simple that they can be translated into a human-readable form. > > > > I agree. > > Here is a simple way to state the camxes vowel cluster rule: > > V = a, e, i, o, u, y (full vowels) > S = i, u (semi-vowels) > > (1) Two full vowels cannot be adjacent. > (2) Two semivowels cannot be adjacent. > > That means two of i/u can be adjacent only if one is playing full > vowel and the other is playing semi-vowel. Then the only vowel > clusters allowed are those of the form: SVSV...S. At least one > V, and then any number of alternating S's and V's

Good.

This has the interesting consequence that vowel sequences that would not otherwise be legal, is permissible when one of them is playing semi-vowel. For example, the name of Ioioui, a character from one of Jon Bing's novels, can be Lojbanized as {.ioiouin.}, while {.ioioun}.

(Yes, the no ou rule has been bugging me for quite some time. If Anglophones can keep e and ei apart, they should be able to handle o and ou.)

-- Arnt Richard Johansen http://arj.nvg.org/ Vacuum cleaners suck. Kings rule. Ice is cool.

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by "=?ISO-8859-1?Q?Jorge_Llamb=EDas?=" on Tue 09 of Dec., 2008 21:07 GMT On Tue, Dec 9, 2008 at 5:24 PM, Arnt Richard Johansen wrote: > On Tue, Dec 09, 2008 at 05:04:12PM -0300, Jorge Llambías wrote: >> >> Here is a simple way to state the camxes vowel cluster rule: >> >> V = a, e, i, o, u, y (full vowels) >> S = i, u (semi-vowels) >> >> (1) Two full vowels cannot be adjacent. >> (2) Two semivowels cannot be adjacent. >> >> That means two of i/u can be adjacent only if one is playing full >> vowel and the other is playing semi-vowel. Then the only vowel >> clusters allowed are those of the form: SVSV...S. At least one >> V, and then any number of alternating S's and V's > > Good.

I should have added that the S at the very end can only be an "i" after "a", "e", "o", or an "u" after "a". Not just any S after any V.

> This has the interesting consequence that vowel sequences that would not otherwise be legal, is permissible when one of them is playing semi-vowel. For example, the name of Ioioui, a character from one of Jon Bing's novels, can be Lojbanized as {.ioiouin.}, while {.ioioun}. > > (Yes, the no ou rule has been bugging me for quite some time. If Anglophones can keep e and ei apart, they should be able to handle o and ou.)

The missing "eu" and "ou" diphthongs are indeed an ugly gap in the Lojban system.

mu'o mi'e xorxes

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by JohnCowan on Tue 09 of Dec., 2008 21:32 GMT posts: 149 Arnt Richard Johansen scripsit:

> (Yes, the no ou rule has been bugging me for quite some time. If > Anglophones can keep e and ei apart, they should be able to handle o > and ou.)

Blame it on the anglophones, and specifically the caught-cot-merging American ones. We can keep e and ei apart because e is /E/ (as in DRESS) whereas ei is /eI/ (as in FACE). If we had specified that o was /O/ (as in THOUGHT), we could have made ou /oU ~ @U/ (as in GOAT), and there would be sufficient difference to be safe.

But perhaps half of all Americans have /A/ rather than /O/ in THOUGHT words, which would have made Lojban a and o too similar, so o remains /O ~ oU ~ @U/ and ou remains banned.

As for eu, there is nothing like it in any accent of English (the nearest thing is the /æU/ in the Southern Hemisphere, which is the local representation of /aU ~ AU/), so it was never in the running.

-- How they ever reached any conclusion at all is starkly unknowable to the human mind. http://www.ccil.org/~cowan --"Backstage Lensman", Randall Garrett

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by PierreAbbat on Wed 10 of Dec., 2008 03:38 GMT posts: 324 On Tuesday 09 December 2008 10:15:16 John Cowan wrote: > Pierre Abbat scripsit: > > I think that's an error in the Book. "sprae" is a valid fu'ivla because > > it's two syllables and not a slinku'i. So is "spae", but "spa'e" is a > > slinku'i. > > Neither "sprae" nor "spae" is a valid word at all, because "ae" is not a > valid vowel sequence.

It's not a valid diphthong, but the example "bang,r,kore,a" in the Book shows that vowel sequences that aren't diphthongs can occur in fu'ivla. Commas make no difference to the identity of a word, so "sprae" is the same as "spra,e".

As vowels in a string of vowels are paired from the left, as long as they form valid diphthongs, "i,iai,i,iai,ion" is pronounced the same as "i,ia,i,i,ia,i,ion". There is a list of valid diphthongs in the Book, but no list of valid triphthongs.

> > And there are five-letter fu'ivla that begin with a vowel and have no > > apostrophe, such as "aizdo". > > That's an interesting class of fu'ivla; if it's ever been discussed before, > I don't remember it.

"iglu" has been discussed, and we concluded that such words are valid whether the cluster is a valid initial or only a valid medial. Other fu'ivla with only two consonants are "a'orne", "iedra", "io'imbe", "jboia", "orka", "uiski", "uitki", "ulmu", and "urci".

Pierre

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by JohnCowan on Wed 10 of Dec., 2008 03:58 GMT posts: 149 Pierre Abbat scripsit:

> It's not a valid diphthong, but the example "bang,r,kore,a" in the Book > shows that vowel sequences that aren't diphthongs can occur in fu'ivla.

No, it shows that John Cowan should never have put in that damned example.

> Commas make no difference to the identity of a word, so "sprae" is > the same as "spra,e".

Agreed. And both are invalid.

> As vowels in a string of vowels are paired from the left, as long as > they form valid diphthongs, "i,iai,i,iai,ion" is pronounced the same as > "i,ia,i,i,ia,i,ion".

Both of them suck.

> There is a list of valid diphthongs in the Book, but > no list of valid triphthongs.

Should have been.

> "iglu" has been discussed, and we concluded that such words are valid > whether the cluster is a valid initial or only a valid medial. Other > fu'ivla with only two consonants are "a'orne", "iedra", "io'imbe", > "jboia", "orka", "uiski", "uitki", "ulmu", and "urci".

I'm good with all of these.

-- But you, Wormtongue, you have done what you could for your true master. Some reward you have earned at least. Yet Saruman is apt to overlook his bargains. I should advise you to go quickly and remind him, lest he forget your faithful service. --Gandalf John Cowan

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by PierreAbbat on Wed 10 of Dec., 2008 14:43 GMT posts: 324 On Tuesday 09 December 2008 16:32:15 John Cowan wrote: > Blame it on the anglophones, and specifically the caught-cot-merging > American ones. We can keep e and ei apart because e is /E/ (as in DRESS) > whereas ei is /eI/ (as in FACE). If we had specified that o was /O/ > (as in THOUGHT), we could have made ou /oU ~ @U/ (as in GOAT), and there > would be sufficient difference to be safe. > > But perhaps half of all Americans have /A/ rather than /O/ in THOUGHT > words, which would have made Lojban a and o too similar, so o remains /O ~ > oU ~ @U/ and ou remains banned.

We shouldn't base Lojban phonology on what some dialect of English has. I have no trouble pronouncing "cot", "caught", "coat", "cotte", and "côte" distinctly.

I think it's a bad idea to make "re" and "rei" both number words. I have a similar problem with Spanish; I sometimes mishear "doce" and "trece" as "dos" and "tres" (no one I know is a Castilian) or confuse "sesenta" and "setenta".

If we allow "ou" as a diphthong, will it be treated like "ei" in that "Cou" is a possible rafsi?

> As for eu, there is nothing like it in any accent of English (the > nearest thing is the /æU/ in the Southern Hemisphere, which is the > local representation of /aU ~ AU/), so it was never in the running.

It occurs in Spanish (e.g. neutro). When I coined a word for "Basque country", I took the Basque phrase, but changed the initial "eu" to "au" because "e,u" didn't sound as close to the original.

Pierre

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

BPFK Section: PEG Morphology Algorithm

Posted by JohnCowan on Wed 10 of Dec., 2008 15:41 GMT posts: 149 Pierre Abbat scripsit:

> We shouldn't base Lojban phonology on what some dialect of English has.

I wasn't justifying the past, just explaining it. Loglan/Lojban has had four and only four falling diphthongs for almost half a century. My sense is that that isn't about to change.

> I think it's a bad idea to make "re" and "rei" both number words.

The hex digits had to be squeezed in in order to make the mnemonic pattern work given the existing cmavo assignments. I agree that this was unfortunate.

> If we allow "ou" as a diphthong, will it be treated like "ei" in that > "Cou" is a possible rafsi?

I suppose it would.

-- We do, doodley do, doodley do, doodley do, John Cowan What we must, muddily must, muddily must, muddily must; Muddily do, muddily do, muddily do, muddily do, http://www.ccil.org/~cowan Until we bust, bodily bust, bodily bust, bodily bust. --Bokonon

Earlier

Posted by arj on Tue 29 of Jan., 2008 20:36 GMT posts: 953 Currently, camxes accepts commas anywhere. However, the CLL says that "The comma is used to indicate a syllable break within a word ...". I take this to mean that commas may only occur within words.

-arj

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

Re: BPFK Section: PEG Morphology Algorithm

arj Posted by arj on Tue 29 of Jan., 2008 20:42 GMT posts: 953 How is the lack of pauses following BY treated in the morphology? How should it be treated?

CLL is terribly ambiguous on this point.

> Note that the lerfu words ending in ``y were written (in Example 2.1 and Example 2.2) with pauses after them. It is not strictly necessary to pause after such lerfu words, but failure to do so can in some cases lead to ambiguities: ...

> A safe guideline is to pause after any cmavo ending in ``y unless the next word is also a cmavo ending in ``y. The safest and easiest guideline is to pause after all of them.

-arj

Score: 0.00 Vote: 1 2 3 4 5 top of page Reply

Edit  Delete  Report this post

Re: BPFK Section: PEG Morphology Algorithm

arj Posted by arj on Sun 07 of Dec., 2008 12:59 GMT posts: 953 I'm running Robin's test corpus through bouth camxes and the official parser to check for discrepancies. This is one of the first things I found:

Official: PASS camxes: FAIL .i,iai,ii,iai,ion.

The official parser thinks this is a cmene; camxes thinks this is a nonLojbanWord.

If there is a good reason why this shouldn't be allowed (I don't see any), then we should add it to the CLL errata.

-arj

Posted by rlpowell on Thu 16 of Dec., 2004 20:35 GMT posts: 14214

This grammar classifies words by their morphological class (cmene, gismu, lujvo, fuhivla, cmavo, and non-lojban-word). It does not sort them into grammatical classes (CMENE, BRIVLA, A, BAI, BAhE, ..., ZOhU).

Why not? Mine certainly does.

-Robin

Score: 0.00 Vote:

1 2 3 4 5

Talk:BPFK Section: PEG Morphology Algorithm

Earlier

cmene !gismu !lujvo should be satisfied.

slinkuhi is satisfied, because {ci'i} is a medial-rafsi but {le}

ABN=E5r jeg kommer til kloakken, er det for =E5 rense opp - n=E5r Zola

ABN=E5r jeg kommer til kloakken, er det for =E5 rense opp - n=E5r Zola

Navigation menu

Search