Difference between revisions of "on unglorkative anaphora"

From Lojban
Jump to navigation Jump to search
m
 
m (Conversion script moved page On unglorkative anaphora to on unglorkative anaphora: Converting page titles to lowercase)
 
(4 intermediate revisions by one other user not shown)
Line 1: Line 1:
  
* Nicholas, N. 1996a. [http://citeseer.nj.nec.com/27803.html ojban as a Machine Translation Interlanguage in the Pacific]. Fourth Pacific Rim International Conference on Artificial Intelligence: Workshop on 'Future Issues for Multilingual Text Processing', Cairns, Australia, 27 August 1996. 31-39.
+
=== On unglorkative anaphora ===
* pycyn pointed out that work has been performed on conversion form Logician's English to predicate logic. See the [http://groups.yahoo.com/group/lojban/message/12624 rovided references].
 
  
* And suggested [http://www.google.com/search?q=discourse+representation+theory iscourse Representation Theory] as being relevent.
+
The goal of these proposals is to maximize the number of possible antecedents that can be targeted without glorking.
* Bjorn Gohla pointed out [http://www.darmstadt.gmd.de/publish/komet/kpml-1-doc/kpml.html PML] a natural language text generation system.
 
  
** [[jbocre: Jay Kominek|Jay]] think that translating Lojban into other languages is almost purely a [http://www.dynamicmultimedia.com.au/siggen/ L text generation] problem. ([[jbocre: Jay Kominek ay|(Jay Kominek ay]] also feels that translating natural languages into Lojban is an uninteresting problem, FWIW.)
+
==== Conservative version ====
***An uninteresting problem?? Well, let's get some skeleton code running then if it's so easy! Because it's a very useful project!
 
  
****Come on, xod, we're supposed to be thinking logically, aren't we? ''le'i ro cinri na cmima le'i frili'' The kind of reasoning you're demonstrating is the kind of thing Lojban ought to be wonderful at putting a stop to. :) --[[jbocre: Jay Kominek|Jay]]
+
This version works with only preexisting cmavo and does not change cmavo definitions.
***** zo'o .i vu'e ma'i lo'i skami nabmi le sizytolcinri ca'a smuni lo nalpluja
 
  
*** What quality of translation is uninteresting? A Babelfish-quality machine translator would be useful; is that what you consider dull? A Stefan George-quality translator would be incredible (and far beyond the state of the art!).
+
* nei targets a bridi containing the anaphor.
**** Any quality is uninteresting. I don't see taking massive amounts of natlang text and moving it into Lojban as useful. What I do see as useful is writing new things (patents, manuals, etc) in Lojban, and then being able to get quality translations of the Lojban, in ''n'' other languages. (Nobody will learn Lojban to read things which are already available in a natural language, but they might learn it to that they can write things that can be translated into ''n'' languages.) --[[jbocre: Jay Kominek|Jay]]
+
* 'Bridi' is ambiguous between 'syntactic bridi' and 'logical bridi': since there is complete consensus about identifying syntactic bridi, and there probably never will be consensus about identifying logical bridi, the relevant sense of 'bridi' for these anaphors is 'syntactic bridi'.
  
***** Nobody needs to learn Lojban if converting into and out of Lojban is so easy. Sorry, but if Lojban is used as an interlingua, it will be less like a Lingua Franca, spoken by many people to each other, and more like like a hidden inter-translation code that few ever care to see.As far as I am concerned, Natlang --> Lojban is hard, Lojban --> Natlang is easy. So, you can see my surprise at the allegation that the former is easy too! If it's all so easy, let's just do it. --xod
+
* The logical meaning of ''nei'' must be "x1 is expressed by the fa (x1) syntacic sumti of the current syntactic bridi with x2 expressed by the fe (x2) syntactic sumti, etc.". That is, ''nei'' cannot be a repetition of the selbri. This is because it would result in an infinitely recursive meaning. Nei makes little or no sense except with (XS) {lo}.
****** You're living in an alternate reality, because nobody has said the that natlang -> lojban is easy, and asserting it more often won't make it true. --jay
+
* xi subscripting identifies the bridi that is targeted:
  
******* Ignoring my direct addressing of this point doesn't help matters. Read the sentence I wrote above in Lojban. --xod
+
||outermost|nei xi ro||one bridi down|nei xi da'a||two bridi down|nei xi da'a re||innermost|nei (xi pa)||one bridi up|nei xi re||two bridi up|nei xi ci||
******** I did read it. That is a perfectly valid assumption to make, right until you're corrected. When you persist in holding a view which is valid in reference frame A, after it has been pointed out that you're not in reference frame A, well, see the above "alternate realities" comment. --jay
 
  
********* The reference frame is skami nabmi. We are discussing a issue of software complexity. Where is the disconnect, and why do you think it's been explained to me even once? --xod
+
* Similar subscripting could be applied to vo'V (with unsubscripted vo'V being vo'V xi ro or vo'V xi pa or glorked, depending on how that debate is resolved), and this would be terser than using lo SE nei. But it would also clash with the use of subscripting to indicate x6+.
  
* [http://cslu.cse.ogi.edu/HLTsurvey/HLTsurvey.html] - a 1996 overview of the state of the art in natural language processing
+
* ri and go'i target completed sumti and (syntactic) bridi respectively, ordered by their start (outer starting before inner).
 +
* xi subscripting identifies the target:
  
----
+
||first in sentence|xi ro||second in sentence|xi da'a||third in sentence|xi da'a re||last|(xi pa)||penult|xi re||antepenult|xi ci||
  
I'm pretty knowledgeable about artificial intelligence, though I've never worried much about natural language understanding specifically. In my opinion, the issue of whether translation to or from Lojban is "easier" is secondary. Before you can answer it, you have to ask: What quality of translation?
+
* dei targets utterances, the target identified by xi subscript:
  
* There are large chunks of the translation process which Lojban makes easier, and the only thing about Lojban which would make the process more difficult is the fact that you can't rely on natural language vagueness and iffiness to carry the day. (And really, you shouldn't let it ever do so, but people want results...) --jay
+
||current|dei (xi no)||last|xi pa||penult|xi re||
  
For machine translation quality similar to the current state of the art--poor--I expect Lojban would be much easier to translate from and somewhat easier to translate to. Lojban provides unambiguous parses and often-unambiguous word meanings, which are basic abilities that today's machine translators have trouble with.
+
The other anaphora are abbreviations of the basic nei/ri/go'i/dei (-- abbreviations based on mahoste glosses):
  
* "Trouble with"? They're incapable of it. For some languages, its provably impossible. (without true understanding of the context. see Swiss German) --jay
+
||no'a|nei xi re||ra|ri xi za'u||ru|ri xi so'i||go'a|go'i xi so'u||go'e|go'i xi re||go'u|go'i xi so'i||go'o|go'i xi ni'u su'o||da'e|dei xi ni'u so'i||da'u|dei xi so'i||de'e|dei xi ni'u so'u||de'u|dei xi so'u||di'e|dei xi ni'u pa||di'u|dei xi pa||
** "Incapable" is too strong a term. Machine translators can use statistical models to make guesses at word senses. It's not on the same planet as throwing darts. ''mi'e [[jbocre: jezrax|jezrax]]''
 
  
*** I was referring to parsing the grammar, actually. Swiss German is provably context sensitive, so you'll need to understand it before you can even hope to parse it. As far word sense, well, if you've got an algorithmic process for even guessing at the meaning of words in natural language, I suggest you publish. :) Otherwise, you'll have to define for the system every word you want it to be able to translate. (Whereas in Lojban, that is limited to the so-far-little-used fu'ivla) (The best statistical model I've seen which would be applied to determining word sense from nothing is latent semantic analysis, and that requires a very very big corpus, and acts in odd, unpredictable ways: hot and cold are "closer" to it than are cold and cool.)
+
==== Radical version ====
**** ''zo'o'' No need to publish; there are already enough book chapters on it. Search Amazon for "machine translation" or "natural language processing". It's a decades-old research field; people know what the problems are; none of them are solved, but there's progress on every front. One specific suggestion: "Foundations of Statistical Natural Language Processing" [http://www.amazon.com/exec/obidos/ASIN/0262133601/qid=1009761097/sr=1-2/ref=sr_1_75_2/103-0198688-3778269
 
  
****] The ambiguous-parse problem occurs in, probably, all natural languages. I gather that the most popular way to deal with it is by brute force: produce all possible parses (usually a lot more than you expect), and then rank them. As far as I'm concerned, this is an easy problem from among the problems of natural language understanding--but only relatively speaking!
+
This version defines new cmavo to increase the number of targetables and the number of ways of targeting them.
  
For machine translation quality similar to a quick-and-dirty human translation--moderate quality but still much better than the current state of the art--I doubt Lojban offers much advantage. The problems are so much more difficult that merely getting the syntax and individual words correct doesn't go that far toward solving them.
+
New cmavo:
  
* What problems (that don't already exist in dealing with natural language)? jbofi'e already performs "quick-and-dirty" translation, and the results would be decent with smoothing and some knowledge of the destination language applied (getting subject/verb count agreement and such things to match). --jay
+
||LAhE-x1|the x1 of||LAhE-x2|the x2 of||LAhE-x3|the x3 of||LAhE-x4|the x4 of||LAhE-x5|the x5 of||
** ''[[jbocre: jbofi'e|jbofi'e]]'''s translation quality is way, way worse than a quick-and-dirty human translation.
 
  
*** You'll need to define "quick and dirty", then, as I interpret it to mean dictionary lookup of each individual word, limited attempts to deal with conjugation, and simplistic reordering to match the order of subject, verb and object in the destination language. jbofi'e definitely beats that out.
+
For forms, either use experimental cmavo based on FA, or else recycle vo'V as these LAhE. xi-subscripts on these LAhE indicate x6+.
**** A "quick-and-dirty human" translation is, say, one done in real time by a simultaneous translator.
 
  
Obviously, the higher-quality the machine translation, the more uses it has. I doubt that a poor-quality translation would be adequate for the proposed patent application, though I could be wrong. Also, the patent application could rely on formalizing the source text according to special-purpose rules, which would make the job easier.
+
||LI1|the PAth complete bridi|pa = last; ro = first in sentence; ni'u pa = next||LI2|the PAth complete matrix bridi|pa = last; ni'u pa = next||LI3|the PAth complete sumti|pa = last; ro = first in sentence; ni'u pa = next||LI4|the PAth sumti in prenex of uncompleted bridi|pa = last; ro = first in sentence||(LAhE-x1) LI5|the PAth uncompleted bridi|pa = innermost; ro = outermost||(LAhE-x1) LI6|the PAth uncompleted sumti|pa = innermost; ro = outermost||LI7|the PAth utterance|pa = last; no = current; ni'u pa = next||LI8|the PAth BAhE1-marked item|pa = last; ni'u pa = next||LE1 PA|the PAth complete bridi in|pa = first; ro = last (or vice versa?)||LE2 PA|the PAth complete sumti in|pa = first; ro = last (or vice versa?)||BAhE1|mark next as candidate antecedent for LI8||
  
* A poor translation that takes 2 seconds could be worth something compared to a good translation which would take a week or so and cost you a bit of money. (As far as limiting the domain, see the METEO system used by Canada to do translation of weather reports, works flawlessly.) --jay
+
Optional abbreviations:
** Of course; it all depends on the use.
 
  
''mi'e [[jbocre: jezrax|jezrax]]''
+
||MOI1|PA MOI1 = ME LI1 PA||MOI2|PA MOI2 = ME LI2 PA||
 +
 
 +
Questionable abbreviations:
 +
 
 +
||PA1 da'a PA2 MOI1/2 = ME LE2 PA2 LI1/2 PA1||LI1 PA1 da'a PA2 = LE1 PA2 LI1 PA1||LI2 PA1 da'a PA2 = LE2 PA2 LI2 PA1||
 +
 
 +
Notes:
 +
 
 +
* A variable (including anaphors bound to a variable) is a candidate antecedent only if the anaphor is within the scope of the quantifier binding the variable. Variables are invisible to anaphors outside the scope of the variable's binder.
 +
* I would envisage LI4 and LI8 as being particularly useful. The others would be useful to a limited extent, but would easily get too complex for real-time mental processing.
 +
 
 +
* LE1/2 are not logical gadri; they just share the same syntax as gadri.
 +
 
 +
Equivalences with conservative scheme:
 +
 
 +
||lo (SE) nei xi PA|(LAhE) LI5 PA||ri xi PA|LI3 PA||go'i xi PA|ME LI1 PA ~ PA MOI1||dei xi PA|LI7 PA||
 +
 
 +
----------------------
 +
 
 +
[[User:xorxes|xorxes]]:
 +
 
 +
==== Re: Conservative version ====
 +
 
 +
*I understand that nei can't repeat the bridi that it targets without recursion, but couldn't it repeat the selbri of the bridi that it targets? I understand this means it can't be used with lo as an unglorkative sumti anaphor, but isn't it more useful and closer to its apparent intended meaning as a selbri anaphor? I'm thinking of things like {mi ba klama ca le nu do no'a} "I will go when you do" (from CLL). Formally, your proposed definition is a good way to cover all sumti in currently uncompleted bridi, but in practice the calculations involved seem to make it unusable, so I think I would prefer it to have some more useful function. (Eventually we could define an experimental form for the other function. That still leaves sumti in completed bridi out of reach by this method.) Also, you have that nei is by default no'a (both being neixipa). Shouldn't neixiro be the default for one of them?
 +
*I suppose rixino targets itself, neixino targets the bridi of which itself is the selbri (if there is one), go'ixino = neixiro?
 +
 
 +
[[User:And Rosta|And Rosta]]:
 +
 
 +
* no'a is nei xi re -- I've corrected the error in the table above.
 +
* Regarding the meaning of nei:
 +
 
 +
** I agree that selbri anaphora would be useful, but it doesn't quite seem natural to me, because it seems to me that "I will eat an apple when you do" is the same sort of phenomenon as "I will go when you do". So instead we need some sort of bridi anaphora rule that gets round the recursion problem.
 +
** It's not clear to me whether {lo nei} and {lo go'i} (XS lo) are supposed to be unglorkative pointers to the sumti in x1 of the target bridi. My view is that logo'i shouldn't be, and nor should lonei if nei is genuinely some sort of selbri/bridi anaphor. But that would leave the Conservative scheme without an extensible scheme for expressing "the x-n1 of the n2th-innermost bridi". That wouldn't really bother me though, if the Radical scheme were adopted.
 +
 
 +
** Regarding what is an isn't usable, we have to start with '''some''' unglorkative scheme that works in formal terms. Then we can set about making it usable. To some extent, the limits of usability might not be apparent until usage has tested it out.
 +
* rixino, neixino, go'ixino: probably they'd mean what you suggest, but they're not very useful.
 +
 
 +
[[User:xorxes|xorxes]]:
 +
 
 +
*I think the canonical nei is neixiro, not neixipa, i.e. more in accordance with the example in CLL than with what it says in the text. (I also suspect neixiro is more useful than neixipa, I'm not sure.)
 +
**[[User:And Rosta|And Rosta]]: I was inferring nei(xipa) from the mahoste: nei = 'current', no'a = 'next outer'. neixipa wd be useful for reflexives.
 +
 
 +
***[[User:xorxes|xorxes]]: Yes, if {vo'a} point to matrix arguments.
 +
*For incomplete bridi anaphora, I propose that the anaphor repeats the target bridi minus the explicit replacements but with the whole term that contains the anaphor replaced by zo'e. So for example {mi ba citka lo plise ca le nu la djan cusku le sedu'u do ba'o nei[[xiro|xiro]]} would give "I will eat an apple when John says that you have done it", i.e. "when John says that you have eaten an apple".
 +
 
 +
**[[User:And Rosta|And Rosta]]: I agree.
 +
*Your definition of the go'i series is more useful than the canonical one, which only allows them to stand for full sentence bridi, not for subordinate bridi. I have often needed to repeat a recently used selbri that was not the main selbri of a previous sentence. I end up using {co'e}, which is not really an anaphor. I would prefer not to have separate forms for completed and incomplete bridi though.
 +
 
 +
**[[User:And Rosta|And Rosta]]: I agree. In which case, perhaps go'i could be restricted to matrix bridi and nei could be generalized to all bridi? Or to all nonmatrix bridi?
 +
***[[User:xorxes|xorxes]]:The more useful ones I think are {go'i}, {go'a} and {go'u}, so I would want them to be the most general. I don't want to have to decide whether a previously used bridi is matrix or not, complete or not, before anaphorizing it.
 +
 
 +
*What counts as an utterance for the dei-series? I would like that {di'u} could point to part of what has been said previously, even part of a sentence. For example, I want to be able to say {la djan jinvi le du'u broda i mi dy tugni la'e di'u}, where {di'u} points to the partial bridi "broda".
 +
**[[User:And Rosta|And Rosta]]: I was thinking of dei as pointing to illocutions rather than to bridi. The Radical, but not the Conservative, Scheme allows you to target "a bridi within the previous matrix bridi" (LE1 su'o LI1 pa).
 +
 
 +
***[[User:xorxes|xorxes]]: dei is odd in that it doesn't repeat its antecedent, like other anaphora, but quotes it instead. Where would the dei series be used, besides the x2 of cusku?
 +
 
 +
==== Re: Radical version ====
 +
 
 +
I can't imagine any fully unglorkative scheme that will work in practice. I find even the seemingly harmless {ri[[xi pa|xi pa]]} hard to use. Anything that requires counting of previous sumti or bridi seems very unnatural. This scheme certainly seems powerful enough to cover everything, but I don't know. Any persuasive examples of how to use it?
 +
 
 +
[[User:And Rosta|And Rosta]]:
 +
 
 +
I'm not suggesting that glorkative 'anaphora' would not be used. Unglorkative anaphora could be rather taxing, so would be used only when worth the effort. (I don't think ko'a-assignment is any easier, btw, but you're not claiming it is.)
 +
 
 +
'''more anon'''
 +
 
 +
[[continued:]] They're worth the effort when the exactitude of glorklessness is sufficiently important. My hunch is that certain syntactic positions are more salient than others: sumti of bridi are more salient than sumti embedded within sumti; x1 is more salient than x5; sumti earlier in the bridi are more salient than sumti later; more recent targets are more salient than more distant ones; sumti of matrix bridi are more salient than sumti in embedded bridi... and so forth. So 'xi ro', 'xi pa' and 'xi re' might be workable, but the further from ro and pa one gets, the less accessible the antecedent would be. Prenexed and BAhE1-marked elements would be foregrounded as candidate antedents, so should be particularly accessible. At any rate, unless the language is capable of unglorkative anaphora, then it cannot live up to certain of its goals. Yes, precision can be mentally taxing but the possibility of precision is what makes the language worthwhile in the first place.
 +
 
 +
[[User:xorxes|xorxes]]: Right, I find ko'a assignment also mostly unusable. I'm not convinced that the cases where it would be worth the effort to use unglorkatives are worth having the full scheme as part of the language. In order to have the system available for those (rare?) cases when it would be worth using you have to first invest effort in learning it, so it is not just the effort involved in using it that has to be factored in.
 +
 
 +
[[User:And Rosta|And Rosta]]: But the difficulty of the scheme is in using it, or learning to be an accomplished user of it, rather than learning the words & rules.
 +
 
 +
'''more anon'''
 +
 
 +
[[User:xorxes|xorxes]]: Right. I'm just not sure that the effort invested in learning to be an accomplished user of such a system is justified by the occasions in which it might be needed. You can always achieve precision by being very explicit: "the third sumti of the second last complete bridi" and such. Having cmavo to shorten that in a systematic way makes sense only if there is a certain frequency of use. The proposed LAhEs for example would be short forms of {lo sumti be fi li PA bei ...}, at least for some reading of the definition of "sumti".
 +
 
 +
[[User:And Rosta|And Rosta]]: Consider SE-conversion: this is well-defined, can be simple but quickly gets unfeasibly complicated when there are multiple SE. This doesn't detract from the utility of SE or the explicit rules of its definition. And while I have learnt to be a user of simplex SE, I have not bothered becoming accomplished as a user of multiple SE. If the Radical scheme were intrinsically complicated then I'd not advocate it, but it is actually a scheme that graduates from the simple to the impossibly complex. That is, the apparatus that generates the simple usable cases also happens to generate the complex unusable cases too. So I don't see the scheme as flawed; it's just not a panacea. If pressed to simplify the scheme, I would argue for just 3 types: (i) backcounting to sumti in prenexed of uncompleted bridi; (ii) backcounting to BAhE-marked sumti; (iii) backcounting to BAhE-marked bridi. But I don't see why we must be so reductive. One doesn't have to achieve mastery of the anaphora scheme; one just needs to be able to use it for the simple cases at which the mind doesn't boggle, as with SE conversion.
 +
 
 +
Regarding your point about being precise by being longwinded -- yes, this is always true, but it greatly increases the cost of being precise. Since the possibility of being precise is the chief attraction of a logical language, it would therefore diminish the attractiveness of Lojban.
 +
 
 +
[[User:xorxes|xorxes]]: I suppose there is no harm in proposing experimental forms and seeing if they catch on. At least they will be useful in the definition of the conservative system. For the LAhE series I propose:
 +
 
 +
|| voi'a|LAhE|lo sumti be fi li pa bei|the x1 of|| voi'e|LAhE|lo sumti be fi li re bei|the x2 of|| voi'i|LAhE|lo sumti be fi li ci bei|the x3 of|| voi'o|LAhE|lo sumti be fi li vo bei|the x4 of|| voi'u|LAhE|lo sumti be fi li mu bei|the x5 of||
 +
 
 +
[[User:And Rosta|And Rosta]]: So are you inclined to feel that no unglorkative scheme is likely to prove usable? Your intuitions are pretty reliable.
 +
 
 +
I observe that you use lerfu anaphora a lot. Are you in favour of CLL's glorky lerfu anaphora rules, or of Jordan's unglorky ones?
 +
 
 +
[[User:xorxes|xorxes]]: CLL's glorky rules, definitely. As I said, even {ri} gives me trouble in usage except in the trivialest cases, as the calculation of which sumti is the last complete one does not come naturally at all. As soon as there is some subordination around I don't know how to use {ri} without pausing to think.
 +
 
 +
[[User:And Rosta|And Rosta]]: Okay. I think I should pare down the radical proposal, then. (And revise the conservative one to take account of your remarks on nei.) I'll have a rethink when time permits.
 +
 
 +
---------------
 +
 
 +
OK. Rethink yields the following:
 +
 
 +
==== Minimalist scheme ====
 +
 
 +
The radical scheme allows any antecedent to be targeted in afterthought. Such a scheme is necessarily complicated. It is too complicated to be usable except in simple cases. But a rationale for adopting the scheme would be that only through trying to use it would we discover the limits of its practical usability. As an alternative, the following scheme aims for greater simplicity, preserving only what is likely to be consistently useful.
 +
 
 +
* ''dei, di'u, di'e'' -- as in CLL.
 +
* BAhE1 + sumti -- mark as candidate antecedent for LI1
 +
 
 +
* BAhE1 + selbri -- mark as candidate antecedent for MOI1
 +
* LI1 PA -- PAth BAhE1-marked sumti
 +
 
 +
* LI1 BY -- glorky sumti anaphora
 +
* PA MOI1 -- PAth BAhE1-marked bridi
 +
 
 +
* BY MOI1 -- glorky bridi anaphora
 +
* LI2 -- the PAth sumti in prenex of uncompleted bridi (pa = last; ro = first in sentence)
 +
 
 +
* ri-series, go'i-series, nei-series and vo'a-series cmavo can be either recycled (which is especially desirable for monosyllabics, ''ri, ra, ru, nei'') or else defined in whatever way is most consistent with CLL and usage.
 +
 
 +
------------
 +
 
 +
[[User:xorxes|xorxes]]:
 +
 
 +
*The BAhE1 method requires forethought. What would be the advantages over the {ko'a goi}/{broda cei} method which also requires forethought? It doesn't use up variables, but it requires remembering an ordered list of candidate antecedents.
 +
**It doesn't require remembering which ko'V/fo'V form was assigned. goiko'a is good for sumti that are going to be referred back to repeatedly, but onerous otherwise. --[[User:And Rosta|And Rosta]]
 +
 
 +
***Why is remembering a ko'V/fo'V form more onerous than remembering a number? Also, in a long text, each ko'V/fo'V can be recycled individually, but the LI1 list can only keep getting longer, and if you want to start afresh you lose everything. Also, each time you make a new assignment you have to remember which was the last assigmnent that you made, and if you make just one mistake, you can ruin the whole list from there, whereas with ko'V/fo'V one mistake doesn't affect the rest of the assignments.
 +
***I would be hesitant to claim that one system is indubitably more onerous than the other, but they are certainly very differently onerous. The LI1 list is a stack: the most recent BAhE1-markee is first on the list. For approximately the last 3 BAhE1-markees the system should work fairly well.
 +
 
 +
***OK, I was forgetting that you count from the one added last. It still has awkward effects, for example you may refer to one thing with LI1re several times, and after a new use of BAhE1 you have to switch to LI1ci for what used to be LI1re.
 +
***Yes, though if you're referring to something that often then it might be better to use goi.
 +
 
 +
*LI1 BY is just like current BY, isn't it? BY MOI1 sounds interesting though. It could be quite useful.
 +
** LI1 BY: yes, except I'm not certain that plain lerfu are or should be or will be solely anaphoric, so LI1 BY is a potential disambiguator.
 +
 
 +
*LI2: What happens when there is more than one uncompleted bridi? Also, do the candidate sumti have to appear explicitly in the prenex?
 +
**When there is more than one uncompleted bridi, the prenex of the outer will precede the prenex of the inner, so the sequence of prenexes should be unambiguous. And yes, the candidate sumti should appear explicitly in the prenex.

Latest revision as of 08:27, 30 June 2014

On unglorkative anaphora

The goal of these proposals is to maximize the number of possible antecedents that can be targeted without glorking.

Conservative version

This version works with only preexisting cmavo and does not change cmavo definitions.

  • nei targets a bridi containing the anaphor.
  • 'Bridi' is ambiguous between 'syntactic bridi' and 'logical bridi': since there is complete consensus about identifying syntactic bridi, and there probably never will be consensus about identifying logical bridi, the relevant sense of 'bridi' for these anaphors is 'syntactic bridi'.
  • The logical meaning of nei must be "x1 is expressed by the fa (x1) syntacic sumti of the current syntactic bridi with x2 expressed by the fe (x2) syntactic sumti, etc.". That is, nei cannot be a repetition of the selbri. This is because it would result in an infinitely recursive meaning. Nei makes little or no sense except with (XS) {lo}.
  • xi subscripting identifies the bridi that is targeted:

||outermost|nei xi ro||one bridi down|nei xi da'a||two bridi down|nei xi da'a re||innermost|nei (xi pa)||one bridi up|nei xi re||two bridi up|nei xi ci||

  • Similar subscripting could be applied to vo'V (with unsubscripted vo'V being vo'V xi ro or vo'V xi pa or glorked, depending on how that debate is resolved), and this would be terser than using lo SE nei. But it would also clash with the use of subscripting to indicate x6+.
  • ri and go'i target completed sumti and (syntactic) bridi respectively, ordered by their start (outer starting before inner).
  • xi subscripting identifies the target:

||first in sentence|xi ro||second in sentence|xi da'a||third in sentence|xi da'a re||last|(xi pa)||penult|xi re||antepenult|xi ci||

  • dei targets utterances, the target identified by xi subscript:

||current|dei (xi no)||last|xi pa||penult|xi re||

The other anaphora are abbreviations of the basic nei/ri/go'i/dei (-- abbreviations based on mahoste glosses):

||no'a|nei xi re||ra|ri xi za'u||ru|ri xi so'i||go'a|go'i xi so'u||go'e|go'i xi re||go'u|go'i xi so'i||go'o|go'i xi ni'u su'o||da'e|dei xi ni'u so'i||da'u|dei xi so'i||de'e|dei xi ni'u so'u||de'u|dei xi so'u||di'e|dei xi ni'u pa||di'u|dei xi pa||

Radical version

This version defines new cmavo to increase the number of targetables and the number of ways of targeting them.

New cmavo:

||LAhE-x1|the x1 of||LAhE-x2|the x2 of||LAhE-x3|the x3 of||LAhE-x4|the x4 of||LAhE-x5|the x5 of||

For forms, either use experimental cmavo based on FA, or else recycle vo'V as these LAhE. xi-subscripts on these LAhE indicate x6+.

||LI1|the PAth complete bridi|pa = last; ro = first in sentence; ni'u pa = next||LI2|the PAth complete matrix bridi|pa = last; ni'u pa = next||LI3|the PAth complete sumti|pa = last; ro = first in sentence; ni'u pa = next||LI4|the PAth sumti in prenex of uncompleted bridi|pa = last; ro = first in sentence||(LAhE-x1) LI5|the PAth uncompleted bridi|pa = innermost; ro = outermost||(LAhE-x1) LI6|the PAth uncompleted sumti|pa = innermost; ro = outermost||LI7|the PAth utterance|pa = last; no = current; ni'u pa = next||LI8|the PAth BAhE1-marked item|pa = last; ni'u pa = next||LE1 PA|the PAth complete bridi in|pa = first; ro = last (or vice versa?)||LE2 PA|the PAth complete sumti in|pa = first; ro = last (or vice versa?)||BAhE1|mark next as candidate antecedent for LI8||

Optional abbreviations:

||MOI1|PA MOI1 = ME LI1 PA||MOI2|PA MOI2 = ME LI2 PA||

Questionable abbreviations:

||PA1 da'a PA2 MOI1/2 = ME LE2 PA2 LI1/2 PA1||LI1 PA1 da'a PA2 = LE1 PA2 LI1 PA1||LI2 PA1 da'a PA2 = LE2 PA2 LI2 PA1||

Notes:

  • A variable (including anaphors bound to a variable) is a candidate antecedent only if the anaphor is within the scope of the quantifier binding the variable. Variables are invisible to anaphors outside the scope of the variable's binder.
  • I would envisage LI4 and LI8 as being particularly useful. The others would be useful to a limited extent, but would easily get too complex for real-time mental processing.
  • LE1/2 are not logical gadri; they just share the same syntax as gadri.

Equivalences with conservative scheme:

||lo (SE) nei xi PA|(LAhE) LI5 PA||ri xi PA|LI3 PA||go'i xi PA|ME LI1 PA ~ PA MOI1||dei xi PA|LI7 PA||


xorxes:

Re: Conservative version

  • I understand that nei can't repeat the bridi that it targets without recursion, but couldn't it repeat the selbri of the bridi that it targets? I understand this means it can't be used with lo as an unglorkative sumti anaphor, but isn't it more useful and closer to its apparent intended meaning as a selbri anaphor? I'm thinking of things like {mi ba klama ca le nu do no'a} "I will go when you do" (from CLL). Formally, your proposed definition is a good way to cover all sumti in currently uncompleted bridi, but in practice the calculations involved seem to make it unusable, so I think I would prefer it to have some more useful function. (Eventually we could define an experimental form for the other function. That still leaves sumti in completed bridi out of reach by this method.) Also, you have that nei is by default no'a (both being neixipa). Shouldn't neixiro be the default for one of them?
  • I suppose rixino targets itself, neixino targets the bridi of which itself is the selbri (if there is one), go'ixino = neixiro?

And Rosta:

  • no'a is nei xi re -- I've corrected the error in the table above.
  • Regarding the meaning of nei:
    • I agree that selbri anaphora would be useful, but it doesn't quite seem natural to me, because it seems to me that "I will eat an apple when you do" is the same sort of phenomenon as "I will go when you do". So instead we need some sort of bridi anaphora rule that gets round the recursion problem.
    • It's not clear to me whether {lo nei} and {lo go'i} (XS lo) are supposed to be unglorkative pointers to the sumti in x1 of the target bridi. My view is that logo'i shouldn't be, and nor should lonei if nei is genuinely some sort of selbri/bridi anaphor. But that would leave the Conservative scheme without an extensible scheme for expressing "the x-n1 of the n2th-innermost bridi". That wouldn't really bother me though, if the Radical scheme were adopted.
    • Regarding what is an isn't usable, we have to start with some unglorkative scheme that works in formal terms. Then we can set about making it usable. To some extent, the limits of usability might not be apparent until usage has tested it out.
  • rixino, neixino, go'ixino: probably they'd mean what you suggest, but they're not very useful.

xorxes:

  • I think the canonical nei is neixiro, not neixipa, i.e. more in accordance with the example in CLL than with what it says in the text. (I also suspect neixiro is more useful than neixipa, I'm not sure.)
    • And Rosta: I was inferring nei(xipa) from the mahoste: nei = 'current', no'a = 'next outer'. neixipa wd be useful for reflexives.
      • xorxes: Yes, if {vo'a} point to matrix arguments.
  • For incomplete bridi anaphora, I propose that the anaphor repeats the target bridi minus the explicit replacements but with the whole term that contains the anaphor replaced by zo'e. So for example {mi ba citka lo plise ca le nu la djan cusku le sedu'u do ba'o neixiro} would give "I will eat an apple when John says that you have done it", i.e. "when John says that you have eaten an apple".
  • Your definition of the go'i series is more useful than the canonical one, which only allows them to stand for full sentence bridi, not for subordinate bridi. I have often needed to repeat a recently used selbri that was not the main selbri of a previous sentence. I end up using {co'e}, which is not really an anaphor. I would prefer not to have separate forms for completed and incomplete bridi though.
    • And Rosta: I agree. In which case, perhaps go'i could be restricted to matrix bridi and nei could be generalized to all bridi? Or to all nonmatrix bridi?
      • xorxes:The more useful ones I think are {go'i}, {go'a} and {go'u}, so I would want them to be the most general. I don't want to have to decide whether a previously used bridi is matrix or not, complete or not, before anaphorizing it.
  • What counts as an utterance for the dei-series? I would like that {di'u} could point to part of what has been said previously, even part of a sentence. For example, I want to be able to say {la djan jinvi le du'u broda i mi dy tugni la'e di'u}, where {di'u} points to the partial bridi "broda".
    • And Rosta: I was thinking of dei as pointing to illocutions rather than to bridi. The Radical, but not the Conservative, Scheme allows you to target "a bridi within the previous matrix bridi" (LE1 su'o LI1 pa).
      • xorxes: dei is odd in that it doesn't repeat its antecedent, like other anaphora, but quotes it instead. Where would the dei series be used, besides the x2 of cusku?

Re: Radical version

I can't imagine any fully unglorkative scheme that will work in practice. I find even the seemingly harmless {rixi pa} hard to use. Anything that requires counting of previous sumti or bridi seems very unnatural. This scheme certainly seems powerful enough to cover everything, but I don't know. Any persuasive examples of how to use it?

And Rosta:

I'm not suggesting that glorkative 'anaphora' would not be used. Unglorkative anaphora could be rather taxing, so would be used only when worth the effort. (I don't think ko'a-assignment is any easier, btw, but you're not claiming it is.)

more anon

continued: They're worth the effort when the exactitude of glorklessness is sufficiently important. My hunch is that certain syntactic positions are more salient than others: sumti of bridi are more salient than sumti embedded within sumti; x1 is more salient than x5; sumti earlier in the bridi are more salient than sumti later; more recent targets are more salient than more distant ones; sumti of matrix bridi are more salient than sumti in embedded bridi... and so forth. So 'xi ro', 'xi pa' and 'xi re' might be workable, but the further from ro and pa one gets, the less accessible the antecedent would be. Prenexed and BAhE1-marked elements would be foregrounded as candidate antedents, so should be particularly accessible. At any rate, unless the language is capable of unglorkative anaphora, then it cannot live up to certain of its goals. Yes, precision can be mentally taxing but the possibility of precision is what makes the language worthwhile in the first place.

xorxes: Right, I find ko'a assignment also mostly unusable. I'm not convinced that the cases where it would be worth the effort to use unglorkatives are worth having the full scheme as part of the language. In order to have the system available for those (rare?) cases when it would be worth using you have to first invest effort in learning it, so it is not just the effort involved in using it that has to be factored in.

And Rosta: But the difficulty of the scheme is in using it, or learning to be an accomplished user of it, rather than learning the words & rules.

more anon

xorxes: Right. I'm just not sure that the effort invested in learning to be an accomplished user of such a system is justified by the occasions in which it might be needed. You can always achieve precision by being very explicit: "the third sumti of the second last complete bridi" and such. Having cmavo to shorten that in a systematic way makes sense only if there is a certain frequency of use. The proposed LAhEs for example would be short forms of {lo sumti be fi li PA bei ...}, at least for some reading of the definition of "sumti".

And Rosta: Consider SE-conversion: this is well-defined, can be simple but quickly gets unfeasibly complicated when there are multiple SE. This doesn't detract from the utility of SE or the explicit rules of its definition. And while I have learnt to be a user of simplex SE, I have not bothered becoming accomplished as a user of multiple SE. If the Radical scheme were intrinsically complicated then I'd not advocate it, but it is actually a scheme that graduates from the simple to the impossibly complex. That is, the apparatus that generates the simple usable cases also happens to generate the complex unusable cases too. So I don't see the scheme as flawed; it's just not a panacea. If pressed to simplify the scheme, I would argue for just 3 types: (i) backcounting to sumti in prenexed of uncompleted bridi; (ii) backcounting to BAhE-marked sumti; (iii) backcounting to BAhE-marked bridi. But I don't see why we must be so reductive. One doesn't have to achieve mastery of the anaphora scheme; one just needs to be able to use it for the simple cases at which the mind doesn't boggle, as with SE conversion.

Regarding your point about being precise by being longwinded -- yes, this is always true, but it greatly increases the cost of being precise. Since the possibility of being precise is the chief attraction of a logical language, it would therefore diminish the attractiveness of Lojban.

xorxes: I suppose there is no harm in proposing experimental forms and seeing if they catch on. At least they will be useful in the definition of the conservative system. For the LAhE series I propose:

|| voi'a|LAhE|lo sumti be fi li pa bei|the x1 of|| voi'e|LAhE|lo sumti be fi li re bei|the x2 of|| voi'i|LAhE|lo sumti be fi li ci bei|the x3 of|| voi'o|LAhE|lo sumti be fi li vo bei|the x4 of|| voi'u|LAhE|lo sumti be fi li mu bei|the x5 of||

And Rosta: So are you inclined to feel that no unglorkative scheme is likely to prove usable? Your intuitions are pretty reliable.

I observe that you use lerfu anaphora a lot. Are you in favour of CLL's glorky lerfu anaphora rules, or of Jordan's unglorky ones?

xorxes: CLL's glorky rules, definitely. As I said, even {ri} gives me trouble in usage except in the trivialest cases, as the calculation of which sumti is the last complete one does not come naturally at all. As soon as there is some subordination around I don't know how to use {ri} without pausing to think.

And Rosta: Okay. I think I should pare down the radical proposal, then. (And revise the conservative one to take account of your remarks on nei.) I'll have a rethink when time permits.


OK. Rethink yields the following:

Minimalist scheme

The radical scheme allows any antecedent to be targeted in afterthought. Such a scheme is necessarily complicated. It is too complicated to be usable except in simple cases. But a rationale for adopting the scheme would be that only through trying to use it would we discover the limits of its practical usability. As an alternative, the following scheme aims for greater simplicity, preserving only what is likely to be consistently useful.

  • dei, di'u, di'e -- as in CLL.
  • BAhE1 + sumti -- mark as candidate antecedent for LI1
  • BAhE1 + selbri -- mark as candidate antecedent for MOI1
  • LI1 PA -- PAth BAhE1-marked sumti
  • LI1 BY -- glorky sumti anaphora
  • PA MOI1 -- PAth BAhE1-marked bridi
  • BY MOI1 -- glorky bridi anaphora
  • LI2 -- the PAth sumti in prenex of uncompleted bridi (pa = last; ro = first in sentence)
  • ri-series, go'i-series, nei-series and vo'a-series cmavo can be either recycled (which is especially desirable for monosyllabics, ri, ra, ru, nei) or else defined in whatever way is most consistent with CLL and usage.

xorxes:

  • The BAhE1 method requires forethought. What would be the advantages over the {ko'a goi}/{broda cei} method which also requires forethought? It doesn't use up variables, but it requires remembering an ordered list of candidate antecedents.
    • It doesn't require remembering which ko'V/fo'V form was assigned. goiko'a is good for sumti that are going to be referred back to repeatedly, but onerous otherwise. --And Rosta
      • Why is remembering a ko'V/fo'V form more onerous than remembering a number? Also, in a long text, each ko'V/fo'V can be recycled individually, but the LI1 list can only keep getting longer, and if you want to start afresh you lose everything. Also, each time you make a new assignment you have to remember which was the last assigmnent that you made, and if you make just one mistake, you can ruin the whole list from there, whereas with ko'V/fo'V one mistake doesn't affect the rest of the assignments.
      • I would be hesitant to claim that one system is indubitably more onerous than the other, but they are certainly very differently onerous. The LI1 list is a stack: the most recent BAhE1-markee is first on the list. For approximately the last 3 BAhE1-markees the system should work fairly well.
      • OK, I was forgetting that you count from the one added last. It still has awkward effects, for example you may refer to one thing with LI1re several times, and after a new use of BAhE1 you have to switch to LI1ci for what used to be LI1re.
      • Yes, though if you're referring to something that often then it might be better to use goi.
  • LI1 BY is just like current BY, isn't it? BY MOI1 sounds interesting though. It could be quite useful.
    • LI1 BY: yes, except I'm not certain that plain lerfu are or should be or will be solely anaphoric, so LI1 BY is a potential disambiguator.
  • LI2: What happens when there is more than one uncompleted bridi? Also, do the candidate sumti have to appear explicitly in the prenex?
    • When there is more than one uncompleted bridi, the prenex of the outer will precede the prenex of the inner, so the sequence of prenexes should be unambiguous. And yes, the candidate sumti should appear explicitly in the prenex.