proposed jbovlaste guidelines

From Lojban
Jump to navigation Jump to search

This is my (arj's) proposal for new Jbovlaste guidelines. Feel free to edit directly, or send me comments or suggestions.

Introductory remarks

The aim of these guidelines is to enable editors with different styles to collaborate on creating a dictionary database that is as consistent and high-quality as possible.

Unless otherwise specified, "English" refers to any of the meta languages that Jbovlaste uses. (In the interface, these are called "natural languages", even though some constructed languages, such as Klingon and Loglan are included).

Lemma selection

When preparing dictionaries for natural languages, the usual problem is which of the millions of words are important enough that they need to have a definition. Lojban's vocabulary is much smaller than that, but we do have limited manpower with which to write definitions. Some words are more important than others, and should be prioritized.

That being said, most of us work on Jbovlaste because it is fun. If we didn't spend some time once in a while with fun distractions, we would get demotivated, and the important things would never get done.

Most important words

  1. Lujvo that are referred to in the gismu list
  1. Lujvo that have seen the most usage over the longest period of time
  1. Lujvo that have seen any usage at all
  1. Cmevla that have seen the most usage over the longest period of time (and are not nonce)

Least important words

  • Completely novel words that cover some previously uncovered area

Words that shouldn't be in the dictionary at all (or shouldn't have their definitions edited)

  • Cmavo or cmavo compounds
    • These shouldn't be touched until the BPFK is complete.
  • Unofficial gismu
    • Unlike experimental cmavo, the language description actually gives no provision for using words of gismu form that are not part of the closed set of gismu. "Experimental" gismu are not valid Lojban.
  • Poorly thought out lujvo
    • Exception: lujvo that have seen a lot of usage

The definition

All definitions end with a period.

Definitions of brivla must include a place structure. In fact, almost all definitions of a brivla consist of nothing but a place structure definition.

The best way to find out what the place structure of a brivla is, is to base it on usage. Unfortunately, there is almost never enough usage data to gain any kind of insight on the places beyond x1.

If available, existing Noralujv entries may be used as a template. These entries usually reflect best practices, but be wary of clerical errors.

Place structures

Fu'ivla place structures should have plain numbered arguments: x_1 ... x_n.

Lujvo place structures should have arguments that point to its component gismu. The symbol for the arguments should be the shortest possible abbreviation of the gismu such that no ambiguity occurs. For instance, if the lujvo is based on bloti, and girzu, it is okay to just use b and g. If the lujvo is based on badna and bancu, however, bad and ban must be used.

The arguments must be in ascending order and without skipping any argument. If it is necessary to reorder the arguments in the definition to make it flow more smoothly, all places must be explicitly numbered with x_1 - x_n.

As a consequence, if a component gismu begins with x, at least two letters must be used in the argument abbreviation, so that it can't be misunderstood as referring to a numbered place.

Parentheses

Parentheses usually have the same meaning as in ordinary language.

You may specify which kind of abstraction is expected in a particular place by adding the abstractor (of selma'o NU) after the term in question.

Square brackets

Some gismu definitions use square brackets. Since their meaning is not completely understood, they should be avoided in new definitions.

Alternations

A solidus ('/') between different words or phrases can be used to indicate a distinction made in English that is not made in Lojban. This might best be illustrated by an example.

rodbo'e: $brode_1$ is a foo/bar/quux of $broda_1$.

This means that foo, bar, and quux are all rodbo'e, along with other items in the foo-bar-quux-continuum.

However, a solidus is never to be used in the gloss keywords, because this will lead to a database error.

Gismu

The wording of a gismu definition in English is baselined and must not be modified. If you really think it's necessary, bring it up with the BPFK.

All gismu changes must be described on the Approved gismu Alterations page.

Typographical changes for clarification, hyperlinking Lojban words (with curly braces) and so on are fine. Adding new entries to the gloss word and keyword fields are fine. In both cases, post to the Approved gismu Alterations page. Any other gismu changes should probably be brought to the BPFK, or at least mentioned on the jbovlaste mailing list.

Notes

The notes field is logically organised in several parts. Not all parts should be present in all entries, but they should be in the same order.

The notes field should not contain any information that is or could be automatically generated. In particular, the notes field for a lujvo should not explain which gismu it is composed of (but it may point to those gismu in the cross-references).

Notes on place structure

If the place structure of a lujvo departs significantly from jvojva, this should be explained. Typically this means that a place has been omitted because it contains little useful information.

Incorrect

  • Omitted: x5 = klama2 (destination) = bartu1 (something external).
  • Omit $x_3$ = se {klama} (destination) = {ckana} (bed)

Correct

  • Omit $x_4=s_2=m_1$.

Cross-references

Cross-references are signified by "Cf." (an abbreviation for "confer", which means "see"), and one or more words enclosed in curly brackets, separated by a comma and a space, and ends with a period.

Example

  • Cf. {broda}, {brode}, se {brodi}.

Examples

Examples should only occur in example fields, not anywhere else. Examples should illustrate the meaning and/or usage of the word in question. Genuine examples (ie. live usage) are preferred over invented examples.

Gloss keywords and place keywords

The purpose of keywords is to serve as a natural-language index to the Lojban dictionary. The Jbovlaste infrastructure does not work well as a basis for an English-to-Lojban dictionary, and this is in fact undesirable. To the extent that Lojban and English carves reality in different ways in terms of the vocabulary, the selection of English keywords should reflect this.

Citation form

Keywords should be given in their citation form. They should not be inflected, or have particles attached to them.

Incorrect

  • to look
  • an apple
  • intercepts

Correct

  • look
  • apple
  • intercept

Key "words" may have multiple words, and having a multi-word expression may in some cases be the best solution. But such keywords must always be a distinct exical item, and it should not be too long.

Incorrect

  • red balloon
  • pointed towards
  • millimeter standard

Correct

  • chewing gum
  • take care of
  • volitional entity

In exceptional cases, part of a keyword can be fronted with a comma, to ease searching.

Example

  • fever, have a

Multiple senses

When to use multiple senses

The sense field of a keyword should be used if an English speaker (with no knowledge of Lojban) perceives a word to have multiple distinct meanings.

Multiple senses of a word should not be used to indicate a semantic distinction that occurs only in Lojban. A keyword must never be split into multiple senses simply to ensure that two or more Lojban synonyms (or near-synonyms) are included in the English part of the dictionary.

Incorrect

  • throw up (vomit) / throw up (due to alcohol)

Correct

  • Uranus (planet) / Uranus (god)
  • stable (sturdy) / stable (for horses)
  • wing (of animal) / wing (of building)

How to write a good sub-sense description

The sub-sense of a keyword (the part in the parentheses) should preferably be an exact equivalent (synonym), with which the keyword could be replaced in running text without change in meaning. A short phrase may be used, if necessary.

If a usable synonym cannot be found, the second-best solution is to use an adjective, prepositional phrase or adverbial that is meant to be read as part of the keyword.

Grammatical terms (noun, verb, adjective etc.) must never be used as a sub-sense description. The only exception to this is the term "attitudinal", since linking to attitudinals from the English index would otherwise be too difficult.

Terms that specify which field a jargon term belongs to (mathematics, botany, etc.) are likewise to be avoided.

Incorrect

  • e-mail (noun) / e-mail (verb)
  • point (needle) / point (mathematics)

Correct

  • e-mail (message) / e-mail (send e-mail)
  • point (of a needle) / point (mathematical)