This page is a summary of information about the Lojban Dictionary: It's history, current form, and future design.

Dictionary Software and Data

Dictionary Discussion

Several pages on the wiki discuss the Lojban Dictionary. Those are collected here:

Dictionary Design

This section is an exploration of the issue started in this discussion about the dictionary backend.

What use cases will use Lojban dictionary data?

What end-use will this data put put to?

  • A print version of the Lojban dictionary
  • jbovlaste, the online dictionary. This includes editing and voting on dictionary entries
  • Flash card definitions, which are more concise.
  • Interactive tools, like glossers and parsers
    • Ideally, jbofihe could use this data instead of having its own version.
    • Similarly, the list of rafsi for jvocuhadju could be extracted herefrom

What sorts of things *should* a Lojbanic dictionary store, ideally?

  • Are any of these use cases unsuitable for a dictionary? Are there use cases we haven't thought of that are suitable?
  • Can we separate a definition from its grammatical context? Are we treating Lojban gismu too much like a verb or noun in the way we handle them now?
  • Should we use some manner of spatial visualization? (e.g., cpacuvisualization)
  • Versioning: it would be really really nice to be able to have BPFK proposals reference "checkpoint 123 in the dictionary software". A possibility here: have an "official word" checkbox, and a button that says "checkpoint all the official words, give the checkpoint a number", and the ability to diff successive checkpoints. Or possibly have the checkpoint button do everything, but be able to filter out the diffs to just show the official stuff, or not. Then it could be of more general use: could also have a "this word is worth printing" button, and than any given official print run of the dictionary would be all those words, from checkpoint #123 or whatever.
    • Related: You can't edit a definition with the "officially approved" and "official word" checkboxes. You can fork it, and then there's a button that outputs a diff between the fork and the officialdata version. An admin could then move the "officially approved" checkbox around.

What storage format is going to work for all of these use cases?

  • Each use of the dictionary data needs to view it in a different way. Can we design a format that can be shared between all users of the data?
  • Since the form of a dictionary entry is often unstructured text, What would a dictionary definition look like that supports all of our use cases without duplicating the definitions by changing their form? (e.g., a brief definition for a study card, a full definition for a print dictionary, and an archive discussion for the online dictionary?)

What prior art is there?

  • Dictionaries are not a new problem, how do other people deal with this?
  • Is Lojban fundamentally different because of its formal grammar? Is our thinking on this problem influenced by working with languages that don't have a formal grammar: Is a dictionary a compromise that we have a better solution for?

Design Proposal

gismu, lujvo, and selma'o have different storage requirements. The proposal below assumes they will be storted in a database, and describe the storage schema.