Word frequency lists: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
m (Text replace - "jbocre: ([L-Z])" to "$1") |
||
Line 1: | Line 1: | ||
== Main lists == | == Main lists == | ||
*[[:File:MyFreq-COMB_without_dots.txt|Full list (all words including cmavo clusters)]] | *[[:File:MyFreq-COMB_without_dots.txt|Full list (all words including cmavo clusters)]] | ||
*[[ | *[[Word frequency lists: gismu|Word frequency lists: gismu]] | ||
== How to generate lists yourself == | == How to generate lists yourself == | ||
Line 9: | Line 9: | ||
* Older word frequencies can be found [http://www.lojban.org/files/roadmap.html#draft-dictionary_working here] | * Older word frequencies can be found [http://www.lojban.org/files/roadmap.html#draft-dictionary_working here] | ||
* [[line-templates-by-frequency.txt|This]] is a sorted list of "sentence templates" excerpted from IRC. It shows which sequences of selma'o/word types are most common. | * [[line-templates-by-frequency.txt|This]] is a sorted list of "sentence templates" excerpted from IRC. It shows which sequences of selma'o/word types are most common. | ||
=== [[ | === [[Robin Lee Powell|Robin Lee Powell]]'s lists === | ||
[http://teddyb.org/~rlpowell/hobbies/lojban/flashcards/big_list gismu and cmavo frequency ordered word list], based on Lojban IRC, Alice, and a few other large texts. There is also a [http://teddyb.org/~rlpowell/hobbies/lojban/flashcards/ large selection of intermediary files], including pure frequency lists | [http://teddyb.org/~rlpowell/hobbies/lojban/flashcards/big_list gismu and cmavo frequency ordered word list], based on Lojban IRC, Alice, and a few other large texts. There is also a [http://teddyb.org/~rlpowell/hobbies/lojban/flashcards/ large selection of intermediary files], including pure frequency lists |
Revision as of 14:47, 23 March 2014
Main lists
How to generate lists yourself
- See discussion for details
- the Lojbanic corpus in a .tar.gz archive.
Older stuff
- Older word frequencies can be found here
- This is a sorted list of "sentence templates" excerpted from IRC. It shows which sequences of selma'o/word types are most common.
Robin Lee Powell's lists
gismu and cmavo frequency ordered word list, based on Lojban IRC, Alice, and a few other large texts. There is also a large selection of intermediary files, including pure frequency lists