Zipf's Law: Difference between revisions
m (Gleki moved page jbocre: Zipf's Law to Zipf's Law without leaving a redirect: Text replace - "jbocre: ([A-Z])" to "$1") |
m (Gleki moved page zipf's Law to Zipf's Law over a redirect without leaving a redirect) |
||
(One intermediate revision by one other user not shown) |
Latest revision as of 07:02, 9 July 2014
Zipf's Law (see for example [1] http://www.kornai.com/MatLing/statling.html) was formulated in the 1940's by Harvard linguistics professor George Kingsley Zipf (1902-1950) as an empirical generalisation, and states that the n-th most frequent word in a language shows up with frequency 1/n.
So the most frequent two words account for 150% of the language?
- ... ignoring boundary cases, obviously.
Zipf made the further assumption that, the shorter a word is, the more common it is; this ties in to the more general empirical observation that 'smaller' events are commoner than 'larger' events. (http://www.parc.xerox.com/istl/groups/iea/papers/ranking/ranking.html for other laws expressing this.) This observation is also referred to loosely as 'Zipf's Law', but is not what people outside linguistics understand by it.
However, this is only a generalization; & every language has common polysyllabic terms, because
they are useful. It doesn't mean a long term is somehow "doomed". (And as Talen says, 'If you want
orlin, you know where to find it.')
- How common do you mean? Just "everyday", or "extremely high frequency, top 250 words" sort of thing? If the former, then yes, but that is not really the key issue. If the latter, I will bother to check for English. --And Rosta