Loading...
Thumbnail Image
Item

Zipfs law holds for phrases, not words

Ryland Williams, Jake
Lessard, Paul R.
Desu, Suma
Clark, Eric M.
Bagrow, James P.
Danforth, Christopher M.
Sheridan Dodds, Peter
Citations
Altmetric:
License
DOI
10.1038/srep12209
Abstract
With Zipfs law being originally and most famously observed for word frequency, it is surprisingly limited in its applicability to human language, holding over no more than three to four orders of magnitude before hitting a clear break in scaling. Here, building on the simple observation that phrases of one or more words comprise the most coherent units of meaning in language, we show empirically that Zipfs law for phrases extends over as many as nine orders of rank magnitude. In doing so, we develop a principled and scalable statistical mechanical method of random text partitioning, which opens up a rich frontier of rigorous text analysis via a rank ordering of mixed length phrases.
Description
Date
2015-08-11
Journal Title
Journal ISSN
Volume Title
Publisher
Research Projects
Organizational Units
Journal Issue
Citation
Williams JR, Lessard PR, Desu S, Clark EM, Bagrow JP, Danforth CM, Dodds PS. Zipf’s law holds for phrases, not words. Scientific reports. 2015 Aug 11;5:12209.
Embedded videos