Hugh Pickens writes “Christopher Shea writes in the WSJ that physicists studying Google’s massive collection of scanned books claim to have identified universal laws governing the birth, life course and death of words marking an advance in a new field dubbed ‘Culturomics’: the application of data-crunching to subjects typically considered part of the humanities. Published in Science, their paper gives the best-yet estimate of the true number of words in English—a million, far more than any dictionary has recorded (the 2002 Webster’s Third New International Dictionary has 348,000) with more than half of the language considered ‘dark matter’ that has evaded standard dictionaries (PDF). The paper tracked word usage through time (each year, for instance, 1% of the world’s English-speaking population switches from ‘sneaked’ to ‘snuck’) and found that English continues to grow at a rate of 8,500 new words a year. However the growth rate is slowing, partly because the language is already so rich, the “marginal utility” of new words is declining. Another discovery is that the death rates for words is rising, largely as a matter of homogenization as regional words disappear and spell-checking programs and vigilant copy editors choke off the chaotic variety of words much more quickly, in effect speeding up the natural selection of words. The authors also identified a universal ‘tipping point’ in the life cycle of new words: Roughly 30 to 50 years after their birth, words either enter the long-term lexicon or tumble off a cliff into disuse and go ’23 skidoo’ as children either accept or reject their parents’ coinages.”
Read more of this story at Slashdot.