Peter Mark Roget, the nineteenth-century physician and polymath who invented the thesaurus, was the grandson of a Geneva clockmaker. His father, a Protestant minister, had immigrated to London in 1775. It seems appropriate that the scion of Calvinists and technicians would be the man to organize the English language into a thousand concepts, divided into six classes further divided into divisions and sections. Were you to need a synonym for "love," for instance, you would have found it lodged between concept No. 896, "Congratulation," and No. 898, "Hate," themselves lodged in the sixth class, "Affections"--not, say, "Matter," where the division "Sensation" resides, though Byron might have put it there; nor, say, "Intellect," where Dante might have put it.
The first reverse-engineered dictionary was published in 1852 in London, when Roget was 73, after a lifetime of compulsive list-making. According to his most recent biographer, Joshua Kendall--The Man Who Made Lists: Love, Death, Madness, and the Creation of Roget's Thesaurus (Berkley; paper, $16)--his first list seems to have been "Dates of Deaths," which began with his father's death date when Peter was 4. But Roget's Thesaurus was also the fruit of an age whose mania was classification, and the class/division/section of the book was the direct descendent of the phylum/class/order system first put in place in 1735 by Linnaeus to organize the plant and animal kingdoms. In so many ways, English was a forest full of flora and fauna; Roget was out to mold it into a botanical garden and zoo.
We still live in that world, with technology-driven semantic fields birthing whole species of new vocabulary annually. In lieu of a definitive answer to the question "How many words are there in English?" the Oxford English Dictionary Online has a chart listing tranches of vocabulary from most to least common, what percentage of the corpus each tranche represents and example words. The hundred most common words (from, because, go, me, our...) account for 50 percent of the corpus; the thousand most common words (girl, win, decide...) account for 75 percent of the corpus; but at 99 percent of the corpus we may have a vocabulary of 1 million words, which include the likes of endobenthic and pomological. That 1 percent of extremely rare, specialized words is what takes us from the average American high-schooler's vocabulary of 60,000 words to the endlessly receding horizon of ever more exotic, but extremely precise, terminology. English is a monster with a very long tail, and that's why attempts to tame it--from Roget to Strunk and White--are vulnerable to poetic backlash. Even in the popular imagination our paragon remains cornucopian Shakespeare, who wrote before standardized spelling.
About 400 million people speak this monster English as a first language. Few people shed tears over dying languages displaced by English and other national languages. According to the Hans Rausing Endangered Languages Project, there are 6,500 languages in the world today, and half of them are bound for extinction within fifty to 100 years. Without the disarming smile of the African slender-snouted crocodile, the pathos of the Galapagos penguin or the splendor of the golden langur, it's difficult to mount a telegenic campaign to preserve and promulgate them.
Daniel L. Everett, author of Don't Sleep, There Are Snakes: Life and Language in the Amazonian Jungle (Pantheon; $26.95) and the subject of a controversial New Yorker profile last year, went to the Amazon jungle in 1977 to translate the Gospels into Piraha. He is now the world's leading expert on this dialect, which is spoken by less than 400 Indian natives. English uses about forty phonemes; Piraha about eleven. We average about 60,000 words and counting in our workaday lives (see above); the Pirahas reject nonindigenous vocabulary. As Everett puts it in his book, "To talk about things that have no place in their own culture, such as other gods, Western ideas of germs, and so on, would require the Pirahas to adopt a change in life and thought. So they avoid such talk." And thus it has been with this people since their first encounters with Brazilians in the seventeenth century.
In this extremely conservative culture, Pirahas don't give credence to any experience that wasn't directly witnessed by themselves or their interlocutors. Hence, they have no creation myths or histories. What if your religious conversations went something like this:
"Hey Dan, what does Jesus look like? Is he dark like us or light like you?"
"Well, I have never actually seen him. He lived a long time ago. But I do have his words."
"Well, Dan, how do you have his words if you have never heard him or seen him?"
The Pirahas represent the Occam's razoresque antithesis to our linguistic fecundity: even recursion, the act of imbedding multiple clauses in sentences, is an unnecessary complication where simple sentences will do.
Despite the small vocabulary and little grammar involved in Piraha, it serves them very well, and Everett is at pains to convince us that "maybe we don't need much grammar after all in an esoteric culture." Born to a culture as exoteric as ours, it is hard for me to imagine a language that relies more on interpersonal context and tone than description and syntax. In fact, much communication is done through prosody: whistle speech, hum speech, musical speech and yell speech. Everett is convinced that these profound differences in our languages reflect valuable differences in experience. The Pirahas have a difficult time learning math, which seems to be a direct result of having no words for numbers; however, Piraha encodes perceptions that are indigenous to life in the Amazon. Everett's work has helped to revive the theories of Benjamin Whorf and Edward Sapir thought to have been laid to rest by Chomsky's rationalist theory of universal grammar. Language may well decide what we are able to think.
I imagine that, given the economy of Piraha, synonyms are rare. But even in English, and after centuries of thesauruses, the question lingers: do synonyms exist? From the first effort to contain them, Abbe Gabriel Girard's 1718 La justesse de la langue francoise (Synonymes francois in later editions), the question has been posed as to whether two words can ever mean the very same thing. For Roget, his thesaurus lists were an aide-memoire for le mot juste: the right word, the exact shade of connotation. Our versions of whistle speech, hum speech and so forth might also be encompassed by a list of near-synonyms: love, ardor, predilection, inclination are all used slightly differently. Is every word in English a piece of the puzzle of our existence? Do synonyms replicate at the smallest levels the function of languages in relation to their totality? Would a master thesaurus contain the history of human perception?
Am I getting universalist now? In fact, the Linnaean impulse underlying Roget's Thesaurus and its prototypes was born of a desire to establish a universal language on par with Latin. Roget leaned heavily on both Abbe Girard and Bishop John Wilkins's An Essay Toward a Real Character and a Philosophical Language (1668), which even tried to propose a symbolic language that would usurp the many colorful but divisive local ones. Perhaps someday we will communicate in logical symbols instead of this prolific, verbose and multifarious monster English, which seems to be the map so big it covers the territory and then some. On the other hand, it's the "and then some" that is our growing end. If we don't embrace large swaths of English--those schoolmarmish twenty-five-cent words, medicinal latinates, frothy slang--aren't we denying that mischievous faculty of ours where curiosity about the world--the Pirahas', Roget's, the golden langurs'--overlaps with our instinct for play?
There's a poignant irony lingering in the space between Roget's and Everett's stories. Roget was afflicted with melancholy and saw generations of his family succumb to depression and suicide; he found solace in Christianity even in the face of rising secularism. The evangelical Everett came out as an atheist after several decades of fieldwork. He lost his own tribe (and marriage) in the bargain but found in the nonchalantly skeptical Pirahas the happiest, most contented people he had ever known. Maybe the search for natural medicines in the Amazon should be redirected from its flora to its languages.