the atoms of language

29 february 2016

The Atoms of Language, now 15 years old, has become a classic of popular-science writing. The field is linguistics, which has produced several such classics, including Steven Pinker's Language Instinct, but Mark C. Baker's writing is more patient than Pinker's, less showy, less hectoring. It's also decidedly more difficult, but difficult does not mean abstruse or involuted. It's just a difficult topic to talk about: the essential deep similarity of all human languages, despite their crazy superficial differences.

It's an obvious phenomenon, though. You meet someone who speaks incomprehensibly, and a third party can interpret – an everyday occurrence for many people. Baker's vivid opening example is the second world war's Navajo Code Talkers. Japanese intelligence agents had no idea what the American Code Talkers were saying in Navajo, but the Code Talkers themselves could move between English and Navajo with great facility.

(Although the analogy is not direct, of course. Baker simplifies it to make his point clearer. The Code Talkers didn't talk in plain Navajo – for one thing, they would have had to borrow a lot of vocabulary from English and geography to do so, and that vocabulary would have been vulnerable to analysis. Instead, English messages were encoded, translated, transmitted, translated back, and then decoded. No step was very complicated, but the synthesis of all of them was baffling. Baker, however, presents a situation where bilingual English-Navajo speakers simply spoke in Navajo and were unintelligible to Japanese. Close enough for analogy's sake, because such a system would indeed work as Baker describes it. But since his primary interest is in syntax, and the coded messages did not use Navajo syntax, he's making an imaginative leap. The codes did make use of Navajo phonology that was opaque to outsiders, but Baker's not very interested in phonology.)

In any case, the point is clear: languages must be different enough to utterly confuse non-speakers, but similar enough to be mutually translatable, even on the fly and under stress, on any topic imaginable. What gives?

The bulk of Baker's book is a careful examination of what gives in terms of the possible syntaxes of the world's languages. Here he is as clear as possible, though he depends a little too much on extended analogies to chemistry and the history of chemistry (hence the "atoms" of the title). Baker's "atoms" are parameters: the settings that determine the essential syntax of a language. These are suites of features that move in highly predictable step with one another, so that languages that put the verb at the end of a sentence tend to use postpositions. English this prepositions with not do does.

The information swells in this long middle section and becomes impossible to remember, but there's no test at the end. Like languages themselves, the content is of less interest to a linguist than the principles followed. Baker establishes very coherently that there are a few different paradigms for major syntactic patterns. A few are very common (verb after subject and direct object, verb after subject but before direct object), a few are rare (verb at the start of a sentence) and some are virtually absent (direct object at start of a sentence, which is why Yoda sounds impossible).

All children can learn any of these languages if they start early enough, but there's not unlimited choice about what syntactic form that language will take. This is why Baker chooses syntax. The sound of words seems constrained largely by the limits of our vocal apparatus, and their meaning constrained by nothing but imagination. But no language, as it develops and changes, starts putting direct objects at the start of sentences.

In fact, given the inventiveness and capacity of the human mind, it seems odd that all languages have subjects and direct objects, and that almost all tend to put the subject before the direct object. The cat ate the rat; misery loves company. Why should we inevitably explain the world in terms of a first thing acting on a second thing? It's a brilliant question, to me as awesome as any speculation about the nature of the Universe.

Baker, like all linguists primarily influenced by Noam Chomsky, is himself locked into a paradigm that examines sentences and asks whether they are possible in a given language. It's a valuable abstraction from the messiness of actual speech. It does lead to some weird sentences, though. Among his very first examples is this:

English-speaking … adults don't say sentences like * Who does Pat think that will marry Chris? whereas they do say sentences like Whom does Pat think that Chris will marry? (44)
But precisely nobody outside of the library on Downton Abbey says Whom does Pat think that Chris will marry? Could Baker have come up with a less natural example of an English sentence? Now, to be fair, the impossibility of that sentence is all in the "whom." People do say things like "Who do you think Chris'll marry?" – which is syntactically the same. Still, at times Baker is more interested in whether a sentence will run through an abstract parser than whether any known speaker would speak it.

The prologue and the long central section of The Atoms of Language are really just preambles to its final chapter, a meditative essay on how we got to be our linguistic selves that seems to (non-specialist) me not to have dated at all in fifteen years. Baker wonders whether the "parametric" nature of syntax in a natural language – its reliance on choosing one of a few basic patterns from many theoretically possible permutations – is a cultural choice, or evolved via natural selection. Both seem unlikely. Prescriptive forces can't get people to say "whom" instead of "who," let alone opt for verb-final syntax. Meanwhile, what on earth could be the selective advantage of multiple locked-in paradigms for syntax which, despite their inherent simplicity, still lead to a bewildering variety of mutually unintelligible languages?

Baker invokes a sense of "mystery" – not supernatural, merely inexplicable in our current state of knowledge, or perhaps inexplicable in any possible future state of self-knowledge. The selective advantage of language itself is obvious and enormous. The selective value of Babel is not. Particularly intractable is the universal openness of human infants toward learning one, or even several, native languages, combined with the near-universal dauntingness of learning an additional language in adulthood.

Of course, biological traits don't necessarily have to have selective advantage. Baker seems slightly wedded to the notion that any genetic hard-wiring must have arisen because it provided advantage, so he casts around looking for possible advantages. (Possibly Babel is a way for humans to reinforce the differences between in-groups and out-groups, for instance.) Yet for a trait like lactose tolerance to spread, all that really matters is that it not be selected against. Traits need not serve any grand adaptive purpose. If human language is profoundly effective (because we can kick the asses of dumbly symbolic species), it can survive being less than perfectly efficient (because we are scattered by Babel).

Baker notes that the history of world languages is analogous to the evolution and filiation of biological species, but notes as well that there are "important disanalogies" between the two phenomena. The essential parametric identity of known languages is therefore a ray of hope, I think. When biological species go extinct, they are not coming back, and biodiversity is irretrievably lost. But human languages have not evolved with the kind of freedom that species have. Baker traces processes whereby subject-object-verb languages can become subject-verb-object languages in a generation or two – it happened to English in historic times. This suggests that the range of possible languages, never wide to begin with, is diversely represented in the capacity of all human children. Language death is not like extinction: languages can evolve, die out, and re-evolve in their full possible complexity.

Language death is regrettable for all sorts of other reasons, of course. It represents the death of cultures, their accumulated lore and verbal artistry, and possibly some of their knowledge of the natural world. But such death does not seal off potential futures in quite the same way as biological extinction does.

Baker, Mark C. The Atoms of Language. New York: Basic [Perseus], 2001. P 107 .B35