In 1998, speakers of English have enormous power in the world. English is not the language spoken by the most native speakers--that is Mandarin Chinese, spoken by about a billion people. English comes second, and has "only" about half a billion speakers, but a key dynamic is that many of them are non-native speakers. A central component of the education and empowerment of millions of people around the world is learning English as a second language. English is the language of technology--of international finance, of the Internet, of aviation, of science. It is also the language of the overwhelming wave of American culture: of blue jeans, Coca-Cola, and Michael Jordan. This wave is passing over an earlier wave of British imperial culture that has made English a national language in Ireland, the West Indies, West Africa, Kenya, South Africa, the Indian subcontinent, Australia, and New Zealand--not to mention, of course, in North America itself, which is the prerequisite for the current American domination of world popular culture.

It may seem as if the native speaker of English is at the center of the linguistic world as we approach the year 2000. Yet a look at the family relationships of the English language can be a Copernican experience. Go for instance to the Ethnologue Language Families Page, and see where we are. There are between 75 and 100 major language families in the world. English is a twig on a branch of one of nine major subfamilies of one of those families, Indo-European. In terms of the geographical range of the Indo-European languages, English is literally marginal, originating in the far west of the range, on the very edge of the European subcontinent.

A language family can be conveniently defined as a group of languages that contain cognate words. The English words father, mother and daughter are, in Classical Greek, peter, meter and thugater. These English words do not come from Greek, because we have evidence of them in Old English texts long before anyone in England spoke or read Greek. Hence their close similarity must be due to them being cognate. Modern English and Modern Greek are very distant cousins. Both descend from a language spoken in Europe a very long time ago.

We call that language Indo-European, but we know nothing directly about it. No Indo-European texts survive. No spoken Indo-European language today is any more "primitive," or closer to the original I-E, than any other one. Nor is there anything special about I-E in particular. Probably about two billion people speak Indo-European languages today, but in its origins, it was just another dialect spoken by rather unprepossessing, illiterate tribes that lived somewhere in Eastern Europe--linguists differ on just where--thousands of years ago.

Language families are separate when they can't be shown to contain cognate words. The perception of cognate words is a very inexact science. Some people see larger-order connections between language families than others can.

What was Indo-European culture like, and how do we know that the language existed?

What was Indo-European like?

That's a good question :-) We know next to nothing about Indo-European culture, and everything that we know must be inferred indirectly, including the structure and vocabulary of the language. (Since the geographical origin of Indo-European is unclear, we cannot certainly link the language to any archeological data.) We know a few things about the culture from the words that survive as cognates in modern languages. They had cattle, sheep, and dogs; they had horses and wheeled vehicles. They used copper, bronze, gold and silver; they knew of beech trees, eels, salmon, and honey, and they had boats. They worshiped a sky god who was conceived of as a patriarchal Father.

How do we know that?

There are cognates in widely scattered I-E languages for those words. Sanskrit, an ancient Indo-Aryan language from the east of the I-E range, has a word uksan that means ox. There's also a Sanskrit word avi- that means "sheep," and the Latin word for "sheep" is ovis. Our word for a female sheep is ewe. Out of such cognates, we reconstruct a vocabulary and a culture.

Let's use some examples of historical reconstruction in order to show how difficult and how speculative a task it is. We'll start with numbers, giving the Old English, Latin, Greek and Sanskrit in turn:

  1. An, unus, heis, eka
  2. Twa, duo, duo, dvau
  3. Þrie, tres, treis, trayas
  4. Feower, quattuor, tettares, catvaras
  5. Fif, quinque, pente, panca
  6. Siex, sex, hex, sat
  7. Seofon, septem, hepta, sapta
  8. Eahta, octo, okto, astau
  9. Nigon, novem, ennea, nava
  10. Tien, decem, deka, dasa

 Here we are dealing with unambiguous concepts and very clear and robust data--the words for the basic integers are in use every day, and we would not expect them to change much at all. Even so, there are big differences. What make the differences manageable are their systematic nature, the cumulative power of them, and the gradation of the differences. If you only had feower and catvaras, you'd be hard-pressed to see any connection between English and Sanskrit. But feower and fif vary systematically with Latin quattuor and quinque, so we establish a cognate at one end and then worry about the less systematic connections between Latin and the more eastern languages later. Similarly, Greek hex looks very unlike Sanskrit sat, but the connection of hex to hepta and sat to sapta helps us formulate a rule for Greek: I-E initial s becomes h in Greek under some conditions. (We could extend this by noting that the Greek for snake is herpes and the Latin is serpens.) Can you make other rules out of the limited chart above?


Naturally, almost no other cases of reconstruction follow such neat rules. Here, for instance, are some color terms. You'd expect color terms to be stable as well as numbers, but such is not the case. I give here words for some basic colors in English, German, French, Italian, and Greek:

Here we quickly run into desperate straits. For one thing, even concepts as basic as white and black do not seem to have cognates across the I-E languages. The word "red" does (the middle of the Greek erythros is the key cognate element). But "blue" has a strange history, because it is a word borrowed into Romance languages from Germanic and then borrowed back into English, whereupon all the Romance languages except French dropped it and used instead something like azzurro (cf. Spanish azul).

Our little excursion into historical reconstruction has been a disaster . . . except that linguists who have more data at hand have many ways to elaborate and refine this kind of analysis. We might look for cognates, for instance, in words that are less closely related semantically. White is a common Germanic word, but obviously doesn't match the Italic or Greek words for "white." It does match a Sanskrit root *cvid- which means "to be bright"--our cognate from the east here, slightly altered in conceptual value. There's an OE word swart "black" that survives in the ModE "swarthy," linking OE to German ("black" is an obscure word, and in Old and Middle English there's a word blac that means "white"--so you can see how hopeless much reconstruction is. Like translation, reconstruction is not a matter of mechanical decoding, but is an inexact and artistic science.

Our last example will be days of the week, to show something else that happens to languages. New cultural forces can erase even very conservative words and concepts. Here are the days of the week in English, German, French, Italian, Spanish, and Portuguese:

  1. Sunday, Sonntag, dimanche, domenica, domingo, domingo
  2. Monday, Montag, lundi, lunedi, lunes, segunda-feira
  3. Tuesday, Dienstag, mardi, martedi, martes, terceira-feira
  4. Wednesday, Mittwoch, mercredi, mercoledi, miercoles, quarta-feira
  5. Thursday, Donnerstag, jeudi, giovedi, jueves, quinta-feira
  6. Friday, Freitag, vendredi, venerdi, viernes, sexta-feira
  7. Saturday, Samstag, samedi, sabato, sabado, sabado

The days are associated with gods--but unpredictably so. The I-E sky god has his day on Tuesday in Germanic but on Thursday in Romance. The moon is always Monday, but the other gods are mixed around a bit, and Saturday seems to commemorate a Roman god in English but the Hebrew concept of "sabbath" in Romance. Christian concepts have therefore driven the old gods out of the Romance weekends but not out of the weekdays. English has a wholly pagan week, but the middle day in German has become a neutral descriptor. And in Portuguese, uniquely, neutral or Christian terms have driven all of the gods out of the week altogether.

New cultural forces--as, here, Christianity--rearrange old vocabularies. It is posited that we could take two cognate languages and tell how long ago they diverged by counting the percentage of basic words still cognate. Such an approach is called glottochronology,--an analogy to the use of the molecular clock in biological evolution. Still, though language is evolutionary and biological, the presence of truly cultural elements means that glottochronology is a terribly inexact science. Just based on the above list, for instance, you would think that French, Italian and Spanish diverged very recently, and that Portuguese must have diverged well before, to have retained only 28.6% of its cognates against 100% agreement in the other three languages. Yet of course, Portuguese and Spanish have diverged relatively recently. Cultural forces act more quickly and unpredictably than any biological force.