I came across an opinion story at The Chronicle today about how we tend to focus on war more than peace.  I read the Language Log often and they cringe at the pop media “words arguments”:  language X has more words for Y.  Notably, the Eskimo snow argument crops up often.  Part of the problem is illustrated in this comic:  we have quite a few ways of describing snow in English, but that is often ignored.  Other problems relate to the morphology of those languages.

The Chronicle story brings up the Eskimo snow example while saying that we have words for many different wars, but only one word for peace.  For starters, I’m assuming words like tranquility and harmony don’t really count – we’re talking about the first sense in this WordNet listing (the state prevailing during the absence of war).  The thing that strikes me is that we’re describing the absence of something (war).  How many different absences could there be?  How could you describe the absence of something?

Examples of different peacetime policies or cultures could be given perhaps, but they rarely relate to the absence of war.  But we already have names for policy/cultural times (e.g., Truman Era, Great Depression, etc).  Other languages have similar names, such as Qing Dynasty or Edo Period.  Roughly speaking, they are an alternate description of time, but policy/culture isn’t mutually exclusive to war.

At a higher level, I wonder what it’d take for me to believe a words-based argument.  The Language Log highlighted an interesting post about untranslatable words, pointing out the difference between “not translatable” and “doesn’t have a one-word translation”.  Though I have to say the Russian toska seems close to the English word heartache.  Some of the others are more interesting.

Thinking about words without a single-word equivalent in other languages, the words-argument seems to roughly depend on the idea that shorter realizations reflect frequency (which I’d loosely agree with).  But instead, they only consider the difference between one word and multiple words.  Instead, maybe it would be better to consider the time it takes to say a given word or phrase.  The problem in considering the number of words directly is that some languages have rich morphology or compounding systems.

At the same time, I also wonder about word frequency.  How do you decide which versions of “snow” count?  Do you count slang?  How do you handle dialect differences in languages?  Is soda vs pop one word or two in the counting scheme?  (The argument for one might say that the terms are mutually exclusive)  When you’re looking at dialect problems, how do you account for culture size and spread?  (I’d expect to find more regional variations in English than Inuit)

The only ideas I have for such a study is that you’d have to come up with all single-word realizations related to a concept in both languages, then translate each to a phrase in the other language.  Then for each pair of semantically identical phrases/words you would need to compare either the natural speaking length or else something like the relative frequency of each.  If you’re comparing relative frequency, you need to normalize out the cultural/regional effects somehow (e.g., how often do you need to talk about snow if you’re sampling English from Florida vs Inuit in Alaska?).  You could probably attempt this by sampling texts from the same region, but in that case the languages may have already borrowed some words from each other.  If you don’t have regionally similar data, you might need to compare something like P(realization_1 | wanted to talk about snow).  In other words, normalize out the differences in wanting to talk about the (loose) semantic concept.

I think the intended basis of the argument is to say that communication related to a concept is more or less efficient in certain languages.  To go a step further, I imagine people expect that languages develop as an optimization process.  Given that, I’m curious just what it would take for someone to convince me that one language is more optimal for certain communication than another language.


