I was sipping a coffee in a café in Xishuangbanna, southwest China last January when I saw the following news item doing the rounds on the internet:
This was a paper by Caleb Everett and two of my colleagues/arch-nemeses, Damián Blasi and Seán Roberts. Their claim in the paper is that climate can affect languages, in particular that tonal languages are easier to speak in humid climates. Tonal languages such as Chinese use pitch to make distinctions between words, for example xióngmāo means ‘panda’ (熊猫) while xiōngmáo means ‘chest hair’ (胸毛). Tonal languages tend to be found in warm and humid places around the equator such as Asia, Africa and Mexico. I had heard other people joke about this peculiar geographical pattern before, so I found it both amusing and surprising that someone had proposed the following half-way plausible explanation.
When breathing in dry air, the larynx can become desiccated and people have less precise control over phonation. Tonal languages such as Mandarin Chinese require precise control of pitch when speaking. They therefore predicted that tonal languages would be less likely to occur in dry climates, simply because people would prefer avoiding making precise phonation distinctions which require more effort.
This was news to me - I hadn’t noticed it being harder on my throat speaking Mandarin in Oxford than in China. However, this is apparently a well-substantiated effect in laryngology, and in some very dry places you do begin to notice that it is difficult to speak. I notice this when I’m in some dusty mountainous villages in Yunnan, sometimes gingerly testing whether I can still produce tonal tongue twisters accurately (ma1ma ma4 ma3 ma ‘is mother scolding the horse?’).
In line with this, tonal languages are generally found in humid places, although the correlation actually isn’t significant in a logistic regression (p=0.16), a fact that is bizarrely omitted in their paper. They instead choose to show this with a random independent samples test, where they show that random samples of tonal languages have a higher humidity in most cases than random samples of non-tonal languages. In addition, the number of tones that languages does correlate positively with humidity (significantly this time) within several areas of the world such as Africa, Eurasia, the Pacific and indigenous languages of South America. Despite this statistical evidence and the physiologically supported mechanism, most linguists were dismissive of this finding - ‘correlation does not imply causation’.
This week, one year after the original paper, a new open-access journal the Journal of Language Evolution was published (edited by Dan Dediu and Bart de Boer), which devoted its first issue to a debate between Everett et al. and some invited contributors, and then a further response by the authors.
I have a paper in this issue that criticises their paper and offers a counter-argument, namely that the distribution of tone is in fact no different from what you would expect if tone were randomly distributed. This is because languages are not evenly concentrated around the world, with many more languages being found in humid places such as New Guinea or west Africa than there are in dry places such as Europe (r=0.31, P<0.001). I illustrated this with the map below, where each star is proportional in size to the number of languages in a 100 km radius of that point. In Europe there tends to be just one or two languages in a 100 km radius of any location (e.g. between Nijmegen and Amsterdam, only Dutch is spoken). In more humid places as many as 134 languages can be spoken over a similar distance, perhaps because of the relative isolation of communities in areas with rainforests, as illustrated by a photo of the point with the highest language density (the Trans-New-Guinea language Kandawo).
This raises the question, suppose that tone was in fact randomly distributed, would tone be more likely to occur in humid places, simply because there are more languages there? I did the following simulation to test this, using the same data that the authors used (the World Phonotactics Database and humidity data supplied by Seán Roberts):
- Pick a small number of languages at random, say between six and ten.
- Then assign the nearest 100 languages tone, and assign every other language non-tone. In this invented dataset, it turns out that tone correlates with humidity with between 40% and 80% probability depending on the parameters chosen.
The next thing I did was repeat the test with a control for language family. This is because related languages are not independent data-points; if you included 100 dialects of English in your data, people could accuse you of deliberately inflating the correlation between tone and humidity because English does not have tone and it is spoken in a relatively dry place. In order to avoid counting related languages multiple times, the authors sample one language per known language family. Here is another map I made of the ranges of known language families of the world:
One language is sampled per family. After doing this, you can test the correlation between tone and humidity either with a regression analysis or with the test they actually used, which was simply to compare the humidity of the tonal languages with the non-tonal languages in random samples. In my simulated datasets, it turns out that their result is expected between 33% and 53% of the time. The probability that their result is due to chance, even after sampling one language per language family, is quite high.
I therefore think that the global correlation is meaningless, because it can be derived simply from two facts: that there are more languages in humid areas, and that tone has a tends to spread to neighbouring languages. More than most features of languages, tone is very contagious: many unrelated but neighbouring languages in Asia have tone, Vietnamese being a relatively recent example of a language known to have acquired tone from neighbouring Chinese (from which it also has a lot of loan-words).
I demonstrated this last point with an analysis of two large tonal language families, Sino-Tibetan and Niger-Congo. Using a crude form of phylogeography and ancestral state reconstruction (maximum likelihood reconstruction of number of tones and location using the ‘ape’ function in R and phylogenies from Glottolog), I showed that as languages migrated in the past, the way that they moved predicts the way that their number of tones changes over time. My map of Sino-Tibetan below shows this, with the descent of three languages plotted: the algorithm reconstructs the origin of Sino-Tibetan in western China, radiating out to northern India in one direction (e.g. the language Raute) and China in the other (e.g. Mandarin and Cantonese). The paths show the way that they spread out in one simulation, while the numbers show the number of tones that they had at each point (as reconstructed in one run of the algorithm).
It turns out that the main predictor of how languages change in their number of tones is what unrelated languages they are near. Sino-Tibetan languages gain more tones as they move towards Hmong-Mien languages (Pearson’s r=0.33, P<0.001), which have the most tones of languages in Asia, suggesting that southern Chinese languages in the Min and Yue branches have been influenced by highly tonal language families that were present in the region. Conversely, Sino-Tibetan languages lose tones when they are near unrelated languages with few or no tones, such as Indo-European languages (Pearson’s r=0.47, P < 0.001). Humidity turns out not to be a significant predictor of number of tones in Sino-Tibetan (Pearson’s r=0.06, P=0.48). (The test I used was to take every change in number of tones that occurred in the family and compare the number of tones it transitioned to with the location that this happened, a made-up test which has its own problems such as transition events not being independent of each other, but I did this because I don't know of an available phylogenetic correlation test for a continuous trait such as humidity).
This suggests that Indo-European speakers have been simplifying the number of tones in Sino-Tibetan languages. This reminds me of the attitude of learners of Chinese towards tones - I initially taught myself to speak Chinese tonelessly, not believing that pitch could matter for meaning. Other learners of Mandarin find particular tones difficult to pronounce, such as the falling tone. Even quite fluent, intelligent and highly motivated speakers of Mandarin can be extremely bad at producing tones, such as Mark Zuckerberg's recent speeches (as also noted by Mark Lieberman).
Second-language learners can often find tone difficult, and when the proportion of second-language learners in a population is high enough, this can cause simplification, such as Indo-European speakers simplifying Sino-Tibetan languages near India; this effect is replicated in Africa, where Niger-Congo languages lose tone as they spread near non-tonal Afro-Asiatic languages (Pearson’s r=0.23, P < 0.001).
Winter and Wedel’s paper in the issue made the interesting point that number of tones seems to be negatively correlated with number of second language learners that a language has, in support of my analysis.
The most impressive finding of Everett et al.’s paper is that tone correlates with humidity in three continents in a logistic regression. How likely is this due to chance? It turns out that it’s between 5 and 20% depending on the parameters of the simulation (see the paper for details). This is not high, but is higher than one might expect. Scientific papers are conventionally supposed to show that there is a less than 5% chance of their findings if the null hypothesis (that there is no relation between tone and humidity) were true.
Everett et al. then had the opportunity to respond to our papers, here. They wrote a paragraph responding to my paper, which I reproduce here:
‘Collins quite convincingly shows that the kind of correlation we observed might derive from historical spread and contact in many individual cases. First, we should stress that we have never claimed that such factors are not also at play, and we suspect they may be at play in interrelated ways to desiccation. But Collins’s analysis also demonstrates that it is quite unlikely that humidity and desiccation will associate in different macroregions, as we have observed. His logistic regressions suggest that the pattern holds independently in Eurasia, Africa, and North America. In Everett et al. (2015) we observed that it also held in South America, and in fact the South American distribution is consistent given that languages with complex tone there occur in Amazonia. If the question is whether the pattern is simply consistent in these four macroregions, the answer is yes. A coincidental consistency in four macroregions seems not just ‘unlikely’ but ‘extremely unlikely’, and a contact-only approach does not explain it. Also, it should be stressed that these are the only four macroregions for which our account makes any predictions, given that they have in- habitable desiccated regions. Furthermore, Collins’s commentary does not address one of the key points in our original study—the observed association holds across geographically distant isolates. Finally, Collins mentions the movement of Niger-Congo languages through Africa, losing tone as they come into contact with dry regions, but also nontonal languages. One open question is why the languages in these regions were dry to begin with. His data are fascinating, we think, and may be consistent, at least in some cases, with the proposed ‘borrowing’ mechanism we discussed in our target piece. This too requires further exploration.’
My response to their response would be:
- They say that my simulations explain how the correlation can arise ‘in individual cases’, but seem to ignore my main claim - that the global correlation between tone and humidity is actually expected, even after using their test controlling for language family. I think that the main finding of their paper is therefore a spurious result.
- In a logistic regression the presence of complex tone does not correlate with humidity in South America; they reply that the number of tones correlates with humidity (in their Supplementary Materials). This is quite a different prediction, and would require a different set of simulations. One interesting problem is that a correlation between number of tones and humidity might arise because transition events are not independent: a language is more likely to end up having nine tones if it is in a branch of languages which have eight tones, than if it was in a branch of languages with just three tones. A simulation would need to be done where the way that languages transition in numbers of tones would be taken into account.
- If you stick with the results I was analysing, then a correlation within three regions is ‘unlikely’ (5-20%) but not ‘extremely unlikely’, and is in fact above conventional significance (p>0.05).
- As Harald Hammarström pointed out, in these correlations within regions, there is no control for language relatedness. Although my particular randomisation test does not predict a correlation within three regions to be very likely, another type of randomisation test that simulates the evolution of tone within known families may end up getting a higher probability of this finding.
- They say that I did not address their point that the correlation holds across geographically distant isolates - I should do this, although it is unclear to me why the correlation holding in isolates matters in particular, as opposed to the correlation holding in one language sampled per family.
- They say it is an open question why in Africa non-tonal families are found in the dryer regions. To me this is not an ‘open question’, as I showed that there is about a 47% chance of this correlation holding within at least two regions.
They also say that my account in terms of language contact may be compatible with their account in terms of humidity, pointing out that a situation of second-language learning may exacerbate the problems encountered speaking a tonal language in a dry environment.
This is an open question, and would need proper naturalistic experiments to show whether people do in fact find tone more difficult in a dry environment - for example, whether a learner of Mandarin in Oxford would find tone more difficult than a learner in, say, California.
Bart de Boer’s paper in the issue casted doubt on whether the effect of dry air on the larynx really is that strong, while others such as Carlos Gussenhoven's paper tried to argue that tonal languages are not necessarily different from non-tonal languages in the degree of precision in phonation that they need.
Harald Hammarström’s paper was the harshest piece of academic writing that I have ever seen. His piece ends with the sentence, ‘As I argue, it is the (empirically false) trade-off hypothesis that follows from the theoretical background and the ingenuity of the other strategies, along with a lack of concern for multiple testing, reflects poorly on the authors, reviewers, and editors who saw it through.’ He criticised the statistical tests they employed, rightly pointing out that they fail to mention the non-significance of a regression analysis of tone and humidity, and also that the authors did not control for area and language relatedness simultaneously, either controlling for area or for family but not both. These are his strongest points, but the beginning of his paper has a (to me) bizarre philosophical ramble:
‘The authors boldly claim that ‘humans adapt to their ambient conditions at every observed level’ (EBRPP: 1) and follow the theoretical perspective with a number of examples of ecological adaptation from human and animal studies (EBRPP: 2–4). But somewhere here the logical jump from ‘some’ to ‘all’ was lost on the authors. If ecological adaptation can be found on some level it does not follow that sound systems of human languages belong to an adaptive level, or that all other levels are (or should be a priori expected to be) adaptive. To be fair, formulations later are weakened to ‘nearly every observed stratum’ (EBRPP: 1) and ‘nearly every inspected level’ (EBRPP: 2), but even that seems too strong. Should the water divider for a priori categorization not be whether the climatic differences can be argued to have a discernible impact on the human cultural behaviour in question? That is, the a priori question need not be determined by a general rule that stipulates everything to (not?) adapt to climate, but by reasoning about the potential discernible impact given the theoretical specification of climatic differences and nature of some cultural behaviour. Had the authors approached ecological adaptivity in human language with this proviso in mind, they might have been more successful in actually finding it, such as with whistled languages (Meyer 2005: 29–86) or signed hunting registers (Divale and Zipin 1977).’
Hammarström then jumps into the statistical analysis, apparently ignoring the point that the authors are making a specific and physiologically supported claim, not reasoning from an a priori statement that all or nearly all aspects of language should be ecologically adaptive.
I think that the main challenge is in showing that speakers really are sensitive to humidity when they use tone, and these need to be substantiated with experiments of Chinese speakers, or learners of Chinese, in different environments. The evidence that dry air affects precision of phonation in the larynx is enough to me to justify doing experiments of this type, regardless of whether it can also explain the distribution of tonal languages around the world (which I think can be explained by other means).
I disagree with Harald Hammarström’s opinion that Everett et al.’s paper reflects poorly on the authors. And contrary to the jocular title I chose for this blogpost, I have come round to thinking that this type of experiment is worth doing, and exciting if it gets positive results. I’m still hedging my bets.