01 November 2011
Of questionable character
It is almost election time here in New Zealand. This year we have been spared much of the usual hoo hah because of the recent (and spectacularly successful) Rugby World Cup dominating the news as well as the Canterbury earthquakes before that. One of the tasks that we will have on Election Day is to find out what New Zealanders think of our voting system. We moved away from a winner takes all 'first past the post' system 15 years ago but there is a move to bring this back. The problem with 'first past the post' is that you can become the government despite winning less than half the overall vote, and once you are in you can act as if you won 100%. Other systems encourage representation in proportion to the amount of votes you won. What has this to do with ecology? Well it turns out that there are similar issues when it comes to analysing large molecular data sets. Such data sets can tell us about the evolutionary history of a group of species; who is related to who and when their last common ancestor lived. The problem is that most of the methods for finding these patterns tend towards the first past the post ideal. The evolutionary tree is built from the strongest signal and other signals are ignored.
Rob Cruickshank has explored the issue of character conflict within molecular data sets in a recent issue of Zootaxa. Ideally, your data set would contain one signal, that of the evolutionary history of a group. Unfortunately there are a number of factors that can introduce other signals, such as convergence, parallel evolution, human error, high rates of change and so on. So within your data set there are usually competing signals, much as within society there are competing political parties. Most analyses simply find the signal with the most votes and this is proclaimed the winner. However, Rob points out that there are several ways to find smaller strength signals to further analyse, after all one of these might be the true answer of how species are related. For example, the fantastically named spectral analysis looks at each signal in the data and shows how much support there is (how many characters agree with this signal) and what conflict they have (how many characters disagree with this signal). Sometimes the signal with the greatest support has a lot of conflict whereas the next largest signal has none. Given that we might expect the correct phylogentic signal to have little conflict then this might encourage us to look further than the loudest signal. This would be like being given two votes: one for the person/party that you wanted to support and one for the person/party that you especially didn't want. If the leading candidate is also the one with the largest conflict then maybe they are not as good for consensus politics as the next candidate with a much lower conflict score. Certainly, in the world of species relationships determined by molecular characters, this might be something worth considering.