17 January 2012

Along came a SPIDER

As Rowan Atkinson said "There is a certain level of uncertainty about, of which we can be quite .... quite sure". The problem with living in the information age is trying to deal with all of the information as well as the uncertainty. We need to have good filters to remove the noise and retain the signal. This is a real problem in biology where, with the advent of technology to collect DNA, there is far more data than we can often sensibly deal with. Another problem is that although there are often useful tools for making sense of the data they are usually scattered around the internet and can be difficult to find. A group of Lincoln molecular ecologists, including Rob Cruickshank and Stephane Boyer, have put time an effort into creating a tool to help identify species and to understand speciation. The package is called SPIDER and the details of what it can do are found in a paper from Molecular Ecology Resources.

The SPIDER package (SPecies IDentity and Evolution in R)uses the the statistics package R (which is free to use) to develop tools to aid researchers in handling barcoding data. Genetic barcodes are regions of DNA (often the CO1 gene) that are unique to each species. Taking DNA from an unknown specimen, examining its barcode gene region and comparing with a DNA library allows that specimen to be positiviely identified. The SPIDER package provides summary measures of genetic distances between samples, assessments of variation and test of how accurate each match is are part of the statistics provided. SPIDER also implements a sliding window analysis that explores signal conflict within the gene region (not all data say the same thing) and allows the uncertainty to be shown in the evolutionary tree obtained from the data.

The package is in constant development and there are tutorials and a manual available from the SPIDER webpage. If you are in the business of DNA barcodes then this will be a useful tool to add to your work. Of that we can be quite sure.

No comments: