Search this keyword

Sometimes the mess taxonomy creates drives me nuts

Playing with some sequence data I found numerous Plasmodium sequences from the following paper:

Werner, E. B. ., Taylor, W. R., & Holder, A. A. (1998). A Plasmodium chabaudi protein contains a repetitive region with a predicted spectrin-like structure1Note: Nucleotide sequence data reported in this paper are available in the EMBL, GenBank™ and DDJB databases under the accession number U43145.1. Molecular and Biochemical Parasitology, 94(2), 185–196. doi:10.1016/S0166-6851(98)00067-X

These sequences (e.g., U43145) give the host as Thamnomys rutilans. You'd think it would be fairly easy to learn more about this animal, given that it hosts a relative of the cause of malaria in humans, and indeed there are a number of biomedical papers that come up in Google, e.g.:

Landau, I., & Chabaud, A. (1994). Advances in Parasitology (Vol. 33, pp. 49–90). Elsevier BV. doi:10.1016/S0065-308X(08)60411-X
Killick-Kendrick, R. (1968). Malaria parasites of Thamnomys rutilans (Rodentia, Muridae) in Nigeria. Bull World Health Organ. 1968; 38(5): 822–824. PMC2554675

Google also tells me that Thamnomys rutilans is an African rodent (e.g., 6.1.6. Rodent malaria, but NCBI has no sequences for "Thamnomys rutilans", and GBIF has no data on its distribution. If I search Mammal Species of the World I get (literally) "nothing found ...".

So, this is an African rodent, host to Plasmodium, and we know nothing about it? A bit of Googling, a trip to Wikipedia and Google Books reveals that Thamnomys rutilans is a synonym of Grammomys rutilans, but it is now called Grammomys poensis because the original name (Mus rutilans Peters 1876) is a junior synonym homonym of Mus rutilans Olfers, 1818 (simples). You can see the original description of Mus rutilans Peters 1876 in BioStor http://biostor.org/reference/105261 (this took some tracking down, but that's another story):



The original description of Mus rutilans Olfers, 1818 is given by The description of a new species of South American hocicudo, or long-nose mouse, genus Oxymycterus (Sigmodontinae, Muroidea), with a critical review of the generic content as:

Olfers, I. 1818. Bemerkungen zu Illiger's Ueberblick der Saugethiere nach ihrer Betheilung über die Welttheile rüchsichtlich der Südamerikanischen Arten (Species). In Eschwege, W. L., ed., Journal von Brasilien, Weimar, 15(2): 192-237.

This reference doesn't seem to be online.

The upshot of all this information about the host of Plasmodium chabaudi is hidden behind taxonomic name changes, and databases that one might expect to help simply don't. If names are the glue that link biodiversity data together then we need to get a lot better at making basic information about name changes accessible, otherwise we are creating islands of disconnected data.

Dimly lit taxa - guest post by Bob Mesibov

The following is a first for iPhylo, a guest post by Bob Mesibov. Bob

Rod Page introduced 'dark taxa' here on iPhylo in April 2011. He wrote:

The bulk of newly added taxa in GenBank are what we might term "dark taxa", that is, taxa that aren't identified to a known species. This doesn't necessarily mean that they are species new to science, we may already have encountered these species before, they may be sitting in museum collections, and have descriptions already published. We simply don't know. As the output from DNA barcoding grows, the number of dark taxa will only increase, and macroscopic biology starts to look a lot like microbiology.

Rod suggested that 'quite a lot' of biology can be done without taxonomic names. For the dark taxa in GenBank, that might well mean doing biology without organisms – a surprising thought if you're a whole-organism biologist.

Non-taxonomists may be surprised to learn that a lot of taxonomy is also done, in fact, without taxonomic names. Not only is there a 'dark taxa' gap between putative species identified genetically and Linnaean species described by specialists, there's a 'dimly lit taxa' gap between the diversity taxonomists have already discovered, and the diversity they've named.

Dimly lit taxa range from genera and species given code names by a specialist or a group of collaborators, and listed by those codes in publications and databases, to potential type specimens once seen and long remembered by a specialist who plans to work them up in future, time and workload permitting.

In that phrase 'time and workload permitting' is a large part of the explanation for dimly lit taxa. Over the past month I created 71 species of this kind myself. Each has been code-named, diagnostically imaged, databased and placed in code-labelled bottles on museum shelves. The relevant museums have been given digital copies of the images and data.

The 71 are 'species-in-waiting'. They aren't formally named and described, but specialists like myself can refer to the images and data for identifying new specimens, building morphological and biogeographical hypotheses, and widening awareness of diversity in the group to which the 71 belong.

'Time and workload permitting'. Many of the 71 are poor-quality or fragmented museum specimens from which important morphological data, let alone sequences, cannot be obtained. Fresh specimens are needed, and fieldwork is neither quick nor easy. In my special corner of zoology, as in most such corners in zoology and botany, the widespread and abundant species are all, or nearly all, named. The unnamed rump consists of a huge diversity of geographically restricted and uncommon species. There are more than 71 in that group of mine; those are just the rare species I know about, so far.

'Time and workload permitting'. A non-taxonomist might ask, 'Why don't you just name and describe the 71 briefly, so that the names are at least available, and the gap between what's known and what's named is narrowed?' The answer is simple: inadequate descriptions are the bane of taxonomy. There are hundreds of species in my special group that were named and inadequately described long ago, and which wind up on checklists of names as 'nomen dubium' and 'incertae sedis'. Clearing up the mysteries means locating the types (which hopefully still exist) and studying them. That slow and tedious study would better have been done by the first describer.

Cybertaxonomic tools can help bring dimly lit taxa into full light, but not much. The rate-limiting steps in lighting up taxa are in the minds and lives of human taxonomists coping with the huge and bewilderingly complex diversity of life. It's not the tools used after the observing and thinking is done, it's the observing and thinking.

In their article 'Ramping up biodiversity discovery via online quantum contributions' (http://dx.doi.org/10.1016/j.tree.2011.10.010), Maddison et al. argue that the pace of naming and description can be increased if information about what I've called dimly lit taxa is publicly posted, piece by piece, 'publish as you go', on the Internet. In my case, I would upload images and data for my 71 'species-in-waiting' to suitable sites and make them freely available.

Excited by these discoveries, amateurs and professionals would rush to search for fresh specimens. Specialists would drop whatever else they were doing, borrow the existing specimens of the 71 from their repositories and do careful inventories of the morphological features I haven't documented. Aroused from their humdrum phylogenetic analyses of other organisms, molecular phylogeny labs would apply for extra funding to work on my 71 dimly lit taxa. In no time at all, a proud team of amateurs and specialists would be publishing the results of their collaboration, with 71 names and descriptions.

Shortly afterwards, flocks of pigs would slowly circle the 71 type localities, flapping their wings in unison.

Memo to Maddison et al. and other would-be reformers: the rate of taxonomic discovery and documentation is very largely constrained by the supply of taxonomists. You want more names, find more namers.

Citations, Social Media & Science

Quick note that Morgan Jackson (@BioInFocus) has written nice blog post Citations, Social Media & Science inspired by the fact that the following paper:

Kwong, S., Srivathsan, A., & Meier, R. (2012). An update on DNA barcoding: low species coverage and numerous unidentified sequences. Cladistics, no–no. doi:10.1111/j.1096-0031.2012.00408.x

cites my "Dark taxa" in the body of the text but not in the list of literature cited. This prompted some discussion of DOIs and blog posts on Twitter:



Read Morgan's post for more on this topic. While I personally would prefer to see my blog posts properly cited in papers like doi:10.1111/j.1096-0031.2012.00408.x, I suspect the authors did what they could given current conventions (blogs lack DOIs, are treated differently from papers, and many publishers cite URLs in the text, not the list of references cited). If we can provide DOIs (ideally from CrossRef so we become part of the regular citation network), suitable archiving, and — most importantly — content that people consider worthy of citation then perhaps this practice will change.