Search this keyword

Figshare and F1000 integrate data into publication: could TreeBASE do the same?

Spiralsticker reasonably smallQuick thoughts on the recent announcement by figshare and F1000 about the new journals being launched on the F1000 Research site. The articles being published have data sets embedded as figshare widgets in the body of the text, instead of being, say, a static table. For example, the article:

Oliver, G. (2012). Considerations for clinical read alignment and mutational profiling using next-generation sequencing. F1000 Research. doi:10.3410/f1000research.1-2.v1
has a widget that looks like this:

Widget
You can interact with this widget to view the data. Because the data are in figshare those data are independently citable, e.g. the dataset "Simulated Illumina BRCA1 reads in FASTQ format" has a DOI http://dx.doi.org/10.6084/m9.figshare.92338.

Now, wouldn't it be cool if TreeBASE did something similar? Imagine if uploading trees to TreeBASE were easy, and that you didn't have to have published yet, you just wanted to store the trees and make them citable. Imagine if TreeBASE had a nice tree viewer (no, not a Java applet, a nice viewer that uses SVG, for exmaple). Imagine if you could embed that tree viewer as a widget when you published your results. It's a win all round. People have an incentive to upload trees (nice viewer, place to store them, and others can cite the trees because they'd have DOIs). TreeBASE builds its database a lot more quickly (make it dead easy to upload tree), and then as more publishers adopt this style of publishing TreeBASE is well placed to provide nice visualisations of phylogenies pre-packaged, interactive, and citable. And let's not stop there, how about a nice alignment viewer? Perhaps this is the something currently rather moribund PLoS Currents Tree of Life could think about supporting?

Building a BHL Africa: BHL in a box

Was going to post this as a comment on the BHL blog but they use Blogger's native comment system, which is horrible, and it refused to accept my comment (yes, yes, I'm sure it did that on grounds of taste). I read the recent post Building a BHL Africa and couldn't believe my eyes when I read the following:

the "BHL in a Box" concept was highly desired. This would entail creating interactive CDs of BHL content for distribution in areas where internet access is unreliable or unavailable.
CDs! Really? Surely this is crazy!?. You want to use an obsolete technology that require additional obsolete technology to ship BHL around Africa? Why not ship relevant parts of BHL on iPads? Lots more storage space than CDs, built-in interactivity (obviously need to write an app, but could use HTML + Javascript as a starting point), long battery life, portable, comes with 3G support if needed. I'll be the first to admit that my knowledge of Africa is about zero, but given that mobile devices are common, mobile networks are fairly well developed, and tablets are making inroads (see iPad has become a big factor in African business) surely "BHL mobile" is the way to go to provide "BHL in a box", not CDs.

Why not develop an app that stores BHL content on a device like an iPad, then distribute those? Support updating the content over the network so the user isn't stuck with content they no longer need. In effect, something like Amazon's Kindle app or iBooks would do the trick. You'd need to compress BHL content to keep the size down (the images BHL currently displays on its web site could be made a lot smaller) but this is doable. Indeed, the BHL Africa could be an ideal motivation to move BHL to platforms such as phones and tablets, where at the moment users have to struggle with a website that makes no concessions to those devices.

Postscript
Of course, it doesn't have to be the iPad as such. Imagine if BHL published books and articles on Amazon, then used Kindle to deliver content physically (i.e., ship Kindles), and anyone else could access it directly from Amazon using their Kindle (or Kindle app on iPad).

Sometimes the mess taxonomy creates drives me nuts

Playing with some sequence data I found numerous Plasmodium sequences from the following paper:

Werner, E. B. ., Taylor, W. R., & Holder, A. A. (1998). A Plasmodium chabaudi protein contains a repetitive region with a predicted spectrin-like structure1Note: Nucleotide sequence data reported in this paper are available in the EMBL, GenBank™ and DDJB databases under the accession number U43145.1. Molecular and Biochemical Parasitology, 94(2), 185–196. doi:10.1016/S0166-6851(98)00067-X

These sequences (e.g., U43145) give the host as Thamnomys rutilans. You'd think it would be fairly easy to learn more about this animal, given that it hosts a relative of the cause of malaria in humans, and indeed there are a number of biomedical papers that come up in Google, e.g.:

Landau, I., & Chabaud, A. (1994). Advances in Parasitology (Vol. 33, pp. 49–90). Elsevier BV. doi:10.1016/S0065-308X(08)60411-X
Killick-Kendrick, R. (1968). Malaria parasites of Thamnomys rutilans (Rodentia, Muridae) in Nigeria. Bull World Health Organ. 1968; 38(5): 822–824. PMC2554675

Google also tells me that Thamnomys rutilans is an African rodent (e.g., 6.1.6. Rodent malaria, but NCBI has no sequences for "Thamnomys rutilans", and GBIF has no data on its distribution. If I search Mammal Species of the World I get (literally) "nothing found ...".

So, this is an African rodent, host to Plasmodium, and we know nothing about it? A bit of Googling, a trip to Wikipedia and Google Books reveals that Thamnomys rutilans is a synonym of Grammomys rutilans, but it is now called Grammomys poensis because the original name (Mus rutilans Peters 1876) is a junior synonym homonym of Mus rutilans Olfers, 1818 (simples). You can see the original description of Mus rutilans Peters 1876 in BioStor http://biostor.org/reference/105261 (this took some tracking down, but that's another story):



The original description of Mus rutilans Olfers, 1818 is given by The description of a new species of South American hocicudo, or long-nose mouse, genus Oxymycterus (Sigmodontinae, Muroidea), with a critical review of the generic content as:

Olfers, I. 1818. Bemerkungen zu Illiger's Ueberblick der Saugethiere nach ihrer Betheilung über die Welttheile rüchsichtlich der Südamerikanischen Arten (Species). In Eschwege, W. L., ed., Journal von Brasilien, Weimar, 15(2): 192-237.

This reference doesn't seem to be online.

The upshot of all this information about the host of Plasmodium chabaudi is hidden behind taxonomic name changes, and databases that one might expect to help simply don't. If names are the glue that link biodiversity data together then we need to get a lot better at making basic information about name changes accessible, otherwise we are creating islands of disconnected data.