Accounting Careers

EOL iPad web app using jQueryMobile

As part of a course on "phyloinformatics" that I'm about to teach I've been making some visualisations of classifications. Here's one I've put together using jQuery Mobile and the Encyclopedia of Life API. It's pretty limited, but is a simple way to explore EOL using three different classifications. You can view this live at http://iphylo.org/~rpage/phyloinformatics/eoliphone/ (looks best on an iPad or iPhone). Once I've tidied it up I'll put the code online. Meantime here's a quick demo:

Yet another reason why we need specimen identifiers, now!

This message appeared on the TAXACOM mailing list:

It is getting more and more necessary for taxonomists to demonstrate
that they are useful and used. This does not only apply to the
individual scientists, but also to institutions with taxonomic
collections, such as museums and herbaria.

In an attempt to live up to that increasing demand for documentation,
the leadership of the Natural History Museum of Denmark has issued an
order to its curatorial staff - The staff members are requested to
document which publications from 2011, written entirely by external
scientists, that in one way or another are based on material in the
collections of the Museum.

Given that most specimens lack resolvable digital identifiers (a theme I've harped on about before, most recently in the context of DNA barcoding), answering this kind of query ends up being a case of searching publications for text strings that contain the acronym of the collection. The sender of the message, Ib Friis, is alarmed at this prospect:

In publications, material from our herbarium at "C" is normally referred
to in text strings of one of the following forms: "(C)", "(C, ", ", C,"
or " C)". But a search in for example Google Scholar or other search
engines result in overflow of thousands and thousands of hits, even
when these text strings are combined with other relevant words such as
"botany", "plants", etc.

In an earlier paper "Biodiversity informatics: the challenge of linking data and the role of shared identifiers" (http://dx.doi.org/10.1093/bib/bbn022) (free preprint available here: hdl:10101/npre.2008.1760.1) I argued that having resolvable identifiers for specimens could enable measures of "citation" to be computed for specimens (and data derived from those specimens). Just as we have citation counts for articles and impact factors for journals, we could have equivalent measures for specimens and collections. These measures may keep administrators happy, for scientists I think the real benefits will be the ability to trace the provenance of some data, and the fate of data they themselves have collected or published.

For things such as publications it is trivial to track their usage. For example, to find the number of times the article "Biodiversity informatics: the challenge of linking data and the role of shared identifiers" has been cited, I simply enter the DOI into Google Scholar, e.g. http://scholar.google.co.uk/scholar?q=10.1093/bib/bbn022. Imagine being able to do the same for specimens?

For this to happen, museum specimens need digital identifiers. If museums are serious about quantifying the impact of their collections, they should make assigning digital identifiers a priority.

Mendeley as CiteBank: some ideas

Here are some quick notes on how BHL could use Mendeley as a "CiteBank".

As a repository of bibliographic data

If the goal is to assemble a "bibliography of life" then there are various ways this could be done.

Taxon-specific bibliographies

Create groups that are taxon-specific (or find existing groups in Mendeley. For example, I've created groups for amphibias (Amphibian Species of the World) and reptiles (TIGR/JCVI Reptile Database) based on the Amphibian Species of the World and TIGR/JCVI Reptile Database, respectively. Taxon-specific groups are probably going to be attractive to users, but the quality of bibliographic metadata can be variable. However, a bibliography for a specific taxonomic group that is populated with links to BHL content would be very useful.

Journal-specific bibliographies

This is where I've spent most of my efforts. I've created around 300 groups for various journals (see list below, or go directly to http://dl.dropbox.com/u/639486/groups.html). In some cases I've managed to populate these with the complete set of articles published in that journal, typically harvested from the journal's own web site. Typically the metadata from journal sites is high quality, although one has to be wary of Orwellian metadata.

I use these groups in two ways. The first is as a source of metadata for extracting articles from BHL using BioStor. If you have article-level metadata finding articles in BHL becomes easier, and can be automated so that 1000's can be added in a few minutes.

The second is for the taxon-literature mapping project, where one strategy is to use approximate string mapping to find equivalent citations in Mendeley and the ION database. Ultimately I'd like to link to the Mendeley citations as they tend to be higher quality than those in the original ION database.

BHL could create Mendeley groups for journals it has scanned, and populate those.

As an article-level index to BHL

This is perhaps the most direct way BHL could use Mendeley is as follows:

Create a BHL account.
For each BHL title create a Mendeley group (the name would be the BHL TitleID).
For each item in that title create a folder in the corresponding group (the folder name would be the ItemID).
Within each folder list the articles, book chapters or other component parts. If these aren't available yet, encourage people to add them. Some of these could be pre-populated with content from BioStor.
Harvest the contents of these groups to provide an article-level index to BHL (which for me is the single biggest impediment to using BHL). Previously I've suggested a way to easily add article data to BHL, Mendeley title/item groups and folders might be way to facilitate this process.

PDF storage

Although Mendeley offers PDF storage, this is one feature I'd be less inclined to use. Mendeley's rule for sharing PDFs and making them publicly available are too restrictive (they often don't know whether a PDF can, in fact, be shared). Plus you want tools to visualise, index, and archive PDFs. In effect a big file store with added features. I have some ideas on how this can be implemented (and have a rough working version to support http://iphylo.org/~rpage/itaxon). Alternatively, one could use Internet Archive services.

Summary

As I've often argued, given the success of tools like Mendeley it seems pointless for anyone to try and build yet another online bibliographic database. The trick is to figure out how to leverage what Mendeley provides to support what the taxonomic (and broader biodiversity) community needs.

Accounting Careers

Search this keyword

EOL iPad web app using jQueryMobile

Yet another reason why we need specimen identifiers, now!

Mendeley as CiteBank: some ideas

Blog Archive

Popular Posts

Labels