Search this keyword

BHL needs to engage with publishers (and EOL needs to link to primary literature)

Browsing EOL I stumbled upon the recently described fish Protoanguilla palau, shown below in an image by rairaiken2011:
Palauan Primitive Cave Eel

Two things struck me, the first is that the EOL page for this fish gives absolutely no clue as to where you would to find out more about this fish (apart from an unclickable link to the Wikipedia page http://en.wikipedia.org/wiki/Protoanguilla - seriously, a link that isn't clickable?), despite the fact this fish has been recently described in an Open Access publication ("A 'living fossil eel (Anguilliformes: Protanguillidae, fam. nov.) from an undersea cave in Palau", http://dx.doi.org/10.1098/rspb.2011.1289).

Now that I've got my customary grumble about EOL out of the way, let's look at the article itself. On the first page of the PDF it states:
This article cites 29 articles, 7 of which can be accessed free
http://rspb.royalsocietypublishing.org/content/early/2011/09/16/rspb.2011.1289.full.html#ref-list-1

So 22 of the articles or books cited in this paper are, apparently, not freely available. However, looking at the list of literature cited it becomes obvious that rather more of these citations are available online than we might think. For example, there are articles that are in the Biodiversity Heritage Library (BHL), e.g.


Then there are articles that are available in other digitising projects

  • Hay O. P. 1903 On a collection of Upper Cretaceous fishes from Mount Lebanon, Syria, with descriptions of four new genera and nineteen new species. Bull. Am. Mus. Nat. Hist. N. Y. 19, 395–452. http://hdl.handle.net/2246/1500
  • Nelson G. J. 1966 Gill arches of fishes of the order Anguilliformes. Pac. Sci. 20, 391–408. http://hdl.handle.net/10125/7805

Furthermore, there are articles that aren't necessarily free, but which have been digitised and have DOIs that have been missed by the publisher, such as the Regan paper above, and


So, the Proceedings of the Royal Society has underestimated just how many citations the reader can view online. The problem, of course, is how does a publisher discover these additional citations? Some have been missed because of sloppy bibliographic data. The missing DOIs are probably because the Regan citation lacks a volume number, and the Trewavas paper uses a different volume number to that used by Wiley (who digitised Proc. Zool. Soc. Lond.). But the content in BHL and other digital archives will be missed because finding these is not part of a publisher's normal workflow. Typically citations are matched by using services ultimately provided by CrossRef, and the bulk of BHL content is not in CrossRef.

So it seems there's an opportunity here for someone to provide a service for publishers that adds value to their content in at least three ways:
  1. Add missing DOIs due to problematic citations for older literature
  2. Add links to BHL content
  3. Add links to content in additional digitisation projects, such as journal archives in DSpace respositories

For readers this would enhance their experience (more of the literature becomes accessible to them), and for BHL and the repositories it will drive more readers to those repositories (how many people reading the paper on Protoanguilla palau have even heard of BHL?). I've said most of this before, but I really think there's an opportunity here to provide services to the publishing industry, and we don't seem to be grasping it yet.

Wikipedia History Flow tool now in GitHub

Inspired by a comment on my post Visualising edit history of a Wikipedia page, the code I use to make history flow diagrams like the one below is now in GitHub at https://github.com/rdmpage/wikihistoryflow.

Historyflow

There is also a live version at http://iphylo.org/~rpage/wikihistoryflow. If you enter the name of a Wikipedia page the tool will display the edit history with columns representing page versions and individual contributors (people and bots) distinguished by different colours.

This tool will fall over for pages with a lengthy history of edits, and requires a web browser that can support SVG, but it's a fun visualisation, and may inspire someone to do this properly.

Apache mod_rewrite and question marks "?"

Quick note to self in case I (inevitably) forget later. If you are using Apache mod_rewrite to make nice, clean URLs, and are also supporting JSONP, you may run into the situation where you have code that wants to append "?callback=xxx" to your URL (e.g., a cross-domain AJAX call in jQuery). Imagine you have a nice clean URL /user/123, which actually corresponds to user.php?id=123. If you append ?callback=xxx to the URL then chances are the code will break, because mod_rewrite will rewrite the URL to something like user.php?id=123?callback=xxx. What you actually want to send to your web server is user.php?id=123&callback=xxx (note the & before "callback"). After much grief trying to figure out how to coerce Apache mod_rewrite into handling this situation I found the answer, of course, on Stack Overflow. If you use the [QSA] flag, Apache will append the additional callback parameter onto the end of the rewritten URL, so JSONP will now work. Once again, Stack Overflow turned a show-stopper into a learning experience.