Search this keyword

Mendeley API: we'll bring the awesome if you bring the documentation

mendeley.pngMenedeley's API has been publicly launched at http://dev.mendeley.com/, accompanied by various announcements such as:
Mendeley's Research API is now open to the public. Developers, go forth and bring the awesome :) http://dev.mendeley.com/ (@subcide)

Finally saw the awesome Easter Egg that @subcide hid on the new dev.mendeley.com Developer Portal! Whoaaa! (@mendeley_com)

All good fun to be sure, but it's a pity more effort has been spent on Easter eggs than on documenting and testing the API. If you visit the API development site there's precious little in the way of documentation, and few examples. As well as making a developer's life harder, adding examples would have helped catch some bugs, such as the failure of the API calls to return details such as volume, issue, and page numbers for articles, and the inability to retrieve a document using a DOI (the '/' that a DOI contains breaks the API). These are fairly obvious things. If resources are limiting, perhaps the Mendeley API team should open up the development web site to others to help create documentation and examples. A wiki would be one way to do this.

Menedeley is a great idea, but on occasion the hype gets ahead of reality. The product has a lot of potential, but also has some significant problems. Using the search API you pretty quickly encounter its number one problem: duplicates. I get the sense that Mendeley is about three things:

  1. Managing personal bibliographies and generating citations (desktop client)

  2. Networking ("the Last.fm of research") (web site)

  3. Bibliographic data

Number 3 is, I suspect, the hardest problem to tackle, and it is where the ultimate value lies (think citation networks, audience data, iTunes-like business model for selling articles, etc.). I'd like Mendeley a lot more if I was confident that they had a good handle on the complexities of bibliographic data (and didn't drop pagination from API calls). Good places to start are "Are your citations clean?" (doi:10.1145/1323688.1323690) and "Learning metadata from the evidence in an on-line citation matching scheme" (doi:10.1145/1141753.1141817), both currently duplicated in Mendeley (try searching for Are your citations clean and Learning metadata from the evidence in an on-line citation matching scheme).

Extracting semantic goodness from Zootaxa articles

zootaxa.png

I've just come back from a holiday in New Zealand, during which time I spent a morning chatting with Zhi-Qiang Zhang (@Zootaxa, editor of Zootaxa) and Stephen Thorpe (stho002, a major contributor to Wikispecies).

Fresh from playing with PLoS XML to explore ways of redisplaying articles (described in my commentary on the PLoS iPad app), I was extolling the virtues of the XML mark-up that underlies PLoS (and other Open Access journals, such as the BMC series). These publishers provide Open Access XML versions of their papers that are quite richly marked up: internal citations, links to figures, the bibliography, etc. are all clearly identified, although they don't have the semantic mark-up of TaxPub, used in some recent Zookeys papers.

Talking to Zhi-Qiang Zhang is always a useful reality check. Zootaxa describes itself as the
World's foremost journal in taxonomy; publisher of 15,421 new taxa in 141,518 pages by 7,385 authors worldwide since 2001

This is taxonomic publishing on a grand scale, averaging more than an article a day. Since 2004 Zootaxa has published 12.60% percent of the new taxa recorded in Zoological Record, an order of magnitude more it's nearest rival. The journal is being tightly run, and doesn't have cash to spare (it has nothing like the funding PLoS has, for example). Any change to the basic work flow (author submits Word file, this is imported into Adobe Framemaker, which creates the PDF files displayed on the Zootaxa web site) requires compelling justification. Furthermore, any change would have to scale. The level of work required to embellish articles using custom mark-up, such as TaxPub, just isn't feasible.

Zhi-Qiang waxed enthusiastically about Google Books' interface, where basic information such as keywords, geographic location, and references are extracted automatically. Google Books was one inspiration for the article display I use in BioStor, so I wondered how hard it would be to take some of the work I've been doing on BioStor and on adding mark-up to PLoS XML and apply it to Zootaxa PDFs. After some fussing with regular expressions, the bioGUID OpenURL resolver and uBio's FindIT taxonomic name tool, I've some scripts that automate extracting basic information from a Zootaxa PDF, such as the abstract, localities, taxonomic names, GenBank sequences, and the bibliography. You can see some examples at http://iphylo.org/~rpage/zootaxa/. It's all a bit crude, and isn't the same as being able to mark-up the actual text (which could be done, but with rather more effort), but there's potential here to create nice interfaces to Zootaxa papers, as well as extract the data needed to do some interesting queries.



Flipboard and BHL

Flipboard is a new application for the iPad that is pitching itself as a personalised social magazine. It's launch created a lot of buzz, so much so that many users were unable to add their Facebook and Twitter accounts to it, much to their chagrin. I was one of these annoyed users, but now that I've been able to login I've been having a play and it's a lot of fun.



Nice typography and a clever layout is part of the attraction, and there has been some discussion about whether the Biodiversity Heritage Library (BHL) could be integrated.

@chrisfreeland @rdmpage #bhlib integration on @flipboard? The fb interface is very polished, familiar and comfortableless than a minute ago via Twitter for iPhone



Personally I'm sceptical. For me the key to Flipboard is not so much the nice interface, but the fact that the content is timely and relevant: timely because it's taken from live streams, and relevant because it comes from sources you select, including those form your social network. BHL doesn't have any of these characteristics. It's a huge digital archive with very little structure, and what structure it does have is largely bibliographic. For this content to work in a Flipboard-like environment I think BHL would need to develop "streams" based on, say taxa, geography, or readership, and these would have to be personalised. In a sense, Flipboard is displaying streams of content assembled by a combination of editors and your social network, and BHL has neither.