Search this keyword

Showing posts with label article 2.0. Show all posts
Showing posts with label article 2.0. Show all posts

Elsevier articles have interactive phylogenies

Elsevier treeSay what you will about Elsevier, they are certainly exploring ways to re-imagine the scientific article. In a comment on an earlier post Fabian Schreiber pointed out that Elsevier have released an app to display phylogenies in articles they publish. The app is based on jsPhyloSVGand is described here. You can see live examples in these articles:

Matos-Maraví, P. F., Peña, C., Willmott, K. R., Freitas, A. V. L., & Wahlberg, N. (2013). Systematics and evolutionary history of butterflies in the “Taygetis clade” (Nymphalidae: Satyrinae: Euptychiina): Towards a better understanding of Neotropical biogeography. Molecular Phylogenetics and Evolution, 66(1), 54–68. doi:10.1016/j.ympev.2012.09.005
Poćwierz-Kotus, A., Burzyński, A., & Wenne, R. (2010). Identification of a Tc1-like transposon integration site in the genome of the flounder (Platichthys flesus): A novel use of an inverse PCR method. Marine Genomics, 3(1), 45–50. doi:10.1016/j.margen.2010.03.001
Sampleimg2Sampleimg3

Viewing scientific articles on the iPad: cloning the Nature.com iPhone app using jQuery Mobile

Over the last few months I've been exploring different ways to view scientific articles on the iPad, summarised here. I've also made a few prototypes, either from scratch (such as my response to the PLoS iPad app) or using Sencha Touch (see Touching citations on the iPad).

Today, it's time for something a little different. The Sencha Touch framework I used earlier is huge and wasn't easy to get my head around. I was resigning myself to trying to get to grips with it when jQuery Mobile came along. Still in alpha, jQuery Mobile is very simple and elegant, and writing an app is basically a case of writing HTML (with a little Javascript here and there if needed). It has a few rough edges, but it's possible to create something usable very quickly. And, it's actually fun.

So, to learn a it more about how to use it, I decided to see if I could write a "clone" of Nature.com's iPhone app (which I reviewed earlier). Nature's app is in many ways the most interesting iOS app for articles because it doesn't treat the article as a monolithic PDF, but rather it uses the ePub format. As a result, you can view figures, tables, and references separately.

The cloneYou can see the clone here.

photo.PNGphoto.PNG


I've tried to mimic the basic functionality of the Nature.com app in terms of transitions between pages, display of figures, references, etc. In making this clone I've focussed on just the article display.

A web app is going to lack the speed and functionality of a native app, but is probably a lot faster to develop. It also works on a wider range of platforms. jQuery Mobile is committed to supporting a wide range of platforms, so this clone should work on platforms other than the iPad.

The Nature.com app has a lot of additional functionality apart from just displaying articles, such as list the latest articles from Nature.com journals, manage a user's bookmarks, and enable the user to buy subscriptions. Some of this functionality would be pretty easy to add to this clone, for example by consuming RSS feeds to get article lists. With a little effort one could have a simple, Web-based app to browse Nature content across a range of mobile devices.

Technical stuff

Nature's app uses the ePub format, but Nature's web site doesn't provide an option to download articles in ePub format. However, if you use a HTTP debugging proxy (such as Charles Proxy) when using Nature's app you can see the URLs needed to fetch the ePub file.

I grabbed a couple of ePub files for articles in Nature communications and unzipped them (.epub files are zip files). The iPad app is a single HTML file that uses some Ajax calls to populate the different views. One Ajax call takes the index.html that has the article text and replaces the internal and external links with calls to Javascript functions. An article's references, figure captions, and tables are stored in separate XML files, so I have some simple PHP scripts that read the XML and extract the relevant bits. Internal links (such as to figures and references) are handled by jQuery Mobile. External links are displayed within an iFrame.

There are some intellectual property issues to address. Nature isn't an Open Access journal, but some articles in Nature Communications are (under the Commons Attribution-NonCommercial-Share Alike 3.0 Unported License), so I've used two of these as examples. When it displays an article, Nature's app uses Droid fonts for the article heading. These fonts are supplied as an SVG file contained within the ePub file. Droid fonts are available under an Apache License as TrueType fonts as part of the Android SDK. I couldn't find SVG versions of the fonts in the Android SDK, so I use the TrueType fonts (see Jeffrey Zeldman's Web type news: iPhone and iPad now support TrueType font embedding. This is huge.). Oh, and I "borrowed" some of the CSS from the style.css file that comes with each ePub file.

Viewing scientific articles on the iPad: browsing articles

touchevents.pngIn previous articles I've looked at how various apps display scientific articles. The apps I looked at were:

So, where next? As Ian Mulvany noted in a comment on an earlier post, I haven't attempted to summarise the best user interface metaphors for navigation. Rather than try and do that in the abstract, I'd like to create some prototypes to play with various ideas. The Sencha Touch framework looks a good place to start. It's web-based, so things can be prototyped rapidly (I'm not going to learn Objective C anytime soon). There's a moderately steep learning curve, unless you've written a lot of Javascript (I've done some, but not a lot), but it seems to offer a lot of functionality. Another advantage of developing a web app is that it keeps the focus on making the content accessible across devices, and using the web as the means to display and interact with content.

Then there is also the issue (in addition to displaying an individual article) of how to browse and find articles to view. Here are some possibilities.

Publisher's stream
Apps such as the Nature app and the PLos Reader provide you with a stream of articles from a single publisher. This is obviously a bit limiting for the reader, but might have some advantages if the publisher has specifically enhanced their content for devices such as the iPad.

Personal library
Apps such as Mendeley and Papers provide articles from your personal library. These are papers you care about, and one you may make active use of.

Social
Social readers such as Flipboard show the power of bringing together in one place content derived from social streams, such as Twitter and Facebook, as well as curated sources and publisher streams. Mendeley and other social bookmarking services (e.g., CiteULike, Connotea) could be used to provide social similar streams of papers for an article viewer. Here the goal is probably to find out what papers people you know find interesting.

Spatialipadmap.png
In an earlier post I used a map to explore papers in my BioStor archive. This would be an obvious thing to add to an iPad app, especially as the iPad knows where you are. Hence, you could imagine browsing papers about areas that are near you, or perhaps by authors near you. This would be useful if, say, you wanted to know about ecological or health studies of the area you live in. If the geographic search was for people rather than papers, you could easily discovering what kind of research is published by universities or other research bodies that are near your current location.

Of course, Earth is not the only thing we can explore spatially. Google maps can display other bodies in the solar system, (e.g., Mars), as well as the night sky. Imagine being interested in astronomy and being able to browse papers about specific planetary or stellar objects. Likewise, genomes can be browsed using Google maps-inspired browsers (e.g., jBrowse), so we could have an app where you could easily retrieve articles about a particular gene or other region of a genome.

Categories
Another way to browse content is by topic. Classifying knowledge into categories is somewhat fraught, but there are some obvious wasy this could be useful. A biologist might want to navigate content by taxonomic group, particularly if they want to browse through the 1000's of articles published in a journal such as Zootaxa (hence my experiments on browsing EOL). Of course, a tree is not the only way to navigate hierarchical content. Treemaps are another example, and I've played with various versions in the past (see here and here).

qt.png

I have a love-hate relationship with treemaps, but some of the most interesting work I've seen on treemaps has been motivated by displaying information on small screens, e.g. "Using treemaps to visualize threaded discussion forums on PDAs" (doi:10.1145/1056808.1056915).

Summary
These notes list some of the more obvious ways to browse a collection of articles. It would be fun to explore these (and other approaches) in parallel with thinking about how to display the actual articles. These two issues are related, in the sense that the more metadata we can extract from the articles (such as keywords, taxonomic names and other named entities, geographic localities, etc.) the richer the possibilities for finding our way through those articles.

Extracting semantic goodness from Zootaxa articles

zootaxa.png

I've just come back from a holiday in New Zealand, during which time I spent a morning chatting with Zhi-Qiang Zhang (@Zootaxa, editor of Zootaxa) and Stephen Thorpe (stho002, a major contributor to Wikispecies).

Fresh from playing with PLoS XML to explore ways of redisplaying articles (described in my commentary on the PLoS iPad app), I was extolling the virtues of the XML mark-up that underlies PLoS (and other Open Access journals, such as the BMC series). These publishers provide Open Access XML versions of their papers that are quite richly marked up: internal citations, links to figures, the bibliography, etc. are all clearly identified, although they don't have the semantic mark-up of TaxPub, used in some recent Zookeys papers.

Talking to Zhi-Qiang Zhang is always a useful reality check. Zootaxa describes itself as the
World's foremost journal in taxonomy; publisher of 15,421 new taxa in 141,518 pages by 7,385 authors worldwide since 2001

This is taxonomic publishing on a grand scale, averaging more than an article a day. Since 2004 Zootaxa has published 12.60% percent of the new taxa recorded in Zoological Record, an order of magnitude more it's nearest rival. The journal is being tightly run, and doesn't have cash to spare (it has nothing like the funding PLoS has, for example). Any change to the basic work flow (author submits Word file, this is imported into Adobe Framemaker, which creates the PDF files displayed on the Zootaxa web site) requires compelling justification. Furthermore, any change would have to scale. The level of work required to embellish articles using custom mark-up, such as TaxPub, just isn't feasible.

Zhi-Qiang waxed enthusiastically about Google Books' interface, where basic information such as keywords, geographic location, and references are extracted automatically. Google Books was one inspiration for the article display I use in BioStor, so I wondered how hard it would be to take some of the work I've been doing on BioStor and on adding mark-up to PLoS XML and apply it to Zootaxa PDFs. After some fussing with regular expressions, the bioGUID OpenURL resolver and uBio's FindIT taxonomic name tool, I've some scripts that automate extracting basic information from a Zootaxa PDF, such as the abstract, localities, taxonomic names, GenBank sequences, and the bibliography. You can see some examples at http://iphylo.org/~rpage/zootaxa/. It's all a bit crude, and isn't the same as being able to mark-up the actual text (which could be done, but with rather more effort), but there's potential here to create nice interfaces to Zootaxa papers, as well as extract the data needed to do some interesting queries.



ZooKeys publishes articles of the future

The open access taxonomic journal ZooKeys has published a special issue with four papers, each available in HTML, PDF, and XML, the later being extensively marked up. Penev et al. ("Semantic tagging of and semantic enhancements to systematics papers: ZooKeys working examples", doi:10.3897/zookeys.50.538) describes the process involved in creating these XML files. Two papers (doi:10.3897/zookeys.50.506 and doi:10.3897/zookeys.50.505) were created using authoring tools available in Scratchpads, as outlined by Blagoderov et al. ("Streamlining taxonomic publication: a working example with Scratchpads and ZooKeys", doi:10.3897/zookeys.50.539). When you view the HTMl for these articles you can toggle on or off the highlighting citations, taxonomic names, and geographic co-ordinates. Mousing over a taxonomic name, for example, a popup appears with links to GBIF, NCBI, EOL, BHL, Wikipedia, etc.):

brake.png

I think these papers represent one view of the future of scientific publishing ("article 2.0"), and I'm flattered that Penev et al. cite my Elsevier challenge work (doi:10.1016/j.websem.2010.03.004, preprint at hdl:10101/npre.2009.3173.1) as one of the sources of inspiration (along with the landmark Shotton et al. "Adventures in Semantic Publishing: Exemplar Semantic Enhancements of a Research Article" doi:10.1371/journal.pcbi.1000361, which I've discussed previously). It is also good to see the TaxPub XML schema used by a publisher, and Scratchpads being a part of the process of publishing taxonomic information.

Deep linking

My initial impression is that there is huge of potential here, although I think there is still lots to do. I'm not totally convinced that popups are they way to go (although I've dabbled with them as well), and we need to move beyond simply linking to other sites to a deeper form of integration. For example, a Zookeys article might link to BHL via a taxonomic name, but how about deeper linking? For example, the paper by Brake and von Tschirnhaus (doi:10.3897/zookeys.50.505) contains the following citations:

Biró L (1899) Commensalismus bei Fliegen. Természetrajzi füzetek 22: 198–204.

Kertész K (1899) Verzeichnis einiger, von L. Biró in Neu-Guinea und am Malayischen Archipel gesammelten Dipteren. Természetrajzi füzetek 22: 173–19

Neither reference has any links in the HTML, so the user is under the impression that they aren't available online, but both references have been scanned by BHL. You can see full text for these articles in BioStor (references 52005 and 52004, respectively -- note that the pagination for Biró 1899 is given incorrectly in the paper). This is one area where BHL has a lot to offer publishers, and it would be great to see BHL provide the services publishers need to add these links to their articles.

This integration should go both ways. It's odd that the paper by Brake and von Tschirnhaus contains LSID used by the ZooBank for this paper (urn:lsid:zoobank.org:pub:DABB03F4-A128-43BB-990C-02F25D656B00, see the <self-uri> tag in the XML), but ZooBank doesn't know about the DOI for the paper, hence the ZooBank page for this article has no link to the article itself. It's time to join this stuff together.

What's next?

What I'd really like to see is article XML repurposed as, say, RDF, and used to populate a database so that we can query it. In this way we can start to atomise the article into useful parts, and recombine them in new and interesting ways. Might be something to play with over the summer.

On a practical level, I'm somewhat bemused by the variety of XML formats being used by open access publishers. PLoS use version 2.0 of the NLM Journal Archiving and Interchange Tag Suite, and I wrote a XSLT style sheet to transform PLoS articles for viewing on an iPad. TaxPub is based on version 3.0 of the NLM DTD, which breaks quite a bit of my code relating to citations, so I'll have to tweak this to get it to display Zookeys articles correctly. Handling TaxPub itself will also require some additional work. Then there are the BMC journals, which have their own flavour of XML (based on something called the "KETON DTD"). It's all a bit messy. But I guess it'd be no fun if it was too easy...


PLoS doesn't "get" the iPad (or the web)

PLoS recently announced a dedicated iPad app, that covers all the PLoS Journals, and which is available from the App Store. Given the statement that "PLoS is committed to continue pushing the boundaries of scientific communication" I was expecting something special. Instead, what we get (as shown) in the video below is a PDF viewer with a nice page turning effect (code here). Maybe it's Steve Job's fault for showing iBooks when he first demoed the iPad, but there desire to imitate 3D page turning effects leaves me cold (for a nice discussion of how this can lead to horribly mixed metaphors see iA's Designing for iPad: Reality Check).




But I think this app shows that PLoS really don't grok the iPad. Maybe it's early days, but I find it really disappointing that page-turning PDFs is the first thing they come up with. It's not about recreating the paper experience on a device! There's huge scope for interactivity, which the PLoS app simply ignores — you can't select text, and none of the references. It also ignores the web (without which, ironically, PLoS couldn't exist).

Instead of just moaning about this, I've spent a couple of days fussing with a simple demo of what could be done. I've taken a PLoS paper ("Discovery of the Largest Orbweaving Spider Species: The Evolution of Gigantism in Nephila", doi:10.1371/journal.pone.0007516), grabbed the XML, applied a XSLT style sheet to generate some HTML, and added a little Javascript functionality. References are displayed as clickable links inline. If you click on one a window pops up displaying the citation, and it then tries to find it for you online (for the technically mined, it's using OpenURL and bioGUID). If it succeeds it displays a blue arrow — click that and you're off to the publisher's web site to view the article.
reference.png

Figures are also links, click on and you get a Lightbox view of the image.
You can view this article live, in a regular browser or in iPad. Here's a video of the demonstration page:


This is all very crude and rushed. There's a lot more that could be done. For references we could flag which articles are self citations, we could talk to bookmarking services via their APIs to see which citations the reader already has, etc. We could also make data, sequences, and taxonomic names clickable, providing the reader with more information and avenues for exploration. Then there's the whole issue of figures. For graphs we should have the underlying data so that we can easily make new visualisations, phylogenies should be interactive (at least make the taxon names clickable), and there's no need to amalgamate figures into aggregates like Fig .2 below. Each element (A-E) should be separately addressable so when the text refers to Fig. 2D we can show the user just that element.

journal.pone.0007516.g002.png

The PLoS app and reactions to Elsevier's "Article 2.0" (e.g., Elsevier's 'Article of the Future' resembles websites of the past and The “Article of the Future” — Just Lipstick Again?) suggests publishers are floundering in their efforts to get to grips with the web, and new platforms for interacting with the web.

So, PLoS, I challenge you to show us that you actually "get" the iPad and what it could mean for science publishing. Because at the moment, I've seen nothing that suggests you grasp the opportunity it represents. Better yet, why not revisit Elsevier's Article 2.0 project and have a challenge specifically about re-imagining the scientific article? And please, no more page turning effects