Search this keyword

e-Biosphere 09 Challenge slides

I've put the slides for my e-Biosphere 09 challenge entry on SlideShare.

Not much information on the other entries yet, except for the eBiosphere Citizen Science Challenge, by Joel Sachs and colleagues, which will demonstrate a "global human sensor net". Their plan is to aggregate observations posted on Flickr, Twitter, Spotter, and email. It might be fun to make use of some of this for my own entry (by default already will, because we are both using EOL's Flickr pool).

Visualizing the Evolutionary Tree of Life

Over on the EOL blog is a summary of a meeting Visualizing the Evolutionary Tree of Life. This sounds like it was a fun meeting, but part of me is suffering from déjà vu. Our community has tossed this subject around for a while now. I recall Tamara Munzner wowing us with the H3 hyperbolic browser at a meeting at UC Davis in 2000 (part of the original NSF TOL workshops, archived herel) -- the image of Joel Cracraft excitedly running up to the display screen to find various birds is forever etched into my brain. TreeJuxtaposer has been around for a while, to not much effect.

We seem to continue to fail to make much progress on this topic, despite meetings such as the EOL one, great reviews of the topic, fancy 3D visualisation, and Mike Sanderson's way kewl wall of monitors.

Maybe progress was made at the EOL meeting, but I don't get the sense that we're any further forward. It would be interesting to work out why we've struggled to satisfactorily solve this problem.

e-Biosphere Challenge: visualising biodiversity digitisation in real time

e-Biosphere '09 kicks off next week, and features the challenge:
Prepare and present a real-time demonstration during the days of the Conference of the capabilities in your community of practice to discover, disseminate, integrate, and explore new biodiversity-related data by:
  • Capturing data in private and public databases;
  • Conducting quality assurance on the data by automated validation and/or peer review;
  • Indexing, linking and/or automatically submitting the new data records to other relevant databases;
  • Integrating the data with other databases and data streams;
  • Making these data available to relevant audiences;
  • Make the data and links to the data widely accessible; and
  • Offering interfaces for users to query or explore the data.


Originally I planned to enter the wiki project I've been working on for a while, but time was running out and the deadline was too ambitious. Hence, I switched to thinking about RSS feeds. The idea was to first create a set of RSS feeds for sources that lack them, which I've been doing over at http://bioguid.info/rss, then integrate these feeds in a useful way. For example, the feeds would include images from Flickr (such as EOL's pool), geotagged sequences from GenBank, the latest papers from Zootaxa, and new names from uBio (I'd hoped to include ION as well, but they've been spectacularly hacked).

After playing with triple stores and SPARQL (incompatible vocabularies and multiple identifiers rather buggers this approach), and visualisations based on Google Maps (building on my swine flu timemap), it dawned on me what I really needed was an eye-catching way of displaying geotagged, timestamped information, just like David Troy's wonderful twittervision and flickrvision.com. In particular, David took the Poly9 Globe and added Twitter and Flickr feeds (see twittervision 3D and flickrvision 3D. So, I took hacked David's code and created this, which you can view at http://bioguid.info/ebio09/www/3d/:



It's a lot easier to simply look at it rather than describe what it does, but here's a quick sketch of what's under the hood.

Firstly, I take RSS feeds, either the raw geoFeed from Flickr, or from http://bioguid.info/rss. The bioGUID feeds include the latest papers in Zootaxa (most new animal species are described in this journal), a modified version of uBio's new names feed, and a feed of the latest, geotagged sequences in GenBank (I'd hoped to use only DNA barcodes, but it turns out rather few barcode sequences are geotagged, and few have the "BARCODE" keyword). The Flickr feeds are simple to handle because they include locality information (including latitude, longitude, and Yahoo Where-on-Earth Identifiers (WOEIDs)). Similarly, the GenBank feed I created has latitude and longitudes (although extracting this isn't always as straightforward as it should be). Other feeds require more processing. The uBio feed already has taxonomic names, but no geotagging, so I use services from Yahoo! GeoPlanet™ to find localities from article titles. For the Zootaxa feed that I created I use uBio's SOAP service to extract taxonomic names, and Yahoo! GeoPlanet™ to extract localities.


I've tried to create a useful display popup. For Zootaxa papers you get a thumbnail of the paper, and where possible an icon of the taxonomic group the paper talks about (the presence of this icon depends on the success of uBio's taxonomic name finding service, the Catalogue of Life having the same name, and my having a suitable icon). The example above shows a paper about copepods. Other papers have a icon for the journal (again, a function of my being able to determine the journal ISSN and having a suitable icon). Flickr images simply display a thumbnail of the image.

What does it all mean? Well, I could say all sorts of things about integration and mash-ups but, dammit, it's pretty. I think it's a fun way to see just what is happening in digital biodiversity. I've deliberately limited the demo to items that came online in the month of May, and I'll be adding items during the conference (June 1-3rd in London). For example, if any more papers appear in Zootaxa, or in the uBio feeds I'll add those. If anybody uploads geotagged photos to EOL's Flickr group, I'll grab those as well. It's still a bit crude, but it shows some of the potential of bringing things together, coupled with a nice visualisation. I welcome any feedback.