Search this keyword

Wikispecies RSS feed

Following on from my previous post about Wikispecies (which generated some discussion on TAXACOM) I've played some more with Wikispecies.

AS a first step I've added a Wikispecies RSS feed to my list of RSS feeds. This feed takes the original Wikispecies RSS feed for new pages (generated by the page Special:NewPages) and tries to extract some details before reformatting it as an ATOM feed. Specifically, I extract GUIDs such as IPNI and Index Fungorum identifiers, bibliographic references (which I will later parse to try and extract identifiers such as DOIs), and latitude and longitude if the Wikispecies page has type locality information. Having the later means that the RSS feed can be displayed as a map (Google Maps can take a RSS feed with geotagged items and display it on a map for you).

The map below is live, so it will show any geotagged items in the current Wikispecies feed.


View Larger Map


Wikispecies is not a database


This post was prompted by Stephen Thorpe's post on TAXACOM about Wikispecies in which he wrote (in a thread discussing Roger Hyam's recent blog post) that
[i]f it [Wikispecies] isn't a true database, then it is BETTER than a database. It can do anything a database can do, and more, if you know how it works properly.
I beg to differ. Wikispecies runs on a database (the Mediawiki software uses a database to store the wiki), and Mediawiki can be thought of as a database of semi-structured text, but it lacks a lot of the functionality database users would expect. For example, in Wikispecies there's no way to perform basic queries such as how many descendants a given taxon has, what names a particular author has published, or to find out in which geographic region most new names are being described from. Much of this information is in Wikispecies, it just isn't in a form that we can usefully use.

These limitations are mostly due to the underlying software (Mediawiki), which fortunately can be extended to address these issues using Semantic Mediawiki. I've explored these ideas earlier. With some restructuring, Wikispecies could become a database, but it would require some serious work.

But this raises the real issue with Wikispecies, namely what is it for? Wikipedia is much more informative for many taxa, and the two wikis are very poorly linked (surely we'd want Wikipedia pages linked to the corresponding Wikispecies pages?). Given that Wikipedia is the basis for some core efforts in linked data (e.g., DBPedia), it seems a no brainer that we would want our information stored in Wikipedia, rather than Wikispecies.

It seems to me that the split between Wikipedia and Wikispecies parallels that between "taxonomic concepts" and "taxonomic names". Wikipedia provides the former, in that it provides one (consensus) view of what a taxon is. Wikispecies would be ideally placed to be a nomenclatural database (and a great place to put all the synonyms that we've accumulated over time, but which would swamp Wikipedia). But Wikispecies seems also to want to provide a classification as well, which strikes me as unnecessary (and raises the issue of how this relates to the classification in Wikipedia).

I don't wish to denigrate the efforts of Wikispecies contributors (they are doing some neat things, such as harvesting new names from Zookeys), and by clever use of templates they avoid some of the serious problems with classification in Wikipedia, but it's not a taxonomic database, at least, not yet.

Index Fungorum

DD01286D-DB9A-46B6-A05B-5B6A42CE0747.jpgI've added Index Fungorum to the list of RSS feeds that I generate at bioguid.info/rss. The feed uses the Index Fungorum web services to get the names added the previous day, and tries to extract any bibliographic identifiers from the metadata associated with each record (we get the metadata by resolving the LSID for the name). As with IPNI, bibliographic information in an Index Fungorum record lists the page the name was published on, which makes locating identifiers such as DOIs a bit of a struggle. Still, it's nice to have another feed of taxonomic names.