I've added Index Fungorum to the list of RSS feeds that I generate at bioguid.info/rss. The feed uses the Index Fungorum web services to get the names added the previous day, and tries to extract any bibliographic identifiers from the metadata associated with each record (we get the metadata by resolving the LSID for the name). As with IPNI, bibliographic information in an Index Fungorum record lists the page the name was published on, which makes locating identifiers such as DOIs a bit of a struggle. Still, it's nice to have another feed of taxonomic names.
Index Fungorum
I've added Index Fungorum to the list of RSS feeds that I generate at bioguid.info/rss. The feed uses the Index Fungorum web services to get the names added the previous day, and tries to extract any bibliographic identifiers from the metadata associated with each record (we get the metadata by resolving the LSID for the name). As with IPNI, bibliographic information in an Index Fungorum record lists the page the name was published on, which makes locating identifiers such as DOIs a bit of a struggle. Still, it's nice to have another feed of taxonomic names.
Labels:
Index Fungorum
,
RSS
Zotero: creating bibliographies in the cloud
Lately I've become more and more interested in moving data off my machine(s) and into the cloud. I'm keen to do this partly to avoid having data in one place (e.g., a machine at work) when I need it someplace else (e.g., at home), and there are great tools for doing this (such as the wonderful Dropbox).
As a developer, the cloud appeals, not so much because of the compute power that some are salivating over, but because it may free me from having to create my own software. For example, some time ago I have created an OpenURL resolver to help me find articles online. I harvest a bunch of sources, such as CrossRef, PubMed, some OPAI respositories, etc., but there's always times where I find a reference online that I'd like to add, and that reference doesn't have an identifier such as a DOI.
Typically I add these manually, or by importing a file. I could write some interface code to add (and edit) a bibliographic reference (and, indeed I did some time back), but wouldn't it be great if somebody else had done this for me?
Using a resource like Zotero saves me the hassle of having to write my own bibliographic editor, plus I benefit from using a tool that's a lot more polished than one I could make. Because of this, and my experience with the Google Spreadsheets API, I'm ultimately aiming to never have to write a user interface again. If I write services, and rely on third parties to make tools that can either generate services I can use, or consume my services, then my life becomes a lot simpler.
OK, perhaps I exaggerate. I like making interfaces, such as my eBio09 entry, or the experiment with SpaceTree. However, I can imagine a situation where I don't have to write a data entry interface ever again.
Labels:
bibliographies
,
cloud
,
data entry
,
interface
,
services
,
WebDAV
,
Zotero
How to publish a journal RSS feed
This morning I posted
this tweet:
In the spirit of being constructive, here are some dos and don'ts.
Do
1. Validate the RSS feed
Fun your feed through Feed Validator to make sure it's valid. Your feed is an XML document, if it won't validate then aggregators may struggle with it. Testing it in your favourite web browser isn't enough, but if a browser fails to display it this may be a clue that something's wrong. For example, Safari won't display the ZooKeys RSS feed, and at the time of writing this feed is not valid.
2. Make sure your feed is autodiscoverable

When I visit your web page my browser should tell me that there is a feed available (typically with a RSS icon in the location bar). If there's no such icon, then I have to look at your page to find the feed (if it exists). The Nuytsia page is an example of a non-discoverable feed. To make your feed autodiscoverable is easy, just add a link tag inside the head tag on the page. For example, something like this:
3. Use standard identifiers as the links
If your journal has DOIs, use those in the links, not the URL of the article web page. The later is likely to change (the DOI won't, unless you are being naughty), and given a DOI I can harvest the metadata via CrossRef.
4. Each item link in the feed links to ONE article
This was the reason for my grumpy tweet. The journal Nuytsia has a RSS feed (great!), but the links are not to individual articles. Instead, they are database queries that may generate one or more results. For example, this link RYE, B.L., (2009). Reinstatement of the Western Australian genus Oxymyrrhine (Myrtaceae : Chamelauci... actually lists two papers, both authored by B. L. Rye. This breaks the underlying model where the feed lists individual articles.
5. Include lots of metadata in your feed
If you don't use DOIs, then include metadata about your article in your feed. That way, I don't need to scrape your web pages, all I need is already in the feed.
6. Make it possible to harvest metadata about your articles
If you don't use DOIs are your article identifier, or use the DOIs are the item links in your RSS feed, then make it easy for me to get the bibliographic details from either the RSS feed, or from the web page. If you use RSS 1.0, then ideally you are using PRISM and I can get the metadata from that. If not, you can embed the metadata in the HTML page describing the article using Dublin Core and meta and link tags. For example, if you resolve this doi:10.1076/snfe.38.2.115.15923 and view the HTML source you will see this:
Not pretty, but it enables me to get the details I want.
7. Support conditional HTTP GET
If you don't want feed readers and aggregators to hammer your service, support HTTP conditional GET (see here for details) so that feed readers only grab your feed if it has changed. Not many journal publishers do this, if they get overloaded by people grabbing RSS feeds they've only themselves to blame.
Don'ts
1. Sign up/log in
Don't ever ask me to sign up or log in to get the RSS feed (Cambridge University Press, I'm looking at you). If you think your content is so good/precious that I should sign up for it, you are sadly mistaken. Nature doesn't ask me to login, nor should you.
2. Break DOIs
Another major cause of grumpiness is the frequency with which DOIs break, especially for recently published articles (i.e., precisely those that will be encountered in RSS feeds). There is quite simply no excuse for this. If your workflow results in DOIs being put on web pages before they are registered with CrossRef, then you (or CrossRef) are incompetent.
this tweet:
Harvesting Nuytsia RSS http://science.dec.wa.gov.a... Non-trivial as links are not to individual articles #failMy grumpiness (on this occasion, seems lots of things seem to make me grumpy lately) is that often journal RSS feeds leave a lot to be desired. As RSS feeds are a major source of biodiversity information (for a great example of their use see uBio's RSS, described in doi:10.1093/bioinformatics/btm109) it would be helpful if publishers did a few basic things. Some of these suggestions are in Lisa Roger's RSS and Scholarly Journal Tables of Contents: the ticTOCs Project, and Good Practice Guidelines for Publishers, but some aren't.
In the spirit of being constructive, here are some dos and don'ts.
Do
1. Validate the RSS feed
Fun your feed through Feed Validator to make sure it's valid. Your feed is an XML document, if it won't validate then aggregators may struggle with it. Testing it in your favourite web browser isn't enough, but if a browser fails to display it this may be a clue that something's wrong. For example, Safari won't display the ZooKeys RSS feed, and at the time of writing this feed is not valid.
2. Make sure your feed is autodiscoverable
When I visit your web page my browser should tell me that there is a feed available (typically with a RSS icon in the location bar). If there's no such icon, then I have to look at your page to find the feed (if it exists). The Nuytsia page is an example of a non-discoverable feed. To make your feed autodiscoverable is easy, just add a link tag inside the head tag on the page. For example, something like this:
<link rel="alternate" type="application/rss+xml"
title="RSS Feed for Nuytsia"
href="http://science.dec.wa.gov.au/nuytsia/nuytsia.rss.xml" />
3. Use standard identifiers as the links
If your journal has DOIs, use those in the links, not the URL of the article web page. The later is likely to change (the DOI won't, unless you are being naughty), and given a DOI I can harvest the metadata via CrossRef.
4. Each item link in the feed links to ONE article
This was the reason for my grumpy tweet. The journal Nuytsia has a RSS feed (great!), but the links are not to individual articles. Instead, they are database queries that may generate one or more results. For example, this link RYE, B.L., (2009). Reinstatement of the Western Australian genus Oxymyrrhine (Myrtaceae : Chamelauci... actually lists two papers, both authored by B. L. Rye. This breaks the underlying model where the feed lists individual articles.
5. Include lots of metadata in your feed
If you don't use DOIs, then include metadata about your article in your feed. That way, I don't need to scrape your web pages, all I need is already in the feed.
6. Make it possible to harvest metadata about your articles
If you don't use DOIs are your article identifier, or use the DOIs are the item links in your RSS feed, then make it easy for me to get the bibliographic details from either the RSS feed, or from the web page. If you use RSS 1.0, then ideally you are using PRISM and I can get the metadata from that. If not, you can embed the metadata in the HTML page describing the article using Dublin Core and meta and link tags. For example, if you resolve this doi:10.1076/snfe.38.2.115.15923 and view the HTML source you will see this:
<meta http-equiv="Content-Type" content="text/html;charset=iso-8859-1" />
<meta http-equiv="Content-Language" content="en-gb" />
<link rel="shortcut icon" href="/mpp/favicon.ico" />
<meta name="verify-v1" content="xKhof/of+uTbjR1pAOMT0/eOFPxG8QxB8VTJ07qNY8w=" />
<meta name="DC.publisher" content="Taylor & Francis" />
<meta name="DC.identifier" content="info:doi/10.1076/snfe.38.2.115.15923" />
<meta name="description" content="In this study we determined the effects of topography on the distribution of ground-dwelling ants in a primary terra-firme forest near Manaus, in cent..." />
<meta name="authors" content="Heraldo L. Vasconcelos ,Antônio C. C. Macedo,José M. S. Vilhena" />
<meta name="DC.creator" content="Heraldo L. Vasconcelos" />
<meta name="DC.creator" content="Antônio C. C. Macedo" />
<meta name="DC.creator" content="José M. S. Vilhena" />
Not pretty, but it enables me to get the details I want.
7. Support conditional HTTP GET
If you don't want feed readers and aggregators to hammer your service, support HTTP conditional GET (see here for details) so that feed readers only grab your feed if it has changed. Not many journal publishers do this, if they get overloaded by people grabbing RSS feeds they've only themselves to blame.
Don'ts
1. Sign up/log in
Don't ever ask me to sign up or log in to get the RSS feed (Cambridge University Press, I'm looking at you). If you think your content is so good/precious that I should sign up for it, you are sadly mistaken. Nature doesn't ask me to login, nor should you.
2. Break DOIs
Another major cause of grumpiness is the frequency with which DOIs break, especially for recently published articles (i.e., precisely those that will be encountered in RSS feeds). There is quite simply no excuse for this. If your workflow results in DOIs being put on web pages before they are registered with CrossRef, then you (or CrossRef) are incompetent.
Labels:
aggregation
,
DOI
,
Dublin Core
,
PRISM
,
publication
,
RSS
Subscribe to:
Posts
(
Atom
)