Search this keyword

Wikipedia History Flow tool now in GitHub

Inspired by a comment on my post Visualising edit history of a Wikipedia page, the code I use to make history flow diagrams like the one below is now in GitHub at https://github.com/rdmpage/wikihistoryflow.

Historyflow

There is also a live version at http://iphylo.org/~rpage/wikihistoryflow. If you enter the name of a Wikipedia page the tool will display the edit history with columns representing page versions and individual contributors (people and bots) distinguished by different colours.

This tool will fall over for pages with a lengthy history of edits, and requires a web browser that can support SVG, but it's a fun visualisation, and may inspire someone to do this properly.

Apache mod_rewrite and question marks "?"

Quick note to self in case I (inevitably) forget later. If you are using Apache mod_rewrite to make nice, clean URLs, and are also supporting JSONP, you may run into the situation where you have code that wants to append "?callback=xxx" to your URL (e.g., a cross-domain AJAX call in jQuery). Imagine you have a nice clean URL /user/123, which actually corresponds to user.php?id=123. If you append ?callback=xxx to the URL then chances are the code will break, because mod_rewrite will rewrite the URL to something like user.php?id=123?callback=xxx. What you actually want to send to your web server is user.php?id=123&callback=xxx (note the & before "callback"). After much grief trying to figure out how to coerce Apache mod_rewrite into handling this situation I found the answer, of course, on Stack Overflow. If you use the [QSA] flag, Apache will append the additional callback parameter onto the end of the rewritten URL, so JSONP will now work. Once again, Stack Overflow turned a show-stopper into a learning experience.

Adding article-level metadata to BHL

Recently I've been thinking about the best ways to make article-level metadata from BioStor more widely available. For example, for someone visiting the BHL site there is no easy way to find articles, which are the basic unit for much of the scientific literature. How hard would it be to add articles to BHL? In the past I've wanted an all-singing all dancing article-level interface to BHL content (sort of BioStor on steroids), but that's a way off, and ideally would have a broader scope than BHL. So instead I've been thinking of ways to add articles to BHL without requiring a lot of re-engineering of BHL itself.

Looking at other digital archive projects like Gallica and Google Books it strikes me that if the BHL interface to a scanned item had a "Contents" drop down menu then users would be able to go to individual articles very easily. Below is a screen shot of how Gallica does this (see http://gallica.bnf.fr/ark:/12148/bpt6k61331684/f57).

Gallica

There's also a screen shot of something similar in Google Books (see http://books.google.co.uk/books?id=PkvoRnAM6WUC)

Contents

The idea would be that if BioStor had found articles within a scanned item, they would be listed in the contents menu (title, author, starting page), and if the user clicked on the article title then the BHL viewer would jump to that page. If there were no known articles, but the scanned item had a table of contents flagged (e.g., http://www.biodiversitylibrary.org/item/25703) then the menu could function as a button that takes you to that page. If there are no articles or contents, then the menu could be grayed out, or simply not displayed. This way the interface would work for books, monographs, and journal volumes.

Now, admittedly this is not the most elegant interface, and it treats articles as fragments of books rather than individual units, but it would be a start. It would also require minimal effort both on the part of BHL (who need to add the contents button), and myself (it would be easy to create a dump of the article titles indexed by scanned item).