Search this keyword

BioStor in the cloud

CloudantQuick note on an experimental version of BioStor that is (mostly) hosted in the cloud. BioStor currently runs on a Mac Mini and uses MySQL as the database. For a number of reasons (it's running on a Mac Mini and my knowledge of optimising MySQL is limited) BioStor is struggling a bit. It's also gathered a lot of cruff as I've worked on ways to map article citations to the rather messy metadata in BHL.

So, I've started to play with a version that runs in the cloud using my favourite database, CouchDB. The data is hosted by Cloudant, which now provides full text search powered by Lucene. Essentially, I simply take article-level metadata from BioStor in BibJSON format and push that to Cloudant. I then wrote a simple wrapper around querying CouchDB, couple that with the Documentcloud Viewer to display articles and citeproc-js to format the citations (not exactly fun, but someone is bound to ask for them), and a we have a simple, searchable database of literature.

If you want to try the cloud-based version go to http://biostor-cloud.pagodabox.com/ (code on Github).

Bcloud

I've been wanting to do this for a while, partly because this is how I will implement my entry in EOL's computational data challenge, but also because CrossRef's Metadata search shows the power of finding references simply by using full text search (I've shamelessly borrowed some of the interface styling from Karl Ward's code). David Shorthouse demonstrates what you can do using CrossRef's tool in his post Conference Tweets in the Age of Information Overconsumption. Given how much time I spend trying to parse taxonomic citations and match them to articles in CrossRef's database, or BioStor, I'm looking forward to making this easier.

There are two major limitations of this cloud version of BioStor (aprt from the fact it has only a subset of the articles in BioStor). The first is that the page images are still being served from my Mac Mini, so they can be a bit slow to load. I've put the metadata and the search engine in the cloud, but not the images (we're talking a terabyte or two of bitmaps).

The other limitation is that there's no API. I hope to address this shortly, perhaps mimicking the CrossRef API so if one has code that talks to CrossRef it could just as easily talk to BioStor.

Species wait 21 years to be described - show me the data

21Benoît Fontaine et al. recently published a study concluding that average lag time between a species being discovered and subsequently described is 21 years.

Fontaine, B., Perrard, A., & Bouchet, P. (2012). 21 years of shelf life between discovery and description of new species. Current Biology, 22(22), R943–R944. doi:10.1016/j.cub.2012.10.029

The paper concludes:

With a biodiversity crisis that predicts massive extinctions and a shelf life that will continue to reach several decades, taxonomists will increasingly be describing from museum collections species that are already extinct in the wild, just as astronomers observe stars that vanished thousands of years ago.

This is a conclusion that merits more investigation, especially as the title of the paper suggests there is an appalling lack of efficiency (or resources) in the way we decsribe biodiversity. So, with interest I looked at the Supplemental Information for the data:

I was hoping to see the list of the 600 species chosen at random, the publication containing their original description, and the date of their first collection. Instead, all we have is a description of the methods for data collection and analysis. Where is the data? Without the data I have no way of exploring the conclusions, asking additional questions. For example, what is the distribution of date of specimen collection in each species? One could imagine situations where a number of specimens are recently collected, prompting recognition and description of a new species, and as part of that process rummaging through the collections turns up older, unrecognised members of that species. Indeed, if it takes a certain number of specimens to describe a species (people tend to frown upon descriptions based on single specimens) perhaps what we are seeing is the outcome of a sampling process where specimens of new species are rare, they take a while to accumulate in collections, and the distribution of collection dates will have a long tail.

These are the sort of questions we could have if we had the data, but the authors don't provide that. The worrying thing is that we are seeing a number of high-visibility papers that potentially have major implications for how we view the field of taxonomy but which don't publish their data. Another recent example is:

Joppa, L. N., Roberts, D. L., & Pimm, S. L. (2011). The population ecology and social behaviour of taxonomists. Trends in Ecology & Evolution, 26(11), 551–553. doi:10.1016/j.tree.2011.07.010

Biodiversity is a big data science, it's time we insisted on that data being made available.

Classification of Accounts - Hints for Journalizing - Advantages of Journal

Personal Accounts

Accounts recording transactions relating to individuals or firms or company are known as personal accounts. Personal accounts may further be classified as :

(1) Natural person's personal accounts: The accounts recording transactions relating to individual human beings e.g., Anand's A/c, Remesh's A/c, Pankaj's A/c are classified as natural person's personal accounts.

(2) Artificial person's personal account: The accounts recording transactions relating to limited companies. bank, firm, institution, club. etc. e.g. Delhi Cloth Mill; Hans Raj College; Gymkhana Club are classified as artificial persons' personal accounts.

(3) Representative personal accounts: The accounts recording transactions relating to the expenses and incomes are classified as nominal accounts. But in certain cases due to the matching concept of accounting the amount, on a particular date, is payable to the individuals or recoverable from individuals.

Such amount (a) relates to the particular head of expenditure or income and (b) represents persons to whom itis payable or from whom it is recoverable. Such accounts are classified as representative personal accounts e.g. "Wages Outstanding Account", Pre-paid Insurance Account. etc.

Real Accounts

The accounts recording transactions relating to tangible things (which can be touched, purchased and sold) such as goods, cash, building. machinery etc., are classified as tangible real accounts.

Whereas the accounts recording transactions relating to. intangible things (which do not have physical shape) such as goodwill, patents and copy rights. trade marks etc., are classified as intangible real accounts.

Nominal Accounts

The accounts recording transactions relating to the losses, gains. expenses and incomes e.g., Rent, salaries, wages, commission, interest, bad debts etc. are classified as nominal accounts. As already discussed, wherever a nominal account represents the amount payable to or receivable from certain persons it is known as representative personal account.

Rules of Debit and Credit (classification based)

1. Personal Accounts: Debit the receiver, Credit the giver (supplier)

2. Real Accounts: Debit what comes in, Credit what goes out

3. Nominal Accounts: Debit expenses and losses, Credit incomes and gains.,

Hints for Journalizing

The following discussion will help in diagnosing the transaction with a view to find out which accounts are relevant for passing the journal entry.

1. Treatment of cash/credit transaction.

Read carefully the following transactions:

(i) Purchased goods for Rs. 1,200 cash. .
(ii) Purchased goods for Rs. 1,200.
(iii) Purchased goods for Rs. 1,200 from Arun.
(iv) Purchased goods for Rs. 1,200 from Arun on cash.

Transaction (i) and (iv) are clear as it has been specifically stated that purchases have been made on cash. Thus the entry is :

Purchases account Dr. 1,200 To Cash account 1,200

Transaction (ii) and (iii) are not specific as to whether the purchases are for cash or on credit. However transaction (ii) does not mention any name of the supplier; therefore it implies that the purchases are for cash. Similarly transaction (iii) mentions the name of the supplier but is silent regarding cash-it implies that purchases are on credit: Thus the entry for transaction (iii) is

Purchases account Dr. 1,200 To Amex 1200.

2. Treatment of payment on personal/expenses account.

When payment is made to a person against amount due to him as per his ledger account-the personal account of the creditor should be debited. However if the payment is being made to a person representing business expenditure then the particular expenditure (nominal) account should be debited.

3. Treatment of receipt on personal/ income account.

When amount is received from a person against amount recoverable from him as per ledger account-the personal account of the debtor should be credited. However if the amount received represents business income, then the particular income (nominal) account should be credited.