XBio:D Roadmap

From xBio:D Wiki
Revision as of 12:47, 3 October 2018 by Nfj (talk | contribs) (Created page with " == The Database == The core of the database is information found on the specimen labels: this includes place of collection, time of collection, who did the collecting, how t...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

The Database

The core of the database is information found on the specimen labels: this includes place of collection, time of collection, who did the collecting, how the specimens were collected, the identification of the specimen. This is all linked together using the unique identifier for each specimen (the collecting unit ID). This ID then links to any information on where the specimen is deposited and any images (or other media) of the specimen. The database also stores information on the published literature.


IPT

The data in the database are publically available through our own portals. Additionally, the data are supposed to be regularly harvested and cached by data aggregators. These include the Global Biodiversity Information Facility (GBIF), iDigBio (iDigBio), and the SCAN network (SCAN). These aggregators do this by connecting to resources made available with the Integrated Publishing Toolkit (IPT), a Java program produced by GBIF. The scheme is that the database, at regular weekly intervals, produces a Darwin Core (DwC) file that contains the information we are sharing. Each resource we make available has a separate DwC file. We have, or intend to have anyway, a couple dozen such resources.


Specimage

First, this app is intended to be pronounced "spess-ee-maj", a mashup of the words "specimen" and "image." Fundamentally, this is simply an image management system. It differs from similar commercially available programs in that the specimen seen in each image is linked to its collecting unit ID. This ID then provides access to all of the information in the core database that is associated with that specimen. Specimage also has an upload function to add new images. During that process a thumbnail and a web-friendly JPG version of the original image are produced, and the user specifies the license under which the image may be distributed. The core database contains only pointers to the location of the actual images.


HOL

HOL - Hymenoptera On Line - is intended as a generic portal to the data we have. Text entered into the search box is interpreted in as many ways as possible: as a specimen ID, as the name of an organism, as a place name, as a person's name, etc. The results from these various options are presented as a series of tabs. Within each tab are expandable sections for predefined categories of information. Wildcards are accepted (% and _) as text input. Most of the information is live, i.e., directly extracted from the database and therefore as current as possible. Some summary information, however, is collated weekly and so may be slightly out-of-date.

OJ_Break API

OJ_Break is the name of the API used to interact with many (ultimately all!) of the web-based data portals. The output of OJ_Break are JSON objects that can then be parsed and formatted for display by Javascript code in the webpage.

bioguid.osu.edu

The Biodiversity Informatics Standards group (TDWG) has set up and maintains (???) a vocabulary for the basic kinds of information that we share. In some of the data portals (e.g., HNS) we have the option of delivering the information in RDF. The domain bioguid.osu.edu is intended as a resolution mechanism for the (hopefully) globally unique identifiers that we support. In its original formulation TDWG recommend the use of life sciences identifiers (LSIDs) as the format for these identifiers. The community has now generally abandoned that format, and instead opted for stable URLs. The resolver software should be able to handle both formats. Unfortunately, bioguid.osu.edu presently seems to be offline.

HNS

HNS is the Hymenoptera Name Server (HNS). Its function is to provide basic information associated with a taxonomic name. The code for this portal is actually compiled and stored within the database itself. It does not make use of the OJ_Break API.