1.3. First step towards databasing collections (USE)

I am interested in thinking how the catalogue could help institutions to move towards richer digital access. Some years ago, many collections at least offered online lists (not structured data) of their type materials. I haven’t seen such lists in recent years. Some of those collections will instead have published Darwin Core datasets, but I think some collections may now be less discoverable than when the Internet was less crowded.

GBIF (along with other data aggregators) offer the option of publishing four increasingly rich classes of data:

  • Resource metadata: Information describing a resource (e.g. a dataset), whether or not it is digitised in a machine readable form - this can be a useful discovery tool. Collection records based on the TDWG Collection Descriptions could be the preferred data model for this kind of advertising role.
  • Checklist dataset: These can take many forms, but allow a list of species (or other taxa) to be shared, along with metadata describing the list. Additional Darwin Core or other terms may also be included for each species. Institutions that have lists of types or of species within their collections could easily use this model to showcase their holdings - the species or specimens in these lists could be handled as very-low-information Darwin Core records and become discoverable to researchers.
  • Occurrence dataset: Most specimen data today is shared in this form, which supports rich information on each specimen held.
  • Sampling-event dataset: This is rarely used by collections today, but would be a valuable extension wherever specimens or other materials were collected in a standardised way.

Every one of these four dataset types could be associated with a TDWG Collections Description. It would be wonderful if collections could be helped to use these levels as stepping-stones from being digitally undiscoverable to being fully digitised. Visibility and value are increased at each step.