Document: 10 recommendations from DiSSCo

Many thanks to the DiSSCo team for putting this together. I fully agree with most of these recommendations, but here are some comments on recommendation 1 (scope of the catalogue).

I’d like us to think of nested sets of requirements and how we support each of these sets. We may end up with several hierarchically arranged catalogues that each have a purpose of their own and cannot properly support the more specialised versions. What I still don’t know (and what I hope we can clarify) is which of these can be treated as “the same catalogue” and which will need to be kept separate:

  1. A catalogue of all scientific collections. This is why GRSciColl was created. GBIF hosts this now and expects it to remain open for collections that are not related to biology or earth sciences. Whatever else we do, we need a mechanism to identify and reference any collection.
  2. A catalogue of all preserved biological collections. This is a major driver for GBIF and many of the other stakeholders in this consultation. We have excellent use cases for this and we need to make sure we meet these needs. This needs to support and normalise much more detailed information than the catalogue of all scientific collections. The biological (species-oriented) focus is important for many of its uses. My main question is whether this can be built to include both of the next two classes of collection without weakening its effectiveness for these uses.
  3. A catalogue of all living biological collections. Many of the uses of such a catalogue, and much of the content, will overlap with the catalogue of preserved biological collections. I’d like to know if there are any downsides we need to address before merging the two as a single catalogue.
  4. A catalogue of all geoscience collections. These share a number of features with preserved biological collections and many institutions handle these resources as part of the same institutional collection. We need to think about how to meet these needs, but should we do so as a single catalogue or as separate catalogues (which could still be populated through a common pipeline based e.g. on a modular CD document format)? Most importantly, what are the use cases for a search or other data access that returns a mixture of preserved biological and geoscience collections? And are these use cases ones that would not be met by the catalogue of all scientific collections?

I think we should build all of these catalogues using the same infrastructure and tools and with modularised information in TDWG CD format. The issue is whether we need to brand these as separate catalogues with separate interfaces for different purposes. I am inclined to think we need to keep separate at least 1) The GRSciColl catalogue of all scientific collections, 2) A GBIF catalogue of all biodiversity collections, and 3) A geoscience collection catalogue (in partnership with geosamples.org?). All three would be built as a single information resource. In rough terms, GRSciColl would be the access point for all records, focussed on search through the generic sections of CD, and the other two would be based on all collections that provided modular CD data for the given type of collection.