1.7. Foundation for new and enriched services (USE)

This is topic 1.7. in the Uses section of the Advancing the Catalogue of the World’s Natural History Collections consultation. Use this topic to discuss the questions listed below.

Background
A comprehensive directory could serve as a foundation for new tools that enhance taxonomic efforts and cooperation between all collection holders. One example might be the development of distributed loans systems or on-demand digitisation, as planned for the DiSSCo European Loans and Visits System (ELViS). A catalogue could also serve as a showcase for institutions to highlight their holdings and unique features, as in the visual concept shared by GBIF for collection pages. GBIF tracking and reporting on the use of biodiversity data in research publications could feed into new services that provide standard metrics and help collections to measure and report their impact.
Other materials

The following contributed materials are particularly relevant to this topic:

Questions

  • What other services could be developed on the foundations of a collection catalogue?
  • Would these attract investment to fund the development and support the maintenance of the catalogue?
2 Likes

In a real-world example. There’s an organization in the UK that funds the preservation and sharing of human-created paper products (digitizing books, etc. as ways to share and preserve objects in danger of being lost to time and the elements). How does this organization find potential materials?

In much the same way, I would venture there are geological survey companies that would benefit from our data (and would potentially share profits for benefit of “free” data). I believe this is already happening (in pers comm).

Rui Figueroa gave a marvelous talk about the vine-wine industry and wild relatives discovery benefits. I’m guessing there are more such opportunities. It seems much like a dating service. It’s awfully difficult to find a match if you’re not in the database.

If I have resources (objects) and you have funds, we can’t bring those together unless we can make the objects known to the potential stakeholder / supporter. It’s like a chicken-egg problem (or opportunity?).

I also suspect (but am trying to keep it short), that there are eco-tourisum opportunities in sharing our metadata about collections … AND most definitely in the movie industry too.

A natural history collection is of course in itself a part of cultural heritage in terms of contributing (as a human practise) to science, society, education. So collections provide besides points-in-time for biodiversity, *omic variations, species distributions etc etc also points-in-time how scientists (in different location etc etc) reflected about these things … Maybe that opens an interesting field of association discovery along a catalogue (with very limited commercialization perspective :slight_smile:).

There definitely should be a direct cooperation with INSDC on how to address this with respect to sequences. The sequence submission tools as well as best practices already support citation of underlying material, but a connection to this catalogue would be crucial thinking of thousands of sequences that are still submitted every day without proper references to collections (and specimens/samples). Cleaning existing data is a nightmare, so I would rather start helping the researchers to submit their sequence data in a better way.

From @maperalta in this Spanish thread

Services: Make biodiversity information sources readily available for environmental managers, policy makers, social and sanitary agents. Regarding financial aid, it could come from advertisement included in future initiatives (providers of supplies such as collections security items, software).

From @ErikaSalazar in this Spanish thread

A collections catalogue can serve as a platform for collections to provide links that direct their users to their websites, online data publication pages, digital catalogues, scientific publications from the collection.

I think that it’s important to look forward to multiple downstream uses or synergies with the catalog, but for the purposes of sustainability this may actually be a distraction based on the experiences of a couple other data infrastructure initiatives. In particular, there could be major downsides to entangling the most basic functions of a collections catalogue with broader aims for an innovative new platform for biodiversity knowledge integration. The experience of the TAIR database may be informative here, based on an analysis by Sabina Leonelli (2013). “One early strategy adopted by curators was to create several different search engines within TAIR, each of which would provide a different perspective on Arabidopsis biology… Not all of these tools have been found to be equally valuable and accessible by plant researchers, and TAIR curators have reduced their ambitions over time, focusing increasingly on updating sequence and functional data on Arabidopsis rather than including new data types and tools for comparison across plant species (which might be viewed as one reason for their loss of funding)” (Leonelli 2013). That last comment on TAIR’s loss of NSF funding notwithstanding, the database represents one of the greatest success stories in biology for sustainable funding and research impact.

In contrast, Leonelli describes how the Cancer Biomedical Informatics Grid (caBIG) sought to address the widest range of possible user goals by “pushing the databases collected under its purview to adopt common formats and follow basic structural rules enabling basic interoperability across different databases,” with the consequence that the resource “is not supposed to operate on a shared, unified understanding of what it could be used for” (Leonelli 2013, 457). Unfortunately, interoperability and a wide range of potential use cases did not drive strong engagement and adoption for caBIG. The conclusion, I think, is not to abandon the bigger goals for data integration that will leverage the catalogue, but to focus narrowly on what queries the catalog can support best in the short to medium term and that correspond to a sufficiently important audience (e.g. large, high impact, well-resourced, etc).

Leonelli, Sabina. 2013. “Global Data for Local Science: Assessing the Scale of Data Infrastructures in Biological and Biomedical Research” 8 (4). Nature Publishing Group: 449–65. doi:10.1057/biosoc.2013.23.