Data models and standards for improved usability

dhobern · September 8, 2022, 11:32pm

A key aspect is that the primary mechanisms for collecting useful data on biodiversity have changed over time. I see three overlapping eras:

SPECIMENS - Prior to the middle of the 20th century, the vast majority of information we have on biodiversity comes from the work of collectors. A small workforce delivered data collected primarily from the most accessible locations (no sampling methodology) but with very broad taxonomic scope. The model could never scale to deliver planetary modelling but gives us our earliest useful data.
HUMAN OBSERVATIONS - From the 20th century onwards, the vast bulk of our data is from field observations either by professional scientists (ecologists, etc.) or volunteer efforts (bird atlases, bird banding/ringing, citizen science, etc.). The taxonomic scope is often narrower than with specimen collection, but (for the taxa that can be recorded by amateur naturalists) data volumes can be very large (though often still with insufficient thought given to sampling methodology).
MACHINE OBSERVATIONS - We are near the beginning of a third era, in which the simplest, most cost-effective and scalable way to collect biodiversity data will be through machine solutions: eDNA, AI processing of webcam, UAV and satellite images or of acoustic recordings, etc. Such methods are much more amenable to broad-scale sampling approaches and can (at least with eDNA) cover most organism groups.

The coverage and quality profiles of these three categories (and of the associated recording eras) are fundamentally different. Successful integration will require us to find ways to cross-calibrate these diverse signals.

Topic		Replies	Views
Transparency of data and methodologies used in indicators Post-2020 Global Biodiversity Framework	2	916	September 23, 2022
Sampling events GB27	2	842	October 21, 2020
Data Use Club practical session: GBIF species occurrence cubes Data Use	1	64	May 23, 2025
Diversifying the GBIF data model - intro Diversifying the GBIF data model	14	1175	July 21, 2022
Toward Reliable Biodiversity Dataset References Data Use	8	2923	February 24, 2020

Data models and standards for improved usability

Related topics