Summary: 7. Persistent identifier (PID) scheme(s)

Summary 1 (June 15th to July 7th)
PIDs are text strings with underlying commitments to support their persistency, resolvability, and discoverability. Ensuring such commitments always work for long periods (e.g., 100+ years) entails a cost. But this is actually quite cheap if compared to the value of research and economic opportunities lost because artefacts are not properly identified. The cost of operating an appropriate PID scheme based on the Handle System is estimated to be around €1m/$1.2m annually for the 30B PIDs needed for Digital extended Specimens in natural history domains. The cost can be shared globally among institutions and/or various research infrastructures, but there should be no costs for individual researchers to make use of PIDs.

Handle System mechanisms are proposed as the underlying infrastructure (technical and organisational) that makes the PIDs we need persistent and resolvable. DiSSCo’s proposal of adopting DOI as the PID for DSs is based on a substantial evaluative comparison of 22 Handle System variants.

DiSSCo has become a member of the DOI Foundation and is working to develop the governance, operations, financing, and service portfolio models, potentially for a new Registration Agency (RA) operating on behalf of the global community. There is also potential to partner with DataCite. In searching for a sustainable business model to scale to growing demands, IGSN (Global Sample Number, a popular PID currently mostly applied to physical earth samples) shared their solution of establishing a partnership recently announced with DataCite.

Clarifying the concepts of sample, specimen, material sample, and subsample, it become clear that a material sample is the result of a sampling event while a (catalogued) specimen is the result of a curation process applied to a material sample. Nevertheless, there are some categories of curated objects i.e., specimens that are not material samples, such as a sound recording or a drawing. Specimens can yield subsamples that can themselves be curated (and thus become specimens with identifiers).

Standardizing metadata for heterogeneous sample collections is a present challenge. Beyond samples/specimens, peoples and organizations are involved with collection, curation, and management of samples. They need PIDs, such as ORCiD and Wikidata identifiers for people, and ROR for organizations.