Making FAIR data for specimens accessible

JuttaBuschbom · February 28, 2021, 11:22am

@Markus_B Thanks a lot for your detailed answer. It is helping me to understand, why I find the ongoing legal debate about “DSI” in parts so detached from the reality of its wider context. I am recognizing just now that when I am discussing ABS policy solutions for DSI, I am already intuitively taking into account and talking about the whole set of possible DS/ES information.

DNA-sequence information gains value mainly by it being associated with phenotypic or additional extended data (see eg. the information sections proposed by @hardistyar in the the subtopic-thread on structure and responsibilities (Structure and responsibilities of a #digextspecimen).

DNA-sequence datasets in isolation are “only” of interests to genome scientists, otherwise they aren’t really that interesting, eg. to society.

Adding information about the geographic coordinates of the sample (-> physical specimen) to a DNA-sequence, population genomicists start to get excited. Statistical approaches for reliably resolving population structure and diversity at increasingly smaller scales are currently a rapidly advancing scientific field. Thereby, these datasets now intersect with legal, social, economic and conservation spheres in the form of forensics, certification and monitoring. Simply, but pivotal, by adding geographic information, DNA-sequence information starts to collide eg. with the 200 billion annual profit sector of environmental crime. Now you have all kinds of interests at play.

These societal interests non-linearly expand and intensify with the association of phenotypic, ecological and environmental information - the core information of specimens in collections and of biodiversity records (ABCD: units - not sure about the appropriate term).

In most applied fields, biodiversity is recognized as a topic of fundamental importance, yet it is a nuisance topic - not unlike the situation in human medical R&D. Having a background of 10+ years in forest genetics, in my experience the majority of stakeholders there are not interested in biodiversity per se, they want to know if the timber they are buying is the real deal and not a cheaper substitute; how to breed “Spessart” or “Slavonian” high quality oaks; and which species and provenance mix will be able to hold up under and adapt to climate change.

All of this requires reliable geographic origin, phenotypic, ecological and environmental information. Exactly the information the DS/ES infrastructure of the biodiversity sciences intends to provide through digitalisation (Thanks for pointing out the crucial role of digitalisation). Suddenly, dusty old natural history collections aren’t a a romantic, slightly backwards enterprise that is a financial sink. They, via their data, become a goldmine.

That extended data are a game changer is mentioned in this article by Powell 2021 (The broken promise that undermines human genome research), part of a series on the 20th anniversary of the human genome:

Bahlo and others say that data federation efforts become even more important as the field pivots to digging deeper into phenotype data, which have grown in scope and complexity. “That data comes in all sorts of forms — environmental exposures, smoking status, medical imaging data,” says Bahlo.

Therefore, I completely agree with your point of view that all biodiversity information is, should be or will be subject to questions regarding access and benefit sharing, and hence discussions about ABS within the Convention on Biological Diversity. My guess is that the narrow focus on DSI can be explained by historical processes.

The information by the Secretariat of the CBD is that there will be a decision at the next Conference of the Parties (COP 15), tentatively this year. Likely, it will not be possible so late in the policy process to change the object from “DSI” to “biodiversity information - DS/ES information” in general.

However, it seems crucial, due to the time constraints imposed by global change, to already now, in parallel, consider the consequences of implementations of the DS/ES concept when discussing ABS policy option for DSI. A well-shaped set of options for DSI could then be more easily expanded to include all of biodiversity information in the future.

In return, it is necessary to already now consider the implications of the DS/ES concept and its implementation(s) for access and benefit sharing, as well as with regard to privacy concerns = the protection of sensitive data and proprietary information.

While the ethical foundations start out well in this topic, the discussion about applied consequences is moving quickly into the subject of topic 5 Analyzing/mining specimen data for novel applications.

Topic		Replies	Views
Attributing work done (Data Attribution) Miscellaneous	19	1831	March 27, 2021
Collections catalogue (GRBio) Miscellaneous	52	6470	June 28, 2020
Darwin Core Half-Million - UPDATE Data Publishing	11	1201	December 8, 2022
Type Specimen CASTYPE1652 found via filtered query https://doi.org/10.15468/dl.xf6ahb, but not in open-access GBIF data product https://doi.org/10.15468/dl.pk3trq Miscellaneous	18	618	May 6, 2023
Downloads failing to include all files in the archive Data Use	16	1022	October 29, 2023

Making FAIR data for specimens accessible

Related topics