8. Meeting legal/regulatory, ethical and sensitive data obligations

Moderators: Alex Hardisty, Jutta Buschbom and Breda Zimkus


The desire to have open access to specimen data and to conduct open science contrasts with issues associated with legal requirements such as intellectual property rights; regulatory constraints of specific national and international legislation; the need to prevent the exposure of sensitive information to unauthorised persons; the need of businesses to maintain data- and infrastructure-based business foundations; and the social goals of fairness and equity, which require mechanisms to prevent individuals and groups taking undue advantage, especially of large and information-rich datasets. Against the background of a general policy of being ‘as open as possible, as closed as legally and/or ethically necessary’, the goal of this topic is to identify and explore the mechanisms of the future to meet legal obligations, including national and international regulations, as well as ethical and sensitive data concerns associated with biodiversity collections and their use.

The paragraphs below develop the themes of this topic in greater depth to allow several questions to be considered.

Digital solutions for meeting obligations

We want to explore when extended data infrastructure(s) are accepted and embraced by the communities to which they can provide services. Which criteria and functions must be offered to support legal/regulatory and ethical/moral obligations and sensitive data concerns? Are there existing solutions that can provide blueprints or that can be adapted? Digital solutions must provide services and advantages welcomed by the [communities of collection professionals, bio/geodiversity scientists and bio/geodiversity informaticians](link to topic 9), who develop, manage and maintain the data infrastructure(s). At the same time, bio/geodiversity infrastructures radiate out into their surrounding societies and offer intersections with a wide range of affected or interested [stakeholders with potential applications](link to topic 11). Thus, extended infrastructures need to appeal to and provide the functionality required in interactions with specific partners. These partners are found, for example, with indigenous peoples and local communities (IPLCs), women and youth; with engaged citizens and environmental activists; in university, government and companies’ research and development departments; in businesses (e.g., providing environmental assessments or producing a wide range of commodities); as well as organizations maintaining certification systems. Furthermore, infrastructure partners are responsible for local, subnational to national administration and contribute to (sub-)national planning and reporting processes; are professionals in customs offices, in law enforcement and the legal system responsible for case decisions; and are involved in developing forward-looking policy-decisions. Considering these and other applications, one aspect of this topic is related more directly to technical functionality providing versatile and elegant effectiveness, and efficiency through user-friendly interfaces, powerful tools and integrated workflows, made possible by FAIR Digital Objects, [persistent identifiers](link to topic 7) and relational links governed by [transactional mechanisms and provenance](link to topic 10) A second aspect considers and develops the theoretical and procedural concepts implementing a layer of legal and regulatory obligations; considerations for sensitive data, information privacy, and business intelligence; as well as ethical and social frameworks based on the the Sustainable Development Goals for the use of traditional knowledge in fairness, equity and justice.


The collections community is one of the mediators of both, technical as well as socio-ethical aspects with regard to implementing and executing Access and Benefit-Sharing (ABS) regulations. In this context, a growing number of legal, regulatory, and ethical issues are confronting biodiversity collections. The Nagoya Protocol on Access and Benefit Sharing has notable implications, and although the aim is to create greater legal certainty and increase transparency for both providers and users of genetic resources, this agreement often poses a number of challenges for those collecting, managing, and using collections. Several issues remain unresolved, including the inclusion of digital sequence information (DSI) under the CBD and/or the Nagoya Protocol; the conservation and sustainable use of marine biological diversity of areas beyond national jurisdiction (BBNJ) with the consideration of the development of an international instrument under the United Nations Convention on the Law of the Sea; and the interaction with traditional knowledge and data rights of indigenous people, local communities, women and youth (CARE principles for the governance of indigenous data).

Chains of custody

Already for ABS, chains of custody must be documented. The technical implementation of and operational compliance with chains of custody have expanded requirements when specimens are used for specific purposes, such as certification or forensic casework. In a versatile conservation work environment, chains of custody span the path of a specimen and its associated metadata from the gathering event in the field, transport, accessioning, preparation and digitization to use (e.g., lab work, imaging, statistical analyses) and end products (e. g. reports, publications), including loan/gift-events in the context of biodiversity collections. Chain of custody-functionality and the information it provides must be available when required for official reporting (compliance) in conservation contexts, national planning, court evidence, and commercial and customs decisions. These represent some of the main use cases for which transactional mechanisms and provenance ([Topic 10](link to topic 10)) are needed and for which consensus on global implementation mechanisms is needed.

Questions to promote discussion

  1. Which models and frameworks already exist? Have they been implemented and how? Are they in use? What are the experiences with them?
  2. Who decides the specifics of what should be implemented?
  • Will this need to be an international, legal and multi-stakeholder top-down model or will it be a per-specimen/per-information and per-provider bottom-up model?
  • What happens if for one specimen or information different potential rights-holders exist? E.g., an indigenous people and local community (IPLC), a researcher who produced derived results, a collections institution, a company holding a patent, etc.
  • Applicability of subnational and national regulations and international treaties might depend on use and outcomes: e.g., if information is “exported”, i.e., used in a different country/administrative unit; if it is used for non-commercial or commercial purposes; if it results in a commercial product years later; …
  1. Who should be responsible for setting rights and maintaining them, e.g., for biological diversity of areas beyond national jurisdiction (BBNJ)? Who has the legal and/or ethical responsibility? How can this be recorded in Digital Extended Specimens?
  2. What is the power and potential of supporting these obligations and considerations in a Digital Extended Specimen infrastructure? How can they inspire, even “demand” the use of DES infrastructures and spin-off applications?
  3. Are there stakeholders that have not yet been identified that might play an important role in aspects of implementation or compliance?

Information resources

A wide range of background information resources are relevant, including those of a general nature and those related to the biodiversity and natural sciences domain more specifically.

On intellectual property rights

On open science

On collections-based experiences and points of view

  • BCoN (2019) Extending U.S. Biodiversity Collections to Promote Research and Education. A report by the Biodiversity Collections Network 2019 URL: https://www.aibs.org/home/assets/BCoN_March2019_FINAL.pdf.
  • Blasiak, R., R. Wynberg, K. Grorud-Colvert, S. Thambisetty, N.M. Bandarra, A.V.M. Canário, J. da Silva, C.M. Duarte, M. Jaspars, A. Rogers, K. Sink, and C.C.C. Wabnitz. (2020) The ocean genome and future prospects for conservation and equity. Nature Sustainability 3: 588–596. doi:10.1038/s41893-020-0522-9.
  • Colella, J.P., R.B. Stephens, M.L. Campbell, B.A. Kohli, D.J. Parsons, and B.S. Mclean. (2020) The Open-Specimen Movement. BioScience biaa146: 1–10. doi:10.1093/biosci/biaa146.
  • Fukushima, C., R. West, T. Pape, L. Penev, L. Schulman, and P. Cardoso (2020) Wildlife collection for scientific purposes. Conservation Biology. https://doi.org/10.1111/cobi.13572.
  • National Academies of Sciences, Engineering, and Medicine (2020) Biological Collections: Ensuring Critical Research and Education for the 21st Century. Washington, DC: The National Academies Press. doi: 10.17226/25592. - Page 28 specifically.
  • Thiers, B., J. Bates, A.C. Bentley, L.S. Ford, D. Jennings, A.K. Monfils, J.M. Zaspel, J.P. Collins, M.H. Hazbón, and J. L. Pandey (2021) Viewpoint: Implementing a community vision for the future of biodiversity collections. BioScience biab036, 1–3. https://doi.org/10.1093/biosci/biab036
  • Zimkus, B.M., L.S. Ford, and P.M. Morris (2021) The need for permit management within biodiversity collection management systems to digitally track permits and other legal compliance documentation and increase transparency about origins and uses. Collection Forum. Accepted.

On experiences made in/by human genomics and medicine

On Access and Benefit-Sharing

On rights-based governance of data, access and use

On environmental ethics

On sensitive data

  • Chapman AD (2020) Current Best Practices for Generalizing Sensitive Species Occurrence Data. Copenhagen: GBIF Secretariat. https://doi.org/10.15468/doc-5jp4-5g10 .
  • Figueira R, Beja P, Villaverde C, Vega M, Cezón K, Messina T, Archambeau A, Johaadien R, Endresen D & Escobar D (2020) Guidance for private companies to become data publishers through GBIF: Template document to support the internal authorization process to become a GBIF publisher. Copenhagen: GBIF Secretariat. https://doi.org/10.35035/doc-b8hq-me03.
  • GBIF Secretariat & IAIA (2020) Best Practices for Publishing Biodiversity Data from Environmental Impact Assessments. Copenhagen: GBIF Secretariat. https://doi.org/10.35035/doc-5xdm-8762.

On chains of custody

Good morning and welcome to discussions focusing on the Digital Extended Specimen concept as a future implemented digital infrastructure embedded in and in exchange with society.

Starting out, open access to and the fair and equitable sharing of benefits derived from biodiversity are fundamental for societies attaining sustainability, as well as for concrete, operational conservation applications. At the same time, the topic of access and benefit sharing (ABS) has gained a certain notoriety for having given rise to a thicket of laws and regulations, difficult to understand and navigate, not only for organismal biologists and collections.

How can the integration of extended physical and digital infrastructures provide orientation, actionable confidence, as well as easy and effective application for providers and users?

When discussing licensing of biodiversity, could we also discuss License stacking when licenses are applied to data.
The existence of various licenses with different degrees of openness seriously hampers the reuse of data.
A lot of data is made available using licenses that are more geared towards reports and other creative outputs, but not all open data licenses are compatible with each other, meaning that they can not be integrated.

It would be great if this consultation cycle could lead to a decision tree that can be used to pick an applicable license for biodiversity data.

1 Like

Hi Andra,
welcome to the topic and thank you for your interest.
It seems that you are touching on two topics:

  1. for users, advice on which license to choose for biodiversity data and knowledge, so that it can and will be reused, and thus can contribute to eg. conservation, benefit sharing, deepening of our knowledge, and more.
  2. for infrastructure developers, which information and functions will make a user’s interaction with licenses “easier”.

Here, I will explore my view on 1) licensing biodiversity data and knowledge. In a subsequent post, I will compile a list of technical fields and functions as proposal for an answer to 2), infrastructure development and implementation.

License choices for biodiversity data and knowledge

My experience with licenses is that it’s really hard to deal with them, both as user of a licensed resource, as well as a provider who has herself to choose a license for a resource that she is providing, eg. a product, service, … And I am only speaking about decisions to be made in the universe of open/free/public/CC/copyleft/open-source/public-domain-equivalent/permissive licenses (see corresponding pages in https://en.wikipedia.org).

Having not heard the term “license stacking” I followed your link with its very helpful information and animation (librarian’s cool glasses!). Afterwards, I found GBIF’s description of their licenses’ development and their argumentation a good foundation.

Basically, GBIF’s strategy supports “Public money, public code”, ie. public resources. They allow three licenses: CC0, CC-BY and CC-BY-NC.

These are my personal, partly abstract, 5cent:
For simplicity, when ever possible choose CC0. You don’t need a lawyer for that, and you promote sharing and reuse for, hopefully, the good of society and future generations.

When your, your business’, organization’s or institution’s income depends on credit returned by uses of this resource, and there is no other way for “quantifying” the importance and impact of your published resource, then you might use CC-BY. Looking into a future in which the DES infrastructure will be fully up and running, at that point a CC-BY seems not to be necessary anymore. All uses of all resources will be linked, no matter about around how many steps and corners, to the provider or publisher. All it needs is a dashboard for each provider/publisher that is visualizing the work of some (friendly) bots, which are collecting links in the background. [sounds somehow scary? - mmh]

Thinking about much of the data involved in the context of the DES, GBIF and biodiversity (a DNA-sequence, a barcode, a single genome sequence, one physical collection specimen, an observation, a single annotation, … ), this is data which is of interest in the context of “Big Data” analyses. That is, in themselves these single data “points” are not that informative, their value lies in combining them with other data into smaller to larger datasets, on which more or less extensive analyses are run. I am not certain that in these cases CC-BY-NC-licenses will stop an unfair and in the longer term unsustainably acting “big business” or start-up to get rich on exploiting accessible resources on the back of everybody else. We (the public, providers) have no insight into what lies on companies’ servers, what might have been used in R&D, or what data might power commercial platforms providing analytical and information services.

On the other hand, societies, the SDGs, conservation, etc. depend to a large part on businesses to provide sustainable and “fair” products and services for the good of all. Restrictive licenses might provide obstacles for achieving exactly the goals, for which they were chosen for.

Data, information and expertise as business intelligence can form the foundation of business enterprises. Hence, businesses can have good reasons to restrict access to (parts of) their data and information, and/or restrict the use of their information capital, by setting in place more or less restrictive licenses and patents. In this way, they keep them behind a kind of “communication” wall. If somebody would like to use their resources, they need to contact the business and inquire about conditions for access and use. Here, a restrictive license can form the foundation for income, and also cooperation and collaboration.

Apart from setting your data and information resources into the public domain (eg. CC0), all licenses require effort by the licensor and demand long-term responsibility from the licensor. Licenses need upkeep, at a minimum in the form of keeping contact information up to date, ensuring that somebody can be reached in a reasonable time frame for inquiries and communicating a testament decision – what should happen if the licensor isn’t around anymore, will somebody inherit the ownership of the license, who? Licensors might also want to monitor the use of their resources and potentially enforce the license (else, why add restrictions in the first place). Thus, when choosing a restrictive license, it might be a good idea to have it to be time-limited and/or default to the public domain (CC0).

This is the second part of my answer, considering which fields and functions might make user’s interactions with licenses et al. easier. On one hand, they can provide orientation to users. At the same time, their extent should not scare providers and users away from these issues.

Fields and functions associated with licenses, legal agreements and/or business contracts

  • [Owner of resource]
  • License
    • Type
    • Version
    • information/description
    • URL
    • License holder
  • Contact for interaction
    • Agent (might be owner, license holder, designated contact point, etc.): researcher, institution, organization, business, agency, etc.
    • Contact information, eg. email, phone, etc.
  • Expiration date
    • Set legally by regulation
    • Set by license holder (needs to be sooner than any legally set date)
  • Legacy (testament for license and license rights)
    • Who will inherit the license and the rights arising from it?
    • Information if this testament is legally binding (and why)
  • Acknowledgements/Attributions
    • If the resource includes previous licenses (eg. because it is a composite dataset of several to many data points): yes - no
    • Links to those licenses
    • (automatically compiled) list of links, resolved to license holders, who require attribution in human and machine-readable form

Legal and/or business statements (eg. permits, contracts) concerning acquisition and/or further uses of the resource. These fields need to be repeating, since several interactions might occur for one resource. Also, interactions might evolve over time, reflection of a series of eg. inquiries, outcomes, contact points, etc.

  • Legal and/or business agreements (eg. permits, contracts)
    • Legal and/or business agreements
      • Are required? Yes – no (eg. associated with acquisition)
      • Exist? Yes - no
    • Permit holder, signatories of agreement
    • Link to legal agreement or business contract, including (scanned) document
    • Contact points and information on both sides, eg. collector and permit agency; resource provider and business partner, etc.
    • Associated communication, eg. inquiries and associated outcomes
    • Terms of contract
    • (Potential) Expiration date
  • Information about confidential legal/business agreements (eg. Prior Informed Consent (PIC) and Mutually Agreed Terms (MAT) agreements associated with Access and Benefit Sharing, see Nagoya Protocol - Wikipedia, might be confidential)
    • Confidential information exists? Yes – no
    • Contact for inquiries into confidential information
    • Restricted access, accessible only for resource owner/provider (legal contact) and permit provider/business partner:
      • Module under 7.

@Andrawaag and everybody: I am curious what you think about these ideas, proposals. Do you have experience with these topics and matters?