8. Meeting legal/regulatory, ethical and sensitive data obligations

gdadade · August 12, 2021, 12:51pm

I agree that transactions should be visible for the community, but the question is when this information can be made public, at the time of the request? Many researchers would not like that, but rather would like to wait until the results are published. But this can take many years. This is something we need to discuss with the research community. The basis would be to track all transactions internally (both incoming and outgoing), this already is a challenge in some cases. Within SYNTHESYS+ we did a survey to find the critical bottlenecks in this tracking/logging workflows in particular for molecular collections. You can find the summary of the outcomes at: GGBN Library » SYNTHESYS+ NA3.2 survey summary.pdf

abentley · August 12, 2021, 2:20pm

This is one of those cases where a digital key could be provided to hidden data for those who need to access it (ABS compliance people in various countries, etc.) after which it can be made public once there is no worry of it being transparent. The digital key system could be used in this way for lots of other circumstances and should accommodate hiding data but then making it transparent at a later date if necessary.

gdadade · August 13, 2021, 7:17am

Thanks Andy, that makes sense. The user rights management would have to work across multiple infrastructures and entry points, but that should be doable.

JuttaBuschbom · August 13, 2021, 2:09pm

@dshorthouse Interesting and important perspective.

The question is if marketability hasn’t already been shown. In general, by “the internet”, or any large shopping, or social media platform, as well as by Wikipedia, GBIF, UN Biodiversity Lab, etc. Collecting and linking data (cp. also printed encyclopedias) seems to provide additional value that can be and is widely marketed.

The airport/flight analogy I still find fruitful and fun. Here, I will argue against your assessment. Personally, I am happy to see pilots walk around any airplane I sit in (-> waiting) and bump their feet against tires, check if the tail xyz is still attached etc. Considering Boeing’s general problems after a couple of planes of one of their series went down, I don’t seem to be the only one who thinks that maintenance and security are prerequisites for me to get on a plane. Also, for plain economic reasons, I don’t think airlines would rather do without security checks.

While there are good reasons to spend the repeated times waiting in line in front of security checkpoints during a series of flights rather in a high-speed train having enjoyable to productive me-time, in the case that you decided to fly there are so many worst-case scenarios that companies and passengers would insist on checkpoints if they wouldn’t already exist. They might complain about the waiting and standing in line, not about the security checks.

JuttaBuschbom · August 13, 2021, 3:00pm

Another option is to only publish summary statistics across datasets, which definitely cannot be re-personalized to the public or researchers. This seems to be one of the ways the UK Biobank (https://www.ukbiobank.ac.uk/) protects privacy when allowing access to their population-genomic datasets.

JuttaBuschbom · August 13, 2021, 3:16pm

I am not sure that I understand the specific component of the situation that you are asking about.

Assuming that you have “old” data, which was generated without a management plan, but publicly funded; or there is a data management plan and the situation changed. As I understand it, there can be legal and ethical reasons to nevertheless not provide access to such data.

Due to legal reasons, eg. access and use restrictions imposed by third parties
Due to ethical reasons, eg. a population or species becomes overexploited and endangered
In both cases you have, I would think, legitimate arguments to restrict access and withhold data.

dshorthouse · August 13, 2021, 5:05pm

This is true for downstream use once links are created. By “marketability”, I am specifically referring to those who will be engaged in making links. It’s THAT audience that must be incentivized, engaged & committed to the task first and foremost. We can dream-up all kinds of end-uses but unless we tackle and understand the motivations for building & sustaining links in the first place…the creators of the links…it’s not particularly helpful to talk about rules or restrictions UNLESS those techno-governance structures result directly in motivation to contribute. There are lessons to be learned through examination of wikidata’s open, participatory model.

apodemus · August 13, 2021, 5:58pm

@JuttaBuschbom et al., thank you for further developing this discussion. Just as food for thought, let me throw in additional, a more ’diplomatic’ layer to the analysis still using the air travel analogy.

Conventionally, even in the pre-CBD time, what seems to have worked for stakeholders —both users and providers— as solutions to lowering the hurdle and lubricating otherwise dry ‘zero-trust’ international research transactions was gradually building a track record of sustainable research partnerships and trust with their counterparts in specific countries or communities over the time to gain the slightest advantage in the ecosystem. That advantage may have been manifested in priority processing of repeat access requests and ease of securing permits, memoranda of understanding, and prior consent by government agencies, to initiate and maintain smooth access to and exchange of study materials, data, as well as talents and funding (but not bribes!). Such traditional trust-based international research partnerships may have been built initially on the personal level of individual investigators or inter-organizational level, i.e., private trust as opposed to public trust, but which cumulatively may have grown at the intergovernmental level, leading potentially to stronger passports and visas, a faster check-in line, and clearance privilege at airport security checkpoints—by airport analogy—based on non-binding policies and informal agreements. In the framework we are envisioning, should we make sufficient room to allow for developing such diplomatic deviation from standard guidelines, irrespective of how flexibly or rigidly the proposed DES framework is implemented through individually crafted and traceable boarding passes (use agreements)? Or should we as a whole community stay away from such ’friendly’ honor-based deviation under the zero trust model in favor of consistency?

Breda_Zimkus · August 14, 2021, 12:41pm

@apodemus- You may be correct that providers may feel more comfortable approving the work of long-standing researchers because they have either repeatedly fulfilled the obligations outlined in benefit-sharing agreements, or they have made significant contributions even without a formal agreement. The Nagoya Protocol allows countries to decide on how they will regulate access to genetic resources, and our goal is to have a DES framework that provides transparency and accountability so they do not have to depend on these personal relationships. As you know (but others may not), access to genetic resources may require Prior Informed Consent of the country providing the resources (the country of origin or a state that has acquired the resources according to this article and acts as country of origin), and national access laws may require that Mutually Agreed Terms be established between provider and user, including terms for sharing benefits arising from the utilization of genetic resources. Non-monetary benefits are especially relevant to those conducting scientific research for noncommercial purposes, which may include collaboration with in-country partners (e.g., fieldwork, authorship), instruction of students, community education, and provision of relevant research results. Research that benefits the provider country (and society as a whole) should be approved regardless of whether the researcher is (back to our flight analogy) taking a one-way flight or is a frequent flyer. Short-term projects can still lead to important discoveries and generate significant benefits for both the users and providers, and these successful one-way flights will only continue to strengthen the trust in the system. In addition, this work does not prevent those who want to be frequent flyers with long-term collaborations, but like everyone else, they will have to have a boarding pass issued and go through the security checkpoint before getting on their flight.

JuttaBuschbom · August 17, 2021, 10:42am

@dshorthouse Thanks a lot for your reply, it is food for thought for me. So far, my perspective has been that the DES infrastructure will grow

In the way that collections, GBIF, Wikipedia, etc. grow - rather randomly or semi-organized based on the individual interests of its providers and users.
Providers and users will use and expand the DES, because the infrastructure is useful to their own work, comparable to physical collections. The more user-friendly the system and the more value it provides, the more it will be used and expanded.
Providing a connection to discussions under Topic 7 “Workforce & capacity building and inclusivity”, specifically the posts following this question posed by @austinmast : at one point, which might be early-on, DES data curators might fill gaps, add and curate links focusing on specific topics, projects, goals, etc. They might provide maintenance, integration and expansion services in an organized, goal-focused way. Who might coordinate such work, is a question that @Debbie asked in Topic 11 “Partnerships”.

I completely agree with your statement:

My point of view is that the goal for the DES is or should be to have “techno-governance” structures so well implemented that they are user-friendly and fulfill the needs of their communities, thereby intrinsically providing value and motivation to providers and users.

Wikidata has come up by now several times (see eg. Topic 6 “Robust access points …” and the discussions there). Having had no previous exposure and only slowly starting to understand its core, functionality and the experiences with it, I am starting to see its relevance and importance for the design of governance layers for the DES. One difference might be that the DES is “dealing” not only with publicly widely available data, but with a lot of data which are “sensitive” due to a whole range of reasons, as well as with the intersection of such data with an often non-trivial legal (and political) context. Though this might be a simplified view of the situation of Wikipedia/Wikidata and a reason to check their legal, ethical and sensitive data guidelines.

JuttaBuschbom · August 18, 2021, 9:01am

https://rightsstatements.org/en/

An approach for organizations that I came across today. Developed in the cultural heritage world, it might be applicable, at least in part or as blueprint, to the realm of natural history, too.

@Rich87 Maybe one of these rights statements fits your use case. For example, the statement " No Copyright - Other Known Legal Restrictions".

PFUHLIR · August 20, 2021, 9:28pm

Hello everyone, sorry to be late to the party. I was invited to participate, but didn’t get around to it until now. I would like to turn the conversation around a bit to organizational models and their legal implications with regard to biodiversity data.

I was a co-chair of the Legal Interoperability Principles and Implementation Guidelines at the Research Data Alliance a few years ago. We recommended the CC0 waiver for maximum interoperability, although we recognized that a lot of people/institutions would want to get attribution, so we left the door open for a 4.0 CC-BY license, even though the requirement to attribute was not always a legally enforceable requirement, but was a normative request. We also recognized the attribution stacking issue. I don’t wish to re-plow the discussion in the above thread of licenses, their language and form, or their various shortcomings, however, but focus instead on the institutional and implementation context. For some simplicity, one can divide this into 3 categories: the bottom-up individual researcher license, the top-down governmental or inter-governmental approach, and the institutional consortium that is in-between the two. I will stop here and comment on each one separately over the next day or two…

PFUHLIR · August 21, 2021, 2:54pm

Flexibility is an important principle that serves many ends, but at a decrease or loss of interoperability, which may or may not be important. Minimum standards can help keep the flexible approaches from diverging too widely. Agreement by research funding institutions, both nationally and internationally, can establish those minimum standards. The research community has developed minimum standards in some areas, mostly S&T, but not in others.

JuttaBuschbom · August 23, 2021, 5:54am

@PFUHLIR Welcome to the conversation.
The categories, that is, bottom-up approaches (licenses, use agreements and data protection specifications) and the top-down approaches (“permits”, local to multilateral government regulations, privacy protection; but I don’t recall any concrete regulations for the protection of specific sensitive biodiversity data) were covered by us in the previous discussions. Specifications set in place by IPLCs, cp. CARE-principles and LocalContexts-labels, are part of the top-down category from my point of view.

Institutional consortia in my understanding don’t form a third category inbetween. They act in a dual-role: as primary data producers (publicly- or privately-funded; bottom-up) as well as mediators and/or aggregators of legal and ethical requirements (cp. RightsStatements approach, additional/updated data protection specifications).

JuttaBuschbom · August 23, 2021, 6:48am

The costs of flexibility in terms of interoperability I don’t see as so extensive, since flexibility for me in this context is mostly a matter of combinatorics, as well as well-defined and standardized vocabularies. My impression is that independent developments are currently quite actively underway, which can provide the building blocks for rights-based vocabularies. Legal regulations will provide their own set of terms. There is the hope that despite cultural diversity, distinct jurisdictions and the wide range of use-contexts, those terms over time will tend towards harmonization to improve interoperabilty, re-use and thus benefit-sharing.

A minimum standard is an interesting perspective, thank you for having brought this point to the discussion. I am not sure in which direction to think. One possibility is to follow the Rights Statements approach and simply state if any specifications apply (likely always yes), if they are known and which category they belong to.

The alternative that I see is that a minimum standard still consists of two+ dozen “fields”. There really doesn’t seems to be a lot that you can skip. Here, I am more worried about user-friendliness and see one solution to this in a well-designed system of defaults, vocabularies and user-defined pre-settings.

Legal, ethical and data protection specifications form a huge and diverse field of information. Development of a standard, vocabularies and a user-friendly system, as well as their implementation is not going to be trivial. However, there are infrastructures and model-based softwares that do a good job at guiding the user through complex landscapes of entries, settings and choices.

PFUHLIR · August 23, 2021, 8:19pm

Thanks, Jutta, I agree generally with these points. In thinking further about this, I suggest using the construction of the EC’s research data principle on openness: “As open as possible; as closed as necessary”, as a corollary: “As flexible as possible; as standardized as necessary”.

Topic		Replies	Views
Making FAIR data for specimens accessible Digital/Extended Specimen	59	4374	March 5, 2021
6. Robust access points and data infrastructure alignment Digital/Extended Specimen	32	3149	August 31, 2021
10. Transactional mechanisms and provenance Digital/Extended Specimen	58	3537	March 17, 2022
Structure and responsibilities of a #digextspecimen Digital/Extended Specimen	30	4265	June 29, 2021
11. Partnerships to collaborate more effectively Digital/Extended Specimen	19	2800	October 6, 2021

8. Meeting legal/regulatory, ethical and sensitive data obligations

Related topics