Assessing FAIRness of Biodiversity Data through Badges and Download Buttons

What better way to start a new year with a badge and a download button to show off the FAIRness of your favorite biodiversity datasets?

An example of a FAIR badge for UCSB-IZC rendered by https://linker.bio/badge/10.15468/w6hvhv is shown below.

How does this work? For more details see text below or https://linker.bio/#use-case-4-assessing-fairness-of-biodiversity-data -

Wishing you an inspired 2024!

-jorrit

Use Case 4: Assessing FAIRness of Biodiversity Data

As a way to promote the mobility and usability of digital data, the FAIR principles [1] have gained traction in the science community. In order for data to be FAIR, they have to be “Findable”, “Accessible”, “Interoperable”, and “Reusable.” But what exactly does it mean to be FAIR? Who determines whether data is FAIR?

Thousands of Darwin Core Archives [2] (DwC-A) containing valuable biodiversity data are published by Natural History Collections (e.g., the Field Museum, the Museum of Southwestern Biology), Community Science Intiatives (e.g., iNaturalist, eBird), and Taxonomic Authorities (e.g., Integrated Taxonomic Information System (ITIS), World Register for Marine Species (WoRMS)). To increase their reach, many of these archives are registered with the Global Biodiversity Information Facility (https://gbif.org), Integrated Digitized Biocollections (iDigBio) or Ocean Biodiversity Information System (OBIS).

Since 2018/2019[3], Preston processes have been tracking registered datasets in GBIF, iDigBio, and OBIS. Now, many years later, a wealth of data is available on which archives were registered with networks including, but not limited to, GBIF, iDigBio and OBIS. By sampling monthly, a detailed temporal record is kept on the origin and content of these archives. So, if an archive has left a trace in these registry records, the origanization that published the archive can say that their data is FAIR. They are FAIR because, the Preston tracking process was able to Find the archive in a registry, Access their associated content, show their Interability through their adoption on a recognized standard, DwC-A, and was able to Reuse the archive by keeping versioned copies as proof of registration.

To make it easier to see whether an archive is FAIR according to the methods describe above, you can get your FAIR assessment badge using:

https://linker.bio/badge/[your archive DOI/UUID/URL]

For instance, the University of Santa Barbara’s Invertebrate Zoology Collection (UCSB-IZC) has registered the location of their archive (i.e., https://ecdysis.org/content/dwca/UCSB-IZC_DwC-A.zip) with iDigBio and GBIF. iDigBio assigned the UCSB-IZC the recordset uuid urn:uuid:65007e62-740c-4302-ba20-260fe68da291, GBIF assigned both a DOI (i.e., ‘10.15468/w6hvhv’) and UUID (i.e., urn:uuid:d6097f75-f99e-4c2a-b8a5-b0fc213ecbd0).

Now, the FAIRness of the UCSB-IZC archives can be visualized by visiting one of the following location a web browser:

  1. https://linker.bio/badge/https://ecdysis.org/content/dwca/UCSB-IZC_DwC-A.zip (by archive location)

  2. https://linker.bio/badge/urn:uuid:65007e62-740c-4302-ba20-260fe68da291 (by iDigBio RecordSet UUID)

  3. https://linker.bio/badge/10.15468/w6hvhv (by GBIF DOI)

  4. https://linker.bio/badge/urn:uuid:d6097f75-f99e-4c2a-b8a5-b0fc213ecbd0 (by GBIF Dataset UUID)

If an archive reference (by location, uuid, doi) is associated with a tracked DwC-A, a download badge is generated for a recently tracked versioned copy of the FAIR archive. If an archive reference could not be resolved in the corpus of tracked biodiversity archives, a 404 unknown archive badge is generated. With this, an independent FAIR assessment badge service is available: the service is independent of the publisher (UCSB-IZC) or registries (iDigBio, GBIF). These badges may be used to institutions to show off their commitment to FAIRness, or by registries to show that they contribute to the findability to existing data archives.

An example of a FAIR badge for UCSB-IZC rendered by https://linker.bio/badge/10.15468/w6hvhv is shown below.

Note that the tracked corpus itself can be cloned, copied, and verified. This means that others can implement FAIR assessment services (or any other kind of service using the biodiversity data archives) on the verifiably exact same tracked corpus as the one that https://linker.bio uses.

If you’d like to learn more about how this service works, please read through the history of the feature or contact the author of this document.

Please note that this FAIR assessment feature was heavily influenced by the WorldFAIR project report by Trekels et al. 2023 [4].


  1. Wilkinson, Mark D., Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, et al. 2016. “The FAIR Guiding Principles for Scientific Data Management and Stewardship.” Scientific Data 3 (1). The FAIR Guiding Principles for scientific data management and stewardship | Scientific Data. ↩

  2. “Darwin Core is a standard [
] intended to facilitate the sharing of information about biological diversity [
]” - https://dwc.tdwg.org/ accessed at 2024-01-03 ↩

  3. Poelen, J. H. (2023). A biodiversity dataset graph: GBIF, iDigBio, BioCASe hash://sha256/450deb8ed9092ac9b2f0f31d3dcf4e2b9be003c460df63dd6463d252bff37b55 hash://md5/898a9c02bedccaea5434ee4c6d64b7a2 (0.0.4) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7651831 ↩

  4. Trekels, Maarten, Debora Pignatari Drucker, JosĂ© Augusto Salim, Jeff Ollerton, Jorrit Poelen, Filipi Miranda Soares, Max RĂŒnzel, Muo Kasina, Quentin Groom, and Mariano Devoto. 2023. “WorldFAIR Project (D10.1) Agriculture-related pollinator data standards use cases report.” Zenodo. WorldFAIR Project (D10.1) Agriculture-related pollinator data standards use cases report. ↩

3 Likes

Oh interesting! How can I add to our collection public profile? Ecdysis Portal University of California Santa Barbara Invertebrate Zoology Collection

1 Like

@seltmann thanks for asking how to add a badge to your collection page.

in markdown, you should be able to make a clickable badge with something like:

[![](https://linker.bio/badge/10.15468/w6hvhv)](https://linker.bio/10.15468/w6hvhv)

in html, the markup would look something like:

<a href="https://linker.bio/10.15468/w6hvhv" target="_blank">
 <img src="https://linker.bio/badge/10.15468/w6hvhv"/>
</a>

with the badge showing up like:

Note that you can choose to use identifiers associated with your collection in the badge URL: the GBIF dataset UUID, GBIF dataset DOI (or 10.15468/w6hvhv, like in the example above), iDigBio RecordSet UUID, and DwC-A endpoint URL also.

Curious to hear how you’ll end up embedding this badge in your page.

I added the badge to our metadata here: Ecdysis Portal University of California Santa Barbara Invertebrate Zoology Collection. I exported the metadata to our GBIF publisher profile : University of California Santa Barbara Invertebrate Zoology Collection

Badges do not appear in the GBIF publishers profile even if they are included in the metadata (I included via html). I wish they did!

1 Like

@seltmann I was able to see the badge on your collection page at https://ecdysis.org/collections/misc/collprofiles.php?collid=38 (see below). Yay! Fun to see these badges being re-used.

Note that these badge can be automatically generated from the metadata already present for your collection, just like the gbif citations/ bionomia collectors & determiners badges. Also, I wonder why the badges get stripped from the GBIF collection publishers profile. Maybe there’s another way to add them? Perhaps @markus knows . . .

2 Likes

Thanks Jorrit
Love badges
Not our favourite ones I fear. I’ve only tried 4 of the datasets at Dipterists Forum | NBN Atlas and the best I can achieve for the FAIR badge using the DOI number listed on their GBIF links is a green “unknown”. I did think this might be due to license type (for some we use NC so that contributors to schemes don’t think they’re being exploited by commerce) but it’s returning the same “unknown” regardless of license.
Any ideas about how we might get our GBIF datasets checked for FAIRness would be appreciated, or perhaps a different link structure.
Example https://linker.bio/badge/10.15468/mwjnku for the 4th item in that list

Hi @Darwyn - Good to hear that you like badges . . . and thanks for sharing the example related to the GBIF registered dataset with associated identifiers:

and citation

Dipterists Forum (2023). Dipterists Forum - Recording Scheme - Stilt & Stalk Flies. Occurrence dataset https://doi.org/10.15468/mwjnku

According to linker.bio records, the doi/uuid/urls are known, only no content has been associated with them.

And, as I visited https://registry.nbnatlas.org/archives/dr940/dr940.zip - I noticed a “403 Forbidden” page. Any idea why the content associated with the Stilt & Stalk Flies dataset is not accessible?

In other words, the Access to the data could not be independently verified. However, identifiers associated with the dataset are known. So, as far as FAIR goes, linker.bio didn’t get passed the F (findable).

Thanks for being patient as I am trying to understand what is going on here.

@Darwyn inspired by your comment and example, a third badge type was introduced. For more details see introduce no-access badge to indicate known DwC-A registration but non accessible content · Issue #273 · bio-guoda/preston · GitHub .

So now when content associated with known identifiers (doi, uuid, url) could not be accessed the following badge is generated:

Screenshot from 2024-01-10 13-17-15

So, because the dataset with DOI 10.15468/mwjnku was known to GBIF, but the content location (or URL) did not produce any content for some reason.

Thanks again for sharing your findings.

Thanks Jorrit
Neither of those were my intention. In fact the DwC dataset was converted by my UK GBIF node (NBN Atlas) from spreadsheet datasets and other online systems here.Not an area of my expertise.
I can see how that might be a more informative return under some circumstances but hopefully the NBN Atlas team will respond shortly with undertakings to help fix the faults however they arose. There are other UK Open Data datasets similarly affected.

1 Like

@Darwyn thank you for providing some more context. I am curious to see how this will develop and how/if I can contribute.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.