GBIF attempts to improve identifier stability by monitoring changes of occurrenceIDs - GBIF Data Blog

Since 2022, GBIF has been monitoring changes of occurrenceIDs in datasets to improve the stability of GBIF identifiers. We pause data ingestion when we detect more than half of occurrence records in the latest version have different occurrenceIDs from the previous version (live on GBIF.org). This identifier validation process automatically creates issues on GitHub and GBIF helpdesk will contact the publishers to verify the changes of occurrenceIDs.


This is a companion discussion topic for the original entry at https://data-blog.gbif.org/post/improve-identifier-stability
2 Likes

Thank you very much @mgrosjean @Kumiko, really informative and useful. A small question, are datablogs available for translation in Crowdin? Or in case that they are not available, could we translate the first part of this blog and publish it in our web page (with all the credits to Kumiko, of course). We will try to share this information with our network in 2024 and make it more widely known.

2 Likes

Great work everyone on the new routine and description of what actions are taken under what conditions. And, thanks for the shout out to Bionomia and its users who stand to benefit from the increased stability in occurrenceIDs you’re able to foster. This is a ton of work with a lot of back-and-forth communications with data publishers. Let’s hope the volume of issues continues to show sign of attenuation as data publishers embrace the importance for those who use their data, for repeatability in science, and for their own tracking purposes.

2 Likes

Thank you for your interest in sharing this! The data blog is not in Crowdin. Please feel free to use this blog post. You can translate and publish the materials on your website. Thank you for mentioning the credits.

3 Likes

Excellent!

There seems to be a formatting issue in the section " Three options to deal with identifier issues". I believe the following is intended to be a table?

Number Option Who can do this What happens after 1 Resume the data ingestion by allowing changes of occurrenceIDs GBIF helpdesk GBIF identifiers under old occurrenceIDs will be deprecated and new GBIF identifiers will be given for new occurrenceIDs. 2 Change back …

Oh, I see I can also read it at GBIF attempts to improve identifier stability by monitoring changes of occurrenceIDs - GBIF Data Blog, which has the correct formatting