Update on the Global Registry of Scientific Collections - Technical Support Hour for Nodes

Join us for the next technical support hour for GBIF nodes on the October the 1 at 4:00 pm CEST for an update on the Global Registry of Scientific Collections (GRSciColl).

GRSciColl is a registry of physical scientific collections hosted by GBIF and maintained by the community. The data product team will briefly present GRSciColl with an emphasis on its latest news and developments as well as give advice on how best help your communities share information about their collections.

We will be happy to answer any questions related to or not related to the topic. Please feel free to post questions in advance in this thread or write to helpdesk@gbif.org.

1 Like

The video recording is available here: Technical support hour for GBIF Nodes on Vimeo

Here is a transcript of the Q&A:

How are object classification vocabularies (like “fishes”) chosen in GRSciColl, since some are not taxonomic groups in GBIF?

There are several ways of searching collections on GRSciColl.

One is by scientific names included in collection descriptors. The indexing of scientific name is done based on the GBIF backbone taxonomy. For example, you can search for mammal collections: Data - GRSciColl

For the object classifications, we use a practical vocabulary based on the DiSSCo synthesis+ report, rather than strictly following taxonomic groups. This approach matches how collections are grouped and searched in practice, allowing for categories like “fishes” or “algae” that may not be monophyletic or taxonomic. See the vocabulary used here: GBIF Registry

The new GRSciColl web interface doesn’t show the button to change language is no longer visible and the menu in the Spanish version isn’t visible.

Thank you for raising the issue, we logged it and are working on it: Bugs in the new style. · Issue #239 · gbif/hp-grscicoll · GitHub.
If you see more issues, please report them here: GitHub · Where software is built. Thank you!

Would it be possible to clarify what are external sources for GRSciColl entries and how GBIF datasets and occurrence relate to GRSciColl?

GRSciColl and GBIF data interact in several ways.

  1. Occurrences published on GBIF with a collection or institution code or identifier are linked to GRSciColl (when the codes and identifiers match to GRSciColl). This allows to create metrics and dashboards. See this example: Data - GRSciColl.

  2. GRSciColl editors can choose to connect:

Here is an example where a collection entry is connected to this dataset. Note that if you click on the suggest button, the interface doesn’t allow you to update some of the fields. The information for those fields come from the dataset metadata. When the dataset metadata are updated, the collection information is updated as well. The dataset metadata is the master source of the collection entry.

The idea with having external sources like Index Herbariorum and GBIF datasets is that you would only need to update the information once and it is propagated to GRSciColl.

You can find how to add or remove a source here.
You can read more on how GRSciColl is connected to other systems here: Connected systems - GRSciColl

What is Index Herbariorum?

Index Herbariorum (IH) is a worldwide index of herbaria and associated staff where plant and fungal specimens are permanently housed. It is hosted by the New York Botanical Garden and is the source of most hebarium collection records in GRSciColl.

How easy is it to update data on Index Herbariorum?

IH has an update suggestion system like GRSciColl. If you go to the editing page of an entry connected to IH, you will find at the top, the link to the IH corresponding entry. On the collection page in IH, you can click on the edit button. You will then be able to suggest updates directly in a form. IH manually review these update suggestions before applying them.
There is usually a few weeks between an update applied on IH and the update being available in GRSciColl.

You have created descriptors for algae here but the same collection also contains fungi and lichens, would it make sense to create descriptors for everything?

Yes, it would. Right now, we have been working on the making of descriptors in small increments. For example, trying to make searchable all algae collections or all marine collections. This is because it is easier to do and more managable to review.

But it would also be great to do everything at once. We have been working with the Colombian node to import some of the National Registry of Collection data in GRSciColl as collection descriptors: Data - GRSciColl

For some countries that have a lot of collections that could potentially be updated, we didn’t go through the suggestion system to avoiding them recieving many notifications. We will contact them and see what would work best for them.

Do collection descriptors go to Index Herbariorum?

Index Herbariorum has something called “collection summary” which, in GRSciColl is tranformed into a collection descriptor table.
Nothing goes from GRSciColl back to Index Herbariorum.

For example, most of the collections in this search result: Data - GRSciColl have collection summary from Index Herbariorum.

Is there a way for a database to be connected directly to GRSciColl?

Yes, if you would like to have automated updates pushed to GRSciColl, there are two ways to do so:

  1. If you have a public API allowing to access your data, we can set up your system as one of the possible sources of information for GRSciColl (like we do for GBIF dataset or IH). If that’s the case, please contact us and we will work together to set it up.
  2. If you don’t have such API, then you can use the GRSciColl API to push updates. If you decide to do so, you need to be careful not to overwite possible updates that might have occurred and manage possible confilcts with other external sources. We can help you set up a script if you need help.

Note that you can also access data from GRSciColl programatically by using the GRSciColl API. This is what the iDigBio collection portal does: https://portal.idigbio.org/portal/collections.

Should we advise institutions to edit their data in Idex Herbariorum or directly in GRSciColl?

Ideally institution should go through Index Herbariorum. You can always add more information to GRSciColl as collection descriptors.

What are GRSciColl reviewers supposed to do if they don’t have enough information to accept or discard a suggested descriptor?

If we sent the descriptor based on the script mentioned in the presentation sent the descriptors, and you cannot confirm the data, please discard the suggestion.

If someone working with the collection sent the descriptor, they likely have more knowledge about the collection. If you aren’t sure whether to apply the change, please contact the person who made the suggestion.

How should we address data mobilization for international resource? We have a publisher which has a big database containing data spanning the whole world (including our country) but they aren’t based in our country. How to proceed?

There is no formal process for this. A first step would be to register the dataset here: GitHub · Where software is built and we will try to notify the relevant people. Ideally the node of the country where the organization has the headquarters would be the one working on mobilizing the data.

1 Like