Statistics and graphics for Nodes

The document that was used in the video with all the links for the examples:

  1. Country page presentation and reminder: Ecuador

  2. https://www.gbif.org/occurrence/charts?publishing_country=EC&advanced=1&occurrence_status=present Another way of viewing data from a country by way of the occurrence search – highlight custom filters.

    1. Not an official feature – but a neat trick to make custom charts:

      i. Send a list of datasets for which the occurrences appear in a search:
      a. Replace the “search” in the URL by “datasets”: Search
      b. Make a custom chart with the parameter “d=datasetKey”: Search

      ii. Make and send interactive custom charts:
      The parameters are actually hidden in the URLs, which means that you can send a link to someone, but the parameters won’t be visible on their web browser once they click on the link. The charts have three possible parameters:
      - d: the first dimension (datasetKey, country, month, speciesKey, etc.)
      - d2: the second dimension
      - t: the type of chart (TABLE|COLUMN|PIE|LINE)

       You can combine these together to create different charts, for example:
           - `https://www.gbif.org/occurrence/charts?country=EC&taxon_key=6`
           - `https://www.gbif.org/occurrence/charts?country=EC&taxon_key=6&d=speciesKey&d2=month&t=COLUMN`
           - `https://www.gbif.org/occurrence/charts?country=EC&taxon_key=6&d=basis_of_record&t=PIE`
      
  3. https://analytics-files.gbif.org/: Access data and work with it yourself → country → Ecuador → etc.

  4. Dataset search interface: https://www.gbif.org/dataset/search?q= (projects, datasets, download csv)

  5. Literature search interface: https://www.gbif.org/resource/search?contentType=literature (multiple datasets: Resources)

  6. Download activity report from publisher page: Marshall University Herbarium

  7. Download usage statistics by country: https://www.gbif.org/developer/occurrence

    Occurrence API facets for breakdown of numbers by taxon or area, etc.
    https://api.gbif.org/v1/occurrence/search?country=EC&taxonKey=1&facet=year&limit=0
    https://api.gbif.org/v1/occurrence/search?country=EC&taxonKey=1&facet=gadmLevel1Gid&limit=0&facetLimit=100

  8. Plausible · gbif.org (Google Analytics-like statistics)

The questions and answers during the session

You mentioned aggregation of metrics by project but there aren’t a lot of project pages on GBIF, how to you get a project registered so data can be searched by project identifier?

Project pages like this one are for projects for which receive GBIF-mediated funds (for example, projects in the context of BID or BIFA). They allow us to keep track of the projects and aggregate metrics easily. We don’t make project pages for any project.
With that in mind, you can still use project identifiers. You can add to a dataset a project identifier in the projectID section of the dataset metadata. This will allow you to search, occurrences, datasets and citation based on that project identifier regardless of whether you have a project page. For example, the project identifier Boyaca_BIO isn’t associated with any project page. Yet, you can find all the datasets associated with that identifier here, all the occurrence here and all the citations here.
In other words, you don’t need to have a project page in order to use a project identifier.

Is it possible to download a list of species for a dataset?

Yes you can click on the dataset occurrences and then click on the download tab and select the species list format. It works with any occurrence query.

My question relates to the GBIF occurrence search using to verbatim scientific names. If I understand correctly, the verbatim scientific name field contains the value as provided in the scientific name field by the publisher. I can see that there is a Darwin Core term named verbatimIdentification which seems very similar. What are the differences?

The verbatimScientificName field isn’t a Darwin Core term. We created it for the practical purpose of making the original name provided to GBIF searchable. This is especially useful in cases for which there is no match in the GBIF backbone taxonomy. The occurrences can still be found using the scientific name provided.
The verbatimeIdentification field is used to share the original identification associated with the record (before any normalization or update of any name) so it might differ from the scientific name. We don’t match the content of the verbatimeIdentification field to the GBIF backbone taxonomy.

Is there a way to capture relationships between publishers? For example, several publishers belonging to the same parent organisation? Should we just have the acronym of the parent organisation in the publisher title?

We don’t have any guidelines for this specific type of question. For practical purposes keeping the acronym of the parent institution would be the easiest. We also have machineTags that could be used in some cases but it might not suit your needs. For now, we can discuss it on a ad hoc basis. If you have specific cases, please send us an email to helpdesk@gbif.org

If someone wants to apply to create a GBIF hosted portal , is it as simple as filling out the application and having the Node support? Is there anything else to keep in mind?

Yes. The main technical requirement is that the data has to be available and searchable on GBIF (for example all the occurrences in a geographic area or network). If the data isn’t searchable, a portal might be more difficult (or sometimes impossible depending on the request). Otherwise, the application process is as easy as it looks.

Will the extensions be downloadable in the download interface?

Yes, occurrence some extensions will be downloadable either just in the download API or both in the API and the web interface. Note that this concerns only those extensions:

What about a trait and Plinian Core extension for a Taxon core?

It isn’t planned, we suggest to log a GitHub issue to let us know what would be of interest to the community.

How do publisher share trait data?

It depends if we are talking about traits and characteristics of a particular taxon or measurements on specimens. See this thread for more discussion: https://discourse.gbif.org/t/publishing-trait-data-for-plants-is-it-possible-species-level-traits-from-literature-traits-measured-in-the-lab/

You showed the country pages with associated metrics. Is there an equivalent for a hosting organization?

There is no country page equivalent for non-country Nodes. In the case where the Node is also a hosting organization, you can go to the organization page and find metrics there. For example:

  • here are metrics for all the occurrences hosted by the same organization
  • here are all the datasets hosted by the same organization

If you aren’t in the case where your Node is also a hosted organization, it is a bit more challenging. We don’t have “endorsing node” statistics. You have to get all the keys publishers endorsed by your node and generate the statistics yourself.

2 Likes