Asia Office Hours

Today @vijay.barve, @Lily, @melissa0520 and I started the Asia Office Hour as a casual space for BIFA teams to have LIVE Q&A sessions on (mainly) data publishing.

For our first session, I prepared a flowchart trying to answer the qualification part of the most frequently asked question: How to publish data with GBIF?

Yes, that question needs a four-day workshop to cover properly, but in the office hours we will fill gaps, speaking in other words and hope some live interactions will improve knowledge retention.

I shared this slide with participants today. Hope it’s self-explaining! Please comment if you find anything incorrect or unclear, so I can revise it.

8 Likes

Thanks again for those who joined today! Special thanks to Choki from Bhutan for sharing the biodiversity information portal for Bhutan, and it’s role in supporting domestic citizen science activities. We also learned how the data underneath have been published to GBIF.

The same software stack was also used for India Biodiversity Portal. They look really nice indeed!

4 Likes

In the recent edition of Asia Office Hours, we discussed about

  • Geocoding historical data where GPS coordinates are not available. Need for carefully assigning the coordinate uncertainty values.
  • How google maps can be used to get locations and uncertainty
  • What exactly is occurrence status and how absent can be used
  • Can a publisher use multiple IPTs to publish different datasets.
4 Likes

Recommended IPTs for Asia region if there are no IPTs hosted in your node or institution:

3 Likes

From the Asia Office Hour last week, we discussed about

  • The best way to georeference several locations at once

Geocoding API 1
Geocoding API 2

  • The best practices for using DwC terms
    • remember to add the information in “acceptedNameUsageID” about what taxon and the specie taxon ID you refer to (i.e. Species2000, Catalogue Of Life)
    • recommend accepted specie name resolver: Global Names Resolver
    • how to do if there is no any date information of specimens: should provide the date/year range than a blank. i.e. 2007-03/09, 2007-05-20/25, 1900/1909 (some time during the interval between the beginning of the year 1900 and the end of the year 1909)

Welcome to join us every Thursday or keep following the posts here!

2 Likes

How to convert “day, month, year” columns into ISO 8601 yyyy-mm-dd format in Excel?
convertDateToISO

2 Likes

Thanks to Dr Lu, Dr Pujary and Riya for joining us today! We enjoyed having the opportunity to elaborate about the structure of Darwin Core Archive, GBIF data connection with citizen science activities, the value of having a GBIF data publishing badge and how to meeting the data publishing requirement for BIFA midterm report. It was also good to explain about why GBIF only allows institutions to be data publishers. Hope you find it useful, too!

2 Likes

Since there are more and more DNA barcoding data in biodiversity research, here are some suggestions for DNA-related data publishing:

1 Like

Thanks to the participants today. It’s great to learn about the efforts from Botanical Survey of India that has herbaria resources catalogued. NGCPR is another CSR contribution from India that awaits further engagement with GBIF data publishing and domestic conservation experts. Hope we will hear more from them.

For those who are busy preparing the first dataset for the midterm, if you find no clues about certain fields not showing up, we have a hint for you. On your dataset page at GBIF.org, look for “DOWNLOAD” tab near the title, and choose “GBIF annotated archive”:

And then once you’re logged in, choose “DARWIN CORE ARCHIVE” and wait for the notification when it’s ready.

What you will download is what GBIF.org sees as the result of your data publication. Therefore, by examining the text files, you may be able to spot the missing part, for example, wrong mapping, when you discovered that the header in the CSV file doesn’t represent the value in the same column.

But don’t forget, depending on the scale of the potential issue, sometimes it’s still easier to see the comparison if you examine each occurrence record. There you have the “Interpreted” VS “Original” to understand how your dataset has been, eh, interpreted.

2 Likes

I am interested in this discussion!
Are there any summary points for this discussion that I can read or refer to?

3 Likes

From the Asia office hour last week, we introduced the basic concepts of data mobilization and clarified the differences between “GBIF data publisher” and “IPT individual account”. Here are the summaries:

  • GBIF data publisher is only available for organizations, not for individuals.
  • You need to apply for your IPT account to manage and publish your dataset. IPT account is for becoming the user on IPT.
  • Whenever you publish a new dataset to GBIF, you need to choose a GBIF data publisher from the organization list in the IPT metadata section.
  • To become a GBIF data publisher, please go to https://www.gbif.org/become-a-publisher for the registration.

Welcome to join the Asia office hour later!

2 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

During a conversation with a GBIF data publisher recently, I noticed that he said “…we regularly upload our data to GBIF…”. That reminds me about a common misconception among publishers newly engaged with GBIF data publishing.

The fact is, in GBIF data publishing model, we don’t upload data to GBIF. Instead, we make data public and register it with GBIF, or, we publish data through GBIF.

GBIF implements a distributed model for data hosting. In the model, data publishers are required to maintain the data online thus assumed part of the responsibility to keep the infrastructure up. This also allows organizations to have full control over their data. For data publishers who have limited capacity to keep data online, they can have their data hosted by technical installations (e.g. IPTs) operated by other capable organizations.

When using other services to publish data, one may indeed required to upload a copy to the designated server or website. However, in the GBIF model, data publisher only upload their data to the technical installations, or IPTs. And the installation is fully managed by the host, not GBIF.

This slight difference of wording highlights the ownership and responsibility of data publishers, especially when often upload is interpreted as hand over or give, which may be the mindset contributed to the fact that we have many orphan datasets in GBIF today.

Of course, people can always use upload loosely, as long as we take care of our data published out there by keeping contact information up-to-date and responding to quality inquiries.

6 Likes

As it never appears in this thread, I am posting the time of the Asia Office Hours so whoever dropped on this thread knows where and when to join.

We run the session for an hour on every Thursday at:
16:00 Tokyo/Seoul (GMT+9)
15:00 Manila/Taipei/Singapore (GMT+8)
14:00 Hanoi/Phnom Penh/Jakarta (GMT+7)
12:45 Kathmandu (GMT+5:45)
12:30 Mumbai (GMT+5:30)
12:00 Islamabad (GMT+5:00)

Zoom link: Launch Meeting - Zoom

All are welcome to come and chat, even just to say hello!

3 Likes

How to convert Degree-Minute-Second coordinates to decimal that is required by Darwin Core? Many ask this question, and depending on the volume of your dataset, there are solutions using different tools. In most cases, Excel should handle the task okay, as long as the format of all values in a column is consistent, and don’t get lost in repeating the steps.

Essentially, one use the degree(°), minute('), second(") and space( ) characters as delimiters to separate values and notations to their own columns, then use the values to calculate the decimal value following this formula:

Decimal = Degree + Minute Ă· 60 + Second Ă· 3600

There is a nice step-by-step blog post online that describes the how-to.

Have a look! And hope this is helpful.

4 Likes