Asia Office Hours

cjk · March 17, 2022, 11:42am

Today @vijay.barve, @Lily, @melissa0520 and I started the Asia Office Hour as a casual space for BIFA teams to have LIVE Q&A sessions on (mainly) data publishing.

For our first session, I prepared a flowchart trying to answer the qualification part of the most frequently asked question: How to publish data with GBIF?

Yes, that question needs a four-day workshop to cover properly, but in the office hours we will fill gaps, speaking in other words and hope some live interactions will improve knowledge retention.

I shared this slide with participants today. Hope it’s self-explaining! Please comment if you find anything incorrect or unclear, so I can revise it.

cjk · March 24, 2022, 9:13am

Thanks again for those who joined today! Special thanks to Choki from Bhutan for sharing the biodiversity information portal for Bhutan, and it’s role in supporting domestic citizen science activities. We also learned how the data underneath have been published to GBIF.

The same software stack was also used for India Biodiversity Portal. They look really nice indeed!

vijay.barve · April 1, 2022, 3:58am

In the recent edition of Asia Office Hours, we discussed about

Geocoding historical data where GPS coordinates are not available. Need for carefully assigning the coordinate uncertainty values.
How google maps can be used to get locations and uncertainty
What exactly is occurrence status and how absent can be used
Can a publisher use multiple IPTs to publish different datasets.

melissa0520 · April 3, 2022, 11:18am

Recommended IPTs for Asia region if there are no IPTs hosted in your node or institution:

Asia regional IPT: https://cloud.gbif.org/asia// Contact: asia_support@gbif.org
TaiBIF IPT (Taiwan node): https://ipt.taibif.tw// Contact: taibif.brcas@gmail.com

melissa0520 · April 12, 2022, 9:13am

From the Asia Office Hour last week, we discussed about

The best way to georeference several locations at once

Geocoding API 1
Geocoding API 2

The best practices for using DwC terms
- remember to add the information in “acceptedNameUsageID” about what taxon and the specie taxon ID you refer to (i.e. Species2000, Catalogue Of Life)
- recommend accepted specie name resolver: Global Names Resolver
- how to do if there is no any date information of specimens: should provide the date/year range than a blank. i.e. 2007-03/09, 2007-05-20/25, 1900/1909 (some time during the interval between the beginning of the year 1900 and the end of the year 1909)

Welcome to join us every Thursday or keep following the posts here!

cjk · April 14, 2022, 9:25am

How to convert “day, month, year” columns into ISO 8601 yyyy-mm-dd format in Excel?
convertDateToISO

cjk · May 5, 2022, 8:17am

Thanks to Dr Lu, Dr Pujary and Riya for joining us today! We enjoyed having the opportunity to elaborate about the structure of Darwin Core Archive, GBIF data connection with citizen science activities, the value of having a GBIF data publishing badge and how to meeting the data publishing requirement for BIFA midterm report. It was also good to explain about why GBIF only allows institutions to be data publishers. Hope you find it useful, too!

melissa0520 · May 5, 2022, 8:25am

Since there are more and more DNA barcoding data in biodiversity research, here are some suggestions for DNA-related data publishing:

Use this term if the sequence is already publish to NCBI : http://rs.tdwg.org/dwc/terms/associatedSequences
If it is eDNA, it would be different and you can choose extension to provide the data:
https://rs.gbif.org/extensions.html
DNA derived data extension: https://rs.gbif.org/extension/gbif/1.0/dna_derived_data_2022-02-23.xml
DNA data publishing guideline: https://docs.gbif.org/publishing-dna-derived-data/1.0/en/publishing-dna-derived-data-through-biodiversity-data-platforms.en.pdf

cjk · May 19, 2022, 12:03pm

Thanks to the participants today. It’s great to learn about the efforts from Botanical Survey of India that has herbaria resources catalogued. NGCPR is another CSR contribution from India that awaits further engagement with GBIF data publishing and domestic conservation experts. Hope we will hear more from them.

For those who are busy preparing the first dataset for the midterm, if you find no clues about certain fields not showing up, we have a hint for you. On your dataset page at GBIF.org, look for “DOWNLOAD” tab near the title, and choose “GBIF annotated archive”:

And then once you’re logged in, choose “DARWIN CORE ARCHIVE” and wait for the notification when it’s ready.

What you will download is what GBIF.org sees as the result of your data publication. Therefore, by examining the text files, you may be able to spot the missing part, for example, wrong mapping, when you discovered that the header in the CSV file doesn’t represent the value in the same column.

But don’t forget, depending on the scale of the potential issue, sometimes it’s still easier to see the comparison if you examine each occurrence record. There you have the “Interpreted” VS “Original” to understand how your dataset has been, eh, interpreted.

daphnehoh · June 15, 2022, 7:21am

I am interested in this discussion!
Are there any summary points for this discussion that I can read or refer to?

melissa0520 · June 16, 2022, 4:41am

From the Asia office hour last week, we introduced the basic concepts of data mobilization and clarified the differences between “GBIF data publisher” and “IPT individual account”. Here are the summaries:

GBIF data publisher is only available for organizations, not for individuals.
You need to apply for your IPT account to manage and publish your dataset. IPT account is for becoming the user on IPT.
Whenever you publish a new dataset to GBIF, you need to choose a GBIF data publisher from the organization list in the IPT metadata section.
To become a GBIF data publisher, please go to https://www.gbif.org/become-a-publisher for the registration.

Welcome to join the Asia office hour later!

system · July 16, 2022, 2:41pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

cjk · July 21, 2022, 9:42am

During a conversation with a GBIF data publisher recently, I noticed that he said “…we regularly upload our data to GBIF…”. That reminds me about a common misconception among publishers newly engaged with GBIF data publishing.

The fact is, in GBIF data publishing model, we don’t upload data to GBIF. Instead, we make data public and register it with GBIF, or, we publish data through GBIF.

GBIF implements a distributed model for data hosting. In the model, data publishers are required to maintain the data online thus assumed part of the responsibility to keep the infrastructure up. This also allows organizations to have full control over their data. For data publishers who have limited capacity to keep data online, they can have their data hosted by technical installations (e.g. IPTs) operated by other capable organizations.

When using other services to publish data, one may indeed required to upload a copy to the designated server or website. However, in the GBIF model, data publisher only upload their data to the technical installations, or IPTs. And the installation is fully managed by the host, not GBIF.

This slight difference of wording highlights the ownership and responsibility of data publishers, especially when often upload is interpreted as hand over or give, which may be the mindset contributed to the fact that we have many orphan datasets in GBIF today.

Of course, people can always use upload loosely, as long as we take care of our data published out there by keeping contact information up-to-date and responding to quality inquiries.

cjk · October 10, 2022, 2:30pm

As it never appears in this thread, I am posting the time of the Asia Office Hours so whoever dropped on this thread knows where and when to join.

We run the session for an hour on every Thursday at:
16:00 Tokyo/Seoul (GMT+9)
15:00 Manila/Taipei/Singapore (GMT+8)
14:00 Hanoi/Phnom Penh/Jakarta (GMT+7)
12:45 Kathmandu (GMT+5:45)
12:30 Mumbai (GMT+5:30)
12:00 Islamabad (GMT+5:00)

Zoom link: Launch Meeting - Zoom

All are welcome to come and chat, even just to say hello!

cjk · October 15, 2022, 3:22am

How to convert Degree-Minute-Second coordinates to decimal that is required by Darwin Core? Many ask this question, and depending on the volume of your dataset, there are solutions using different tools. In most cases, Excel should handle the task okay, as long as the format of all values in a column is consistent, and don’t get lost in repeating the steps.

Essentially, one use the degree(°), minute('), second(") and space( ) characters as delimiters to separate values and notations to their own columns, then use the values to calculate the decimal value following this formula:

Decimal = Degree + Minute ÷ 60 + Second ÷ 3600

There is a nice step-by-step blog post online that describes the how-to.

Have a look! And hope this is helpful.

Topic		Replies	Views
Sign up for the Virtual #DataUse Workshop for #Asia to ActivityPub test	0	140	June 16, 2024
How to publish data via the GBIF API (GBIF technical support hour for Nodes) Data Publishing NodesSupportHour	1	184	August 26, 2024
📣 Attention new/potential biodiversity data users in Asia ActivityPub test	0	115	June 9, 2024
🔎🌱 Are you a #biodiversity data holder located in ActivityPub test	0	8	December 20, 2024
DNA Data Publishing (GBIF technical support hour for Nodes) Data Publishing NodesSupportHour	7	468	February 14, 2024

Asia Office Hours

Related topics