Trying to identify different citizen science datasets in GBIF and use their occurrences. Besides having a list of datasets from Chandler et al. 2016 and then trying to select those, I do not know a consistent way of downloading citizen science occurrences from GBIF.
I have seen in some posts the use of HUMAN OBSERVATIONS, but those do not necessarily mean it is a citizen science program, right?
@pserra, GBIF harvests datasets that are built with Darwin Core categories, and there is no DwC category for type-of-publisher, like “citizen-science project”. However, processed GBIF datasets also have a non-DwC field called “publisher”. In the 21st-century Amphibia dataset I wrote about, this field has 166 entries with publisher names like “iNaturalist.org” (635159 records), “Observation.org” (199000) and “Froglife” (2769). You can select citizen-science projects from these names with either a look-up table of projects or a bit of googling. How you do the selecting will depend on your data-managing software. Email me if you’d like command-line suggestions.
Unfortunately, that’s not a tight solution because some publishers of citizen-science records are institutions (museums etc) that run nature observation programs, so you would need to check all the “publisher” entries from a HUMAN_OBSERVATION set of occurrence records if you wanted to be thorough!
I however do not know if I am doing anything wrong here because when I download the dataset I get >85k datasets. When I look some datasets I see some forest inventory datasets in there – which are not citizen science projects. Also, when I try to select for some species, like a tree where we clearly have some records that are not citizen science, it does not seem to subset the dataset.
I just want to make sure I am not doing something wrong here. I understand this is based on machine tagging.