Finding gridded datasets - GBIF Data Blog

Gridded datasets are a known problem at GBIF. Many datasets have an equally-spaced points in a regular pattern. These datasets are usually systematic national surveys or data taken from some atlas (“so-called rasterized collection designs”).


This is a companion discussion topic for the original entry at https://data-blog.gbif.org/post/finding-gridded-datasets/
1 Like

Thanks for the post! Did you get a chance to check if any coordinate uncertainty or precision was specified for the occurrences in the gridded dataset? If yes, do you know how many of the gridded datasets use coordinateUncertaintyInMeters or coordinatePrecision? (https://dwc.tdwg.org/terms/#dwc:coordinateUncertaintyInMeters)

No I have not had a chance to look at coordinateUncertainyInMeters.

Probably though footprintWKT would be more meaningful to look at for gridded datasets, but I actually did not know it existed before writing this blog post. https://terms.tdwg.org/wiki/dwc:footprintWKT

Unfortunately, I don’t think footprintWKT is used so often. Around 20M records have a not NULL in the footprintwkt column. I estimated at least 75M occurrence records are gridded. Also having a not NULL in footprintWKT does not necessarily mean the record is gridded above a trivial degree.

You can see that some gridded datasets do have a square footprintWKT.
Like this point form EBCC Atlas of European Breeding Birds

Hello there,

Would it be helpful for the dataset to use the following information to fill in the georeferenceProtocol field: “Coordinates represent the centroid of a MGRS UTM 10km grid” ?

Or is it better to use georeferenceRemarks for this purpose?

Hi @Salza
I think the georeferenceProtocol field would fit (you could add something about how the uncertainty is estimated as well). The georeferenceRemarks field can be used for additional information and context.
Some publishers use both: Occurrence Detail 3436435467