I’m following this guide to try and pull some data from the API. The issue I’m having is that, when trying to offset by more than 10,000, I am receiving an HTTP 404 error. I found a sentence in the guide stating:
You can page through API results if you want more records, but you can only go 100,000 records deep
I wanted to check if anyone else had/has this issue, and if so:
If the limit is actually 10,000 (not 100,000)?
Are there any (smart) ways of getting around this that I may not be aware of?
The original use case (although I’m enjoying this so much that I’m sure I’ll think of new ideas!) was creating a list of (grouped) animal names, via the following steps:
Identify a specific ‘type’ of animal
Easy for some things, like Birds (class = “Aves”)
Harder for some others (like Reptiles), but manageable with some Wikipedia knowledge transfer
Extract all species information for this ‘type’ of animal
Extracting all vernacular names, deduping/cleaning etc
As soon as I’m at this step, given it’s just data cleaning I figured I’m home and dry
From what you’ve provided, I think checklistbank is suitable… that being said, I’d appreciate thoughts as to whether you’d agree on that or not, as I’m not at all familiar with these datasets (and am finding them slightly overwhelming).
The vernacular names available on GBIF (and via the search API) are coming from different checklists published on GBIF. You might not get as many vernacular names if you follow my previous advice (which is downloading the GBIF backbone via checklistbank).
If you want to get a custom export (with all the vernacular names) from GBIF, you can write to helpdesk@gbif.org.