Occurrence download API: check if download requested recently?

rgbif maintainer here … In thinking about how to make the rgibf users life easier, I am exploring this issue https://github.com/ropensci/rgbif/issues/308 where I’m working through something in rgbif to check if a request to the GBIF occurrencee download API has been made recently, and if so, then not kick off a new download based on certain conditions. For example, if a user requests occurrences for 1981 for Helianthus annuus in Mexico - we check to see if they did that recently (recently = user specified, e.g., within last day or week or so) - if they did the same exact download recently, then DO NOT send a new request, but instead download the data from GBIF for the matching request.

However, I got to thinking maybe GBIF could have an endpoint, e.g.,

/occurrence/download/request/exists

or similar, where it’s a POST request just as the normal download request, but this new route just checks if the same request has been done recently, and reports back if so, when, and other details.


I can definitely just do this on the rgbif side, but thought I’d check if GBIF has any interest in this

(In addition, on the R side, we can keep track of downloaded occurrence API bulk data (and use locally cached data if present rather than downloading from GBIFs servers/S3’s), but that’s a separate issue.)

1 Like

Hi Scott,

This idea would work for more static species groups, but when we look at very ‘dynamic’ groups users might genuinely want the latest occurrences rather than the last download.
This is seen from the GBIF end of things.
From the rgbif user perspective it might indeed be what is needed in those cases where users are running their scripts multiple times when testing or debugging. Give users this option in rgbif would certainly be helpful.

Best,
Jan K. Legind

Agree, seems to make sense to not implement this at the GBIF API level, but in clients (e.g, rgbif) as needed.