My first PyGbif script

Hello everyone :smiley:

I’ve been a naturalist for several decades, I’m also a geomatician, and I’m new to Python scripting.
I feel that PyGbif would have great potential for my practice.
To get started, I’d like to write a first “simple” script, which would retrieve my occurrences (from several platforms, iNaturalist in particular, but not only) and download them locally (CSV, GeoJson, …).

Am I in the right place to ask for help with this request?

1 Like

Some resources that might be helpful are:

The first and third of the above might be really useful if you are just beginning to learn Python.

Hi @Sylvain_M, in addition to the resources @quercitron suggest, you could check out this Data Use Club recorded webinar: Data use club - practical session 3 - recording and resources

This blog post also contains examples of download in Python: Downloading occurrences from a long list of species in R and Python - GBIF Data Blog (although not with PyGBIF)

2 Likes

The pygbif documentation offers examples of code that you can execute. If you run them as stand-alone scripts, you’ll likely want to add print statements to display some of the results. For instance, the example below is offered in the species module documentation. You can enhance that example by using some variables to hold intermediate results, and by adding calls to the print built-in function to display output.

from pygbif import species
species.name_backbone(name='Helianthus annuus', kingdom='plants')
species.name_backbone(name='Helianthus', rank='genus', kingdom='plants')
species.name_backbone(name='Poa', rank='genus', family='Poaceae')

# Verbose - gives back alternatives
species.name_backbone(name='Helianthus annuus', kingdom='plants', verbose=True)

# Strictness
species.name_backbone(name='Poa', kingdom='plants', verbose=True, strict=False)
species.name_backbone(name='Helianthus annuus', kingdom='plants', verbose=True, strict=True)

# Non-existent name
species.name_backbone(name='Aso')

# Multiple equal matches
species.name_backbone(name='Oenante')

The following script was produced from code borrowed from the above and other examples, with the aid of variables and print statements that were added:

# See the following references
# https://pygbif.readthedocs.io/en/latest/modules/species.html
# https://pygbif.readthedocs.io/en/latest/modules/occurrence.html

from pygbif import species
from pygbif import occurrences as occ

species_name = 'Quercus velutina' # name of species
spec_info = species.name_backbone(name = species_name) # get information for that species
spec_key = spec_info['speciesKey'] # get species key from information for that species

spec_occs = occ.search(taxonKey = spec_key, limit = 12) # get 12 occurrences of species
results = spec_occs['results'] # get the results as a list


# loop through list, displaying only results that have latitude and longitude
for result in results:
    if ('decimalLatitude' in result and 'decimalLongitude' in result):
        print(f"Latitude, Longitude: {result['decimalLatitude']}, {result['decimalLongitude']}")
        
print("\n\nKeys found in first result:")
for key in results[0]:
    print(key)

The purpose of the final for loop at the end of the code was to reveal numerous keys that are potentially available for querying the occurrence results. You can experiment with some of those keys to find out what information is available.

Output from the above script:

Latitude, Longitude: 40.755153, -73.46598
Latitude, Longitude: 40.962715, -72.7979
Latitude, Longitude: 39.049889, -77.321062
Latitude, Longitude: 39.322321, -74.596241
Latitude, Longitude: 39.84528, -90.560022
Latitude, Longitude: 40.82695, -73.5326
Latitude, Longitude: 37.606563, -77.542687
Latitude, Longitude: 40.959447, -73.990726
Latitude, Longitude: 42.42818, -72.500149
Latitude, Longitude: 40.68413, -73.513642
Latitude, Longitude: 41.73222, -72.391798
Latitude, Longitude: 41.732109, -72.392163


Keys found in first result:
key
datasetKey
publishingOrgKey
installationKey
hostingOrganizationKey
publishingCountry
protocol
lastCrawled
lastParsed
crawlId
extensions
basisOfRecord
occurrenceStatus
taxonKey
kingdomKey
phylumKey
classKey
orderKey
familyKey
genusKey
speciesKey
acceptedTaxonKey
scientificName
acceptedScientificName
kingdom
phylum
order
family
genus
species
genericName
specificEpithet
taxonRank
taxonomicStatus
iucnRedListCategory
dateIdentified
decimalLatitude
decimalLongitude
coordinateUncertaintyInMeters
continent
stateProvince
gadm
year
month
day
eventDate
issues
modified
lastInterpreted
references
license
identifiers
media
facts
relations
isInCluster
datasetName
recordedBy
identifiedBy
geodeticDatum
class
countryCode
recordedByIDs
identifiedByIDs
country
rightsHolder
identifier
http://unknown.org/nick
verbatimEventDate
gbifID
verbatimLocality
collectionCode
occurrenceID
taxonID
catalogNumber
institutionCode
eventTime
http://unknown.org/captive
identificationID
1 Like

Thank you all so much!
Read more here: Retrieving iNaturalist Observations for a region / observers with PyGbif

1 Like