I hope you’re doing well. I recently stumbled upon the GBIF dataset, a vast collection of animal data within the kingdom Animalia. To be honest, it’s a bit daunting for me because I’m completely new to this field. I’ve been working with a significant dataset in JSON format, but delving into GBIF feels like stepping into the unknown.

In my JSON dataset, I have entries like this one:

  "animal_id": 613,
  "animal_name": "Cockatiel",
  "slug": "cockatiel",
  "data": {
    "scientific_classification": {
      "Kingdom": "Animalia",
      "Phylum": "Chordata",
      "Class": "Aves",
      "Order": "Psittaciformes",
      "Family": "Cacatuidae",
      "Genus": "Nymphicus",
      "Scientific Name": "Nymphicus hollandicus"
    "locations": [
    "facts": {
      "Prey": "Terrestrial insects",
      "Fun Fact": "They have crests that rise or fall depending on their emotions",
      "Estimated Population Size": "Undetermined, but conservation status is least concern",
      "Biggest Threat": "Birds of prey",
      "Most Distinctive Feature": "Its cheek patches",
      "Other Name(s)": "Weiro bird, quarrion",
      "Wingspan": "11.8 to 13.7 inches",
      "Incubation Period": "17 to 23 days",
      "Litter Size": "Five",
      "Habitat": "Scrub, bush, wetlands",
      "Predators": "Raptors",
      "Diet": "Herbivore",
      "Type": "Bird",
      "Common Name": "Cockatiel",
      "Number Of Species": "1",
      "Location": "Australia",
      "Average Clutch Size": "5",
      "Nesting Location": "Tree cavities",
      "Age of Molting": "Three to five weeks"
    "physical_characteristics": {
      "Color": "Grey, Yellow, White, Orange",
      "Skin Type": "Feathers",
      "Top Speed": "43 mph",
      "Lifespan": "As long as 35 years, average between 15 and 25 or longer in captivity",
      "Weight": "3.17 ounces",
      "Length": "9.84 to 13.8 inches"

My goal is to download the complete GBIF dataset for animals only once so that I can sort it out without having to call APIs every time I need species information. I’m looking for some guidance on how to go about this. Is there a straightforward way to download the entire dataset?

I’d appreciate any help or advice you can offer. Thanks a bunch for your time and support!

You can download all the Animal occurrences from the GBIF website. Here is a selection for all the Animals in the occurrence search interface: Search. Click on the “download” tab and select the download format. This will generate a download with citable DOI.

If you only want to download the species information (available on the species pages), please refer to this other thread: API Offset Limit


I’m going to check and find out which dataset is best for me, it’s over 1TB and there a few!

