Uploading a large dataset to Zenodo with R

jwaller · February 18, 2022, 3:54pm

It is sometimes useful to upload a large dataset to Zenodo. For example, if you are wanting to register a derived dataset.

I have found that large datasets tend not to be handled well by the Zenodo UI. In these cases, uploading via the Zenodo API tends to work better.

I wrote a script below in R that will upload a dataset to a started/unpublished Zenodo deposit.

This script requires a personal access token, which you can get here.

# zenodo large file upload

library(httr) 
library(dplyr)
library(purrr)

# get your token here
# https://zenodo.org/account/settings/applications/
token <- "long_ugly_number_you_get_from_zenodo"
deposit_id <- 6137047 # fill in UI form to get this number
file <- "file_you_want_to_upload.zip"

bucket <- GET(paste0("https://www.zenodo.org/api/deposit/depositions/",deposit_id), 
add_headers(Authorization = paste("Bearer", token)),
encode = 'json'
) %>% 
content(as = "text") %>% 
jsonlite::fromJSON() %>% 
pluck("links") %>% 
pluck("bucket") %>% 
gsub("https://zenodo.org/api/files/","",.)

PUT(url = paste0("https://www.zenodo.org/api/files/",bucket,"/",file,"?access_token=",token), 
body = upload_file(file) # path to your local file
) %>% 
content(as = "text")

This script was inspired by

gist.github.com

https://gist.github.com/maxogden/b758cf0fe6d353846ef9ce7d03fdca0c

upload.sh

# in zenodo ui create a deposition. get the id

curl -H "Accept: application/json" -H "Authorization: Bearer $TOKEN" "https://www.zenodo.org/api/deposit/depositions/$DEPOSITION"
# get the bucket id from above 

curl --progress-bar -o /dev/null --upload-file ./$FILE https://www.zenodo.org/api/files/$BUCKET/$FILE?access_token=$TOKEN

@jhpoelen

system · March 21, 2022, 1:54am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Where can we publish images to be linked in GBIF/OBIS datasets Data Publishing	12	171	February 13, 2025
Derived datasets are here - GBIF Data Blog data-blog	6	1638	May 8, 2024
Problem parsing large occurrence downloads Data Use	5	1695	April 14, 2021
API Offset Limit Data Use	4	880	October 8, 2023
Web client considerations for a largish dataset Data Use	6	742	March 21, 2022

Uploading a large dataset to Zenodo with R

Related topics