Where can we publish images to be linked in GBIF/OBIS datasets

ymgan · December 11, 2024, 11:03am

Hello, may I know where can we publish large number (>1k) of images that can be linked to Occurrence records which will be published to GBIF/OBIS? This question is being asked a lot in the marine/OBIS community. I personally also encounter dataset where the number of photos of specimens far exceeded Zenodo’s limit Zenodo FAQ - What are the size limitations of Zenodo?

Curious if anyone has the answer to this question please? Thanks a lot!

EstebanMH-SiB · December 11, 2024, 9:56pm

Dear @ymgan , we usually told our publishers that they can use Internet Archive, it is free and you can upload as many images/sounds/videos as you want, also there is no limit to the size of the media.
You can create a collection when the images are grouped Colección de Microorganismos de la Pontificia Universidad Javeriana - CMPUJ or in an individual profile INCIVA.

I know that you can also use Wikimedia and is free, but I do not have an experienci with ir or an example from our publishers, but maybe someone has experience with it and can share information.

ymgan · December 12, 2024, 6:02am

Thank you so so much Esteban!! This is so helpful! I believe this is what we are looking for, I will try it out!!

sformel · December 16, 2024, 1:18pm

@EstebanMH-SiB @ymgan This seems like a good solution, thanks for identifying it! It’s not clear if they mint persistent identifiers for the individual files, or the collections, of the things they archive. Do either of you know if that’s possible?

vechocho · December 16, 2024, 8:27pm

The persistent URI is for the file, thus dataset: Registro de varamientos de megafauna marina en Ecuador continental hast this archive.org item Varamientos Ecuador 2023 : Ministerio del Ambiente, Agua y Transición Ecológica : Free Download, Borrow, and Streaming : Internet Archive with 58 files and you can add more files latter to a item

sformel · December 17, 2024, 3:58pm

Got it. Thanks @vechocho !

MatDillen · December 20, 2024, 9:59am

The limitations for Zenodo in size and file number are only per record. You can make a community and add all images to it as separate records, like this example with ca. 270k images we did as a pilot for the ICEDIG project years ago: Search Belgium Herbarium of Meise Botanic Garden

You can find more info on the process in this report: Digitisation infrastructure design for Zenodo. Deliverable D6.3
The tech specs and the python scripts used for these pilots are out of date by now. I believe GitHub - plazi/lycophron: Batch uploader to Zenodo is a more up to date tool.

dshorthouse · December 21, 2024, 5:24pm

Cautionary note here that the IA is battling serious litigation. As a result, the Biodiversity Heritage Library whose page scans have been in IA for decades, is exploring alternatives like AWS.

EstebanMH-SiB · December 31, 2024, 8:44pm

Thanks for the useful insights @dshorthouse and @MatDillen. We will explore Zenodo a little bit so we can recommend it to our publishers and keep an eye on IA hoping they continue working without much trouble, they have been a great resource so far!

ymgan · February 12, 2025, 11:26am

Thank you very much for this Mat! The Herbarium specimen records are nice. From what I understood, the examples are 1 image = 1 occurrence. I am wondering if you have any recommendations/examples on how to do this for images that can have multiple occurrences please?

Example: Submersible Gathered Evidence of a Vulnerable Marine Ecosystem at the Melchior Islands, Western Antarctic Peninsula (Subarea 48.1) - multimedia There can be sea stars, sponges and other organisms within the same picture.

Thanks a lot!

MatDillen · February 13, 2025, 10:05am

I’m not sure what else you need? That Zenodo record you linked to already does the job of hosting (multiple) multi-specimen images. You can enrich the record with a more standardized data file, like a Darwin Core archive that lists all identified observations from the images and video as occurrences. You can then use the Audiovisual Core extension to link the occurrences to the images they occur in. You could even link the occurrence to a region of interest within the image if you know where each of the organisms can be spotted: Audiovisual Core Term List - Audiovisual Core

What you probably can’t do is embed the data in the record as subjects, making them easily accessible through the Zenodo API, like we did in this example.

ymgan · February 13, 2025, 10:58am

Thank you Mat!

I showed the Zenodo record as an example of the type of images that I have. It works for that dataset because there are <100 files, within Zenodo’s size limitations (per record).

I have another new dataset that I have not published yet with >700 of same type of images as the Zenodo record which way exceeded Zenodo’s size limitations (per record) and I am curious how to do it like the example you shared.

I think, you answered my question that we can’t.

MatDillen · February 13, 2025, 11:19am

Why can’t you put each image in a separate record (i.e. >700 similar records), or bundle them in smaller batches? That would require splitting up the data, or making a separate record for the overarching data and linking that one to each image record (like this one), but other than that I don’t see why it wouldn’t work?

At some point, you might exceed Zenodo’s Fair Usage policy. But that mainly depends on your total data volume (how many of such datasets you have and how much gigabytes their media files amount to). Individual cases like this should be fine.

Topic		Replies	Views
Duplicates while publishing a big herbarium Data Publishing	2	524	June 28, 2023
Sharing images, sounds and videos on GBIF - GBIF Data Blog Data blog	6	12295	August 20, 2025
Uploading a large dataset to Zenodo with R Data Use	1	1419	March 21, 2022
GBIF exports as public datasets in cloud environments Miscellaneous	28	13208	May 19, 2022
Downloads failing to include all files in the archive Data Use	16	1028	October 29, 2023

Where can we publish images to be linked in GBIF/OBIS datasets

Related topics