Best practices to publish soundscapes in GBIF

A couple of years ago, we asked the community about publishing soundscapes through GBIF, so we are wondering if there are some best practices to publish this data at this momment.

I have come across to several ways, from event core Acoustic detections of birds using the SILIC in Yushan National Park, Taiwan, occurrence core only with identified speciesSanctuary Soundscape Monitoring Project (SanctSound) Daily Aggregated Species Detections and occurrence core with “Sonus naturalis” to publish general audios without an specific species Xeno-canto - Soundscapes from around the world.

We want to publish a biological collection of environmental sounds and want to know any experiences or recommendations to better represent this data, I was thinking of using eventCore for all soundscapes with occurrences when there is any possible identification.

But we are not 100% sure because there are several ways to do it and many questions regarding this, as: Can we adapt CampTrap DP to soundscapes? Should we think about a new data model for soundscapes? or we can adapt what is already available and maybe do a quick guide on how to publish this data in GBIF?

I would like to hear your valuable opinion @abbybenson @mgrosjean @peterdesmet @dhobern @ylebras !

Thank you very much for your help and have a wonderful 2024!

1 Like

Hi @EstebanMH-SiB I am not aware of any best practise guide to publishing soundscape in GBIF. I believe that so far, each data providers have made their own datasets how they see fit.

I can see that @jeromeko is a contact for the dataset Acoustic detections of birds using the SILIC in Yushan National Park, Taiwan perhaps he has some insights on how best to work on this type of data?

1 Like

We’ve had casual conversations about this in the context of the data model and with at least one potential partner network that could bring a large influx of data.

Should note, too, the likely potential audience among the authors of this recent paper:

Looby, A., Erbe, C., Bravo, S. et al. Global inventory of species categorized by known underwater sonifery. Sci Data 10, 892 (2023). Global inventory of species categorized by known underwater sonifery | Scientific Data

1 Like

Interesting, I think, especially to look for a pragmatic data field to indicate if a soundscape includes multiple species and, if so, to map those identified species. Hereby a descriptive comparison of randomly selected GBIF occurrence records from mentioned data providers:

Additional remarks:

In latter mapping method, it’d be more logical to add a data field for the (calculated/estimated/derived) amount of individuals per species heard on the soundscape, if they’re auditively distinguishable. Defining an automatic method to consistently make those distinctions appear to be challenging, particularly if there’s no visual documentation of the observation.

Even automatically distinguishing sounds from different species on one recording, is also subjected to improvement in some examples:
e.g. ZZG01_20210123_073300.wav - Google Drive includes more than one species (in my opinion), while its event only lists one species:
Acoustic detections of birds using the SILIC in Yushan National Park, Taiwan
– cf. Steere's Liocichla (Liocichla steerii) :: xeno-canto

At the base, correct species identification is of course essential.

1 Like

I apologize I have not responded sooner to this- it came in at a busy time. For the SanctSound project that you linked to the most important thing to us was to include a coordinateUncertaintyInMeters that accounted for the fact that the recorder can be quite far away from the identified animal and we aggregated the detections to once per day per species because, according to the scientists that provided the data, that would be safest for preventing overestimation. We also only had events when there were occurrences.

I don’t know if CamtrapDP would be a good fit for soundscapes or not. Hopefully Peter will chime in. I think it has some things in it that make it pretty specific to camera traps but I think the general ideas of a deployment that is stationary and associated detections would be similar.

I think the topic of soundscapes has been raised a few times in the Machine Observations Interest Group working meetings but I don’t think it’s ever been directly addressed. Perhaps it would make a good topic for discussion at the next TDWG, although I won’t be there.

I am also pinging @sformel who might have some ideas or at least interest to engage on the topic.

1 Like

I’ll forward this thread to Peter

Hi @EstebanMH-SiB,

Right now, you should likely use Darwin Core. I see two potential models:

  • Event core + Occurrence extension + Audiovisual Core extension: Each soundscape is an Event (in the core), with a location, a coordinateUncertaintyInMeters (as @abbybenson points out) and a duration in eventDate. Each identified species is an Occurrence (in the extension), at the same location (so no need to repeat that), but likely a more precise duration or timestamp in eventDate. The (link to the) media files can be published in the Audiovisual Core media extension, associated with the event. The drawback of this model is that GBIF won’t show the media files for your occurrences (since they are linked to the event, not directly the occurrence).
  • Occurrence core + Audiovisual Core extension: Each identified species is an Occurrence (in the core), at a location and with an eventDate (many of which are repeated). You can group occurrences into deployments by having a shared eventID. To indicate that the occurrence is based on an associated soundscape, you can publish the media file in the Audiovisual Core media extension (which will contain many repeated URLs). This model is a more flattened approach of the above (hence all the repetition), but has the advantage that GBIF will show the media files associated with an occurrence. We use that model to convert Camtrap DP to Darwin Core (function, example).

Longer term, I think Camtrap DP is a good fit for soundscapes. As @abbybenson points out:

This will require testing with acoustic data use cases, so we can see how to extend certain vocabularies and maybe rename terms. The good news is that we just submitted a Horizon Europe proposal, where I’m responsible to do just that (extend Camtrap DP to acoustic and insect camera data). Let’s hope it gets funded. The bad news is that you can’t use it right now. :slight_smile:

Hope this helps,