Our data provider is using image annotation software to label organisms in benthic video frames captured by a remote operating vehicle (ROV). The software annotation is in vernacular name that she has to mapped to scientificName, and the identification is often not at species-level. For example, one often have to examine the underside of starfish to know which species it is, but only the top side was captured in the images. I was asked:
How can I express the level of confidence in my identifications based on the evidence available?
Evidence available includes:
- Expert knowledge: For example, the annotation software detects a pink starfish. She knows, based on her expertise, that only one species of pink starfish is known to inhabit the region that they sampled.
- Physical samples: Divers are collecting specimens from the same area (for a separate dataset and analysis). This proximity increases her confidence that the organism in the image is likely to be the same species as those physically sampled.
Challenge:
There does not appear to be a standard term or field in Darwin Core (DwC) and AudioVisual Core to capture either the evidence used or the confidence level in image-based identifications.
Camtrap-DP has a field classificationProbability which serves similar purpose. IdentificationVerificationStatus is not what I wanted because the sample collected may not be the same individuals as the ones in the video.
Questions:
- How do others handle similar cases?
- Is this something others (e.g. GBIF users) would find useful?
I raised this question during GBIF node support hour and Peter Desmet kindly offered some insights above. It is also kindly summarised by Marie in the thread linked above. (Thank you Peter and Marie!)
I am not sure if I am the only one with this challenge. Any insights or suggestions would be greatly appreciated. Thank you!