Not that different I would say. The goal of subscribing to alerts for annotations is to be notified that there are annotations on the occurrence records you are interested in. The emails only contain links to the occurrence records that have been annotated and a link to a query that will find them all. You can safely discard the emails, as you can query for annotations at any time and you can also get them through the API.
I was making the suggested distinction when I said that I did not want the fully automated ingestion of annotations into our CMS and that just our curation team being notified of annotations was enough. This is very much given in by the form annotations that are available to us take at the moment and it also depends on the type of annotations.
For example, if we had a fully structured annotation with a suggested correction for a geo-reference, it would be very nice to be able to accept or reject the assertion within the CMS and have the CMS update the geo-reference – or, better still, create a new geo-reference. Attribution for geo-references is trivial, as we can use georeferencedBy
for that, which our CMS has (as does @Rich87’s, as we use the same CMS).
If the annotation is a suggested identification, I will happily store it as a determination record, but one of our (other) botanists will have to look at the specimen to verify the identification for it to become the current determination (the one we deliver in the Occurrence Core) and the current determination will be attributed to the botanist who confirmed the identification. Once all our specimens are imaged, I would love people to be able to do online identifications, but I do not see any way of automating this, as specimens will have to be pulled out of cupboards, annotated and re-incorporated and the work with the specimens will eclipse any amount of work that has to be done in the database.
With Bionomia attributions I get into trouble with our data model, as our CMS stores the Agent IDs at the Agent level, while the attributions are at the Collection Object or Identification level. Also, while I try to automate as much as possible, all matches are verified by our curation team and there are (a very small minority) of attributions that they think are incorrect, or are not sure about, so I am not sure who to blame for the actions we take in our database. However, as long as the annotations will always be in Bionomia and will always be connected to the specimen, I do not think that is an issue. Also, while I cannot (and do not really want to) accommodate the Bionomia annotations in our collections database, I think it would be great if our records in AVH (ALA) could link to them.
There are also annotations that do not have to get back to the data curators. For example, in our online Flora, we use annotations when making the maps. These maps are based on occurrence data from the ALA. Dots on the maps have different icons depending on the value of establishmentMeans
. The value for establishmentMeans
comes with the occurrence record, or assertions by our Flora editors. The latter are annotations. I do not think these have to go back to the curators of the source data sets at all, but aggregators like ALA could use them to improve the filter on establishmentMeans
, which is important to many users, but for which the occurrence data is very incomplete.
We also use assertions that the occurrenceStatus
is ‘doubtful’ to prevent dodgy-looking occurrences from displaying on the maps. These assertions mostly indicate that a specimen is probably misidentified, so the curators of the specimen data need to be made aware that these assertions have been made, so they can verify the identification. But even if nothing is done with the annotations at the source, they will still be on the AVH record alerting the user that there might be a problem with the record, so they can decide whether or not to include it in their analyses. Since earlier this week, the ALA Biocache has data quality profiles in which one of the options is to ‘Exclude records with unresolved user annotations’.
I never said anything about severing the communication between source data and annotations. I do not see how you can. There is obviously a lot that can and needs to be improved both in CMS and annotations to enhance a positive feedback loop. However, a positive feedback loop does not mean just storing annotations that are made elsewhere in the source CMS, which is what I was talking about (and thought @JoeMiller was asking about). Also, while this feedback loop is important, I think annotations add great value to data regardless, so having an annotation store is always useful.
I have been rather heavily invested in one particular regional infrastructure, so I might be biased, but I think the more we increase the role of CMS and the more responsibilities we pile onto collection managers or collections database managers, the more we will be leaving smaller and under-resourced collections behind. So, the more responsibilities that CMS and collection managers do not really have to do that can be pushed to shared infrastructures (incl. hosted CMS) the better.
Sorry, this gone on way too long. Apologies to who has to summarise (or even read) this.