10. Transactional mechanisms and provenance

Me too. You cannot legally own data - so I’m told by lawyers. You can only ever be a custodian, guardian, steward, or controller of it. By what right could you or your institution then be regarded as the sole authority for or about that data? Even the latter case of being a data controller doesn’t generally confer rights as far as I know, but mainly obligations under, for example data protection laws. Authority is a convention borne of necessity that’s historically grown up as a consequence of having concentrations of expertise in specific collection-holding institutions combined with the missions of those institutions to curate, research and educate. Nevertheless, (again as far as I know) it’s always been possible for other recognised experts to attach additional information to objects they are not custodians of.

I agree we must challenge the conventions, think outside of the box and design/deliver infrastructure that enables new transformed and combined physical/digital working practices effective for and commensurate with the timescales on which collection-holding institutions are typically used to working on i.e., aiming to be fit for purpose for the next 100 years.

@nelson makes an assumption that Digital extended Specimens (DS) should be immutable digital objects. Choosing to go with immutable versus mutable DS digital objects must be an explicit design choice and not an assumption because that choice has all kinds of consequences. It determines what and how you store? It determines how you design your principal digital objects (i.e., the DS) and the related objects (such as different kinds of transaction object) that go along with it. It determines how you process objects, what it means to be a ‘machine-actionable’ object and how you write the software programs that do that processing.

Do we want to keep every immutable object as it is transformed from one object version to the next? Do we want to keep every delta and rebuild the object when it’s needed to the state it’s needed in e.g., at the time it was cited? Do we want a single mutable object that is always current, with access to the record of deltas and transactions that led to it or to any prior version of it? Each of these has different design implications for how the 7. Persistent identifier (PID) schemes must operate.

We must be careful not to put the cart before the horse by jumping immediately to specific technical solutions without first settling the proper model by which the ‘DES layer’ illustrated in the Background and context for phase 2 will function. Especially for infrastructure operating on very long timescales (as I mentioned above) the technologies can and will change.

I agree with @DESchindel that transactions are events in the life of an object (its diary as @DESchindel puts it) and that (like in the art world) the DS’s provenance is the history of those events. Logs and ledgers are the appropriate chronological way to record activities performed by agents on entities (the PROV model) and to visibly attribute those. But the logs/ledgers are separate from and sit alongside the detailed records of the events (loans, visits, annotations, interpretations, amendments, enrichments, etc.) themselves.

The fundamental model we must define now is a FAIR Digital Object / [cloud]event model for the long term that can be both Web-compatible for the medium term and technology-neutral but it definitely won’t be a solely Web-based model in the first place.

1 Like