Webinar 2: Organisms vs. specimens (Wouter Addink)

The following question(s) were asked in the Collection Management Systems Webinar and will be answered here.

Wouter Addink (at 1hr:05 in the webinar video): How would the following use case appear in the model?: A branch that is collected in the field and preserve one part as specimen, and another part that is used for a sequence. How would these relate to the object of interest (the organism) when the CMS systems are not storing a record for this?

Response:
The most common situation in collection management systems is that the Organism is not instantiated as a concept separate from the material representing it in the collection. Similarly, it is common that the distinct parts of the Organism are not tracked separately, but rather are enumerated in a list. Let’s look at your scenario in two ways, each with distinct explicit information the data publisher is able to provide.

As a first example, the data publisher is able to represent all of the Entities involved in the scenario. These would be a plant (a dwc:Organism, an Entity), a branch (a MaterialEntity), a part of a branch used up for sequencing (a MaterialEntity), and, optionally, the remainder of the branch after the part was taken for sequencing (a MaterialEntity). All of these would need to have identifiers to track them as distinct objects and their properties over time (EntityAssertions) and to relate them to each other (EntityRelationships).

As a second example, the data publisher has a single Darwin Core Occurrence record for a “PreservedSpecimen” that lists a branch and a sequence as dwc:preparations. There is an occurrenceID as an identifier for the specimen in the Darwin Core archive, but that was fabricated and unique only within the dataset to minimally satisfy the archive’s requirement for a unique identifier for each record. That occurrenceID isn’t actually used in the real world - the catalog number and the collector number are. The catalog number is used in the role of an identifier referring to the Organism, to the part of it that is in the collection, and to the part of it that was used to generate a sequence. An aggregator would have to instantiate a dwc:Organism Entity from which the preparations were made as well as MaterialEntities for the parts in the specimen record, and give identifiers to all of them to use within the aggregated data store. They would not be able to tell that the sequence was generated from a sample of the branch that is mentioned, because that information is not in the original data.

1 Like