This is topic 2.4. in the Information section of the Advancing the Catalogue of the World’s Natural History Collections consultation. Use this topic to discuss the questions listed below.
The TDWG Collection Descriptions (CD) Interest Group] is currently developing the CD standard for collection descriptions (evolving from the earlier TDWG Natural Collections Description (NCD) standard). Existing networks and institutional schemes use a variety of different formats or variants of metadata standards for their collection records, as a result of which interoperability between these resources (and hence data aggregation) is limited. To overcome this barrier, clarity is needed around factors such as preferred standards and vocabularies, mandatory fields and compatibility between information in different formats.
The following contributed materials are particularly relevant to this topic:
- What descriptive information should be considered mandatory or desirable for each Collection?
- Does the TDWG CD work supply everything needed?
- Otherwise, what enhancements are necessary?
- How much of this information needs to be normalised for machine processing (rather than just for human readers)?
I think TDWG CD is a great start to an extension for collections description. The only field I would like to see added is a field for database system used which would be useful for both collections and research communities.
I also think the more fields you can make optional the better as info will vary by discipline - as long as it doesn’t compromise the utility of the system. I also think it is important for as many fields as possible to have controlled vocabularies - learning from the issues with traditional DwC fields that we see now - 28,000 unique values in the sex field for instance!!
Thanks @abentley, I’ve added a few questions on the database system field to your post in the task group thread
Not so much optional as rather allowing for different profiles per discipline. A meteorite collection needs to be described by other dimensions than an insect collection.
What descriptive information should be considered mandatory or desirable for each Collection?
- a human readible informative collection name
- holding institution including contact information (weblink, email, phone, address
- information on how to cite the collection (as this appears to be an important motivation for establishing the catalogue)
- source of the information in the record
- last update
- a short, human-readable description of the collection
- collection metrics (e.g., number of specimens)
- identifiers for the collection (e.g., acronyms, codes)
- links to representations of the collection and/or its specimens
- links to related collections
How much of this information needs to be normalised for machine processing (rather than just for human readers)?
All of the information should be made available both for human consumption and machine processing according to linked data principles. Depending on the use cases and their priorization the latter might be tackled in a second step, after gaining community acceptance.
One process-level point I want to highlight is that a “consensus” metadata standard developed without some level of concurrent adoption or testing in institutional data workflows and projects is likely to need major revisions. The development of the Ecological Metadata Language (EML) in conjunction with staff at the U.S.’s Long-Term Ecological Research sites is a good example of the challenges of separating development and testing. Reflecting on the lengthy, conjoined revision-and-implementation process that followed the announcement of the EML 1.0, Millerand et al. 2007 conclude that “we slice the ontological pie the wrong way if we see software over here and organizational arrangements over there.” Changes to information standards are often inseparable from changes in organizational roles, for example as the local managers responsible for implementing new standards must become (and be recognized as) developers in their own right in order to articulate how local circumstances match onto abstract expectations. This suggests a high value on starting to test metadata standards in the core activities of key stakeholders before attempting to finalize agreement on a general consensus standard for describing collections.
Millerand, Florence, and Geoffrey C Bowker. 2009. “Metadata Standards. Trajectories and Enactment in the Life of an Ontology.” In Standards and Their Stories, edited by M Lampland and Susan Leigh Star, 149–65. Ithaca, NY: Cornell University Press.