Frictionless Data and Darwin Core - GBIF Data Blog

dodobot · January 15, 2020, 9:36am

Frictionless data is about removing the friction in working with data through the creation of tools, standards, and best practices for publishing data using the Data Package standard, a containerization format for any kind of data. It offers specifications and software around data publication, transport and consumption. Similarly to Darwin Core, data resources are presented in CSV files while the data model is described in a JSON structure.

This is a companion discussion topic for the original entry at https://data-blog.gbif.org/post/frictionless-data-and-darwin-core/

dshorthouse · January 15, 2020, 3:38pm

Thanks for continuing to push on Frictionless Data. This appears to be the next logical step for DwC-A. Can you provide some suggestions on how a consumer of these data would best navigate many:many relationships? There are at least three ways to represent such relationships in flat csv files but no guidance for best practices. Would the onus be on a validator to accept some ways of representing relationships but not others? And, what about that eml.xml? It looks as if Frictionless Data has a rather loose, simple expectation for metadata in its datapackage.json (“just make new elements”) & there are no provisions for something like a community-endorsed or domain-specific metadata extension. Is this a problem?

andre · January 16, 2020, 8:38am

Thanks for your comments,
Using a truly relation model (such as Frictionless Data) open opportunities for both Data consumers and producers. eg the later could publish locations, people, collections, habitats, hosts… in addition to taxons, observations/specimens and samples. Thanks to (primary and foreign keys) consumers will be able to understand and dig into all these additional entities, as they were published by the producer and validated by the standard.

On the metadata side, I do agree Frictionless Data is quite loose and only imposes to describe the data model/structure. But it also suggest to add a readme file and nothing prevent you to add an eml.xml (or other agreed metadata) in your data package. My tool keep the eml.xml (and all others files) in the data package. It also converts eml.xml into a human readable readme.md file. So the output of FrictionlessDarwinCore is both
a DarwinCore archive and a Data Package. But I do agree, well-defined metadata are crucial and should be both machine and human readable.
Hoping this answer you questions,

Topic		Replies	Views
DwC to Frictionless Data Package Miscellaneous	3	4651	August 17, 2019
Frictionless Darwin Core Miscellaneous	2	665	October 19, 2019
Diversifying the GBIF data model - intro Diversifying the GBIF data model	14	1278	July 21, 2022
Darwin Core Data Package - A new publishing format for biodiversity data (technical support hour for GBIF nodes) Data Publishing NodesSupportHour	1	181	September 15, 2025
Darwin Core Data Package (DwC-DP) Data Publishing	0	208	May 3, 2025

Frictionless Data and Darwin Core - GBIF Data Blog

Related topics