Frictionless Data and Darwin Core - GBIF Data Blog

Frictionless data is about removing the friction in working with data through the creation of tools, standards, and best practices for publishing data using the Data Package standard, a containerization format for any kind of data. It offers specifications and software around data publication, transport and consumption. Similarly to Darwin Core, data resources are presented in CSV files while the data model is described in a JSON structure.


This is a companion discussion topic for the original entry at https://data-blog.gbif.org/post/frictionless-data-and-darwin-core/
2 Likes

Thanks for continuing to push on Frictionless Data. This appears to be the next logical step for DwC-A. Can you provide some suggestions on how a consumer of these data would best navigate many:many relationships? There are at least three ways to represent such relationships in flat csv files but no guidance for best practices. Would the onus be on a validator to accept some ways of representing relationships but not others? And, what about that eml.xml? It looks as if Frictionless Data has a rather loose, simple expectation for metadata in its datapackage.json (“just make new elements”) & there are no provisions for something like a community-endorsed or domain-specific metadata extension. Is this a problem?

Thanks for your comments,
Using a truly relation model (such as Frictionless Data) open opportunities for both Data consumers and producers. eg the later could publish locations, people, collections, habitats, hosts… in addition to taxons, observations/specimens and samples. Thanks to (primary and foreign keys) consumers will be able to understand and dig into all these additional entities, as they were published by the producer and validated by the standard.

On the metadata side, I do agree Frictionless Data is quite loose and only imposes to describe the data model/structure. But it also suggest to add a readme file and nothing prevent you to add an eml.xml (or other agreed metadata) in your data package. My tool keep the eml.xml (and all others files) in the data package. It also converts eml.xml into a human readable readme.md file. So the output of FrictionlessDarwinCore is both
a DarwinCore archive and a Data Package. But I do agree, well-defined metadata are crucial and should be both machine and human readable.
Hoping this answer you questions,