I have the pleasure of sharing with you a series of data quality scripts for Open Refine developed by the Coordinating Team of SiB Colombia (GBIF Colombia), along with an Open Refine Advanced Guide for biodiversity data (available in Spanish, the English version is under construction). These scripts automate validation processes and aid data cleaning using already available tools of the GBIF community. We hope these quality routines helps all GBIF Nodes to speed up the validation process of the data (pre-publication ideally) with the publishers, to improve the data quality available through GBIF.
Key links:
GitHub repository
All the scripts with metadata in both English and Spanish
Open Refine Basic Guide
The current version is available only in Spanish for Open Refine below 3.1, the next version will be optimized for Open Refine 3.1 or above and will be launched both in English and Spanish in late 2019.
The scripts and guides are continually under development, all your comments and issues are welcome at our GitHub repository so we can keep on improving these tools for the whole GBIF community.
We are sharing scripts for OpenRefine, and guides to explain how to use them, to facilitate the validation and data cleaning processes of biodiversity data: geographic and taxonomic domains mainly.
Thanks for sharing those scripts Leonardo. It is a really useful tool.
Just for your information, the Canadensys date parsing tool is now under https, if you want to modify that in your script.
If you are interested in a french translation, I can help with that.
Thanks a lot for your feedback! We are going to update the scripts in API calls to Canadensys under “https” then, thanks for the novelty.
We would be glad to have your support in a French translation, you could share the translation in issues on our GitHub repository or, if you prefer, could send us translations files to sib@humboldt.org.co.