Events in biodiversity data usually have a date. Suppose the event was on 3 May 2021. How would that date be represented numerically in your records or (if you’re at a museum or herbarium) in your CMS? Here are some possibilities:
5/3/21
5-3-21
05/03/21
05-03-21
5/3/2021
5-3-2021
05/03/2021
05-03-2021
3/5/21
3-5-21
03/05/21
03-05-21
3/5/2021
3-5-2021
03/05/2021
03-05-2021
2021/5/3
2021-5-3
2021/05/03
03052021
20210305
v.3.21
v.3.2021
3.v.21
3.v.2021
OK, now which of the above is the correct format for sharing biodiversity data?
Answer: None of the above. The international standard way of representing that date is 2021-05-03.
This YYYY-MM-DD format was adopted as a date standard (ISO 8601) more than 30 years ago. You can use whatever date format you prefer in your own records and in your own institution, but if you want to share your dates with the rest of the world, please use YYYY-MM-DD.
Advantages
The biggest advantage of YYYY-MM-DD is that it avoids ambiguity. Is “5/3/21” meant to be 3 May 1921? 3 May 2021? 5 March 1921? 5 March 2021?
The next biggest advantage comes when your computer sorts YYYY-MM-DD dates. By default they sort chronologically, oldest first.
Parochialism
USA-based readers of this post might like to point out that everyone in the USA writes their dates with month first, then day, then year (MDY), so that “5/3/2021” is perfectly understandable as May 3 2021.
In response, I’d like to point out (politely) that the USA is almost completely isolated in using MDY. Other countries prefer DMY or YMD and would see “5/3/2021” as 5 March 2021.
Spreadsheets
The nasty behaviour of spreadsheets with dates (and things that might be dates) is so well known that I won’t give Horrible Examples here. But there are two simple ways to enter “2021-05-03” in a spreadsheet cell without it becoming something else:
- Format the column as “Text” instead of “Date” or “General”
- Type an apostrophe before the YYYY-MM-DD. This forces the spreadsheet to see the entry as text. The apostrophe won’t show in the cell (but you can see it in the formula bar), and if you copy-and-paste the cell into a text file, the apostrophe disappears.
Even better, avoid unwanted changes in date format and other spreadsheet disasters by not using a spreadsheet.
Missing elements
Suppose you only know the year of the event, or only the year and month. The ISO 8601 (2004) way to represent year-only dates is YYYY, and YYYY-MM for year-and-month-only dates.
Please do not share dates like “2021-00-00” or "2021-?? ", and avoid sharing “1950-01-01” for “sometime in 1950”.
If you only know the month, or only the month and day of an event, in Darwin Core you can use the month, day and verbatimEventDate fields to enter what you know. Leave eventDate blank, because “??-07-23” or “–07-23” will not be correctly parsed as ISO 8601.
Date ranges
ISO 8601 also has a simple way to represent intervals like “3 May 2021 to 16 May 2021”. It’s “2021-05-03/2021-05-16”, or alternatively “2021-05-03/16”. Note that there is no space on either side of the forward slash “/”.
An interval such as “sometime between 1869 and 1873” is most simply represented as “1869/1873”, not as “1869-01-01/1873-12-31”.
More on uncertainties
ISO 8601 was extended in 2019 to introduce standard ways to deal with uncertainty in dates. These have not yet been widely adopted. For an overview of the extensions and some skeptical comments, see this blog post.
An XKCD webcomic by Randall Munroe.
Robert Mesibov (“datafixer”); robert.mesibov@gmail.com