@rdmpage Thank you for taking the time to respond. I much appreciate your willingness to discuss this complex topic of data provenance and citation and their use in digital infrastructures and services.
For your purposes
-
attribution - potentially thousands, if not more, folks have helped to provide the data, infrastructure, and software that you use to provide a useful service. By providing an accurate and precise citation, you enable ways to resolve and credit those folks (and robots) for their contributions.
-
debugging - being able to trace the (many?) transformations that provided data products went through in case of analyzing suspicious results
-
error analysis - seen as a biodiversity measurements device, collection managements systems, and the services that index them, biodiversity data is not prefect, and are subject to system errors (e.g., bias) and measurement (e.g., classification errors, machine transcription errors) error. Understanding the provenance of this data helps work towards error estimation appropriate for the origin of the knowledge.
If you’d like to have more reasons, I would imagine that articles on the benefits of Open Science [0] and/ or FAIR principles ([1], note that FAIR does not necessarily mean open) would offer more details on the benefits of citing your sources. How can you access and re-use data products without understanding were they came from?
what information should this tool provide that would be enough for you cite it with confidence?
Ideally, your tool would provide signed citations [2] to make the citation and their associated content (and origin!) persistent and verifiable.
Disclaimer
I am co-author of [2].
References
[0] Patricia A. Soranno, Kendra S. Cheruvelil, Kevin C. Elliott, Georgina M. Montgomery, It’s Good to Share: Why Environmental Scientists’ Ethics Are Out of Date, BioScience , Volume 65, Issue 1, January 2015, Pages 69–73, It's Good to Share: Why Environmental Scientists’ Ethics Are Out of Date | BioScience | Oxford Academic
[1] Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3 , 160018 (2016). The FAIR Guiding Principles for scientific data management and stewardship | Scientific Data
[2] Elliott, M. J., Poelen, J. H., & Fortes, J. (2022, in review). Signed Citations: Making Persistent and Verifiable Citations of Digital Scientific Content. https://doi.org/10.31222/osf.io/wycjn