From the white paper:
Assessing the quality of primary biodiversity data that meets the standards needed for further use in indicators is critical. Even if we were to satisfy the need for better quality data, questions would remain about how homogeneous and repeatable the treatment of the same data can be in different contexts. The same data, from multiple sources, is being used by distinct organizations or collaborations to build EBVs and indicators. Stakeholders developing a given EBV or indicator treat the data independently, apply their own filters and quality checks, and perform their own taxonomic harmonization process, which may be more or less similar to those used by other stakeholders. If biodiversity data platforms could prepare and share species occurrence data in advance for EBV and indicator creation, as EBV-usable datasets, or make the workflows to process data available for example, better consistency and transparency might be achieved. GBIF is exploring ways of assisting this process, for example through pre-filtered versions of GBIF-mediated data exported regularly to public cloud environments.
A second opportunity of equal importance is to improve the communication pipeline between data provider and data user. Data and communications about these data, tend to flow in one direction, from local data collection and mobilization to scientists and policymakers, with little to no communication in the opposite direction. GBIF and other biodiversity data platforms have made commendable efforts to track downloads of data and to report the citations of published works back to data publishers when they are made public through the use of Digital Object Identifiers (DOIs). Improved communications build trust across the data provider network by communicating back to organizations and individuals at the local level about the uses of data. These communications could occur in many ways, including notifications that alert data publishers when their data have been used in the creation of EBVs, biodiversity indicators and other high-level policy documents, using tools similar to the GBIF citation widget. Another effective communication strategy could be the presentation of specific examples that demonstrate how high-quality data and associated metadata are being used to influence science and policy as a part of capacity-building activities and other public events. These possibilities will remain only possibilities, however, without greater transparency.
A third opportunity to work towards greater transparency and traceability across the entire information supply chain is to document all steps taken to create indicators. In this complex process, it is not uncommon for the processes and analyses used to generate these synthezised data and policy products to remain undocumented or hidden from public view. Similarly it is equally difficult to know exactly which data were used in the processes and how. The CBD Secretariat and UNEP-WCMC, are currently working on standardizing the metadata requirements for the proposed headline indicators (see example for the Species Habitat Index, UNEP-WCMC 2021; this must include clear reporting of datasets (DOIs) used and data providers consulted to improve traceability even further.