Google Data Search and Download "Datasets"

GBIF uses schema.org to get datasets indexed on Google Data Search. However, Google Data Search is also indexing all the downloads, presumably because they have a DOI. As all the downloads are derivatives of other datasets this means that Google Datasearch is getting filled with a whole bunch of rather meaningless “datasets”, which are not really datasets.
I had a quick look at the Google documentation, but couldn’t see a way to stop your dataset being indexed, but you would have thought they have thought of situations like this.

1 Like

Hi Quentin,

We have only provided schema.org metadata for datasets proper, not for downloads. However, it seems that because they have DOIs too, Datacite are providing the metadata for these. I agree that they are “diluting” the picture, but I’m not sure we can do much.

That being said, I’m not sure how much Dataset Search is being used for finding datasets…