The search giant on Wednesday announced the beta release of a new search engine named Dataset Search targeted at scientists and journalists looking for precise datasets.
Google Dataset Search
Google’s Dataset search works similar to Google Scholar, the search giants popular search engine for academic studies and reports. The new tool enables users to find datasets stored across thousands of repositories on the Web, making these datasets universally accessible and useful.
In an official blog post, Google’s AI and Research scientist, Natasha Noy, said: “Dataset Search lets you find datasets wherever they’re hosted, whether it’s a publisher’s site, a digital library, or an author’s personal web page.”
To create Dataset Search, the tech giant has developed guidelines for dataset providers to describe their data in a way that the company (and other search engines) can better understand the content of their pages.
“These guidelines include salient information about datasets: who created the dataset, when it was published, how the data was collected, what the terms are for using the data, etc. We then collect and link this information, analyze where different versions of the same dataset might be, and find publications that may be describing or discussing the dataset,” Noy said.
“We encourage dataset providers, large and small, to adopt this common standard so that all datasets are part of this robust ecosystem,” added Noy.
Google recommends that institutions should adopt the open source Schema markup standard so as to make sure that the datasets are accessible via Google’s tool. This allows publishers to include machine-readable data like date of publication, how data was collected, the terms of usage, etc.
Dataset Search works with multiple languages but not all, and the company plans to extend the support for more languages in the future.
The datasets already contains information from organizations like NOAA, NASA, Harvard Dataverser, ProPublica, etc., and more data providers are expected to extend the support.
So what do you think about this new development from Google? Share your views in the comments and don’t forget to subscribe.