dr Piotr Krajewski
Employment
- Senior Librarian at Scientific and Technological Information Section
Scientific posters
Piotr Krajewski, Aleksander Mroziński
A comprehensive study of the metadata standards used to implement a multidisciplinary, open data repository
16th RDA Plenary Meeting
November 9-12, 2020, Costa Rica (online)
Summary: Well-described metadata plays an important role in enabling the efficient storage, sorting, retrieval, sharing, and linking of scientific data. Different kinds of metadata standards have been developed to describe different kinds of data: text, images, video, etc. Several general standards have also been developed to meet the needs of the various scientific disciplines. Unfortunately, common elements of data description have a different status in each distinct metadata standard, and these differences hinder interoperability. This poster describes the effects of three important factors—universality, consistency, and interoperability of metadata—on the process of implementing a metadata schema for a multidisciplinary, open research data repository called the Bridge of Data.
The Bridge of Data is a recently launched open research data repository developed as part of the ‘Bridge of Data. Multidisciplinary Open Knowledge Transfer System - stage II: Open Research Data’, a project co-financed by the European Regional Development Fund under the operational programme Digital Poland 2014-2020. The Bridge of Data Repository is one module of the Bridge of Knowledge platform (http://mostwiedzy.pl).
Three important goals were taken into account when creating the metadata schema in order to reconcile existing standards.
The first goal may be called ‘general scheme for different fields of science’. The repository is a collaborative project of three universities from the Pomeranian region representing many scientific disciplines. Because scientists have been unable to define metadata standards and requirements for the vast majority of research data, the crucial task was to render metadata usable for different fields of science. To this end, four hundred different sample sets of research data from various disciplines—including engineering and technology sciences, medical and health sciences, social science and humanities and others—were obtained and analyzed. These datasets were used to determine components of the metadata schema that will be mandatory for all data.
The second goal may be called ‘metadata granularity and compliance with metadata standards’. The first step to achieve this goal was to compare commonly used metadata standards such as Dublin Core, Inspire, and DDI, which enabled metadata attributes that are common to these standards to be distinguished.
The third goal may be called ‘compliance with indexing services’ requirements’. It was essential to prepare a metadata schema that would enable the Bridge of Data repository to be indexed in multidisciplinary and international databases. Verifying the mandatory requirements of such indexing repositories as Web of Science, DataCite, Re3data, OpenAIRE, and Google Dataset Search was one of the first steps in creating the schema. It was also necessary to associate metadata fields with other information from the Bridge of Knowledge platform, such as information about publications, projects, and the profiles of researchers.
The result of this process was the creation of a metadata schema that consists of eight mandatory fields: description, author(s), year of publication, dataset language, field(s) of science, DOI, funding, and keywords. Two other fields, creation date and ethical papers, are optional.
seen 3467 times