Metadata are information about the context, content, quality, provenance, and/or accessibility of data. This is the critical information for ensuring the longevity and reproducibility of research data.
Metadata can exist in a variety of different formats. Examples are listed below:
Metadata can exist in a variety of different formats. Here are a few of the most common:
Discipline | Definition |
---|---|
Biodiversity | The Darwin Core (DwC) is a standard designed to facilitate the exchange of information about the geographic occurrence of species and the existence of specimens in collections. |
Geospatial | Geospatial metadata commonly document geographic digital data such as Geographic Information System (GIS) files, geospatial databases, and earth imagery. It also can be used to document geospatial resources, including data catalogs, mapping applications, data models, and related websites. |
Social Science | Data Documentation Initiative (DDI) is an effort to create an international standard for describing data from the social, behavioral, and economic sciences. Expressed in XML, the DDI metadata specification now supports the entire research data life cycle. |
If you are uncertain of what metadata standards may be in use in your discipline, the Research Data Alliance Metadata Standards Working Group maintains a directory of metadata standards by discipline, including related tools and use cases.
In addition, if you intend to deposit your data in a data repository, this repository may have guidelines on what metadata standard(s) should be used to describe deposited data.
Controlled vocabularies are a collection of preferred terms that are used to retrieve content consistently. Predefined and authorized terms are mandated, in contrast to tags or keywords, which are not controlled, thus ambiguous and inconsistent. Taxonomies, thesauri, and ontologies are types of controlled vocabulary.
In some fields, vocabularies are well-established, while in other disciplines, they are still emerging. You may want to check professional societies and journals for ones that have been developed in your disciplinary area. The list below is a starting point. JISC Digital Media has developed a more comprehensive Directory of Metadata Vocabularies that may provide guidance, as well.
Disciplinary area | Example |
---|---|
Life Science | Bioportal biomedical vocabularies from the NIH National Centers for Biomedical Computing |
Geospatial | Geographic Names Information System (GNIS) Developed by the USGS in cooperation with the U.S. Board on Geographic Names, contains information about physical and cultural geographic features in the United States and associated areas, both current and historical (not including roads and highways). The database holds the Federally recognized name of each feature and defines the location of the feature by state, county, USGS topographic map, and geographic coordinates. |
Medical | Medical Subject Headings (MeSH) is a controlled vocabulary for the purpose of indexing journal articles and books in the life sciences; created and updated by the US National Library of Medicine. |
Agriculture | The Agricultural Thesaurus Online vocabulary tools of agricultural terms in English and Spanish, cooperatively produced by the National Agricultural Library, USDA, and the Inter-American Institute for Cooperation on Agriculture. |
Biodiversity | Biocomplexity Thesaurus displays terminologies and term relationships in the fields of biology, ecology, environmental sciences, and sustainability. |
Humanities | Music Ontology provides main concepts and properties for describing music (artists, albums, tracks, arrangements). |
If you would like assistance in finding, adapting, and using an appropriate vocabulary, one of the following specialists may be able to help.