Transcript for:
Understanding Metadata in Research Data Management

[Music] metadata and documentation play an important role in rdm enabling data to be found and reused metadata is often defined as data that describes other data if you have seen our knowledge clip about documentation you might remember that the key difference between them is that metadata records essential information about data in a highly structured way using a set of defined information fields or elements the reason why metadata is highly structured is because it is meant to be readable and exchangeable by computers something often referred to as machine readability metadata is needed for many things it facilitates the process of searching and finding data metadata can help us to assess whether the data we find is useful for us or not without having to download it first it also lets us know how the data can be accessed and how can it be reused because of this metadata is essential to make your data fair let's now have a look at some metadata concepts to understand why it is so important first of all there are different types of metadata a first type is called descriptive metadata this type includes common elements or fields that help us to discover the data this can be for instance things like title of the data set the author keywords describing the subject and so on when we talk about technical metadata we mean information about technical aspects of the data or files this could be for instance information about how to access the data the file type used or the size of the file administrative metadata contains elements or fields that deal with intellectual property rights such as the license or access rights or restrictions finally there is also structural metadata this type of metadata indicates how the data set relates to other online resources so how is metadata created and where can we find it metadata can be associated to many different research objects and appear in many different ways sometimes metadata is generated automatically some instruments such as microscopes telescopes or digital cameras create metadata when data is collected but this is not always the case other times metadata needs to be manually created for instance by taking notes in a laboratory notebook or by filling out a form or data listing the second question is how is metadata stored metadata can be stored embedded within the files or it can be stored as separate files and another way to provide metadata comes when you upload your data to a data repository or archive let's have a look at some details and examples most day-to-day digital files include a range of metadata fields these allow you for example to search and sort files according to date created file type author size etc often discipline specific file formats might also have additional embedded metadata fields for example microscopy images normally include the objective settings within the file besides research instrumentation metadata can also be generated by processing or analysis software for example statistical packages such as spss embed rich metadata within the file like formats or additional variable information it is important to find out whether the file formats you use of metadata fields embedded and if these are needed to use the data if you plan to convert a file with embedded metadata to a different file format you should check whether these metadata will also be present in the new format in some domains another place where metadata can be found is in the header of the files typically this is a section at the top of the document preceding the data containing a summary of the data or information about the instrumentation settings about the variables etc often this metadata header follows agreed conventions or standards and the information it contains can be read by applications processing software or algorithms in other cases a metadata header can be manually created by a researcher for example to provide contextual details about an interview in the transcription file when metadata is generated by research instrumentation or software it might also be stored on a separate file for example sensors and measurement devices often provide configuration or calibration files and software used to process geographical data might store geospatial metadata such as the coordinate system in separate files but these separate files can also be manually generated by the researcher for example in a readme file or a spreadsheet recording metadata in such way can also be done in a structured way and often templates are available to help you using readme files can be a useful way to collect metadata during the course of the project however this approach has some downsides for example there is a risk that the link between metadata and the data they represent is lost for example when files are moved keeping some kind of metadata is certainly better than collecting no metadata at all but as a general rule custom-made approaches make difficult for metadata to be machine readable and your data become less findable and reusable and this takes us to our last point providing metadata on a data repository or archive depositing your data on a repository might be required by your institution or research fund or policies or by the journal in which you want to publish your results even if not required it is a good research practice and will increase the fairness of your data because data repositories provide functionalities to make your data more fair including services to create and manage metadata to upload your data to a repository you will be required to fill in a user-friendly form to describe your data all the fields in this form are in fact metadata fields pre-configured to meet a specific metadata standard allowing the result to become machine readable then what are metadata standards when the information fields captured within a specific metadata set become widely used and accepted it often evolves into a metadata standard to put it simply a metadata standard or metadata schema defines the set of elements that can or must be used to describe a resource the standard also tells you how these elements should be named and also which values are allowed or what the required format is for each of the elements some metadata standards are designed to be used across different scientific domains examples of such generic standards are the dublin core standard or data site but there are also discipline specific standards which typically contain additional elements to satisfy the needs of a particular scientific domain for example the ecological metadata language is used in ecology research and has additional elements such as taxonomic coverage to indicate which species are included in the data set another example of a specialized metadata standard is the data documentation initiative or ddi this standard contains elements such as questionnaire specification for research that involves surveys the use of metadata standards facilitates data exchange by different systems or applications in other words it makes research metadata interoperable one of the fair data principles to recap during the research process metadata can be created in different ways and appear in multiple forms an important use of metadata is to make your data findable and let others know how they can access it and reuse it data repositories provide you the functionalities to create and manage machine-readable metadata and therefore make your data more fair that is why it is a good idea to familiarize yourself with the kind of metadata that repositories require during the course of the project make sure to document this information so that when the time comes to provide the metadata you are not only relying on your fading memory for more information about metadata and data repositories have a look at our website