Why Are Research Data Managed and Reused?
Research data management entails that data and related metadata are created, preserved and organised in a manner which ensures that data remain accessible and reliable, and data protection and security are maintained over the whole data life cycle.
Reuse means using data that were originally collected by someone else for some other purpose. Reusers of data often use different research methods than the original data creators. Data reuse can also be called secondary analysis.
Conducting research according to good scientific practice is fundamental for reliability of research. Research data are an essential part of the research project and process.
According to the guidelines published by the Finnish Advisory Board on Research Integrity (TENK), responsible conduct of research entails that data acquisition, as well as research and evaluation methods used, are in line with scientific research criteria and are ethically sound. Data created from research should also be stored according to the standards established for scientific knowledge.
The Research Council of Finland requires that grant applicants describe their data collection methods and data management plans in their research plan. The data management plan describes how the research project collects and uses data, and how the data are stored and made available to other researchers after the research has been completed.
Other major research funders that require grant applicants to include a data management plan in funding applications include Koneen Säätiö, the Finnish Work Environment Fund and the Finnish Foundation for Alcohol Studies. The Finnish Funding Agency for Innovation (Tekes) requires that data created from research funded by it be managed in a way that allows efficient use of the data in the future. The Horizon 2020 guidelines by the European Commission make Open Research Data the default, placing emphasis on producing FAIR data, that is data which are Findable, Accessible, Interoperable and Re-usable.
Archiving research data constitutes academic credit for researchers. The Research Council of Finland, the Finnish Advisory Board on Research Integrity, and Universities Finland (UNIFI) have jointly drafted a template for a researcher's curriculum vitae. In the template, the production and distribution of research data is included as a credit in the section scientific and societal impact of research. Some international science publishers have policies where access to data is a mandatory condition of publication.
Citations to archived data also constitute significant potential credit. Research publications where the data that form the basis to the findings are archived receive more citations that publications based on data that are not shared. This is due to the fact that when researchers use archived data for new research, they also often cite, in addition to the research data itself, the publications of the original data creators.
It is crucial for verification of research findings and the progress of research in general that the data on which publications are based are made accessible to the scientific community. Open access to data and data sharing promote innovation in research and collaboration between researchers. It also increases the visibility of research organisations and researchers, and enables efficient use of data created in different fields of science. OECD guidelines state that open access to publicly funded research data is an important condition for enhancing international research collaboration.
Archived data enable comparison over time and the study of a great variety of research problems. Data sharing can also encourage the improvement of research methods. Developing methods, software and technology in their part offer new opportunities for research. Archived data also form an important resource for teaching and learning.
Reusing data is economic and saves resources. If suitable data are readily available, there is less need to spend time and money to collect new material. Data from large surveys often include material that has not been analysed in the original research. Data reuse helps to avoid duplication of data collection. It can also minimise collection on the hard-to-reach or the vulnerable.
Valuable research data are of no use to the scientific community and future research if original data creators are the only persons to have any information on the data. If they relocate to other organisations or to other tasks, or retire, all information will disappear.
Research data usually have a longer timespan than the research project creating them. The project ends when the funding ends, but data created from it can be used long afterwards.
The image portrays the different stages of research data life cycle. After the research has been completed, the research team deposits the data in an archive for sharing purposes. The data are processed and documented both during research and in the archiving stage. Duplicated effort and information loss will be avoided if different stages of the data life cycle are planned well before data collection.
For instance, it is easiest and most cost-effective to produce the necessary metadata during research, at relevant stages of the life cycle. Finding out and completing all metadata information necessary for reuse years after data collection may not even be possible. If possible, the costs may be multiple compared to metadata production in the early stages of the cycle.