Archiving and anonymisation in international research projects – EduMAP project leading the way

This article was published in Finnish in the latest issue of the FSD Bulletin (1/2019).

The EduMAP project studies the role of adult education for vulnerable young adults in a variety of European countries. Of the large interview data collected in the project, FSD’s holdings include interviews with educational personnel and policy-makers in ten languages. To ensure the anonymity of interviewees, the researchers familiarised themselves with anonymisation techniques with the help of the Finnish Social Science Data Archive.

The EduMAP project, which was carried out in the European Union and in Turkey, received its funding from the EU’s Horizon 2020 Research and Innovation Programme, which had requirements concerning the archiving of data. Like for all larger research programmes nowadays, the openness of research data and publications was a principle for EduMAP as well.

“Basically, everything that is done with the funding is open, which is great. Granted, our project was not the most ideal in terms of sharing the data, because the interviews were by nature quite sensitive,” researcher Jaakko Hyytiä says.

In the end, a sample of the interviews conducted with educational personnel and policy-makers was selected to be archived at FSD. The interviews discuss the contents of study programmes provided by the organisations, pedagogical approaches, life management, and active participatory citizenship. The interviewees were guaranteed complete anonymity, and the researchers had to educate themselves on how to implement anonymisation.

Researchers Jaakko Hyytiä and Paula Kuusipalo in a stairwell.
Researchers Jaakko Hyytiä and Paula Kuusipalo think that archiving research data makes it easier for researchers to plan their academic careers, as the data that they collect remain usable after the research project ends.

The third group interviewed for the research project was vulnerable young adults aged 16–30, which included immigrants, for instance. These interviews were not archived because ensuring anonymity would have been arduous and the researchers wanted to protect the vulnerable young participants.

Archiving increases reliability and benefits the researcher

According to researcher Paula Kuusipalo, the research funder is to be thanked for preparations in the project that facilitated archiving and open access.

“The project is included in the Open Research Data Pilot (ORDP), which required drafting a data management plan as well as archiving the data for reuse. So we have had to describe archiving procedures at many stages, in which the data management planning tool DMPTuuli and FSD’s Data Management Guidelines were of great help.”

“The whole archiving process requires a surprising amount of work,” Kuusipalo says. “We probably would have been able to work smarter if we had known all the details related to archiving when we started out.”

Hyytiä and Kuusipalo say that the requirement of archiving has awoken the research group to think about the advantages of open access to data.

“It’s quite obvious that archiving adds to the reliability of research. Harshly thinking one could question the results of the research if the data were not open and nobody could verify the conclusions that were reached.”

Hyytiä emphasises that archiving data benefits the researchers in the project as well.

“In an ideal situation, the data, which the researcher spent a lot of effort and time collecting, remains usable after the project ends. This makes it easier, for instance, to plan one’s own academic career. In general, these kinds of projects suffer from the tragedy of spending massive amounts of resources on collecting and bringing the data together. And if the data are not archived, reuse is impossible after the funding and the researcher’s work ends. Luckily in the EduMAP project at least part of the data was archived.”

Support and guidance for anonymisation

The General Data Protection Regulation of the European Union that came into force last year brought stricter rules for processing personal data, which increases the significance of anonymisation in research data. To assist researchers working on the project around Europe in archiving the EduMAP data, FSD organised a webinar on anonymisation of qualitative data.

“We noticed that people talk in very different ways. Some discuss things on such a general level that only their name and location have to be anonymised. Others have a way of talking where they throw names around and tell everything about their colleagues and their colleagues’ colleagues.”

“It’s challenging!” Hyytiä sighs. But not all anonymisation measures were that difficult in his experience.

“The easiest thing is to remove direct identifiers and obvious indirect identifiers, such as where someone lives or works and so on. The VALMA preparatory vocational training programme was one of the subjects discussed in the Finnish interviews, and since the programme is provided pretty much everywhere in Finland, it was easy to anonymise.”

“But we also had situations where an individual could possibly be identified because the target population was so small. And when you consider the people working with this particular programme, in this case only five people, it starts to become problematic,” Hyytiä says.

FSD’s guidelines instructed the researchers in these types of situations. When the group is small, researchers need to think of anonymisation techniques that would enlarge the possible group of individuals. In this dataset, for example, the names of institutions were anonymised so that the interviewees’ could be the personnel of any vocational education institution in Finland.

When assessing the anonymity of research participants, one should also consider what information concerning them is available in other sources, especially on the Internet. Organisations’ websites, employment and education history on LinkedIn and public social media profiles may reveal detailed information about people. Possible connections to this information in the data must be erased in anonymisation.

Learning by doing

The researchers found that processing the interview transcriptions for archiving got easier as the work progressed.

“In short, I’d say that many researchers had a moment of clarity where they noticed that anonymisation comes quite naturally. Trial and error teaches one to do things in a more organised way,” Hyytiä says.

As a researcher begins processing a large interview transcription, different types of support tools are necessary for an efficient process.

“Anonymisation tips and checklists are great because anonymisation coincides with other work-intensive stages in the research, namely transcribing interviews and drafting publication manuscripts,” Paula Kuusipalo lists.

Researchers working on the EduMAP project all over Europe received help for anonymisation from FSD’s Data Management Guidelines, which are also available in English.

Text and photo: Eija Savolainen

EduMAP research and innovation project

Adult Education as a Means to Active Participatory Citizenship

  • Conducted by: Tampere University, Faculty of Education and Culture
  • Duration: 1 February 2016 – 31 January 2019
  • Funding: EU Commission’s Horizon 2020 Research and Innovation Programme / project 693388
  • More information:

A representative sample of the interviews collected in the EduMAP project has been archived at FSD for reuse. The archived interviews were collected in ten European countries in their respective languages: Finland, United Kingdom, Turkey, Spain, Romania, Latvia, Hungary, Greece, Germany, and Estonia.