Arja Kuula and Jani Hautamäki

Software to Facilitate Textual Data Management

FSD began archiving qualitative data in 2003. Over the years, our IT staff designed bits of software to aid qualitative data processing. The ongoing FSD Upgrade project has made the software development much more systematic. One of the results of the project is a new software facilitating textual data management.

From study descriptions to unit descriptions

In the first stage of archiving qualitative data, the data archive concentrated on producing high quality metadata at study level, describing the time of data collection, data content and collection methods. Files were converted to rtf or txt files and a file index was produced. Next, to meet the demands of researchers, the archive started to improve unit level information. This is typically background information concerning one interview, one piece of writing or one focus group interview: the collection time and location of a particular interview, background information for a particular research participant, some specific content information of the unit (e.g. if the data consist of stories on recovering from a divorce, the year of divorce) etc.

Information at unit or individual level is particularly important for secondary users of data who often use archived data for different purposes and differently from the original researcher(s). For instance, the original study may have concentrated on the impact of mobile phone use on everyday lives of young people and children. A secondary user may be studying how the age of an interviewee affects interaction and thus be interested in only a small part of the vast material, namely, the interviews where, in addition to the researcher, both a child and a parent are present.

Tools for managing research data

The archive has now developed tools which considerably improve the possibilities to manage textual data. One textual dataset may contain dozens or even hundreds of separate files. To facilitate data browsing and choosing the units to analyse, the archive created an automatic data processing procedure which transforms textual data into a HTML version. The process analyses unstructured information and transforms it into structured one.

A HTML index page help visualise the relevant parts of each dataset. In the index page (see FSD2506, FSD2727), one can sort the order of the units by clicking on the field title, thus also easily seeing the number of participants with a particular background characteristics, for instance. One can also read the unit one has chosen in html format.

The same software can also be used to help researchers manage their data during the research project. If researchers commit themselves to archiving their data at FSD after the research has been completed, the archive will process their textual data into html files already during the research stage. This is a definite help to researchers in studies where large amounts of data are collected. Managing the numerous files and indexes is challenging and often very time-consuming. The service provided by the archive helps to save valuable time.

More information: