The Research Data Connectome will facilitate and accelerate data reuse in the social sciences and humanities. To find out more about current data research practices in these communities we commissioned the SWITCH Innovation Lab “Repositories & Data Quality”. We asked Nicolai Hauf, ZHAW and Martin Jaekel, ZHAW, about their findings and what they could mean for the next steps of building the Research Data Connectome.
SWITCH: What was the aim of this SWITCH Innovation Lab?
Nicolai Hauf: We wanted to analyse the reuse of existing research data in the disciplines of Social Sciences and Humanities (SSH). Our main aim was to identify the relevant data sources for researchers in Switzerland. For this reason, we developed a survey investigating the location of existing valuable data, the purpose for reusing this material and the selection criteria researchers use when deciding on data for their own use.
SWITCH: Which data providers are the most frequently named sources?
Nicolai Hauf: Many participants named FORSbase, the digital research information and data access portal for social science studies in Switzerland. The Federal Statistical Office (FSO) is another very important source for data on many different research topics. Other important mentions are the European Social Survey (ESS) and the GESIS data archive. Interesting is also the wide variety of small data providers that were mentioned.
SWITCH: Why is reusability of research data in SSH relevant for future research?
Nicolai Hauf: The availability of reusable data can support future research in many ways. Firstly, existing data can be included in new research, saving resources to collect available data for a second time. Insights into data sets can inspire new research questions and the adaption of research methods as well. Last but certainly not least, reusability of data enables the replication of studies and results.
The Research Data Connectome provides a useful framework to consider future linked data – based on high-quality data and on support from researchers of different disciplines.Martin Jaekel
SWITCH: What are key criteria for selecting data for reuse?
Nicolai Hauf: The main aspect is trust in the data source, but also in the data itself and the authors. Data providers must secure the researchers trust in their services and the quality of the data they make available. Furthermore, a detailed description of the data provided with additional materials like documentations and methodologies is a key criterion to understand and assess the data. The availability of raw data on the one hand and clean data on the other hand is very important as well.
SWITCH: In your findings, what are the biggest challenges SSH researchers face when choosing a suitable data source for reuse?
Nicolai Hauf: Data must be easily findable and available in the first place. Sometimes the desired data is generally not obtainable, or sometimes not to the extent, detail or quality that is required to enable reuse for other purposes. Access must not lead to heavy bureaucratic burdens and the actual reuse should be allowed by permissive licenses. Qualitative research is often highly context-sensitive. Legal requirements demand from researchers to anonymize data prior to publication. This crucial information is then lost and reuse is not feasible anymore.
SWITCH: Martin, based on this research but also drawing on your own perspective as a researcher, in which ways would you envision the Research Data Connectome to solve some of these issues?
Martin Jaekel: The Research Data Connectome can promote the extension of good practices from one discipline to another. This will lead to better quality data in general. In this process, SSH-disciplines will also consider joint interests, data descriptions, infrastructures, services etc., which will provide a basis for the integration of datasets from different sources. Hence, the Research Data Connectome provides a useful framework to consider future linked data – based on high-quality data and on support from researchers of different disciplines.
SWITCH: Based on your results, which actions do you propose as next steps to build client-oriented services in the Research Data Connectome ecosystem?
Nicolai Hauf: It is of utmost importance to connect to your target audience and their research practice. Hence, in a next step these relevant data sources should be evaluated for the Connectome based on specific use cases.
Scientists across disciplines generate increasing amounts of valuable data as part of their daily research activities. Being able to reuse or even combine such scientific data opens the door to many exciting possibilities. Until now, research data has been collected in domain or institutional silos and could not be easily connected.
The Research Data Connectome connects and organises (open) scientific (meta)data sustainably across disciplines to make it widely accessible, interoperable and valuable. Building a Connectome prototype is a joint effort by DaSCH, FORS, EPFL Blue Brain, eXascale Infolab, SATW, SAGW and SWITCH.
To pursue a user-centred deployment of the ecosystem, we're setting up focus groups dedicated to the humanities and social sciences. The aim is to identify the needs of different researchers and describe additional use cases from specific disciplines to be implemented in our Minimum Viable Ecosystem before the end of the year.