CESSDA - The Consortium of European Social Science Data Archives

The Consortium of European Social Science Data Archives (CESSDA ERIC) provides large-scale, integrated and sustainable data services to the social sciences. It brings together social science data archives across Europe, aiming to promote the results of social science research and supporting national and international research and cooperation.

Having evolved from a network of European Service Providers into a legal entity and large-scale infrastructure under the auspices of the European Strategy Forum on Research Infrastructures (ESFRI), CESSDA became an ERIC (European Research Infrastructure Consortium) in 2017.

CESSDA is owned and financed by the individual member states’ ministries of research or a delegated institution. Its main office is located in Bergen (Norway).

Aims

CESSDA’s vision is to be a key player in the social sciences domain, providing a trusted platform for researchers with tools and services to curate, publish and re-use research data. CESSDA has endorsed the principles of the European Open Science Cloud Declaration and has committed to actively support the implementation of FAIR data.

The CESSDA Strategy builds on four pillars: technology, training, project management and trust. Technology ensures a stable and up-to-date backbone for our products and services, for single sign on and other tools that should make data deposit and data use more easy. Training focuses on ‘train the trainers’ and new training modes (webinars, moocs) for outreach to new users, explaining data privacy issues, etc. Project Management purposes to jointly plan activities, budgets, coordinate efforts in project proposal writing and submission, as well as to monitor, support conduct and document administrative and financial aspects of the ongoing projects within CESSDA to assure the timely delivery of project outputs. Trust refers to the position of the CESSDA Service Providers, e.g. as ‘trusted repositories’, to ensure quality of data, and safe and secure access, etc. For each of these pillars there are CESSDA Working Groups and within these groups there are CESSDA projects and regular meetings for alignment.

For further information, visit the CESSDA website.

The role of GESIS:

CESSDA Training

CESSDA Training is an essential part of the permanent tasks of GESIS, responsible for supporting continuous learning and training of Service Provider staff and the social science user community in data curation, data management and the discovery and handling of data. In 2017, CESSDA Training held five workshops on data management and data sharing for researchers in the social sciences. CESSDA Training actively contributed to the "CESSDA Training Working Group", including the newly-launched CESSDA Expert Tour Guide on Data Management, as well as webinars and workshops on the topic of data discovery.

Furthermore, CESSDA Training lead the development and implementation of the CESSDA Knowledge Platform as part of the CESSDA SaW project (see below).

CESSDA PID

CESSDA PID is responsible for developing a common approach to the use of PID within CESSDA ERIC and supporting the CESSDA Service Providers in assigning DOI names to their data holdings. GESIS leads the CESSDA PID Task Force which consists of GESIS, DANS and SND. In 2017, the CESSDA PID Task Force developed the CESSDA ERIC Persistent Identifier Policy which was approved by the General Assembly on 22 November 2017. The Policy framework covers the general principles for the use of Persistent Identifiers across CESSDA Service Providers and contains Best Practice Guidelines. The CESSDA PID Task Force held a webinar on 29 June 2017 introducing the Policy framework to the Service Providers and collecting feedback.

GESIS is supporting CESSDA Service Providers registering DOI names via the free registration service da|ra. In 2017, there were the following users from the CESSDA ERIC: GESIS (9171 DOIs); CSDA (434 DOIs); CESSDA ERIC (42 DOIs), AUSSDA (2 DOIs), FORS (2 DOIs). ADP and ISSDA used the da|ra advisory service including the use of the test environment. The CESSDA ERIC partners in Croatia and Serbia are using the da|ra service as well.

Furthermore, as the head of the CESSDA PID Task Force GESIS was one of the CESSDA representatives in the Group of European Data Experts in RDA (GEDE-RDA) and was involved in the RDA Europe Adoption project “Persistent Identifier Types for the Social Sciences” (PITTS) which was led by DANS. Representatives of GESIS took part in a dedicated workshop in Den Haag, on 29/30 May 2017.

Harmonization

Creating and maintaining digital tools to facilitate social science variable harmonization and its documentation is a permanent task GESIS provides for CESSDA ERIC. It includes planning, designing, developing and maintaining all tools, software support and web activities connected to documenting and publishing variable recoding and harmonization.

The CharmStats (Coding and Harmonization of Statistics) team developed two versions of its free and open source software. Smaller or simple harmonization work can be done in QuickCharmStats, while team-based work with multiple users and projects to manage can be handled with CharmStats Pro. In 2016 the team published a peer-reviewed article on standardizing the documentation and citation of harmonization work, informed by the CharmStats workflow system.

To further the goal of supporting digital variable harmonization tools, the CharmStats team was part of the successful SERISS bid (Synergies for Europe's Research Infrastructures in the Social Sciences). With EU funding, the team now oversees the construction of an online library where harmonization work (based on the CharmStats data model and our documentation standards) can be submitted, reviewed, accepted, assigned a permanent identifier - in cooperation with CESSDA PID - and published on an easily accessible website. In 2018 the CharmStats team will develop video tutorials and other training materials in consultation with CESSDA Training.

Technical notes: CharmStats products are JAVA based desktop-applications using a MySQL database to store persistent data and supports DDI (import and export) as well as SPSS, Stata, SAS and MPlus (automated code generation for researcher use). New versions of QuickCharmStats and CharmStats Pro are scheduled for released in 2018. The Online Library of Harmonizations (scheduled to go live in 2019) is a web application that uses DSpace and its metadata handling is guided by the CharmStats data model. The harmonization library’s digital infrastructure will connect with DARA to assign DOIs to accepted harmonization documents.

Euro Question Bank 2018

The vision of the Euro Question Bank (EQB) project is that researchers in the Social Sciences can search survey questions of different datasets in different languages from all CESSDA Archives at one portal provided by CESSDA. In 2018, the EQB project will continue the work from previous projects and install an improved prototype application. This one will be based on the CESSDA Metadata Management data model, which is based on DDI-Lifecycle to enhance equal usage of metadata elements among CESSDA Service Providers. Partners in EQB 2018 are DANS, DDA, FSD, FORS, NSD, SND, TARKI, UKDA, and GESIS (lead). 

CESSDA Metadata Management

CMM2 is a continuation of the CESSDA Metadata Management Project 1, which started in November 2015 and ended in April 2017. The aim of the CMM is to develop a common CESSDA metadata schema, to facilitate the exchange of data between archive and support other CESSDA projects. CMM1 provided a basic solution, that allows for starting to build prioritized services. The Portfolio consists of a Core Metadata Model and Controlled Vocabularies. CMM Phase 2 will develop an advanced Portfolio Version 2 that will meet the majority of needs and enable the implementation of innovative functionality. Phase 2 will make it possible to align optimally with other CESSDA initiatives like the Product and Service Catalogue, OSMH, the Euro Question Bank, as well as the DDI CVG and DDI Moving Forward project. There are eight partners within the CMM2 project, those are: FSD (lead), SOHDA, DDA, GESIS, NSD, ADP, SND, UKDS.

GESIS has (and had within CMM1) the lead for the task of the Metadata Core Model. The Core Metadata Model no. 1 is a basic solution that allows for starting to build prioritized services. It reflects the needs of both Service Providers and researchers. The Metadata Model supports the description of data from social sciences (especially survey-type quantitative data), but also from other disciplines, e.g. humanities and health sciences. For the CMM Portfolio we are using DDI Lifecycle. We decided to use the DDI Lifecycle standard due to several reasons. Most of the CESSDA archives are already using it. Another reason to use DDI was the objective of interoperability. The use of DDI profiles and the consideration of other standards and schemas (such as Dublin Core, DataCite, da|ra, ISO) will also help with the objective of interoperability.

The CMM2 Metadata Core Model is based on the CMM1 Metadata Core Model. This means for the Core Metadata Model 2 that it is an extended and adapted version of CMM1. For CMM2 we took into account further relevant metadata models, such as PREMIS, METS, DDILimDAS, DataCite etc. The extended Core Metadata Model will meet further user needs (identified in Phase 1), such as the needs identified by CMM but not included in Portfolio Version 1, needs having emerged after completing Version 1, and needs of other CESSDA Projects related to CMM. Those are especially PaSC and EQB, but also the CV Manager, Vocabulary Services and the DataverseEU project. The CMM2 Core Model will support semantic web standards. We also put a major emphasis on the inclusion of metadata elements concerning long-term preservation. 

CESSDA Pathfinder Project – European Remote Access Network

The goal of this initiative is to enhance and extend the infrastructure for secure remote access to research data. Although the technical system has challenges such as finding agreed and trusted solutions, the focus of this project is social infrastructure. This ranges from negotiating agreements for data access and use across national and institutional boundaries, to training data secure data professionals in European data protection law and disclosure risk assessment and mitigation. This infrastructure for remote access realizes FAIR principles by removing barriers to access for data that cannot be made “open” because of disclosure risk. The representatives of this pathfinder project are from GESIS, UKDA, DANS, FORS and ADP. The project provide content for a sub-module of a broader CESSDA bid (InfraEOSC-04) to the European Open Science Cloud.

CESSDA SaW (H2020 funds 2015-2017)

GESIS staff were active in different tasks  as task leaders and as partners. Among the outputs are the CESSDA Knowledge Platform, a repository for sharing CESSDA publications and electronic resources, several webinars and online tutorials on data curation and long-term preservation, as well as reports such as the country report on development potentials of data archive services in Europe (D3.2), report on DSA certification for CESSDA (D4.4), as well as the report on the establishment of an international curriculum for professional development in digital data services for the social sciences (D5.3). For further information on the CESSDA SaW project and its output, please visit the project’s website.

Controlled Vocabulary Manager

CESSDA CV manager is a tool that will allow the creation, versioning and maintenance of controlled vocabularies, their translation into all member languages, and access to all CVs. The tool makes vocabulary work quicker, less labour intensive and less error-prone. The CV Manager is developed by UKDS (Lead), FSD, SND and GESIS, which leads the technical development. A first development prototype is available at the CESSDA cloud via http://cv-dev.cessda.eu/cvmanager

Dataverse EU

The DataverseEU project will offer a service with data repositories for data archives (with limited technical resources). DataverseEU will make use of the Dataverse software, developed as Open Source software by Harvard.  This service can be used in various ways. GESIS, along with DANS, ADP and SND, will adopt and extend the Dataverse. GESIS will add a generic module to enhance the registration process with DataCite and da|ra for DOIs. Further adaptations like multilinguality of the user interface, harvesting of the CESSDA portal and integration with the CESSDA CV manager will be done.

VOICE - Vocabulary Services Multilingual Content Management

VOICE aims at managing and coordinating cooperative development and use of multilingual Terminology such as Thesauri, Topic Classifications, Controlled Vocabularies, etc. as they will be needed in CESSDA Products and Services. UKDS has the lead in this project with GESIS and FSD as lead partners. In Phase 1, the project partners will focus their work on the ELSST Thesaurus that has been managed and hosted by UKDS so far. GESIS has contributed to the thesaurus in the past and will continue to do so in this project. GESIS contributes to VOICE 2018 in the areas of improving and extending the thesaurus as well as developing a best practice guide for translators.