Data description (documentation)

The detailed description and registration of research data is a prerequisite for their discoverability as well as for their interpretation and evaluation. GESIS follows the FAIR-Criteria (Findability, Accessibility, Inter-Operability and Re-Usability). 

A very important aspect of archiving social science studies is the detailed description of the studies. GESIS refers to the package of data and data-describing documents, such as the questionnaire or a method description, as a study. The study description is created using a structured and standardized (metadata) schema. This includes content, methodological and formal elements. 

Mandatory elements of the description are: 

  • a study number
  • the title of the study, 
  • the names of the data depositors or primary researchers
  • the access category
  • and a Digital Object Identifier (DOI), which is assigned by GESIS. 

The study description is divided into different areas: 

  • The bibliographic information captures the mandatory fields mentioned above, as well as other survey, version, and citation information. 
  • The content description includes the summary of the main survey contents and a detailed description of the survey instrument at question or variable level. Further information on the content can be provided. 
  • Methodological information includes the study area, population, selection, survey procedure and period, and data collection. 
  • If publications are available that are directly related to the study, they are also listed. 

Data are provided in various technical formats (usually in SPSS and STATA) along with associated documents for download or by order. 

All metadata used for the publication of data is processed according to the specifications of the Data Documentation Initiative (DDI). A description of data for our catalog or a documentation of data on variable level thus comply with internationally widely used standards.verarbeitet. 

DDI is an international standard for describing data from the social, behavioral, economic, and health sciences. It is a free standard that can be used to document and manage various stages in the lifecycle of research data, such as conceptualization, collection, processing, distribution, discovery, and archiving. 

Archiving BASIS Archiving PLUS / PREMIUM
SowiDataNet / datorium offers an online input form (DDI-compliant) for archiving BASIS.  For PLUS and PREMIUM archiving, the data can be compiled using the DBKForm tool (DDI-compliant).

Every dataset published via GESIS automatically receives a persistent identifier in the form of a Digital Object Identifier (DOI) for publication. The DOI name uniquely identifies the data and makes it easier to cite. As part of a URL, it forms a link to the corresponding study description at GESIS. The DOI is assigned via the registration agency da|ra

Da|ra is operated by GESIS in cooperation with DataCite and the ZBW - Leibniz Information Center for Economics Registration Service da|ra. Da|ra is the registration service for social and economic data in Germany. The DOI is part of the data description. 

Part of the registration of research data is versioning, i.e. keeping track of changes to the data.

The research data archived at GESIS are subject to a certain dynamic. Error corrections, additions or other processes change the data. With each change, a new version of the data is created. 

A new persistent identifier (DOI name) is assigned for each new version. Together, DOI and version identifier enable unambiguous and error-free referencing or citation of the data. Among other things, this leads to a significantly improved discoverability of these data. 

Changes are documented on three levels: Major.Minor.Revision (e.g. 2.1.0): 

 1. Position – Major:

  • Adding one or more new samples (usually countries) to an integrated or cumulative dataset 
  • Adding one or more new waves to a cumulative data set 
  • Adding (deleting) one or more variables in (from) a data set 
  • Adding (deleting) one or more cases in (from) a record 
  • Change in quality due to preparation for a higher data state class (usually class 1). 

 2. Position – Minor

  • Change of a variable, i.e. meaning-relevant corrections or additions in the data set (label, recoding, data formats ...) 

3. Position – Revision

  • Non-significant corrections (e.g. the improvement of spelling mistakes) 
  • Simple revision of labels without relevance to meaning 

Example:

On an existing data set with version 1.2.3, a spelling error is corrected (→1.2.4), a variable is recoded (→1.3.0), and a variable is added (→ 2.0.0). If all changes are included in the new version, the version number 2.0.0 is assigned. If only the first two changes are made, version 1.3.0 is generated.