Data ingest plays an important role in long term preservation. The more precisely the submission of data is planned and carried out, the simpler future measures for preservation can be planned and realized. For this reason, as much information on the study as possible is collected and to produce detailed documented. After the study has been submitted its content is verified and enhanced with further information.
After data are submitted to the archive avalidation is carried out to check the following aspects:
Each study receives a study number and is recorded in the GESIS data catalogue (DBK).
The data are stored in the archive system with further information (archive agreement, correspondence between archive and depositor, etc.).
Two additional standard variables are added (study number, version number and date of version). The following steps depend on the original material, so processing can comprise of, for example:
Any alterations made to the data are documented and saved together with the data set.
Data stored in the archive undergo revision and changes even after their publication. For example, subsequently discovered errors are corrected, or the data is augmented by additional variables or interviews. Assigning version numbers guarantees datasets used for publications are identifiable together alongside their study number, allowing for unique referencing and citation.
A persistent identifier (DOI name) assigned to each version also makes the data easier to locate. DOI names link the user directly to the study description in the DBK.
Changes are documented on three levels: Major.Minor.Revision (e.g. 2.1.0):
1. Position – Major:
Addition of one or more new waves in a cumulative data set
Addition/deletion of one or more variables in a data set
2. Position – Minor
3. Position – Revision
Changes that do not affect the meaning of a variable (e.g. correction of spelling mistakes)
A spelling mistake in a data set with version 1.2.3 is corrected (→1.2.4), a variable is recoded (→1.3.0), and a variable is added (→ 2.0.0). If all the changes are made at once, version number 2.0.0 is assigned. If only the first mentioned two changes have been carried out, version number 1.3.0 is assigned.