Georeferencing of survey data
Leader: Dr. Pascal Siegers, Dr. Katharina Kinder-Kurlanda, Wolfgang Zenk-Möltgen
Scientific unit: Data Archive for the Social Sciences (DAS), Monitoring Society and Social Change
The aim of this project is to develop an infrastructure that enables scientists to enrich survey data with small-scale area information. Enrichment is achieved by geocoding the survey data and merging them with spatial thematic data. The project thus unlocks the analytical potential of spatial data for social science research. The geocoding service being developed will be available for scientific research projects for the long term.
For the implementation of a geocoding service for survey data the creation of a conceptual and practical basis for georeferencing is required. This is related to three problem areas. First, technical competences in georeferencing survey data have to be developed. Second, to facilitate reuse of georeferenced survey data, appropriate documentation of the data that corresponds to the established metadata standard DDI has to take place. Third, data protection aspects regarding the provision of georeferenced survey data for secondary analyses have to be resolved by developing test procedures that consider the risks of de-anonymization of the data. In this way an appropriate mode of access for each particular case can be defined. Two practical realizations of georeferencing are used as prototypes for the implementation of the geocoding service. By developing georeferenced, DDI-documented, and data privacy legislation compliant data sets that are used as models for other projects, the urgently needed basis for generation and dissemination of georeferenced survey data will be created.
Furthermore, georeferencing is useful for creating consistent time series for longitudinal analyses. Analyzing long time series often comes along with harmonization problems because of local government reforms. For example, due to local government reorganizations the administrative districts of the year 2000 are not comparable with the districts of 2012. With the help of geocoded survey data consistent time series can be created by assigning participants from “older” survey waves to boundaries of the current administrative units.