Missed an episode? No problem! In our archive you can watch and listen to all episodes of our experts series. Please find links to the recordings on YouTube and slides for downloading in the talk descriptions.

Season 4: Augmenting survey data by linking and harmonisation 

This „Meet the experts“ series presents a range of GESIS services provided by the department “Survey Data Curation”. We exhibit services that augment surveys by linking them with other data types or by harmonising across survey waves and countries. Topics include linking with geospatial data (a geocoder package and an application using GLES), linking with expert-coded data, an application of the cumulated ALLBUS, and the Eurobarometer and its extensions. 

 Slides    (2.59 MB)|   This talk on YouTube   |   MTE Playlist

The talk will be given in English.

Traditionally, quantitative social scientists have mostly used surveys to study social phenomena. Even though very useful, surveys mostly capture self-reported, retrospective individual perceptions in cross-sections or at larger intervals. Social science theories, in contrast, are seldom limited to individual, slow moving factors only. Individual behaviour is embedded into social, temporal, and spatial contexts. With the digital revolution, vast amounts of context data have become available from official statistics, social insurance schemes, media outlets, and a plethora of other sources. These data cover significant organizational, temporal, or spatial dimensions of the human action space. Scholars have started exploiting this newly available data space to augment surveys with context information.

Respondent-based linking to survey data, however, poses conceptual and practical challenges. We present an overview of four such challenges emerging when working with different types of data – and potential solutions. First, we discuss the challenge of obtaining consent for linking survey data with internet behaviour. We suggest strategies to reduce sample bias from self-selection. Second, we present the challenge of choosing appropriate levels of aggregation for spatial data. Parameterized geo-coders provide a solution by facilitating tailor-made linking with survey data. Third, we discuss the challenge of identifying respondents’ treatment status in real-world experiments when data privacy prohibits full disclosure. Pseudo-randomization techniques can mitigate this obstacle. Fourth, we present the challenge of aligning time frames of expert-coded data to survey data. Careful documentation and data management mitigate this issue.

Speaker:

Dr. Pascal Siegers

Dr. Sebastian Ziaja 

 Slides (1.49 MB)   |   This talk on YouTube   |   MTE Playlist

The talk will be given in English.

This talk presents a linking approach that combines expert assessments on electoral integrity with survey data to study how electoral integrity affects the way in which election results translate into citizen attitudes towards the political system. It introduces a causal mechanism that links political losing to political trust via evaluations of electoral fairness: citizens who voted for the losing camp are more likely to view the electoral process as unfair than citizens who voted for the winning camp, resulting in political distrust. It further suggests that the effects of political losing depend on the level of electoral integrity. In conditions where elections are conducted in a free and fair manner, even those who voted for the losing camp have little reason to suspect foul play. Whenever there are actual indications of electoral malpractice, however, political losers have much more reason to doubt the integrity of the electoral process than those who are content with the election outcome. The analysis makes use of a unique dataset that ex-post harmonizes survey data from three cross-national survey projects (Asian Barometer Survey, European Social Survey, Latinobarómetro) and links this survey data with expert assessments of electoral integrity provided by the Varieties-of-Democracy (V-Dem) project to cover 45 democracies in Europe, East Asia, and Latin America. The talk details how respondents (in the survey data) were matched to elections (in the V-Dem data) and discusses challenges arising from survey fieldwork periods not being aligned with election cycles.

Speaker:

Dr. Marlene Mauk

 Slides  (1.68 MB)  |   This talk on YouTube   |   MTE Playlist

The talk will be given in English.

Analysis of the global rise in political resentments and support for radical right parties reveals strong spatial patterns within the affected countries. In this light, some scholars go as far as talking about “the revenge of the places that don't matter” (Rodríguez-Pose 2018), pointing out that relatively deprived places serve as breeding grounds of discontent.

Dubbed the “rural-urban divide”, “spatial polarization”, “geography of discontent” or “left behind places”, this phenomenon has, in recent years, increasingly attracted scholarly attention. However, the analyses of micro-level mechanisms driving such place-based effects require linking survey data with small-scale contextual information.

This talk introduces geocoded survey data from the German Longitudinal Election Study (GLES) and provides an overview of the data preparation process, data set structure and data access. With a focus on the 2021 German Federal Election, the presentation highlights research opportunities available when linking these georeferenced survey data with administrative data and presents empirical results as to how contextual factors affect individuals' attitudes and contribute to radical right voting.         

Speaker:

Anne-Kathrin Stroppe

 Slides (3.11 MB)   |   This talk on YouTube   |   MTE Playlist

The talk will be given in German.

The German General Social Survey (ALLBUS) is a central component of the social science infrastructure in Germany. Since 1980, attitudes, behaviors and sociodemographic characteristics of the German population have been surveyed every two years in repeated cross-sectional studies. The replication of different question sets and a constant sociodemographic module provide a rich data source for studying German society and its social change. Every wave of the ALLBUS contains one or two core questionnaire modules. In general, a 10-year replication cycle is aimed at for main topics. The talk offers an introduction to the analysis potentials of ALLBUS and presents the available data offerings. A special focus will be on the data of the ALLBUS cumulation, whose long time series allows for an analysis of social change of up to 40 years. Changes in the ALLBUS methodology and sampling design are also highlighted. Concrete application examples provide an introduction to the data set of the cumulation and the consideration of changes with respect to the ALLUS methodology in longitudinal analyses.

Speaker:

Dr. Sonja Schulz

 Slides (1.82 MB)   |   This talk on YouTube   |   MTE Playlist

The talk will be given in English.

The Eurobarometer, conducted for the European Commission and the European Parliament, is one of the longest running international survey collections. It spans five decades and more than 200 waves of Standard and Special Eurobarometer surveys. Additionally, there are more than 500 so-called Flash Eurobarometers alongside further collections such as the Central and Eastern Eurobarometer, the Candidate Countries Eurobarometer, and the recent COVID-19 surveys conducted at the behest of the European Parliament.

This edition of “Meet the Experts” will give an overview of these data, their history and thematic breadth, and their key advantages. The presentation will also refer to research publications that exemplify the unique benefits of this survey series for a variety of topics and methodological approaches. It will also mention some of the more unique varieties of Eurobarometer stock, such as surveys of youths, companies, or from countries outside the European context. Finally, it is planned to showcase a selection of user-generated cumulations of these surveys deposited at GESIS. In sum, the talk will give an overview of the potentials of the Eurobarometer for a diverse set of topics and disciplines.

Speaker:

Dr. Boris Heizmann

U+2713

 Slides (9.11 MB)   |   This talk on YouTube   |   MTE Playlist

The talk will be given in English.

Geospatial data has become increasingly widespread in the social sciences. Applications not only extend to the analysis of classical geographical entities (e.g., policy diffusion across spatially proximate countries) but also to analyses of micro-level data, including respondent information from georeferenced surveys or user trace data from social media. Georeferencing of survey and digital behavioral data opens up new possibilities for spatial linking with contextual variables, such as emission levels from industries or traffic, land use indicators, or socio-economic variables. At the same time, spatial linking also creates new challenges, especially regarding data protection issues.

This talk presents current geocoding-related projects at GESIS. It introduces a geocoding tool based on the Federal Agency of Cartography and Geodesy (BKG) data in Germany: the bkggeocoder. The bkggeocoder provides access to the geocoded address database from the BKG via an R interface. Licensed users of the BKG services, such as GESIS, can directly use the BKG’s API to receive geocoded address information in a tidy data table, including important information about data quality. Alternatively, an offline interface provides more fine-grained methods of navigating data protection challenges in cases where an API may not meet a specific project’s requirements for data storage. 

Speaker:

Dr. Stefan Jünger

U+2713

 Slides  (1.06 MB)  |   This talk on YouTube   |   MTE Playlist

The talk will be given in German.

All surveys of individuals measure basic socio-demographic characteristics of the respondents. Recommendations for the collection of these characteristics in German surveys have been available since the late 1970s in the form of the "Demographic Standards". However, not all surveys can use the standard items recommended there. The data from different surveys can therefore not easily be compared or linked with each other (see also https://doi.org/10.5281/zenodo.6810973). Official classifications already exist for some socio-demographic characteristics (e.g. the German classification of occupations KldB). However, this is not the case for most characteristics. KonsortSWD has set itself the goal of closing this gap. To this end, specifications for standard variables were developed for characteristics for which no generally accepted target variable or classification exists to date, or existing standards were adapted for application to survey data. These were subjected to extensive empirical validation and discussion with experts and potential users. This talk will present the developed socio-demographic standard variables as well as the validation results. 

Speaker:

Dr. Silke Schneider, Lennart Palm

U+2713

 Slides  (1.61 MB)  |   This talk on YouTube   |   MTE Playlist

The talk will be given in English.

With researchers increasingly recognizing the potential benefits coming out of the combination of different survey data sources, GESIS has begun extending its offers of harmonized data sets over the last years (sometimes in the form of syntax files that perform the harmonization at the users' end). The intention is to relieve researchers from the often very laborious and error-prone process of large-scale data ex-post harmonization and merging by making available ready-to-use, harmonized data files in areas where we expect the biggest demand or see relevant research opportunities. 

This talk will lay out the range of such offers especially in the area of comparative surveys and briefly present some examples of harmonized data sets and projects available at the GESIS data archive for re-use. We will then discuss in more depth the potential issues connected to creating, and consequently also to using, large-scale harmonized data files to allow a better understanding of the utility of such files, but also of the methodological limitations they come with. We will present those with the help of examples that combine multiple survey sources of different degrees of heterogeneity.  

Thus, the talk can be understood as a primer to a competent usage of such files, alerting potential users both to opportunities and to practical pitfalls to look out for when dealing with these data files. 

Speaker:

Markus Quandt, Ivet Solanes Ros