72 years of parliamentary discourse - Discover the data set with the official records of the German Bundestag now


Categories: GESIS-News

We are excited to introduce the Pollux Political Corpora (PoliCorp) platform, an advanced resource that offers researchers structured and searchable access to processed political corpora. Part of the Pollux project, this platform enables in-depth analysis of parliamentary discourse through rich textual datasets.

Currently, the platform hosts data from the GermaParl corpus, a comprehensive linguistic dataset comprising official protocols of plenary debates published by the German Bundestag. The corpus spans 72 years of parliamentary discourse—from September 7, 1949, to September 7, 2021—and contains 958,100 speech contributions.

PoliCorp offers political scientists and multidisciplinary researchers access to structured data that is easily searchable through the web search interface. With the advanced search functionality, researchers can apply logical operations such as AND, OR, and NOT to combine or exclude search criteria, making it easier to filter through vast amounts of parliamentary debate data. The search can be customised by combining multiple fields and applying logical operators to uncover intricate patterns and insights within the data. Selected datasets can be downloaded freely in JSON format, providing a convenient option for further analysis using computational tools.

A demo version of the platform is available for trial at https://demo-pollux.gesis.org/.