New data publications available


Categories: GESIS-News

The dataset generated in this study was retrieved from the bioRxiv preprint server and the Crossref metadata API in July 2024. The dataset covers two time periods: the pre-pandemic period (2016–2018) and the COVID-19 pandemic period (2020–2022). The dataset contains detailed metadata about preprints and their published versions, including titles, authors, abstracts, institutions, submission and publication dates, licenses, and subject categories. The metadata was processed to facilitate analysis, for example, by standardizing date formats, normalizing author names, and selecting the first and last version of each preprint.

The dataset OpenSSCI comprises reference metadata and citation links derived from 63,070 full-text academic documents archived in the SSOAR (Social Science Open Access Repository). The data has been curated within the OUTCITE project for downstream ingestion into OpenCitations. The main goal of OUTCITE is to research, develop, and deploy an open-source toolchain for linking literature references—including non-source items—to their sources. Demo system: <https://demo-outcite.gesis.org/>