GESIS Leibniz Institute for the Social Sciences: Go to homepage

InFoLiS I and II

Integration of Research Data and Literature in the Social Sciences

Team: Katarina Boland
Leader: Dr. Benjamin Zapilko
Scientific unit: Knowledge Technologies for the Social Sciences (WTS)


Goals of the first project phase were the development of methods for an automatically linking of publications and research data in the social sciences, the integration of detected links into search systems of the project partners as well as the automatic classification of research data for a better retrieval.

InFoLiS II expands this focus on additional scientific and scholarly domains and languages beyond German. In concrete terms, the developed InfoLink toolkit will be used with various publications and datasets from the fields of social and economic studies and related disciplines in both English and German language. To massively expand the corpus of texts and datasets, it is agreed on cooperations with national and international research institutions, repository administrators and publishers.

It is aimed to build a flexible, and long-term sustainable infrastructure to house the algorithms developed in the precursor project for finding links between publications and datasets and embedding them into existing systems.

Based on the paradigms of Linked Open Data (LOD) and RESTful web services, all the steps of the InfoLink toolkit will be implemented as self-contained components and workflows will be set into place to automatically update our knowledge base with new data on a regular basis.

Wide-spread usage of the links detected by InfoLink will be embraced. To work towards wider acceptance, software will be developed to integrate those links into existing research platforms such as publication databases, research data repositories, research infrastructure databases and discovery platforms.

To demonstrate how thusly integrated data improves Information Retrieval, a research prototype will be built that uses the full power of the graph spanned between publications, research data and authors.

It is a truth universally acknowledged that the trend to describing, archiving and making available of research data is gaining traction. However, this is a very heterogeneous process.

On the one hand, the granularity (what is the smallest element of research data in need of description?) and the possible, aggregating intermediary steps vary widely. On the other hand, the nature of the relation between publication and dataset varies widely as well.

To formalize this, a research data ontology will be developed. The resulting improvements of the search process and the reusability of the links in general will also be a focus of the project.

Further information can be found at the <link http:>InFoLiS webpage.


01.08.2011 - 31.07.2016

Sponsored by


  • Institut für Informatik und Wirtschaftsinformatik, Universität Mannheim
  • Universitätsbibliothek Mannheim

<link http: cms faculties research-centers zeu-en forschung external-link-new-window>


  • Boland, K. & Mathiak, B. (2013). Connecting Literature and Research Data. In IASSIST 2013 - Data Innovation: Increasing Accessibility, Visibility, and Sustainability, Cologne, Germany, May 29-31, 2013.
  • Boland, K.; Ritze, D.; Eckert, K.; Mathiak, B. (2012): Identifying references to datasets in publications. In: Zaphiris, P.; Buchanan, G.; Rasmussen, E.; Loizides, F. (Hrsg.): Proceedings of the Second International Conference on Theory and Practice of Digital Libraries (TPDL 2012), S.150-161, 2012.
  • Ritze, D.; Boland, K. (2013): Integration of Research Data and Research Data Links into Library Catalogues. Proceedings of the International Conference on Dublin Core and Metadata Applications (DC 2013), 2013.