Keynote by Dr. Iana Atanassova, CRIT, Université de Bourgogne Franche-Comté, France
Title of the talk: Beyond Metadata: the New Challenges in Mining Scientific Papers
Abstract: Scientific articles make use of complex argumentative structures whose exploitation from a computational point of view is an important challenge. The exploration of scientific corpora involves methods and techniques from Natural Language Processing in order to develop applications in the field of Information Retrieval, Automatic Synthesis, citation analyses or ontological population. Among the problems that remain to be addressed in this domain is the developing fine-grained analysis of the text content of articles to identify specific semantic categories such as the expression of uncertainty and controversy that are an integral part of the scientific process. The well-known IMRaD structure (Introduction, Methods, Results, and Discussion) is standard template that governs the structure of articles in experimental sciences and provides clearly identifiable text units. We study the internal structure of articles from several different perspectives and report on the processing of a large sample extracted from the PLOS corpus. On the one hand, we analyze citation contexts with respect to their positions, verbs used and similarities across the different sections, and on the other hand, we quantify text re-use in abstracts as well as other phenomena such as the expression of uncertainty. The production of standard datasets dedicated to such tasks is now necessary and would provide favorable environment for the development of new approaches, e.g. using neural networks that require large amounts of labeled data.
Accepted Long Presentations
Accepted Short Presentations
Accepted Demo Presentation
You are invited to submit to the 8th international workshop on Bibliometric-enhanced Information Retrieval (BIR 2019), to be held as part of the 41st European Conference on Information Retrieval (ECIR 2019). https://www.ecir2019.org/
The Bibliometric-enhanced Information Retrieval (BIR) workshop series at ECIR tackles issues related to academic search, at the crossroads between Information Retrieval and Bibliometrics. BIR is a hot topic investigated by both academia (e.g., ArnetMiner, CiteSeerX, DocEar) and the industry (e.g., Google Scholar, Microsoft Academic Search, Semantic Scholar). A one-day workshop is to be held at ECIR 2019 in Cologne, Germany.
Past BIR proceedings are online https://dblp.org/search?q=BIR.ECIR as open access.
Searching for scientific information is a long-lived information need. In the early 1960s, Salton (1963) was already striving to enhance information retrieval by including clues inferred from bibliographic citations. The development of citation indexes pioneered by Garfield (1955) proved determinant for such a research endeavour at the crossroads between the nascent fields of Information Retrieval (IR) and Bibliometrics [Bibliometrics refers to the statistical analysis of the academic literature (Pritchard, 1969) and plays a key role in scientometrics: the quantitative analysis of science and innovation (Leydesdorff & Milojevic, 2015)]. The pioneers who established these fields in Information Science---such as Salton and Garfield---were followed by scientists who specialised in one of these (White & McCain, 1998), leading to the two loosely connected fields we know of today.
The purpose of the BIR workshop series founded in 2014 is to tighten up the link between IR and Bibliometrics. We strive to get the ‘retrievalists’ and ‘citationists’ (White & McCain, 1998) active in both academia and the industry together, who are developing search engines and recommender systems such as ArnetMiner, CiteSeerX, DocEar, Google Scholar, Microsoft Academic Search, and Semantic Scholar, just to name a few.
These bibliometric-enhanced IR systems must deal with the multifaceted nature of scientific information by searching for or recommending academic papers, patents, venues (i.e., conferences or journals), authors, experts (e.g., peer reviewers), references (to be cited to support an argument), and datasets. The underlying models harness relevance signals from keywords provided by authors, topics extracted from the full-texts, coauthorship networks, citation networks, and various classifications schemes of science.
Bibliometric-enhanced IR is a hot topic whose recent developments made the news---see for instance the Initiative for Open Citations (Shotton, 2018) and the Google Dataset Search (Castelvecchi, 2018) launched on September 4, 2018. We believe that BIR@ECIR is a much needed scientific event for the ‘retrievalists’ and ‘citationists’ to meet and join forces pushing the knowledge boundaries of IR applied to literature search and recommendation.
We welcome submissions regarding all three aspects of the search/recommendation process:
We especially invite descriptions of running projects and ongoing work as well as contributions from industry. Papers that investigate multiple themes directly are especially welcome.
All submissions must be written in English following Springer LNCS author guidelines (6 to 12 pages) and should be submitted as PDF files to EasyChair. All submissions will be reviewed by at least two independent reviewers. Please be aware of the fact that at least one author per paper needs to register for the workshop and attend the workshop to present the work. In case of no-show the paper (even if accepted) will be deleted from the proceedings AND from the program.
Workshop proceedings will be deposited online in the CEUR workshop proceedings publication service (ISSN 1613-0073) - this way the proceedings will be permanently available and citable (digital persistent identifiers and long term preservation). A special issue of the Scientometrics journal (http://link.springer.com/journal/11192) will include extended versions of the best papers presented at the workshop.