Staff

The many faces of GESIS

Vita

Sharmila Upadhyaya is a doctoral Student at GESIS – Leibniz Institute for the Social Sciences from November, 2022. In October 2022, she completed her Erasmus Mundus Language and Communication Technology (LCT) Master’s degree at the University of Lorraine, France and Saarland University, Germany, with a specialization in Computational Linguistics. Before that, She was working as a NLP Engineer at Ekbana Solutions in Nepal for 2 years.

Research

Sharmila's research interest includes NLP, Speech Processing and Knowledge Graph.

Publications

Journal article

Tong, Lin, Xu Tong, Lei Lei, Ziling Zeng, Sihong Liu, Lei Zhang, Cheng Wang, Sharmila Upadhyaya, Hongjun Yang, and Huamin Zhang. 2024. "Chinese text recognition and knowledge graph of Shen Nong Ben Cao Jing based on BERT pretrained language models." Guidelines and Standards in Chinese Medicine 2 (1): 13-20. doi: https://doi.org/10.1097/gscm.0000000000000017.

Chapter in an edited book

Otto, Wolfgang, Lu Gan, Sharmila Upadhyaya, Saurav Karmakar, and Stefan Dietze. 2026 (Forthcoming). "GSAP-ERE: Fine-Grained Scholarly Entity and Relation Extraction Focused on Machine Learning." In Proceedings of the 40th AAAI Conference on Artificial Intelligence (AAAI-26), Proceedings of the Conference on Artificial Intelligance (AAAI). Washington DC: AAAI Press.

Upadhyaya, Sharmila, Wolfgang Otto, Stefan Dietze, and Frank Krüger. 2025. "Joint Named Entity and Relation Extraction from Software Mentions in Scholarly Articles." In 2nd Conference on Research Data Infrastructure (CoRDI) , Aachen, Germany, 26-28 August 2025, edited by York Sure-Vetter, and Paul Groth, zenodo. doi: https://doi.org/10.5281/zenodo.16736340.

Ahmad, Raia Abu, Rana Abdulla, Tilahun A. Taffa, Sören Auer, Hamed Babaei Giglou, Ekaterina Borisova, Zongxiong Chen, Stefan Dietze, Jennifer D’Souza, Mayra Elwes, Genet-Asefa Gesese, Shufan Jiang, Ekaterina Kutafina, Philipp Mayr-Schlegel, Georg Rehm, Sameer Sadruddin, Sonja Schimmler, Daniel Schneider, Kanishka Silva, Sharmila Upadhyaya, and Ricardo Usbeck. 2025. "NFDI4DS: Shared Tasks for Scholarly Document Processing." In Informatik 2025, Lecture Notes in Informatics (LNI) - Proceedings P-366, 1195. Bonn: Gesellschaft für Informatik e.V.. doi: https://doi.org/10.18420/inf2025_103.

Silva, Kanishka, Marcel R. Ackermann, Heike Fliegl, Genet Asefa Gesese, Fidan Limani, Philipp Mayr, Peter Mutschke, Allard Oelen, Dr. Muhammad Asif Suryani, Sharmila Upadhyaya, Benjamin Zapilko, Harald Sack, and Stefan Dietze. 2025. "Research Knowledge Graphs in NFDI4DataScience: Key Activities, Achievements, and Future Directions." In Informatik 2025: The Wide Open: Offenheit von Source bis Science, 16. – 19. September 2025 Potsdam, edited by Ulrike Lucke, Stefan Stieglitz, Falk Uebernickel, Anna-Lena Lamprech, and Maike Klein, Lecture Notes in Informatics (LNI) - Proceedings 366, 1183-1193. Bonn: Gesellschaft für Informatik e.V.. https://nextcloud.gi.de/s/YcW26W9ApSLD6on.

Upadhyaya, Sharmila, Wolfgang Otto, Frank Krüger, and Stefan Dietze. 2025. "SOMD2025: A Challenging Shared Tasks for Software Related Information Extraction." In Proceedings of the Fifth Workshop on Scholarly Document Processing (SDP 2025), edited by Tirthankar Ghosal, Philipp Mayr, Amanpreet Singh, Aakanksha Naik, Georg Rehm, Dayne Freitag, Dan Li, Sonja Schimmler, and Anita de Waard, 137-145. Association for Computational Linguistics. doi: https://doi.org/10.18653/v1/2025.sdp-1.13. https://aclanthology.org/2025.sdp-1.13/.

Otto, Wolfgang, Sharmila Upadhyaya, Lu Gan, and Kanishka Silva. 2025. "Track Machine Learning in Your Research Domain." In 2nd Conference on Research Data Infrastructure (CoRDI), doi: https://doi.org/10.5281/zenodo.16736334.

Otto, Wolfgang, Sharmila Upadhyaya, and Stefan Dietze. 2024. "Enhancing Software-Related Information Extraction via Single-Choice Question Answering with Large Language Models." In Natural Scientific Language Processing and Research Knowledge Graphs. NSLP 2024, edited by Georg Rehm, Stefan Dietze, Sonja Schimmler, and Frank Krüger, Lecture Notes in Computer Science 14770, 289-306. Cham: Springer Nature. doi: https://doi.org/10.1007/978-3-031-65794-8_21. https://link.springer.com/content/pdf/10.1007/978-3-031-65794-8.pdf.

Karmakar, Saurav, Matthäus Zloch, Fidan Limani, Benjamin Zapilko, Sharmila Upadhyaya, Jennifer D'Souza, Leyla Jael Castro, Georg Rehm, Marcel R. Ackermann, Harald Sack, Zeyd Boukhers, Sonja Schimmler, Peter Mutschke, and Stefan Dietze. 2023. "Research Knowledge Graphs in NFDI4DS." In INFORMATIK 2023 - Designing Futures: Zukünfte gestalten, edited by Maike Klein, Daniel Krupka, Cornelia Winter, and Volker Wohlgemuth, Lecture Notes in Informatics (LNI) - Proceedings P-337, 909-918. Gesellschaft für Informatik e.V.. doi: https://doi.org/10.18420/inf2023_102.

Working and discussion paper

Zapilko, Benjamin, Daniel Mietchen, Sharmila Upadhyaya, and Fidan Limani. 2025. KGI4NFDI - Best practices and guidelines for metadata mapping, linking, and integration. zenodo. doi: https://doi.org/10.5281/zenodo.15780728..

Tong, Xu, Nina Smirnova, Sharmila Upadhyaya, Ran Yu, Chao Sun, Jack Culbert, Wolfgang Otto, and Philipp Mayr. 2024. Utilizing Large Language Models for Named Entity Recognition in Traditional Chinese Medicine against COVID-19 Literature: Comparative Study. arXiv. doi: https://doi.org/10.48550/arXiv.2408.13501.

Data/Software

Upadhyaya, Sharmila. 2024. gesisDataSeachKG Resources. doi: https://doi.org/10.5281/zenodo.11070842.

Presentation at a conference

Otto, Wolfgang, Lu Gan, Sharmila Upadhyaya, Saurav Karmakar, and Stefan Dietze. 2026. "GSAP-ERE: Fine-Grained Scholarly Entity and Relation Extraction Focused on Machine Learning." The 40th Annual AAAI Conference on Artificial Intelligence (AAAI2026), Singapore, 2026-01-20. https://arxiv.org/pdf/2511.09411.

Upadhyaya, Sharmila, Wolfgang Otto, Stefan Dietze, and Frank Krüger. 2025. "Joint Named Entity and Relation Extraction from Software Mentions in Scholarly Articles ." 2nd Conference on Research Data Infrastructure (CoRDI), 2025-04-08. doi: https://doi.org/10.5281/zenodo.16736340.

Otto, Wolfgang, Sharmila Upadhyaya, and Stefan Dietze. 2024. "Enhancing Software-Related Information Extraction via Single-Choice Question Answering with Large Language Models." Natural Scientific Language Processing and Research Knowledge Graphs (NSLP 2024), Hersonissos, Crete, Greece., 2024-05-27.