Value-Added Services for Information Retrieval II (IRM II)

Developing Scaleable Web Services to Support Search

Bearbeitung: Thomas Lüke, Dr. Philipp Schaer
Leitung: Prof. Dr. York Sure, Dr. Philipp Mayr, Peter Mutschke
Wissenschaftlicher Arbeitsbereich: Wissenstechnologien für Sozialwissenschaften (WTS)

Projektbeschreibung

Information on the forerunner project IRM can be found on the corresponding  project page.

The main goal during the first project phase (IRM) was the implementation of the value-added service prototypes: A Search-Term-Recommender and the two new re-ranking methodologies Bradfordizing and Author Centrality. These three prototypes where integrated into an interactive retrieval environment to allow quantitative and qualitative evaluations regarding the gain in retrieval quality by using these services.

In the follow-up project IRM II the goal is to further develop the – previously positive evaluated – prototypes from the first project phase. We aim at getting a toolbox of reusable, combinable and scalable software components that can be integrated into a real-world retrieval environment. These components will form a service oriented architecture (SOA) and will offer their services for distributed systems (see fig. 1) in a highly productive and stable way – even under high load and with multiple users. The main scenario is to includes our services in the search facilities of subject gateways, portals (e.g. Sowiport, MedPilot or EconBiz) and repositories (like SSOAR). The modular architecture allows an easy integration into GESIS’s own disciplinary portals as into the ones of external partners. The first step would be to integrate the reimplemented services into Sowiport as a reference infrastructure and later roll out an implementation into other scientific portals.

We hope to achieve a more flexible and potential software solution and to enable other institutions to reuse our software and services. To support these goals we will make all the code developed in this project publicly available under an open-source license.

 IRM II Architektur

Fig. 1: Value-added services for Information Retrieval seen as a network of loosely coupled public and private Web Services in a service-oriented architecture.

The IRSA Framework

All services were integrated into the Open Source framework IRSA that is released on Sourceforge.

 

IRSA is accessible through a web frontend that handles user registration, status mails and also provides rudimentary management and accessibility methods for the domain specific TS. To make use of the framework DL operators have to take six steps: (1) Register at the project’s website, (2) create a new repository within the system, (3) supply either the OAI-PMH interface URL of the DL or the metadata itself as XML files in oai_dc style, (4) schedule the repository for processing (5), wait until processing of the TS is finished (typically within hours), and (6) use the generated RESTful web service in the specific project. This web service can be included in the DL with a few lines of code depending on the programming language and frameworks used. An API key is used to ensure privacy for each user. Figure 2 illustrates the internal workflow with most steps happening without user interaction.

IRSA workflow

Fig. 2: Workflow of the IRSA system.

Prototype

To start the interactive IRM Prototype just follow the link below or click the image.

 

Start the IRM Prototype.

Projektlaufzeit

01.05.2011 - 30.6.2013

Gefördert durch

Partner

Project Partners

Publikationen

Recent Publications