Multi-armed bandits for adjudicating documents in pooling-based evaluation of information retrieval systems

Citation data:

Information Processing & Management, ISSN: 0306-4573, Vol: 53, Issue: 5, Page: 1005-1025

Publication Year:
2017
Usage 289
Abstract Views 282
Link-outs 7
Captures 12
Exports-Saves 12
Social Media 30
Shares, Likes & Comments 17
Tweets 13
Citations 1
Citation Indexes 1
DOI:
10.1016/j.ipm.2017.04.005
Author(s):
David E. Losada, Javier Parapar, Alvaro Barreiro
Publisher(s):
Elsevier BV
Tags:
Computer Science, Engineering, Decision Sciences, Social Sciences
Most Recent Tweet View All Tweets
article description
Evaluating Information Retrieval systems is crucial to making progress in search technologies. Evaluation is often based on assembling reference collections consisting of documents, queries and relevance judgments done by humans. In large-scale environments, exhaustively judging relevance becomes infeasible. Instead, only a pool of documents is judged for relevance. By selectively choosing documents from the pool we can optimize the number of judgments required to identify a given number of relevant documents. We argue that this iterative selection process can be naturally modeled as a reinforcement learning problem and propose innovative and formal adjudication methods based on multi-armed bandits. Casting document judging as a multi-armed bandit problem is not only theoretically appealing, but also leads to highly effective adjudication methods. Under this bandit allocation framework, we consider stationary and non-stationary models and propose seven new document adjudication methods (five stationary methods and two non-stationary variants). Our paper also reports a series of experiments performed to thoroughly compare our new methods against current adjudication methods. This comparative study includes existing methods designed for pooling-based evaluation and existing methods designed for metasearch. Our experiments show that our theoretically grounded adjudication methods can substantially minimize the assessment effort.

This article has 0 Wikipedia mention.