Masters Thesis

Balancing precision and recall with selective search

Balancing precision and recall is a long-standing problem in Information Retrieval field. We tackled this problem with Selective Search, which divides the large-scale document collection into small shards and passes the user query to only a few of those shards. In contrast, Exhaustive Search passes the user query through the whole collection. Selective Search has shown better efficiency and precision than Exhaustive Search because of the reduction of false positive documents but suffers from much worse recall due to the missing of relevant documents. The optimization work over Selective Search can be categorized into two tasks: (i) shard ranking and (ii) cutoff estimation. On the shard ranking task, we developed three new ranking approaches which improved both precisions and recalls over the previous works. On the cutoff estimation task, we tested three estimators to predict the number of shards ought to be searched, and we explore the finest parameter setting for precision-oriented and recall-oriented evaluation metrics.

Relationships

In Collection:

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.