Masters Thesis

Automatic question answering system for factoid and non-factoid open-domain questions

We present an end-to-end system for open-domain question-answering that consists of three components. (1) The query formulation module is tasked with transforming the verbose, and often non-grammatical and noisy question into a boolean query of a few keywords. The generated query is then run through a commercial search engine to obtain matching documents from the Web. (2) The candidate answer generation module extracts potential answers from the retrieved documents. (3) The answer selection module is responsible for identifying the best answer based on various criteria. The originality of the proposed system is a efficient combination of Machine Learning methods and refining the datasets quality. A thorough empirical evaluation using multiple datasets demonstrates that the approach is highly competitive and exceeding the state-of-art results.

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.