Séminaire LIA: Jeudi 19/01 de 11h30 à 12h00
Speaker: Eric SanJuan
Title: Automatic evaluation of In(formative)|(teresting)ness
Document relevance is widely used to evaluate Information Retrieval (IR) system effectiveness. On the other hand, content informativeness has mainly been used both in interactive information retrieval and in automatic summarization evaluation. These two types of measure differ in that the first one considers document relevance while the second considers relevant words or n-grams occurrences as effectiveness clue. For various reference evaluation collections and tasks, we compare the obtained system rankings, including a comparison with the official task system ranking. We first show that that informativeness based on n-grams is correlated to informativeness based on various types of key-phrases (DBpedia entries, multi-terms).
We then observe that informativeness and document relevancy are correlated on TREC5-8, Web2000-1, Robust and Terabyte2004-6 tracks when system ranking is concerned. Our general conclusion is that whenever document readability is assumed, IR system evaluation based on informativeness is robust an can be extended to interestingness.