Séminaire LIA: Jeudi 19/01 de 11h30 à 12h00

Speaker: Eric SanJuan
Title: Automatic evaluation of In(formative)|(teresting)ness

Abstract :
Document relevance is widely used to evaluate Information Retrieval (IR)  system effectiveness. On the other hand, content informativeness has mainly been used both in interactive information retrieval and in automatic summarization evaluation. These two types of measure  differ in that the first one considers document relevance while the second considers relevant words or n-grams occurrences as effectiveness clue. For various reference evaluation collections and tasks, we compare the obtained system rankings, including a comparison with the official task system ranking. We first show that that  informativeness based on n-grams is  correlated to informativeness based on various types of key-phrases (DBpedia entries,  multi-terms). 
We then observe that informativeness  and document relevancy are correlated on TREC5-8, Web2000-1, Robust and  Terabyte2004-6 tracks when system ranking is concerned. Our general conclusion is that whenever document readability is assumed, IR system evaluation based on informativeness is robust an can be extended to interestingness. 
