Abstract de la publi numéro 17964

This paper presents the participation of the IRIT laboratory (University of Toulouse) to the Real Time Summarization track of TREC 2016. This track consists in a real-time filtering the tweet stream and identifying both relevant and novel tweets to be pushed to user in real-time. Our team proposes three different approaches: (1) The first approach consist of a filtering model that combines several summarization constraints (2) The second approach for the scenario A is composed of three filters adjusted sequentially in which we use word similarity based function to evaluate the relevance of an incoming tweet. The generation of a batch of up to 100 ranked tweets is formulate as an optimization problem. (3) The third approach consist of a step by step stream selection method focusing on rapidity, and taking into account tweet similarity as well as several features including content, entities and user-related aspects. We describe in this paper the three proposed approaches and we discuss official obtained results for each of them.