Merging Rank Lists from Multiple Sources in Video Classification

Wei-Hao Lin

Fall 2003

 

This page is the progress report for the lab part of the course, Advanced Information Retrieval Seminar and Lab.

Contents


Proposal

I propose to investigate the problem of news video retrieval from multiple sources. When the news video collection is consisted of programs from multiple channels (e.g. CNN, ABC, NBC, etc), any video retrieval systems have to decide whether they want to treat everything as a combined collection, or develop channel-specific retrieval engines. If the channel-specific approaches is chosen, the retrieval system have to further solve the problem of combining rank lists from different channels.

The problem of retrieval from multiple sources are not unique to news video retrieval and has been explored in other domains. In Cross-Lingual Information Retrieval (CLIR), search collection is consisted of documents in multiple languages, and CLIR system is confronted with a problem of combining rank lists from each monolingual collection. However, these combining strategies are usually ad-hoc designed. The research works in Distributed Information Retrieval (DIR) field directly focus on the program of searching among multiple resources, and many techniques of combining rank lists have been developed. However, it is still not clear how text-based combining strategy can be generalized to multimedia domain.

In the lab project, I will evaluate the performance of combining multiple resources as huge collection, and to what extent channel-specific engines plus rank list combining strategies can achieve. Combined strategies borrowed from CLIR and DIR techniques, as well as other techniques such as score distribution fitting, will be implemented and tested against TREC 2003 video track training data, which consists of video news from three channels and are already labeled in the shot level.

Timeline

Week Plan
Aug. 31 ~ Sep. 6 Proposal due (Sep. 4).  Start literature review.
Sep. 7 ~ Sep. 13 Continue literature review. Work on lab webpage.
Sep. 14 ~ Sep. 20 Lab webpage due (Sep. 16). Define experiment protocols and set up experiment environment.
Sep. 21 ~ Sep. 27 Initial project presentation (Sep. 23). Implement direct score and normalized score combining functions..
Sep. 28 ~ Oct. 4 Complete first run and analyze preliminary results.
Oct. 5 ~ Oct. 11 Out of town (attending Web Intelligence Conference)
Oct. 12 ~  Oct. 18 Out of town (attending Web Intelligence Conference)
Oct. 19 ~ Oct. 25 Midterm project presentation (Oct. 23). 
Oct. 26 ~ Nov. 1 Implement rank-based methods, including round-robin, Broda counting, etc.
Nov. 2 ~ Nov. 8 Implement methods using sampled data repository to estimate global scores.
Nov. 9 ~ Nov. 15 Conduct experiments and compare different methods.  Analyze all results.
Nov. 16 ~ Nov. 22 Out of town (attending Video TREC workshop)
Nov. 23 ~ Nov. 29 Wrap up experiments and write final report.
Nov. 30 ~ Dec. 6 Final project presentation (Dec. 2)

Slides