Seminar by Nikhil Rasiwasia
Cross-modal Retrieval: Retrieval across different content modalities
Nikhil Rasiwasia
Yahoo Labs, Bangalore
Date: Friday, August 23rd, 2013
Time: 2:30 PM
Venue: CS101.
Abstract:
Multimedia data such as images, web pages, videos, music, etc. are now available in abundance. The increasing availability demands the development of novel representations to tackle the unique challenges posed by the multimedia content. The primary challenge being heterogeneous nature --- data with multiple information modalities --- of the content e.g. web pages which contain both images and text, videos which contain both images and audio, songs with associated lyrics, etc. In almost all these situations, different representations are adopted for different modalities, thereby making it nearly impossible to operate across them using traditional retrieval approaches. In this talk, the problem of cross-modal retrieval from multimedia repositories is considered. This problem addresses the design of retrieval systems that support queries across content modalities, e.g., using an image to search for texts. Two hypotheses are then investigated. The first is that low-level cross-modal correlations should be accounted for. The second is that the joint space should enable semantic abstraction. Three new solutions to the cross-modal retrieval problem are then derived from these hypotheses: correlation matching (CM), an unsupervised method which models cross-modal correlations, semantic matching (SM), a supervised technique that relies on semantic representation, and semantic correlation matching (SCM), which combines both. It is concluded that both hypotheses hold, in a complementary form, although the evidence in favor of the abstraction hypothesis is stronger than that for correlation.
About the speaker:
Nikhil Rasiwasia received the B.Tech degree in electrical engineering from Indian Institute of Technology Kanpur (India) in 2005. He received the MS and PhD degrees from the University of California, San Diego in 2007 and 2011 respectively, where he was a graduate student researcher at the Statistical Visual Computing Laboratory, in the ECE department. Currently, he is working as scientist for Yahoo Labs! Bangalore, India. In 2008, he was recognized as an `Emerging Leader in Multimedia' by IBM T. J. Watson Research. He also received the best student paper award at ACM Multimedia conference in 2010. His research interests are in the areas of computer vision and machine learning, in particular applying machine learning solutions to computer vision problems.