Visual Concept Detection Using Web-Based Training Sources

Damian Borth

ICSI

Tuesday, April 8
12:30 p.m., Conference Room 5A

Semantic concept detection (or tagging) is considered to be a key building block of modern video search and an important step stone towards bridging the “semantic gap”, the key challenge to image and video retrieval systems.

One substantial difficulty in building such detectors lies in the nature of the underlying machine learning techniques i.e. the demand for a huge labeled training set which serves as ground truth for each concept to be learned. Practice, so far, is to label this data manually which is time-consuming and leads to the problems of scalability and the lack of dynamic concept vocabularies capturing users' information need (such as “Ukraine Riots” or “FIFA World Cup 2014”). Concluding, semantic concept detection suffers from training data acquisition.

To resolve this problem, other data sources for classifier training might be used, namely online video. For example online platforms like YouTube provide textual enriched videos at a large scale available to be used as a source for autonomous concept detector training. Unfortunately, such training sources are weakly labeled and contain a high fraction of non-relevant content, which has to be taking into account during training. Further, and due of the nature of web video platforms the usage of social signals such as viewer demographics can be embedded as additional contextual signal to improve overall system performance, to discover trending topics for dynamic concept vocabularies, or to predict sentiment from visual content.

Bio:

Damian Borth has received his PhD from the University of Kaiserlautern, Germany, where he was a member of the Multimedia Analysis and Data Mining Group at the German Research Center for Artificial Intelligence (DFKI). He is working on visual learning from platforms such as YouTube and Flickr for application areas like automatic tagging or targeted advertising. During his PhD Damian also spend time visiting Columbia University and collaborating with Shih-Fu Chang on visual sentiment analysis.