Learning to Recognise Dynamic Visual Content from Broadcast Footage


HOME	PROJECT OVERVIEW	PUBLICATIONS	CONTACT US
For signed information about any topic, please click the following icon.
This research is in the area of computer vision - making computers which can understand what is happening in photographs and video. However, most learning based approaches require annotated data which can be expensive to acquire. This project seeks to develop automated tools that allow temporal visual content,such as a human gesturing, using sign language, or interacting with objects or other humans, to be learnt from standard TV broadcast signals using the high level annotation in the form of subtitles or scripts. This requires the development of models of the visual appearance and dynamics of actions, and learning methods which can train such models using the weak supervision provided by the text. As such there are two main domains to this work, Sign Language Recognition and more general understanding of actions and behavior in broadcast footage. More details on some of these elements are given below. Sign Language Upper Body Pose Estimation and Tracking Learning Sign Language by Watching TV Sign Recognition using Sequential Pattern Trees Additional pose data and CNN body pose software Action Recognition Recognising actions in 2D Recognising actions in 3D Tracking and Character Identification 2D detection and tracking Tracking 3D objects from 2D footage Tracking Hands in 3D Identifying Characters in Footage This page was last updated on 19/03/2016 by S Hadfield