Honours projects 2008
Projects supervised by Irena Koprinska
Automatically Extracting Subtitles from Videos
Co-supervised with Rafael Calvo from EIE and Daniel Lloyd-Jones from Visionbytes
Producing searchable video typically requires textual data to be generated and indexed in a database for retrieval. In Australia many programs use closed caption text which sends data as a separate stream to the video which can easily be extracted. However in Asia and other countries most TV stations use "open captions" commonly referred to as subtitles. This text is actually embedded within the video stream and hence cannot be easily extracted. This project will investigate ways to identify, extract and convert this data back into text.
A possible approach is to identify where the captions are using color and location information (captions tend to be located in the same location of the screen and are in particular colors), extract the area of the captions into an image, remove the background information from the image, compare subsequent images to identify consecutive captions, and use OCR to convert the image into characters.
Sample video content will be provided by Visionbytes, together with an OCR engine if needed. Visionbytes are leaders in searching video content. Their clients include TV channels such as ABC, SBS and SkyTV, portals such as NineMSN and Yahoo!7 and government organisations such as the NSW Parliament.
Story segmentation, topic detection and tracking of news videos
Co-supervised with Rafael Calvo from EIE and Daniel Lloyd-Jones from Visionbytes
News videos are an important information source. They contain valuable information for documentary-making, business analysis, tracking topics over time, and can also be used as a reference tool for studying political and historical events. However, efficient searching and browsing of video content is still an open research problem. This project aims at investigating ways for: 1) automatic segmentation of a news program into individual stories, 2) topic detection and tracking. This will be based on two information sources: visual and text (closed captions) streams. Closed captions are human generated transcriptions of what is spoken and are available for all news programs in Australia.
A possible approach for the first task is to use supervised machine learning to “learn” the story boundaries from previously labelled frames as story boundary or not. This can be used in conjunction with other clues, e.g. new stories typically start with anchor person. The second task involves detecting if the story is on a new or existing topic (e.g. using clustering), and may also involve providing a suitable visualization of the story’s evolution (e.g. number of stories in the topic and related stories). Sample video content will be provided by Visionbytes.
Machine Learning for Painting with a Wii
The Wii remote is used as a pointing and motion controller in the recently developed and very popular Wii consoles for playing games. We have developed a prototype of a computer program which uses the Wii for painting. The goal of this project is to explore the use of machine learning algorithms (e.g. supervised learning from examples) to improve the pointing and motion capabilities of the Wii-remote for painting on a computer screen, and evaluate its usability.
More projects will be added.