Computer Science Department
School of Computer Science, Carnegie Mellon University


A Robust Subspace Approach to Extracting
Layers from Image Sequences

Qifa Ke

August 2003

Ph.D. Thesis

Keywords: Layer extraction, layered representation, subspace, clustering, robust, video sementation, video analysis, ego-motion

A layer is a 2D sub-image inside which pixels share common apparent motion of some 3D scene plane. Representing videos with such layers has many important applications, such as video compression, 3D scene and motion analysis, object detection and tracking, and vehicle navigation. Extracting layers from videos involves solving three subproblems: 1) segment the image into sub-regions (layers); 2) estimate the 2D motion of each layer; and 3) determine the number of layers. These three subproblems are highly intertwined, making the layer extraction problem very challenging. Existing approaches to layer extraction are limited by 1) requiring good initial segmentation, 2) strong assumptions about the scene, 3) unable to fully and simultaneously utilize the spatial and temporal constraints in video, and 4) unstable clustering in high dimensional space. This thesis presents a subspace approach to layer extraction which does not have the above limitations. We first show that the homographies induced by the planar patches in the scene form a linear subspace whose dimension is as low as two or three in many applications. We then formulate the layer extraction problem as clustering in such low dimensional subspace. Each layer in the input images will form a well-defined cluster in the subspace, and a simple mean shift based clustering algorithm can reliably identify the clusters thus the layers. A proof is presented to show that the subspace approach is guaranteed to increase significantly the layer discriminability, due to its ability to simultaneously utilize spatial and temporal constraints in the video. We present the detailed robust algorithm for layer extraction using subspace, as well as experimental results on a variety of real image sequences.

171 pages

Return to: SCS Technical Report Collection
School of Computer Science homepage

This page maintained by