SRI's VSAM Project: Extra Sets of Eyes
Kurt Konolige Bob Bolles David Beymer Chris Eveland
Eveland, C., K. Konolige, and R. C. Bolles. Background modeling for segmentation of video-rate stereo sequences.
Conference on Computer Vision and Pattern Recognition, Santa Barbara, CA (June 1998).
Abstract. Stereo sequences promise to be a powerful method for segmenting images for applications such as tracking human figures. We present a method of statistical background modeling for stereo sequences that improves the reliability and sensitivity of segmentation in the presence of object clutter. The dynamic version of the method, called gated background adaptation, can reliably learn background statistics in the presence of corrupting foreground motion. The method has been used with a simple head discriminator to detect and track people using a stereo head mounted on a pan/tilt platform. It runs at video rates using standard PC hardware.
PDF version [180KB]
Beymer, D., and K. Konolige. Real-Time Tracking of Multiple People Using Continuous Detection.
Submitted to ICCV 99.
Abstract. Recent investigations have shown the advantages of keeping multiple hypotheses during visual tracking. In this paper we explore an alternative method that keeps just a single hypothesis per tracked object for computational efficiency, but displays robust performance and recovery from error by employing continuous detection during tracking. The method is implemented in the domain of people-tracking, using a novel combination of stereo information for continuous detection and intensity image correlation for tracking. Real-time stereo provides extended information for 3D detection and tracking, even in the presence of crowded scenes, obscuring objects, and large scale changes. We are able to reliably detect and track people in natural environments, on an implemented system that runs at more than 10 Hz on standard PC hardware.
PDF version [329KB]
PS version [2.6MB]
SRI's VSAM effort is focused on multimodal person tracking using stereo and image intensity data. The project uses SRI's Small Vision System to provide real-time stereo disparities. People are initially detected by finding person-shaped blobs in the stereo disparity maps. Once detected, people are then tracked using a combination of stereo and intensity templates.
Single person running
This is a powerpoint presentation of a April, 1999, VSAM talk converted to HTML. It is a good introduction for those who are interested in the technical details of our tracking approach. In addition, more tracking videos are available as hyperlinks from the slides.
This three minute video explains our VSAM tracker, showing results from the stereo system, person detection module, and the final tracking results, including 3D "map" view results. The video has an accompanying voice track that explains the system.
MPEG video [320x240, 33MB]
Quicktime video [320x240, 46MB]