Aravind Sundaresan

Research

This page contains a brief description of some of the projects that I have worked on. The representative publications if any are listed under each project. For a complete list of publications, please refer to the publications page.

Sentient: Real-time Multi Stereo Sensor based Tracking

People: Chris Connolly, Aravind Sundaresan, Bob Bolles

People tracked in scene Sentient is a scalable real-time multi-sensor stereo-based tracking system for tracking people in crowded scenes. The system includes distributed sensor-nodes that process stereo images and detect persons in each image (referred to as tracks). A short history of tracks is maintained in order to conservatively assign a temporal ID to tracks belonging to the same person. A group of tracks with the same temporal ID is called a ``tracklet''. At each time instant, these tracks are transmitted over the network to a central node which combines information from all available sensor nodes to construct a global view of the scene. Statistics such as color histograms and height histograms of each track as well as the geometrical and temporal proximity are used to perform the merging at both the sensor and central nodes. The tracks, along with relevant and possibly unique statistics and image snapshots, are stored in a MySQL database that can be queried to perform tasks such as analysis of people traffic, identification of unusual activity, etc. The stereo-based system has several advantages over a monocular system, including robustness to background and illumination changes as well as the fact that the 3D location of the tracked objects can be obtained. We describe a number of innovations that allow us track people in a crowd despite complex motion, mutual occlusion etc. The figure illustrates two persons who have been tracked using two sensors and "merged" - their tracks are overlaid on the image.

[Back to top]

Leaving Flatland: 3D scene mapping

People: Radu Rusu, Aravind Sundaresan, Benoit Morisset, Motilal Agarwal, Kris Hauser, Jean-Claude Latombe, Michael Beetz

3D model of scene "Leaving Flatland" is an exploratory project that attempts to surmount the challenges of closing the loop between autonomous perception and action on challenging terrain. The proposed system includes comprehensive localization, mapping, path planning and visualization techniques for a mobile robot to operate autonomously in complex 3D indoor and outdoor environments. In doing so we integrate robust Visual Odometry localization techniques with real-time 3D mapping methods from stereo data to obtain consistent global models annotated with semantic labels. These models are used by a multi-region motion planner which adapts existing 2D planning techniques to operate in 3D terrain. All the system components are evaluated on a variety of real world data sets, and their computational performance is shown to be favorable for high-speed autonomous navigation.

[1] Radu Bogdan Rusu, Aravind Sundaresan, Benoit Morisset, Kris Hauser, Motilal Agrawal, Jean-Claude Latombe, and Michael Beetz. Leaving Flatland: Efficient Real-Time three-dimensional perception and motion planning. Journal of Field Robotics: Special Issue on Three-Dimensional Mapping, 26(10), September 2009. [ .pdf ]

[Back to top]

Real-time path detection for LAGR

People: Kurt Konolige, Motilal Agarwal, Rufus Blas, Aravind Sundaresan, Bob Bolles

Path segmentation This project outlines an approach for learning the brightness, color and texture of scene objects with the final goal for identifying a path in real-time for the LAGR (Learning Adapted for Ground Robots) project. At the lowest level, high dimensional vectors in the form of textons are used to represent local image measurements. An efficient two-step k-means implementation is then used to cluster image regions into separate groups. This is done by clustering the textons into a number of distinctive texton primitives for a given scene. Histograms are then constructed of these texton primitives in image neighborhoods and re-clustered into image regions with similar histograms. The Earth Movers Distance is used to merge clusters in order to minimize over-segmentation. Integral images are used allowing for construction of arbitrarily large histograms at constant time. Recent advances in accelerating k-means allow for a real-time implementation. Results are shown for a robotic application of outdoor path following where a context-aware approach allows for automatic learning of the visual cues for a path using 3D spatial information.

[1] Kurt Konolige, Motilal Agrawal, Morten Rufus Blas, Robert C. Bolles, Brian Gerkey, Joan Sola, and Aravind Sundaresan. Mapping, Navigation, and Learning for Off-Road Traversal. Journal of Field Robotics: Special Issue on LAGR Program, 26(1), December 2008. [ .pdf ]

[Back to top]

Towards markerless motion capture

People: Aravind Sundaresan, James Sherman, Rama Chellappa

Motion capture methods traditionally use active or passive markers, and there exist applications where it is desirable to do away with markers for a variety of reasons not the least of which is their invasive nature. In particular, biomechanical and clinical applications, where marker-based motion capture is the state-of-the-art technique would benefit greatly from such a system. We have published some work on image-based 3-D tracking as well as pose estimation and human body model estimation from voxels.

Since much of my work has revolved around multi-camera capture I have a fair bit of experience with different multiple capture systems. I have (with James Sherman) designed and built the Hydra project, a portable and scalable multi-camera capture system for human motion analysis.

Model driven human body model estimation in Laplacian Eigenspace

Segmentation in Eigenspace This project has two components. The first is model-driven segmentation in Laplacian Eigenspace. The input abstraction layer is voxels and the neighbourhood relationship of the voxels is used to compute the Laplacian of the adjacency graph. The nodes are then mapped to 6-D Laplacian eigenspace using the eigenvectors corresponding to the smallest non-zero eigenvalues of the Laplacian matrix. We show that this transformation maps segments whose lengths are greater than their thicknesses to 1-D curves in eigenspace. We can then fit splines to these 1-D curves and segment them at their joints. The two images on the left correspond to the 6-D eigenspace and we have segmented one segment by fitting a spline. This work was awarded the best student paper in the Computer Vision track at the biennial International Conference on Pattern Recognition, 2006.

Model acquisition The second part of the project is to acquire a set of key frames where the voxels have been segmented and registered using a prior model and a probabilistic registration method. This set of key frames can be used to estimate the human body model in two steps: estimate a skeleton based human body model and joint locations using human body statistics and computed skeleton, and then fit a super-quadric model using the segmented voxels. The images (from left to right) denote the voxels (unsegmented), voxels (segmented), computed skeleton curve and estimated super-quadric skeleton model. Five frames were used to estimate the model.

Articulated 3-D Tracking using shape and motion cues

3D tracking using shape and motion We perform 3D tracking of articulated subjects using both motion and shape cues information in the images obtained from multiple cameras. The motion information used is the computed pixel displacement. The shape information used is both the silhouette information as well as the "motion residue". These cues are complementary and when fused in the tracking algorithm prevent drifting (through use of shape features) and avoid non-optimal local minima (through prediction of motion using optical flow). The two images on the right illustrate the super-quadric model superimposed on the image for two views.

[1] Aravind Sundaresan. Towards Markerless Motion Capture: Model estimation, Initialization and Tracking. PhD thesis, University of Maryland, College Park, MD 20740, 2007. [ .pdf ]
[2] Aravind Sundaresan and Rama Chellappa. Model driven segmentation and registration of articulating humans in Laplacian Eigenspace. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(10):1771-1785, October 2008. [ .pdf ]
[3] Aravind Sundaresan and Rama Chellappa. Multi-camera tracking of articulated human motion using shape and motion cues. IEEE Transactions on Image Processing, 18(9):2114-2126, September 2009. [ .pdf ]

[Back to top]

Real-time marker based motion capture to control robots

People: Allen Yang, Aravind Sundaresan, James Davis, Hector Gonzalez-Banos

Real-time marker-based capture The project was with Allen Yang, James Davis, Hector Gonzalez-Banos and Victor Ng-Thow-Hing at Honda Research Institute. The objective is to retarget motion from a subject wearing markers to different robots such as Asimo. Images were obtained from eight cameras attached to two servers. The server can be controlled over network to capture, calibrate, obtain 2D marker locations and 3D marker locations. The markers are located in real time and their position in space found. The pose is estimated from the markers and the motion is retargetted to the robot. The motion retargetting was done by Allen and Hector.

Human Identification at a Distance (HID)

People: Amit Kale, Aravind Sundaresan, Naresh Cuntoor, Rama Chellappa, Amit Roy-Chowdhury,Volker Kruger, A. Rajagoplan.

HMM for Gait Recognition The objective is to obtain methods of representing and recognizing humans in video sequences. We use 2-D binary silhouettes in an HMM framework for modelling gait and human shape. This simple approach enables us to analyse gait and gain an understanding of the problem. We aim to ultimately build 3-D models for human motion. Silhouettes are obtained from the sequence and are used to build an exemplar image-based human shape and gait model using the Baum-Welch algorithm. This HMM model can be used to obtain the identity of an unknown subject by maximising the posterior probability of the model given the sequence. The UMD data is available at HID UMD Database page.

[1] Amit A. Kale, Aravind Sundaresan, A. N. Rajagopalan, Naresh P. Cuntoor, Amit K. Roy Chowdhury, Volker Krüger, and Rama Chellappa. Identification of humans using gait. IEEE Transactions on Image Processing, 13(9):1163-1173, September 2004. [ .pdf ]

[Back to top]