Computer Vision Research - 1992-2002
Former affiliations
Computer vision
Since you are reading this page,
you are probably not blind, and vision is by far your most important sense.
It is something you can take for granted, yet it is a tremendously
complicated process to which a major portion of your brain is devoted.
Computer vision, a subfield of artificial intelligence, has the goal
of replicating the capabilities of human vision on computers.
Starting from a video sequence or a few photographs - a set of
two-dimensional, static representations, which are digitized as arrays
of integer values, each integer representing the brightness of a
location in the image. -, we try to analyse the images to
extract information about the
scene such as three-dimensional geometry (shape, spatial position and
orientation), motion, or recognition of the objects, places, or people
as instances of known categories.
I have been working on a variety of projects in computer vision.
All of them have in common the theme of extracting
three-dimensional geometry from a set of images. With
computer generated images or video games, you see (2D) images
generated from 3D models. The problems I am addressing are in some
sense the inverse. They are considerably more difficult because you
try to go from a lower to a higher dimension.
A few major projects
-
Projective geometry for multiple view analysis.
This is a rather mathematical undertaking for which I am best known.
The general goal is extend the conditions under which
3D information can be obtained from images.
In general it is not possible to obtain 3D information
from a single image, and therefore several views are taken. The
difference between the position of corresponding features in the
various images allows us to recover the 3D information, but in order to do
so, one needs to have a description of the geometry of the set of
images (their relative position and
orientation, as well as optical characteristics). Previous methods
relied on surveyed landmarks or artificial calibration objects,
which is often impractical.
My contribution (which was the basis for my PhD thesis at INRIA)
has been to propose a new framework based on projective geometry for
such a description.
This has helped starting an active subfield of research.
From a theoretical point of view, it has
considerably
deepened our understanding of the geometry of multiple images.
From a practical point of view, it
has lead to a host of new solutions to the
motion and calibration problem.
-
An stereo-based integrated approach to automatic vehicle guidance.
This was an applied project, conducted at UC Berkeley,
aimed at developing vision as a sensor
technology for passenger vehicle control.
The idea is to mount a pair of video
cameras on the vehicle, and to process the images in real time in
order to extract enough road and traffic information that the vehicule
can drive automatically. The two goals is
to follow the road and to maintain the correct distance with respect
to a leading vehicle.
The novel feature of this
project, compared to most previous approaches, is the extensive use
of binocular stereopsis. First, it provides information for obstacle
detection, grouping, and range estimation which is directly used for
longitudinal control. Secondly, the obstacle--ground separation
enables robust localization of partially occluded lane boundaries as
well as the dynamic update of camera rig parameters to deal with
vibrations and vertical road curvature.
I helped defining the system and the algorithms in an initial
feasibility study. A second phase of the project
eventually resulted in a working
system which demonstrated automatic driving at highway speeds.
-
Continuous Terrain Modeling from Image Sequences
with Applications to Change Detection.
In this project, conducted at SRI, we build at different times 3D mesh models
of a terrain from sets of aerial images. The question we seek to
address is, given two such models, is there a significant change in
terrain shape or reflectance
(due perhapes to bomb damage, movement of large machinery,
deforestation, and so on) ? This concrete question has given rise to
two general conceptual developments.
First, changes in the
3D mesh model can arise not only from significant changes
in terrain, but also from internal variability in the 3D
reconstruction algorithm. Therefore, we need a way to quantify this
source of variability, which is to assess the performance of the
reconstruction algorithm without using ground truth. We have
developed a new statistical framework called "Self-consistency"
to this effect.
Second, while geometric reconstruction from multiple images is now well
understood, radiometric reconstruction from multiple images is not.
We developed a new framework and method to estimate non-uniform reflectances of the
surfaces and illumination parameters from multiple images.
To be taken with a grain of salt...
Back to Tuan's official home page.