Search |  Contact |  SRI Home Do not follow this link, or your host will be blocked from this site. This is a spider trap. Do not follow this link, or your host will be blocked from this site. This is a spider trap. Do not follow this link, or your host will be blocked from this site. This is a spider trap.A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A ASRI International.  333 Ravenswood Avenue.  Menlo Park, CA 94025-3493. SRI International is a nonprofit corporation.

AIC Seminar Series

Mental Imagery, Language, and Gesture: Video Access to Human Communication

Francis QuekVision Interfaces and System Laboratory, Computer Science & Engineering Department, Wright State University[Home Page]

Date:  2003-07-23 at 16:00

Location:  EJ228  (Directions)


Much video data involves recording of humans engaged in communication. One may loosely classify the venues of such communications as meetings with varying degrees of formality for such purposes as planning, conflict resolution, negotiation, collaboration, confrontation, information exchange, and gossip. We argue that the understanding of human multimodal communicative behavior, and how witting or unwitting visual displays relate to such communication is key to any approach to the analysis of such data. We need to address two questions: how do we bridge video and audio processing with the realities of human multimodal communication, and how information from the different modes may be fused. One path from multimodal behavior to language is bridged by the underlying mental imagery. This visuospatial imagery, for a speaker, relates not to the elements of syntax, but to the units of thought that drive the expression (vocal utterance and visible display). The basic idea is that mental imagery is integral to language production, and non-verbal behavior (gesture, gaze, facial expression) informs us of this imagery. Hence, we have a handle to extract information on human discourse from video. The question becomes what computable features of behavior are informative about imagery and the organization of the discourse. We present the Catchment Feature Model (CFM) our two key questions. We motivate the CFM from psycholinguistic research, and present the Model. In contrast to 'whole gesture' recognition, the CFM applies a feature decomposition approach that facilitates cross-modal fusion at the level of discourse planning and conceptualization. We shall discuss the CFM-based experimental framework, and cite concrete examples of Catchment Features (CF).

   Bio for Francis Quek

Francis Quek is currently an Associate Professor in the Department of Computer Science and Engineering at the Wright State University. He is director of the Vision Interfaces and Systems Laboratory (VISLab) which he established for computer vision, medical imaging, vision-based interaction, and human-computer interaction research. He performs research in multimodal verbal/non-verbal interaction, vision-based interaction, facial modeling, multimedia databases, medical imaging, collaboration technology, computer vision, human computer interaction, and computer graphics.

   Note for Visitors to SRI

Please arrive at least 10 minutes early as you will need to sign in by following instructions by the lobby phone at Building E. (or call Wilma Lenz at 650 859 4904, or Vicenta at Lopez at 650 859 5750). SRI is located at 333 Ravenswood Avenue in Menlo Park. Visitors may park in the parking lots off Fourth Street. Detailed directions to SRI, as well as maps, are available from the Visiting AIC web page. There are two entrances to SRI International located on Ravenswood Ave. Please check the Builing E entrance signage.

SRI International
©2017 SRI International 333 Ravenswood Avenue, Menlo Park, CA 94025-3493
SRI International is an independent, nonprofit corporation. Privacy policy