Search |  Contact |  SRI Home Do not follow this link, or your host will be blocked from this site. This is a spider trap. Do not follow this link, or your host will be blocked from this site. This is a spider trap. Do not follow this link, or your host will be blocked from this site. This is a spider trap.A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A ASRI International.  333 Ravenswood Avenue.  Menlo Park, CA 94025-3493. SRI International is a nonprofit corporation.

AIC Seminar Series

Auditory and Visual Scene Analysis for Multiparty Human-Robot Interaction

Radu HoraudPERCEPTION group, INRIA Grenoble, France[Home Page]

Notice:  Hosted by Bob Bolles

Date:  Friday, July 8th 2016 at 1:30pm

Location:  EK255 (SRI E building)  (Directions)


Robots in the future are envisioned to collaborate and communicate with people, rather than merely executing repetitive physical tasks, e.g. object manipulation. In particular, robots should be able to solve problems together with humans, or to assist them in various ways. For example, if a robot (or more generally an autonomous agent) is engaged in a conversation with two or more persons, important tasks to be solved, prior to automatic speech recognition, natural language processing and turn-taking, are to correctly assign temporal segments of speech to speakers and to associate spoken words to objects in the shared physical space. Obviously, it is desirable to develop a novel multi-user multimodal approach that goes beyond traditional single-user spoken-dialogue systems. In this talk we will advocate the use of both audio and visual data and will present the methodology under development in the PERCEPTION team at INRIA Grenoble (France). We will briefly enumerate the difficulties associated with audio-visual fusion. We will describe how to extract auditory and visual features over time, how to align these two modalities, and how to associate speech signals to visually tracked persons within a Bayesian fusion model. This framework calls for advanced multi-channel audio signal processing and multi-person detection and tracking techniques that are under development. Finally we will mention a number of challenges that remain to be solved in order to allow unconstrained human-robot and human-computer interaction.

   Bio for Radu Horaud

Radu Patrice Horaud holds a position of director of research at INRIA Grenoble Rhône-Alpes, France. He is the founder and director of the PERCEPTION team. Radu, formerly Horodniceanu, was born in Bucharest, Romania, arrived in France in 1972 at the age of 18, and became a citizen of France in 1975. Radu’s research interests cover computational vision, audio signal processing, audio-visual scene analysis, machine learning, and robotics. He is the author of over 200 scientific publications. Radu pioneered work in computer vision using range data (or depth images) and developed a number of principles and methods at the cross-roads of computer vision and robotics. In 2006, he started to develop audio-visual fusion and recognition techniques in conjunction with human-robot interaction. He is an area editor for the Computer Vision and Image Understanding (Elsevier), a member of the advisory board for the International Journal of Robotics Research (Sage), and an associated editor for the International Journal of Computer Vision (Kluwer-Springer). In 2001 he was program co-chair of the IEEE Eighth International Conference on Computer Vision (ICCV’01) and in 2015 he was program co-chair of the 17th ACM International Conference on Multimodal Interaction (ICMI’15). Radu Horaud was the scientific coordinator of the European Marie Curie network VISIONTRAIN (2005-2009), STREP projects POP (2006-2008) and HUMAVIPS (2010-2013), and the principal investigator of a collaborative project between INRIA and Samsung’s Advanced Institute of Technology (SAIT) on computer vision algorithms for 3D television (2010-2013). In 2013 he was awarded an ERC Advanced Grant for his five year project VHIA (2014-2019). In 2015 he received a three year grant (jointly with Florence Forbes) from Xerox University Affairs Committee.

   Note for Visitors to SRI

Photography or broadcast of the event is prohibited unless specifically authorized by SRI. Reporters must coordinate with SRI 24 hours in advance before attending.
Please arrive at least 10 minutes early as you will need to sign in by following instructions by the lobby phone at Building E (or call Wilma Lenz at 650 859 4904, or Eunice Tseng at 650 859 2799). SRI is located at 333 Ravenswood Avenue in Menlo Park. Visitors may park in the parking lots off Fourth Street. Detailed directions to SRI, as well as maps, are available from the Visiting AIC web page. There are two entrances to SRI International located on Ravenswood Ave. Please check the Building E entrance signage.

SRI International
©2021 SRI International 333 Ravenswood Avenue, Menlo Park, CA 94025-3493
SRI International is an independent, nonprofit corporation. Privacy policy