There is a broad class of crisis situations that can be abstractly characterized in a similar fashion: The goal of a team of people is to gain access to specific structures or objectives through a terrain that presents obstacles and is subject to continuously changing conditions. Examples include fire fighting operations, earthquake and flood response, rescues, tank battles, urban warfare, and hostage situations. Various route-planning and travel-planning tasks can be viewed in this way as well. The members of the team interact in part through language and in part through a model of the terrain, either a map or some richer representation. They use the terrain model to identify structures, terrain features, routes of access, and obstacles along these routes, and they must update the model as conditions change.
The terrain model is essentially a dynamic database of geographical information. It is potentially very rich in information, containing many levels of detail and best viewed with a specific focus or perspective. Thus, for complex tasks it is essential to have a computer-based presentation of the terrain model.
The most convenient means for interacting with such a computerized terrain model would be natural language and gesture. (By ``gesture'' we mean the use of pointing or the drawing of simple figures on a display.) This raises the problem of reference, broadly construed as the recognition of the mapping from the way meanings are expressed to the entities in the model that they indicate. This includes the problem of resolving referential expressions in context, on the basis of their form and content. But another aspect of the problem of reference arises from the fact that the conceptualizations of space that underlie natural language and gesture, on the one hand, and current terrain models, on the other, are radically different. There is a significant gap that must be bridged. The essentially topological conceptualization of space that underlies natural language must be mapped into the more geometric representations of the terrain model.
We propose to investigate the properties of interactions with the terrain model and between team members as they would occur in such a crisis situation. Our focus will be on elucidating the mapping from natural language and gesture to the terrain model. Specifically, we will investigate
Our first task has been to design ``Wizard of Oz'' experiments to elicit the coordinated use of language and gesture in interacting with a terrain model. Because of the software that is already available at SRI, our first experiment, to be carried out in the summer of 1997, will involve travel planning with a computer-based map and other information about a city to be toured. We have run pilot sessions on this set-up, and turned up a wide variety of styles of interacting with such a system, from users who try to keep their inputs simple for the computer with tightly coordinated speech and gesture, to users who ramble on and on with little use of gesture. One of the problems we face in this task is how to encourage the use of gesture without biasing users to a particular small set of gestures.
The second scenario we have been exploring would involve expert or trainee fire-fighters directing resources to objectives while using a terrain model rich in topographic and other information. The design of this experiment is in a much more preliminary stage.
In addition to the experimental work, we are looking at fundamental aspects about how spatial information is represented in language. Specifically, we are developing an axiomatic theory of scales, or scalar notions, which underlie our conceptualizations of space and other phenomena. This work will be reported on at the AAAI Workshop on Language and Space, in Providence, RI, July 27 and 28.
The problems of multimodal reference and how discourse structure influences it, and of nature of the mapping from the linguistic and conceptual representations of language and gesture and the more geometric representation of terrain models are problems of significant scientific interest and practical utility, and they are the focus of this project.