Improved Spoken-Language Understanding
Principal investigators:
John Dowding
(dowding@ai.sri.com)
Ananth Sankar
(sankar@speech.sri.com)
"Combining Linguistic and Statistical Technology for Improved Spoken
Language Understanding," is the current in a series of DARPA-funded research projects at
SRI International with the goal of
creating technology for understanding spontaneous spoken natural
language. This technology combines speech recognition (determining
what sequence of words has been spoken) with natural-language
understanding (determining what a given sequence of words means).
Specific goals are to improve the accuracy, robustness, generality,
and speed of spoken-language understanding systems; to reduce the
effort required to port to new applications; and to apply
spoken-language understanding technology to real problems of military
and commercial interest. Work under the project encompasses
This project is a joint effort of SRI's Artificial Intelligence Center and Speech Technology and Research
Laboratory.
Recent Accomplishments
- Greatly expanded the scope of CommandTalk, a spoken-language
interface to distributed battlefield simulations that allows operators
to control synthetic forces using the same language that commanders
use to direct live forces. CommandTalk has recently been extended
from its original Marine Corps application to control Navy, Air Force,
and Army synthetic forces.
- Developed new compiler to extract recognition grammars from GEMINI
natural-language grammars, supporting more general natural-language
grammars and running 10X faster than an earlier compiler.
- Developed SOLVIT, a foreign-language instruction application in
areas of interest to Special Operations Forces (SOF), and demonstrated
SOLVIT at the SOF Language Conference in October 1996.
- Developed an initial version of Multi-media Archival and Retrieval
Voice Entry Link (MARVEL), for archiving and retrieving information
stored in the form of speech data, such as news broadcasts, using
spoken queries. Innovative techniques include a retrieval algorithm
using a statistical language model for each information segment to
rank information segments in order of relevance to the query, and an
algorithm for automatically determining a set of key words that
discriminate well between news stories.
- Designed and implemented a new state-clustering algorithm that
uses a distance metric based on both acoustic similarity and allophone
class entropy, which produced a statistically significant improvement
in recognition accuracy in initial tests.
FY 1998 Plans
-
Use CommandTalk to control all synthetic forces in DARPA's STOW97 Advanced Concept Technology
Demonstration.
- Create tools to permit application developers to easily build
spoken-language interfaces in new domains. A major focus will be on
the development of a high-level tool to allow developers who are not
expert computational linguists to define grammars for new
applications.
- Extend CommandTalk to accept unanticipated variations in language,
and to engage in extended voice dialogs with users.
- Develop automatic techniques to segment and classify speech data
into different acoustic classes, such as clean or degraded speech,
speech from different speakers, etc., in order to apply processing
methods specific to those segment types for purposes of speech
recognition and information retrieval.
- Develop robust recognition algorithms to improve recognition on
acoustically degraded speech, and develop robust techniques to detect
unknown words.