Integration of Speech and Natural Language

Finding an effective way of using natural-language understanding technology to improve speech recognition has been a long-standing goal of spoken-language understanding research, but achieving positive results has proved difficult. Under its Improved Spoken-Language Understanding project, SRI International has demonstrated a significant reduction in speech recognition error by using a natural-language processing system to rescore recognition hypotheses.

The difficulty of this task is due, at least in part, to lack of robustness when the natural-language system is unable to analyze an utterance as a single coherent phrase or sentence. SRI's innovative approach to this problem involves finding the best analysis of a recognition hypothesis as a sequence of semantically meaningful fragments, estimating the probability of an utterance consisting of a sequence of fragments of the linguistic types found, and combining that probability with estimates of the probability of each fragment type consisting of the corresponding word sequence in the hypothesis. This gives an overall linguistic probability for the hypothesis that is used to modify the score for the hypothesis produced by the baseline speech recognizer.

This method was tested in the December 1994 DARPA benchmark evaluations, with the result that word recognition error was reduced by 15% (from 2.5% to 2.1%). These results represent the only significant improvement we are aware of obtained by using a linguistically-based natural-language knowledge source in conjunction with a current state-of-the-art recognizer, in a blind test on spontaneous, natural speech. For more information, see

"Combining Linguistic and Statistical Knowledge Sources in Natural-Language Processing for ATIS" (GNU-compressed [gz] Postscript - 37685 bytes).
"Using Natural Language Knowledge Sources in Speech Recognition" , 1998, Robert C. Moore, Proceedings of the NATO Advanced Study Institute (ASI). postscript   pdf