Full-motion video has inherent advantages over still imagery for characterizing events and movement. Military and intelligence analysts currently view live video imagery from airborne and ground-based video platforms, but few tools are available for further exploitation of these streams of information. Consumers of video need object- and event-level indexing of video and its accompanying metadata, to be able to efficiently interact with and exploit streams of video.


SRI has developed a Modular Video Imagery Exploitation Work Station (MVIEWS), a demonstration system for annotating, indexing, extracting, and disseminating information from video streams for surveillance and intelligence applications. With MVIEWS , a single operator can view and annotate live video data. Aided by a set of intelligent software agents and automated tools, the operator can rapidly create and forward multimedia messages and reports. The MVIEWS demonstration system includes:

MVIEWS Functions

A key feature of MVIEWS is the multimodal user interface, which accepts voice and pen input. Talking, pointing, and drawing are very natural ways for humans to convey information. These modes of communication are especially valuable for describing the contents of video. Pointing and drawing are ideal for specifying locations, paths, and spatial relationships. In combination with speech, descriptions of complex concepts can be formulated very efficiently. For example, an operator can circle an object in the video and identify it ("report T-72 tank"), and then draw a path ("headed in this direction"). The identity of the detected object, its location, and its probable path can be immediately dispatched in a message. A still image chip or short MPEG video clip of the object can also be extracted from the video. The MVIEWS system automatically records the operator's verbal comments and drawings on the video image, and associates them with specific frames in the video sequence. The verbal and drawing annotations are archived and can be accessed and replayed in a later phase of analysis.

Live Video Stream Presented to Operator for Annotation and Exploitation

Recognized Speech for Report Generation and Command Interpretation

Map of Surveillance Area

Timeline Enables Video Replay and Browsing Using Multimedia Indexing


The MVIEWS system is based on a number of SRI-developed technologies:

MVIEWS is not finished! Other technologies, such as human-human collaboration and database retrieval by content-based indexing are in the process of being incorporated into the system.


MVIEWS is a video exploitation testbed that can be adapted for a wide variety of surveillance, monitoring, and intelligence applications such as:


