next up previous
Next: Multimodal Map Up: OAA Applications Previous: OAA Applications

Unified Messaging

The Unified Messaging application extends the Automated Office presented in section 4 with an emphasis on ubiquitous access and dynamic presentation of the information and services supported by the agent community. The agents used in this application are depicted in Figure 4.

A real dialog taken from the current system can provide insight into how systems are built using OAA. In this scenario, the user, with only a telephone as an interface, is planning a trip to Boston where he will soon give a presentation. Capitalized sentences are phrases spoken by the user into the telephone and processed by a phone agent. Responses, unless otherwise indicated, are spoken by the system using text-to-speech generation.

1.1  Welcome to SRI International.  Please enter your user ID and password.
     <Enters touchtone ID and password>
     Good to see you again Adam Cheyer.  I am listening to you.

Every user interface agent, including the telephone agent, should know the identify of its user. This information is used in resolving anaphoric references such as ``me'' and ``I'', and allows multiple user interfaces operated by the same user to work together.

1.2  WHAT IS TODAY'S SCHEDULE?
     Here is today's schedule for Adam Cheyer:
       At 10am for 1 hour, meeting with Dave.
       At 3pm for 1 hour, presentation about software agents.
     End of schedule.

If the user is operating both a graphical user interface and a telephone, as described in conjunction with the Automated Office application, the result of this spoken request is to display a calendar window containing the current schedule. In this case, with no graphical display available, the GEN_NL agent is tasked to produce a spoken response that can be played over the phone. GEN_NL shares the same dynamic vocabulary and phrasal rules as the natural language parser DCG_NL, and contains strategies for producing responses to queries using either simple or list-based multimedia utterances.

1.3  FIND FRIDAY'S WEATHER IN BOSTON.
     The weather in Boston for Friday is as follows:
       Sunny in the morning. Partly cloudy in the afternoon with a 20
       percent chance of thunderstorms late. Highs in the mid 70s.

In addition to data accessible from legacy applications, content may be retrieved by web-reading agents which provide OAA wrappers around useful websites.

1.4  FIND ALL NEW MAIL MESSAGES.
     There are 2 messages available.
     Message 1, from Mark Tierny, entitled ``OAA meeting.''

1.5  NEXT MESSAGE
     Message 2, from Jennifer Schwefler, entitled ``Presentation Summary.''

1.6  PLAY IT.
     This message is a multipart MIME-encoded message.  There are two parts.  
     Part 1.  (Voicemail message, not text-to speech): 
     Thanks for taking part as a speaker in our conference.  
     The schedule will be posted soon on our homepage.

1.7  NEXT PART
     Part 2. (read using text-to-speech):
     The presentation home page is http://www....

1.8  PRINT MESSAGE
     Command executed.

Mail messages are no longer just simple text documents, but often consist of multiple subparts containing audio files, pictures, webpages, attachments and so forth. When a user asks to play a complex email message over the telephone, many different agents may be implicated in the translation process, which would be quite different given the request ``print it.'' The challenge is to develop a system which will enable agents to cooperate in an extensible, flexible manner that alleviates explicit coding of agent interactions for every possible input/output combination.

In an OAA implementation, each agent concentrates only on what it can do and on what it knows, and leaves other work to be delegated to the agent community. For instance, a printer agent, defining the solvable print(Object,Parameters), can be defined by the following pseudocode, which basically says, ``If someone can get me a document, in either POSTSCRIPT or text form, I can print it.''.

print(Object, Parameters) {

   ' If Object is reference to ``it'', find an appropriate document
   if (Object = "ref(it)") 
      oaa_Solve(resolve_reference(the, document, Params, Object),[]);

   ' Given a reference to some document, ask for the document in POSTSCRIPT
   if (Object = "id(Pointer)")
      oaa_Solve(resolve_id_as(id(Pointer), postscript, [], Object),[]);

   ' If Object is of type text or POSTSCRIPT, we can print it.
   if ((Object is of type Text) or (Object is of type Postscript))
      do_print(Object);
}

In our example, since an email message is the salient document, the mail agent will receive a request to produce the message as POSTSCRIPT. Whereas the mail agent may know how to save a text message as POSTSCRIPT, it will not know what to do with a webpage or voicemail message. For these parts of the message, it will simply send oaa_Solve requests to see if another agent knows how to accomplish the task.

Until now, the user has been using only a telephone as user interface. Now, he moves to his desktop, starts a web browser, and accesses the URL referenced by the mail message.

1.9  RECORD MESSAGE
     Recording voice message.  Start speaking now.

1.10 THIS IS THE UPDATED WEB PAGE CONTAINING THE PRESENTATION SCHEDULE.
     Message one recorded.

1.11 IF THIS WEB PAGE CHANGES, GET IT TO ME WITH NOTE ONE.
     Trigger added as requested.

In this example, a local agent which interfaces with the web browser can return the current page as a solution to the request ``oaa_Solve(resolve_reference(this, web_page, [], Ref),[])'', sent by the NL agent. A trigger is installed on a web agent to monitor changes to the page, and when the page is updated, the notify agent can find the user and transmit the webpage and voicemail message using the most appropriate media transfer mechanism.

This example based on the Unified Messaging application is intended to show how OAA concepts can be used to produce a simple yet extensible solution to a multiagent problem that would be difficult to implement using a more rigid framework. The application supports adaptable presentation for queries across dynamically changing, complex information; shared context and reference resolution among applications; and flexible translation of multimedia data. In the next section, we will present an application which highlights the use of parallel competition and cooperation among agents during multimodal fusion.


next up previous
Next: Multimodal Map Up: OAA Applications Previous: OAA Applications

Adam Cheyer
Mon Oct 19 17:14:26 PDT 1998