Open Agent Architecture (OAA)

Developer's Guide




The OAA Home Page is at: http://www.ai.sri.com/~oaa

This document is at http://www.ai.sri.com/~oaa/distribution/doc/proguid2.html

Open Agent Architecture and OAA are trademarks of SRI International.




Contents



1. Introduction

This Developer's Guide describes the design and implementation of the Open Agent Architecture (OAA), and provides a description of how to create agents within this framework.

This document assumes familiarity with the goals and concepts which have been defined for the OAA project, as described in related documents such as the Specification Report and Definition Report. Some familiarity with the Prolog language is also assumed.



2. OAA Design Objectives

The term "agent-based programming" has come to mean many things in today's technological jargon. Within the OAA framework, agents are seen as independent processes which communicate and cooperate with each other across a distributed network. We think of agents as more than just distributed objects because of their high-level communication language and their abilities to actively contribute to a computation as opposed to being only passive participants.

Although the goals and concepts for the Open Agent Architecture have been developed in detail elsewhere (OAA Specification Report, OAA Definition Report), it will be helpful to briefly restate some of the design objectives of the architecture here.

2.1 Fine-grained Cooperation

The OAA focuses on the idea of a community of agents working together to solve tasks for the user. Although it is possible in principle to create a single agent whose role is to autonomously accomplish all envisioned tasks, the benefits of the agent-based approach are best realized when computation is spread over many specialized "expert" agents. In the OAA framework, most requests from users are handled by the combined effort of multiple agents (for example, participating agents may include those that understand natural language input, plan actions, access databases, display results, etc.). This imposes a requirement that communication among agents be efficiently implemented, so as not to incur unacceptable overhead in executing a task.

2.2 Distributedness

The community of agents should be diverse, allowing agents to run on whatever platforms they choose, and to be written in any number of programming languages. The OAA provides a set of standard conventions which allows agents to work together under these conditions. In addition, distributed computation opens the door to parallel computation, where multiple agents may work either cooperatively or competitively on various aspects of a task.

2.3 Adaptability

As new members join the community, the overall effect of their interactions should change. If some agent joins the collaborative process as a latecomer, the interactions among agents must be flexible enough to allow the new agent to participate in computations. A "plug & play" architecture allows systems built with preexisting agents to easily take advantage of new functionality added later in the form of a new agent or agents.

2.4 Communication

Since human users are expected to be participants in the collaborative agent experience, the Interagent Communication Language (ICL), however defined, must be powerful enough to represent natural language (human) input. If the ICL can represent full natural language expressions, procedural (programmatic) interactions will also be possible.

2.5 Active and Reactive Computation

Agents are more than just passive data sources that perform actions or return information only when requested. Agents should be able to monitor actions in the world around them and decide when to take action, perhaps alerting a user or group of other agents about some pertinent situation. In addition, agents should be able to watch the interactions of other agents, and perhaps make suggestions about how to do something better.



3. Overview of the Open Agent Architecture

Figure 1 presents the basic structure of the Open Agent Architecture, using several agents from the Office Assistant prototype as examples of typical agents. The configuration shown in Figure 1 is composed of a multimodal user interface agent that analyzes input and coordinates the presentation of multimedia output to the user; a collection of domain agents for various office-related tasks; and a specialized server agent - the facilitator agent - that is responsible for coordinating agent communication and control, and for providing a global data store to its client agents. Note that a system configuration is not limited to a single facilitator; connectivity between multiple facilitators is discussed later in this document (see Multiple Facilitator Configurations).

Figure 1: OAA structure

3.1 Client Agents

Each agent in the OAA is either a facilitator agent or a client agent. Client agents are so called because each acts (in some respects) as a client of some facilitator, which provides communication and other essential services for the client. When invoked, a client agent makes a connection to a facilitator, which is known as its "parent facilitator". Upon connection, an agent informs its parent facilitator of the services it is capable of providing. When the agent is needed, the facilitator sends it a request using the Interagent Communication Language (ICL). The agent parses this request, processes it, and returns answers or status reports to the facilitator. In processing a request, the agent can make use of a variety of capabilities provided by the OAA. For example, it can request services of other agents, set triggers and post data to the facilitator.

3.2 Facilitator Agents

Each facilitator is an agent that is responsible for managing three types of tasks for its set of client agents:

3.3 Multiple Facilitator Configurations

The agent configuration displayed above in Figure 1 contains a typical arrangement with one facilitator agent and a number of client agents for which that facilitator is responsible. The facilitator knows the capabilities of each of its agents and, given a task to execute, decides how the agents should interact to produce the desired results. However, other network configurations are possible within the OAA, including more complex configurations where multiple facilitator agents can interact.

As noted in the previous section, the facilitator is actually just another OAA agent, using the same agent library and communication standards as a domain agent. In the case of a multiple-facilitator configuration, we can think of each facilitator agent as as a "super" agent, capable of solving all goals solvable by its client agents; the facilitator accomplishes this by delegating each incoming goal to one or more of its client agents.

A number of multiple-facilitator configurations are conceivable within the OAA. One issue to consider when designing a network structure of OAA agents is to ensure that circular requests will not occur, with one agent asking another to do a task, and the second trying to accomplish the task by asking the first to do it.

We have experimented with one possible multiple-facilitator configuration, using a hierarchy of facilitator agents (Figure 2). When a goal (G) is posted to a local facilitator (BB1), and the facilitator agent at BB1 determines that none of its child agents has the requisite knowledge to achieve the goal, it propagates the goal to a more senior facilitator agent (BB4) in the hierarchy. ("Child" agents here refers to non-facilitator agents, which are not shown in the figure.) This more senior facilitator agent maintains a knowledge base of the goals that its lower level facilitators can solve. When a senior facilitator agent receives such a request, it in turn propagates the request down to its child agents (which themselves are facilitator agents), which either have immediate child agents which can evaluate the goal, or can themselves pass on the goal to another subsidiary facilitator agent. In the case illustrated in Figure 2, BB4 determines that none of its subsidiary facilitators can handle the goal, and thus sends the goal to its superior facilitator agent (BB5). BB5 passes the goal to BB6, which in turn passes it to BB9. When such a "referred goal" is passed through the hierarchy of facilitators, it is accompanied by information about the address of the originating facilitator (indicated by the BB1 subscript on G).

This "continuation" information enables a return communication (with answers or failure) to be sent directly to the originating facilitator, without having to navigate the facilitator hierarchy again. Also, the identity of the responding knowledge source BB9 can be sent back to the originator, so that future queries of the same type from BB1 may be addressed directly to BB9 without passing through the hierarchy of facilitators.

Figure 2: Multiple facilitators arranged in a hierarchy

3.4 The Interagent Communication Language

The OAA's Interagent Communication Language (ICL) is the interface language shared by all agents, no matter what machine they are running on or what computer language they are programmed in. The ICL has been designed as an extension of the Prolog programming language, in order to take advantage of the power of unification and backtracking during interactions among agents.

Every agent participating in an OAA-based system defines and publishes a set of capabilities specifications, expressed in the ICL, describing the services that it provides. These establish a high-level interface to the agent, which is used by a facilitator in communicating with the agent, and, most important, in delegating service requests (or parts of requests) to the agent. Partly due to our use of Prolog as the basis of the ICL, we refer to these capabilities specifications as solvables.

For example, in creating an agent for a mail system, solvables might be defined for sending a message to a person, testing whether a message about a particular subject has arrived in the mail queue, or displaying a particular message onscreen. For a database wrapper agent, one might define a distinct solvable corresponding to each of the relations present in the database.

3.5 Execution Management

Because of the heterogeneity of implementation languages, platforms, and origins that is likely to be present among the agents of a system, there can be a fair amount of complexity involved in starting up a system, and in keeping in running. To alleviate some of the effort involved, the OAA includes an execution manager, Start-It.

Once a collection of interoperable agents has been assembled to work on a set of tasks, Start-It provides the means of invoking each of the agents on the correct platform, according to the system protocols of that platform, and ensuring that the agent makes the required connection to an OAA facilitator. Of equal importance, Start-It monitors the status of each agent to see that it continues to function correctly. In the event that Start-It detects a failure of one of the agents, it is able to take steps to recover from the failure and automatically restart the agent.

Startup specifications for each agent and instructions on how to deal with failures are contained in configuration files which can be automatically generated by a component of the Agent Development Tools. The use of Start-It is described in more detail below (see Section 10.1, Start-It).

3.6 Notes on Terminology

One goal of the OAA is to facilitate the creation of agents using a wide variety of implementation languages. This Developer's Guide is intended for use by developers who are working with any of the languages currently supported. However, this creates a considerable challenge in selecting the appropriate terminology to use in referring to programming concepts.

The OAA has roots in Prolog programming, and some of the concepts of Prolog (in particular, unification and backtracking) are employed by the OAA. Consequently, the developer should have some understanding of these concepts, and these terms are used wherever appropriate in this Guide.

It is helpful to distinguish between general design concepts and implementation issues. In discussing design concepts, this manual employs general terminology that is clearly not related to any programming language. For example, we say that agents ``provide services'' that can be ``requested'' by other agents.

In discussing implementation issues, a careful mixture of Prolog-specific terminology and language-neutral terminology has been selected. In cases where Prolog terminology helps to characterize some OAA feature that is inherently Prolog-like, we use that terminology. For example, we say that an agent can ``solve a goal'' (or ``solve a query'') to emphasize first, that the goal must unify with one of the agent's capabilities specifications, and second, that the goal can have multiple solutions.

Otherwise, we use terminology that is widely familiar and can be considered to be programming language-neutral. For example, instead of the Prolog term ``predicate'', we use ``procedure'' or ``declaration'', depending on context.



4. Agent Infrastructure

4.1 The Agent Library

The agent library provides functionality that is common to all agents in the OAA, including facilitator agents; in a sense, the capabilities provided by the agent library determine what it means to be an OAA agent. The library has been ported to a number of different programming languages, and the inclusion of the library is part of the implementation of every agent. The agent developer needs to call procedures provided by the library, and needs to provide certain declarations and define certain callback procedures that are expected by the library. These procedures and declarations are described throughout this Developer's Guide. This section, on Agent Infrastructure, describes some basic features of the agent library, and some uses of the library that are common to all agents.

4.2 Transport Protocol

TCP/IP (Transmission Control Protocol / Internet Packets) has been chosen as the transport protocol on which the OAA is based. TCP is standardized on many operating systems (UNIX, Macintosh, DOS, Microsoft Windows, and so forth), so its use facilitates the task of interoperating agents on a wide variety of platforms.

This layer of protocol is provided transparently by the agent library; thus the agent developer need not be concerned with any of the details of TCP/IP.

4.3 The Event Loop

The activities of every agent are structured around an event loop, which is initiated when the agent is invoked (normally by the agent's call to the agent library procedure go/3). The operation of the event loop is to repeatedly check the agent's event queue, to see if any messages (events) have arrived from the agent's parent facilitator. When an event arrives, it is handled in one of three ways: There are two elements of an agent program that allow it to control what happens when there is no event present in the agent's event queue. The timeout declaration allows the agent to specify how long to wait for another event. Whenever a nonzero timeout value has been exceeded, the agent library calls the user-defined procedure idle.

4.4 The Setup File

The agent library contains code that loads a setup file setup.pl, which is needed by each agent, and which is required to be present in either the agent's current directory or (under UNIX) in the home directory of the user who is running the agent. This file contains a small number of setup parameters that are changed when the agent system is moved to a new machine or file system.

4.4.1 The Rootdata Parameter

The most important parameter in the setup file is

rootdata(PortNumber, HostName).
which specifies the fixed machine address at which the root facilitator agent is to be installed. When an agent (either a new facilitator agent in the case of a hierarchical configuration or a domain agent) connects to the system, this information is used to locate the root facilitator agent and request connection information. PortNumber is expressed as an integer, and HostName as a Prolog atom, as in the following example:
   rootdata(3333, 'trestle.ai.sri.com').

The PortNumber and HostName values can also be specified as command line arguments or as environment variables, in which case they override whatever values may appear in the setup file. The order of precedence is:

  1. command line arguments
  2. environment variables
  3. setup.pl
That is, command line arguments will override environment variables, which will override values in setup.pl.

To express these values as command line arguments, include the following flags on the agent's command line:

-oaa_host <HostName> -oaa_port <PortNumber>

To express them as environment variables, use the variables OAA_HOST and OAA_PORT. This is typically done, under UNIX, using the following commands:

setenv OAA_HOST <HostName>
setenv OAA_PORT <PortNumber>

If these values are expressed using command line arguments (or environment variables), it is required that both values be defined that way. If only one of the two is defined, it will be ignored.

One other variable which can be set on an agent's command line is the internal name for the agent, usually hardcoded within an agent. This value can be set using the arguments -oaa_name NAME on the command line.

4.5 Basic Program Elements

There are a small number of program elements that each agent should include to define its basic behavior relative to the OAA; these provide information needed by the agent library in performing various tasks for the agent.

5. Providing Services

Most agents provide one or more services that may be requested by the user or by other agents. The OAA provides the framework by which these services are requested, and the agent library handles the necessary communications. Each agent provides an interface that makes it possible for the agent to receive and respond to requests for its services. As is shown in this section, the agent library makes it very simple, when implementing an agent, to set up this interface. There are only two things that must be supplied in order to create an agent's services interface: (1) a solvable declaration, and (2) a do_event procedure that contains an implementation for each service listed in the solvable declaration.

5.1 The 'solvable' Declaration

The solvable declaration has a single argument that lists the agent's services as goals that the agent knows how to solve:
solvable(GoalList).
For example, the solvable declaration for an extremely simple mail agent might look like this:
   solvable(
      [last_message(_MessageNum),
       get_message(_MessageNum, _Msg),
       send(mail, _ToPerson)
      ]).

In writing a solvable declaration, it may be helpful to understand that each goal specification will be used according to the semantics associated with Prolog variables and constants. That is, when the facilitator receives a request to solve some goal, it uses unification to test that request against the goals contained in the solvable declarations of its connected agents.

5.2 The 'do_event' Procedure

When an agent receives a request for one of its services, the agent's do_event procedure is called. To handle these calls, each agent must define code for implementing each of the goals listed in its solvable declaration.

'do_event' is called with the incoming goal to solve, and an argument CallingKS which specifies the agent that requested the service be performed. CallingKS may be used in conjunction with the address(KS) parameter of the solve function to send additional information back to the requesting agent.

For example, for the solvables declared above, in a Prolog implementation, the following do_event rules would need to be defined:

   do_event(CallingKs, last_message(MessageNum)) :-
       <code implementing the last_message solvable ...>

   do_event(CallingKs, get_message(MessageNum, Msg)) :-
       <code implementing the get_message solvable ...>

   do_event(CallingKs, send(mail, ToPerson)) :-
       <code implementing the send solvable ...>
In agent libraries for languages such as C, Delphi or Visual Basic, do_event is a function in which there is if or switch statement to handle each of the solvable request events. The above example, written in C, might be:
int do_event(CallingKs, func, args, answers)
char *CallingKs, *func, *args;
char **answers;
{
   int arity = 0;
   int success = 1;
 
   /* Default: predicate succeeds without returning variables */
   *answers = malloc(strlen(func)+strlen(args)+10);
   if (*args) {
      sprintf(*answers, "[%s(%s)]", func, args);
      arity = list_len(args);
   }
   else
      sprintf(*answers, "[%s]", func);
 
   /*--- last_message(Message) ---*/
   if ((strcmp(func, "last_message") == 0) && (arity == 1)) {
      /* Code implementing the last_message solvable */
   } else
 
   /*--- get_message(MsgNum, Msg) ---*/
   if ((strcmp(func, "get_message") == 0) && (arity == 2)) {
      /* Code implementing the get_message solvable */
   } else
 
   /*--- send(mail,ToPerson) ---*/
   if ((strcmp(func, "last_message") == 0) && (arity == 1)) {
      /* Code implementing the send solvable */
   } else

      return 0;   /* Unknown predicate not handled */

   return 1;	  /* Solvable processed */
}
   
In the C, Visual Basic and Delphi libraries, the 'do_event' function returns success, failure or multiple solutions to the incoming request in the variable answers. For an event request send(mail,'Adam'), a successful action would return "[send(mail,'Adam')]" in answers, and failure would be indicated by returning the empty solution list "[]". Multiple solutions can be returned as well: the query manager(adam,X) could return two solutions by storing "[manager(adam,jerry),manager(adam,doug)]" in answers.

There is also the possibility of delaying having to provide a solution to a 'do_event' request until some future time. For instance, if a robot has been asked to achieve a certain position, this is not a goal that can be achieved instantaneously. Using

some_delay_id = delay_solution(answers)
the agent can exit from do_event without providing an immediate solution, and when at some later time the goal has been attained (or failed), the agent can send the response to the goal using
return_delayed_solutions(some_delay_id, solutions)
The delayed solution functionality has not yet been added to the Prolog library.



6. Requesting Services: the 'solve' Procedure

While performing a task, an agent can often make use of information and services provided by other agent knowledge sources. The mechanism for making requests of other agents is encapsulated in a single procedure, called solve(). The solve() procedure provides different methods of obtaining information from another agent in response to a query, as specified by a list of parameters.

6.1 Prolog-Style Queries

The standard way (in Prolog) to request a solution to some query is to call the procedure

solve(Goal, ParameterList).

Note: Other programming languages such as C or Delphi have their own syntactic variants for all of the library routines described in this developer's guide. For instance, in C, the solve procedure is written as solve(char *Goal, char *ParameterList, char **answers).

By default (that is, with an empty ParameterList, ([]) or by calling the solve/1 procedure which assumes an empty Parameter list), this procedure behaves just as if the procedure Goal were being executed locally by the Prolog system: it can fail or succeed, and multiple solutions can be obtained through backtracking. This is a very convenient and natural method of writing code when creating an agent's functionality.

When a goal is sent to the Facilitator through a solve request, the Facilitator looks for agents who can provide a response to the goal by matching the goal to the solvable lists of all connected agents. If multiple agents have indicated that they can return solutions to the goal in question, the default behavior used by the Facilitator is to send the request to all pertinent agents and to wait until all agents have responded, collecting the solutions from each agent and routing the set of all solutions back to the requesting agent. If you wish to override this default behavior, perhaps having individual agents send their specific solutions back to the requesting agent separately, use the asynchronous parameter.

However, this default use of the solve procedure, while providing all of the power and expressiveness of Prolog, also has its shortcomings; for example, execution of the agent calling solve is suspended until the goal has been solved, either succeeding or failing. This goal may take a certain time to resolve, as it will be posted to the facilitator, and then routed to another agent on another machine for processing. And during this time, the agent posting the query is dormant, waiting for the solution to return. There are, however, a number of alternative methods of posting queries, using the ParameterList, as explained below.

6.2 Prolog-style Queries with Caching

In some particular cases, an agent may decide to use solve/2 with the cache parameter; that is,

solve(Goal, [cache]).
This behaves just like solve(Goal, []), except that once the goal is computed, the solution is stored locally in the agent's database. The next time that the same goal is recomputed, the answer is found immediately, without accessing a remote agent. This optimization must be used with care, as the agent is responsible for maintaining the coherency of his own cache: if a solution is subject to changes over time, it is safer to omit the cache parameter.

The cache option should correctly handle subsumption. Imagine that a user first issues the request solve(hotel('fairmont', Info),[cache]), which returns one solution, storing this in the cache. If at a later time, the user asks the query solve(hotel(Any,Info),[cache]), it would be an error to return only the solution stored in the cache; rather, the system must recognize that the cache does not have information stored which subsumes this query, so a new query must be posted over the network. Subsumption is currently correctly handled by the Prolog agent library, but will be incorrectly handled by the agent libraries in other Programming languages.

An agent can choose to clear its own cache by using the primitive clear_cache.

6.3 Limiting the Number of Solutions

As mentioned above, the default use of solve/2, solve(Goal, []), causes the responding agent(s) to generate all possible solutions to Goal. In some cases, depending on how the called procedure is written, finding multiple solutions may have undesirable side-effects. Thus, if the calling program knows that it needs only a single solution, or a limited number of solutions, it can prevent the responding program from finding all possible solutions, by using the solution_limit(N) parameter:

solve(Goal, [solution_limit(N)]),
where N can be any positive integer. This parameters tells the responding program to find at most N solutions.

6.4 Using the Requesting Agent as a Solver

When a request is received by the Facilitator from some agent, the Facilitator's normal behavior is to solicit solutions from all agents that claim to handle that sort of request, except for the requesting agent itself. If it is desired that the requesting agent be included as a solver of a request, that may be indicated using the reflexive parameter:

solve(Goal, [reflexive]).
Note, then, that for a requesting agent to be used as a solver of its own request, there are two conditions that must hold: first, the request must match one of the requesting agent's solvables; and second, the parameters of the request must include reflexive.

6.5 Requesting Solutions from a Specific Agent

In general, when requesting the solution to a goal, an agent posts the request to the facilitator without knowing which remote agent (or agents) will solve the goal -- the facilitator is responsible for transparently locating an appropriate agent (or agents). However, if the address of a specific agent is known (for instance, in the case of a hierarchical configuration, perhaps we want to request that a specific agent residing on another facilitator be the one to attempt a solution), the address can be indicated in the parameter list as

solve(Goal, [address(AgentAddress)]).
Addresses are specified as the agent's name, followed by the facilitator path in the hierarchy. For example, database.navy.military would correspond to an agent named 'database', connected to a facilitator named 'navy', which in turn is connected to a facilitator named 'military', which is in turn connected to the root facilitator agent.

Agents can also be referred to by their unique internal created by an agent's Facilitator. This internal Id for an agent can be obtained through primitives such as can_solve(Goal, AgentList) (which returns a list of IDs for agents whose solvable list match the desired Goal), or sent by the Facilitator to an agent as a parameter in do_event().

6.6 Broadcasting Messages

Sometimes, an agent just wants to send out a message to all applicable agents without expecting a response. A normal solve call expects that an agent will respond to a query, either in agreement (success) or in disagreement (failure). To avoid waiting for a response to a query, and to avoid the production of a response message, use the broadcast argument in solve/2's parameter list.

6.7 Test-locatable Queries

Using the test parameter, you can specify a test to be executed on a local facilitator, and if the test succeeds locally, only then will the goal be solved on that facilitator. This is useful, for example, when specifying a goal such as "send a message to the interface agent, but only on a facilitator where the user's name is 'phil'" :

solve(ui_inform('message'), [test(ksname('phil')), broadcast]).

Note: the test parameter is useful only when multiple facilitators are used.

6.8 Controlling Hierarchical Search

When agents are arranged in a hierarchy of multiple facilitators, search will continue climbing higher until a solution can be found to the query. To limit the search to a certain number of levels, use the level_limit parameter to solve/2. For example,

solve(Goal, [level_limit(0)])
will only attempt to solve the goal locally, using the immediate facilitator.

6.9 Controlling Search Time

Except when the broadcast or asynchronous parameters are specified, the solve/2 procedure blocks until a solution to a query has been found (either in a positive way - success - or a negative way - no solution can be obtained). For this reason, it is sometimes desirable to set a limit on the amount of time a remote agent can use to satisfy a query. The time_limit parameter is used to limit the amount of solve-time to a given number of seconds, with solve/2 failing if no solution can be found in this amount of time.

The time_limit parameter encloses a positive real number, as in the following example:

   solve(Goal, [time_limit(10.5)]),
which indicates that the responding agent is not to be given any more than 10.5 seconds to respond. If it happens that the responding agent does not respond in the allowed time, the facilitator causes the call to solve/2 to fail.

However, it is still possible that the responding agent will return solutions after the time limit has expired. If this happens, the solutions are still returned to the requesting agent; that is, a solved event is sent to the requesting agent. This makes it possible for the requesting agent to make use of the belated response, if it takes the required steps to make use of that event. The way to do this is described in Section 6.12.

6.10 And-Parallel Queries

The agent library also provides means of executing goals in parallel. Some goals can be written more efficiently by using the and_parallel parameter. For example,

    send_appointment(Person1, Person2) :-
        solve(get_email(Person1,Email )),
        solve(get_appointment(Person2, App),
        send_mail(Person1, App).
can be written:
    send_appointment(Person1, Person2) :-
        solve([get_email(Person1,Email),
               get_appointment(Person2, App)], 
              [and_parallel]), 
        send_mail(Person1, App).

In the latter case, assuming that the two solvables are controlled by different agents, they will both be executed simultaneously. When they both have terminated, execution will continue normally. This is much better than the first example, which waits until process 1 has completed before launching the second process. Variable bindings can be shared among the parallel processes, as long as no order precedence among the processes is required (e.g. process 1 must execute before process 2).

6.11 Or-Parallel Queries

In a similar vein, the or_parallel parameter runs several goals in parallel, continuing as soon as any one of the goals succeeds. When backtracking, all solutions will be returned for each of the concurrent processes.

    print_info(Person) :-
        solve ([ info_in_db1(Person, Info), 
                 info_in_db2(Person, Info)], 
               [or_parallel]),
        display(Info), 
        fail. % backtrack over all solutions

This goal will display all information found about Person in both databases #1 and #2. However, the solutions will be printed asynchronously, as they arrive from the two databases. Notice that the variable Info does not unify across the parallel goals as they would with the and_parallel parameter.

6.12 Asynchronous Queries

In order to request information from remote agents in a non-blocking fashion, use solve() with the asynchronous parameter.

solve(Goal, [asynchronous])
In this way, an agent's computation can continue while the query is being resolved. This is more efficient for the local agent, however it is more difficult to program.

The results of an asynchronous query will generate an event of the form

solved(WhoSolved, Query, Params, SolutionList).
The WhoSolved parameter will contain the internal ID of the agent who solved the query, or a list of IDs if multiple agents contributed to the SolutionList (see solve's default behavior for a description of how multiple agents may answer a query).

One way of handling the results of an asynchronous query is to write a handler for the solved/4 message in do_event. Another way is to trap the event by a local event trigger, which supplies the action to be executed when the query has terminated.

The following example posts the query to be solved, and then sets up a trigger to display the results of the goal asynchronously when they are returned. While the goal is being executed remotely, the agent can perform other actions.

   interface :-
      get_query_from_user(PostableQuery),
      solve(PostableQuery, [asynchronous]),
      add_local_trigger(self, event, when, on_receive,
                        solved(_WhoSolved, PostableQuery, _Params, Solutions),
	     	        true,
                        display(Solutions)).




7. Triggers

Each agent can install triggers either locally for itself, or remotely, on either the facilitator or on another agent. There are currently four types of triggers :

Event triggers: any incoming or outgoing event (message) may be monitored. For instance, a simple event trigger may say something like:

"Whenever a solution to a query is returned by the facilitator, send the result to the presentation manager to be displayed to the user."

Data triggers: data triggers monitor the state of the global information written to the facilitator. An example data trigger might be:

"Whenever the state of the reactor becomes unstable, send an alert message to the shutdown system."

A data trigger is always installed on the facilitator, as the facilitator monitors all global data for processes. Individual database agents are expected to provide their own trigger mechanisms, patterned after the data trigger functionality provided by the facilitator agent.

Test triggers: Test triggers are monitored after the processing of each incoming event, and also whenever a timeout occurs in the event polling. The test condition may specify any goal executable by the local agent meta-interpreter. Test triggers are useful for examining internal events that do not come through the facilitator. For example, a mail agent might watch for new incoming mail, or an airline database agent may monitor which flights will arrive later than scheduled.

Alarm triggers: Alarm triggers (also known as Time triggers) monitor time conditions. Alarm triggers can keep track of a single fixed point in time (eg. "On december 23rd at 3pm"), or can handle recurrent triggers (eg. "Every three minutes from now until noon").

To install a local trigger, use the command

add_local_trigger(self, Kind, Type, OpMask, Template, Condition, Action),
and to install a remote trigger, use one of the commands
add_trigger(KS, Kind, Type, OpMask, Template, Condition, Action).
or
add_trigger(Kind, Type, OpMask, Template, Condition, Action)

add_local_trigger/7 causes a test or event trigger to be installed locally, on the agent that calls it. add_trigger/7 causes a trigger to be installed on the specified agent (KS). add_trigger/6 causes a trigger to be installed on the facilitator, in the case of data and event triggers. In the case of test triggers, add_trigger/6 causes the trigger to be installed on all agents that can solve the given Condition, as indicated by their solvable declarations. For alarm triggers, add_trigger/6 will install the specified trigger on an alarm agent currently connected to the facilitator.

As mentioned above, data triggers can only be installed on the facilitator; thus it is incorrect to attempt to install a data trigger using either add_local_trigger/7 or add_trigger/7. Similarly, alarm triggers are only installed on alarm agents; you might use add_trigger/7 to specify a particular alarm agent to install the trigger on though.

The first argument of add_local_trigger/7 (self) and that of add_trigger/7 (KS) refer to agents (knowledge sources), but have different meanings. In the case of add_local_trigger, this argument indicates what agent requested the trigger. In the case of add_trigger/7, the first argument indicates the agent on which the trigger is to be installed.

The Kind parameter specifies one of the trigger kinds described above, either 'event', 'data', 'test' or 'alarm'.

The Type parameter specifies the duration of the trigger. 'if' and 'when' triggers execute only once. (The difference between the two is only from an English perspective: 'if' indicates that an event might happen, whereas 'when' states that an event will happen). However, a 'whenever' trigger remains active after firing, continuing its watch for matching events. Note: the Type parameter is not used by 'alarm' triggers since alarm triggers have their own notion of recurrence built-in.

The OpMask parameter is used only with event and data triggers, and specifies under what circumstances the trigger is to be considered for execution. For an event trigger, OpMask may be either 'on_send', 'on_receive', a list containing both of these, or a variable. The use of a variable, which has the same effect as a list containing both 'on_send' and 'on_receive', simply means that the trigger should be considered in both situations. For a data trigger, OpMask may be either 'on_write', 'on_retract', 'on_replace', 'on_write_replace', a list containing any combination of these, or a variable. 'on_write', 'on_retract', 'on_replace', and 'on_write_replace' correspond to the use of the agent library procedures write_bb, retract_bb, replace_bb, and write_replace_bb.

The Template parameter is also used only with event and data triggers, and controls, by unification, what events or data items cause the trigger to fire. In the case of an event trigger, Template should have the same form as an event that is expected to be sent or received by the agent on which the trigger is installed. In the case of a data trigger, Template should have the form data(Item, Value).

With test or alarm triggers, the values of OpMask and of Template may be anything, as they are unused.

The Condition parameter contains a test which must succeed in order for the trigger to fire. In the case of event trigger or data triggers, Condition contains an arbitrary test expression which will be evaluated in addition to successful matches of the OpMask and Template parameters. An example might be

add_trigger(data, whenever, [on_replace], position(car1, X, Y), (solve(position(target, X2,Y2)),solve(distance(X,Y,X2,Y2,D)), D < 100), perform(some_action)).

In this example, as the position of car1 changes, the condition of the trigger specifies that if its distance to a target becomes less than 100, the trigger should fire.

As we have just seen, the Condition parameter provides an arbitrary additional text expression for event and data triggers. However, for alarm and test triggers, since the OpMask and Template parameters are not used, the Condition parameter is the sole determiner of whether the trigger fires.

The Action parameter is the body of the trigger; that is, what is to be executed when the trigger fires. This is in the form of an ICL expression which will be posted to the facilitator for execution once the conditions of the trigger have been met.

Triggers can be installed programmatically, as described above, using the add_trigger and add_local_trigger agent library routines. Another excellent way of adding triggers (especially test and alarm triggers) is through natural language. Here are some examples of natural language trigger expressions and their programmatic equivalents:





8. Data Management

Each facilitator provides a global data repository that can be used cooperatively by its client agents. In combination with the use of triggers, this allows for a group of agents to organize their efforts around a ``blackboard'' style of communication.

The write_bb, read_bb, retract_bb, write_replace_bb and replace_bb procedures are used to read and write this global data.

A call to any of these, except for read_bb, causes the facilitator to check its active data triggers, and may result in the firing of one or more of them.

Writing Data

Any client agent may post data on its parent facilitator by calling

write_bb(Item, Data).
This causes the facilitator to record the data element <Item, Data>, in precisely the form that they are passed to it. Both Item and Data are unconstrained, and there may be multiple values of Data recorded for a given value of Item.

As an example, the DCG-NL agent (Definite Clause Grammar Natural Language agent), which provides natural language processing services for a variety of other agents, expects those other agents to post the vocabulary that they are prepared to respond to, with an indication of each word's part of speech, and of the logical form that should result from the use of that word. In the Office Assistant system, a number of agents make use of these services. For instance, the database agent uses the following call to post the noun `boss', and to indicate that the ``meaning'' of boss is the concept `manager':

write_bb(noun, [manager, [atom(boss)]]).

Reading Data

Any client agent may read data from its parent facilitator by calling

read_bb(Item, Data).
This call will return, via backtracking, all stored data items that unify with both Item and Data.

Replacing Data

Any client agent may replace data on its parent facilitator by calling
replace_bb(Item, OldValue, NewValue).

This causes the facilitator to first remove all data elements that unify with both Item and OldValue, and then to record the data element <Item, NewValue>.

A call to replace_bb causes the facilitator to check its active data triggers, and may result in the firing of one or more of them.



9. Creating Agents

9.1 Basic Steps

The process of creating a new agent consists of the following steps:

The value of Class should be either 'root', 'node', or 'leaf'. The value 'root' is used only by the root facilitator agent, 'node' is used by other facilitator agents (if a hierarchical facilitator structure is in use), and 'leaf' is used by all other agents. KSName (Knowledge Source Name) may be any atom that uniquely characterizes the agent at the level of the local facilitator, and FacilitatorName (in some places this is called BBName) should be the name of the facilitator agent that this agent wants to connect to. The root facilitator agent's identifier is always 'root'.

After connecting to the appropriate facilitator and performing other essential initilizations, the go/3 procedure starts an event loop for the agent.

Please see the Agent Development Tools User Manual for a description of how the process of creating new OAA agents can be partially automated.

9.2 Wrapper Agents

Often, a domain agent will serve as an OAA "wrapper" to some existing application, such as to a database or calendar program. A variety of means may be used to enclose an existing application within an agent "wrapper":

9.3 ICL Parsing Procedures

Agents that are not written in Prolog will need to parse incoming ICL requests, which use a Prolog syntax. The Agent Library for each of these languages includes a set of ICL parsing procedures to facilitate breaking apart incoming requests into appropriate components.

9.4 Programming Recommendations

Included here are several recommendations for programming agents within the OAA.

  1. When requesting information from the facilitator, don't access a specific agent by name (using the address parameter) - just specify the information you require. This will allow the facilitator the flexibility to find the best means of obtaining a solution as new agents become available.
  2. Separate the user interface from application functionality. Doing so allows the application to be ported rapidly to another platform by simply rewriting only the user interface agent.
  3. Make sure to separate domain-independent information from domain-specific information. Create agents that will be reusable by a large number of applications.




10. Invoking and Monitoring an Agent System

Most sections of this Developer's Guide are concerned either with the design and structure of individual agents, or with the interactions that take place between agents in an operating system. Here we consider, given a collection of agents that have been selected and/or constructed to work together as a system, what needs to happen in order to set that system in motion, and to insure that it continues to operate.

We describe, below, how to make use of the OAA's execution manager, Start-It, which is designed to minimize the effort involved in invoking and monitoring a system. However, there are times, especially during debugging, when it is important to be able to start or stop an agent (or a system of agents) ``manually''; that is, from the command line of the agent's execution platform.

In starting an individual agent, there are only two general requirements to remember. First, the agent must be able to obtain the correct host and port of its facilitator; this is normally obtained from the setup file. Second, the agent must request a connection to its facilitator; this is normally accomplished by calling go/3.

The agent library provides a callback to each agent, app_init, which occurs as a result of calling go/3. When implementing an agent, any agent-specific initializations should be handled by the app_init procedure. This helps to keep the startup procedure for the agent as simple as possible.

In starting up an entire system, the only general requirement to remember is that each facilitator must be started up before its client agents. More precisely, the facilitator must be listening to its assigned port before any of its clients attempt to make a connection.

10.1 Start-It

Start-It provides a graphical user interface through which a collection of agents may be controlled. (Start-It may also be used to control non-agent processes.) This interface, in turn, is controlled by the contents of a user-editable configuration file. The configuration file is used to define each agent's name, execution host, and startup command line. In addition, it may be used to define agent-specific menus that are displayed by Start-It and to select execution parameters for the agent. Figure 3 shows a simple example of Start-It's interface.

Figure 3: Start-It's user interface

10.1.1 Invoking Start-It

Start-It is invoked from the Unix command line using
startit [options] <configuration file>
For example, the instance of Start-It shown in Figure 3 was invoked using
    startit oaa.config
As Start-It is itself an agent, it attempts to connect to a facilitator when it is invoked. Thus, before invoking Start-It, you must ensure that a facilitator is running at the host and port that are currently indicated in the relevant setup file.

Invoking Start-It does not cause any other agents to start up immediately. Rather, its interface provides controls that allow other agents to be invoked.

-no_expect

Start-It takes an optional command line argument -no_expect. If this option is given on startup, Start-It will not try to make use of the `expect' program, which requires that TCL be installed on your system. Expect solves some minor problems with Start-It:

-autostart

If the -autostart option is given, all agents controlled by the Start-It window will automatically be run as soon as Start-It is ready to do so, as if the user had pressed the big blue START button. This feature could be useful if you want to create a login for a particular demo where all agents (including Start-It) would begin executing as soon as X-Windows initializes.

-project <ProjectName>

Selects the initial project to be used on startup. For a description of projects, see the following section.

10.1.2 Start-It's Interface

The variety of controls provided by the interface depends on what is specified in the configuration file that is in use. Figure 3 shows an interface resulting from a very simple configuration file. Configuration file contents are described in the next section.

As shown in the figure, Start-It always provides the Display control and the large button labeled Start. The Display control shows which machine will be used to display the windows in which the various agents are invoked (this is not necessarily the same machine on which the agents are run). Clicking in the small box causes a choice list to appear, which can be used to either choose or type in a different display. The default value is ``Local'', which means the machine that Start-It is running on.

The Start button causes Start-It to invoke all agents currently being displayed in its main window.

Start-It is capable of maintaining multiple projects, or collections of agents, in a single configuration file. By choosing a project from the Projects menu, all agents belonging to the project become visible in Start-It's main window, and all agents not belonging to the project are no longer displayed. To display the list of all agents defined in the configuration file, the user may select the Global from the Projects menu.

Beneath the Display and Start controls in the main window, there are groups of controls for individual agents. For each agent specified in the configuration file, Start-It provides a status box, a basic menu, and a host selector.

When Start-It has been asked to invoke an agent, the agent's status box is colored either yellow (the agent is initializing but is not yet ready to interoperate), green (the agent is connected to a facilitator and ready to interoperate), or red (the agent has died or is not functioning normally).

An agent's basic menu is accessed by clicking on the large button labeled with the agent's name (such as the button labeled ``Database'' in Figure 3). When this is done, a menu is displayed containing the four commands Hide, Show Options, Start, and Kill. These commands are used as follows:

The Host control is used to specify the machine on which the agent is to execute. Clicking in the small box causes a choice list to appear, which can be used to either choose or type in a different host. The default value is ``Local'', which means the machine that Start-It is running on.

10.1.3 Configuration Files

A configuration file for Start-It is an ascii file that contains a set of simple specifications for each agent. For example, here are the specifications for the Database and Calendar agents that appear in Figure 3:

#------------------------------------------------------
# Employee Database Agent
#------------------------------------------------------
appname Database
oaaname employee_db.root
appdir /home/zuma1/OAA/demo.bin
appline ./d2
end
  
#------------------------------------------------------
# Calendar Agent
#------------------------------------------------------
appname Calendar
oaaname calendar.root
appdir /home/zuma1/OAA/demo.bin
preline setenv ENV_VAR something
appline ./c
onready         inform_ui(calendar, ready)
ondisconnect    [inform_ui(calendar, disconnect), reinit(all)]
end

The specifications shown here for each of the agents, `appname', `appdir', and `appline', are required to be present for each process to be started. If the process is an oaa agent, 'oaaname' should be defined as well.

It is also possible, in the configuration file, to specify agent-specific menus or other interface elements for use in selecting operational parameters for various agents. The interface elements are used to allow the user to specify values for different variables which can be referenced in the appline. Command line variables should be written in the form ${varname}.

The following example (for the phone agent shown in Figure 3) allows the user to choose a tty-port (A or B), which is added to the command line of the phone agent.

#------------------------------------------------------
# Phone Agent
#------------------------------------------------------
appname Phone Agent
oaaname phone
appdir /home/zuma1/OAA/demo.bin
appline ./phone ${tty-port}
    menu   tty-port	TTY Port?
       option	ttya	TTY-A
       option	ttyb	TTY-B
    end
end
There are five types of interface components: Descriptions of each will be given in the following sections.

Menu

The menu interface control allows a user to select from a number of fixed choices. It's configuration file specification format is as follows:
   menu   varname	Text To Be Displayed
      option	return_value1	Option Text 1
      line
      option	return_value2	Option Text 2
      default	return_default	Default Text
   end
The `line' option is used to create separators within the menu, and is optional. If the `default' field is not specified, the first option listed will be the default value.

Toggle

The toggle interface control allows a user to specify whether some boolean condition is true or false. It's configuration file specification format is as follows:
   toggle   varname	Text To Be Displayed
	true-value	Value To Be Returned If TRUE
	true-label	Label To Be Displayed when TRUE
	false-value	Value To Be Returned If FALSE
	false-label	Label To Be Displayed when FALSE
	set
   end
All parameters for the toggle control are optional. If the `set' parameter is given, the toggle defaults to have value `TRUE', `FALSE' otherwise.

Text

The text interface control allows a user to enter a line of text. It's configuration file specification format is as follows:
   text   varname	Text To Be Displayed
      value	Default Start Value
      size	WidthInChars
   end
All parameters for the text control are optional. The `value' field provides the initial value contained by the text field, and `size' indicates how wide the text field is in characters.

Text-Menu & Toggle-Text-Menu

The `text-menu' and `toggle-text-menu' are composites of the three interface controls listed above, and take all field values provided by these controls.

10.1.4 Variable Expansion in Start-It

This section gives a detailed explanation of how Start-It handles configuration file variables.

Variables are expanded ONLY IF they are sourrounded with curly braces in this fashion: ${variable}. If they occur without braces (e.g. $variable) they are treated like any other string, and simply passed verbatim ($ sign and all) to the shell started up by the rsh. They will, of course, get expanded "on the other end" of the rsh, so you should be sure some such shell variable will exist there (either because you know it's defined by the shell itself, or in your .cshrc, or in the app's "preline" directive).

Special Variable Names

Currently, there are two variable names built into the system.

One is ${globalDisplay}, which refers to the contents of the display widget in the top left corner of the startit GUI.

The second is ${Host} (notice capitalization), which refers to the contents of the "Host:" widget (located next to each Application Name button on the startit GUI). You may use the form ${application:Host} to specify an application other than the current one (see "Specifying The Application Name" below).

Specifying the application name

The following syntax for variable expansion is also allowed:

 ${application_name:selector}

Whenever a ":" exists within your variable name, this method of resolving the variable is used exclusively.

application_name is one of the applications in your config file, as specified on an "appline" directive. It may contain spaces, or any other character, except a colon. selector is the name of one of the selector widgets (e.g. toggle, menu, text, toggle-text-menu) as specified by the second argument of one of those directives.

        EXAMPLE:

        This makes the host entry of "My Agent" start out to be
        the same host as whatever the "Push To Talk" is set to:

        appname My Agent
        host    ${Push To Talk:Host}

Not specifying the application name

Whenever there is no ":" in the variable name, the following resolutions are tried in order. As soon as one method is able to resolve a variable name, that name is the one used:
  1. Use the value specified by the selector of the given name in the current application
  2. Search the global selectors (currently, this is only globalDisplay)
  3. Search all applications, in order of appearance
  4. Resolve as an environment variable (local to the startit GUI process), using the getenv() system call
  5. If all else fails, use the literal value inside the ${} (this is almost certainly not what the user wants, but a warning message is printed out when this happens)
Note: by convention, since environment variables tend to be all upper case, startit variables (and the selector names they refer to) should be in lower case. The pre-defined startit variables are in mixed case. This is only convention, however. Startit does not actually look at the case to decide which resolution method to use.

More Examples

Here are some examples which look very similar, but do quite different things. All of these assume you are using tcsh, which automatically sets the environment variable $HOST to be the current host:
        # This will echo the host "Test 1" is running on, since
        # "$HOST" doesn't get expanded by startit at all
        appdir  /
        appname Test 1
        appline echo $HOST
        end

        # This will echo the host "startit" is running on, since
        # startit will expand "${HOST}" as its own environment var
        appdir  /
        appname Test 2
        appline echo ${HOST}
        end

        # This will echo the contents of "Test 3"'s "Host:" selector
        appdir  /
        appname Test 3
        appline echo ${Host}
        end

        # This will echo the contents of "Test 1"'s "Host:" selector
        appdir  /
        appname Test 4
        appline echo ${Test 1:Host}
        end