Adaptive Process Management: An AI Perspective

Pauline M. Berry Karen L. Myers

Artificial Intelligence Center,
SRI International,
333 Ravenswood Ave.
Menlo Park, CA 94025

1st Sept. 1998

1.0 Introduction

Many domains of interest to the workflow community are characterized by ever-changing requirements and dynamic environments. Workflow systems must increase in sophistication to provide the reactivity and flexibility necessary to support their operational requirements for process management. Within the AI community, work on reactive control has lead to the exploration of techniques for intelligent process management to meet the requirements of adpativity for dynamic and unpredictable environments. Although motivated by somewhat different concerns and grounded in different perspectives, there is much overlap between the objectives and requirements of these two communities.

In an effort to encourage the exchange of ideas, this paper explores how techniques from intelligent reactive control might be leveraged to provide adaptivity within workflow technologies, while also acknowledging their current limitations for certain workflow requirements. Two systems under development by the authors are described that illustrate how reactive control and other AI techniques can be transitional to provide a basis for adaptive workflow technology.

2.0 Intelligent Process Management: Two Perspectives

We begin with brief summaries of the models for process management that underlie the workflow and reactive control communities, as a way to highlight the commonalities and differences of the two fields.

2.1 Workflow Management

Workflow is a fast-evolving area that has been influenced by several fields, including CSCW and Distributed Artificial Intelligence. Workflow can be viewed as the instantiation of activities or steps required to fulfill all or part of a process. A Workflow Management (WFM) System provides the systems and services required to automate the workflow, including the definition and enactment of the activities to facilitate the process.

WFM systems can be characterized by the following functional components:

In current WFM technology, process modeling and representation is typically performed at build-time, while the other functions are context-dependent and performed at runtime. Figure 1 illustrates our view of a generic WFM system, adapted from the Workflow Management Coalitions (WfMC) Reference Model [[11]]. One of the aims of this paper is to show how the run-time/build-time distinction could be removed.


Figure 1: Overview of Existing Workflow Management System Technology

2.2 Overview of Existing Workflow Management System Technology

Much work has been done in the workflow community on process modeling, process validation and enactment. Current technologies provide relatively simple form-based procedures [[5]], which suffice for traditional workflow applications. However, as WFM systems move into more complex domains (e.g.,, production control, telecommunication service provision, military applications) new technologies are required that address domain uncertainty and volatility, reliability, flexibility and reactivity.

There are three areas where runtime adaptivity is needed. The first area is the definition and/or creation of workflow processes. This capability should be at least partially automated, moving the task from build-time to run-time. Doing so would enable processes to reflect changes in the environment, business practices and goals in timely fashion. The second area is the enactment and repair of workflow processes. As the activities of a process are instantiated, changes in the environment or previous activities may invalidate the current workflow processes. Techniques are required to continuously repair or improve the execution of a workflow process. Finally, in the reactive allocation of activities to agents with uncertain and varied availability and capability. The complex set of activities to be scheduled by the WFM system is constantly changing as new processes are instantiated and the environment changes. The agents available to perform the activities may also change due to breakages, illness, maintenance schedules, and other perturbations. Agents may even be tasked by external, third-party agents. Thus, the scheduling algorithms must be able to address these uncertainties to maintain valid and effective agent tasking.

2.3 Intelligent Reactive Control

Process management within the AI community draws extensively on the use of knowledge-based software controllers as embedded systems. These systems are typically organized around an interpreter that runs a tight control loop of sensing to detect key changes in the operating environment or sets of assigned tasks, deliberation to determine how to respond to sensed changes, and acting to execute relevant responses. Control systems vary considerably in the complexity of the approaches that they adopt to deliberation and acting. Predefined procedure libraries may be established that describe sequences of actions and tests that can be performed to achieve some goal, or that serve as appropriate responses to designated events (for example, [[9], [8], [16]]). These libraries can be augmented by plan generation tools that can synthesize processes at runtime based on composition of procedures. Alternatively, model-based reasoning techniques can be used to deduce control rules from explicit descriptions of the domain [[28]].

Such systems fluidly integrate event- and goal-driven activity. While processes are being executed to accomplish current tasks, the operating environment is constantly monitored for changes that require the activation of additional processes, the adaptation of current processes, or the termination of processes in favour of new processes that better suit the current operating environment. Their tight control loops enable rapid reactivity to changes in the operating environment.

The reactive control community has been motivated primarily by domains that involve control of computational processes and physical devices (e.g.,, robots, satellites, computer networks, agent communities). Initial systems focused on full automation, although more recent work has sought to develop interactive and mixed-initiative methods that involve humans in the control process.

3.0 Techniques for Adaptivity

The reactive control community employs a variety of mechanisms to support adaptivity within process management. Here, we discuss several, pointing out their benefits and limitations relative to requirements of the Workflow community.

3.1 Flexible Representations

At their most powerful, reactive control systems employ highly expressive formalisms for representing events and activities (e.g.,, [[8], [25]]). Typically, such formalisms employ powerful task representations and rich control constructs (iteration, sequencing, concurrency, monitoring, testing and suspension/resumption, constraints). They generally support the decomposition of processes into individual modules that provide small, coherent units of functionality. Each such unit generally consists of a description of the purpose of that unit (i.e.,, what it can be used for, either to respond to a tasking request or to some event), and conditions of applicability describing constraints on the usage of the procedure.

Hierarchical representations enable the encoding of complex activities at multiple levels of abstraction. High-level activities can be initiated without concern as to the low-level details for activity implementation. Rather, low-level decisions are made on an as-needed basis when the time comes to execute those actions.

Because the reactive control paradigm has been less motivated by distributed applications, requirements from the workflow community such as transactional operations, synchronization primitives, and distributed control protocols have received relatively little attention.

3.2 Process Synthesis

Reactive controllers can select from among defined processes to respond to new tasks and events based on current situational and problem-solving information. For certain tasks, however, it is necessary to consider the long-range ramifications of action choices to ensure, for example, that sufficient resources will be available to complete a particular task. For such situations, plan generation techniques can be used to synthesize new processes from previously defined process templates. While most plan generation work has ignored issues of plan use, efforts have been made recently to combine reactive control with sophisticated plan generation techniques, as a way of enabling the dynamic synthesis of plans at run time in response to changing situations and goals [[27], [13], [22]].

3.3. Monitoring

Monitors play an integral part in reactive control systems. Responses for triggered monitors can encompass the invocation of prespecified processes, adaptations to current activities, to the abandonment of current activities. Today, monitors are mostly created by hand. However, recent work has sought to extract monitors from automatically generated plans through analysis of their derivation structures [[21], [18]].

General representations for specifying monitors are available. However, as more complex applications are considered, recognition of the need for rich theories of monitoring has grown. For example, monitoring capabilities to date have generally been limited to detection of atomic events. New techniques are beginning to merge that support monitoring of composite events, which conists of collections of atomic events related by specified temporal or mathematical constraints.

3.4 Recovery and Process Repair

Reactive control systems operate as embedded systems in dynamic environments, performing activities that can change the world in irrevocable ways. For this reason, recovery techniques such as the use of checkpoint schemes, whereby coherent states are saved periodically to enable rollback in case of unrecoverable failures, are not viable. Instead, methods for forward recovery are required that support transition from a failed state to some known, safe state.

Within the reactive control community, most recovery mechanisms are currently implemented in an ad hoc manner. For the most part, it is the human modeler's responsibility to ensure either that process execution will avoid problematic states or that procedures for transitioning from such states exist. Tools for ensuring key properties (e.g.,, safety conditions, liveness) lack sophistication and are not commonly used.

One area in which more principled recovery mechanisms have been explored is in the repair of automatically generated plans. The general approach involves dependency structure analysis [[23], [12]], in which plan derivation structures are analyzed to identify problems relative to the current state and execution results. Two main sources of problems are precondition failure and action failure. Precondition failure arises when associated preconditions for an action are not satisfied at the time the action is to be executed. Action failure results when an executed action does not achieve its intended effects. Repair methods range from case-base [[10]] to generative [[23], [12], [21]], with emphasis on correctness-preserving and minimal-perturbation methods.

Work on adaptivity of plans has mostly ignored issues involved in switching plans. Cost is one issue: adaptation strategies need to incorporate realistic models of the expense in redirecting activities. Control is a second issue. Most systems that support runtime plan repair require synchronous operations, in which execution is halted while an alternative plan is generated. This mode contrasts with asynchronous replanning: when problems arise during execution of a plan, an executor can invoke a repair module to fix problems in the current plan while continuing to execute portions of the original plan that are unaffected. Preliminary efforts have been made to support asynchronous repair [[27]], but more general and robust schemes are required.

In recent years the scheduling community has made significant advances in the construction and maintenance of robust schedules. Techniques include constructive methods that use predictive information [[2], [19]] to build schedules that are resistant unexpected events, anytime algorithms designed to maintain a legal schedule at all times [[7], [29]], and intelligent repair techniques [[20]].

4.0 Adaptive Systems: From AI to Workflow

We are involved with two ongoing projects focused on adaptive process management. The first, the Continuous Planning and Execution Framework (CPEF) [[18]], is developing a framework that supports the generation and execution of complex plans to attain assigned goals, while remaining responsive and adaptive to environmental changes. The second, Intelligent Workflow for Collection Management (IWCM), has a more conventional workflow flavour. It is our intent to leverage technologies being developed in CPEF to construct the adaptive workflow engine for IWCM.

4.1 CPEF

CPEF is a multiagent framework for performing and managing complex tasks in dynamic and uncertain environments. It provides taskability (i.e.,, the ability to formulate and execute plans to achieve assigned high-level tasks) and reactivity (i.e.,, the ability to adapt behavior based on changes in the operating environment). Tasks often involve long-term commitments that require look-ahead analysis; for this reason, generative planning technology is employed to compose new plans from libraries of operator templates.

In contrast to many integrated planning and execution systems, CPEF embraces the philosophy that plans are dynamic, open-ended artifacts that must evolve in response to an ever-changing environment. In particular, plans are updated in response to new information and requirements in a timely fashion to ensure that they remain viable and relevant, and replaced by alternatives when they are not. Users are an integral part of the overall process, providing input that influences the types of plans that are generated, the number of options to consider, failure assessments, plan repair strategies, and overall control of system behavior.

CPEF leverages several sophisticated AI technologies as components. SIPE-2 [[24]] provides hierarchical task network (HTN) generative planning and minimal-perturbation plan repair capabilities derived from dependency-structure analysis. The Advisable Planner (AP) [[17]] supports user provision of advice to guide the process of plan generation. The Procedural Reasoning System (PRS) [[9], [15]], a hierarchical reactive control system, is used as both an executor for plans, and a high-level controller for the overall system. Additionally, CPEF builds on aspects of the Multiagent Planning Architecture [[26]], primarily to support distributed communication and plan storage services.

CPEF supports both direct models of execution, for which process actions are performed by the system itself, and indirect models of execution for which the system supervises execution of plans by a collection of distributed execution entities. The indirect model of execution is essential for domains where direct software control of plan entities is impossible, including many classes of WFM problems.

CPEF employs a procedure library that includes both plans and operators (encoded in the Act representation language [25]]). Elements of the library span multiple abstraction levels and are usable for both plan generation and execution, thus supporting smooth transitions between the two capabilities. In particular, plan generation can proceed to arbitrary levels of refinement, with the executor applying additional procedures at runtime to refine tasks to executable activities. Planning and execution operate asynchronously, in a loosely coupled fashion, with agents communicating domain knowledge, plans, requests, and situation enformation as required to fulfill their respective responsibilities.

The creation and deployment of monitors (i.e.,, event-response rules) is a critical part of CPEF. Users can define a wide range of monitors; additionally, certain kinds of monitors are generated automatically based on the content of generated plans, as a way of detecting situation changes that may invalidate a plan. One research focus for CPEF is to develop more flexible and powerful models of failure detection and recovery. For example, CPEF supports the specification, monitoring, and repair of the following generalized types of failures.

Unattributable Failures
occur when no individual action has failed or assumption been violated, yet some assessment (human or automated) has deemed the current plan inadequate. Unattributable failures arise because planning operators don't model the real world with sufficient fidelity.
Aggregate Failures
are defined by the unsuccessful execution of a set of semantically linked activities. Aggregation is important for failure identification when processes include redundant actions as a way of improving their robustness.
To date, the focus on repair in CPEF has been on minimal-perturbation dependency structure methods that have been extended somewhat to accommodate our theory of generalized failures. Ideally, a process management system should provide a spectrum of plan repair mechanisms ranging from the correct but costly minimal-perturbation, dependency-structure based methods to transformational approaches that employ domain-specific transformation rules (in the spirit of [[1]]) that trade correctness for efficiency. We intend to augment these methods with heuristic local repairs in the near future.

While domain-independent technology, CPEF is being developed within the context of supporting a Joint Forces Air Component Commander (JFACC) in the execution of realistic air campaigns[[14]]. CPEF has been successfully applied to generate, execute, and repair complex plans for gaining and maintaining air superiority while remaining responsive to changes in guidance and tasking within a simulated operating environment.

4.2 IWCM

In the IWCM project, we are developing a WFM system (jointly with CIRL, University of Oregon) to support the management of assets and resources for advanced Intelligence, Surveillance and Reconnaissance (ISR) capabilities. On a daily basis, intelligence planners are faced with the task of coordinating multiple ISR assets to maximize available information about the battlefield in order to increase the effectiveness of the deployed forces . Effective integration of the automated information discovery, acquisition, exploitation and dissemination with multi-asset synchronization within ISR requires some form of intelligent process management. IWCM aims to provide a highly adaptive WFM system that will enable more effective and efficient management of available assets and information. The workflow manager must address traditional workflow uncertainties, a volatile operating environment, frequently changing goals and operating practices, and the unexpected addition and subtraction of processing agents during runtime.

Our contribution to the project is focused on reactive control and scheduling. It will leverage many of the reactive control capabilities from CPEF, augmenting them with advanced resource allocation, capacity analysis, and scheduling capabilities.

The expressive activity representations of CPEF will be used to capture the activity, capability, and information product knowledge required to reason about workflow processes, while the monitoring, reactive execution, and dynamic repair capabilities will be employed to support adaptivity of active processes. The ability to efficiently combine declarative and procedural knowledge will allow a reactive controller exploit knowledge about the domain. At later stages of the project, hierarchical planning techniques will be employed to provide automatic process generation.

Advanced resource allocation/scheduling techniques (based on Adaptive Constraint Satisfaction [[4]]) combined with capacity analysis [[2]] will be used to task agents. The tasks in the activity list, will exist at different levels of abstraction [[3]]. Some might be to achieve strategic objectives, while others might be to perform a specific set of collection tasks within a set time horizon. The activity manager will apply the most appropriate algorithms, given the agents involved and abstraction level. There will always be a legal schedule of activities ready for distribution. However, the system will constantly adapt the schedule to reflect activities and changes in the world. Incoming information that affects the current schedule will also exist spanning multiple abstraction levels, and possibly different temporal intervals. Triggered by monitors, the activity manager will select the strategy most appropriate to the new situation and evolve the current schedule appropriately.

5.0 Conclusions

Workflow management and reactive control both seek to provide intelligent management of processes, although they approach the problem from different perspectives. Given the overlap in requirements and objectives, however, researchers from the two fields have much to learn from each other (as is evident in the work of others [[6]]). In this paper we have concentrated on what reactive process control can bring to the creation of adaptive workflow management systems. In particular, we have discussed techniques for runtime process generation, reactive agent tasking, execution monitoring and repair. Two ongoing projects were described that affirm our commitment to transitioning reactive control technology to the workflow arena.

6.0 References

J. L. Ambite and C. A. Knoblock. Planning by rewriting: Efficiently generating high-quality plans. In Proceedings of the Fourteenth National Conference on Artificial Intelligence. AAAI Press, 1997.

P. M. Berry. The PCP: A predictive model for satisfying conflicting objectives in scheduling problems. Artificial Intelligence in Engineering, 7:227-242, 1992.

P. M. Berry, B. Y. Choueiry, and L. Friha. A distributed approach to dynamic resource management based on temporal abstractions. Journal of Intelligent Systems Engineering, 7:227-242, 1992.

J. E. Borrett, E. P. K. Tsang, and N. R. Walsh. Adaptive constraint satisfaction: The quickest first principle. In Proceedings of the 12th European Conference on Artificial Intelligence, 1996.

A. Cichocki, A. S. Helal, M. Rusinkiewicz, and D. Woelk. Workflow and Process Automation: Concepts and Technology. Kluwer, 1998.

B. Drabble, T. Lydiard, and A. Tate. Workflow support in the air campaign planning process. In Proceedings of the AIPS Workshop on Interactive and Collaborative Planning, Pittsburgh, PA, 1998.

M. Drummond, K. Swanston, J. Bresina, and R. Levinson. Reaction-first search. In Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, pages 1408-1414, 1993.

R. J. Firby. Task networks for controlling continuous processes. In Proceedings of the Second International Conference on AI Planning Systems, Menlo Park, CA, 1994. AAAI Press.

M. P. Georgeff and F. F. Ingrand. Decision-making in an embedded reasoning system. In Proceedings of the Eleventh International Joint Conference on Artificial Intelligence, Detroit, MI, 1989.

K. Hammond. Explaining and repairing plans that fail. Artificial Intelligence, 45:173-228, 1990.

D. Hollingsworth. The workflow reference model. Technical Report TC00-1003, Workflow Management Coalition, 1995.

S. Kambhampati and J. Hendler. A validation-structure-based theory of plan modification and reuse. Artificial Intelligence, 55(2):192-258, 1992.

J. Laird. Integrating planning and execution in Soar. In Proceedings of the AAAI Spring Symposium on Planning in Uncertain, Unpredicatable, or Changing Environments, Stanford, CA, 1990.

T. J. Lee. The air campaign planning knowledge base. Technical report, Advanced Automation Technology Center, Menlo Park, CA, 1998.

K. L. Myers. User's Guide for the Procedural Reasoning System. Artificial Intelligence Center, SRI International, Menlo Park, CA, 1993.

K. L. Myers. A procedural knowledge approach to task-level control. In Proceedings of the Third International Conference on AI Planning Systems. AAAI Press, 1996.

K. L. Myers. Strategic advice for hierarchical planners. In L. C. Aiello, J. Doyle, and S. C. Shapiro, editors, Principles of Knowledge Representation and Reasoning: Proceedings of the Fifth International Conference (KR '96). Morgan Kaufmann Publishers, 1996.

K. L. Myers. Towards a framework for continuous planning and execution. In Proceedings of the AAAI Fall Symposium on Distributed Continual Planning, Menlo Park, CA, 1998. AAAI Press.

N. Sadeh. Mirco-opportunistic scheduling. In M. Zweben and M. Fox, editors, Intelligent Scheduling, chapter 4. Morgan Kaufmann Publishers, 1994.

K. Sycara and K. Miyashita. Incremental schedule modification. In Proceedings of the AAAI Spring Symposium on Computational Considerations in Supporting Incremental Modification and Reuse, 1992.

M. M. Veloso, M. E. Pollack, and M. T. Cox. Rationale-based monitoring for planning in dynamic environments. In Proceedings of the Fourth International Conference on AI Planning Systems, 1998.

R. M. Washington. Abstraction Planning in Real Time. PhD thesis, Stanford University, 1994.

D. E. Wilkins. Recovering from execution errors in SIPE. Computational Intelligence, 1(1):33-45, 1985.

D. E. Wilkins. Practical Planning: Extending the Classical AI Planning Paradigm. Morgan Kaufmann, 1988.

D. E. Wilkins and K. L. Myers. A common knowledge representation for plan generation and reactive execution. Journal of Logic and Computation, 5(6):731-761, December 1995.

D. E. Wilkins and K. L. Myers. A multiagent planning architecture. In Proceedings of the Fourth International Conference on AI Planning Systems, 1998.

D. E. Wilkins, K. L. Myers, J. D. Lowrance, and L. P. Wesley. Planning and reacting in uncertain and dynamic environments. Journal of Experimental and Theoretical AI, 7(1):197-227, 1995.

B. C. Williams and P. P. Nayak. A model-based approach to reactive self-configuring systems. In Proceedings of the Thirteenth National Conference on Artificial Intelligence, 1996.

M. Zweben, B. Daun, E. Davis, and M. Deale. Scheduling and rescheduling with iterative repair. In Zweben and Fox, editors, Intelligent Scheduling, chapter 8. Morgan Kaufmann Publishers, 1994.