Presentation Syntax for OWL-S

Author:
Drew McDermott, Yale University

Abstract

This document presents a grammar and semantics for the OWL-S "presentation syntax." The semantics are provided by presenting an algorithm for translating the presentation syntax into RDF.

  Goals

The original syntax for processes in OWL-S is somewhat unwieldy, to put it mildly. The main design constraint was that it be RDF; but the job it is doing, describing processes in all their intricate detail, is not really suitable for RDF, which is basically a metadata formalism. Therefore, we are moving toward a more readable syntax, which can nonetheless be translated into the RDF notation when required. Our syntax includes N3 [<>] as a subset, then adds what we hope is a natural algebraic syntax for expressing variable declarations and control structures.

Here is an example of a process definition, drawn from the Congo domain [[ ]], which involves a book-selling service. All XML files must begin with namespace declarations, and ours is no exception:

<<Define congo-namespaces [congo, line 3]
with-namespaces
       (uri"http://www.daml.org/services/owl-s/1.2/CongoProcess.owl",
        rdf: uri"http://www.w3.org/1999/02/22-rdf-syntax-ns",
        rdfs: uri"http://www.w3.org/2000/01/rdf-schema",
        xsd: uri"http://www.w3.org/2001/XMLSchema",
        owl: uri"http://www.w3.org/2002/07/owl",
        expr: uri"http://www.daml.org/services/owl-s/1.2/generic/Expression.owl",
        swrl: uri"http://www.w3.org/2003/11/swrl",
//      swrlx: "http://www.daml.org/services/owl-s/1.2/generic/swrlx.owl",
        service: "http://www.daml.org/services/owl-s/1.2/Service.owl",
        process: "http://www.daml.org/services/owl-s/1.2/Process.owl",
        time: "http://www.isi.edu/~pan/damltime/time-entry.owl",  
        objList: "http://www.daml.org/services/owl-s/1.2/generic/ObjectList.owl",
        congo: "http://www.daml.org/services/owl-s/1.2/CongoService.owl", 
        profileHierarchy: "http://www.daml.org/services/owl-s/1.2/ProfileHierarchy.owl") 
>>

It is not necessary to include any ontological definitions in an Owl-S file, but it's often quite convenient. We use a superset of the N3 notation:

<<Define congo-ontology [congo, line 22]
  {

   ontology
    {
      ontology_version
          "1.88 2006/01/30 07:52:44" .

      imports time, service, process, profileHierarchy,
              uri"http://www.daml.org/rules/proposal/swrl.owl" .
      /**

      A B2C bookbuying example of OWL-S (Web Ontology Language
      for Services; see http://www.daml.org/services/owl-s) usage,

      */    



     shippedTo - owl:ObjectProperty;
               rdfs:domain Shipment;
               rdfs:range :AcctID .

     shippedBook - owl:ObjectProperty;
               rdfs:domain Shipment;
               rdfs:range profileHierarchy:Book .

     deliveryType - owl:ObjectProperty;
               rdfs:domain Shipment;
               rdfs:range DeliveryType .
     
     packagingType - owl:ObjectProperty;
               rdfs:domain Shipment;
               rdfs:range PackagingType .

     Shipment - owl:Class;
              rdfs:subClassOf [ a owl:Restriction;
                                owl:onProperty :shippedTo;
                                owl:cardinality 1 ],
                              [ a owl:Restriction;
                                owl:onProperty :shippedBook;
                                owl:cardinality 1 ],
                              [ a owl:Restriction;
                                owl:onProperty :deliveryType;
                                owl:cardinality 1 ];
                              [ a owl:Restriction;
                                owl:onProperty :packagingType;
                                owl:cardinality 1 ] .

     cardNumber - owl:DatatypeProperty;
            rdfs:domain CreditCard;
            rdfs:range xsd:decimal .

     cardType - owl:ObjectProperty;
            rdfs:domain CreditCard;
            rdfs:range CreditCardType .

     cardExpiration - owl:DatatypeProperty;
            rdfs:domain CreditCard;
            rdfs:range xsd:gYearMonth . 

     validity - owl:ObjectProperty;
            rdfs:domain CreditCard;
            rdfs:range ValidityType .

     CreditCard rdfs:subClassOf [ a rdfs:Restriction
                                  owl:onProperty cardNumber;
                                  owl:cardinality 1 ],
                                [ a rdfs:Restriction
                                  owl:onProperty cardType;
                                  owl:cardinality 1 ],
                                [ a rdfs:Restriction
                                  owl:onProperty cardExpiration;
                                  owl:cardinality 1 ],
                                [ a rdfs:Restriction
                                  owl:onProperty validity;
                                  owl:cardinality 1 ] .

     FailureNotification enum { NotifyBookNotFound NotifyBookOutOfStock } .

     OrderShippedAcknowledgment - owl:Class .

     ExpressCongoBuyOutputType - owl:Class;
            owl:unionOf ( OrderShippedAcknowledgment 
                          FailureNotification ) .

     FullCongoBuyOutputType - owl:Class;
            owl:unionOf ( OrderShippedAcknowledgment 
                          FailureNotification ) .

     LocateBookOutputType - owl:Class;
       /** If successful, LocateBook returns an ISBN;
           otherwise, it returns a FailureNotification. */
        owl:unionOf ( profileHierarchy:ISBN FailureNotification ) .

     InStockBook - owl:Class;
        /** The set of books that are found in stock of Congo. */
         rdfs:subClassOf profileHierarchy:Book .

     OutOfStockBook - owl:Class;
       /** The set of books that are NOT found in stock of Congo. */
         rdfs:subClassOf profileHierarchy:Book;
         owl:disjointWith InStockBook .

     hasBook - owl:ObjectProperty;
      /** Inverse of hasISBN property, i.e. relate an ISBN number
          to a book. */
         owl:inverseOf profileHierarchy:hasISBN .

     SignInData - owl:Class . 

     acctName - owl:DatatypeProperty;
       rdfs:domain SignInData;
       rdfs:range xsd:string .

     password - owl:DatatypeProperty;
       rdfs:domain SignInData;
      rdfs:range xsd:string .

     UserProfileInfo - owl:Class .

     AcctInfo - owl:Class;
          owl:unionOf ( SignInData UserProfileInfo ) .

     AcctID - owl:Class;

     hasAcctID - owl:FunctionalProperty;
            /** An account exists if there is an account ID associated
               with it. */
          rdfs:domain AcctInfo;
          rdfs:range AcctID .

     NonExistingAcct - owl:Class;
          /** If there is no account ID for an account then that
             account simply does not exist */
      owl:intersectionOf ( AcctInfo 
                           [ a owl:Restriction;
                             owl:onProperty hasAcctID;
                             owl:maxCardinality 0 ] ) .



     }
>>
The principal enhancement to the notation is to allow constants to be declared using a hyphen between the name of the constant and its "type." Obviously, in Owl objects can belong to more than one class, but as we will see, type checking is quite useful [[ why exactly? ]]; if there is no reason to single a particular class out as "the" type of an object, one can of course leave it undeclared or declare it to be of type owl:Thing. An undeclared constant must be referred to using the usual N3 notation, with a prefixed colon. The name :foo is conventionally an abbreviation for <#foo>; in Owl-S this is not a convention. The angle-bracket notation (<#foo>) is not supported, so the prefix colon is the only way to refer to an undeclared global constant. This is consistent with the general approach to namespaces described in Section "????"

Now we come to an actual process definition

<<Define ExpressCongoBuy [congo, line 283]
    define atomic process ExpressCongoBuy
      /**
         This is an express "one shot" service for buying a book
         with Congo.  It takes as input a Book ISBN number and
         the customer's sign-in information, and has the effect of
         ordering the book if the book is in stock.  In this case,
         the output of the service is a message saying the book 
         is ordered.  If the book is not in stock, the output says 
         that the book is out of stock.
      */
          (inputs: (ExpressCongoBuyISBN - profileHierarchy:ISBN,
                    ExpressCongoBuySignInInfo - SignInData,
                    ExpressCongoBuyCreditCardNumber - xsd:decimal,
                    ExpressCongoBuyCreditCardType - CreditCardType,
                    ExpressCongoBuyCreditCardExpirationDate - xsd:gYearMonth),

           exists: (ExpressCongoBuyAcctID - CustomerAcctID,
                    /** This is a local variable that represents the associated 
                        account ID for the given sign-in info. The existence of 
                        this value indicates that the sign-in info is valid an
                        account exists. */
           `        ExpressCongoBuyCreditCard - CreditCard
                    /** This is a local variable that represents the associated 
                      credit card for the given card number. */
                    ),
           precondition: (hasAcctID(ExpressCongoBuySignInInfo, ExpressCongoBuyAcctID)
                          & credit-card-number(ExpressCongoBuyCreditCard,
                                               ExpressCongoBuyCreditCardNumber)
                          & validity(ExpressCongoBuyCreditCard, Valid)),
                /** Typically this condition should also say that
                   given credit card type and expiration date is also
                   correct for the credit card. Those details are left
                   out for this example. */

           outputs: (ExpressCongoBuyOutput - ExpressCongoBuyOutputType),

           result: (forall (ExpressCongoBuyBook - profileHierarchy:Book)
                      /** book identified by the given ISBN input */
                      (hasISBN(ExpressCongoBuyBook, ISBN)
                       & inStockBook(ExpressCongoBuyBook)
                       |->
                        /** If the book is in stock, then the result is that the
                            order was shipped and an appropriate acknowledgment
                            is output. */
                          N3$${ ExpressCongoBuyOutput 
                                    a orderShippedAcknowledgment }
                           & exists(ExpressCongoBuyShipment - Shipment)
                                    /** shipment that gets generated on
                                        a successful purchase */
                              (shippedTo(ExpressCongoBuyShipment, AcctID)
                               & shippedBook(ExpressCongoBuyShipment,
                                             ExpressCongoBuyBook)))
                      &
                      (hasISBN(ExpressCongoBuyBook, ISBN)
                       & N3$${ ExpressCongoBuyBookISBN 
                                a [ owl:Restriction;
                                    owl:onProperty hasBook;
                                    owl:allValuesFrom outOfStockBook ] }
                       /** If the book is out of stock, then the result is simply
                          that an appropriate acknowledgment is output, indicating 
                          that the book is out of stock. */
                       |-> output(ExpressCongoBuyOutput
                                     <= notifyOutOfStock))))
>>
An atomic process is one that is defined entirely in terms of the values it returns and the effects that it has, without any mention of its subevents. (At the time scale of a typical web-service interaction, any such subevents can be neglected.) Modeling a service as atomic in this way is closely analogous to the way web services are modeled in WSDL [<>], and in fact that's why many web services can be described in Owl-S and grounded in WSDL.

The inputs and precondition fields of a definition describe what goes into a process, and the outputs and results fields describe what comes out. (The exists field is used to bind variables [[ and will be described later ]].) What the process definition says is that if the inputs satisfy the type descriptions and the preconditions are true, then the process will produce outputs of the prescribed types and the result will "occur." The prototypical result field is a proposition that will become true (e.g., shippedTo(AcctID)), but these effects can be gated by conditions (e.g., inStockBook(book)), and some result fields specify output values instead of propositions (e.g., output(Status <= orderShippedAcknowledgement)).

In addition to atomic processes, we also have composite processes, which have bodies describing what other processes must carried out in order for the process to be consummated.

In what follows I will specify the grammar for the ontology and process definitions (sect. "????"); discuss the syntactic analysis provided by this grammar (sect. "????"); and then sketch the XMLified RDF that would be produced by "internalizing" the parsetrees, which is essentially the same as the original XML version in [] (sect. "????").

This document consists of pieces of code intermixed with text, in a "literate programming" style [Knu84]. Some of the code comes from our CongoBuy example, but in addition we present the entire grammar in this format. [[ See [McD04b], appendix 1, for a brief explanation of this incarnation of literate programming. ]]


  Syntax

The OWL-S grammar is expressed in a somewhat nontraditional form, and to explain it will require a digression on the topic of syntax. [[ The syntax notation is for the Lexiparse parser [McD04b]. ]] Although human languages are more complex, the formal languages used for programming, logic, and OWL-S process definition are fundamentally quite simple. Every expression can be thought of as a tree structure whose leaves are the lexemes of the language. (We leave aside for the moment how to think about the structure of those lexemes.) It is convenient to assume that every nonleaf tree structure has an "operator" lexeme that sits at the top. The most obvious examples are arithmetic expressions such as "a * b + c," which can be diagramed as a the tree shown in [[ ]]. We will draw our tree structures using variations of the parenthesis notation apparently first invented by Noam Chomsky [<>]. This one can be drawn (+ (* a b) c). The operator of the tree is "+"; its left subtree is itself a tree, with operator "*".

We will be using two sorts of parenthesized tree expression: the one just exemplified, which is of course the notation used by the programming language Lisp [<>]; and a variant in which the symbol ":^+" comes between the open paren and the operator. The reason is that we will be using Chomsky-tree-bracket notation for two purposes: to express the rules of the grammar, and to describe the trees of the language defined by the grammar. For the first purpose we will use traditional Lisp notation, and for the second we would write our example tree as (:^+ + (:^+ * a b) c). (The "Tinkertoy" symbol is meant to resemble a treelet with two subnodes.)

The Lexiparse formalism is different from most in that it is cast in terms of precedences and parsetrees rather than phrase-structure rules[]. Explicit phrase-structure rules are unnecessary because they fall into a few stereotyped classes that are captured more succinctly using other information. Every nonterminal can appear in an infix position or prefix position, depending on whether it has an operand to its left. For each position, its behavior is defined in terms of at most four features:

  1. Numargs: How many arguments it expects to its right (an interval [min, max]).
  2. Left precedence: How tightly it binds an operand to its left (unnecessary in preffix position)
  3. Right precedence: How tightly it binds an operand to its right (unnecessary if max numargs = 0).
  4. Bracket: Whether it is an open or close bracket, and which brackets it matches.
Suffix operators are just a special case of infix operators whose numargs = [0,0]. Suffix operators that take more than one argument to their left are impossible. The notation for specifying these things, and a couple of other items of less import, will become clear as we go.

These features suffice to define a deterministic grammar that can be parsed rapidly, transforming any string of lexemes into a parsetree without further ado. From that point on, we express our grammar in terms of transformations that clean up and filter parsetrees, producing error messages for the "defective" ones that don't pass the filters. The grammar rules are written in a Lisp-style notation, and the parsetrees they refer to are written using "Tinkertoy" notation.

The higher the precedence of a token, the "tighter" it binds to the expressions adjacent to it, which means that high-precedence tokens tend to head small(er) subtrees of the large(r) parsetrees headed by low-precedence tokens. The tokens near the root of the tree are those with the lower precedences. (This might be one context where computer scientists should draw their trees with the leaves up!) All grouping decisions can be made purely locally. Suppose the parser has the sequence A op B in hand, where A and B are trees and op is an operator. The question is whether to build the tree (:^+ op A B), or to keep looking for material to the right of B that could be used to augment the second operand of op. If there is an operator to the right of B, and its left precedence is higher than op's right precedence, the parser augments B; otherwise, it doesn't. Example: Having seen x + y, if the next lexeme is *, then the parser will augment y, so that it may well wind up with a tree that looks like (:^+ + x (:^+ * y ...)). If the next lexeme is absent (we're at the end of the stream), or it's a close bracket (whose left precedence is 0, and so much lower than any normal operator), or it's "-" (whose left precedence is the same as +'s right precedence, which, let's say, is 180), the parser will build (:^+ + x y). Of course, that the parser must then decide whether to incorporate that into a larger tree tree (such as (:^+ - (:^+ + x y) ...)), depending on exactly the same considerations as before, but now the sequence A op B has A = (:^+ x y), op = -, and B = whatever's to the right of -.

This is all well and good, but there are many expressions that don't consist of an application of an operator to one or two arguments. In Owl-S, how do we handle something as complex as define, which expects a bunch of keywords to the right, followed by a complex structure of inputs, results, and such? Each case must be handled in its own way, but the idea is to define keywords such as process or atomic as operators themselves, so that the parsetree headed by define contains subtrees headed by these "sub-operators." The syntax of define then consists of its basic precedence information plus operations for tidying and checking a tree headed by an occurrence of it. Checking a tree means looking for illegal patterns; tidying occurs first, in order to clean up the structure produced by the basic precedence grammar. These are embodied in rule sets called the tidiers and checkers for define.

What this means is that to read a Lexiparse grammar you look for the patterns that are expected to occur under an operator. These are not found on the right-hand sides of grammar rules, but inside the tidiers and checkers for each operator. The language of checkers in particular contains many features that go far beyond context-free grammars.

The most intuitive example is the classic if-then-else, which we'll present here even though it belongs much later, in the section on control constructs in composite processes. Intuitively, <if> is an operator with two or three arguments: a test, something to do if the test is true, and, optionally, something to do if the test is false. We capture that by making <else> into an infix operator with a higher precedence than <then>, so that the three-argument case has a subtree of the form (:^+ then (:^+ else iftrue iffalse)).

<<Define then-else-syntax

   (def-syntactic then :reserved true
                       :prefix (:precedence 87 :numargs 1))

   (def-syntactic else :reserved true
                       :infix (:precedence 88)) >>
The field ":reserved true" means that this token is derived from occurrences of the symbol itself, and not from some character sequence (as <left-paren> is derived from '('). The lexical level of the language is deal with in appendix?  "????"

Now we define if thus:

<<Define if-syntax

   (def-syntactic if :reserved true
                     :prefix (:numargs 2 :precedence 85)
      :tidier (( (:^+ if ?test (:^+ then (:^+ else ?iftrue ?iffalse)))
                !~>>- (:^+ if ,test ,iftrue ,iffalse))
               ( (:^+ if ?test (:^+ then ?iftrue))
                !~>>- (:^+ if ?test ?iftrue))
               (:else
                (defect "'if' has no 'then' part")))) >>
The first clause establishes that the precedence of the token if is 85, which places it above then in any parsetree. The :tidier section is a list of pairs of patterns:
      ( ( (:^+ ...) !~>>- (:^+ ...) )
        ( (:^+ ...) !~>>- (:^+ ...) )
        (:else ...)
       )
(but we normally use fewer spaces and more newlines + indentation). On the left side of a "!~>>-" rule, question marks indicate chunks of a parsetree to extract and name. On the right side, commas indicate items to plug into a parsetree being built. We use standard notations for pattern matching and expression construction. The former are derived from logic programming and the latter from Lisp "quasi-quote" or "backquote." These are not regular expressions, mainly because we are not matching strings. The precedence grammar has given us crisply organized structures to match against, not saggy strings. A match variable of the form ?var matches a single item in a parsetree. These are also called qvars. A qvaroid is an expression where the question mark is followed by a parenthesized list, beginning with a symbol. What it matches is governed by that symbol. The qvaroid ?(:^+ op –subs–) matches a parsetree with operator op and subtrees subs. (The "subtrees" are not always parsetrees themselves; they can be occurrences of symbols, strings, and other lexical primitives.) The pattern ?(:^+ if ?test (:^+ then (:^+ else ?iftrue ?iffalse))) matches any parsetree of the form (:^+ if __ (:^+ then (:^+ else __ __))) and binds the variables test, iftrue, and iffalse to the pieces filling in the blanks. (If the left-hand side is a parsetree, as it usually is, the question mark in front of it may be omitted; but in other contexts it will be mandatory.)

On the right side of a "!~>>-" rule, commas indicate where the pieces of a newly constructed parsetree are to come from. So (:^+ if ,test ,iftrue ,iffalse) creates a new parsetree with a "flatter" structure of three subtrees, respectively the values of the variables test, iftrue, and iffalse.

After either a question mark or a comma, an atsign indicates that a series of elements is to be matched or substituted. ?@v matches a series of parsetrees and stores them (as a list) in v, and ,@exp substitutes the value of the list exp (as a series) into the parsetree being built. Only one ?@ may occur in a single list. That doesn't mean in the entire pattern, but just within one list in the pattern. The ,@ construct has no analogous limitations We will see examples of the atsign constructs below.

If neither transformation rule in the tidier applies, then the :else clause kicks in. The expression (defect ...) is obviously not a transformation rule. Its pattern consists of strings and expressions. The values of the expressions are concatenated with the strings to make a description of a defect, or "flaw," in the syntax tree. These defects are accumulated, and in the end produce error messages about an unsuccessful parse.

The <if> construct is simple enough that it needs no separate checker. If it had needed one, it would have been defined with a separate :checkers clause in def-syntactic. We will see examples of this shortly.

Here's another key example, the definition of left-parenthesis:

<<Define left-paren

   ;; Eliminate commas from parenthesized expressions
   (def-syntactic left-paren :lex "("
                             :prefix (:right-precedence 0
                                      :bracket (:open right-paren))
                             :infix (:left-precedence 200 :right-precedence 0
                                     :bracket (:open right-paren))
      :tidier
         (( (:^+ left-paren [*_] (:^+ comma ?@subs))
           !~>>- (:^+ group [*_] ,@subs))
          ( (:^+ left-paren [*_] ?e)
           !~>>- e)
          ( (:^+ left-paren ?f [_*_] (:^+ comma ?@args))
           !~>>- (:^+ fun-app ,f [_*_] ,@args))
          ( (:^+ left-paren ?f [_*_] ?arg)
           !~>>- (:^+ fun-app ,f [_*_] ,arg)))) >>
Left parentheses can occur as infix or prefix operators. In either case a left paren is (big surprise) a :bracket closed by a right paren; its right precedence of 0 causes all other operators to bind tighter, until a right bracket is seen, with left precedence of 0. The resulting parsetrees are then tidied using four rules. The first two contain the symbol [*_], indicating that they apply only to prefix occurrences of <left-paren>; the second two substitute [_*_], which indicates that they match with infix parsetrees only. What the rules do depends on whether the main operator inside the parentheses is a <comma> or not. For reasons [[ explained below/that need not concern us here ]], <comma> can dominate a parsetree with an indefinite number of subtrees. Hence we use "?@" as a prefix on subs and args in our match patterns, and ",@" as prefix on the same variables in our substitution patterns. In those cases, the <comma>s are removed.

A prefix paren not followed by a comma is redundant (having served its purpose of overriding normal precedence relationships) and it is removed (by the second tidier clause). In all other cases the <left-paren> token is replaced; prefix <left-paren> becomes the token <group> and infix <left-paren> becomes <fun-app> (short for "function application"). In what follows, tidiers will often look for occurrences of <group>, and it's useful to bear in mind where they came from.

One thing that may not be clear is that inside a pair of brackets there must always be a single expression. Typically some grouping connective, such as <comma> in the case of ordinary parentheses, is used to bundle together multiple expressions. As we will see later, if there is no explicit connective, Lexiparse will insert one.

The discussion in the rest of this section roughly follows the order shown in the

Tables "Right Precedence" and "Left Precedence", from high-level, low-precedence tokens to low-level, high-precedence tokens. [[ It does? ]] Before continuing with the discussion, it might be useful to stop and scan these tables, which show the overall precedence structure of OWL-S. An infix operator (such as <left-paren>in some contexts) has both a left and right precedence. The former determines how well it competes with tokens to its left, the latter, how well with those to its right.

Table 2: OWL-S operators in order of increasing ...
right precedence
Op Left Right
** 1000 -1000
left precedence
Op Left Right
** 1000 -1000

Later, in Table Precedence Array, we show the "cross product" of these two tables, indicating more clearly which tokens tend to occur below which other tokens.

2.1  Upper-Level Syntax: Process Name Declarations

We start by declaring that we're constructing a grammar, called owl-s:

<<Define grammar-def

(def-grammar owl-s
    :top-tokens (ontology define with_namespaces)
    :lex-parent standard-arith-syntax
    :syn-parent standard-arith-syntax
    :sym-case #-allegro :up #+allegro (allegro-case-choice)) >>
The :top-tokens declaration indicates that a legal process definition in OWL-S must be headed by one of the tokens ontology, define, or with_namespaces. (This last one is actually the most likely; it's explained in Section "????"; ontology is discussed in section "????".)

The rest of the grammar consists of definitions of the tokens of the language, starting with those that appear higher (closer to the root) in parsetrees, which are (more or less) the same as the :top-tokens. We start with <define>.

A process-model definition consists of a sequence of process definitions, each atomic, simple, or composite. All three stem from the <define> operator.

<<Define process-definition

   (def-syntactic define :reserved true
                         :prefix (:precedence 5 :numargs 1)
      :tidier
      (( (:^+ define (:^+ composite 
                          ?(:^+ process ?name ?@iopr-spec)
                          ?@body-spec))
        !~>>- (:^+ define composite ,name (:^+ iopr ,@iopr-spec) ,@body-spec))

       ( (:^+ define (:^+ atomic (:^+ process ?name ?@iopr-spec)))
        !~>>- (:^+ define atomic ,name (:^+ iopr ,@iopr-spec) nil))

       ( (:^+ define (:^+ simple (:^+ process ?name ?@iopr-spec)))
        !~>>- (:^+ define simple ,name (:^+ iopr ,@iopr-spec) nil))

       (:else (defect "Unintelligible 'define' expression")))
      <<Insert: process-definition-checkers>>>>

The tidier for <define>, like <if>'s, flatten the structure produced by the operators <atomic>, <composite>, etc. (which are defined later). Now we add in a :checkers field, which specifies various validity tests for the parsetrees headed by <define>:

<<Define process-definition-checkers

      :checkers
       (
        ((:~ ?(:^+ define ?_ ?(:+ ?name is-name) ?@_))
         (defect "Process has illegal name: " name))

        (:up *
          ((:~ ?(:^+ ?(:|| with-namespaces left-brace) ?@_))
           (defect "'define' can appear below only 'with-namespaces'"
                   " or 'left-brace'")))
       )) >>

Note that checkers is a list of (in this case, two) items. Each specifies a test to be run. All the tests are run, and all the defects they find are accumulated. Important: Tidiers are run as soon as each sub-parsetree is first constructed. Checkers are run only after all grouping and tidying are finished for the entire parsetree; in addition, for a given sub-parsetree, checking is canceled if the tidying phase found defects. Now back to define. The second checker starts with the symbol :up; the first is of the form
    (pattern –defects-or-checkers–)
If the pattern matches the current parsetree, then the defects-or-checkers specify what happens. Expressions of the form (defect ...) create defects, and anything else is interpreted as another checker to be run. (If a checker has a pattern followed by nothing, a standard defect of the form "Doesn't match this pattern" is created. To indicate that no further checkers need be run, one just writes :okay.)

Some new features of the pattern language appear at this point. They are summarized in table "????". The meaning of the first checker may be glossed as, "If the parsetree does not have at least two children, the second of which satisfies the test is-name, then report the defect [Process has illegal name: n], where n is the offending second child."

Matches if p matches and the item in question satisfies the tests.
Table : Match codes used by Lexiparse
?_ Matches anything
?(:~ p) Matches only if p does not
?(:& p1 ...pK) Matches only if each pI does
?(:|| p1 ...pK [:& p0]) Matches only if one of the pI and,
if present, p0 match
?(:+ p –tests–)

The notation (:up * –checkers–) indicates that all the ancestors of the current parsetree are to be checked using the given checkers. (If "1" had appeared instead of the asterisk, then only its parent would have been checked.) In this case the checker limits define to appearing either at the top level or below a with_namespaces, possibly with some enclosing curly braces.

The syntactic rules above introduce the acronym IOPR, which stands for "Inputs, Outputs, Preconditions, and Results" — all the types of entities that define the external view of the process. For atomic and simple processes, the external view is all there is; for composites, the body-spec provides a view of the inner works of the process.

We analyze <atomic>, <simple>, and <composite> as simple prefix operators, followed by a process-spec.

<<Define process-classes

   ;; __::= atomic <process-spec>
   (def-syntactic atomic
                  :reserved true
                  :prefix (:precedence 15 :numargs 1))

   ;; __::= simple <process-spec> 
   (def-syntactic simple :reserved true
                  :prefix (:precedence 15 :numargs 1))

   ;; __::=  composite <process-spec>... { ...
   (def-syntactic composite :reserved true
                  :prefix (:precedence 15 :numargs 2)
      :tidier
        (( (:^+ composite ?proc (:^+ left-brace ?body))
          !~>>- (:^+ composite ,proc ,body))
         ( (:^+ composite ?proc ?b)
          (defect "Ill-formed body for composite process " proc
                  :% " (must be surrounded by braces): " b)))) >>
The syntax of curly braces will be dealt with below; suffice it to say that the way brackets work in a typical Lexiparse grammar is that a balanced expression is analyzed as a parsetree whose operator is the open bracket and whose sole child is the expression inside the brackets.

  2 Namespaces

Before presenting the syntax of processes, I digress to talk about namespaces, on the grounds that processes and ontologies almost always occur inside namespace declarations. In the RDF/XML representation, namespaces are treated using the standard XML namespace notations, that is, as attributes whose names start with xmlns. In the presentation syntax, we must supply a special notation:

<<Define namespace-spec

   (def-syntactic with_namespaces :reserved true
                  :prefix (:precedence 15 :numargs 2)
      :tidier
         (( (:^+ with_namespaces ?spec ?e)
           !~>>- (:^+ with_namespaces
                        ,e ,@(apply-named-tidier namespaces-flatten spec)))))

   <<Insert: namespaces-flatten>>>>
A named tidier is a packaged tidier that can be invoked from multiple contexts. We'll get to its definition shortly. Its purpose is to "flatten" the namespace list that occurs right after <with_namespaces>, i.e., to move it up a level and put it after the body of the <with_namespaces> expression in the tidied parsetree. This format is slightly easier for the internalizers of section "????" to handle. Before presenting it, we discuss the syntax of colons, which are obviously relevant to namespaces.

When <colon> occurs as an infix operator, it's part of a qualified name, like owl:Class. Infix occurrences are replaced with the more mnemonic <name-wrt-space>. It also occurs as a prefix operator. These occurrences are not renamed, but deleted, as explained in section "????". Either way, there should be no <colon>s in a parsetree after it is tidied, which explains the checker in the following syntax definition:

<<Define colon

   ;; Note that colon is not a reserved word, but comes from
     ;;  occurrences of ":"
   (def-syntactic colon :lex ":" :infix (:numargs 1 :precedence 210)

                         ;; -- namespace construct, with high precedence
                         :prefix (:numargs 1 :precedence 120)
                         ;; -- Occurs only after inputs, outputs,
                         ;; preconditions, effects
                         ;; Low precedence, but higher than comma
      :tidier
         (( (:^+ colon ?namespace [_*_] ?name)
           !~>>- (:^+ name-wrt-space ,namespace ,name)))

      :checkers
         (( (:^+ colon ?@_)
           (defect "Stray colon"))))

;;; Because infix <colon> is replaced by the "artificial" token
;;; <name-wrt-space>, we have to use 'def-checkers' to define its
;;; checkers --

   (def-checkers name-wrt-space
      ( ?(:~ ?(:^+ name-wrt-space ?(:+ ?namespace is-Symbol) ?_))
       (defect "Namespace must be a non-nil symbol: " namespace))
     
      ( ?(:~ ?(:^+ name-wrt-space ?_ ?(:+ ?name is-Symbol)))
       (defect "Name must be a symbol: " name))) >>
The checker for (:^+ name-wrt-space space name) verifies that space and name are symbols. The usual expectation is that space is bound by an outer with_namespaces expression, but bindings are not checked until expressions are internalized. And, of course, in the with_namespaces expression itself, the meaning of the name and space are completely different.

We can now turn to the named tidier used to tidy with_namespaces expressions:

<<Define namespaces-flatten

   (def-named-tidier namespaces-flatten
      ( (:^+  uri ?s)
       !~>>- ((:^+ namespace+name ,false  (:^+ uri ,s))))
      ( (:^+ ?(:|| equals name-wrt-space)
             ?(:+ ?sym is-Symbol)
             (:^+ uri ?s))
       !~>>- ((:^+ namespace+name ,sym (:^+ uri ,s))))
      ( (:^+ group ?@specs)
       !~>>- (all-tidy specs 
                (use-named-tidier namespaces-flatten)))
      (:else
        (defect "Illegal namespace spec " spec))) >>
It has the same syntax as a regular tidier, although those not intimately familiar with Lisp conventions may think there is a missing open parenthesis. It differs from a regular tidier in that it can be used in several places, so you may want to go back to the definition of with_namespaces to remind yourself that in that context it is applied to the list of namespace prefixes that occurs as its first argument. The key purpose of namespaces-flatten is revealed by its third clause, with matches against a parsetree of the form (:^+ group ...), which, you'll recall, is what a parenthesized list gets tidied into. If this is the form taken by the expression following with_namespaces, then namespaces_flatten is applied to the subtrees of the group recursively. The construct (all-tidy l tidier) means the result obtained by tidying each element of l using tidier, then appending all the results together. (If a tidier produces a single parsetree, it is treated as a singleton list containing that parsetree.)

The basis cases come are the first two clauses of namespace-flatten: either a single URI, representing the default namespace, or expressions of the form prefix:URI. (We also allow prefix=URI, which explains the ?(:|| ...) expression.)

In XML, URIs are represented as strings, and are recognized as URIs by their context. We take a slightly different tack, and supply a syntactic operator uri"..." that indicates a string is to be taken as a URI.

<<Define uri-spec

   (def-syntactic uri :reserved true
                  :prefix (:precedence 210 :numargs 1)
      :checkers
         (( (:~ ?(:^+ uri ?(:+ ?s is-String)))
           (defect "uri followed by non-string " s)))) >>

n 2 Process Syntax

We now return to the syntax of processes, and in particular process itself:

<<Define process-spec

   
   ;; __::= process <name>(-IOPRs-)
   (def-syntactic process :reserved true
                  :prefix (:precedence 25 :numargs 1)  
      :tidier
         ( ( (:^+ process (:^+ fun-app ?name [_*_] ?@subs))
           ~(>>- :^+ process ,name ,@subs))
          <<Insert: anon-process>>
          (:else (defect "'process' must be followed by <name>"
                         " and IOPRs in parens")))

      :checkers

         (( (:^+ process ?(:+ ?_ (not is-Symbol)) ?@_)
           (defect "Stray occurrence of 'process'"))

          ( (:^+ process ?_ ?@ioprs)
           (forall-check ioprs
              ((:^+ ?(:~ ?(:|| inputs outputs locals participants
                               precondition result))
                    ?_)
               (defect "Process has illegal IOPR spec: " this-ptree))))

          ( (:^+ process ?_ ?@ioprs)
           (occ-range ioprs 0 1 ?(:^+ inputs ?@_))
           (occ-range ioprs 0 1 ?(:^+ outputs ?@_))
           (occ-range ioprs 0 1 ?(:^+ locals ?@_))
           (occ-range ioprs 0 1 ?(:^+ precondition ?@_))))
) >>
The only remark to make about the tidier for process, besides the fact that it instantiates the common flatten-or-produce-defect pattern, is that it refers to the token <fun-app>, which is what an infix <left-paren> turns into.

The notation (forall-check l –checkers–) indicates that every element of the list l is to be checked using the checkers. The resulting defects are accumulated. The use of this-ptree here is not specific to forall-check; it can be used inside any checker to refer to the parsetree being checked.

The last novel element used for the checkers of <define> is the construct (occ-range list lo hi pattern). This counts the number of elements of list that match pattern. If it falls outside the range [lo,hi], a defect is issued, of the form [Too few (or too many) subexpressions match pattern].

There are two kinds of IOPRs, the IOs (parameters) and the PRs (preconditions and results). One key feature of the former is that they are followed by a parameter-declaration list with a distinctive syntax.

<<Define param-decls

   
   ;; <IOPR> ::= inputs : (...)
   (def-syntactic inputs :reserved true
                         :prefix (:numargs 1 :precedence 120
                                  :local-grammar ((1 typed-var-grammar)))
      :tidier
         (use-named-tidier typed-vars-tidier inputs)

      :checkers
         (use-named-checkers typed-vars-checker inputs)

   ;; <IOPR> ::= outputs : (...)
   (def-syntactic outputs :reserved true
                          :prefix (:numargs 1 :precedence 120
                                   :local-grammar ((1 typed-var-grammar)))
      :tidier
         (use-named-tidier typed-vars-tidier outputs)

      :checkers
         (use-named-checkers typed-vars-checker outputs))

   ;; <IOPR> ::= locals : (...)
   (def-syntactic locals :reserved true
                        :prefix (:numargs 1 :precedence 120
                                 :local-grammar ((1 typed-var-grammar)))
      :tidier
         (use-named-tidier typed-vars-tidier locals)

      :checkers
         (use-named-checkers typed-vars-checker locals))

   ;; <IOPR> ::= participants : (...)
   (def-syntactic participants :reserved true
                        :prefix (:numargs 1 :precedence 120
                                 :local-grammar ((1 typed-var-grammar)))
      :tidier
         (use-named-tidier typed-vars-tidier participants)

      :checkers
         (use-named-checkers typed-vars-checker participants)) >>
The tidiers for each of these constructs are all essentially the same, so we make use of a single named tidier for all of them, whose name is typed-vars-tidier:
<<Define typed-vars-tidier

(def-named-tidier typed-vars-tidier (sort) :grammar owl-s
   ( (:^+ ?,sort (:^+ colon (:^+ group ?@decls)))
    !~>>- (:^+ ,sort ,@decls))
   ( (:^+ ?,sort (:^+ colon ?decl))
    !~>>- (:^+ ,sort ,decl))
   (:else (defect "'" sort "' doesn't occur in the form "
                  sort ": (...)"))) >>
This named tidier differs from the previous one because it takes an argument sort, in addition to the implicit argument, the parsetree being tidied. The different between apply-named-tidier and use-named-tidier is that in the former the parsetree being tidied is explicit, while for use-named-tidier it is always the current one. Both take optional arguments at the end, which must match up with the arguments that come after the name of the tidier in def-named-tidier.

[[ ]]
<<Define typed-vars-checker

(def-named-checkers typed-vars-checker (sort)
   ( (:up 1 (:process ?name ?_))
    (defect sort " doesn't occur as process argument"))) >>

The checkers for the various constructs are all the same, too, but they're so short we don't bother defining a named checker set for them. Each is of the form

     (:up 1 ((:^+ iopr ?@_)))
that the checker ((:^+ iopr ?@_)) is to run on the parsetree one level up from this on. This is a checker consisting of a list (pattern), i.e., a pattern followed by nothing. If the parsetree one level up doesn't match it, a defect of the form "Doesn't match (:^+ iopr ?@_)" is produced and attached to this one.

The principal new feature for the IOPR declarations is their use of the local-grammar facility, which allows us to override the usual precedences and other properties of operators when they occur inside (one of) the right-hand argument(s) of an operator. (The qualification "right-hand" is due to Lexiparse's left-to-right bias; by the time it sees an infix operator, its left-hand argument has already been parsed. But the possibilities are wide open for the arguments to the right of an operator, whether infix or prefix.) The local grammar typed-var-grammar is for typed variable declarations such as (x,y - String n - Integer), in which the hyphens must have precedence lower than that of the commas.

<<Define typed-var-grammar

(def-grammar typed-var-grammar
    :lex-parent owl-s
    :syn-parent owl-s
    :replace ((minus hyphen))
    :sym-case #-allegro :up #+allegro (allegro-case-choice)

 :definitions
 (
   (def-syntactic \| :lex "" :infix (:precedence 16 :grouping :true))

   (def-syntactic left-paren :lex "("
                             :prefix (:left-precedence 200 :right-precedence 0
                                      :bracket (:open right-paren)))

   ;; Hyphens can be generated only by replacement of <minus>
   ;; Precedence puts hyphen above comma in parsetree --
   (def-syntactic hyphen :lex "-" :infix (:precedence 17 :numargs :1)
       :tidier
          ;; Eliminate commas --
          (( (:^+ hyphen (:^+ comma ?@vars) ?type)
            !~>>- (:^+ hyphen ,@vars ,type)))
       :checkers
          (hyphen-checks))
          
   (def-named-checkers hyphen-checks
       ( ?(:~ ?(:^+ hyphen ?@_ ?(:+ ?_ is-name)))
        (defect "Type is not name of Owl class: " type))

       ( ?(:^+ hyphen ?@vars ?_)
        (forall-check vars
          ( ?(:+ ?v (not is-Symbol))
           (defect "Variable is not symbol: " v)))))) >>

The token with name "|" is the invisible "contiguity" operator that Lexiparse inserts whenever it requires an operator and none appears in the lexeme stream. In the declaration (x,y - String n - Integer), such an operator will be inserted between String and n, and will become the top operator for the declaration parsetree. As mentioned in the beginning of the Syntax section, inside every pair of brackets must come a single expression. This need to assimilate everything under one operator drives the system to insert <\) ._ whenever it comes to the end of an expression and there is more stuff to the right that isn't the bracket it is looking for. however, the precedence and other properties of this operator are entirely up to the grammar creator. in this case, by making the precedence of ~~(toke \> low we ensure that it becomes the main operator for type-variable lists with multiple type declarations. By declaring it to be ":grouping :true," we ensure that any parsetree of the form

   (:^+ | (:^+ | a b) c)
will be converted to
   (:^+ | a b c)

The clause :replace ((minus hyphen)) means to replace the token <minus> generated by the lexer with <hyphen> when typed-var-grammar is in control. That means that even when control returns to the normal owl-s grammar, the token derived from "-" will not start looking like a minus again.

[[ ]]

The other two things that can appear in an IOPR list are preconditions and results:

<<Define preconds

   ;; <IOPR> ::= precondition : ...
   (def-syntactic precondition :reserved true
                               :prefix (:numargs 1 :precedence 120)
      :tidier
         (( (:^+ precondition (:^+ colon ?exp))
           !~>>- (:^+ precondition ,exp))
          (:else (defect "'precondition' not followed by ': <expression>'")))

      :checkers
         ((:up 1 (?(:^+ iopr ?@_))))) >>
The formula in the precondition field of a process definition has no special syntax, but is just part of the general-purpose logical-expression system that generates all those XML literals at the leaves of OWL-S. This system is discussed in section "????".

2.4  The Syntax of results

The result field(s) of a process definitions is more complex than the inputs, outputs, preconditions, and other fields.

<<Define results

   ;; <IOPR> ::= result : ...
   (def-syntactic result :reserved true
                          :prefix (:numargs 1 :precedence 120)
      :tidier
         (( (:^+ result (:^+ colon ?exp))
           !~>>- (:^+ result ,exp))
          (:else (defect "'result' not followed by ': <expression>'")))

      :checkers
         ((:up 1 (?(:^+ iopr ?@_)))
          <<Insert: result-tree-checker>>)) >>
Results can include several constructs that are not meaningful, or mean something different, in other contexts. Even though a result may look like a predicate-calculus formula, it is not really an entity with a truth value, but a recipe for changing the truth values of various propositions. For this reason, I prefer using the verb "impose" to describe a result. A result R does not become "true"; it is imposed following an action or event. Here is a rough definition of what that means:

  1. To impose an atomic formula p is to make it true
  2. To impose a negated atomic formula not p is to make p false. For many implementations, this translates into deleting p from a representation of the current situation, so that the use of a closed-world assumption [Rei78] will allow an implementation to conclude that p is false.
  3. To impose p |-> q is to impose q if p was true before the action or event.
  4. To impose forall (x) p[x] is to impose p[a] for all objects a. In practice, most implementations only handle the special case forall (x) p[x] |-> q[x], which means "For every set of values v for vars such that p[v] is true before the action or event, impose q[a]." OWL-S is typical in singling this case out.
  5. To impose output(p1 <= v1, ..., pk <= vk), where each pi is an output of the current process, make each vi the value of pi. (This notation corresponds to the withOutput construct of OWL-S.)

Various syntactic constraints are implied by this definition of "impose." There are others as well. For example, we don't allow disjunctive effects in OWL-S. [[ To check that a result is syntactically legal, we must write a straightforward Lisp procedure (called result-defects) to walk through a parsetree headed by <result>.]]

First, we define the special operators (<when> and <output>), which occur only in results. [[ move lexicals out ]]

<<Define syn-when

   (def-syntactic when :lex "|->"
                       :infix (:left-precedence 110 :right-precedence 111)) >>
<<Define bind-param-syn

   (def-syntactic bind-param :lex "<="
                             :infix (:precedence 110)
      :checkers
         (( (:^+ bind-param ?(:~ ?(:+ ?p is-Symbol)) ?_)
           (defect "Non-symbol to left of '<=': " p))
          <<Insert: bind-param-context>>)) >>
<<Define output-syntax

   (def-syntactic output :reserved true
                  :prefix (:precedence 200 :numargs 1)
      :tidier
         (( (:^+ output (:^+ group ?@bindings))
           !~>>- (:^+ output ,@bindings)))

      :checkers
         (( (:^+ output ?@bindings)
           (forall-check bindings 
              ( (:~ ?(:^+ bind-param ?@_))
               (defect "Outputs must all be of form 'param <= exp', not "
                       this-ptree)))))) >>

The checker for results calls the recursive checker result-check:

<<Define result-tree-checker

:. (def-syntactic result ... 
      ...
      :checkers
         (...  .:
          ( ?(:^+ result ?exp)
           (apply-named-checker result-check exp)) >>
<<Define result-syntax-checker

(def-named-checkers result-check
   (select 
      ( (:^+ forall ?vars ?r)
       (apply-named-checkers decls-check vars)
       (apply-named-checkers forall-body-must-be-whens r))
      ( (:^+ exists ?vars ?r)
       (defect "Existential quantifiers are illegal in results"))
      ( (:^+ when ?_ ?effect)
       (apply-named-checkers result-check effect))
      ( (:^+ and ?@conjuncts)
       (forall-check conjuncts
          result-check))
      ( (:^+ group ?_)  ;; stuff originaly separated by commas
       (defect "Can't have <comma> as a connective"))
      ( (:^+ not ?elt)
       (apply-named-checkers must-be-fun-app elt))
      ( (:^+ output ?@_)
      (:else
       must-be-fun-app))))

<<Insert: result-piece-checkers>>>>
Note that result-check is defined using def-named-checkers; the plural indicates that in general the name tags a set of checkers. In this case the set is a singleton, a checker of the form (select –clauses–), which operates by running the checkers in the first clause whose pattern matches the current parsetree. Note that to apply a named checker set to the current parsetree, all you have to do is give its name. To apply one to something else you use apply-named-checkers. The checker sets in question are now defined. To understand the first one, it's necessary to recall that declarations are parsed with a local grammar that assigns a low precedence to the contiguity operator <|>. If variables of more than one type are being declared, then this will become the main operator of the declaration. All other subexpressions must be of the form "var, var, var" or "var, var, var - type."
<<Define result-piece-checkers

(def-named-checkers decls-check 
   (select
      ( (:^+ \| ?@decls)
       (forall-check decls
          decl-check))
      (:else
        decl-check)))

(def-named-checkers decl-check
   (select 
      ( (:^+ hyphen ?@vars ?type)
       hyphen-checks)
      ( (:^+ comma ?@vars)
       (forall-check vars
          ( ?(:+ ?v (not is-Symbol))
           (defect "Variable is not symbol: " v))))
      ( ?(:+ ?v (not is-Symbol))
       (defect "Variable is not symbol: " v))))

(def-named-checkers forall-body-must-be-whens
   (select
      ( (:^+ when ?_ ?e)
       (result-check e))
      ( (:^+ and ?@conjuncts)
       (forall-check conjuncts
          forall-body-must-be-whens))
      (:else
       (defect "Body of 'forall' must be a '|->' or a conjunction"
               " of '|->' expressions")))) >>
<<Define must-be-atomic

(def-named-checkers must-be-fun-app
   ( (:~ (:^+ fun-app ?_ ?@_))
    (defect "Context requires atomic formula"))) >>
One thing these checkers do not check is whether each occurrence of a variable in a result is bound by a dominating forall. That check is deferred until the internalization phase, which is managed by a subgrammar called owl-s-as-rdf, discussed in section "????".

2.5  The Bodies of Composite Processes

Unlike atomic and simple processes, composite processes have bodies, the stuff in left braces after the IOPR specs.

<<Define syn-braces

   (def-syntactic left-brace :lex "{"
                             :prefix (:precedence 0
                                      :bracket (:open right-brace))
      :tidier
      (( (:^+ left-brace ?sub)
        !~>>- sub)))

   (def-syntactic right-brace :lex "}"
                              :suffix (:left-precedence 0
                                       :bracket (:close left-brace))) >>
(See the syntactic definition for composite.) Some are defined using reserved words, others with special character sequences.
<<Define control-constructs

   (def-syntactic anyord :lex "||;"
                         :infix (:precedence 60 :grouping :true)
      :tidier elim-left-braces

      :checkers (control-construct-checks))


   (def-syntactic split-join :lex "||>"
                             :infix (:precedence 60 :grouping true)
       :tidier elim-left-braces

       :checkers (control-construct-checks))

   (def-syntactic split :lex "||<"
                        :infix (:precedence 60 :grouping true)
       :tidier elim-left-braces

       :checkers (control-construct-checks))

   (def-syntactic choice :lex ";?"
                         :infix (:precedence 60 :grouping true)
       :tidier elim-left-braces

       :checkers (control-construct-checks))

   (def-syntactic semicolon :lex ";"
                            :infix (:precedence 65 :grouping true)
       :tidier elim-left-braces

       :checkers (( (:^+ semicolon ?@elements)
                   (forall-check elements
                      check-down-to-control-constructs))))

   ;;; All control constructs must pass these two checks ...
   (def-named-checkers control-construct-checks
      (:up 1 control-context-check)

      ( (:^+ ?_ ?@elements)
       (forall-check elements
          check-down-to-control-constructs)))

<<Insert: control-constants>>

   ;;; A control construct can appear only in certain places
   (def-named-checkers control-context-check 
      ( ?(:|| ?(:^+ if ?_ ?,start-ptree ?_)
              ?(:^+ if ?_ ?_ ?,start-ptree)
              ?(:^+ define composite ?_ ?_ ?,start-ptree)
              ?(:^+ ?(:member +owl-s-control-operators+) ?@_))
       (defect "Control construct in illegal context")))

   ;;; And it can contain only certain things
   (def-named-checkers check-down-to-control-constructs 
      ( ?(:~ ?(:^+ ?(:|| tag
                         with-namespaces
                         ?(:member +owl-s-control-operators+)
                         ?(:member +owl-s-taggables+))
                   ?@_))
       (defect "Illegal as element of composed process: " this-ptree))
      ( (:^+ with-namespaces ?e ?@_)
       (apply-named-checkers check-down-to-control-constructs e))) >>
In a checker, start-ptree refers to the parsetree where the checking began, which, in an up, is not necessarily the same as this-ptree. So the expression (:^+ if ?_ ?,start-ptree ?_) matches an "if" whose "iftrue" part is the tree we started at, one level below the current one.

The qvaroid ?(:member l) matches any datum in the list l. The list is typically a named constant. Here the two constant lists are

<<Define control-constants

(defconstant +owl-s-control-operators+
    '(anyord choice split-join split semicolon))

(defconstant +owl-s-taggables+
    '(perform produce as-process)) >>
The first list, +owl-s-control-operators+, is the set of composite-process classes that have components comprising bags of steps. The second, +owl-s-taggables+, is the set of "leaf" classes. The name derives from the fact that in the presentation syntax these are the constructs whose instances can be named using tags.
<<Define composite-process-leaves


   (def-syntactic perform :reserved true
                          :prefix (:numargs 1 :precedence 90)
      :tidier
         (( (:^+ perform (:^+ fun-app ?name [_*_] ?@pbs))
           !~>>- (:^+ perform ,name ,@pbs))
          (:else (defect "Unintelligible")))

      :checkers
         (( (:^+ perform ?(:+ ?name (not is-Symbol)) ?@_)
           (defect "Perform of process with illegal name " name))

          ( (:^+ perform ?name ?@bdgs)
           (forall-check bdgs
              (:? ?(:~ ?(:^+ bind-param ?_ ?_))
                 (defect "Illegal data input " b " to perform of "
                         name))))))

   ;; 'produce (p1 <= v1 , ...)' indicates bindings to outputs of this
   ;; process
   (def-syntactic produce :reserved true
                          :prefix (:numargs 1 :precedence 90)
      :tidier
         (( (:^+ produce (:^+ group ?@bindings))
           !~>>- (:^+ produce ,@bindings))
          ( (:^+ produce ?binding)
           !~>>- (:^+ produce ,binding))
          (:else (defect "'produce' must be followed by bindings in parens")))

      :checkers
         (( (:^+ produce ?@bindings)
           (forall-check bindings
              (:? ?(:~ ?(:^+ bind-param ?@_))
                 (defect "'produce' args must all be of form"
                         " 'param <= exp', not " b)))))) >>
To refer to the outputs of a performed process, you must tag it and then refer to the outputs as "slots" of the tag.

It happens that the syntax of the parenthesized items after perform and produce is identical to that of the items after output, although they all serve different purposes. In each case, the items must be a sequence of parameter bindings, p <= v. (The characters "<=" are converted to the token <bind-param> .) For output and produce, they describe how the results of the current process are set; the former is for atomic and simple processes, and occurs in a result construct, while the latter occurs in the body of a composite process.)

<<Define bind-param-context

:. (def-syntactic bind-param :lex "<="
                             ...
      :checkers
         (...      .:
          (:up 1
             (:? ?(:^+ ?(:|| output perform produce) ?@_))) :.)).: >>

Any leaf component of a control construct can be tagged. We indicate a tag using a double colon: The syntax is simple:

<<Define tag-syn

   (def-syntactic tag :lex "::" :infix (:precedence 70)
      :checkers
         (( (:^+ tag ?(:+ ?name (not is-Symbol)) ?_)
           (defect "Illegal tag: " name))

          ( (:^+ tag ?_ ?(:& ?x (:^+ (:& ?op
                                         ?(:~ ?(:member +owl-s-taggables+)))
                                     ?@_)))
           (defect "Illegal as tagged element: " x
                   "[" op "]"))

          (:up 1 control-context-check))) >>

The final control construct is as process(...){...}, which allows us to reuse the process syntax for pieces of a process, so that we can declare local effects, preconditions, and outputs:

<<Define as-syntax

(def-syntactic as :reserved true
                  :prefix (:precedence 75 :numargs 2)
   :tidier
      (( (:^+ as ?(:^+ process :local ?@ioprs)
                 ?body)
        !~>>- (:^+ as (:^+ process :local ,@ioprs)
                      ,body))
       (:else (defect "Ill-formed 'as process' construct")))) >>
The process spec following the token has the same syntax as a composite-process definition, except that the process has no name. So we have to broaden the syntax of process to include the anonymous case:
<<Define anon-process

 :.(def-syntactic process :reserved true
                  :prefix (:precedence 25 :numargs 1)  
      :tidier .:
          (:? ?(:^+ process (:^+ group ?@ioprs))
             !~(:^+ process :local ,@ioprs))  >>
After this transformation, all processes have names; the anonymous ones have the name :local.

2 Ontologies

In case you've forgotten, we allow pieces of ontologies to be included into Owl-S, written in the N3 notation. Hence we must provide a local grammar for it.

<<Define n3-grammar-def

(def-grammar n3-grammar :parent owl-s
   :definitions
   (
<<Insert: n3-lexical>>

   <<Insert: ontology-def>>

    <<Insert: dot-syntax>>
    ))
>>
The declaration of grammar owl-s as the parent for n3-grammar is a bit risky. We do it so that we can inherit the lexical and syntactic definitions of <colon>. But we have to make sure to avoid inheriting many of the infix operators of the algebraic language used by Owl-S, which are foreign to N3 (and RDF). (See appendix "????".) We do so by declaring them illegal at the lexical level.
<<Define illegal-n3-ops

   (def-lexical #\~ (illegal))
   (def-lexical #\< (illegal))
   (def-lexical #\> (illegal))
   (def-lexical #\& (illegal))
   (def-lexical #\| (illegal))
   (def-lexical #\= (illegal)) >>

At the top of the precedence heap, so to speak, we find ontologies introduced by the operator ontology:

<<Define ontology-def

   (def-syntactic ontology :reserved true
                  :prefix (:precedence 5 :local-grammar n3-grammar)
      :checkers
         (( (:^+ ontology ?@subs)
           (forall-check subs
              ( ?(:& ?(:~ ?(:|| ?(:^+ imports ?_)
                                ?(:^+ ontology_version ?_)
                                ?(:^+ triple ?@_)))
                     ?x)
               (defect "Illegal element of ontology: " x))))))
 
   (def-syntactic imports :reserved true
                  :prefix (:precedence 6)
      :tidier
         (( (:^+ imports (:^+ comma ?@subs))
           !~>>- (:^+ imports ,@subs)))

      :checkers
         (( (:^+ imports ?@subs)
           (forall-check subs
              (select
                 ( ?(:+ ?_ is-name)
                  :okay)
                 ( ?(:^+ uri ?_)
                  :okay)
                 ( ?sub
                  (defect "Illegal import (not a name or URI): " sub)))))))

   (def-syntactic ontology_version :reserved true
                  :prefix (:precedence 6)
      :checkers
         (( ?(:^+ ontology-version ?(:+ ?_ (not is-String)))
           (defect "Ontology version must be string")))) >>
The versioning information is there for purely documentary purposes. The imports statement is more substantial, but at the syntactic phase the grammar can check only its most superficial properties.

The important elements of the ontology are, of course, the triples and type declarations, separated by dots:

<<Define dot-syntax

    (def-syntactic left-brace :lex "{"
                   :prefix (:left-precedence 10 :right-precedence 0))

    (def-syntactic imports :reserved true
                   :prefix (:precedence 15))

    (def-syntactic dot :lex "."
                   :infix (:precedence 5
                           :numargs (0 1)
                           :grouping true))
       <<Insert: dot-checkers>>

    (def-syntactic semicolon :lex ";"
                   :infix (:precedence 10 :grouping true)
       <<Insert: semicolon-checkers>>
    )

    (def-syntactic hyphen :lex "-"
                   :infix (:precedence 18 :grouping false))

    (def-syntactic \| :lex ""
                   :infix (:precedence 20 :grouping true)
       <<Insert: contig-tidier>>)

    <<Insert: contig-checkers>>

    (def-syntactic comma :lex ","
                   :infix (:precedence 30 :grouping true))
    
   <<Insert: n3-named-checkers>>
     >>
These precedences put <dot> highest in an N3 parse, not surprisingly. It may seem natural to think of "." as a three-argument postfix operator, but Lexiparse doesn't allow postfix operators. Besides, it's really not the dot that makes the terms stick together; as shown by the way semicolons work, it's simple contiguity that means "property attribution" in N3. So we recruit the contiguity operator <|>, discussed in section 2. Its precedence is set below that of <comma>, so that "x p a,b,c" gets parsed as
 
   (:^+ | x p (:^+ comma a b c))
Then <semicolon> is given lower precedence than <|>, so that "x p1 a1; p2 a2; p3 a3" gets parsed as
   (:^+ semicolon (:^+ | x p1 a1)
                  (:^+ | p2 a2)
                  (:^+ | p3 a3))
(Or it would if the contiguity operator didn't get tidied away.) Here syntax doesn't follow semantics exactly. But it's easier to let the semantics be straightened out during internalization.

It's best to replace occurrences the contiguity operator with more revealingly named tokens:

<<Define contig-tidier

       :tidier
          (( (:^+ \| ?subj ?prop ?obj)
            !~>>- (:^+ triple ?subj ?prop ?obj))
           ( (:^+ \| ?prop ?obj)
            !~>>- (:^+ prop-obj-attrib ?prop ?obj))
           (:else
            (defect "Only two or three contiguous items allowed in N3"))) >>
<<Define contig-checkers

    (def-checkers triple
       ( ?(:^+ triple ?(:+ ?subj (not is-name)) ?_ ?_)
        (defect "Illegal subject in triple: " subj))
       ( ?(:^+ triple ?_ ?(:+ ?property (not is-name)) ?_)
        (defect "Illegal property in triple: " property)))

    (def-checkers prop-obj-attrib
       ( ?(:^+ prop-obj-attrib ?(:+ ?property (not is-name)) ?_)
        (defect "Illegal property in \"double\": " property))) >>
The checkers for triples and "doubles" verify that subjects and properties are names. They can't check anything about objects, because datatype properties allow fairly unpredictable things in the object slot.

Another thing we check at this stage is that dots and semicolons really are triples.

<<Define dot-checkers

:.  (def-syntactic dot .:
       :checkers
          ( ( (:up 1 ?(:~ ?(:^+ left-brace ?_)))
             (defect "Dot in illegal context"))

            ( ?(:^+ dot ?@subs)
             (forall-check subs
                (select
                   ( ?(:|| ?(:^+ \| ?@_)
                           ?(:^+ hyphen ?_ ?_))
                    (use-named-checkers legal-triple))
                   ( ?(:^+ semicolon ?triple
                                     ?@semisubs)
                    (apply-named-checkers legal-triple triple)
                    (forall-check semisubs
                       (use-named-checkers legal-double))
                   (:else
                    (defects "Illegal at top level of n3: "
                             this-ptree))))))) >>
<<Define semicolon-checkers

:.  (def-syntactic semicolon .:
       :checkers
           ( ( (:up 1 ?(:~ ?(:^+ dot ?@_)))
              (defect "Semicolon in illegal context"))))  >>
The actual checking is done by these named checkers:
<<Define n3-named-checkers

   (def-named-checkers legal-triple
        ( ?(:~ ?(:^+ ?_ ?_ ?_))
         (defect "Triple without 3 elements!"))

        ( ?(:^+ ?(:+ ?subj (not is-name)) ?_ ?_)
         (defect "Triple with subject that isn't a name: " subj))

        ( ?(:^+ ?_ ?(:+ ?property (not is-name)) ?_)
         (defect "Triple with property that isn't a name: " property)))

   ;; We don't attempt to check for a legal object, because
   ;; object properties are too varied

   ;; A "double" is the stuff after a semicolon
   (def-named-checkers legal-double
        ( ?(:~ ?(:^+ ?_ ?_))
         (defect "Illegal property-object pair after semicolon"))

        ( ?(:^+ ?(:+ ?property (not is-name)) ?_)
         (defect "Illegal property in propert-object pair: " property))) >>


  2 Formulas

The only parts of OWL-S that are left to deal with are the logical formulas in conditions, effects, and various output constructs. Here we incorporate a simple logical language capturing the normal encodings of Lisp as infix syntax.

Most of this grammar is inherited from the simple-arith grammar that is part of the Lexiparse package All this grammar does is provide the usual precedence hierarchy for multiplication (*), addition (+), and their ilk. All the precedences for the operators lie in the range 100 to 200.

All we introduce here are the usual quantifiers:

<<Define quantifiers

   (def-syntactic forall :reserved true
                         :prefix (:numargs 2 :precedence 90
                                  :local-grammar ((1 typed-var-grammar))))

   (def-syntactic exists :reserved true
                         :prefix (:numargs 2 :precedence 90
                                  :local-grammar ((1 typed-var-grammar)))) >>

Finally, in table Cross Precedence, we show which operators a given operator "dominates." The row for operator i shows how it behaves when competing with other operators to its right. The first column shows how it behaves when matched against itself. The second column shows all the operators whose left precedence is greater than operator i. If operator j appears in the list, it will tend to appear in trees headed by operator i. (A bullet to the left of an operator's name in the row labels indicates that it is an infix operator, to minimize confusion about operators with infix and prefix versions.) Of course, this table gives only a crude indication of how each operator interacts with the entities below it in the parsetree; the details are spelled out in the syntactic spec for the operators in question.

Table 2-3:Operator precedence comparisons
Operators on right
Operator
on left
Associa
tivity
Dominated tokens

2 Comments

Comments in Owl-S are similar to Java comments, with a couple of differences. Multiline comments are bounded by /*and */. Single-line comments start with //. In Java, as in C/C++, multiline comments cannot be nested. Owl-S pays attention to nesting. In Java, a doubled asterisk in the opening of a multiline comments means that it should be used by Javadoc. In Owl-S, it means that the comment should be transformed into an rdfs:comment declaration. [[ This is harder than it sounds ]]

Lexiparse handles comments mainly at the lexical level. Comments "float above" other syntactic material, and get swept up into whatever parsetrees they are closest to. They are invisible to tidiers and checkers. (They should be visible to parsetree pretty-printers and other debugging tools.) However, internalizers are free to do what they want with them, including turning them into rdfs:comments.

Here are the definitions of the comment characters

<<Define comments

(with-grammar owl-s

   (def-lexical #\/
      (dispatch
         (#\* (dispatch
                 (#\* (comment
                         :style (:multiline "*/" 0 :important ("/*" "*/"))))
                 (:else
                    (comment
                       :style (:multiline "*/" 0 ("/*" "*/"))))))
         (#\/ (dispatch
                 (#\/ (comment :style (:one-line 2)))
                 (:else (comment :style (:one-line 1)))))
         (:else (token divide))))) >>
Hopefully these declarations are not too mystifying.

Appendix : Standard Arithmetic Syntax

This is the grammar that most of the Owl-S grammar is based on.

<<Define arithgram [arithgram, line 2]
(in-package :lexiparse)
;; $Id: owl-s-gram-htm.html,v 1.4 2006/09/01 20:57:39 martin Exp $

(depends-on %module/ lexiparse)

(eval-when (:compile-toplevel :load-toplevel :execute :slurp-toplevel)
   (export '()))

(def-grammar standard-arith-syntax)

(with-grammar standard-arith-syntax

(def-lexical-tokens  ((#\( left-paren)
		      (#\, comma)
		      (#\) right-paren)
		      (#\~ not)
		      (#\- minus)
		      (#\+ plus)
		      (#\* times)
		      (#\/ divide)
		      (#\< less)
		      (#\& and)
		      (#\| or)))

(def-lexical #\= (dispatch
                    (#\< (token leq))
                    (else (token equals))))

(def-lexical #\> (dispatch
                    (#\= (token geq))
                    (else (token greater))))

(def-lexical (#\0 - #\9) #'lex-number
    :name numeric)

(def-lexical (#\a - #\z #\A - #\Z) #'one-char-sym
    :name alphabetic)

(def-lexical #\" (lex-string false !()))
)

;;; Except for parens' internal precedences, the entire standard-
;;; arith system has precedences >= 100.  This leaves plenty of room
;;; for "superstructure."


(with-grammar standard-arith-syntax

(def-syntactic comma :lex "," :infix (:precedence 100 :grouping true))

(def-syntactic right-paren :lex ")"
               :suffix (:left-precedence 0 :bracket (:close left-paren)))

;; The left-prec of left-paren (as infix) is 200, which is higher than
;; the right-prec 190 for times and divides.  It's rare to see
;; a op b(...) get parsed as (a op b)(...), but it can be arranged by
;; giving 'op' a right-precedence >= 200.  -- 

(def-syntactic left-paren
        :lex "("
	:prefix (:precedence 0 :bracket (:open right-paren))
	:infix (:left-precedence 200 :right-precedence 0
				 :bracket (:open right-paren)))

(def-syntactic or :lex "|" :infix (:precedence 120 :grouping true))

(def-syntactic and :lex "&" :infix (:precedence 130 :grouping true))

(def-syntactic not :lex "~" :prefix (:precedence 140))


(def-syntactic greater :lex ">" :infix (:precedence 160))

(def-syntactic less :lex "<" :infix (:precedence 160)) 

(def-syntactic geq :lex ">=" :infix (:precedence 160)) 

(def-syntactic leq :lex "=<" :infix (:precedence 160)) 

(def-syntactic equals :lex "=" :infix (:precedence 160))

(def-syntactic plus :lex "+"
		    :prefix (:precedence 180 :numargs 1)
                    :infix (:precedence 180 :grouping true))

(def-syntactic minus :lex "-"
		     :prefix (:precedence 180 :numargs 1)
                     :infix (:precedence 180))

(def-syntactic times :lex "*" :infix (:precedence 190 :grouping true))

(def-syntactic divide :lex "/" :infix (:precedence 190 :grouping true))

)
>>

Appendix -4: OWL-S Syntax: Top-Level

For the complete, "(un)tangled," listing for this file, see owl-s-syn.lisp

<<Define File owl-s-syn-new.lisp
;-*- Mode: Common-lisp; Package: lexiparse; Readtable: ytools; -*-
(in-package :lexiparse)

(depends-on %module/ ytools lexiparse)

(depends-on %lexiparse/ arithgram)

#+allegro
(defun allegro-case-choice ()
   (cond ((eq excl:*current-case-mode* ':case-sensitive-lower)
          ':preserve)
         (t ':up))) 

<<Insert: grammar-def>>

<<Insert: typed-vars-tidier>>

<<Insert: result-syntax-checker>>

<<Insert: must-be-atomic>>

<<Insert: ??control-checks??>>

(with-grammar owl-s

;;; LEXICAL

   (def-lexical-tokens ((#\{ left-brace)
                        (#\} right-brace)
                        (#\. dot)))

   ;; ;? = choice, ; = sequence
   (def-lexical #\;
         (dispatch (#\? (token choice))
                   (:else (token semicolon))))

   (def-lexical #\| 
   ;; ||; = any-order, ||> = split+join, ||< = split.
   ;; '|->' becomes the token <when> --
                    (dispatch     
                       (#\| (dispatch 
                               (#\; (token anyord))
                               (#\> (token split-join))
                               (#\< (token split))))
                       (#\- (dispatch (#\> (token when))))
                       (:else (token or))))

   ;; '->' is ordinary if-then --
   (def-lexical #\- (dispatch
                       (#\> (token imply))
                       (:else (token minus))))

   ;; '=>' becomes the token <when>, although '|->' is now the preferred form
   (def-lexical #\= (dispatch
                       (#\> (token when))
                       (:else (token equals))))

   ;; '<=' becomes the token <bind-param>
   (def-lexical #\< (dispatch
                       (#\= (token bind-param))
                       (:else (token less))))

   ;; '::' is for step tags --
   (def-lexical #\: (dispatch
                        (#\: (token tag))
                        (:else (token colon))))

   (def-char-set owl-s-lexical-chars* (#\_))

   (def-lexical (#\_ #\a - #\z #\A - #\Z)
       (lex-sym (owl-s-lexical-chars))
       :name alphabetic)


;;; SYNTACTIC

   <<Insert: namespace-spec>>

   <<Insert: uri-spec>>

   <<Insert: process-definition>>

   <<Insert: process-classes>>

   <<Insert: process-spec>>

   <<Insert: param-decls>>

   <<Insert: preconds>>

   <<Insert: results>>

   <<Insert: output-syntax>>

<<Insert: colon>>

   <<Insert: quantifiers>>

   <<Insert: syn-when>>

   <<Insert: syn-braces>>

   <<Insert: bind-param-syn>>

   <<Insert: control-constructs>>

   <<Insert: if-syntax>>

   <<Insert: then-else-syntax>>

<<Insert: composite-process-leaves>>

<<Insert: as-syntax>>

   <<Insert: tag-syn>>

   <<Insert: left-paren>>

   (def-syntactic \| :lex "" :infix (:precedence 3 :grouping true))

   (def-syntactic dot :lex "." :infix (:precedence 205)
      :checkers
      ((:? ?(:^+ dot ?step ?output)
          (nconc (cond ((is-Symbol step) !())
                       (t (list (defect "Illegal name to left of dot in "
                                        step "." output))))
                 (cond ((is-Symbol output) !())
                       (t (list (defect "Illegal name to right of dot in "
                                        step "." output))))))))
)


;;; LOCAL GRAMMAR for type declarations

<<Insert: typed-var-grammar>>

(defun is-name (x)
   (or (is-Symbol x)
       (matchq ?(:^+ name-wrt-space ?_ ?_)
               x))) >>

Appendix : N3 Syntax: Top-Level

This is the outline of the N3 syntax discussed in section "2", plus a few key pieces not covered there.

<<Define File n3-syn-new.lisp
;-*- Mode: Common-lisp; Package: lexiparse; Readtable: ytools; -*-
(in-package :lexiparse)

(depends-on %module/ ytools lexiparse)

<<Insert: n3-grammar-def>>

>>
<<Define n3-lexical


   <<Insert: illegal-n3-ops>>

   (def-lexical #\. (token dot))

     >>

Bibliography

Coa05
The OWL-S Coalition. OWL-S Release 1.1beta. Recent draft, 2004.
Knu84
Donald E. Knuth. Literate programming. The Computer Journal , 27(2):97-111, 1984.
McD04a
Drew McDermott. The YTools Manual, version 1.4
McD04b
Drew McDermott. Lexiparse: A Lexicon-based Parser for Lisp Applications. Draft, 2004.
MH04
Deborah L. McGuinness and Frank van Harmelen (eds.) OWL Web Ontology Language Overview. W3C Recommendation 2004
Rei78
Raymond Reiter. On closed world data bases. In Gallaire and Minker (eds). Logic and Databases, Plenum Press.