The original syntax for processes in OWL-S is somewhat unwieldy, to put it mildly. The main design constraint was that it be RDF; but the job it is doing, describing processes in all their intricate detail, is not really suitable for RDF, which is basically a metadata formalism. Therefore, we are moving toward a more readable syntax, which can nonetheless be translated into the RDF notation when required. Our syntax includes N3 [<>] as a subset, then adds what we hope is a natural algebraic syntax for expressing variable declarations and control structures.
Here is an example of a process definition, drawn from the Congo domain [[ ]], which involves a book-selling service. All XML files must begin with namespace declarations, and ours is no exception:
<<Define congo-namespaces [congo, line 3]
with-namespaces
(uri"http://www.daml.org/services/owl-s/1.2/CongoProcess.owl",
rdf: uri"http://www.w3.org/1999/02/22-rdf-syntax-ns",
rdfs: uri"http://www.w3.org/2000/01/rdf-schema",
xsd: uri"http://www.w3.org/2001/XMLSchema",
owl: uri"http://www.w3.org/2002/07/owl",
expr: uri"http://www.daml.org/services/owl-s/1.2/generic/Expression.owl",
swrl: uri"http://www.w3.org/2003/11/swrl",
// swrlx: "http://www.daml.org/services/owl-s/1.2/generic/swrlx.owl",
service: "http://www.daml.org/services/owl-s/1.2/Service.owl",
process: "http://www.daml.org/services/owl-s/1.2/Process.owl",
time: "http://www.isi.edu/~pan/damltime/time-entry.owl",
objList: "http://www.daml.org/services/owl-s/1.2/generic/ObjectList.owl",
congo: "http://www.daml.org/services/owl-s/1.2/CongoService.owl",
profileHierarchy: "http://www.daml.org/services/owl-s/1.2/ProfileHierarchy.owl")
>>
It is not necessary to include any ontological definitions in an Owl-S file, but it's often quite convenient. We use a superset of the N3 notation:
<<Define congo-ontology [congo, line 22]
{
ontology
{
ontology_version
"1.88 2006/01/30 07:52:44" .
imports time, service, process, profileHierarchy,
uri"http://www.daml.org/rules/proposal/swrl.owl" .
/**
A B2C bookbuying example of OWL-S (Web Ontology Language
for Services; see http://www.daml.org/services/owl-s) usage,
*/
shippedTo - owl:ObjectProperty;
rdfs:domain Shipment;
rdfs:range :AcctID .
shippedBook - owl:ObjectProperty;
rdfs:domain Shipment;
rdfs:range profileHierarchy:Book .
deliveryType - owl:ObjectProperty;
rdfs:domain Shipment;
rdfs:range DeliveryType .
packagingType - owl:ObjectProperty;
rdfs:domain Shipment;
rdfs:range PackagingType .
Shipment - owl:Class;
rdfs:subClassOf [ a owl:Restriction;
owl:onProperty :shippedTo;
owl:cardinality 1 ],
[ a owl:Restriction;
owl:onProperty :shippedBook;
owl:cardinality 1 ],
[ a owl:Restriction;
owl:onProperty :deliveryType;
owl:cardinality 1 ];
[ a owl:Restriction;
owl:onProperty :packagingType;
owl:cardinality 1 ] .
cardNumber - owl:DatatypeProperty;
rdfs:domain CreditCard;
rdfs:range xsd:decimal .
cardType - owl:ObjectProperty;
rdfs:domain CreditCard;
rdfs:range CreditCardType .
cardExpiration - owl:DatatypeProperty;
rdfs:domain CreditCard;
rdfs:range xsd:gYearMonth .
validity - owl:ObjectProperty;
rdfs:domain CreditCard;
rdfs:range ValidityType .
CreditCard rdfs:subClassOf [ a rdfs:Restriction
owl:onProperty cardNumber;
owl:cardinality 1 ],
[ a rdfs:Restriction
owl:onProperty cardType;
owl:cardinality 1 ],
[ a rdfs:Restriction
owl:onProperty cardExpiration;
owl:cardinality 1 ],
[ a rdfs:Restriction
owl:onProperty validity;
owl:cardinality 1 ] .
FailureNotification enum { NotifyBookNotFound NotifyBookOutOfStock } .
OrderShippedAcknowledgment - owl:Class .
ExpressCongoBuyOutputType - owl:Class;
owl:unionOf ( OrderShippedAcknowledgment
FailureNotification ) .
FullCongoBuyOutputType - owl:Class;
owl:unionOf ( OrderShippedAcknowledgment
FailureNotification ) .
LocateBookOutputType - owl:Class;
/** If successful, LocateBook returns an ISBN;
otherwise, it returns a FailureNotification. */
owl:unionOf ( profileHierarchy:ISBN FailureNotification ) .
InStockBook - owl:Class;
/** The set of books that are found in stock of Congo. */
rdfs:subClassOf profileHierarchy:Book .
OutOfStockBook - owl:Class;
/** The set of books that are NOT found in stock of Congo. */
rdfs:subClassOf profileHierarchy:Book;
owl:disjointWith InStockBook .
hasBook - owl:ObjectProperty;
/** Inverse of hasISBN property, i.e. relate an ISBN number
to a book. */
owl:inverseOf profileHierarchy:hasISBN .
SignInData - owl:Class .
acctName - owl:DatatypeProperty;
rdfs:domain SignInData;
rdfs:range xsd:string .
password - owl:DatatypeProperty;
rdfs:domain SignInData;
rdfs:range xsd:string .
UserProfileInfo - owl:Class .
AcctInfo - owl:Class;
owl:unionOf ( SignInData UserProfileInfo ) .
AcctID - owl:Class;
hasAcctID - owl:FunctionalProperty;
/** An account exists if there is an account ID associated
with it. */
rdfs:domain AcctInfo;
rdfs:range AcctID .
NonExistingAcct - owl:Class;
/** If there is no account ID for an account then that
account simply does not exist */
owl:intersectionOf ( AcctInfo
[ a owl:Restriction;
owl:onProperty hasAcctID;
owl:maxCardinality 0 ] ) .
}
>>
The principal enhancement to the notation is to allow constants to be
declared using a hyphen between the name of the constant and its
"type." Obviously, in Owl objects can belong to more than one class,
but as we will see, type checking is quite useful [[ why exactly? ]];
if there is no reason to single a particular class out as "the" type
of an object, one can of course leave it undeclared or declare it to
be of type owl:Thing. An undeclared constant must be
referred to using the usual N3 notation, with a prefixed colon. The
name :foo is conventionally an abbreviation for <#foo>;
in Owl-S this is not a convention. The angle-bracket notation
(<#foo>) is not supported, so the prefix colon is the only way
to refer to an undeclared global constant. This is consistent with
the general approach to namespaces described in
Section "????"
Now we come to an actual process definition
<<Define ExpressCongoBuy [congo, line 283]
define atomic process ExpressCongoBuy
/**
This is an express "one shot" service for buying a book
with Congo. It takes as input a Book ISBN number and
the customer's sign-in information, and has the effect of
ordering the book if the book is in stock. In this case,
the output of the service is a message saying the book
is ordered. If the book is not in stock, the output says
that the book is out of stock.
*/
(inputs: (ExpressCongoBuyISBN - profileHierarchy:ISBN,
ExpressCongoBuySignInInfo - SignInData,
ExpressCongoBuyCreditCardNumber - xsd:decimal,
ExpressCongoBuyCreditCardType - CreditCardType,
ExpressCongoBuyCreditCardExpirationDate - xsd:gYearMonth),
exists: (ExpressCongoBuyAcctID - CustomerAcctID,
/** This is a local variable that represents the associated
account ID for the given sign-in info. The existence of
this value indicates that the sign-in info is valid an
account exists. */
` ExpressCongoBuyCreditCard - CreditCard
/** This is a local variable that represents the associated
credit card for the given card number. */
),
precondition: (hasAcctID(ExpressCongoBuySignInInfo, ExpressCongoBuyAcctID)
& credit-card-number(ExpressCongoBuyCreditCard,
ExpressCongoBuyCreditCardNumber)
& validity(ExpressCongoBuyCreditCard, Valid)),
/** Typically this condition should also say that
given credit card type and expiration date is also
correct for the credit card. Those details are left
out for this example. */
outputs: (ExpressCongoBuyOutput - ExpressCongoBuyOutputType),
result: (forall (ExpressCongoBuyBook - profileHierarchy:Book)
/** book identified by the given ISBN input */
(hasISBN(ExpressCongoBuyBook, ISBN)
& inStockBook(ExpressCongoBuyBook)
|->
/** If the book is in stock, then the result is that the
order was shipped and an appropriate acknowledgment
is output. */
N3$${ ExpressCongoBuyOutput
a orderShippedAcknowledgment }
& exists(ExpressCongoBuyShipment - Shipment)
/** shipment that gets generated on
a successful purchase */
(shippedTo(ExpressCongoBuyShipment, AcctID)
& shippedBook(ExpressCongoBuyShipment,
ExpressCongoBuyBook)))
&
(hasISBN(ExpressCongoBuyBook, ISBN)
& N3$${ ExpressCongoBuyBookISBN
a [ owl:Restriction;
owl:onProperty hasBook;
owl:allValuesFrom outOfStockBook ] }
/** If the book is out of stock, then the result is simply
that an appropriate acknowledgment is output, indicating
that the book is out of stock. */
|-> output(ExpressCongoBuyOutput
<= notifyOutOfStock))))
>>
An atomic process is one that is defined entirely in terms of the
values it returns and the effects that it has, without any mention of
its subevents. (At the time scale of a typical web-service
interaction, any such subevents can be neglected.) Modeling a service
as atomic in this way is closely analogous to the way web services are
modeled in WSDL [<>], and in fact that's why many web services can be
described in Owl-S and grounded in WSDL.
The inputs and precondition fields of a definition describe what goes into a process, and the outputs and results fields describe what comes out. (The exists field is used to bind variables [[ and will be described later ]].) What the process definition says is that if the inputs satisfy the type descriptions and the preconditions are true, then the process will produce outputs of the prescribed types and the result will "occur." The prototypical result field is a proposition that will become true (e.g., shippedTo(AcctID)), but these effects can be gated by conditions (e.g., inStockBook(book)), and some result fields specify output values instead of propositions (e.g., output(Status <= orderShippedAcknowledgement)).
In addition to atomic processes, we also have composite processes, which have bodies describing what other processes must carried out in order for the process to be consummated.
In what follows I will specify the grammar for the ontology
and process
definitions (sect. "????");
discuss the syntactic analysis provided by this grammar
(sect. "????");
and then sketch the XMLified RDF that
would be produced by "internalizing" the parsetrees, which is
essentially the same as the original XML version in [
This document consists of pieces of code intermixed with text, in a "literate programming" style [Knu84]. Some of the code comes from our CongoBuy example, but in addition we present the entire grammar in this format. [[ See [McD04b], appendix 1, for a brief explanation of this incarnation of literate programming. ]]
The OWL-S grammar is expressed in a somewhat nontraditional form, and to explain it will require a digression on the topic of syntax. [[ The syntax notation is for the Lexiparse parser [McD04b]. ]] Although human languages are more complex, the formal languages used for programming, logic, and OWL-S process definition are fundamentally quite simple. Every expression can be thought of as a tree structure whose leaves are the lexemes of the language. (We leave aside for the moment how to think about the structure of those lexemes.) It is convenient to assume that every nonleaf tree structure has an "operator" lexeme that sits at the top. The most obvious examples are arithmetic expressions such as "a * b + c," which can be diagramed as a the tree shown in [[ ]]. We will draw our tree structures using variations of the parenthesis notation apparently first invented by Noam Chomsky [<>]. This one can be drawn (+ (* a b) c). The operator of the tree is "+"; its left subtree is itself a tree, with operator "*".
We will be using two sorts of parenthesized tree expression: the one just exemplified, which is of course the notation used by the programming language Lisp [<>]; and a variant in which the symbol ":^+" comes between the open paren and the operator. The reason is that we will be using Chomsky-tree-bracket notation for two purposes: to express the rules of the grammar, and to describe the trees of the language defined by the grammar. For the first purpose we will use traditional Lisp notation, and for the second we would write our example tree as (:^+ + (:^+ * a b) c). (The "Tinkertoy" symbol is meant to resemble a treelet with two subnodes.)
The Lexiparse formalism is different from most in that it is cast
in terms of precedences and parsetrees rather than
phrase-structure rules[
Suffix
operators are just a special case of infix operators whose numargs =
[0,0]. Suffix operators that take more than one argument to
their left are impossible.
The notation for specifying these things, and a couple of other
items of less import, will become clear as we go.
These features suffice to define a deterministic grammar that can be parsed rapidly, transforming any string of lexemes into a parsetree without further ado. From that point on, we express our grammar in terms of transformations that clean up and filter parsetrees, producing error messages for the "defective" ones that don't pass the filters. The grammar rules are written in a Lisp-style notation, and the parsetrees they refer to are written using "Tinkertoy" notation.
The higher the precedence of a token, the
"tighter" it binds to the expressions adjacent to it, which means
that high-precedence tokens tend to head small(er) subtrees of
the large(r) parsetrees headed by low-precedence tokens. The tokens
near the root of the tree are those with the lower precedences.
(This might be one context where computer scientists should draw
their trees with the leaves up!)
All grouping decisions can be made purely locally. Suppose the parser
has the sequence A op B in hand, where A and B
are trees and op is an operator. The question is whether to
build the tree (:^+ op A B), or to keep looking for
material to the right of B that could be used to augment the
second operand of op. If there is an operator to the right of
B, and its left precedence is higher than op's right
precedence, the parser augments B; otherwise, it doesn't.
Example: Having seen x + y, if the next lexeme is
*, then the parser will augment y, so that it may
well wind
up with a tree that looks like (:^+ + x (:^+ * y ...)). If
the next lexeme is absent (we're at the end of the stream),
or it's a close bracket (whose left precedence is
0, and so much lower than any normal operator),
or it's
"-" (whose left precedence is the same as +'s right
precedence, which, let's say, is 180), the parser will build
(:^+ + x y). Of course,
that the parser must then decide whether to incorporate that
into a larger tree
tree (such as (:^+ - (:^+ + x y) ...)), depending on exactly the same
considerations as before, but now the sequence A op B has
A = (:^+ x y), op = -, and B =
whatever's to the right of -.
This is all well and good, but there are many expressions that don't consist of an application of an operator to one or two arguments. In Owl-S, how do we handle something as complex as define, which expects a bunch of keywords to the right, followed by a complex structure of inputs, results, and such? Each case must be handled in its own way, but the idea is to define keywords such as process or atomic as operators themselves, so that the parsetree headed by define contains subtrees headed by these "sub-operators." The syntax of define then consists of its basic precedence information plus operations for tidying and checking a tree headed by an occurrence of it. Checking a tree means looking for illegal patterns; tidying occurs first, in order to clean up the structure produced by the basic precedence grammar. These are embodied in rule sets called the tidiers and checkers for define.
What this means is that to read a Lexiparse grammar you look for the patterns that are expected to occur under an operator. These are not found on the right-hand sides of grammar rules, but inside the tidiers and checkers for each operator. The language of checkers in particular contains many features that go far beyond context-free grammars.
The most intuitive example is the classic
if-then-else, which we'll present here even though it
belongs much later, in the section on control constructs in composite
processes. Intuitively, <if> is an operator with two or
three arguments: a test, something to do if the test is true, and,
optionally, something to do if the test is false. We capture that by
making <else> into an infix operator with a higher
precedence than <then>, so that the three-argument case has
a subtree of the form (:^+ then (:^+ else iftrue
iffalse)).
<<Define then-else-syntax
(def-syntactic then :reserved true
:prefix (:precedence 87 :numargs 1))
(def-syntactic else :reserved true
:infix (:precedence 88)) >>
The field ":reserved true" means that this token is
derived from occurrences of the symbol itself,
and not from some character sequence (as <left-paren> is derived from
'(').
The lexical level of the language is deal with in appendix? "????"
Now we define if thus:
<<Define if-syntax
(def-syntactic if :reserved true
:prefix (:numargs 2 :precedence 85)
:tidier (( (:^+ if ?test (:^+ then (:^+ else ?iftrue ?iffalse)))
!~>>- (:^+ if ,test ,iftrue ,iffalse))
( (:^+ if ?test (:^+ then ?iftrue))
!~>>- (:^+ if ?test ?iftrue))
(:else
(defect "'if' has no 'then' part")))) >>
The first clause establishes that the precedence of the token
if is 85, which places it above then in any
parsetree. The :tidier section is a list of pairs of
patterns:
( ( (:^+ ...) !~>>- (:^+ ...) )
( (:^+ ...) !~>>- (:^+ ...) )
(:else ...)
)
(but we normally use fewer spaces and more newlines +
indentation).
On the left side of a "!~>>-" rule, question marks
indicate chunks of a parsetree to extract and name. On the right
side, commas indicate items to plug into a parsetree being built.
We use standard notations for pattern matching and expression
construction. The former are derived from logic programming and the
latter from Lisp "quasi-quote" or "backquote."
These are not regular expressions, mainly because we are not
matching strings. The precedence grammar has given us crisply organized
structures to
match against, not saggy strings. A match variable of the
form ?var matches a single item in a parsetree.
These are also called qvars.
A qvaroid is an expression where the question mark is
followed by a parenthesized list, beginning with a symbol. What it
matches is governed by that symbol. The qvaroid ?(:^+ op
–subs–) matches a parsetree with operator
op and subtrees subs. (The "subtrees" are not always
parsetrees themselves; they can be occurrences of symbols, strings,
and other lexical primitives.)
The pattern ?(:^+ if
?test (:^+ then (:^+ else ?iftrue ?iffalse))) matches any
parsetree of the form (:^+ if __ (:^+ then (:^+ else __
__))) and binds the variables test,
iftrue, and iffalse to the pieces filling in
the blanks. (If the left-hand side is a parsetree, as it usually is,
the question mark in front of it may be omitted; but in other contexts
it will be mandatory.)
On the right side of a "!~>>-" rule,
commas indicate where the pieces of a newly constructed parsetree are to
come from. So (:^+ if ,test ,iftrue ,iffalse) creates a
new parsetree with a "flatter" structure of three subtrees,
respectively the values of the variables test,
iftrue, and iffalse.
After either a
question mark or a comma, an atsign indicates that a series of
elements is to be matched or substituted. ?@v
matches a series of parsetrees and stores them (as a list) in
v, and ,@exp substitutes the value of the
list exp (as a series) into the parsetree being built. Only
one ?@ may occur in a single list. That doesn't mean in
the entire pattern, but just within one list in the pattern.
The ,@ construct has no analogous limitations
We will see examples of the atsign constructs below.
If neither transformation rule in the tidier applies, then the
:else clause kicks in. The expression
(defect ...) is obviously not a transformation rule. Its pattern
consists of strings and expressions. The values of the expressions
are concatenated with the strings to make a description of a defect,
or "flaw," in the syntax tree. These defects are accumulated, and in
the end produce error messages about an unsuccessful parse.
The <if> construct is simple enough that it needs no
separate checker. If it had needed one, it would have been defined
with a separate :checkers clause in
def-syntactic. We will see examples of this shortly.
Here's another key example, the definition of left-parenthesis:
<<Define left-paren
;; Eliminate commas from parenthesized expressions
(def-syntactic left-paren :lex "("
:prefix (:right-precedence 0
:bracket (:open right-paren))
:infix (:left-precedence 200 :right-precedence 0
:bracket (:open right-paren))
:tidier
(( (:^+ left-paren [*_] (:^+ comma ?@subs))
!~>>- (:^+ group [*_] ,@subs))
( (:^+ left-paren [*_] ?e)
!~>>- e)
( (:^+ left-paren ?f [_*_] (:^+ comma ?@args))
!~>>- (:^+ fun-app ,f [_*_] ,@args))
( (:^+ left-paren ?f [_*_] ?arg)
!~>>- (:^+ fun-app ,f [_*_] ,arg)))) >>
Left parentheses can occur as infix or prefix operators. In either
case a left paren is (big surprise) a :bracket closed by a
right paren; its right precedence of 0 causes all other operators to
bind tighter, until a right bracket is seen, with left precedence of
0. The resulting parsetrees are then tidied using four rules. The first
two contain the symbol [*_], indicating that they apply only
to prefix occurrences of <left-paren>; the second two substitute
[_*_], which indicates that they match with infix
parsetrees only. What the rules do depends on whether the main
operator inside the parentheses is a <comma> or not. For
reasons [[ explained below/that need not concern us here ]],
<comma> can dominate a parsetree with an indefinite number of
subtrees. Hence we use "?@" as a prefix on subs and
args in our match patterns, and ",@" as
prefix on the same variables in our substitution patterns. In those cases, the
<comma>s are removed.
A prefix paren not followed by a
comma is redundant (having served its purpose of
overriding normal precedence relationships) and it is removed (by the
second tidier clause).
In all other cases the <left-paren>
token is replaced; prefix <left-paren> becomes the token
<group> and infix <left-paren> becomes <fun-app>
(short for "function application"). In what follows, tidiers will
often look for occurrences of <group>, and it's useful to bear
in mind where they came from.
One thing that may not be clear is that inside a
pair of brackets
there must always be a single expression. Typically some
grouping connective, such as <comma> in the case of ordinary
parentheses, is used to bundle together multiple expressions. As we
will see later, if there is no explicit connective, Lexiparse will
insert one.
The discussion in the rest of this section roughly follows the order shown in the
Tables "Right Precedence"
and "Left Precedence", from high-level,
low-precedence tokens
to low-level, high-precedence tokens. [[ It does? ]]
Before continuing with the discussion, it might be useful to stop and
scan these tables, which
show the overall precedence structure of OWL-S. An infix operator
(such as
<left-paren>in some contexts) has both a left and
right
precedence. The former determines how well it competes with tokens to
its left, the latter, how well with those to its right.
|
|
Later, in Table Precedence Array, we show the "cross product" of these two tables, indicating more clearly which tokens tend to occur below which other tokens.
We start by declaring that we're constructing a grammar, called
owl-s:
<<Define grammar-def
(def-grammar owl-s
:top-tokens (ontology define with_namespaces)
:lex-parent standard-arith-syntax
:syn-parent standard-arith-syntax
:sym-case #-allegro :up #+allegro (allegro-case-choice)) >>
The :top-tokens declaration indicates that
a legal process definition in OWL-S must be headed by one of the
tokens ontology,
define, or
with_namespaces. (This last one is actually the most
likely; it's explained in Section "????";
ontology is discussed in
section "????".)
The rest of the grammar consists of definitions of the tokens of the
language, starting with those that
appear higher (closer to the root) in parsetrees, which are (more or
less) the same as the :top-tokens. We start with
<define>.
A process-model definition consists of a sequence of process
definitions, each atomic, simple, or composite.
All three stem from the <define> operator.
<<Define process-definition
(def-syntactic define :reserved true
:prefix (:precedence 5 :numargs 1)
:tidier
(( (:^+ define (:^+ composite
?(:^+ process ?name ?@iopr-spec)
?@body-spec))
!~>>- (:^+ define composite ,name (:^+ iopr ,@iopr-spec) ,@body-spec))
( (:^+ define (:^+ atomic (:^+ process ?name ?@iopr-spec)))
!~>>- (:^+ define atomic ,name (:^+ iopr ,@iopr-spec) nil))
( (:^+ define (:^+ simple (:^+ process ?name ?@iopr-spec)))
!~>>- (:^+ define simple ,name (:^+ iopr ,@iopr-spec) nil))
(:else (defect "Unintelligible 'define' expression")))
<<Insert: process-definition-checkers>>>>
The tidier for <define>, like <if>'s, flatten
the structure produced by the operators <atomic>,
<composite>, etc. (which are defined later). Now we add in
a :checkers field, which
specifies various validity tests for the parsetrees headed by
<define>:
<<Define process-definition-checkers
:checkers
(
((:~ ?(:^+ define ?_ ?(:+ ?name is-name) ?@_))
(defect "Process has illegal name: " name))
(:up *
((:~ ?(:^+ ?(:|| with-namespaces left-brace) ?@_))
(defect "'define' can appear below only 'with-namespaces'"
" or 'left-brace'")))
)) >>
Note that checkers is a list of (in this case, two) items. Each
specifies a test to be
run. All the tests are run, and all the defects they find are
accumulated. Important: Tidiers are run as soon as each sub-parsetree is
first constructed. Checkers are run only after all grouping and
tidying are finished for the entire parsetree; in addition, for a given
sub-parsetree, checking is canceled if the
tidying phase found defects. Now back to define. The second
checker starts with
the symbol :up; the first is of the form
(pattern –defects-or-checkers–)
If the pattern matches the current parsetree, then the
defects-or-checkers specify what happens. Expressions of the
form (defect ...) create defects, and anything else is
interpreted as another checker to be run.
(If a checker has a pattern followed by nothing, a
standard defect of the form "Doesn't match this pattern" is created.
To indicate that no further checkers need be run, one just writes
:okay.)
Some new features of the pattern language appear at this point.
They are summarized in table "????".
The meaning of the first checker may be glossed as, "If the parsetree
does not have at least two children, the second of which
satisfies the test is-name, then report the defect
[Process has illegal name: n], where n is
the offending second child."
?_ |
Matches anything |
?(:~ p) |
Matches only if p does not |
?(:& p1 ...pK) |
Matches only if each pI does |
?(:|| p1 ...pK [:& p0]) |
Matches only if one of the pI and, |
| if present, p0 match | |
| ?(:+ p –tests–) | Matches if p matches and the item in question satisfies the tests.
The notation (:up * –checkers–) indicates
that all the
ancestors of
the current parsetree are to be checked using the given checkers.
(If "1" had appeared instead of the asterisk,
then only its parent would have been checked.)
In this case the checker limits define to
appearing either at the top level or below a with_namespaces,
possibly with some enclosing curly braces.
The syntactic rules above introduce the acronym IOPR, which stands for "Inputs, Outputs, Preconditions, and Results" — all the types of entities that define the external view of the process. For atomic and simple processes, the external view is all there is; for composites, the body-spec provides a view of the inner works of the process.
We analyze <atomic>, <simple>, and
<composite> as
simple prefix operators, followed by a process-spec.
<<Define process-classes
;; __::= atomic <process-spec>
(def-syntactic atomic
:reserved true
:prefix (:precedence 15 :numargs 1))
;; __::= simple <process-spec>
(def-syntactic simple :reserved true
:prefix (:precedence 15 :numargs 1))
;; __::= composite <process-spec>... { ...
(def-syntactic composite :reserved true
:prefix (:precedence 15 :numargs 2)
:tidier
(( (:^+ composite ?proc (:^+ left-brace ?body))
!~>>- (:^+ composite ,proc ,body))
( (:^+ composite ?proc ?b)
(defect "Ill-formed body for composite process " proc
:% " (must be surrounded by braces): " b)))) >>
The syntax of curly braces will be dealt with below; suffice it to say
that the way brackets work in a typical Lexiparse grammar is that a
balanced expression is analyzed as a parsetree whose operator is the
open bracket and whose sole child is the expression inside the
brackets.
Before presenting the syntax of processes, I digress to talk
about namespaces, on the grounds that processes and ontologies almost always
occur inside namespace declarations. In the RDF/XML representation,
namespaces are
treated using the standard XML namespace notations, that is, as
attributes whose names start with xmlns. In the
presentation syntax, we must supply a special notation:
<<Define namespace-spec
(def-syntactic with_namespaces :reserved true
:prefix (:precedence 15 :numargs 2)
:tidier
(( (:^+ with_namespaces ?spec ?e)
!~>>- (:^+ with_namespaces
,e ,@(apply-named-tidier namespaces-flatten spec)))))
<<Insert: namespaces-flatten>>>>
A named tidier is a packaged tidier that can be invoked from
multiple contexts. We'll get to its definition shortly. Its purpose
is to "flatten" the namespace list that occurs right after
<with_namespaces>, i.e., to move it up a level and put it
after the body of the <with_namespaces>
expression in the tidied parsetree. This format is slightly easier
for the internalizers of section "????" to handle. Before presenting it, we discuss
the syntax of colons, which are obviously relevant to namespaces.
When <colon> occurs as an infix operator, it's part of a
qualified name, like owl:Class.
Infix occurrences are replaced with the more mnemonic
<name-wrt-space>.
It also occurs as a prefix operator.
These occurrences are not renamed, but deleted,
as explained in section "????".
Either way, there should be no <colon>s in a parsetree
after it is tidied, which explains the checker in the following
syntax definition:
<<Define colon
;; Note that colon is not a reserved word, but comes from
;; occurrences of ":"
(def-syntactic colon :lex ":" :infix (:numargs 1 :precedence 210)
;; -- namespace construct, with high precedence
:prefix (:numargs 1 :precedence 120)
;; -- Occurs only after inputs, outputs,
;; preconditions, effects
;; Low precedence, but higher than comma
:tidier
(( (:^+ colon ?namespace [_*_] ?name)
!~>>- (:^+ name-wrt-space ,namespace ,name)))
:checkers
(( (:^+ colon ?@_)
(defect "Stray colon"))))
;;; Because infix <colon> is replaced by the "artificial" token
;;; <name-wrt-space>, we have to use 'def-checkers' to define its
;;; checkers --
(def-checkers name-wrt-space
( ?(:~ ?(:^+ name-wrt-space ?(:+ ?namespace is-Symbol) ?_))
(defect "Namespace must be a non-nil symbol: " namespace))
( ?(:~ ?(:^+ name-wrt-space ?_ ?(:+ ?name is-Symbol)))
(defect "Name must be a symbol: " name))) >>
The checker for (:^+ name-wrt-space space name)
verifies that space and
name are symbols. The usual expectation is that space is
bound by an outer with_namespaces expression, but bindings are
not checked until expressions are internalized. And, of course, in
the with_namespaces expression itself, the meaning of the
name and space are completely different.
We can now turn to the named tidier used to tidy
with_namespaces expressions:
<<Define namespaces-flatten
(def-named-tidier namespaces-flatten
( (:^+ uri ?s)
!~>>- ((:^+ namespace+name ,false (:^+ uri ,s))))
( (:^+ ?(:|| equals name-wrt-space)
?(:+ ?sym is-Symbol)
(:^+ uri ?s))
!~>>- ((:^+ namespace+name ,sym (:^+ uri ,s))))
( (:^+ group ?@specs)
!~>>- (all-tidy specs
(use-named-tidier namespaces-flatten)))
(:else
(defect "Illegal namespace spec " spec))) >>
It has the same syntax as a regular tidier,
although those not intimately familiar with Lisp conventions may think
there is a missing open parenthesis. It differs from a regular tidier
in that it can be used in several places, so you may want to go back
to the definition of
with_namespaces to remind yourself that in that
context it is applied to the list of namespace prefixes that occurs
as its first argument. The key purpose of
namespaces-flatten is revealed by its third clause, with
matches against a parsetree of the form (:^+ group ...),
which, you'll recall, is what a parenthesized list gets tidied into.
If this is the form taken by the expression following
with_namespaces, then namespaces_flatten is
applied to the subtrees of the group recursively. The construct
(all-tidy l tidier) means the
result obtained by tidying each element of l using
tidier, then appending all the results together.
(If a
tidier produces a single parsetree, it is treated as a singleton list
containing that parsetree.)
The basis cases
come are the first two clauses of namespace-flatten:
either a single URI, representing the default
namespace, or expressions of the form
prefix:URI. (We also allow
prefix=URI, which explains the
?(:|| ...) expression.)
In XML, URIs are represented as strings, and are recognized as
URIs by their context. We take a slightly different tack, and supply
a syntactic operator uri"..." that indicates a string is to
be taken as a URI.
<<Define uri-spec
(def-syntactic uri :reserved true
:prefix (:precedence 210 :numargs 1)
:checkers
(( (:~ ?(:^+ uri ?(:+ ?s is-String)))
(defect "uri followed by non-string " s)))) >>
We now return to the syntax of processes, and in particular
process itself:
<<Define process-spec
;; __::= process <name>(-IOPRs-)
(def-syntactic process :reserved true
:prefix (:precedence 25 :numargs 1)
:tidier
( ( (:^+ process (:^+ fun-app ?name [_*_] ?@subs))
~(>>- :^+ process ,name ,@subs))
<<Insert: anon-process>>
(:else (defect "'process' must be followed by <name>"
" and IOPRs in parens")))
:checkers
(( (:^+ process ?(:+ ?_ (not is-Symbol)) ?@_)
(defect "Stray occurrence of 'process'"))
( (:^+ process ?_ ?@ioprs)
(forall-check ioprs
((:^+ ?(:~ ?(:|| inputs outputs locals participants
precondition result))
?_)
(defect "Process has illegal IOPR spec: " this-ptree))))
( (:^+ process ?_ ?@ioprs)
(occ-range ioprs 0 1 ?(:^+ inputs ?@_))
(occ-range ioprs 0 1 ?(:^+ outputs ?@_))
(occ-range ioprs 0 1 ?(:^+ locals ?@_))
(occ-range ioprs 0 1 ?(:^+ precondition ?@_))))
) >>
The only remark to make about the tidier for process,
besides the fact that it instantiates the common
flatten-or-produce-defect pattern, is
that it refers to the token <fun-app>, which is what an
infix <left-paren> turns into.
The notation (forall-check l
–checkers–) indicates that every element of
the list
l is to be checked using the checkers.
The resulting defects
are accumulated. The use of this-ptree here is not
specific to forall-check; it can be used inside any
checker to refer to the parsetree being checked.
The last novel element used for the checkers of <define>
is the construct (occ-range list lo hi
pattern).
This counts the number of elements of list that match
pattern. If it falls outside the range [lo,hi],
a defect is issued, of the form [Too few (or subexpressions match pattern].
too
many)
There are two kinds of IOPRs, the IOs (parameters) and the PRs (preconditions and results). One key feature of the former is that they are followed by a parameter-declaration list with a distinctive syntax.
<<Define param-decls
;; <IOPR> ::= inputs : (...)
(def-syntactic inputs :reserved true
:prefix (:numargs 1 :precedence 120
:local-grammar ((1 typed-var-grammar)))
:tidier
(use-named-tidier typed-vars-tidier inputs)
:checkers
(use-named-checkers typed-vars-checker inputs)
;; <IOPR> ::= outputs : (...)
(def-syntactic outputs :reserved true
:prefix (:numargs 1 :precedence 120
:local-grammar ((1 typed-var-grammar)))
:tidier
(use-named-tidier typed-vars-tidier outputs)
:checkers
(use-named-checkers typed-vars-checker outputs))
;; <IOPR> ::= locals : (...)
(def-syntactic locals :reserved true
:prefix (:numargs 1 :precedence 120
:local-grammar ((1 typed-var-grammar)))
:tidier
(use-named-tidier typed-vars-tidier locals)
:checkers
(use-named-checkers typed-vars-checker locals))
;; <IOPR> ::= participants : (...)
(def-syntactic participants :reserved true
:prefix (:numargs 1 :precedence 120
:local-grammar ((1 typed-var-grammar)))
:tidier
(use-named-tidier typed-vars-tidier participants)
:checkers
(use-named-checkers typed-vars-checker participants)) >>
The tidiers for each of these constructs are all essentially the same,
so we make use of a single named tidier for all of them,
whose name is typed-vars-tidier:
<<Define typed-vars-tidier
(def-named-tidier typed-vars-tidier (sort) :grammar owl-s
( (:^+ ?,sort (:^+ colon (:^+ group ?@decls)))
!~>>- (:^+ ,sort ,@decls))
( (:^+ ?,sort (:^+ colon ?decl))
!~>>- (:^+ ,sort ,decl))
(:else (defect "'" sort "' doesn't occur in the form "
sort ": (...)"))) >>
This named tidier differs from the previous one because it takes an
argument sort, in addition to the implicit argument, the
parsetree being tidied. The different between
apply-named-tidier and use-named-tidier is
that in the former the parsetree being tidied is explicit, while for
use-named-tidier it is always the current one. Both take
optional arguments at the end, which must match up with the arguments
that come after the name of the tidier in def-named-tidier.
[[ ]]
<<Define typed-vars-checker
(def-named-checkers typed-vars-checker (sort)
( (:up 1 (:process ?name ?_))
(defect sort " doesn't occur as process argument"))) >>
The checkers for the various constructs are all the same, too, but they're so short we don't bother defining a named checker set for them. Each is of the form
(:up 1 ((:^+ iopr ?@_)))
that the checker ((:^+ iopr ?@_)) is to run on the
parsetree one level up from this on. This is a checker consisting of
a list (pattern), i.e., a pattern followed by
nothing. If the parsetree one level up doesn't match it, a defect of
the form "Doesn't match (:^+ iopr ?@_)" is produced and
attached to this one.
The principal new feature for the IOPR declarations is their use of
the local-grammar facility, which allows us to override
the usual precedences and other properties of operators when they
occur inside (one of) the right-hand argument(s) of an operator.
(The qualification "right-hand" is due to Lexiparse's left-to-right
bias; by the time it sees an infix operator, its left-hand argument
has already been parsed. But the possibilities are wide open for the
arguments to the right of an operator, whether infix or prefix.)
The local grammar typed-var-grammar is for typed variable
declarations such as (x,y - String n - Integer), in which the
hyphens must have precedence lower than that of the commas.
<<Define typed-var-grammar
(def-grammar typed-var-grammar
:lex-parent owl-s
:syn-parent owl-s
:replace ((minus hyphen))
:sym-case #-allegro :up #+allegro (allegro-case-choice)
:definitions
(
(def-syntactic \| :lex "" :infix (:precedence 16 :grouping :true))
(def-syntactic left-paren :lex "("
:prefix (:left-precedence 200 :right-precedence 0
:bracket (:open right-paren)))
;; Hyphens can be generated only by replacement of <minus>
;; Precedence puts hyphen above comma in parsetree --
(def-syntactic hyphen :lex "-" :infix (:precedence 17 :numargs :1)
:tidier
;; Eliminate commas --
(( (:^+ hyphen (:^+ comma ?@vars) ?type)
!~>>- (:^+ hyphen ,@vars ,type)))
:checkers
(hyphen-checks))
(def-named-checkers hyphen-checks
( ?(:~ ?(:^+ hyphen ?@_ ?(:+ ?_ is-name)))
(defect "Type is not name of Owl class: " type))
( ?(:^+ hyphen ?@vars ?_)
(forall-check vars
( ?(:+ ?v (not is-Symbol))
(defect "Variable is not symbol: " v)))))) >>
The token with name "|" is the
invisible "contiguity"
operator that Lexiparse inserts whenever it requires an operator and
none appears in the lexeme stream. In the declaration
(x,y - String n - Integer), such an operator will be inserted
between String and n, and will become the
top operator for the declaration parsetree. As mentioned in the beginning of the Syntax section, inside every
pair of brackets must come a single expression. This need to
assimilate everything under one operator drives the system to insert
<\) ._ whenever it comes to the end of an expression and
there is more stuff to the right that isn't the bracket it is looking
for. however, the precedence and other properties of this operator
are entirely up to the grammar creator. in this case, by making the
precedence of ~~(toke \> low we ensure that it becomes the main
operator for type-variable lists with multiple type declarations. By
declaring it to be
":grouping :true," we ensure that any parsetree of
the form
(:^+ | (:^+ | a b) c)will be converted to
(:^+ | a b c)
The clause :replace ((minus hyphen)) means to replace the token
<minus> generated by the lexer with <hyphen> when
typed-var-grammar is in control. That means that even
when control returns to the normal owl-s grammar, the
token derived from "-" will not start looking like a
minus again.
The other two things that can appear in an IOPR list are preconditions and results:
<<Define preconds
;; <IOPR> ::= precondition : ...
(def-syntactic precondition :reserved true
:prefix (:numargs 1 :precedence 120)
:tidier
(( (:^+ precondition (:^+ colon ?exp))
!~>>- (:^+ precondition ,exp))
(:else (defect "'precondition' not followed by ': <expression>'")))
:checkers
((:up 1 (?(:^+ iopr ?@_))))) >>
The formula in the precondition field of a
process definition has no special syntax, but is just part of the
general-purpose logical-expression system that generates all those XML
literals at the leaves of OWL-S. This system is discussed in
section "????".
The result field(s) of a process definitions
is more complex than the inputs, outputs,
preconditions, and other
fields.
<<Define results
;; <IOPR> ::= result : ...
(def-syntactic result :reserved true
:prefix (:numargs 1 :precedence 120)
:tidier
(( (:^+ result (:^+ colon ?exp))
!~>>- (:^+ result ,exp))
(:else (defect "'result' not followed by ': <expression>'")))
:checkers
((:up 1 (?(:^+ iopr ?@_)))
<<Insert: result-tree-checker>>)) >>
Results can include several constructs that are
not meaningful, or mean something different, in other contexts. Even
though a result may look like a predicate-calculus formula, it is not
really an entity with a truth value, but a recipe for changing
the truth values of various propositions. For this reason, I prefer
using the verb "impose" to describe a result. A result R
does not become
"true"; it is imposed following an action or event.
Here is a rough definition of what
that means:
not p
is to make p false.
For many implementations, this translates into deleting p from a
representation of the current situation, so that the use of a closed-world
assumption [Rei78]
will allow an implementation to conclude that p is
false.
|-> q
is to impose q if p was true
before the action or event.forall (x) p[x]
is to impose p[a] for
all objects a. In practice, most implementations only handle the
special case forall (x) p[x] |-> q[x],
which means
"For every set of values v for vars such that p[v] is
true before the action or event, impose q[a]." OWL-S is
typical in singling this case out. output(p1 <= v1, ..., pk <=
vk), where each pi is an output of the current
process, make each vi the value of
pi. (This notation corresponds
to the withOutput construct of OWL-S.)
Various syntactic constraints are implied by this definition of
"impose." There are others as well. For example, we don't allow
disjunctive effects in OWL-S.
[[ To check that a result is syntactically legal, we must write a
straightforward Lisp
procedure (called result-defects)
to walk through a parsetree headed by <result>.]]
First, we define the special operators (<when> and <output>), which occur only in results. [[ move lexicals out ]]
<<Define syn-when
(def-syntactic when :lex "|->"
:infix (:left-precedence 110 :right-precedence 111)) >>
<<Define bind-param-syn
(def-syntactic bind-param :lex "<="
:infix (:precedence 110)
:checkers
(( (:^+ bind-param ?(:~ ?(:+ ?p is-Symbol)) ?_)
(defect "Non-symbol to left of '<=': " p))
<<Insert: bind-param-context>>)) >>
<<Define output-syntax
(def-syntactic output :reserved true
:prefix (:precedence 200 :numargs 1)
:tidier
(( (:^+ output (:^+ group ?@bindings))
!~>>- (:^+ output ,@bindings)))
:checkers
(( (:^+ output ?@bindings)
(forall-check bindings
( (:~ ?(:^+ bind-param ?@_))
(defect "Outputs must all be of form 'param <= exp', not "
this-ptree)))))) >>
The checker for results calls the recursive checker
result-check:
<<Define result-tree-checker
:. (def-syntactic result ...
...
:checkers
(... .:
( ?(:^+ result ?exp)
(apply-named-checker result-check exp)) >>
<<Define result-syntax-checker
(def-named-checkers result-check
(select
( (:^+ forall ?vars ?r)
(apply-named-checkers decls-check vars)
(apply-named-checkers forall-body-must-be-whens r))
( (:^+ exists ?vars ?r)
(defect "Existential quantifiers are illegal in results"))
( (:^+ when ?_ ?effect)
(apply-named-checkers result-check effect))
( (:^+ and ?@conjuncts)
(forall-check conjuncts
result-check))
( (:^+ group ?_) ;; stuff originaly separated by commas
(defect "Can't have <comma> as a connective"))
( (:^+ not ?elt)
(apply-named-checkers must-be-fun-app elt))
( (:^+ output ?@_)
(:else
must-be-fun-app))))
<<Insert: result-piece-checkers>>>>
Note that result-check is defined using
def-named-checkers; the plural indicates that in general
the name tags a set of checkers. In this case the set is a
singleton, a checker of the form (select
–clauses–), which operates by running the
checkers in the first clause whose pattern matches the current
parsetree. Note that to apply a named checker set to the current
parsetree, all you have to do is give its name. To apply one to
something else you use apply-named-checkers. The checker
sets in question are now defined. To understand the first one, it's
necessary to recall that declarations are parsed with a local grammar
that assigns a low precedence to the contiguity operator
<|>. If variables of more than one type are being declared, then
this will become the main operator of the declaration. All other
subexpressions must be of the form "var, var, var" or
"var, var, var - type."
<<Define result-piece-checkers
(def-named-checkers decls-check
(select
( (:^+ \| ?@decls)
(forall-check decls
decl-check))
(:else
decl-check)))
(def-named-checkers decl-check
(select
( (:^+ hyphen ?@vars ?type)
hyphen-checks)
( (:^+ comma ?@vars)
(forall-check vars
( ?(:+ ?v (not is-Symbol))
(defect "Variable is not symbol: " v))))
( ?(:+ ?v (not is-Symbol))
(defect "Variable is not symbol: " v))))
(def-named-checkers forall-body-must-be-whens
(select
( (:^+ when ?_ ?e)
(result-check e))
( (:^+ and ?@conjuncts)
(forall-check conjuncts
forall-body-must-be-whens))
(:else
(defect "Body of 'forall' must be a '|->' or a conjunction"
" of '|->' expressions")))) >>
<<Define must-be-atomic
(def-named-checkers must-be-fun-app
( (:~ (:^+ fun-app ?_ ?@_))
(defect "Context requires atomic formula"))) >>
One thing these checkers do not check is whether each occurrence of a
variable in a result is bound by a dominating forall.
That check is deferred until the internalization phase, which is managed
by a subgrammar called owl-s-as-rdf, discussed in
section "????".
Unlike atomic and simple processes, composite processes have bodies, the stuff in left braces after the IOPR specs.
<<Define syn-braces
(def-syntactic left-brace :lex "{"
:prefix (:precedence 0
:bracket (:open right-brace))
:tidier
(( (:^+ left-brace ?sub)
!~>>- sub)))
(def-syntactic right-brace :lex "}"
:suffix (:left-precedence 0
:bracket (:close left-brace))) >>
(See the
syntactic definition for
composite.)
Some are defined using reserved words, others with special character
sequences.
<<Define control-constructs
(def-syntactic anyord :lex "||;"
:infix (:precedence 60 :grouping :true)
:tidier elim-left-braces
:checkers (control-construct-checks))
(def-syntactic split-join :lex "||>"
:infix (:precedence 60 :grouping true)
:tidier elim-left-braces
:checkers (control-construct-checks))
(def-syntactic split :lex "||<"
:infix (:precedence 60 :grouping true)
:tidier elim-left-braces
:checkers (control-construct-checks))
(def-syntactic choice :lex ";?"
:infix (:precedence 60 :grouping true)
:tidier elim-left-braces
:checkers (control-construct-checks))
(def-syntactic semicolon :lex ";"
:infix (:precedence 65 :grouping true)
:tidier elim-left-braces
:checkers (( (:^+ semicolon ?@elements)
(forall-check elements
check-down-to-control-constructs))))
;;; All control constructs must pass these two checks ...
(def-named-checkers control-construct-checks
(:up 1 control-context-check)
( (:^+ ?_ ?@elements)
(forall-check elements
check-down-to-control-constructs)))
<<Insert: control-constants>>
;;; A control construct can appear only in certain places
(def-named-checkers control-context-check
( ?(:|| ?(:^+ if ?_ ?,start-ptree ?_)
?(:^+ if ?_ ?_ ?,start-ptree)
?(:^+ define composite ?_ ?_ ?,start-ptree)
?(:^+ ?(:member +owl-s-control-operators+) ?@_))
(defect "Control construct in illegal context")))
;;; And it can contain only certain things
(def-named-checkers check-down-to-control-constructs
( ?(:~ ?(:^+ ?(:|| tag
with-namespaces
?(:member +owl-s-control-operators+)
?(:member +owl-s-taggables+))
?@_))
(defect "Illegal as element of composed process: " this-ptree))
( (:^+ with-namespaces ?e ?@_)
(apply-named-checkers check-down-to-control-constructs e))) >>
In a checker, start-ptree refers to the parsetree where
the checking began, which, in an up, is not necessarily
the same as this-ptree. So the expression
(:^+ if ?_ ?,start-ptree ?_) matches an "if" whose
"iftrue" part is the tree we started at, one level below the current
one.
The qvaroid ?(:member l) matches any datum
in the list l. The list is typically a named constant. Here
the two constant lists are
<<Define control-constants
(defconstant +owl-s-control-operators+
'(anyord choice split-join split semicolon))
(defconstant +owl-s-taggables+
'(perform produce as-process)) >>
The first list, +owl-s-control-operators+, is the set of
composite-process classes that have components comprising
bags of steps. The second, +owl-s-taggables+, is the set
of "leaf" classes. The name derives from the fact that in the
presentation syntax these are the constructs whose instances can
be named using
tags.
<<Define composite-process-leaves
(def-syntactic perform :reserved true
:prefix (:numargs 1 :precedence 90)
:tidier
(( (:^+ perform (:^+ fun-app ?name [_*_] ?@pbs))
!~>>- (:^+ perform ,name ,@pbs))
(:else (defect "Unintelligible")))
:checkers
(( (:^+ perform ?(:+ ?name (not is-Symbol)) ?@_)
(defect "Perform of process with illegal name " name))
( (:^+ perform ?name ?@bdgs)
(forall-check bdgs
(:? ?(:~ ?(:^+ bind-param ?_ ?_))
(defect "Illegal data input " b " to perform of "
name))))))
;; 'produce (p1 <= v1 , ...)' indicates bindings to outputs of this
;; process
(def-syntactic produce :reserved true
:prefix (:numargs 1 :precedence 90)
:tidier
(( (:^+ produce (:^+ group ?@bindings))
!~>>- (:^+ produce ,@bindings))
( (:^+ produce ?binding)
!~>>- (:^+ produce ,binding))
(:else (defect "'produce' must be followed by bindings in parens")))
:checkers
(( (:^+ produce ?@bindings)
(forall-check bindings
(:? ?(:~ ?(:^+ bind-param ?@_))
(defect "'produce' args must all be of form"
" 'param <= exp', not " b)))))) >>
To refer to the outputs of a
performed process, you must tag it and then refer to the
outputs as "slots" of the tag.
It happens that the syntax of the parenthesized items after
perform and produce is
identical to that of the items after output, although they all
serve different purposes. In each case, the items must be
a sequence of parameter bindings, p <= v.
(The characters "<=" are converted to the token
<bind-param> .) For output and
produce, they
describe how
the results of the current process are set; the former is for atomic
and simple processes, and occurs in a result construct, while the
latter occurs in the body of a composite process.)
<<Define bind-param-context
:. (def-syntactic bind-param :lex "<="
...
:checkers
(... .:
(:up 1
(:? ?(:^+ ?(:|| output perform produce) ?@_))) :.)).: >>
Any leaf component of a control construct can be tagged. We indicate a tag using a double colon: The syntax is simple:
<<Define tag-syn
(def-syntactic tag :lex "::" :infix (:precedence 70)
:checkers
(( (:^+ tag ?(:+ ?name (not is-Symbol)) ?_)
(defect "Illegal tag: " name))
( (:^+ tag ?_ ?(:& ?x (:^+ (:& ?op
?(:~ ?(:member +owl-s-taggables+)))
?@_)))
(defect "Illegal as tagged element: " x
"[" op "]"))
(:up 1 control-context-check))) >>
The final control construct is as process(...){...},
which allows us to reuse the process syntax for
pieces of a process, so that we can declare local effects,
preconditions, and outputs:
<<Define as-syntax
(def-syntactic as :reserved true
:prefix (:precedence 75 :numargs 2)
:tidier
(( (:^+ as ?(:^+ process :local ?@ioprs)
?body)
!~>>- (:^+ as (:^+ process :local ,@ioprs)
,body))
(:else (defect "Ill-formed 'as process' construct")))) >>
The process spec following the token has the same syntax as a
composite-process definition, except that the process has no name. So
we have to broaden the syntax of process to include the
anonymous case:
<<Define anon-process
:.(def-syntactic process :reserved true
:prefix (:precedence 25 :numargs 1)
:tidier .:
(:? ?(:^+ process (:^+ group ?@ioprs))
!~(:^+ process :local ,@ioprs)) >>
After this transformation, all processes have names; the anonymous
ones have the name :local.
In case you've forgotten, we allow pieces of ontologies to be included into Owl-S, written in the N3 notation. Hence we must provide a local grammar for it.
<<Define n3-grammar-def
(def-grammar n3-grammar :parent owl-s
:definitions
(
<<Insert: n3-lexical>>
<<Insert: ontology-def>>
<<Insert: dot-syntax>>
))
>>
The declaration of grammar owl-s as the parent for
n3-grammar is a bit risky. We do it so that we can
inherit the lexical and syntactic definitions of <colon>. But we have to make sure to avoid inheriting many of
the infix operators of the algebraic language used by Owl-S, which
are foreign to N3 (and RDF). (See appendix "????".)
We do so by declaring them illegal
at the lexical level.
<<Define illegal-n3-ops (def-lexical #\~ (illegal)) (def-lexical #\< (illegal)) (def-lexical #\> (illegal)) (def-lexical #\& (illegal)) (def-lexical #\| (illegal)) (def-lexical #\= (illegal)) >>
At the top of the precedence heap, so to speak, we find
ontologies introduced by the operator ontology:
<<Define ontology-def
(def-syntactic ontology :reserved true
:prefix (:precedence 5 :local-grammar n3-grammar)
:checkers
(( (:^+ ontology ?@subs)
(forall-check subs
( ?(:& ?(:~ ?(:|| ?(:^+ imports ?_)
?(:^+ ontology_version ?_)
?(:^+ triple ?@_)))
?x)
(defect "Illegal element of ontology: " x))))))
(def-syntactic imports :reserved true
:prefix (:precedence 6)
:tidier
(( (:^+ imports (:^+ comma ?@subs))
!~>>- (:^+ imports ,@subs)))
:checkers
(( (:^+ imports ?@subs)
(forall-check subs
(select
( ?(:+ ?_ is-name)
:okay)
( ?(:^+ uri ?_)
:okay)
( ?sub
(defect "Illegal import (not a name or URI): " sub)))))))
(def-syntactic ontology_version :reserved true
:prefix (:precedence 6)
:checkers
(( ?(:^+ ontology-version ?(:+ ?_ (not is-String)))
(defect "Ontology version must be string")))) >>
The versioning information is there for purely documentary purposes.
The imports statement is more substantial, but at the
syntactic phase the grammar can check only its most superficial properties.
The important elements of the ontology are, of course, the triples and type declarations, separated by dots:
<<Define dot-syntax
(def-syntactic left-brace :lex "{"
:prefix (:left-precedence 10 :right-precedence 0))
(def-syntactic imports :reserved true
:prefix (:precedence 15))
(def-syntactic dot :lex "."
:infix (:precedence 5
:numargs (0 1)
:grouping true))
<<Insert: dot-checkers>>
(def-syntactic semicolon :lex ";"
:infix (:precedence 10 :grouping true)
<<Insert: semicolon-checkers>>
)
(def-syntactic hyphen :lex "-"
:infix (:precedence 18 :grouping false))
(def-syntactic \| :lex ""
:infix (:precedence 20 :grouping true)
<<Insert: contig-tidier>>)
<<Insert: contig-checkers>>
(def-syntactic comma :lex ","
:infix (:precedence 30 :grouping true))
<<Insert: n3-named-checkers>>
>>
These precedences put <dot> highest in an N3 parse,
not surprisingly. It may seem natural to think of "." as a
three-argument postfix operator, but Lexiparse doesn't allow postfix
operators. Besides, it's really not the dot that makes the terms
stick together; as shown by the way semicolons work, it's simple
contiguity that means "property attribution" in N3.
So we recruit
the contiguity operator <|>, discussed in
section 2.
Its precedence is set below that of <comma>, so
that "x p
a,b,c" gets parsed as
(:^+ | x p (:^+ comma a b c))Then
<semicolon> is given lower
precedence than <|>, so that "x p1 a1; p2
a2; p3 a3" gets parsed as
(:^+ semicolon (:^+ | x p1 a1)
(:^+ | p2 a2)
(:^+ | p3 a3))
(Or it would if the contiguity operator didn't get tidied away.)
Here syntax doesn't follow semantics exactly. But it's easier to let
the semantics be straightened out during internalization.
It's best to replace occurrences the contiguity operator with more revealingly named tokens:
<<Define contig-tidier
:tidier
(( (:^+ \| ?subj ?prop ?obj)
!~>>- (:^+ triple ?subj ?prop ?obj))
( (:^+ \| ?prop ?obj)
!~>>- (:^+ prop-obj-attrib ?prop ?obj))
(:else
(defect "Only two or three contiguous items allowed in N3"))) >>
<<Define contig-checkers
(def-checkers triple
( ?(:^+ triple ?(:+ ?subj (not is-name)) ?_ ?_)
(defect "Illegal subject in triple: " subj))
( ?(:^+ triple ?_ ?(:+ ?property (not is-name)) ?_)
(defect "Illegal property in triple: " property)))
(def-checkers prop-obj-attrib
( ?(:^+ prop-obj-attrib ?(:+ ?property (not is-name)) ?_)
(defect "Illegal property in \"double\": " property))) >>
The checkers for triples and "doubles" verify that subjects and
properties are names. They can't check anything about objects,
because datatype properties allow fairly unpredictable things in the
object slot.
Another thing we check at this stage is that dots and semicolons really are triples.
<<Define dot-checkers
:. (def-syntactic dot .:
:checkers
( ( (:up 1 ?(:~ ?(:^+ left-brace ?_)))
(defect "Dot in illegal context"))
( ?(:^+ dot ?@subs)
(forall-check subs
(select
( ?(:|| ?(:^+ \| ?@_)
?(:^+ hyphen ?_ ?_))
(use-named-checkers legal-triple))
( ?(:^+ semicolon ?triple
?@semisubs)
(apply-named-checkers legal-triple triple)
(forall-check semisubs
(use-named-checkers legal-double))
(:else
(defects "Illegal at top level of n3: "
this-ptree))))))) >>
<<Define semicolon-checkers
:. (def-syntactic semicolon .:
:checkers
( ( (:up 1 ?(:~ ?(:^+ dot ?@_)))
(defect "Semicolon in illegal context")))) >>
The actual checking is done by these named checkers:
<<Define n3-named-checkers
(def-named-checkers legal-triple
( ?(:~ ?(:^+ ?_ ?_ ?_))
(defect "Triple without 3 elements!"))
( ?(:^+ ?(:+ ?subj (not is-name)) ?_ ?_)
(defect "Triple with subject that isn't a name: " subj))
( ?(:^+ ?_ ?(:+ ?property (not is-name)) ?_)
(defect "Triple with property that isn't a name: " property)))
;; We don't attempt to check for a legal object, because
;; object properties are too varied
;; A "double" is the stuff after a semicolon
(def-named-checkers legal-double
( ?(:~ ?(:^+ ?_ ?_))
(defect "Illegal property-object pair after semicolon"))
( ?(:^+ ?(:+ ?property (not is-name)) ?_)
(defect "Illegal property in propert-object pair: " property))) >>
The only parts of OWL-S that are left to deal with are the logical formulas in conditions, effects, and various output constructs. Here we incorporate a simple logical language capturing the normal encodings of Lisp as infix syntax.
Most of this grammar is inherited from the simple-arith grammar
that is part of the
Lexiparse package
All this grammar does is provide the usual precedence hierarchy for
multiplication (*), addition (+), and their ilk. All
the precedences for the operators lie in the range 100 to 200.
All we introduce here are the usual quantifiers:
<<Define quantifiers
(def-syntactic forall :reserved true
:prefix (:numargs 2 :precedence 90
:local-grammar ((1 typed-var-grammar))))
(def-syntactic exists :reserved true
:prefix (:numargs 2 :precedence 90
:local-grammar ((1 typed-var-grammar)))) >>
Finally, in table Cross Precedence, we show which operators a given operator "dominates." The row for operator i shows how it behaves when competing with other operators to its right. The first column shows how it behaves when matched against itself. The second column shows all the operators whose left precedence is greater than operator i. If operator j appears in the list, it will tend to appear in trees headed by operator i. (A bullet to the left of an operator's name in the row labels indicates that it is an infix operator, to minimize confusion about operators with infix and prefix versions.) Of course, this table gives only a crude indication of how each operator interacts with the entities below it in the parsetree; the details are spelled out in the syntactic spec for the operators in question.
| Operators on right | ||
|---|---|---|
| Operator on left |
Associa tivity |
Dominated tokens |
Comments in Owl-S are similar to Java comments, with a couple of
differences. Multiline comments are bounded by /*and
*/. Single-line comments start with //.
In Java, as in C/C++, multiline comments cannot be nested. Owl-S pays
attention to nesting. In Java, a doubled asterisk in the opening of a
multiline comments means that it should be used by Javadoc. In Owl-S,
it means that the comment should be transformed into an
rdfs:comment declaration. [[ This is harder than it
sounds ]]
Lexiparse handles comments mainly at the lexical level. Comments
"float above" other syntactic material, and get swept up into whatever
parsetrees they are closest to. They are invisible to tidiers and
checkers. (They should be visible to parsetree pretty-printers and
other debugging tools.)
However, internalizers are free to do what they want with them,
including turning them into rdfs:comments.
Here are the definitions of the comment characters
<<Define comments
(with-grammar owl-s
(def-lexical #\/
(dispatch
(#\* (dispatch
(#\* (comment
:style (:multiline "*/" 0 :important ("/*" "*/"))))
(:else
(comment
:style (:multiline "*/" 0 ("/*" "*/"))))))
(#\/ (dispatch
(#\/ (comment :style (:one-line 2)))
(:else (comment :style (:one-line 1)))))
(:else (token divide))))) >>
Hopefully these declarations are not too mystifying.
This is the grammar that most of the Owl-S grammar is based on.
<<Define arithgram [arithgram, line 2]
(in-package :lexiparse)
;; $Id: owl-s-gram-htm.html,v 1.4 2006/09/01 20:57:39 martin Exp $
(depends-on %module/ lexiparse)
(eval-when (:compile-toplevel :load-toplevel :execute :slurp-toplevel)
(export '()))
(def-grammar standard-arith-syntax)
(with-grammar standard-arith-syntax
(def-lexical-tokens ((#\( left-paren)
(#\, comma)
(#\) right-paren)
(#\~ not)
(#\- minus)
(#\+ plus)
(#\* times)
(#\/ divide)
(#\< less)
(#\& and)
(#\| or)))
(def-lexical #\= (dispatch
(#\< (token leq))
(else (token equals))))
(def-lexical #\> (dispatch
(#\= (token geq))
(else (token greater))))
(def-lexical (#\0 - #\9) #'lex-number
:name numeric)
(def-lexical (#\a - #\z #\A - #\Z) #'one-char-sym
:name alphabetic)
(def-lexical #\" (lex-string false !()))
)
;;; Except for parens' internal precedences, the entire standard-
;;; arith system has precedences >= 100. This leaves plenty of room
;;; for "superstructure."
(with-grammar standard-arith-syntax
(def-syntactic comma :lex "," :infix (:precedence 100 :grouping true))
(def-syntactic right-paren :lex ")"
:suffix (:left-precedence 0 :bracket (:close left-paren)))
;; The left-prec of left-paren (as infix) is 200, which is higher than
;; the right-prec 190 for times and divides. It's rare to see
;; a op b(...) get parsed as (a op b)(...), but it can be arranged by
;; giving 'op' a right-precedence >= 200. --
(def-syntactic left-paren
:lex "("
:prefix (:precedence 0 :bracket (:open right-paren))
:infix (:left-precedence 200 :right-precedence 0
:bracket (:open right-paren)))
(def-syntactic or :lex "|" :infix (:precedence 120 :grouping true))
(def-syntactic and :lex "&" :infix (:precedence 130 :grouping true))
(def-syntactic not :lex "~" :prefix (:precedence 140))
(def-syntactic greater :lex ">" :infix (:precedence 160))
(def-syntactic less :lex "<" :infix (:precedence 160))
(def-syntactic geq :lex ">=" :infix (:precedence 160))
(def-syntactic leq :lex "=<" :infix (:precedence 160))
(def-syntactic equals :lex "=" :infix (:precedence 160))
(def-syntactic plus :lex "+"
:prefix (:precedence 180 :numargs 1)
:infix (:precedence 180 :grouping true))
(def-syntactic minus :lex "-"
:prefix (:precedence 180 :numargs 1)
:infix (:precedence 180))
(def-syntactic times :lex "*" :infix (:precedence 190 :grouping true))
(def-syntactic divide :lex "/" :infix (:precedence 190 :grouping true))
)
>>
For the complete, "(un)tangled," listing for this file, see owl-s-syn.lisp
<<Define File owl-s-syn-new.lisp
;-*- Mode: Common-lisp; Package: lexiparse; Readtable: ytools; -*-
(in-package :lexiparse)
(depends-on %module/ ytools lexiparse)
(depends-on %lexiparse/ arithgram)
#+allegro
(defun allegro-case-choice ()
(cond ((eq excl:*current-case-mode* ':case-sensitive-lower)
':preserve)
(t ':up)))
<<Insert: grammar-def>>
<<Insert: typed-vars-tidier>>
<<Insert: result-syntax-checker>>
<<Insert: must-be-atomic>>
<<Insert: ??control-checks??>>
(with-grammar owl-s
;;; LEXICAL
(def-lexical-tokens ((#\{ left-brace)
(#\} right-brace)
(#\. dot)))
;; ;? = choice, ; = sequence
(def-lexical #\;
(dispatch (#\? (token choice))
(:else (token semicolon))))
(def-lexical #\|
;; ||; = any-order, ||> = split+join, ||< = split.
;; '|->' becomes the token <when> --
(dispatch
(#\| (dispatch
(#\; (token anyord))
(#\> (token split-join))
(#\< (token split))))
(#\- (dispatch (#\> (token when))))
(:else (token or))))
;; '->' is ordinary if-then --
(def-lexical #\- (dispatch
(#\> (token imply))
(:else (token minus))))
;; '=>' becomes the token <when>, although '|->' is now the preferred form
(def-lexical #\= (dispatch
(#\> (token when))
(:else (token equals))))
;; '<=' becomes the token <bind-param>
(def-lexical #\< (dispatch
(#\= (token bind-param))
(:else (token less))))
;; '::' is for step tags --
(def-lexical #\: (dispatch
(#\: (token tag))
(:else (token colon))))
(def-char-set owl-s-lexical-chars* (#\_))
(def-lexical (#\_ #\a - #\z #\A - #\Z)
(lex-sym (owl-s-lexical-chars))
:name alphabetic)
;;; SYNTACTIC
<<Insert: namespace-spec>>
<<Insert: uri-spec>>
<<Insert: process-definition>>
<<Insert: process-classes>>
<<Insert: process-spec>>
<<Insert: param-decls>>
<<Insert: preconds>>
<<Insert: results>>
<<Insert: output-syntax>>
<<Insert: colon>>
<<Insert: quantifiers>>
<<Insert: syn-when>>
<<Insert: syn-braces>>
<<Insert: bind-param-syn>>
<<Insert: control-constructs>>
<<Insert: if-syntax>>
<<Insert: then-else-syntax>>
<<Insert: composite-process-leaves>>
<<Insert: as-syntax>>
<<Insert: tag-syn>>
<<Insert: left-paren>>
(def-syntactic \| :lex "" :infix (:precedence 3 :grouping true))
(def-syntactic dot :lex "." :infix (:precedence 205)
:checkers
((:? ?(:^+ dot ?step ?output)
(nconc (cond ((is-Symbol step) !())
(t (list (defect "Illegal name to left of dot in "
step "." output))))
(cond ((is-Symbol output) !())
(t (list (defect "Illegal name to right of dot in "
step "." output))))))))
)
;;; LOCAL GRAMMAR for type declarations
<<Insert: typed-var-grammar>>
(defun is-name (x)
(or (is-Symbol x)
(matchq ?(:^+ name-wrt-space ?_ ?_)
x))) >>
This is the outline of the N3 syntax discussed in section "2", plus a few key pieces not covered there.
<<Define File n3-syn-new.lisp ;-*- Mode: Common-lisp; Package: lexiparse; Readtable: ytools; -*- (in-package :lexiparse) (depends-on %module/ ytools lexiparse) <<Insert: n3-grammar-def>> >>
<<Define n3-lexical
<<Insert: illegal-n3-ops>>
(def-lexical #\. (token dot))
>>