Pattern Schema Design
Version 2002/07
Ian Harrison, SRI International
1. Introduction
This document describes the current status of the pattern
markup language (PatternML), that we are proposing to use to describe both
uninstantiated and instantiated patterns for use in the EELD program. The
PatternML schema is also designed to be included within other XML schema,
HypothesisML and ControlML, which are described in separate documents.
When designing PatternML we took into account several other
related XML markup languages that have been developed. We rejected XOL (SRI’s
XML ontology markup language) as being too low level for our purposes: it’s
effectively a frame representation that has been XML-ified. GraphML (www.graphdrawing.org/graphml/)
provided inspiration, but we wanted to be able to represent extra elements in
graphs beyond what they had. Also graphML is, as their website states), subject
to change. We decided to create our own schema from scratch.
2. Design of Pattern Schema
The design goals for PatternML was that it had to support a
variety of uses on the EELD program
1)
As an interchange language between different pattern editors
2)
As an schema to support pattern input to different pattern
matchers
3)
As a schema for representing the results of pattern matching
(as an included schema in HypothesisML).
Design goal 1) meant that layout position information and
human understandable labels needed to be incorporated in the schema. Design
goals 2) and 3) meant using the same schema for the input and output of pattern
matching components. We managed to achieve design goals 2 and 3 by having the
PatternML schema be incorporated in the HypothesisML schema, which is used to
represent the pattern match results. HypothesisML adds a layer of belief in the
pattern match, with the actual matches being represented using PatternML
elements.
The pattern schema was developed using Tibco’s XML Turbo
tool. Several iterations between members of the SRI EELD team and Alphatech
resulted in this current version.
3. Pattern Schema
The patternLibrary element (see Figure 1) is the root element for
describing a library of patterns. patternLibrary can contain 0 or 1 header
elements, where meta-data about the patternLibrary can be recorded (e.g.
author, creation date). The patternLibrary element must contain 1 or
more pattern elements (i.e. the patternLibrary can contain multiple
patterns).
The header element is
designed to hold meta-data (e.g. creation-date, author). This can be specified
using free-text, or using other xml elements. We can foresee users placing
Dublin Core and other meta-data standard elements eventually within this
element.

Figure 1:
patternLibrary element
The pattern element is the
main element and can be used to represent an uninstantiated and an instantiated
pattern. That is, a pattern editor could produce a pattern that has as its root
a pattern element. When this pattern is matched the detailed pattern
match information (e.g. which data element matched a particular node) can also
have a pattern element as its root (the pattern here would either be
just those nodes/edges that matched).
The pattern element has 3 attributes: id, uri and label. We see
the current use of id will be to give a unique id to a pattern for the
document. This is likely to a unique id from the application that developed the
pattern currently. In the future it should be a UUID. uri is the URI pointer to
the pattern if it exists elsewhere. URI could be used in the context of
documents, which don't necessarily include the pattern, just reference it.
Label is just a pretty name for the pattern, which could include version
information.

Figure 2:
pattern element
The pattern element can contain
0 or 1 header element. This is designed to allow for pattern meta-data
to be included (e.g. author, creation date).
The pattern element can also contain 0 or 1 ontology elements,
where the ontology used for the pattern can be given (either the actual
ontology via a class hierarchy or a reference). The pattern element must
contain 0 or 1 body elements where the pattern nodes and edges are
actually given. A pattern element may also contain 0 or 1 properties
element. A properties element is currently a catch-all placeholder for
extra information about the pattern.
The ontology element has 3
attributes: id, uri and label. id doesn't currently serve any purpose, but
eventually it could be a UUID. uri is the URI pointer to the ontology if it
exists elsewhere. Label is just a pretty name for the ontology, which will
probably be the ontology name plus version number. The ontology element
can contain 0 or more class elements OR just content. The idea is that the ontology
element can actually be included in the pattern element so that the document is
complete.
A class element can have 2
attributes: id (required) and label. id is the unique id of a class and will be
application specific for now. Eventually it'll be a UUID. Label is a pretty
string name of the class. A class element can contain 0 or more subclassOf
elements that define the super classes of this class. A subClassOf
element has one attribute classid (required), which is the reference to the id
of the super class. This will be an id in the document - -appplication specific
for now but eventually will be a UUID. A subClassOf element must be
empty (i.e. can not contain any elements or text)
A body element contains the body of the graph. The
body element must contain 1 or more node elements and can contain 0 or
more edge elements, and 0 or more pattern elements. That is the minimum
body for a pattern is 1 node.
A node element can have 2 attributes: id (required)
and label. The id is a unique identifier within a document. It will be likely
used to record an application specific node id for now. Eventually it should be
a UUID. Label is just a pretty name for the node, which is what visualizers
will use when displaying the node. A node element can contain 6 different types
of elements in the following sequence: instanceOf, value, position,
dimension, properties, and origin.
An instanceOf element holds class information about
the node. There can be 0 or more instanceOf elements (a node can be an
instance of multiple classes). An instanceOf element has one attribute,
classid (required) that is the reference to the id of the class that the
element is an instance of. This will be an id in the document -- application
specific for now but eventually will be a UUID. An instanceOf element
can contain 0 or 1 origin element. An origin element here would
be used to refer to the class that the node is an instance of, where the class
is defined in another pattern.
An origin element has 2 attributes patternId and
referenceId (both required). An origin element must be empty (i.e. can not
contain any elements or text). An origin element is used to reference an object
that is defined elsewhere. The patternId is a unique identifier of the pattern
where the object is defined (unique to a document/application for now, should be
a UUID eventually). The referenceId is a unique identifier of the object in the
pattern where the object is defined (unique to a document/application for now,
should be a UUID eventually).
The value element is where value information for the
node can be recorded. There can be 0 or 1 value elements. A value
element is used to hold the match between the object and a data object.
Currently this element can hold anything - text or elements, allowing different
ways of recording the data element that matched.
Both position and dimension elements are visualizer specific information. There are either 0 or 1 of both these elements. A position element has 3 optional attributes - x, y, and z, which are the screen position of the center of the object. A position element must be empty (i.e. can not contain any elements or text). X, y and z are going to be application specific depending on the coordinate system used by each application, but are of use to visualizers that use the same coordinate system. A dimension element has 3 optional attributes - width, height and depth. A dimension element must be empty (i.e. can not contain any elements or text). width, height and depth are the screen dimensions of the object. These are going to be application specific depending on the coordinate system used by each application, but are of use to visualizers that use the same coordinate system
A properties element is a catch all for extra
information about the node. There can be 0 or 1 properties element. The properties
element can contain anything -- text or other elements. One possible content
for the properties element of a node is the cardinality element, which
can be used here to describe the cardinality of the node in a pattern (e.g. the
fact that there are 1 or more nodes, or only 1 node, or 2 to 4 nodes etc
An edge element can have 6 attributes: id (required),
label, relname, from (required), to (required) and directed. The edge id is a
unique identifier -- probably application specific for now but should
eventually be a UUID. Label is a pretty string. Relname is the name (string) of
the link that this edge represents. From is a reference to the id of the node
where the edge comes from. To is a reference to the id of the node, where the
edge goes to. If the edge is undirected, it doesn't matter which node is put as
the value of the from attribute, and which is put as the value for the to
attribute. The final attribute is directed -- this is a boolean. Edges can't be
bi-directional, only directed or undirected. An edge element can contain
4 different types of elements in the following sequence: origin, position,
dimension, and properties. All are optional. Origin (0 or
1 allowed) allows an edge defined in another pattern to be referenced in this
pattern. Position and dimension are visualizer specific
information. Properties is a catch all placeholder for extra information
about the edge.
Appendix A: PatternML Schema
<?xml version = "1.0"
encoding = "UTF-8"?>
<xsd:schema xmlns:xsd =
"http://www.w3.org/2001/XMLSchema">
<xsd:element
name = "patternLibrary">
<xsd:annotation>
<xsd:documentation>patternLibrary
is the root element for describing a library of patterns.
</xsd:documentation>
<xsd:documentation>patternLibrary
can contain 0 or 1 header elements where meta-data about the patternLibrary can
be recorded (e.g. author, creation date). The patternLibrary element must
contain 1 or more pattern elements, which make up the patternLibrary.
</xsd:documentation>
</xsd:annotation>
<xsd:complexType>
<xsd:sequence>
<xsd:element
ref = "header" minOccurs = "0"/>
<xsd:element
ref = "pattern" maxOccurs = "unbounded"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
<xsd:element
name = "ontology">
<xsd:annotation>
<xsd:documentation>ontology
has 2 attributes: id and label. id doesn't currently serve any purpose, but
eventually it could be a uuid. Label is just a pretty name for the ontology,
which will probably be the ontology name plus version number.
</xsd:documentation>
<xsd:documentation>ontology
can contain 0 or more class elements OR just content. The idea is that if that
the ontology can actually be included in the pattern so that the document is
complete. Alternatively the ontology elelemnt can hold information about the
ontology. This is where a URI for the ontology could be placed.
</xsd:documentation>
</xsd:annotation>
<xsd:complexType
mixed = "true">
<xsd:choice>
<xsd:element
ref = "class" minOccurs = "0" maxOccurs =
"unbounded"/>
</xsd:choice>
<xsd:attribute
name = "id" use = "required" type =
"xsd:string"/>
<xsd:attribute
name = "label" type = "xsd:string"/>
<xsd:attribute
name = "uri" type = "xsd:anyURI"/>
</xsd:complexType>
</xsd:element>
<xsd:element
name = "pattern">
<xsd:annotation>
<xsd:documentation>pattern
is the main element and can be used to represent an uninstantiated pattern and
a instantiated pattern. That is, a pattern editor could produce a pattern that
has as its root a pattern element. When this pattern is matched the detailed
pattern match information (e.g. which data element matched a particular node)
can also have a pattern element as its root (the pattern here would be just
those edge/nodes that matched). </xsd:documentation>
<xsd:documentation>pattern
has 3 attributes: id, uri and label. We see the current use of id will be to
give a unique id to a pattern for the document. This is likely to a unique id
from the application that developed the pattern currently. In the future it
should be a uuid. URI is the URI pointer to the pattern if it exists. URI could
be used in the context of documents which don't necessarily include the
pattern, just reference it. Label is just a pretty name for the pattern, which
could include version information.
</xsd:documentation>
<xsd:documentation>pattern
can contain 0 or more header elements. This is designed to allow for pattern
meta-data to be included (e.g. author, creation date). The pattern can also
contain 0 or 1 ontology elements, where the ontology used for the patternLibrary
can be given (either the actual ontology or a reference). The pattern must
contain 0 or 1 body elements where the pattern nodes and edges are actually
given. A pattern may also contain 0 or more property elements. Property
elements are currently a catch all placeholder for extra information about the
pattern.
</xsd:documentation>
</xsd:annotation>
<xsd:complexType>
<xsd:sequence>
<xsd:element
ref = "header" minOccurs = "0"/>
<xsd:element
ref = "ontology" minOccurs = "0"/>
<xsd:element
ref = "body" minOccurs = "0"/>
<xsd:element
ref = "properties" minOccurs = "0"/>
</xsd:sequence>
<xsd:attribute
name = "id" use = "required" type =
"xsd:string"/>
<xsd:attribute
name = "uri" type = "xsd:anyURI"/>
<xsd:attribute
name = "label" type = "xsd:string"/>
</xsd:complexType>
</xsd:element>
<xsd:element
name = "header">
<xsd:annotation>
<xsd:documentation>header
is designed to hold meta-data (e.g. creation-date, author). This can be
specified using free-text, or using other xml elements. We can forsee user
placing Dublin Core and other meta-data standard elements eventually within
this element.
</xsd:documentation>
</xsd:annotation>
</xsd:element>
<xsd:element
name = "body">
<xsd:annotation>
<xsd:documentation>body
conatins the body of the graph. The body element can contain 1 or more node
elements and 0 or more edge elements. That is the minimum body for a graph is 1
node.
</xsd:documentation>
</xsd:annotation>
<xsd:complexType>
<xsd:sequence>
<xsd:element ref =
"node" maxOccurs = "unbounded"/>
<xsd:element
ref = "edge" minOccurs = "0" maxOccurs =
"unbounded"/>
<xsd:element
ref = "pattern" minOccurs = "0" maxOccurs =
"unbounded"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
<xsd:element
name = "node">
<xsd:annotation>
<xsd:documentation>A
node element can have 2 attributes: id (required) and label. The id is aunique
identifer within a document and is required. It will be likely used to record
an application specific node id for now. Eventually it should be a uuid. Label
is just a pretty name for the node, which is what visualizers will use when
displaying the node.
</xsd:documentation>
<xsd:documentation>A
node element can contain 6 different types of elements in the following
sequence: instanceOf, value, position,dimension, properties, and origin.
instanceOf is is class information about the node (a node can be an instanceOf
multiple classes). The value element is where the pattern match information is
recorded. Position and dimension are visualizer specfic information. Properties
are a catch all placeholder for extra information about the node.
</xsd:documentation>
</xsd:annotation>
<xsd:complexType>
<xsd:sequence>
<xsd:element
ref = "instanceOf" minOccurs = "0" maxOccurs =
"unbounded"/>
<xsd:element
ref = "value" minOccurs = "0"/>
<xsd:element
ref = "position" minOccurs = "0"/>
<xsd:element
ref = "dimension" minOccurs = "0"/>
<xsd:element
ref = "properties" minOccurs = "0"/>
<xsd:element
ref = "origin" minOccurs = "0"/>
</xsd:sequence>
<xsd:attribute
name = "id" use = "required" type =
"xsd:string"/>
<xsd:attribute
name = "label" type = "xsd:string"/>
</xsd:complexType>
</xsd:element>
<xsd:element
name = "properties">
<xsd:annotation>
<xsd:documentation>Catch
all for extra information. Can conatin anything -- text or other elements.
</xsd:documentation>
</xsd:annotation>
<xsd:complexType
mixed = "true">
<xsd:choice>
<xsd:element
ref = "cardinality" minOccurs = "0"/>
</xsd:choice>
</xsd:complexType>
</xsd:element>
<xsd:element
name = "cardinality">
<xsd:annotation>
<xsd:documentation>Cardinality
decribes the minimum and maximum number of items that can be the value for an
element. In patternML this is used mainly to dfine graphs where the cardinality
is attached as a property of the node, allowing the user to specify that a
min/max number of nodes of this type can occur. This capability allows n-m
relations to be modelled.
</xsd:documentation>
</xsd:annotation>
<xsd:complexType>
<xsd:attribute
name = "min" type = "xsd:string"/>
<xsd:attribute
name = "max" type = "xsd:string"/>
</xsd:complexType>
</xsd:element>
<xsd:element
name = "edge">
<xsd:annotation>
<xsd:documentation>An
edge can have 6 attributes: id (required), label, relname, from (required), to
(required) and directed. The edge id is a unique identifier -- probably
application specific for now but will eventually be a uuid. Label is a pretty
string. Relname is the name (string) of the link that this edge represents.
From is a reference to the id of the node where the edge comes from,; to is a
reference to the id of the node where the edge goes to. If the edge is
undirected, it doesn't matter which is put where. The final attribute is
directed -- this is a boolean. Edges can't be bi-directional, only directed or
undirected.
</xsd:documentation>
<xsd:documentation>An
edge element can contain 4 different types of elements in the following
sequence: position, dimension,properties, and origin. All are optional. Origin
(0 or 1 allowed) allows a node/edge defined in another pattern to be referenced
in this pattern. Position and dimension are visualizer specfic information.
Properties are a catch all placeholder for extra information about the node.
</xsd:documentation>
</xsd:annotation>
<xsd:complexType>
<xsd:sequence>
<xsd:element
ref = "position" minOccurs = "0"/>
<xsd:element
ref = "dimension" minOccurs = "0"/>
<xsd:element
ref = "properties" minOccurs = "0"/>
<xsd:element
ref = "origin" minOccurs = "0"/>
</xsd:sequence>
<xsd:attribute
name = "id" use = "required" type =
"xsd:string"/>
<xsd:attribute
name = "label" type = "xsd:string"/>
<xsd:attribute
name = "relname" type = "xsd:string"/>
<xsd:attribute
name = "from" use = "required" type =
"xsd:string"/>
<xsd:attribute
name = "to" use = "required" type =
"xsd:string"/>
<xsd:attribute
name = "directed" type = "xsd:boolean"/>
</xsd:complexType>
</xsd:element>
<xsd:element
name = "class">
<xsd:annotation>
<xsd:documentation>A
class element can have 2 attributes: id (required) and label. id is the unique
id of a class and will be application specific for now. Eventually it'll be a
uuid. Label is a pretty string name of the class.
</xsd:documentation>
<xsd:documentation>A
class element can conatin 0 or more subclassOf elements that define the
superclasses of this class.
</xsd:documentation>
</xsd:annotation>
<xsd:complexType>
<xsd:sequence>
<xsd:element ref =
"subClassOf" minOccurs = "0" maxOccurs =
"unbounded"/>
</xsd:sequence>
<xsd:attribute
name = "id" use = "required" type =
"xsd:string"/>
<xsd:attribute
name = "label" type = "xsd:string"/>
</xsd:complexType>
</xsd:element>
<xsd:element
name = "subClassOf">
<xsd:annotation>
<xsd:documentation>A
subClassOf element has one attribute classid, (required) that is the reference
to the id of the superclass. This will be an id in the document - -appplication
specific for now but eventually will be a uuid. A subClassOf element must be
empty (i.e can not contain any elements or text)
</xsd:documentation>
</xsd:annotation>
<xsd:complexType>
<xsd:attribute
name = "classid" use = "required" type =
"xsd:string"/>
</xsd:complexType>
</xsd:element>
<xsd:element
name = "instanceOf">
<xsd:annotation>
<xsd:documentation>An
instanceOf element has one attribute, classid(required,) that is the reference
to the id of the class that the element is an instance of. This will be an id
in the document -- appplication specific for now but eventually will be a uuid.
</xsd:documentation>
<xsd:documentation>An
instanceOf element can contain 0 or 1 origin elements. An Origin element here
would be used to refer to the class that the node is an instance of, where the
class is defined in another pattern.
</xsd:documentation>
</xsd:annotation>
<xsd:complexType>
<xsd:sequence>
<xsd:element
ref = "origin" minOccurs = "0"/>
</xsd:sequence>
<xsd:attribute
name = "classid" use = "required" type =
"xsd:string"/>
</xsd:complexType>
</xsd:element>
<xsd:element
name = "position">
<xsd:annotation>
<xsd:documentation>A
position element has 3 optional attributes - x,y,z, which are the screen
position of the center of the object. A position element must be empty (i.e can
not contain any elements or text). x,y, and z are going to be application
specific depending on the coordinate system used by each application, but are
of use to visualizers that use the same coordinate system
</xsd:documentation>
</xsd:annotation>
<xsd:complexType>
<xsd:attribute
name = "x" type = "xsd:double"/>
<xsd:attribute
name = "y" type = "xsd:double"/>
<xsd:attribute
name = "z" type = "xsd:double"/>
</xsd:complexType>
</xsd:element>
<xsd:element
name = "dimension">
<xsd:annotation>
<xsd:documentation>A
dimension element has 3 optional attributes - width, height and depth. A
dimension element must be empty (i.e can not contain any elements or text).
width, height and depth are the screen dimensions of the object. These are
going to be application specific depending on the coordinate system used by
each application, but are of use to visualizers that use the same coordinate
system
</xsd:documentation>
</xsd:annotation>
<xsd:complexType>
<xsd:attribute
name = "width" type = "xsd:double"/>
<xsd:attribute
name = "height" type = "xsd:double"/>
<xsd:attribute
name = "depth" type = "xsd:double"/>
</xsd:complexType>
</xsd:element>
<xsd:element
name = "origin">
<xsd:annotation>
<xsd:documentation>An
origin element has 2 attributes patternId and referenceId (both required). An
origin element must be empty (i.e can not contain any elements or text). An
origin element is used to reference an object that is defined elsewhere. The
patternId is a unique identifier of the pattern where the object is defined
(unique to a document/application for now -- uuid eventually). The referenceId
is a unique identifier of the object in the pattern where the object is defined
(unique to a document/application for now -- uuid eventually).
</xsd:documentation>
</xsd:annotation>
<xsd:complexType>
<xsd:attribute
name = "patternId" use = "required" type =
"xsd:string"/>
<xsd:attribute
name = "referenceId" use = "required" type =
"xsd:string"/>
</xsd:complexType>
</xsd:element>
<xsd:element
name = "value">
<xsd:annotation>
<xsd:documentation>A
value element is used to hold the match between the object and a data object.
Currently this element can hold anything - text or elements, allowing different
ways of recording the data element that matched.
</xsd:documentation>
</xsd:annotation>
<xsd:complexType
mixed = "true">
<xsd:sequence>
<xsd:any
minOccurs = "0"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:schema>
Appendix B: Example PatternML Files
Example PatternML files.
1) Pattern is a query about all contract killings in Moscow.
It shows the use of cardinality and a instantiated variable (location).
<?xml
version = "1.0" encoding = "UTF-8"?>
<pattern
xmlns:xsi = "http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation =
"http://www.ai.sri.com/~law/schemas/2002/07/pattern"
label="moscow contract murder query"
id="moscow_contract_murder_query1">
<body>
<node id="contract_murder.x"
label="murder">
<instanceOf
classid="MurderForHire"/>
<properties><cardinality
min="1" max="unbounded"/></properties>
</node>
<node
id="eventOccursAt.contract_murder.x"
label="eventOccursAt">
<instanceOf
classid="Relation"/>
</node>
<node
id="location.contract_murder.x" label="location">
<value>Moscow</value>
</node>
<edge from="contract_murder.x"
to="eventOccursAt.contract_murder.x"
id="contract_murder.x.eventOccursAt"
label="location"/>
<edge from="eventOccursAt.contract_murder.x"
to="location.contract_murder.x"
id="eventOccursAt.contract_murder.x.location"
label="location"/>
</body>
</pattern>
2) General pattern for a contract killing (simplified),
which contains no sub-patterns. This would be used to match against e.g. murder
evidence to see whether the murder was a contract killing, or instead was of a
different type (e.g. first degree murder, second degree murder etc.)
<?xml
version = "1.0" encoding = "UTF-8"?>
<pattern
xmlns:xsi = "http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation =
"http://www.ai.sri.com/~law/schemas/2002/07/pattern"
label="contract murder" id="contract_murder1">
<ontology id="eeld_ontology">
<class id="contract_murder"
label="contract murder">
<subClassOf
classid="murder"/>
</class>
</ontology>
<body>
<node id="contract_murder.x"
label="contract murder">
<instanceOf
classid="MurderForHire"/>
</node>
<node id="murder.motive.x"
label="murder motive">
<instanceOf
classid="Relation"/>
</node>
<node id="motive.x"
label="motive">
<instanceOf
classid="motive"/>
</node>
<node id="subevent.a"
label="subevent">
<instanceOf
classid="Relation"/>
</node>
<node id="subevent.b"
label="subevent">
<instanceOf classid="Relation"/>
</node>
<node id="contractor.x"
label="contractor">
<instanceOf
classid="Person"/>
</node>
<node
id="nameString.contractor.x" label="nameString">
<instanceOf
classid="Relation"/>
</node>
<node id="name.contractor.x"
label="contractor name">
<instanceOf
classid="nameString"/>
</node>
<node
id="directingAgent.contract_murder.x"
label="directingAgent">
<instanceOf
classid="Relation"/>
</node>
<node id="murder.x"
label="murder">
<instanceOf
classid="Murder"/>
</node>
<node id="payment.x"
label="payment of first installment">
<instanceOf
classid="Paying"/>
</node>
<node id="payment.source.x"
label="source">
<instanceOf
classid="Relation"/>
</node>
<node id="payment.recipient.x"
label="recipient">
<instanceOf classid="Relation"/>
</node>
<node id="temporal.sequence.x"
label="before">
<instanceOf
classid="Relation"/>
</node>
<node id="dateOfEvent.murder.x"
label="dateOfEvent">
<instanceOf
classid="Relation"/>
</node>
<node id="date.murder.x"
label="date">
<instanceOf
classid="Date"/>
</node>
<node
id="eventOccursAt.murder.x" label="eventOccursAt">
<instanceOf
classid="Relation"/>
</node>
<node id="location.murder.x"
label="location">
<instanceOf classid="S