Inference in Queries
As an experiment, we consider what kinds of inference is required to answer one of the queries for searching for documents. We phrase this query in terms of the ontologies and theories we have developed for bibliographical references, names, addresses, topics, and dates. We then use SNARK to find answers to this query, based on theories we have developed for these areas.
Before we give the example, it may be useful to explain some basic principles of our ontology. We distinguish between strings, names, and entities. For example, "James Hendler" is a string, (personq "James Hendler") is a name, and james-hendler is a person.
While james-hendler refers to a unique individual (ultimately a URI), the name (personq "James Hendler") is shared by several people. While the name (personq "Washington") is uniquely a name of a person, the same string "Washington" can stand for a person, a city, or a state. These distinctions seem pedantic, but when we try to ignore them we get into trouble.
The relation person-val relates a name of a person with the person
itself. Thus, if
(person-val ?personq ?person)holds,
?personq is a name for ?person. By convention,
?person, ?person0, ?person1, etc., are variables that
range over people; on the other hand, ?personq, ?personq0,
etc., range over names of people.
A paper is of sort paper; a reference to a paper is of sort
paperq. Thus we think of a reference as a kind of name for a paper.
Other concepts are best explained in the context of the example. We consider the first query from the homework assignment:
Find a reference to the most recent paper on SHOE with James Hendler as a co-author.This query may be phrased as follows:
(get ?paperq such-that (and (pub-val ?paperq ?paper) (author ?paper ?person) (person-val (personq "James Hendler") ?person) (about ?paper (topic "SHOE")) (= (pub-to-year ?paper) (year-fn ?natural))) prefer starts-after-starting-of on (year-fn ?natural) time-limit 10)
In other words, we want to find a reference ?paperq
such that
?paperq is a reference to ?paper.
?person is an author of ?paper--there may be other
authors.
(personq "James Hendler") is a name for ?person.
?paper is about the topic SHOE; if the topic ontology contained
subtopics of SHOE, papers on those subtopics would be acceptable.
?paper was published in year (year-fn ?natural); e.g.,
while 2000 is a natural number, (year-fn 2000) is the year
2000.
Furthermore, if there are several papers found that satisfy above criteria,
we prefer the later one. The relation starts-after-starting-of is a
relation on time intervals such that
(starts-after-starting-of ?time-interval1 ?time-interval2)holds if
?time-interval1 starts after ?time-interval2; thus
(starts-after-starting-of (year-fn 2000) (year-fn 1997))
We execute this by first finding a single paper reference that meets all our criteria. We then look for another one, with a later publication date. We do not ask for the latest of all of Hendler's publications, since we can never be sure if we have seen all of them; instead, we give the search a time limit of 10 seconds, and return the latest paper we have found in that time.
Let us follow a few steps on the inference process by which an answer was found.
We begin with a logical sentence obtained from the query:
(Row 193
(or (not (pub-val ?paperq ?paper))
(not (author ?paper ?person))
(not (person-val (personq "James Hendler") ?person))
(not (about ?paper (topic "SHOE")))
(not (= (pub-to-year ?paper) (year-fn ?integer&nonnegative))))
Answer (ans ?paperq (year-fn ?integer&nonnegative)))
Note that, because SNARK is a refutation procedure, queries are negated, and
inference proceeds until a contradiction is obtained. Also note that the
query, and its logical descendents, is accompanied by an Answer
expression indicating what answer we expect to obtain from the proof.
Because our preference is based on the year, we include the year of
publication as part of our answer. The expression integer&nonnegative
is SNARK's internal notation for the sort of natural numbers.
Formulas and their associated answers are called "rows."
We assume the existence of a name service that gives the URI of people
with name "James Hendler"; here that service is simulated by
assertions. Using that assertion on the third disjunct, we obtain the next
row
(Row 194
(or (not (pub-val ?paperq ?paper))
(not (author ?paper james-hendler))
(not (about ?paper (topic "SHOE")))
(not (= (pub-to-year ?paper) (year-fn ?integer&nonnegative))))
(resolve 193 name-of-james-hendler)
Answer (ans ?paperq (year-fn ?integer&nonnegative)))
Here we have decided to look for publications of the individual with URI
james-hendler.
We assume we have access to DamL-annotated publication lists; for the time being, we simulate the contents of the annotation in SNARK notation:
(assert
'(pub-val
(inproceedingsq
author (coq
(personq "Sean Luke")
(personq "Lee Spector")
(personq "David Rager")
(personq "James Hendler"))
title (titleq "Ontology-based Web Agents")
booktitle (titleq "Proceedings of the First International Conference
on Autonomous Agents (Agents97)")
year (year-fn 1997)
publisher (publisherq "Association for Computing Machinery")
address (cityq "New York" "NY" "US")
topic (topic "SHOE"))
shoe-acm-paper)
:name 'shoe-acm-paper-reference)
This is a description of a 1997 paper on SHOE, of which Hendler is a
co-author. Here
(coq ?personq1 ?personq2 ...)is a publication list containing names
?personq1, ?personq2,
....
The reference indicates that the paper appears in a conference proceedings, gives the title of the paper, the title of the proceedings, the date of publication, the publisher, the address of the publisher, and the topic. The notation for references in this theory is based on that of Bibtex.
Names, indicated by the keyword :name, have no logical or functional
significance; they are used only to make the trace easier for us to follow.
This assertion applies to the first disjunct of Row 194; we obtain
(Row 202
(or (not (author shoe-acm-paper james-hendler))
(not (about shoe-acm-paper (topic "SHOE")))
(not (= (pub-to-year shoe-acm-paper)
(year-fn ?integer&nonnegative))))
(resolve 194 shoe-acm-paper-reference)
Answer (ans
(inproceedingsq author
(coq (personq "Sean Luke")
(personq "Lee Spector")
(personq "David Rager")
(personq "James Hendler"))
title (titleq "Ontology-based Web Agents")
booktitle
(titleq
"Proceedings of the First International Conference on
Autonomous Agents (Agents97)")
year (year-fn 1997)
publisher
(publisherq "Association for Computing Machinery")
address
(cityq "New York" "NY" "US")
topic (topic "SHOE"))
(year-fn ?integer&nonnegative)))
Note that the answer already contains a reference satisfying the conditions of our query, but we have not yet checked that all these conditions are satisfied. We also have not yet found the date of this publication, although it is implicit in the reference.
Let us skip ahead to the end of this particular refutation:
(Row 321
false
(resolve 320 name-of-lee-spector)
Answer
(ans
(inproceedingsq
author
(coq (personq "Sean Luke")
(personq "Lee Spector")
(personq "David Rager")
(personq "James Hendler"))
title (titleq "Ontology-based Web Agents") booktitle
(titleq
"Proceedings of the First International Conference on
Autonomous Agents (Agents97)")
year (year-fn 1997)
publisher
(publisherq "Association for Computing Machinery")
address
(cityq "New York" "NY" "US")
topic (topic "SHOE"))
(year-fn 1997)))
Here we have verified all the conditions and have determined that the date of
the publication we have found is 1997.
Since we want the latest possible publication, we begin a new query, to find a publication later than 1997:
(Row 324
(or (not (pub-val ?paperq ?paper))
(not (author ?paper ?person))
(not (person-val (personq "James Hendler") ?person))
(not (about ?paper (topic "SHOE")))
(not (= (pub-to-year ?paper) (year-fn ?integer&nonnegative)))
(not (starts-after-starting-of
(year-fn ?integer&nonnegative)
(year-fn 1997))))
Answer (ans ?paperq (year-fn ?integer&nonnegative)))
This query is the same as the one we started with, except it contains an additional disjunct
(starts-after-starting-of (year-fn ?integer&nonnegative) (year-fn 1997))In other words, we insist that the publication we find is more recent than 1997.
The refutation proceeds similarly to what we have seen already, except this time we find the following reference:
(assert
'(pub-val
(inproceedingsq
author (coq
(personq "Jeff Heflin")
(personq "James Hendler"))
title (titleq "Searching the Web with SHOE")
booktitle (titleq "Artificial Intelligence for Web Search.
Papers from the AAAI Workshop.")
year (year-fn 2000)
publisher (publisherq "AAAI Press")
number (pubnumber "WS-00-01")
address (cityq "Menlo Park" "CA" "US")
topic (topic "SHOE"))
shoe-aaai-paper)
:name 'shoe-aaai-paper-reference)
SNARK uses a temporal reasoning procedure, based on the Allen temporal calculus, to check temporal constraints, whether they arise directly from the query or are a consequence of our preferences.
The AAAI paper satisfies all the constraints and is our next best selection:
(Row 421
false
(resolve 420 name-of-jeff-heflin)
Answer
(ans
(inproceedingsq
author
(coq
(personq "Jeff Heflin")
(personq "James Hendler"))
title
(titleq "Searching the Web with SHOE")
booktitle
(titleq "Artificial Intelligence for Web Search.
Papers from the AAAI Workshop.")
year (year-fn 2000)
publisher (publisherq "AAAI Press")
number
(pubnumber "WS-00-01")
address (cityq "Menlo Park" "CA" "US")
topic (topic "SHOE"))
(year-fn 2000)))
SNARK will continue to look for Hendler SHOE papers more recent than 2000, but we have not given assertions for any. When the time limit is exceeded, it will return the reference for the IEEE paper.
It is also possible to obtain lists of papers, ordered by our preference, so that more recent are returned first.