COMPOSITION 1. Introduction --------------- As described in our RKF proposal, a key feature of Shaken is to allow SMEs to construct representations by connecting pre-fabricated, representational components, rather than writing low-level axioms. By component, we mean a coherent set of axioms which describes some abstract phenomenon (e.g. the concept of "invade"), and which are packaged together into a single representational unit. By composition, we mean the connection of such components together, and the computation of additional implications of the composite set of axioms. Components are intended to encode fairly abstract phenomena, such as knowledge about the concepts "invade", "break", "container", "control system". Other examples are given in our RKF proposal. An important claim of our work is that components can be presented to SMEs as graphs, and the SME can then perform composition through graph manipulation operations. As a result, details of the underlying logic will be hidden from SMEs. Two implementation challenges for this are first expressing components as graphs, and second translating the user's graph manipulation operations back into logic, so that as the user manipulates graphs, the system records the logical equivalent of those operations. We overview the design of these two tasks in Sections 3 and 4 below. 2. Components and Composition ----------------------------- We will represent components in the frame language KM. In the simplest case, the axioms for a component are gathered into a single frame data structure. For example, consider a (much simplified) component describing the concept of "Invade") (See the KM User Manual for the first-order logic translation): (Invade has (superclasses (Attack))) (every Invade has (invader ((a TangibleThing))) (invadee ((a TangibleThing))) (firstevent ((the Penetrate subevents of Self))) (subevents ( (a Penetrate with (agent ((the invader of Self))) (patient ((the invadee of Self))) (nextevent ((the Enter subevents of Self)))) (a Enter with (agent ((the invader of Self))) (patient ((the invadee of Self))) (nextevent ((the TakeControl subevents of Self)))) (a TakeControl with (agent ((the invader of Self))) (patient ((the invadee of Self))))))) This component states, among other things, that the invader of an invade is a tangible thing, that it has three subevents (penetrate, enter, take control), the the agent in that penetrate is that invader, etc. From this and other components, a representation of how a virus invades a cell can be built, partly by the user specializing and relating roles in the components, and partly by the computer automatically inferring implications of those statements. For example, when a virus invades a cell, the user needs to specify that: A1. the invader is the virus A2. the invadee is the cell A3. the penetrate is performed by means of endocytosis A4. the agent in the endocytosis is the invadee (ie. the cell) A5. there is also a delivery taking place A6. there are certain correspondences between the invade and the delivery e.g. A6.1 the invader (ie. the virus) is the same as the agent in the delivery A6.2 the thing delivered is the dna of that virus. These statements can be made in logic, providing a specification of how a virus invades a cell, and from this an inference engine can answer questions specifically about this activity. However, the SME will not make such statements directly in logic: rather, he/she will make them thorough the graphical CMap interface, as illustrated in the story-board and explained later in this document. This is possible because these axioms (specifying the composition) are generally all of a simple form; the complex axioms about virus invading a cell have already been pre-encoded in the components the SME uses. This is a key scientific claim of our work, namely that by pre-encoding components, a set of simple types of connections between them will be adequate for KB construction by an SME. In the first version of the system, the user will be able to make four kinds of statements when relating components: 1. SPECIALIZE: specialize the most specific class of an object in the composition (eg. A1, A2 above) 2. UNIFY: state that two objects in the composition are coreferential (A4, A6.1, A6.2) 3. CONNECT: state that a given relationship holds between two objects in the composition (A4) 4. ADD: introduce another component to the composition (A5) These actions are illustrated in the story-board, and each corresponds to a different type of axiom. For a specific instance of the new concept being created, the consequents of these axioms have the form: 1. SPECIALIZE: (isa instance class) 2. UNIFY: (= instance1 instance2) 3. CONNECT: (relation instance1 instance2) 4. ADD: (exists (?x) (isa ?x class)) The antecedents of these axioms contain expressions describing the instance(s) the SME wants to refer to, i.e. the instance1,instance2 above, So each axiom states (informally): Forall instances ?i of the concept being specified (e.g. VirusInvadesCell)... ...for the instance(s) in some particular relationship to ?i... ...the consequent holds for those instance(s). For instance, an example of a SPECIALIZE axiom is: (forall (?i) (=> (isa ?i VirusInvadesCell) ; "In every VirusInvadesCell... (exists (?j) (and (invader ?i ?j) ; the invader... (isa ?j Virus))))) ; ...is a virus." or in KM's notation: (every VirusInvadesCell has (invader ((a Virus)))) 3. Visualizing a Component as a Graph ------------------------------------- A very common pattern of axioms within a component is a "forall...exists..." pattern, stating that for each instance I of that component, there will exist a number of additional instances I1,..In which are in particular relationships with I and with each other. This provides a basis for presenting a component to the SME, namely as an "instance graph" containing - a node I denoting the component's root (e.g. Invade). We call I the root node of the graph - nodes I1,...,In denoting these additional instances implied by the axioms - arcs denoting the relationships between those instances. For example, the graph for Invade would look: [ sketch here ] Such a graph can be created in two ways: - manually - automatically in KM, by creating an instance I of the component, and then recursively exploring the values on each of its slots. This will cause KM to generate Skolem instances I1,...,In denoting the existentially quantified variables, and store those values on the queried slots, ie. assert ground facts in the KB of the form slot(Ij,Ik), corresponding to an arc of the graph. Finally, a graph layout algorithm could add spatial information to the graph. Thus what the SME will see is essentially an *instance* of the component. 4. Mapping from Graph Operations back to Logic ---------------------------------------------- A basic action of the SME will be to click on a node (or nodes) in the graph, and then perform some operation on it (them). The knowledge-base needs to perform and record the logical equivalent of this operation. To do this, there are two steps: a. synthesize a logic description of the node the user clicked on b. assert some implication about instances matching that description, that implication corresponding to the operation the user selects. The logic description of a node N is a first-order formula with a free variable ?X that is true of N only. Typically this formula will be a path (role chain) of relationships from the root node I to N. For example, - the invader in the Invade has a logical description (invader I ?X) - the patient of the Enter subevent has a logical description (forall (?E) (and (subevents I ?E) (isa ?E Enter) (patient ?E ?X))) The assertion the user makes about the ?X (or an ?X and ?Y) will be, as described earlier, one of: SPECIALIZE: (isa ?X class) (where the user chooses class) UNIFY: (= ?X ?Y) CONNECT: (relation ?X ?Y) (where the user chooses relation) The full axiom combines these two with an implication. For example, if the user clicks on the Invade's invader, and asks to specialize it to Virus, then the axiom corresponding to this operation is (exists (?X) (=> (invader I ?X) (isa ?X Virus))) where I is the root node in the graph, namely the instance of the composition the user is building (e.g. VirusInvadesCell). Finally, this axiom is generalized to *all* instances of the component by replacing I with a variable: (forall (?I) (=> (isa ?I VirusInvadesCell) ; In every VirusInvadesCell... (exists (?X) (=> (invader I ?X) ; the invader... (isa ?X Virus))))) ; is a Virus. or in KM's notation (every VirusInvadesCell has ; In every VirusInvadesCell... (invader ((a Virus)))) ; the invader is a Virus. In this way, as the SME operates on the graph, a trace of the logical equivalent of those operations will be accumulated by Shaken. When the SME has completed his/her operations to his/her satisfaction, that set of assertions will be stored as a new concept in the KB. In the storyboard, the assertions corresponding to the SME's actions are (in KM notation and combined together): (VirusInvadesCell has (superclasses (Invade))) (every VirusInvadesCell has (invader ((a Virus))) (invadee ((a Cell))) (subevents ( (a Penetrate with (byMeansOf ((a Endocytosis with (agent ((the invadee of Self))) (patient ((the invader of Self)))))))))) Note that these axioms only specify facts about VirusInvadesCell which are *in addition* to the more general axioms in Invade, Penetrate, etc. The question-answering system will then work with these and the more general axioms to answer questions. 5. Additional Issues -------------------- There are several additional issues related to composition which are under discussion, and which we mention now: 1. Using inheritance and inference for computing the graph: It may be that the graph shown to the SME should include inherited information. For example, Invade will inherit properties like "duration", "location", etc. from more general concepts, and it may be desirable to offer these to the SME to operate on. Similarly, other concepts (e.g. Penetrate) will inherit information. If inference is not used to compute the graph the SME sees, he/she may miss some important information he/she expects to see. However, if inference is used, then some means of controlling it (so as not to produce a large or infinite graph) will be needed. 2. Handling implications of the SME's operations: Sometimes, an SME's operation will have "ripple"/"knock-on" effects. For example, specializing one node may imply other nodes also can be specialized. These effects should be computed and shown to the SME. 3. Additional axiom forms: A mechanism will be needed for handling axioms which should be displayed to the SME, but do not fall into the "forall...exists..." pattern. (For example, "In all cells, all the lysosomes are in the cytoplasm."). 4. Knowledge-Base Management: A mechanism will be needed to check in and check out compositions which the SME has been working on from the KB library, maintain this library, and help the SME locate the right components he/she needs. 5. Errors by the SME: A mechanism is needed to test, detect errors in, and help correct the SME's operations. This is described later in this document. 6. Browsing Components: Straightforward tools for searching and selecting concepts in the taxonomy will be needed. -- end --