ATTAIN

  • Purpose
  • Description
  • Download
  • Quick start
  • Details
  • Sentence types
  • ICL Solvables
  • Dynamic vocabulary
  • ALTA (ATTAIN Lex Tool Agent)
  • Information

  •  

    Purpose

    ATTAIN is a package of natural language OAA agents which provides parsing and translation of English sentences into ICL messages that other agents can use. Thus, this package plays a similar role to that of the DCG-NL agent, but is based upon SRI's state-of-the-art Gemini parser technology, rather than upon the Definite Clause Grammar formalism. ATTAIN produces a more expressive ICL representation than that produced by DCG-NL, and brings several other advantages as well. For example, one practical advantage of ATTAIN is that the inflected forms of regular nouns and verbs (e.g. singular/plural, past tense), come for free with Gemini, whereas they have to be entered manually in the DCG NL agent.

    Description

    This package consists of four OAA2 agents, provided as binaries for Sun/Solaris,
    as well as documentation for the grammar:

    attain_nl_agent
    nl_icl_agent
    attain_lex_tool_agent (ALTA)
    vocabulary_agent
    grammar_html/

    Download

    Binaries are currently provided only for Sun/Solaris.

    Quick start

    Download and unpack the archive in a convenient directory.

    Start an OAA2 facilitator. Start the debug agent (optional). Start the agents "attain_nl_agent" and "nl_icl_agent" simply by executing them on the command line, with no arguments. At this point, English sentences can be translated into ICL messages. Test the interpretation of English sentences by using the debug agent (optional).
     

    Details

    DCG-NL processes English sentences in two phases internally. These two
    stages have been separated into two distinct agents in this package,
    for increased modularity. The first stage, handled by attain_nl_agent,
    take an English sentence and translates it into a "logical form" (LF),
    which represents the meaning of the sentence. The second stage,
    handled by nl_icl_agent, translates and simplifies the LF into the ICL
    language of OAA2.

    Sentence Types

    All the ones in DCG-NL, plus a few more.

    Most of the grammar of DCG-NL has been taken over into
    attain_nl_agent. In some instances where the intent of DCG-NL was to
    parse a sentence type but inadvertently didn't, attain_nl_agent does
    successfully parse that type. Some examples are:

    Sentence     DCG-NL
    Who will arrive on Tuesday?          not parsed
    Karen will arrive.           not parsed
    What letter will Karen send?          not parsed
    Which manager did Karen send a letter to?       not parsed
    which manager from San Francisco arrived?       not parsed
    which new manager from San Francisco arrived?   not parsed
    The manager who arrived is Karen.  not parsed
    The manager that has arrived is Karen.  not parsed
    The manager that is important arrived.  not parsed
    Which manager is important?   not parsed
     

    One completely new construction was added: the so called "double
    object" construction, where both the direct and indirect objects are
    noun phrases. DCG-NL was not intended to handle this construction, and
    gives it an incorrect interpretation. It does correctly interpret the
    construction where the indirect object is a prepositional phrase. So,
    for example,

        DCG-NL
    karen mailed bear a letter =??karen mailed bear to a letter
    karen mailed a letter to bear OK

    attain_nl_agent parses both of the above sentences and gives them the
    same interpretation. Question forms of the double object construction
    are parsed as well. For comparison, here's DCG-NL's behavior on a
    couple question examples:

         DCG-NL
    Did Karen send the manager a letter? =Did karen send the manager to a letter?
    What did Karen send the manager? not parsed

    -Logical form-
    The logical forms produced by attain_nl_agent are significantly
    different from those produced by DCG-NL due to the fact that DCG-NL uses
    Prolog in its grammar while gemini does not. Most applications will
    not be concerned with the logical forms (they were not available in
    the DCG-NL agent), so this section may be skipped if desired.

    There are three main differences between DCG-NL LFs and
    attain_nl_agent LFs.  One is that attain_nl_agent wraps a predicate
    and its arguments in an outer wrapper. For example,

    Did Karen arrive on Tuesday?
    DCG-NL:
    arrives([on(some(tuesday([]))),subject(karen)])

    attain_nl_agent:
    vpred(arrives,[subject(karen),ppred(on,quant(some,tuesday,[]))])

    The same is true of noun phrases:

    the letter from karen:
    DCG-NL:
    the(email([from(karen)]))

    attain_nl_agent:
    quant(the,email,[ppred(from,karen)])

    The second difference between DCG-NL and attain_nl_agent LFs has to do
    with the order of the arguments of a verb. attain_nl_agent has
    roughly the reverse order of DCG-NL, putting the subject first, as can
    be seen in the example above. This will be discussed in more detail
    under the interpretation section.

    -Interpretation-
    An attempt was made to make attain_nl_agent compatible with
    DCG-NL. The one change has to do with the order of the arguments. DCG-NL
    is not consistent in the order of arguments, varying between
    declaratives and questions. attain_nl_agent consistently puts the
    arguments in this order:

    (subject) direct_object [(addressee) (direction) (indirect object) (prepositional phrases)]
     

    Here are some examples (without the oaa_Add/Solve wrapper):
    karen arrived on tuesday.

    DCG-NL:
    arrives,[[on(some(tuesday([]))),subject(karen)]]

    attain_nl_agent:
    arrives(karen,[on(tuesday)])
     

    did karen arrive on tuesday?

    DCG-NL:
    arrives(karen,[on(tuesday)])

    attain_nl_agent:
    arrives(karen,[on(tuesday)])

    An additional interpretation difference is that if pronouns can't be
    resolved, they are translated as 'third_sg' (he/she/it and their
    variants), or 'me' (I/me/my).
     

    ICL Solvables

    - attain_nl_agent has five solvables:
     nl_to_lf(String, Params, LF)
     nl_get_morph_forms(Entry, Forms)
     nl_look_up_word(Word, Results)
     nl_add_vocab(Vocab)
     nl_remove_vocab(Vocab)

    + nl_to_lf(String, Params, LF)
    Converts an English sentence to a gemini LF.

    Optional possible parameters:

     strip_sorts(true|false): This indicates whether the Gemini
    sorts are to be removed from the LF. Sorts are part of Gemini's
    semantic representation, but are not used in the grammar here. The
    default value is "true", so nl_to_lf/3 is not called with strip_sorts
    for the ATTAIN agents.

     loose_agr(true|false): This indicates whether we parse without
    subj-verb agreement or with subject-verb agreement. The difference is
    illustrated by examples (1-2). On either value of loose_agr, (1) will
    parse. However, (2) will parse just in case loose_agr(true) is passed
    as a parameter.

     1. Karen arrives.
     2. Karen arrive.

    The default value for loose_agr is false, i.e. subject-verb agreement
    is enforced. It is strongly suggested that this default be used unless
    it is absolutely necessary to have loose_agr be true, since
    performance can be degraded with loose_agr(true).

    + nl_get_morph_forms(Entry, Forms)
    Purpose: Returns the morphological variants of the word in Entry.The
    gemini "le" format is required of the entry. See
    "dynamic_vocabulary.txt" for details of the "le" format. The word need
    not exist. For example, nl_get_morph_forms(le(blorp, n), Forms), will
    instantiate Forms to the list [blorp, blorps] corresponding to the
    singular and plural of the (invented) noun "blorp".

    This solvable is used by ALTA.

    + nl_look_up_word(Word, Results)

    Purpose: Returns the words/phrases that Word is or is part of as a
    list of [Word, Category, Translation]. For example, with the ATTAIN
    grammar, nl_look_up_word(mail, Results), instantiates Results to:

    [[mail,n,email],[mail,v,mail],[mail,v,mail],
     [[mail,message],n,email],[[mail,messages],n,email]]

    In other words, "mail" can be a noun with the translation "email"; a
    verb with the translate "mail"; and it can be part of the multi-word
    nouns "mail message" and "mail messages" which both have the
    translation "email".

    This solvable is used by ALTA.

    + nl_add_vocab(Vocab)
    + nl_remove_vocab(Vocab)
    Add and remove dynamic vocabulary items. See "dynamic_vocabulary.txt"
    for details.

    - nl_icl_agent has two solvables:
     lf_to_icl(LF, ICL)
     nl_to_icl(LF, Params, ICL)

    + lf_to_icl(LF, ICL)
    Converts an LF into the appropriate ICL message. The ICL messages are
    based on the DCG-NL implementation, with a few changes, listed below.

    + nl_to_icl(LF, Params, ICL)

    nl_to_icl/3 is the same solvable as DCG-NL has, and it takes an English
    sentence and translates it into the appropriate ICL message. It simply
    calls the solvables nl_to_lf/3 and lf_to_icl/2. Any parameters to
    nl_to_icl/3 get passed to nl_to_lf/3.

    Optional possible parameter:
     loose_agr(true|false): same as for nl_to_lf/3
     

    Since nl_icl_agent has the same nl_to_icl/3 solvable as DCG-NL, it is
    possible to use it and attain_nl_agent in place of DCG-NL.



    Information

    Author: Chris Culy

    Implementation language: Prolog

    Availability

    Free download of binary, under restrictions of the OAA Community License.


    Chris Culy < culy@ai.sri.com>

    2001