Dynamic vocabulary in the ATTAIN Gemini agent

  • Dynamic vocabulary
  • Adding vocabulary
  • How the attain_nl_agent adds vocabulary
  • Removing dynamic vocabulary
  • The vocabulary agent
  • ALTA (ATTAIN Lex Tool Agent)
  • Dynamic vocabulary

    Version 1.1 of the ATTAIN Gemini agent added the ability for agents to
    specify their own additional vocabulary for attain_nl_agent to
    use. When agents join the community, they may specify words that are
    appropriate for the tasks they can carry out. The attain_nl_agent
    then incorporates those new words into its vocabulary. When the agents
    leave the community, any vocabulary unique to them also leaves. (Note
    that if multiple agents add the same word, the word remains in the
    attain_nl_agent vocabulary until the last agent with that word
    leaves.)

    Adding vocabulary

    Agents that add vocabulary do so using oaa_AddData for the facilitator
    with either vocabulary/1 (preferred) or vocabulary/2 (deprecated). For
    example,

        oaa_AddData(vocabulary(le(need,v)), [])
        oaa_AddData(vocabulary(n, [manager, boss]),[])

    vocabulary/2 corresponds to the DCGNL format for vocabulary. The first
    argument is the category (= part of speech) of the word. The second
    argument is a two member list. The first element is translation
    ("meaning") of the actual word, which is the second member. In the
    example above, the word being defined is "boss", which is a noun that
    means 'manager'.

    NOTE

    vocabulary/2 and the DCGNL format are deprecated since they do not
    allow the full flexibility and complexity of vocabulary/1.

    vocabulary/1 corresponds to the Gemini format for vocabulary. The sole
    argument is le/2,3,4,5,6. In its simplest form, le/2 takes two
    arguments, the first of which is the word, and second of which is the
    category. In this case, the word itself is used as the translation. If
    a different meaning is desired, it can be specified using the format
    logical_form:translation as an additional argument. We could recast
    the DCGNL format example above as follows:

          oaa_AddData(vocabulary(le(manager, n, logical_form:boss)),[])

    All forms of le/n require the word and the category. All other
    arguments are optional. The maximal form of le/n is le/6:

        le(Word, Category, features:Feature_List, morph_forms:Morph_list, predicate:Pred, logical_form:LF)

    Arguments that are not specified assume a default value. In the
    examples above, if "logical_form" is not specified, then the default
    is to make the logical_form the same as the word itself.

    * le arguments
    - Word
    Single word items are simple given, as above. For a "multi-word" item, the individual words are put in a comma separated list inside square brackets:

          oaa_AddData(vocabulary(le([chief,executive], n, logical_form:boss)),[])

    Words starting with an uppercase letter should be put in single quotes.
     

    - Category
    In the ATTAIN grammar, the possible categories for words include:

       n        common noun (e.g. business, employee)
       pn        name (e.g. Lee, Pat)
       intr_v      intransitive verb (e.g. go, walk)
       v        transitive verb (e.g. see, send)
       adj        adjective (e.g. important, next)
       adv        adverb (e.g. quickly, soon)
       p        prepostion (e.g. to, from)
     

    While other categories for words exist in the ATTAIN grammar, the above categories are the ones most likely to be used in dynamic vocabulary.
     

    - features:[list]
    The features:[list] argument to le/n is unlikely to be used, since
    there are no "features" in the ATTAIN grammar that are likely to be
    specified for dynamic vocabulary.
     

    - morph_forms:[list]
    It is also possible to specify irregular forms of the word, using a
    morph_forms:[list] argument to le.

    In particular, for nouns, we can specify singular and/or plural. For
    example (omitting oaa_AddData, vocabulary):

       le(mouse, n, morph_forms:[sg:mouse, pl:mice])
       le(goose, n, morph_forms:[pl:geese])
       le(pants, n, morph_forms:[sg:'*', pl:pants]) %there is no singular for 'pants' -- it's always plural
       le(milk, n, morph_forms:[pl:'*']) %perhaps there is no plural for 'milk' -- it's always singular

    For verbs, we can specify any of the following forms:
        v_base: the root or "infinitive" (e.g. go). This is almost never necessary.
        s: the third person singular present tense (e.g. goes). This is almost never necessary.
        ed: the past tense of the verb (e.g. went)
        en: the past participle of the verb (e.g. gone)
        ing: the present participle of the verb (e.g. going). This is almost never necessary.

    For example, again omitting oaa_AddData, vocabulary:

        le(go, intr_v, morph_forms:[ed:went, en:gone])

    The order of the irregular forms doesn't matter. Forms that are not
    specified are assumed to be regular. Thus, the above example will correctly produce "go", "goes", "went", "gone", and "going".

    - predicate
    The predicate feature is not used in the ATTAIN grammar.
     

    - logical_form
    The logical form usually is the "meaning" or translation of the word.
     

    How the attain_nl_agent adds vocabulary


    When an agent adds vocabulary to the facilitator using
    oaa_AddData(vocabulary(...), []), the facilitator notifes the
    attain_nl_agent. The attain_nl_agent then uses its own solvable
    nl_add_vocab/1 to process the vocabulary item and add it to the
    parser's internal representation. While it is possible to call the
    nl_add_vocab/1 solvable directly, this is strongly discouraged, since
    the OAA architecture helps manage the dynamic vocabulary when it is
    added via oaa_AddData. One possible exception would be for vocabulary
    that should be available to the community independently of the
    presence of the agent bringing it, but this violates the spirit of
    OAA.
     
     

    Removing dynamic vocabulary


    In order for the removal of dynamic vocabulary to work correctly, you
    must use OAA2 release 12 or higher. Otherwise, the dynamic vocabulary
    is not removed. This is a known issue with OAA2 release 11 and lower.

    An agent need not do anything special to remove its dynamic
    vocabulary. When it leaves the community, its vocabulary is
    automatically removed from the attain_nl_agent (similar to the way in
    which it is added). An agent may also manually remove its dynamic
    vocabulary by using oaa_RemoveData(vocabulary(...)), where the
    vocabulary(...)  specification is identical to the one added. As with
    adding vocabulary, it is possible to call the attain_nl_agent's
    solvable nl_remove_vocab/1 manually, but this is strongly discouraged.
     

    The vocabulary_agent


    The vocabulary_agent provides a means to read dynamic vocabulary from
    a text file. All the entries must be in the le/n format. Comments are
    preceded by %.

    The solvable is: nl_add_vocab_from_file(Filename), where Filename is
    the path to an accessible file containing the entries. The solvable
    can be invoked multiple times, but when the vocabulary_agent leaves
    the community, all of the items added via nl_add_vocab_from_file/1
    will be removed.
     

    ALTA


    The attain_lex_tool_agent (ALTA) is a tool to help in the construction
    of lexical entries for dynamic vocabulary. See the ALTA notes for
    details.