A minimalist architecture to generate coherent texts
|Eva Banik||Lexis Nexis||[Home Page]|
Notice: Hosted by Nikhil Dinesh
Date: 2011-02-22 at 11:00
Location: EJ228 (SRI E Building) (Directions)
The "Holy Grail" of natural language generation is to build systems that produce extensive, syntactically complex, coherent texts, which are similar in quality to human writing and appropriate in a given context. In practice, current NLG systems that come closest to meeting these requirements have numerous shortcomings: they tend to have a complex system architecture, they are highly domain specific and they are inflexible, producing a limited range of solutions.
In this talk I will explore two hypotheses about underlying design principles in NLG:
1) by modelling interacting constraints from different linguistic levels, we can generate a wide range of syntactically complex, coherent, multisentential texts in a principled way
2) we can simplify the architecture of NLG systems and increase their efficiency by using a multi-level integrated grammar, which includes interacting linguistic constraints within the elementary structures associated with lexical items.
I aim to show that the local complexity of lexical items in an integrated grammar allows for a minimalist system architecture, opens up new avenues for optimizing the generation process and provides discourse-level features to select contextually appropriate solutions. I will describe and evaluate a proof-of concept implementation based on the above design principles, which is capable of producing coherent, multisentential, paragraph-length texts. The generator uses a program originally developed as a surface realizer to perform the main tasks expected of an NLG system: text planning, sentence planning, surface realization and pronominalization (GenI, http://projects.haskell.org/GenI/).
Until recently, Eva has been a member of the Natural Language Generation group at the Open University in the UK, headed by Donia Scott and Richard Power. She received her phd from the Department of Computing in 2010 and is currently a computational linguist at Lexis Nexis in London. Prior to joining the NLG group at the Open University, Eva has received an MA in linguistics from the University of Pennsylvania, where she was a member of the XTAG group at the Institute of Research in Cognitive Science and worked on the semantics of Tree Adjoining Grammars.
Please arrive at least 10 minutes early in order to sign in and be escorted to the conference room. SRI is located at 333 Ravenswood Avenue in Menlo Park. Visitors may park in the visitors lot in front of Building E, and should follow the instructions by the lobby phone to be escorted to the meeting room. Detailed directions to SRI, as well as maps, are available from the Visiting AIC web page.