AIC Seminar Series
Learning from the World
|Mihai Surdeanu||Stanford University|
Notice: Hosted by Ramesh Nallapati
Date: Wednesday, June 20th 2012 at 4:00pm
Location: EK255 (SRI E building) (Directions)
Natural language processing (NLP) applications have benefited
immensely from the advent of big data and machine learning. For
example, IBMs Watson learned to successfully compete in Jeopardy! by
using a question answering model trained on millions of Wikipedia
pages and other documents. However, this abundance of textual data
does not always come free: a lot of it has low quality (e.g., the text
is often ungrammatical) or does not illustrate exactly the problem of
interest. In this talk I show that such data is still valuable and can
be used to train end-to-end NLP applications. I will focus on two
specific NLP applications: question answering trained from Yahoo!
Answers question-answer pairs, and information extraction trained from
Wikipedia infoboxes. I will show that: (a) low-quality text can be
made useful by converting it to semantic representations, and (b)
training data that incompletely models the problem of interest can be
successfully incorporated through noise-aware machine learning models.
Dr Mihai Surdeanu is a Senior Research Associate in the Computer
Science Department at Stanford University and lead researcher and CTO
of Lex Machina, a company that focuses on information extraction and
risk analysis in the legal domain. Mihai Surdeanu earned a PhD degree
in Computer Science from Southern Methodist University, Dallas, TX, in
2001. Before joining Stanford in 2008, he worked as a research
scientist at Language Computer Corp. (later VP of Engineering),
Technical University of Catalonia and Yahoo! Research Barcelona.
His research interests include end-to-end NLP applications such as
question answering and information extraction.
Please arrive at least 10 minutes early as you will need to sign in by
following instructions by the lobby phone at Building E (or call Wilma
Lenz at 650 859 4904, or Vicenta at Lopez at 650 859 5750). SRI is
located at 333 Ravenswood Avenue in Menlo Park. Visitors may park in the
parking lots off Fourth Street. Detailed directions to SRI, as well as maps,
are available from the Visiting AIC web page.
There are two entrances to SRI International located on Ravenswood Ave.
Please check the Building E entrance signage.
©2017 SRI International 333 Ravenswood Avenue, Menlo Park, CA 94025-3493