Search |  Contact |  SRI Home Do not follow this link, or your host will be blocked from this site. This is a spider trap. Do not follow this link, or your host will be blocked from this site. This is a spider trap. Do not follow this link, or your host will be blocked from this site. This is a spider trap.A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A ASRI International.  333 Ravenswood Avenue.  Menlo Park, CA 94025-3493. SRI International is a nonprofit corporation.

AIC Seminar Series

Learning from the World

Mihai SurdeanuStanford University

Notice:  Hosted by Ramesh Nallapati

Date:  2012-06-20 at 16:00

Location:  EK255 (SRI E building)  (Directions)

   Abstract

Natural language processing (NLP) applications have benefited immensely from the advent of “big data” and machine learning. For example, IBM’s Watson learned to successfully compete in Jeopardy! by using a question answering model trained on millions of Wikipedia pages and other documents. However, this abundance of textual data does not always come free: a lot of it has low quality (e.g., the text is often ungrammatical) or does not illustrate exactly the problem of interest. In this talk I show that such data is still valuable and can be used to train end-to-end NLP applications. I will focus on two specific NLP applications: question answering trained from Yahoo! Answers question-answer pairs, and information extraction trained from Wikipedia infoboxes. I will show that: (a) low-quality text can be made useful by converting it to semantic representations, and (b) training data that incompletely models the problem of interest can be successfully incorporated through noise-aware machine learning models.

   Bio for Mihai Surdeanu

Dr Mihai Surdeanu is a Senior Research Associate in the Computer Science Department at Stanford University and lead researcher and CTO of Lex Machina, a company that focuses on information extraction and risk analysis in the legal domain. Mihai Surdeanu earned a PhD degree in Computer Science from Southern Methodist University, Dallas, TX, in 2001. Before joining Stanford in 2008, he worked as a research scientist at Language Computer Corp. (later VP of Engineering), Technical University of Catalonia and Yahoo! Research Barcelona. His research interests include end-to-end NLP applications such as question answering and information extraction.

   Note for Visitors to SRI

Please arrive at least 10 minutes early as you will need to sign in by following instructions by the lobby phone at Building E. SRI is located at 333 Ravenswood Avenue in Menlo Park. Visitors may park in the parking lots off Fourth Street. Detailed directions to SRI, as well as maps, are available from the Visiting AIC web page. There are two entrances to SRI International located on Ravenswood Ave. Please check the Builing E entrance signage.

SRI International
©2014 SRI International 333 Ravenswood Avenue, Menlo Park, CA 94025-3493
SRI International is an independent, nonprofit corporation. Privacy policy