Learning Large-scale Paraphrases for Natural Language Understanding and Generation

Wei XuOhio State University[Home Page]

Date:  Tuesday, May 15th 2018 at 4:00pm

Location:  EK255 (SRI E building)  (Directions)

Human language is notoriously complex due to the multitude of ways people can express the same meaning (i.e. paraphrases). I will present our work on robust machine learning methods for large-scale paraphrasing, including 1) automatic paraphrase acquisition that exploited multi-instance learning and deep neural networks for semantics; and 2) utilizing paraphrases for various natural language generation tasks with machine translation techniques.

In particular, I will highlight the importance of Twitter as the only known data source that can constantly provide up-to-date paraphrases in very large quantity. We designed a series of approaches to extract paraphrases from Twitter, which have much broader coverage than any previous studies and can enable natural language processing systems to handle idiomatic expressions, newly coined words, name variations, colloquiums, idiomatic expressions, and simplifications. I will also show how similar multi-instance learning models can learn large knowledge bases and resolve time expressions via distant supervision.

Wei Xu is an assistant professor in the Department of Computer Science and Engineering at the Ohio State University. Her research focuses on natural language processing, particularly semantics and social media data. She received her PhD in Computer Science from New York University where she was a MacCracken fellow. She recently received the NSF CRII Award and CrowdFlower AI for Everyone Award. Previously, she was a postdoctoral researcher at the University of Pennsylvania. She is organizing the International Workshop on Noisy User-generated Text, serving as a workshop co-chair for ACL 2017, an area chair for EMNLP 2016, 2018 and COLING 2018, and the publicity chair for NAACL 2016 and 2018. Her research group is currently supported by research funds from DARPA.

Please arrive at least 10 minutes early as you will need to sign in by following instructions by the lobby phone at Building E (or call Wilma Lenz at 650 859 4904, or Eunice Tseng at 650 859 2799). SRI is located at 333 Ravenswood Avenue in Menlo Park. Visitors may park in the parking lots off Fourth Street. Detailed directions to SRI, as well as maps, are available from the Visiting AIC web page. There are two entrances to SRI International located on Ravenswood Ave. Please check the Building E entrance signage.

