Search |  Contact |  SRI Home Do not follow this link, or your host will be blocked from this site. This is a spider trap. Do not follow this link, or your host will be blocked from this site. This is a spider trap. Do not follow this link, or your host will be blocked from this site. This is a spider trap.A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A ASRI International.  333 Ravenswood Avenue.  Menlo Park, CA 94025-3493. SRI International is a nonprofit corporation.

AIC Seminar Series

CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases

Michihiro Yasunaga Stanford University[Home Page]

Notice:  Hosted by Muthu Chandrasekaran

Date:  Tuesday, October 15th 2019 at 4:00pm

Location:  EK255 (SRI E building)  (Directions)


We present CoSQL, a corpus for building cross-domain, general-purpose database (DB) querying dialogue systems. It consists of 30k+ turns plus 10k+ annotated SQL queries, obtained from a Wizard-of-Oz (WOZ) collection of 3k dialogues querying 200 complex DBs spanning 138 domains. Each dialogue simulates a real-world DB query scenario with a crowd worker as a user exploring the DB and a SQL expert retrieving answers with SQL, clarifying ambiguous questions, or otherwise informing of unanswerable questions. When user questions are answerable by SQL, the expert describes the SQL and execution results to the user, hence maintaining a natural interaction flow. CoSQL introduces new challenges compared to existing task-oriented dialogue datasets:(1) the dialogue states are grounded in SQL, a domain-independent executable representation, instead of domain-specific slot-value pairs, and (2) because testing is done on unseen databases, success requires generalizing to new domains. CoSQL includes three tasks: SQL-grounded dialogue state tracking, response generation from query results, and user dialogue act prediction. We evaluate a set of strong baselines for each task and show that CoSQL presents significant challenges for future research. The dataset, baselines, and leaderboard will be released at

   Bio for Michihiro Yasunaga

Michihiro Yasunaga is a first-year PhD student in Computer Science at Stanford University, where he works on natural language processing (NLP) and machine learning. He has published multiple papers at AAAI and ACL venues. He is interested in designing reliable algorithms to analyze and reason about natural language. In particular, he has contributed to the fields of automated text summarization and natural language interfaces to relational databases. Previously he worked and published with Prof. Dragomir Radev and Prof. John Lafferty at Yale during his undergraduation. He also co-organized the CL-SciSumm Shared Task Series at SIGIR in 2018 and 2019.

   Note for Visitors to SRI

Photography or broadcast of the event is prohibited unless specifically authorized by SRI. Reporters must coordinate with SRI 24 hours in advance before attending.
Please arrive at least 10 minutes early as you will need to sign in by following instructions by the lobby phone at Building E (or call Wilma Lenz at 650 859 4904, or Eunice Tseng at 650 859 2799). SRI is located at 333 Ravenswood Avenue in Menlo Park. Visitors may park in the parking lots off Fourth Street. Detailed directions to SRI, as well as maps, are available from the Visiting AIC web page. There are two entrances to SRI International located on Ravenswood Ave. Please check the Building E entrance signage.

SRI International
©2019 SRI International 333 Ravenswood Avenue, Menlo Park, CA 94025-3493
SRI International is an independent, nonprofit corporation. Privacy policy