Result-targeted Social Media Normalization
|Tyler Baldwin||IBM Almaden|
Notice: Hosted by Ray Perrault. Webex: sound at 1-888-355-1249 #749045, join web meeting
Date: 2013-08-19 at 11:00
Location: EJ228 (SRI E building) (Directions)
The rise of social networks has been mirrored by a corresponding rise in social media analytics, as businesses and organizations take an interest in mining the data to better understand their customers. Natural Language Processing (NLP) tools are often integral to these data mining efforts, as they enable us to obtain a deep understanding of human communication. Unfortunately, the informal writing style employed by authors of social media data is problematic for most NLP tools, which are generally trained on clean, formal text such as newswire data. One possible solution to this problem is normalization, in which the informal text is converted into a more standard formal form.
Although social media normalization is universally motivated by pointing to its role in helping downstream NLP applications, most normalization work gives little to no insight into the effect of the normalization process on the downstream application of interest. In this talk I will give a brief overview of my recent work on social media normalization, with a focus on ensuring that improvements in normalization equate to improvements for the downstream application. The analyses in this work suggest that performing normalization via its traditional narrow definition does not sufficiently improve downstream application performance, and that a system that takes an expanded view of the problem is able to outperform even the gold standard performance of traditional normalization.
Please arrive at least 10 minutes early in order to sign in and be escorted to the conference room. SRI is located at 333 Ravenswood Avenue in Menlo Park. Visitors may park in the visitors lot in front of Building E, and should follow the instructions by the lobby phone to be escorted to the meeting room. Detailed directions to SRI, as well as maps, are available from the Visiting AIC web page.