AIC Seminar Series
Exploring Cost-Effective Approaches to Human Evaluation of Search Engine Relevance
Notice: hosted by Jeffrey Davitz
Date: Thursday June 21, 2007 at 10:00
Location: EJ228 (SRI E building) (Directions)
Traditional Cranfield approaches to document relevance evaluation
involve judges making judgments on individual documents. The search
engine setting complicates matters by returning a *set* of summaries of
results competing with advertising and other links such as spelling
suggestions. Evaluation of rhe relevance of search results in such a
setting needs to account for set-level effects such as ensuring the
returned set does not contain duplicates or near-duplicates, that it
covers most of the common senses for the search query and that it
accurately ranks the results for the majority of the users.
The talk presents a framework of test types and explores the
pros and cons of each type. We compare cost-effective set-level judgments to
item-level judgments and identify the types of queries for which the
item-level methodology misses important aspects.
This is work done at Yahoo and presented at ECIR 2005.
Joint work with Chi Chao Chang and Yun-fang Juan.
My research interests lie at the boundary of application and
theory in Information Extraction, Question Answering, Parse-based
Feature classification, Bootstrap Learning, Sampling in databases,
Active Learning and Bayesian Model Averaging.
My most recent set of papers is on sampling in databases based on an
application fielded at Yahoo for over two years supporting thirteen
internal analytics data-marts. Prior to that in the Web Search group,
I did work on statistical evaluation and competitive analysis of
search results which was important in Yahoos decision to acquire
Inktomi. It also led to an ECIR paper on search evaluation framework.
My PhD is on Bayesian Model Averaging, which I received from UC
Irvine. After that I did research and consulting at IBM Almaden and
Stanfords CLL lab before leaving academia for TiVo. At TiVo, I led
the team that wrote the Suggestions Engine, a system for recommending
TV shows which runs partly in distributed form on over three million
Linux boxes (TiVos). Following TiVo, I was a principal scientist
doing clickstream cluster analysis and text clustering at Vividence