AIC Seminar Series
CleanTAX: An Infrastructure for Reasoning about Biological Taxonomies
| David Thau | University of California, Davis | [Home Page] |
Notice: hosted by Richard Waldinger
Date: Thursday August 16, 2007 at 16:00
Location: EJ228 (SRI E building) (Directions)
|
|
Data are often classified taxonomically. When differences in nomenclature
occur, multiple taxonomies relating to the same underlying data may
arise. Integrating data that have been classified using different taxonomies
often requires inter-taxonomy relational information. Given a set of taxonomic
constraints, these relations may lead to unintended consequences,
or may create inconsistencies in the data. We propose a logic-based framework
for analyzing taxonomies, and articulations between them. Specifically,
a taxonomy T is viewed as a set of first-order formulas constraining the possible interpretations of names and concepts in T. The
formalization of taxonomies T via our FOL language allows us to clarify
(a) what it means for T to be consistent, (b) to be inconsistent, (c) whether a new relationship between two taxa, and (d)
whether two taxonomies T1, T2 from different authorities, together with
a taxonomy mapping (articulation) from a third authority, are mutually consistent.
We illustrate our logic-based formalization and an accompanying architecture
supporting automated reasoning using examples involving the
classification of a genus of plants. We describe the user requirements for
the task of data curation in this context, and demonstrate the utility of
our architecture by discovering inconsistencies and unstated implications
in the data set.
| |
|
Dave Thau is a PhD graduate student in the Database Lab at the
University of California at Davis. He works primarily with Bertram
Ludäscher, focusing on scientific data management. Prior to starting
at UC Davis, Dave consulted on a variety of ecology and biodiversity
informatics related projects, including SEEK (scientific environoment
for ecological knowledge) project, DiGIR (Distributed Generic
Information Retrieval), AntWeb, and the All Species project. Dave
holds Masters degrees in Computer Science and Psychology from the
University of Michigan at Ann Arbor.
| |