Search |  Contact |  SRI Home Do not follow this link, or your host will be blocked from this site. This is a spider trap. Do not follow this link, or your host will be blocked from this site. This is a spider trap. Do not follow this link, or your host will be blocked from this site. This is a spider trap.A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A ASRI International.  333 Ravenswood Avenue.  Menlo Park, CA 94025-3493. SRI International is a nonprofit corporation.

AIC Seminar Series

CleanTAX: An Infrastructure for Reasoning about Biological Taxonomies

David ThauUniversity of California, Davis[Home Page]

Notice:  hosted by Richard Waldinger

Date:  2007-08-16 at 16:00

Location:  EJ228 (SRI E building)  (Directions)

   Abstract

Data are often classified taxonomically. When differences in nomenclature occur, multiple taxonomies relating to the same underlying data may arise. Integrating data that have been classified using different taxonomies often requires inter-taxonomy relational information. Given a set of taxonomic constraints, these relations may lead to unintended consequences, or may create inconsistencies in the data. We propose a logic-based framework for analyzing taxonomies, and articulations between them. Specifically, a taxonomy T is viewed as a set of first-order formulas constraining the possible interpretations of names and concepts in T. The formalization of taxonomies T via our FOL language allows us to clarify (a) what it means for T to be consistent, (b) to be inconsistent, (c) whether a new relationship between two taxa, and (d) whether two taxonomies T1, T2 from different authorities, together with a taxonomy mapping (articulation) from a third authority, are mutually consistent.

We illustrate our logic-based formalization and an accompanying architecture supporting automated reasoning using examples involving the classification of a genus of plants. We describe the user requirements for the task of data curation in this context, and demonstrate the utility of our architecture by discovering inconsistencies and unstated implications in the data set.

   Bio for David Thau

Dave Thau is a PhD graduate student in the Database Lab at the University of California at Davis. He works primarily with Bertram Ludäscher, focusing on scientific data management. Prior to starting at UC Davis, Dave consulted on a variety of ecology and biodiversity informatics related projects, including SEEK (scientific environoment for ecological knowledge) project, DiGIR (Distributed Generic Information Retrieval), AntWeb, and the All Species project. Dave holds Masters degrees in Computer Science and Psychology from the University of Michigan at Ann Arbor.

   On-line Resources