Search |  Contact |  SRI Home Do not follow this link, or your host will be blocked from this site. This is a spider trap. Do not follow this link, or your host will be blocked from this site. This is a spider trap. Do not follow this link, or your host will be blocked from this site. This is a spider trap.A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A ASRI International.  333 Ravenswood Avenue.  Menlo Park, CA 94025-3493. SRI International is a nonprofit corporation.

AIC Seminar Series

Learning to Extract Proteins and their Interactions from Biomedical Text

Raymond J. MooneyUniversity of Texas at Austin[Home Page]

Notice:  hosted by Sugato Basu

Date:  2006-07-28 at 14:30

Location:  EJ228  (Directions)

   Abstract

Automatically extracting information from biomedical text holds the promise of easily consolidating large amounts of biological knowledge in computer-accessible form. This strategy is particularly attractive for extracting data on human genes from the 11 million abstracts in Medline. We have developed and evaluated a variety of learned information-extraction systems for identifying human proteins and their interactions in Medline abstracts. We will present our current best results on identifying names of human proteins using Conditional Random Fields and Relational Markov Networks. We will also present our current best results on identifying interactions between proteins using a Support Vector Machine with an underlying string kernel. Finally, we will summarize results from a recent large-scale application of our techniques, in which we mined 753,459 Medline abstracts to extract a database of 6,580 interactions between 3,737 human proteins. By merging this extracted data with existing databases, we have constructed (to our knowledge) the largest database of known human-protein interactions containing 31,609 interactions amongst 7,748 proteins.

Joint work with Razvan Bunescu, Edward Marcotte, Ruifang Ge, Rohit Kate, Yuk-Wah Wong, and Arun Ramani.

   Bio for Raymond J. Mooney

Raymond J. Mooney is a Professor in the Department of Computer Sciences at the University of Texas at Austin. He received his Ph.D. in 1988 from the University of Illinois at Urbana/Champaign. He is an author of over 100 published research papers, primarily in the area of machine learning. He is program co-chair for the 2006 National Conference on Artificial Intelligence, a recent general chair of the 2005 Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, a former co-chair of the 1990 International Conference on Machine Learning, a former editor of the Machine Learning journal, and a Fellow of the American Association for Artificial Intelligence. His recent research has focused on learning for natural-language processing, text mining, statistical relational learning, semi-supervised learning, bioinformatics, and autonomic computing.

   On-line Resources