Machine Learning of Large Datasets and applications to Biological Systems
| Jorge Moraleda |
Notice: Hosted by Jeffrey Davitz
Date: Friday November 18, 2005 at 10:30
Location: EJ228 (Directions)
|
|
In this talk I will describe my work in algorithms, data structures, and user interfaces for learning Bayesian Networks from large datasets. This work was motivated by a need to analyze biological systems. Learning from data is a hard problem. In particular learning Bayesian Network structure from data is NP-hard. Thus, heuristic search is necessary to find good models. There are two approaches to improving heuristic search: 1) increasing the speed of model evaluation to enable searching a larger number of models in a given time and 2) using better heuristics to generate higher quality models early in the search. I will present the AD+Tree and Queue Learning. The AD+Tree is a data structure that caches counts from the dataset efficiently, enabling fast evaluation of larger models. Queue Learning is an algorithm for learning Bayesian Network structure that can produce better models early in the search than existing techniques when applied to large datasets. I will conclude with an example of the application of these techniques to gene expression data analysis. |
|
|
Please arrive at least 10 minutes early in order to sign in and be escorted to the conference room. SRI is located at 333 Ravenswood Avenue in Menlo Park. Visitors may park in the visitors lot in front of Building E, and should follow the instructions by the lobby phone to be escorted to the meeting room. Detailed directions to SRI, as well as maps, are available from the Visiting AIC web page. ![]()
©2013 SRI International 333 Ravenswood Avenue, Menlo Park, CA 94025-3493
SRI International is an independent, nonprofit corporation. Privacy policy |