AIC Seminar Series
Superlink-online: A Large-Scale Distributed System for Genetic Linkage Analysis
Notice: hosted by Michael Wolverton
Date: Thursday October 04, 2007 at 16:00
Location: EJ228 (SRI E building) (Directions)
Linkage analysis is a tool used by geneticists for mapping
disease-susceptibility genes in the study of genetic diseases. However
such analysis is often beyond the capabilities of a single computer.
We present a distributed system called Superlink-Online for computing
multipoint LOD scores of large inbred pedigrees.
Superlink-online achieves high performance via parallelization of the
algorithms in Superlink, a state-of-the-art serial program for linkage
analysis tasks, and through utilization of thousands of resources
residing in multiple opportunistic computing environments, aka Grids.
Notably, the system is available online, which allows geneticists to
perform computationally intensive analyses with no need for either
installation of software, or maintenance of a complicated distributed
In this talk I will describe the scheduling system architecture which
drives Superlink-online. The main challenges have been to efficiently
split large tasks for distributed execution in highly dynamic
non-dedicated running environment, and to provide nearly interactive
response time for shorter tasks while simultaneously serving massively
parallel ones. The system utilizes resources in all the available
grids, unifying thousands CPUs over campus grids in the Technion and
the University of Wisconsin in Madison, EGEE grids in Europe, and
Community Computing Grid Superlink@Technion (via BOINC) .
The system is being extensively used by medical centers worldwide.
Since January 2006, over 12,000 interactive genetic analysis tasks
were performed, utilizing over 240 years of CPU time.
This work has been done as a part of his PhD in the Technion under
joint supervision of Prof. Assaf Schuster and Dan Geiger. It has been
published in American Journal of Human Genetics and presented at High
Performance Distributed Computing conference in 2006.
Mark Silberstein is a PhD student at the CS department in the
Technion under the joint supervision of Prof. Assaf Schuster and Dan
Geiger. His main research focus has been efficient serial and parallel
algorithms for inference in Bayesian networks (in the context of
genetic linkage analysis), and their execution in large-scale
opportunistic computing environments, aka Grids. He is currently
visiting UC Davis, working with Prof. John Owens on the
parallelization of Bayesian inference on GPUs.