Seminar by Prof. Srinivas Aluru
Parallel EST Clustering
Prof. Srinivas Aluru
Department of Electrical and Computer Engineering, and
L H Baker Center for Bioinformatics and Biological Statistics
Iowa State University
Ames, IA, USA.
Date: Thursday, January 02, 2003
Time: 03:45 PM
Venue: CS-101
Abstract
Expressed Sequence Tags, abbreviated ESTs, are DNA molecules experimentally derived from expressed portions of genes. Clustering of ESTs is important for gene identification and identifying important genetic variations. In this talk, we will present the design and development of PaCE, a software for clustering large-scale EST data on parallel computers. The novel features of our approach include 1) the design of memory efficient algorithms, 2) algorithmic techniques to reduce run-time without affecting quality of clustering, and 3) use of parallel processing to facilitate clustering large data sets. Using a combination of these techniques, PaCE allows the clustering of EST data that is an order of magnitude larger than previously feasible. For example, we clustered 327,632 rat ESTs in 47 minutes on a 64 processor IBM xSeries Pentium cluster.
About the Speaker
Srinivas Aluru is Associate Professor in the Department of Electrical and Computer Engineering and the Laurence H. Baker Center for Bioinformatics and Biological Statistics at Iowa State University. Earlier, he held faculty positions at Syracuse University (1994-1996) and New Mexico State University (1996-1999). He received his B. Tech degree in Computer Science from IIT Madras in 1989, and his M.S. and Ph.D. degrees in Computer Science from Iowa State university in 1991 and 1994, respectively. His research interests include sequential and parallel algorithms, computational biology and scientific computing. He is a recipient of an NSF Career Award, an IBM Faculty Award and a Young Engineering Faculty Research Award from Iowa State University.