BCATS 2003 S - Stanford Universitybcats.stanford.edu/previous_bcats/bcats03/BCATS2003...In addition,...

100
BIOMEDICAL COMPUTATION @ STANFORD 2003 SYMPOSIUM PROCEEDINGS

Transcript of BCATS 2003 S - Stanford Universitybcats.stanford.edu/previous_bcats/bcats03/BCATS2003...In addition,...

BIOMEDICAL COMPUTATION @ STANFORD 2003

SYMPOSIUM PROCEEDINGS

BCATS 2003 SYMPOSIUM PROCEEDINGS Copyright 2003 Biomedical Computation at Stanford (BCATS) Printed in the United States of America Editor: Jessica Shapiro Associate Editor: Serge Saxonov “Hands” artwork courtesy of Biomedical Information Technology at Stanford (BITS) Copyright and Reprint Permissions: Abstracting is permitted with credit to the source. Libraries are permitted to photocopy beyond the limits of U.S. copyright law for private use of patrons. Web Site: http://bcats.stanford.edu/

ii

BIOMEDICAL COMPUTATION AT STANFORD 2003

Symposium Co-Chairs Alberto Figueroa Samuel Ieong Brian Naughton Serge Saxonov

Jessica Shapiro Jing Shi Rong Xu

Administrative Help Tiffany Jung Carol Maxwell Rosalind Ravasio

Kelli Schreckengost Fiona Sincock

Symposium Volunteers

Angela Chau Jonathan Dugan Peter Ebert Nikesh Kotecha Zhi Li Monica R. McLemore Kevin Pan Claudia Pérez

Zachary Pincus Diane Schroeder Lucy Southworth Jesse Tenenbaum Andres Tellez Leo Wong

Symposium Sponsorship

Biomedical Information Technology at Stanford (BITS) The National Institute of Health - BISTI

Bio-X

Tier 1 Sponsors Incyte Corporation

Silicon Graphics, Inc. Sun Microsystems

Tier 2 Sponsors Alloy Ventures

Genentech Roche Bioscience

Tier 3 Sponsors

Affymetrix Apple Computers

Bay Area Bioinformatics BioScience Forum Celera Diagnostics Hopkins and Carley

iii

IEEE Computer Society Computational Systems Bioinformatics Conference (CSBCON2003)

v

TABLE OF CONTENTS

I. Symposium Information………………………………………………….. 1 a. Acknowledgements b. About BITS c. Symposium Schedule and Map

II. Keynote Speakers……………………………………………..………..… 5 a. Sean Eddy, Ph.D. b. Peter Hunter, Ph.D.

III. Abstract List………………………………………………..…………….. 9

IV. Scientific Talks Session I………………………………….………….…. 17

V. Scientific Talks Session II………………………………………………..25

VI. Poster Session…………………………....…………….…..……………..33

a. Posters Titles by Category…………………… …………………….. 35

b. Poster Abstracts………………………………………………………39

VII. Symposium Participant List……………………………….…………….. 81

VIII. Symposium Sponsors………………………………...………………….. 91

iv

SYMPOSIUM INFORMATION

1

ACKNOWLEDGEMENTS Numerous individuals and organizations have contributed to the 2003 symposium in Biomedical Computation at Stanford.

The organizing committee would like to thank the Biomedical Information Technology at Stanford (BITS) faculty who fostered the establishment of a forum through which Stanford researchers from across the university could share and discuss common interests and outline future directions in biomedical computation. We are particularly grateful to Russ Altman for his continuing guidance and his support of the BCATS conference.

We would like to thank Dr. Sean Eddy and Dr. Peter Hunter for setting the tone for a stimulating program and providing a roadmap through the interactions of biomedicine and computation in the new millennium.

We would like to acknowledge the generous financial support of the Biomedical Information Science and Technology Initiative from the National Institutes of Health and the additional support of the Bio-X initiative at Stanford.

We are also grateful to the following corporate sponsors for their financial support and for promoting biomedical computation: Affymetrix, Alloy Ventures, Apple Computer, Bay Area Bioinformatics, BioScience Forum, Celera Diagnostics, Genentech, Hopkins and Carley, IEEE Computer Society Computational Systems Bioinformatics Conference (CSBCON2003), Incyte Corporation, Roche Bioscience, Silicon Graphics, and Sun Microsystems. In addition, we would like to thank the people at those organizations whose efforts made the sponsorships possible. We would especially like to thank Dr. Douglas Brutlag and Dr. Charles A. Taylor for their assistance in contacting sponsors.

We laud the BCATS 2000 committee for starting BCATS successfully. We are indebted to the BCATS 2002 committee, especially Mike Liang and Serkan Apaydin, for their guidance and assistance.

Finally, the organizing committee wishes to thank the many volunteers and department administrators, especially Tiffany Jung, for their tireless assistance with every aspect of this year’s symposium.

Thank you for participating in BCATS 2003, and we hope you enjoy your day. The BCATS 2003 Committee

2

BIOMEDICAL INFORMATION TECHNOLOGY STANFORD

ABOUT BITS

The Biomedical Information Technology at Stanford (BITS) faculty group is the key supporter of BCATS. BITS is an inter-connected, cross-disciplinary group of researchers who develop, share, and utilize computer graphics, scientific computing, medical imaging, and modeling applications in biology, bioengineering, and medicine. Our mission is to establish a world-class biomedical computing and visualization center at Stanford that will support joint initiatives between the Schools of Engineering, Medicine and Humanities and Sciences. Participating labs promote the efficient development of new courses, programs, computational models, and tools that can be used in classrooms, clinical practice, and the biomedical research community. Our goal is to become an international resource for partners in the biotechnology, biomedical device, computing, medical imaging, and software industries. BITS faculty support teaching and training in the biomedical computing sciences and the creation of interdisciplinary biocomputational courses at the undergraduate, graduate, and post-graduate levels, both on-campus and at remote sites. More information can be found at: http://neurosurgery.stanford.edu/bits/index.php

3

SYMPOSIUM SCHEDULE AND MAP

Saturday, October 25, 2003

8:00am - 8:45am On-Site Registration, Badge Pickup, and Breakfast (TCSEQ) Poster Setup

8:45am - 9:00am Opening Comments (TCSEQ Lecture Hall 200)

9:00am - 9:45am Keynote Address - Dr. Peter Hunter (TCSEQ Lecture Hall 200)

9:45am - 10:00am Break

10:00am - 11:30am Scientific Talks Session I (TCSEQ Lecture Hall 200)

11:30am - 12:30pm Lunch (Stone Pine Plaza)

12:30pm - 1:30pm Poster Session I - Odd numbered posters (Packard Lobby)

1:30pm - 2:15pm Keynote Address – Dr. Sean Eddy (TCSEQ Lecture Hall 200)

2:15pm - 2:30pm Break

2:30pm - 4:00pm Scientific Talks Session II (TCSEQ Lecture Hall 200)

4:00pm - 5:00pm Poster Session II - Even numbered posters (Packard Lobby)

5:00pm - 5:15pm Closing Presentation and Awards (Packard Lobby)

Registration & Check In

Entrance

Packard

Main Auditorium Lunch (Stone

Pine Plaza)

4

KEYNOTE SPEAKERS

5

BCATS 2003 Symposium Proceedings Keynote Speaker

Peter J. Hunter, Ph.D. Institute Director Bioengineering Institute University of Auckland Auckland, New Zealand

MULTI-SCALE MODELING FOR THE IUPS PHYSIOME PROJECT

The IUPS Physiome Project is an attempt to build a comprehensive framework for computational multi-scale modeling of human biochemistry, biophysics and anatomy [1,2]. The goal of this project, sponsored by the International Union of Physiological Sciences (IUPS) and the IEEE Engineering in Medicine and Biology Society (EMBS), is to use computational modeling to analyse integrative physiological function in terms of underlying biological structure and processes. Web-accessible databases of model-related data at the organ system, organ, tissue, and cellular levels are being established to support the project. These databases currently include quantitative descriptions of anatomy, mathematical characterisations of physiological processes, and associated bibliographic information (see www.physiome.org.nz). The challenge for the Physiome Project is to link the revolution in the medical imaging of structure and function with the advances in genomics and proteomics using computational modeling tailored to the anatomy, physiology, and genetics of the individual. In order to achieve this we require the development of comprehensive databases covering a wide range of spatial and temporal scales, linked by models so that the parameters of larger scale models are supported by quantitative experiments and models at finer scales. Ontologies and XML-based data exchange formats are also being developed to support this multiscale modeling framework (see www.cellml.org). The talk will discuss a number of aspects of the IUPS Physiome Project and in particular the progress being made in modeling the heart, lungs and musculo-skeletal system. [1] Hunter, P.J. and Borg, T.K. Integration from proteins to organs: The Physiome Project. Nature Reviews Molecular and Cell Biology. Vol 4, pp 237-243, 2003. [2] Hunter, P.J., Kohl, P. and Noble, D. Integrative models of the heart: achievements and limitations. Phil. Trans R. Soc. Lond. A 359:1049-1054, 2001.

6

BCATS 2003 Symposium Proceedings Keynote Speaker

Sean Eddy, Ph.D. Alvin Goldfarb Distinguished Professor of Computational Biology (Genetics) Associate Professor of Computer Science and Engineering and Biomedical Engineering Washington University St. Louis, Missouri

THE MODERN RNA WORLD: NOT ALL GENES ENCODE PROTEINS

Rather than encoding proteins, some genes produce RNAs that function directly as RNAs. Genome sequence analysis, functional genomics, and new computational algorithms have enabled a number of new experiments that have begun to show that RNA genes and RNA-based regulatory circuits are much more prevalent that we suspected. It is becoming apparent that functional noncoding RNAs are produced from a large class of genes that has been almost invisible to both computational and experimental molecular genetics, just because the properties of RNA genes are unlike what we expect from "normal" protein-coding genes. Dr. Sean Eddy (B.S., Caltech, 1986; Ph.D., University of Colorado at Boulder, 1991) is a Howard Hughes Medical Institute assistant investigator and the Alvin Goldfarb Distinguished Professor of Computational Biology in the Department of Genetics, Washington University School of Medicine, St. Louis. His research interests include the evolutionary history of catalytic and structural RNAs, development of algorithms for RNA structure analysis, and other aspects of computational genome sequence analysis. He is the author of the HMMER software for biological sequence analysis, a coauthor of the Pfam database of protein domains, and a coauthor of the book Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids (Cambridge University Press, 1998).

7

8

ABSTRACT LIST

9

SCIENTIFIC TALKS SESSION I Subject-Specific Finite Element Modeling of Three-Dimensional Pulsatile Flow in the Human Abdominal Aorta: Comparison of Resting and Simulated Exercise Conditions Beverly T. Tang, Christopher P. Cheng, Mary T. Draney, Philip S. Tsao and Charles A. Taylor

Probabilistic Consistency-Based Multiple Sequence Alignment Chuong B. Do, Michael Brudno and Serafim Batzoglou

Fast, Intensity-Based 2D-3D Registration of Clinical Spine Data Using Light Fields Daniel B. Russakoff, Torsten Rohlfing, John R. Adler, Jr. Ramin Shahidi and Calvin R. Maurer, Jr.

The Cancer Module Map: Combinatorial Organization of Cancer Revealed By The Unification of Genomic Data Eran Segal, Nir Friedman, Aviv Regev and Daphne Koller

A Literature Based Expression Data Information Extraction Tool Nipun Mehra, Soumya Raychaudhuri and Russ Altman

Computational Assessment of a Single-Molecule Detection Approach for Sequencing-By-Hybridization Alexandros Pertsinidis and Peter Chu

10

BCATS 2003 Symposium Proceedings Abstract List

SCIENTIFIC TALKS SESSION II 3D Modeling of Complex Muscle Geometry Silvia S. Blemker and Scott L. Delp

Genome-Wide Codon Bias Is Set By Mutational Processes Swaine Chen, William Lee, Alison K. Hottes and Harley H. McAdams

Predicting the Activity of Transcription Factor Binding Motifs Yueyi Liu, X. Shirley Liu, Josh M. Stuart, Stuart K. Kim and Serafim Batzoglou

A 3D Statistical Classification Scheme for Computer Aided Detection of Colonic Polyps in CTC Ping Li, Christopher F. Beaulieu, David S. Paik, R.B. Jeffrey, Jr. and Sandy Napel

Beyond the Human Genome: Automated Functional Annotation of Variation Data Sean Mooney and Russ Altman

The Virtual Spine: Haptic Epidural Simulator Samuel Glassenberg and Raymond Glassenberg

11

POSTER SESSIONS P01 Unfolding of Proteins: Thermal and Mechanical Unfolding

Joe S. Hur and Eric Darve

P02 Functional and Computational Analysis of Suites of Coregulated Genes in Ciona savignyi David Scott Johnson and Arend Sidow

P03 Differential Gene Expression Among Naïve, Memory, and Effector Human T Lymphocytes Michael Asmar, Peter Lee and Susan Holmes

P04 Understanding Beta Hairpin Folding: The Tryptophan Zippers Christopher Davis Snow and Vijay Pande

P05 Optimizing the Detection of Evolutionarily Constrained Regions in Proteins Jonathan Binkley, Eric A. Stone and Arend Sidow

P06 Computational Bioinorganic Chemistry: Interaction of Non-Heme Iron Enzymes with Dioxgyen Andrea Decker and Edward I. Solomon

P07 Contextual UMLS® Indexing to Improve the Precision of Concept-Based Representation in XML-Structured Clinical Radiology Reports Yang Huang, Henry J. Lowe and William R. Hersh

P08 Waves and Outflow Boundary Conditions for One-Dimensional Finite Element Modeling of Blood Flow and Pressure in Arteries Irene Vignon and Charles A. Taylor

P09 Coupled MD-Finite Element Modeling of Transdermal Drug Delivery Jee Eun Rim and Peter M. Pinsky

P10 Whole Genome Haplotype Alignment of Ciona savignyi Kerrin Small and Arend Sidow

P11 Brain Wave Recognition of Emotions in EEG Elliot Berkman, Dik Kin Wong, Marcos Perreau Guimaraes, E. Timothy Uy, James Gross and Patrick Suppes

P12 Monte Carlo Simulation of Folding Processes for 2D Linkages Leo Guibas, Rachel Kolodny, Michael Levitt and Ileana Streinu

P13 Visualizing Biological Networks through Selective Reduction and Force Direction Adam Wright

12

BCATS 2003 Symposium Proceedings Abstract List

P14 Protein-Protein Interactions in Poliovirus Andres Bayani Tellez, Scott Crowder, Serkan Apaydin, Doug Brutlag and Karla Kirkegaard

P15 Multi-Hand Haptic Interaction to Simulate Complex Surgical Procedures Christopher Sewell, Kenneth Salisbury, Sabine Girod, Tom Krummel and Jean-Claude Latombe

P16 Comparing Housekeeping and Tissue-Specific Gene Promoters: An Analysis of Spurious Transcription Factor Binding Sites Diane Irene Schroeder and Rick Myers

P17 Three-Dimensional Measurements and Analysis of the Isolated Malleus-Incus Complex Jae Hoon Sim, Sunil Puria and Charles R. Steele

P18 Image-Based Analytic Surface Representation and Mesh Generation Erik Jan Bekkers and Charles A. Taylor

P19 An Automatic Lung Segmentation Scheme for Computer-Aided Detection Shaohua Sun, Geoffrey D. Rubin and Sandy Napel

P20 Simulation by Reaction Path Annealing: Protein Misfolding and Aggregation in a 7 residue peptide from the Yeast Prion Protein Sup35 Jan Lipfert, Joel Franklin, Fang Wu and Sebastian Doniach

P21 Analysis of Nucleotide Diversity in Exonic Splicing Enhancers (ESEs) of Human Membrane Transporter Genes Bernie Daigle, Jr., Maya K. Leabman, Kathleen M. Giacomini and Russ B. Altman

P22 Predicting HIV Drug Resistance Using Supervised Machine Learning Methods Jaideep Srinivas Ravela, Asa Ben-Hur and Robert W. Shafer

P23 Robust Neural Decoding of Reaching Movements for Prosthetic Systems Caleb Kemere, Maneesh Sahani and Teresa Meng

P24 Nanorobots As Cellular Assistants in Inflammatory Responses Aranzazu Casal, Arancha Casal, Tad Hogg and Adriano Cavalcanti

P25 Predicting Functional Sites on Protein Structures with SeqFEATURE MikeHsin-Ping Liang, Russ B. Altman and Doug L. Brutlag

P26 Comparing Conservation and Covariance in Protein Alignments Anthony Austin Fodor and Richard W. Aldrich

P27 Three-Dimensional Mapping of the Water Structure Around Hydrophobic Solutes Tanya M. Raschke and Michael Levitt

13

P28 Genome-Wide Discovery of Transcriptional Modules from DNA Sequence and Gene Expression Eran Segal, Roman Yelensky and Daphne Koller

P29 Walking Along Chromosomes: Genomic Mapping of Cytogenetic Aberrations from Tumor Databases Michael Baudis

P30 A Performance-Based Multi-Classifier Approach to Atlas-Based Segmentation Torsten Rohlfing, Daniel B. Russakoff, Calvin R. Maurer and Jr.

P31 DNA Stacking Interactions of Fluorinated Aromatic C-Deoxyribonucleosides Jacob S. Lai and Eric T. Kool

P32 Large Scale Study of Protein Domain Distribution in the Context of Alternative Splicing Shuo Liu and Russ B. Altman

P33 Real-Time Lens Distortion Correction Using Texture Mapping Michael Bax

P34 Conservation of Known Regulatory Elements and Their Identification using Comparative Genomics Yueyi Irene Liu, X. Shirley Liu, Liping Wei, Russ B. Altman and Serafim Batzoglou

P35 Image-Based Modeling of Blood Flow in Pulmonary Arteries Using a One-Dimensional Finite Element Method Coupled to a Morphometry-Based Model of the Distal Vessels Ryan L. Spilker, David Parker, Jeffrey A. Feinstein and Charles A. Taylor

P36 Calcium Quantification in the Aortoiliac Arteries: Interscan Variability of Agatson Scoring vs. Automated Mass Quantification in Noncontrast and Contrast-Enhanced Scans Bhargav Raman, Raghav Raman, Mercedes Carnethon, Stephen P. Fortmann, Sandy Napel and Geoffrey D. Rubin

P37 Morphological Differences in Thoracic Aortic Aneurysms when Compared to Normal Aortas: A Comparative Study in 49 Patients Raghav Raman, Zhao Shaohung, Bhargav Raman, Sandy Napel and Geoffrey D. Rubin

P38 2D/3D Registration for Image Guidance in Interventional Radiology Joyoni Dey, Markus Kukuk and Sandy Napel

P39 Independent Component Analysis (ICA) for Removing Ballistocardiogram and Ocular Artifacts from EEG Data Acquired Inside an MRI Scanner Gaurav Srivastava, Vinod Menon, Sonia Crottaz-Herbette and Gary H. Glover

14

BCATS 2003 Symposium Proceedings Abstract List

P40 Inferring Motifs That Mediate Protein Interactions Haidong Wang, Eran Segal, Asa Ben-Hur, Douglas Brutlag and Daphne Koller

P41 Community Based Approach To Microarray Research Brian Howard Null and QuangQiu Wang

P42 Probabilistic Discovery of Overlapping Cellular Processes and Their Regulation Alexis Battle, Eran Segal and Daphne Koller

15

16

SCIENTIFIC TALKS SESSION I

17

BCATS 2003 Symposium Proceedings Scientific Talks I

SUBJECT-SPECIFIC FINITE ELEMENT MODELING OF THREE-DIMENSIONAL PULSATILE FLOW IN THE HUMAN ABDOMINAL

AORTA: COMPARISON OF RESTING AND SIMULATED EXERCISE CONDITIONS

Beverly T. Tang, Christopher P. Cheng, Mary T. Draney, Philip S. Tsao and Charles A. Taylor

Purpose Elevated blood flow associated with exercise has been hypothesized to result in hemodynamic conditions that inhibit atherosclerosis, such as unidirectional laminar flow and increased wall shear stress. While it is known that constant high shear environments (10 dynes/cm2) are able to induce atheroprotective gene expression on endothelial cells in culture, the effects of complex in vivo physiologic flow and shear stress patterns have not yet been determined. In this work, computational methods have been used to quantify hemodynamic conditions in subject-specific models of the human abdominal aorta during resting and exercise conditions.

Materials and Methods Magnetic resonance angiography scans of three healthy subjects (age 20-30) were obtained in a 1.5T GE Signa magnet. Three-dimensional, subject-specific solid models were created from these images using custom software and discretized using a commercially available mesh generation program. Cine phase contrast magnetic resonance images were used to specify inlet and outlet boundary conditions. To simulate light exercise, the cardiac cycle was shortened to represent a 50% increase in resting heart rate, and the total volumetric flow under resting conditions was increased in a manner consistent with in vivo data measured in the abdominal aorta of 11 young, healthy subjects pedaling on an MR-compatible exercise cycle. Flow solutions for each mesh were obtained using a finite element method to solve the incompressible Navier-Stokes equations.

Results Flow solutions computed using resting conditions demonstrated low, recirculating flow in the infrarenal portion of the abdominal aorta during the diastolic portion of the cardiac cycle, and time-averaged wall shear stress revealed areas of low wall shear stress (under 4 dynes/cm2) along the posterior wall in both the infrarenal section and the section opposite to the celiac and superior mesenteric arteries. Under simulated light exercise conditions, higher velocity, more unidirectional flow was observed throughout the cardiac cycle, and regions of low time-averaged wall shear stress present under resting conditions were eliminated with light exercise.

Conclusion The results of the simulations show that regions of low time-averaged wall shear stress and complex, recirculating flow correspond to areas where localization of atherosclerosis has been found in previous studies, and an increase in flow achieved by light exercise can help to eliminate adverse conditions. Furthermore, the differences in results between subjects demonstrate the need to perform subject-specific simulations in order to gain a better understanding of existing hemodynamic conditions. These computational models can now be used to guide gene discovery efforts.

18

BCATS 2003 Symposium Proceedings Scientific Talks I

PROBABILISTIC CONSISTENCY-BASED MULTIPLE SEQUENCE ALIGNMENT

Chuong B. Do, Michael Brudno and Serafim Batzoglou Purpose Multiple sequence alignment is a difficult computational problem for which heuristic techniques sacrifice optimality for reasonable running times. In progressive alignment, an approach in which groups of sequences are aligned according to an evolutionary tree, errors in early stages propagate to the final alignment. To prevent these errors, posterior-probability–based strategies for alignment take into account the distribution of all possible suboptimal alignments when scoring particular residue pairings. Consistency-based schemes use alignments of letters from two sequences to a common position in an outgroup sequence as evidence that the letters are related. Here, we combine posterior and consistency-based techniques to introduce probabilistic consistency, a novel method for robust progressive alignment. Materials and Methods We developed pcons-OL2, a prototype protein multiple aligner based on probabilistic consistency. To align K sequences, we first compute posterior probabilities for all possible residue pairings in each of the K(K+1)/2 pairwise comparisons. Next, we apply the probabilistic consistency transformation, producing a new set of all-pairs posterior probabilities which incorporate outgroup information. Finally, we compute a guide tree and perform posterior-based progressive alignment. Results To test our methods, we performed two-fold cross validation with unsupervised EM training over the BAliBASE benchmark database3, a collection of 141 protein structural alignments. Over the entire database, pcons-OL2 demonstrates high accuracy, averaging 1%, 5% and 8% more correctly aligned columns than T-Coffee2, CLUSTAL W4, and DIALIGN1, respectively; among the four methods, pcons-OL2 finds the unique best alignment 35% of the time, compared to 19%, 13%, and 9% for the others. The accuracy gain is most dramatic, however, in the “twilight zone” of protein alignment—sequences with <25% identity—where the method averages 11%, 4%, and 18% more correctly aligned columns. Conclusion On BAliBASE, our methods show significant improvement in accuracy over existing tools. Unlike other aligners, pcons-OL2 does not rely on any hand-specified parameters as its probabilistic generative model allows fully unsupervised EM training. The probabilistic consistency methodology can also be applied to DNA alignment, motif finding, and RNA structure prediction. For aligning distantly related proteins, empirical results dsemonstrate that probabilistic consistency is the method of choice. References

1. Morgenstern, B., Frech, K., Dress, A., and Werner, T. 1998. DIALIGN: Finding local similarities by multiple sequence alignment. Bioinformatics, 14: 290-294.

2. Notredame, C., Higgins, D.G., and Heringa, J. 2000. T-Coffee: A novel method for multiple sequence alignments. Journal of Molecular Biology, 302: 205-217.

3. Thompson, J.D., Plewniak, F., and Poch, O. 1999. A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Research, 27(13): 2682-2690.

4. Thompson, J.D., Higgins, D.G., and Gibson, T.J. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research, 22(22): 4673-4680.

19

BCATS 2003 Symposium Proceedings Scientific Talks I

FAST, INTENSITY-BASED 2D-3D REGISTRATION OF CLINICAL SPINE DATA USING LIGHT FIELDS

Daniel B. Russakoff, Torsten Rohlfing, John R. Adler, Jr., Ramin Shahidi and Calvin R. Maurer, Jr.

Purpose In order to use preoperatively acquired 3D images for intraoperative navigation, the images must be registered to the coordinate system of the operating room. The image-to-physical registration is commonly performed using stereotactic frames or fiducial markers. Alternatively, the preoperative 3D image can be registered to an intraoperative 2D image. Registration of a CT image to one or more X-ray projection images is a particularly interesting example of 2D-3D registration that has a number of possible applications, including radiosurgery, cranial neurosurgery, spinal surgery, and orthopedic surgery. Recently, there has been a great deal of interest in intensity-based methods. One of the drawbacks of such methods is the need to create digitally reconstructed radiographs (DRRs) at each step of the optimization. DRRs are typically generated by ray casting, an operation requiring O(n3) time, where we assume that n is roughly the size (in voxels) of one side of the DRR as well as one side of the CT volume. We address this issue by extending light field rendering techniques from the computer graphics community to generate DRRs instead of conventional rendered images. Using light fields allows most of the computation to be performed in a preprocessing step; after this precomputation, very accurate DRRs can be generated in O(n2) time. Another important issue for 2D-3D registration algorithms is validation. Previously reported 2D-3D registration algorithms were validated using synthetic data or phantoms but not clinical data. We present an intensity-based 2D-3D registration system that generates DRRs using light fields; we validate its performance using clinical data with a known gold standard transformation. Materials and Methods We used archived data from 6 patients treated for lesions in cervical and thoracic vertebrae with the CyberKnife radiosurgery system (Accuray, Sunnyvale, CA). The CyberKnife solves the registration problem by acquiring an orthogonal pair of X-ray projection images and localizing metal fiducial markers implanted in the vertebra. We validated our registration system using these images but without using the markers during registration; the marker-based transformation was used to compute target registration error (TRE) for the intensity-based registrations. Results The mean TRE for the 6 patients was 1.35 mm using conventional DRRs and 1.31 mm using light field DRRs. While the accuracy was nearly identical, the execution time using light field DRRs (158 s) was much less than that using conventional DRRs (16,108 s).

20

BCATS 2003 Symposium Proceedings Scientific Talks I

THE CANCER MODULE MAP: COMBINATORIAL ORGANIZATION OF CANCER REVEALED BY THE UNIFICATION OF GENOMIC DATA

Eran Segal, Nir Friedman, Aviv Regev and Daphne Koller Purpose DNA microarrays are widely used to study the gene expression changes characteristic of tumors. However, these studies are system-specific, and do not address the commonalities and variation between the molecular systems underlying different tumors. It is not clear which mechanisms are general and which are specific across tumors. Although such understanding has important therapeutic implications, few studies took global perspectives to address this issue. Here, we use the wide availability of microarrays to perform the first comprehensive analysis of a compendium of 1975 microarrays spanning 12 different tumor types collected from 24 publicly available studies, aimed at identifying the shared and unique molecular modules underlying human malignancies. Materials and Methods We propose to analyze the activity map of cancer by considering the activation or repression of predefined coherent sets of genes in cancer arrays. Such coherent gene sets can be obtained, for example, from known pathways or genes coherently co-expressed. Several previous studies have shown the utility of this type of analysis. However, these analyses were done in the context of a single gene set or a specific tumor type. To obtain a global view of cancer, we analyzed each array from the compendium mentioned above against 2849 (overlapping) gene sets, derived from: clusters of genes coherently expressed in each study (1300 sets); genes specifically expressed in certain tissue types (101 sets); genes known to participate in the same biological process or pathway (1448 sets). The analysis identifies all the gene sets that significantly change in expression in each array: For a particular gene set and a specific array, we test whether the set of genes contains more induced (or repressed) genes than would be expected under the null hypothesis that induced (or repressed) genes are randomly distributed. Overall, we found 299,233 array-gene set pairs where the gene set significantly changed in the array, suggesting that the selected gene sets are highly informative of the cancer compendium. As the gene sets represent biologically meaningful entities, this analysis provides a characterization of arrays at the level of biological processes rather than individual genes. In addition, as the original gene sets are only rough approximations to the core transcriptionally regulated modules, our method also refines these sets, resulting in the core response genes of each of the original gene sets. Results We show that tumors can be characterized as a combination of activated and deactivated modules. While activation of some modules is highly specific to particular tumors, activation of others is shared across a diverse set of clinical conditions, suggesting common mechanisms that are key for the progression of several tumors. For example, we found a repressed signal transduction module specific to acute lymphoblastic leukemias, and a bone osteoblastic module coherently induced in breast cancer and reduced in hepatocellular carcinoma, acute lymphoblastic leukemias, and lung cancer, suggesting the key role played by these primary tumors in initiating metastasis to bone. Our analysis suggests novel research directions for diagnostic, prognostic and therapeutic studies.

21

BCATS 2003 Symposium Proceedings Scientific Talks I

A LITERATURE BASED EXPRESSION DATA INFORMATION EXTRACTION TOOL

Nipun Mehra, Soumya Raychaudhuri and Russ Altman Purpose Different genes in different organisms have often been studied either independently or in small groups of a few genes. The work presented here seeks to use this available information to detect functional coherency of gene clusters. It first tests the coherency based on literature content and then extracts key-words that best describe the features that bring the given cluster together. For example, genes engaged in apoptosis would have the keyword “apoptosis” assigned to them.. Materials and Methods We use literature reference index provided for yeast (SGD) and Fly (FlyBase) for a mapping from PMIDs (NCBI-PubMed assigned document identifiers) to gene symbols. We obtained all these abstracts from PubMed. For every abstract its 19 nearest neighbors are calculated. The functional coherency is tested based on the number of neighbors that refer back to some gene in the given cluster. This number is counted and its distance from a random set of references (Poisson Distribution) is obtained. The average score over all genes gives the coherency score (called NDPG score) [1]. For key-word extraction, we use the reference index to build “pure” positive and negative sets. To build positive sets, we take the top ten scoring documents as most accurately reflecting the function of the group. We also take their nearest neighbors to form the positive data set. The negative data set is formed by generating a random subset from all documents which neither refer into the group nor are the neighbors of a document. A chi-square significance test is used to obtain the most significant words that distinguish between these sets. Results The results of the coherency score reflect a high degree of sensitivity-specificity. Key-word extraction are more interesting. To test our methods, we used genes assigned to Biological process terms of The Gene Ontology Consortium (GO). For specific terms like “jump response”, the system achieves high quality key words. However, for more generic GO terms like “behavior”, the key words are noticeably poorer. Conclusion The system is expected to serve as a complete annotation process for Gene expression data. When combined with hierarchical or other clustering methods, the tool can be used as a single stop shop for expression data annotation. References

1. Raychaudhuri S, Scutze H, Altman RB. “Inclusion of textual documents in the analysis of multidimensional data sets: application to gene expression data”. Machine Learning 52:119-145, 2003.

22

BCATS 2003 Symposium Proceedings Scientific Talks I

COMPUTATIONAL ASSESSMENT OF A SINGLE-MOLECULE-DETECTION APPROACH FOR SEQUENCING-BY-HYBRIDIZATION

Alexandros Pertsinidis and Steve Chu Purpose DNA Sequencing-by-Hybridization (SBH) was proposed in the late 80’s as an alternative to gel-based sequencing. In its original form, the method involves testing hybridization of the target sequence with a library of oligo-nucleotides (typically with all the 4l sequences of length l). Unfortunately, the length of a typical target DNA sequence that can be unambiguously reconstructed from the l-mer subsequence spectrum scales only as 2l, and for example using all 48=65536 8-mers one can reconstruct a random sequence of only about 200bp. Several methods have been proposed for enhancing the resolving power of SBH (e.g. multiplexing, incorporating positional information, using universal bases, etc.), however a competitive experimental method for SBH has yet to be demonstrated. We propose a SBH scheme that utilizes single-molecule detection: genomic DNA can be stretched on a substrate and probed with fluorescent dye-labeled oligonucleotides. Single-molecule detection is used to determine the position of the probes along the target DNA and fluorescent resonance-energy-transfer (FRET) between donor and acceptor dyes provides nearest-neighbor information. With the above data, one attempts to reconstruct the target DNA sequence. Here we present a computational assessment of the practicality of this method. Materials and Methods We performed simulations of Positional SBH+FRET, using random, bacterial and human DNA sequences. Reconstructing the target DNA is formulated as a Graph-Theoretical problem: one constructs a Hamiltonian graph and searches for a unique Hamiltonian Path compatible with the positional constraints. We introduce an algorithm that iteratively refines the original Positional-Hamiltonian Graph and greatly reduces the complexity of the problem. Next, we decompose the refined graph into sub-graphs and utilize the nearest-neighbor information to resolve the remaining ambiguities. Results For reasonable positional uncertainty (e.g. ~100bp) we show that 7-mer or 8-mer probes are adequate to typically reconstruct several 10’s of kb-long DNA pieces (much longer than the 500bp typically read by electrophoresis). Furthermore, the overhead associated with checking FRET between selected probes is small and the total experimental cost remains of order 4l. Conclusion We demonstrate the potential of our approach for whole-genome de novo sequencing. Several other DNA sequence analysis tasks (e.g. expression profiling, re-sequencing and genotyping/mutation detection) can be easily undertaken using slight variations of the proposed method.

23

24

SCIENTIFIC TALKS SESSION II

25

BCATS 2002 Symposium Proceedings Scientific Talks II

3D MODELING OF COMPLEX MUSCLE GEOMETRY Silvia S. Blemker and Scott L. Delp

Purpose Computational models of the musculoskeletal system are widely to study musculoskeletal form and function. However, almost all these models simplify muscles from three-dimensional (3D) deformable bodies to one-dimensional lines of action (Fig. 1.) This simplification severely limits the models’ ability to represent muscles with complex geometry. The goal of this work is to develop a new framework for using 3D finite-element (FE) models of muscle to characterize muscle geometry. Materials and Methods Our new framework consists of four main components. First, 3D models of each muscle and its underlying bones are generated based on segmentation of magnetic resonance (MR) images of the anatomy in one limb position. Second, representations of the muscle’s fiber geometry are created by mapping a template fiber architecture to the 3D model (Figs. 2 & 3.) Third, motions of the joints are prescribed as boundary conditions, and the resulting deformation of the muscles are calculated via simulations performed in the implicit FE solver, NIKE3D.3 In the solution, a transversely-isotropic, incompressible, hyperelastic constitutive model4 is used to describe the stress-strain relationship in the muscle tissue, and a penalty formulation is used to resolve muscle-bone contact. Fourth, the simulation results are analyzed in a graphics-based musculoskeletal modeling environment, and the lengths and moment arms of all fibers in a muscle are calculated. To demonstrate the utility of the framework, we created models of the psoas major (a muscle with fibers that follow a complex path as they wrap around multiple bones) and the gluteus maximus (a muscle with a complex arrangement of fibers that also wrap around multiple bones.) Results The 3D models were able to predict individual fiber moment arms; the moment arms of the psoas muscle fibers (Fig. 4) varied by ~1cm and the moment arms of the gluteus maximus fibers (Fig. 5) varied by ~5cm, demonstrating that fibers within a muscle can have highly variable moment arms. The average moment arms (across all fibers) for each muscle compared well with moment arms calculated from existing 1D models.1,2 Conclusion Our new formulation for representing muscle geometry can characterize muscles with complex fiber geometry, a challenge that has not been met in previous models of the musculoskeletal system. Thus, this new framework will enhance the utility of computer models of the musculoskeletal system. References

1. Arnold et al. Comp Aid Surg, 5, 108-19, 2000. 2. Delp et al. IEEE Trans on BME, 37, 757-767, 1990. 3. Puso et al. LLNL Technical Report, 2002. 4. Weiss et al. Comput Methods Appl Mech Engrg, 135, 107-128, 1996.

Web Page http://www.stanford.edu/~ssalinas/BCATS03

26

BCATS 2003 Symposium Proceedings Scientific Talks II

GENOME-WIDE CODON BIAS IS SET BY MUTATIONAL PROCESSES Swaine L. Chen, William Lee, Alison K. Hottes and Harley H. McAdams

Purpose Codon bias was discovered soon after the sequencing of the first genes. Understanding how and why codon bias varies is important for understanding molecular evolution, population genetics, genome structure, and gene expression. The space of possible codon biases is extremely complex, and its underlying structure, if any, is unknown. Codon bias is maintained by both selection and mutation, though their relative importance is still unclear. With many genomes now sequenced, we can examine the internal structure of codon bias and the roles of selection and mutation in maintaining overall patterns of codon bias. Materials and Methods Published genome sequences are taken from GenBank. We use a singular value decomposition to model codon bias and intergenic nearest-neighbor nucleotide biases for genes from 100 sequenced prokaryotic organisms. We introduce a method to quantitatively separate variation in codon bias between and within genomes. Least squares techniques are used to correlate changes in codon bias with intergenic mutational parameters. Results Differences in codon bias between genomes vary mostly in two dimensions; therefore, two parameters are necessary and largely sufficient for describing genome-wide codon bias. These two parameters correlate with GC content and a linear combination of nearest-neighbor nucleotide biases. These two parameters may be estimated from intergenic sequence data, allowing prediction of an organism’s genome-wide codon bias from its intergenic sequences. Using a model derived from prokaryotic genomes, codon bias in several eukaryotes is also predicted accurately from corresponding intergenic sequences. Conclusion Codon bias can vary in 41 dimensions (parameters). Despite this massive potential complexity, the genome-wide codon bias of organisms examined effectively varies in only 2 parameters. GC content is known to be the most important parameter. We show that the only other important parameter is a linear combination of nearest-neighbor nucleotide biases. Because GC content and nearest-neighbor nucleotide biases may be calculated from intergenic sequence, codon bias may be predicted from intergenic sequences. Surprisingly, only a single model is needed to predict genome-wide codon bias from intergenic sequences in all three domains of life. Thus, genome-wide codon bias is almost entirely dependent on mutational forces in organisms as diverse as bacteria and plants. Selection, on the other hand, plays a larger role in maintaining variation in codon bias between individual genes.

27

BCATS 2003 Symposium Proceedings Scientific Talks II

PREDICTING THE ACTIVITY OF TRANSCRIPTION FACTOR BINDING MOTIFS

Yueyi Liu, X. Shirley Liu, Josh M. Stuart, Stuart K. Kim and Serafim Batzoglou Purpose Transcription factors bind to their cis-regulatory motifs and activate or repress the transcription of specific genes in a manner that depends on the current state of the cell. One of the challenges for characterizing the regulatory circuitry of the cell is to identify the cellular contexts under which an interaction between a transcription factor and a particular binding motif occurs and when this interaction leads to specific transcription of its target genes. Materials and Methods We present ArrayScan, a method that predicts the experimental conditions under which cis-regulatory motifs direct transcription. Given a putative motif and a large set of microarray expression data, ArrayScan uses linear regression to search for experiments in which the motif scores are significantly correlated with the expression levels measured by the microarray. Results Using two well-studied motifs, RAP1 from S. cerevisiae and HSE from C. elegans, we show that ArrayScan can successfully identify microarray experiments under which RAP1 and HSE are actively directing the transcription of their downstream genes. These motifs have significantly higher regression correlation than randomly generated motifs. We also applied ArrayScan to a novel CIN5/YAP6 motif for which little is known. Our results suggest CIN5 and YAP6 bind to their motifs when the cell experiences challenges to chromosomal integrity such as irradiation or oxidative stress. These predictions are consistent with what is known about these motifs and provide concrete hypotheses that can be tested in the laboratory.

28

BCATS 2003 Symposium Proceedings Scientific Talks II

A 3D STATISTICAL CLASSFICATION SCHEME FOR COMPUTER AIDED DETECTION OF COLONIC POLYPS IN CTC

Ping Li, Christopher F. Beaulieu, David S. Paik, R.B. Jeffrey, Jr. and Sandy Napel Purpose Detecting colonic polyps in 3D CTC images is a well-known difficult problem. A previously proposed algorithm, which solely relies on counting the colon Surface Normal Overlapping (SNO), can detect polyps at high sensitivity but at the cost of large number of false positives (FP), i.e., low specificity. We have developed a novel algorithm that exploits the rich 3D features of the suspicious sub-volumes detected by SNO. A statistical classification scheme is adopted to distinguish polyps from non-polyps. This algorithm has been validated on a data set of 67 CTC cases. Materials and Methods As an initial step, the SNO algorithm was run on each CTC data set. A common threshold was used that ensured 95% the polyps >= 5mm were included. This step generated 6926 suspicious locations (sub-volumes), of which, 118 were true polyps. For each sub-volume, the surface pixels whose normals contribute to the SNO scores form a 3D manifold. For an ideal polyp, its SNO manifold is close to a semi-sphere; while for a haustral fold, its SNO manifold is ideally a half cylinder with a very thin height. A 3D connected-component analysis is performed on each SNO manifold. For each component, 7 features are extracted that include the surface area (A), average length and standard deviation of the surface normals (L and Lstd), major and minor axes of the projected ellipse (S1 and S2), percentage of the ellipse covered by SNO manifold (R), and the closest distance to the largest component (D). Using Principal Component Analysis, an ellipse is obtained by fitting the projection of a component on its base plane whose normal vector is the average of the surface normals. Each component is assigned a weight (W), which is heuristically determined by Ai/A1, Li/L1, as well as Di, where the subscript indicates the ith largest component. Finally, a 9-dimensional feature vector, composed of the combinations of the extracted component features, is input to a simple linear regression classifier. Results 108 out of 118 true polyps (>= 5mm) were correctly classified; and 2000 out of 6808 non-polyps (on average 29.9 FP/case) were classified as polyps. This corresponds to 91.5% sensitivity with 70.6% specificity. Including the sensitivity loss due to initial thresholding, the overall sensitivity is 86.9%. To achieve the same sensitivity, the SNO algorithm alone produces 60.6 FP/case. Conclusion The initial results have shown that the 3D statistical classification algorithm can detect true polyps at high sensitivity with much smaller number of false positives compared with previous algorithm. It is expected that combining our feature extraction scheme with more advanced classification methods will further improve the results.

29

BCATS 2003 Symposium Proceedings Scientific Talks II

BEYOND THE HUMAN GENOME: AUTOMATED FUNCTIONAL ANNOTATION OF VARIATION DATA

Sean Mooney and Russ Altman Purpose With the sequencing of multiple copies of the human genome, the challenge of deciphering this landmark genetic sequence becomes ever important. I am developing computational methods for automatically annotating human genome data with information that is useful to researchers for understanding how variation (genetic differences between individuals) affects phenotypes on a molecular level. Materials and Methods This project has progressed through two stages: First, I have collected and annotated many publicly available coding disease-associated mutations with both protein structural information and with a comparative sequence analysis. I have developed a useful public interface to this dataset at MutDB (http://mutdb.org/). Second, I am developing tools for the automated structural analysis of human variation data. Results Being able to quantify the structural effects of mutation data allows researchers to quickly identify the molecular effects of a mutation. We have annotated more than 5,000 phenotypically associated mutations with structural information, in over 1000 human genes. We have also developed a several tools that enable researchers to study of the mutation data better. In addition to the annotations, we have developed an interface that allows for visualization of several datasets of interest. Conclusion The sheer volume of available genomic variation data makes computational methods attractive for analyzing this data. This ongoing research continues to develop new computational tools for aiding researchers in these efforts. Web Page http://mutdb.org/

30

BCATS 2003 Symposium Proceedings Scientific Talks II

THE VIRTUAL SPINE: HAPTIC EPIDURAL SIMULATOR Samuel Glassenberg and Raymond Glassenberg (Northwestern U.)

Purpose Commonly used during obstetric anesthesia, epidural placement is an especially high-risk and difficult procedure. A needle is inserted deep into the tissue of the lower spine, administering anesthetic into a small area surrounding the spinal cord. Risk of dural puncture (entry into the spinal nerve) is high. The lumbar anatomy itself is a maze of thick ligaments and intricate bones. Once the needle enters the skin, the anesthesiologist receives little visual feedback: the physician can only observe the distance into the anatomy that the needle has penetrated. Almost all feedback is tactile, beginning with determining entry location until the epidural space is entered. Anesthesiologists depend on the responsive force to determine the needle location within the anatomy and to identify when to stop insertion or reattempt if insertion has failed. Thus the procedure is not only difficult to perform, it is impossible to simulate without providing realistic haptic (tactile) feedback. Materials and Methods A virtual-reality epidural placement simulator was developed, utilizing a commercial force-feedback pen-interface device: the Phantom Desktop by Sensable Technologies. The simulator was written in C++. The GHOST SDK library was used for haptic simulation and OpenGL for real-time rendering. 3D meshes of bone, spinal cord, skin, ligaments, and muscle were created and textured to represent patient anatomy in various positions. The tactile properties of lumbar anatomy were simulated based on physician input and results of previous research. The “virtual needle” provides realistic visual response to interaction with tissues. Results The user interacts with The Virtual Spine by holding a pointing device shaped like a syringe. As the virtual needle penetrates the lumbar tissue, the user feels realistic responsive forces and visuals. Multiple instruments, anatomy sets, visual effects, and needle responses are incorporated. The simulation is complete from antiseptic application to the various success/failure scenarios. Surfaces exhibit realistic force-textures, viscosity, and surface friction. The simulator is a comprehensive, customizable training device with varying levels of difficulty and the ability to log user performance in detail. Conclusion This simulation allows the user to experience all the potential difficulties associated with placing an epidural, without exposing a patient to the complication of Dural puncture. Web Page http://www.virtualspine.com/

31

32

POSTER SESSIONS

33

34

BCATS 2003 Symposium Proceedings

POSTER TITLES BY CATEGORY Biomechanical

P01 Unfolding of Proteins: Thermal and Mechanical Unfolding Joe S. Hur and Eric Darve

P08 Waves and Outflow Boundary Conditions for One-Dimensional Finite Element Modeling of Blood Flow and Pressure in Arteries Irene Vignon and Charles A. Taylor

P09 Coupled MD-Finite Element Modeling of Transdermal Drug Delivery Jee Eun Rim and Peter M. Pinsky

P20 Simulation by Reaction Path Annealing: Protein Misfolding and Aggregation in a 7 residue peptide from the Yeast Prion Protein Sup35 Jan Lipfert, Joel Franklin, Fang Wu and Sebastian Doniach

P35 Image-Based Modeling of Blood Flow in Pulmonary Arteries Using a One-Dimensional Finite Element Method Coupled to a Morphometry-Based Model of the Distal Vessels Ryan L. Spilker, David Parker, Jeffrey A. Feinstein and Charles A. Taylor

Biomedical

P11 Brain Wave Recognition of Emotions in EEG Elliot Berkman, Dik Kin Wong, Marcos Perreau Guimaraes, E. Timothy Uy, James Gross and Patrick Suppes

P18 Image-Based Analytic Surface Representation and Mesh Generation Erik Jan Bekkers and Charles A. Taylor

P19 An Automatic Lung Segmentation Scheme for Computer-Aided Detection Shaohua Sun, Geoffrey D. Rubin and Sandy Napel

P24 Nanorobots As Cellular Assistants in Inflammatory Responses Aranzazu Casal, Arancha Casal, Tad Hogg and Adriano Cavalcanti

P33 Real-Time Lens Distortion Correction Using Texture Mapping Michael Bax

P36 Calcium Quantification in the Aortoiliac Arteries: Interscan Variability of Agatson Scoring vs. Automated Mass Quantification in Noncontrast and Contrast-Enhanced Scans Bhargav Raman, Raghav Raman, Mercedes Carnethon, Stephen P. Fortmann, Sandy Napel and Geoffrey D. Rubin

P37 Morphological Differences in Thoracic Aortic Aneurysms when Compared to Normal Aortas: A Comparative Study in 49 Pateints Raghav Raman, Zhao Shaohung, Bhargav Raman, Sandy Napel and Geoffrey D. Rubin

35

BCATS 2003 Symposium Proceedings

P38 2D/3D Registration for Image Guidance in Interventional Radiology Joyoni Dey, Markus Kukuk and Sandy Napel

P39 Independent Component Analysis (ICA) for Removing Ballistocardiogram and Ocular Artifacts from EEG Data Acquired Inside an MRI Scanner Gaurav Srivastava, Vinod Menon, Sonia Crottaz-Herbette and Gary H. Glover

Computational Chemistry

P06 Computational Bioinorganic Chemistry: Interaction of Non-Heme Iron Enzymes with Dioxgyen Andrea Decker and Edward I. Solomon

P27 Three-Dimensional Mapping of the Water Structure Around Hydrophobic Solutes Tanya M. Raschke and Michael Levitt

P31 DNA Stacking Interactions of Fluorinated Aromatic C-Deoxyribonucleosides Jacob S. Lai and Eric T. Kool

Informatics - Clinical

P07 Contextual UMLS® Indexing to Improve the Precision of Concept-Based Representation in XML-Structured Clinical Radiology Reports Yang Huang, Henry J. Lowe and William R. Hersh

P19 An Automatic Lung Segmentation Scheme for Computer-Aided Detection Shaohua Sun, Geoffrey D. Rubin and Sandy Napel

P30 A Performance-Based Multi-Classifier Approach to Atlas-Based Segmentation Torsten Rohlfing, Daniel B. Russakoff, Calvin R. Maurer and Jr.

Informatics - Genetics

P02 Functional and Computational Analysis of Suites of Coregulated Genes in Ciona savignyi David Scott Johnson and Arend Sidow

P10 Whole Genome Haplotype Alignment of Ciona savignyi Kerrin Small and Arend Sidow

P16 Comparing Housekeeping and Tissue-Specific Gene Promoters: An Analysis of Spurious Transcription Factor Binding Sites Diane Irene Schroeder and Rick Myers

P21 Analysis of Nucleotide Diversity in Exonic Splicing Enhancers (ESEs) of Human Membrane Transporter Genes Bernie Daigle, Jr., Maya K. Leabman, Kathleen M. Giacomini and Russ B. Altman

P28 Genome-Wide Discovery of Transcriptional Modules from DNA Sequence and Gene Expression Eran Segal, Roman Yelensky and Daphne Koller

P34 Conservation of Known Regulatory Elements and Their Identification using Comparative Genomics Yueyi Irene Liu, X. Shirley Liu, Liping Wei, Russ B. Altman and Serafim Batzoglou

P42 Probabilistic Discovery of Overlapping Cellular Processes and Their Regulation Alexis Battle, Eran Segal and Daphne Koller

36

BCATS 2003 Symposium Proceedings

Informatics - Biology

P03 Differential Gene Expression Among Naïve, Memory, and Effector Human T Lymphocytes Michael Asmar, Peter Lee and Susan Holmes

P05 Optimizing the Detection of Evolutionarily Constrained Regions in Proteins Jonathan Binkley, Eric A. Stone and Arend Sidow

P13 Visualizing Biological Networks through Selective Reduction and Force Direction Adam Wright

P22 Predicting HIV Drug Resistance Using Supervised Machine Learning Methods Jaideep Srinivas Ravela, Asa Ben-Hur and Robert W. Shafer

P26 Comparing Conservation and Covariance in Protein Alignments Anthony Austin Fodor and Richard W. Aldrich

P29 Walking Along Chromosomes: Genomic Mapping of Cytogenetic Aberrations from Tumor Databases Michael Baudis

P32 Large Scale Study of Protein Domain Distribution in the Context of Alternative Splicing Shuo Liu and Russ B. Altman

P40 Inferring Motifs That Mediate Protein Interactions Haidong Wang, Eran Segal, Asa Ben-Hur, Douglas Brutlag and Daphne Koller

P41 Community Based Approach To Microarray Research Brian Howard Null and QuangQiu Wang

Robotics

P15 Multi-Hand Haptic Interaction to Simulate Complex Surgical Procedures Christopher Sewell, Kenneth Salisbury, Sabine Girod, Tom Krummel and Jean-Claude Latombe

P17 Three-Dimensional Measurements and Analysis of the Isolated Malleus-Incus Complex Jae Hoon Sim, Sunil Puria and Charles R. Steele

P23 Robust Neural Decoding of Reaching Movements for Prosthetic Systems Caleb Kemere, Maneesh Sahani and Teresa Meng

P24 Nanorobots As Cellular Assistants in Inflammatory Responses Aranzazu Casal, Arancha Casal, Tad Hogg and Adriano Cavalcanti

P38 2D/3D Registration for Image Guidance in Interventional Radiology Joyoni Dey, Markus Kukuk and Sandy Napel

Structural

P01 Unfolding of Proteins: Thermal and Mechanical Unfolding Joe S. Hur and Eric Darve

P04 Understanding Beta Hairpin Folding: The Tryptophan Zippers Christopher Davis Snow and Vijay Pande

37

BCATS 2003 Symposium Proceedings

P12 Monte Carlo Simulation of Folding Processes for 2D Linkages Leo Guibas, Rachel Kolodny, Michael Levitt and Ileana Streinu

P14 Protein-Protein Interactions in Poliovirus Andres Bayani Tellez, Scott Crowder, Serkan Apaydin, Doug Brutlag and Karla Kirkegaard

P20 Simulation by Reaction Path Annealing: Protein Misfolding and Aggregation in a 7 residue peptide from the Yeast Prion Protein Sup35 Jan Lipfert, Joel Franklin, Fang Wu and Sebastian Doniach

P25 Predicting Functional Sites on Protein Structures with SeqFEATURE MikeHsin-Ping Liang, Russ B. Altman and Doug L. Brutlag

P27 Three-Dimensional Mapping of the Water Structure Around Hydrophobic Solutes Tanya M. Raschke and Michael Levitt

P31 DNA Stacking Interactions of Fluorinated Aromatic C-Deoxyribonucleosides Jacob S. Lai and Eric T. Kool

38

BCATS 2003 Symposium Proceedings

39

Poster P01

UNFOLDING OF PROTEINS: THERMAL AND MECHANICAL UNFOLDING Joe S. Hur and Eric Darve

Recent theoretical and experimental findings have shown that the topology or conformation of the native structure of small proteins plays a critical role in determining its biological function. Our goal is to understand the mechanisms of protein folding-unfolding in the presence of an external force field - e.g. mechanical and thermal - and further to investigate the differences in the pathways of force-induced unfolding and thermal denaturation. We have developed a Hamiltonian model which builds on the important interactions of the native-state topology. We make a self-consistent Gaussian approximation within the model such that the contact probabilities are determined self-consistently, in a spirit similar to a local mean-field approximation. We present the results for the globular protein 2ci2 and the giant muscle protein titin (1TIT) and compare our findings to experiment and molecular dynamics simulation. Our model, especially in light of the good agreement obtained with experiment and simulation, demonstrates the basic physical elements necessary to capture the mechanism of protein unfolding in an external force field.

BCATS 2003 Symposium Proceedings

40

Poster P02

FUNCTIONAL AND COMPUTATIONAL ANALYSIS OF SUITES OF COREGULATED GENES IN CIONA SAVIGNYI

David Scott Johnson and Arend Sidow We are using computational and experimental approaches to analyze regulation of suites of coregulated genes in the solitary ascidian, Ciona savignyi. The assumption is that noncoding sequences adjacent to coregulated genes share elements that drive their common expression. Comparative analysis of these adjacent sequences should reveal the essential elements necessary for expression. One current effort focuses on muscle terminal differentiation genes. We use a reporter assay approach (Corbo et al., 1997) to test sequences upstream of the transcriptional start site for activity. Construct design is guided by analysis of M-LAGAN (Brudno et al., 2003) alignments between C. intestinalis and the C. savignyi. This technique has been used to generate functional minimal promoter constructs for a variety of genes strongly expressed in the tadpole muscle. Another effort in our lab explores regulation of muscle actins in the C. savignyi embryo. Comparative sequence analyses, coupled with functional analyses, may provide insight into the regulation of both the actin family and the regulation of muscle genes generally. Meanwhile, we are developing other computational methods to discover functionally important noncoding sequences. We are currently expanding this type of analysis to other tissues (and other closely coregulated gene families) in the ascidian embryo. References

1. Brudno M, Do CB, Cooper GM, Kim MF, Davydov E, Green ED, Sidow A, Batzoglou S. (2003). LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA. Genome Research 13(4): 721-31.

2. Corbo, JC, M Levine, & RW Zeller. (1997). Characterization of a notochord-specific enhancer from the Brachyury promoter region of the ascidian, Ciona intestinalis. Development 124: 589-602.

BCATS 2003 Symposium Proceedings

41

Poster P03

DIFFERENTIAL GENE EXPRESSION AMONG NAÏVE, MEMORY, AND EFFECTOR HUMAN T LYMPHOCYTES

Michael Asmar, Peter Lee and Susan Holmes Introduction Human T lymphocytes, or T cells, exist in three different stages: naïve, memory, and effector. An accepted method to characterize T cells is by their cell surface receptors, which recognize other molecules, and are labeled “CD” followed by a number. Purpose To investigate the stages of T cells progress based on patterns of differential gene expression. Materials and Methods We used microarray analysis to quantify the differential gene expression in naïve, memory, and effector T cells of ten adults. Of 12,918 genes studied, we identified 261 genes that were differentially expressed among the three cell types at the 1% significance level. Statistical analysis was done using R and the BioConductor package. Results Most genes (80%) were either expressed highly in naïve cells and not much in effector cells or vice versa. For these genes, all were moderately expressed in memory cells. The 261 genes were divided into six groups by the order of their mean expression level within each of the three cell types: Group A: naïve > memory > effector (107 genes) Group B: effector > memory > naïve (101 genes) Group C: memory > effector > naïve (38 genes) Group D: memory > naïve > effector (9 genes) Group E: naïve > effector > memory (5 genes) Group F: effector > naïve > memory (1 gene) Most genes belonged to Groups A and B, with strong expression in naïve T cells and weak expression in effector T cells or vice versa. Box plots of gene expression levels in Groups C, D, E, and F, showed that the expression level in memory T cells was very close to that of either the naïve or effector T cells, making it likely that the deviant ordering of these groups was due to statistical error. The three T cell types were paired and each pairing was checked separately. A total of 269 genes were differentially expressed at the 3% significance level between naïve and effector T cells. At the same significance level, only 84 genes were differentially expressed between naïve and memory cells, and only 32 between memory and effector cells. Conclusion Many genes were differentially expressed among naïve, effector, and memory T cells. The expression level of these genes was intermediate in memory cells and at opposite extremes in naïve and effector cells. Thus, we hypothesize that memory cells act as a transition stage between naïve and effector cells. More than twice as many genes were differentially expressed between naïve and memory cells than between memory and effector cells. Thus, we hypothesize that memory cells are closer in function to effector cells than to naïve cells.

BCATS 2003 Symposium Proceedings

42

Poster P04

UNDERSTANDING ß-HAIRPIN FOLDING: THE TRYPTOPHAN ZIPPERS Christopher D. Snow and Vijay S. Pande

Purpose Protein and peptide folding events on time-scales of 1-10 microseconds are accessible to both the fastest time-resolved experiments, such as laser T-jump spectroscopy, and to advanced simulation techniques, such as distributed computing. Here, we have studied the first three tryptophan zippers (TZ1-TZ3), a series of unusually stable 12-residue hairpins. Materials and Methods We employed the TINKER 3.8 molecular modeling package, the optimized potentials for liquid simulations united atom (OPLSua), and the all atom (OPLSaa) parameter sets. We modeled solvation with the generalized Born / surface area (GB/SA) implicit solvent model. Constant temperature stochastic dynamics modeled the viscous drag of water (frictional coefficient = 91 ps-1). Models for TZ1 and TZ2 were taken from PDB coordinates 1LE0 and 1LE1. To obtain initial unfolded conformations, fully extended conformers were generated using Tinker. Prior to distributed simulation, each model was equilibrated with 5-100 ps of molecular dynamics. Results Through distributed computing we obtained an aggregate simulation time of 22 milliseconds. The simulations included 150, 212, and 48 room temperature folding events for TZ1, TZ2, and TZ3 respectively. Conclusions The OPLSaa potential set predicted TZ1 and TZ2 properties well: the estimated folding rates agree with experimentally determined folding rates and native conformations were the global potential energy minimum. The simulations also predicted reasonable unfolding activation enthalpies. However, for TZ1-TZ3, OPLSua models had a non-native free energy minimum, and the folding rate for OPLSaa TZ3 was sensitive to the initial conformation. Finally, we characterized the transition state; all three tryptophan zippers fold via similar, native-like, transition state conformations.

BCATS 2003 Symposium Proceedings

43

Poster P05

OPTIMIZING THE DETECTION OF EVOLUTIONARILY CONSTRAINED REGIONS IN PROTEINS

Jonathan Binkley, Eric A. Stone and Arend Sidow Evolution-Structure-Function (ESF) analysis is used to infer and rank the functional importance of regions in a protein by quantifying the evolutionary constraint on those regions (Simon, A. L, Stone, E. A., and Sidow, A., PNAS 99, 2912-2917; 2002). Beginning with a multiple sequence alignment and a phylogenetic tree, ESF calculates maximum likelihood evolutionary rates in scanning windows across the alignment, then defines and ranks constrained regions based on local minima in a plot of rate versus position. We show that ESF rates are very similar for independent alignments of orthologs, and optimize the power of ESF analysis by varying parameters to maximize the similarity. We demonstrate that the constrained regions identified by ESF correspond to functionally important regions identified by independent genetic methods, and optimize the resolution of ESF analysis by varying parameters to maximize the correspondence. We provide evidence for regional (as opposed to strict positional) evolutionary constraint, which is identified by scanning-windows analyses. ESF analysis has been used to identify novel functional regions in proteins, to distinguish differences and similarities among paralogs, and to guide structure-function studies.

BCATS 2003 Symposium Proceedings

44

Poster P06

COMPUTATIONAL BIOINORGANIC CHEMISTRY: INTERACTION OF NON-HEME IRON ENZYMES WITH DIOXYGEN

Andrea Decker and Edward I. Solomon Purpose Metalloenzymes are pervasive in nature and involved in many key metabolic steps. Mutations causing malfunction of these enzyme can lead to severe diseases. The geometric and electronic structure of the active site and the reaction intermediates need to be defined to understand the basis of the enzyme activity and the steps in the catalytic mechanism on a molecular level. Mononuclear non-heme iron enzymes catalyze a wide variety of essential biological reactions requiring the binding and activation of dioxygen;1 for example, 2,3-Dihydroxybiphenyl 1,2-Dioxygenase utilizes dioxygen in the biodegradation of aromatic compounds. We want to define the geometric and electronic structure of the enzyme active site and understand its dioxygen reactivity.2 Materials and Methods Experimentally calibrated quantum chemical calculations, using Density Functional Theory (DFT), can provide insight into enzyme active site geometries, electronic structure and bonding, thermodynamic parameters as well as enzyme reactivity. Models of the enzyme active site are constructed, regarding the enzyme as an inorganic transition metal complex with the protein environment as a “unique ligand”. Results The density functional calculations, correlated to spectroscopic and crystallographic data, showed that both the resting and substrate-bound form of DHBD have a five coordinate, square pyramidal, ferrous active site. The resting site does not react with dioxygen, because the Fe-O bond formation does not provide enough stabilization energy to overcome the high redox potential of the one-electron reduction of O2 and the unfavorable entropy term. On the other hand, the binding of substrate allows a two-electron process and the reaction with dioxygen is very fast. Conclusion The electronic structure calculations combined with experimental data provide insight into the geometric and electronic structure as well as the reactivity of DHBD and a deeper level of understanding of the catalytic mechanism of this class of non-heme iron enzymes. References

1. Solomon, E.I. et al. Geometric and Electronic Structure/Function Correlations in Non-Heme Iron Enzymes. Chem.Rev. 2000, 100, 235-349.

2. Davis, M. et al. Spectroscopic and Electronic Structure Studies of 2,3-Dihydroxybiphenyl 1,2-Dioxygenase: O2 Reactivity of the Non-Heme Ferrous Site in Extradiol Dioxygenases. J.Am.Chem.Soc. 2003, 125, 10810-10821.

BCATS 2003 Symposium Proceedings

45

Poster P07

CONTEXTUAL UMLS® INDEXING TO IMPROVE THE PRECISION OF CONCEPT-BASED REPRESENTATION IN XML-STRUCTURED CLINICAL

REPORTS Yang Huang, Henry J. Lowe and William R. Hersh

Purpose Despite the advantages of structured data entry, much of the patient record is still stored as unstructured or semi-structured narrative text. The issue of representing clinical document content remains problematic. Our prior work using an automated UMLS® document indexing system has been encouraging but has been impacted by the generally low indexing precision of such systems. In an effort to improve precision we have developed a context-sensitive document indexing model to calculate the optimal subset of UMLS® source vocabularies used to index each document section. This pilot study was performed to evaluate the utility of this indexing approach on a set of clinical radiology reports. Materials and Methods A set of 50 clinical radiology reports of six modalities was manually indexed using UMLS® concept descriptors, which served as the gold standard. These reports were then automatically indexed by the Sphire[1] indexing engine. Using the data generated by this process, we developed a system that simulated indexing, at the document section level, of this same document set using many permutations of a subset of the UMLS® constituent vocabularies. The sections of each radiology report were marked up using XML. The concept descriptors from above indexing process were properly assigned to each section. The precision and recall scores, which were generated by simulated indexing for each permutation of two or three UMLS® constituent vocabularies, were calculated and compared to the results of Saphire indexing using all UMLS® vocabularies. Results While there was considerable variation in precision and recall values across the different subtypes of radiology reports, the overall effect of this indexing strategy using the best combination of two or three UMLS® constituent vocabularies was an improvement in precision without significant impact of recall.[2] The Procedure sections had the largest improvement on precision with no negative impacts on recall. The precision of the Finding section was the hardest to improve by this approach alone. MeSH®2000, SNOMED® International 98 and RCD®99 were the top three vocabularies, which appeared most often in the optimal vocabulary sets. References

1. Hersh WR, Hickam D. Information retrieval in medicine: the SAPHIRE experience. Medinfo. 1995;8 Pt 2:1433-7

2. Huang Y, Lowe HJ and Hersh WR. A Pilot Study of Contextual UMLS® Indexing to Improve the Precision of Concept-Based Representation in XML-Structured Clinical Radiology Reports. J Am Med Inform Assoc. 2003 Aug 4 [Epub ahead of print]

BCATS 2003 Symposium Proceedings

46

Poster P08

WAVES AND OUTFLOW BOUNDARY CONDITIONS FOR ONE-DIMENSIONAL FINITE ELEMENT MODELING OF BLOOD FLOW AND

PRESSURE IN ARTERIES Irene Vignon and Charles A. Taylor

Purpose Flow and pressure waves, originating due to the contraction of the heart, propagate along the deformable vessels and reflect due to tapering, branching, and other discontinuities. The size and complexity of the cardiovascular system necessitate a “multiscale” approach, with “upstream” regions of interest (large arteries) coupled to reduced-order models of “downstream” vessels. Previous efforts to couple upstream and downstream domains have included specifying resistance and impedance outflow boundary conditions for the nonlinear one-dimensional wave propagation equations. We have developed a new approach to solve the one-dimensional nonlinear equations of blood flow in elastic vessels utilizing a space-time finite element method with GLS-stabilization for the upstream domain, and a boundary term that incorporates the information from the downstream domain: “the coupled multidomain method”. Materials and Methods In the downstream domain, we solve simplified 0D/1D equations to derive relationships between pressure and flow accommodating periodic and transient phenomena with a consistent formulation for different boundary condition types. We also present a new boundary condition that accommodates transient phenomena based on a Green’s function solution of the linear, damped wave equation in the downstream domain. Results The “coupled multidomain method” was implemented for different boundary conditions, verified and validated. Solutions obtained with different choices of downstream models were compared with solutions for complete (combined upstream and downstream) models and assessed by comparing the simulated pressure and flow waves with a known physiologic response. Results for transient and periodic cases, as well as patient specific studies will be presented. Conclusion We demonstrated the importance of selecting appropriate boundary conditions when modeling blood flow velocity and pressure wave phenomena. In particular, the coupling with a downstream 1D wave model for a single tube was found to be useful in examining transient behavior, and a very promising boundary condition in terms of wave transmission. However, the generalization to vascular networks requires further analysis. Furthermore, the best boundary condition for cardiovascular applications is not necessarily the one that exhibits no wave reflection, since wave reflections naturally arising from downstream beds should propagate back upstream into the numerical domain. Impedance-based boundary conditions are most appropriate for incorporating these natural sites of wave reflection in the analytic domain. Eventually, wave phenomena could be better captured by coupling 3D to 1D models of complex vascular networks downstream.

BCATS 2003 Symposium Proceedings

47

Poster P09 COUPLED MD-FINITE ELEMENT MODELING OF TRANSDERMAL DRUG

DELIVERY Jee E. Rim and Peter M. Pinsky

Purpose Transdermal drug delivery has the advantages of elimination of first-pass drug metabolism and a controlled drug release rate as well as patient compliance compared to the more traditional methods of oral delivery or injections. However, the major difficulty in the wide application of transdermal drug delivery is the low percutaneous permeability of most compounds, necessitating the addition of permeation enhancers to the system. The multicomponent formulation makes the development and optimization of effective drug delivery systems difficult and time consuming. The purpose of this work is to combine molecular dynamics (MD) calculations of epidermal diffusivity data with the finite element (FE) modeling of the diffusion process, resulting in a complete model of the drug delivery system. Materials and Methods We study the diffusion process of two components from a dermal patch applied on the skin through the outer skin layer. The skin is modeled as a fluid phase lipid multi-bilayers. The diffusivities of model compounds in a lipid bilayer is calculated using the MD program NAMD (available from the University of Illinois, Urbana-Champagne). The resulting diffusivities are then used in the FE solution of the diffusion process modeled by multicomponent Fick’s equations in a two-layer heterogeneous domain consisting of the dermal patch and the epidermis. Results We present the diffusivity data from the MD calculations. The concentration profiles of the model compounds throughout the diffusion domain from the FE simulation is presented as well. The flux of the model drug compound through the dermal patch and the epidermis is calculated as a function of time. Conclusion By combining MD and FE calculations, we can effectively model a nonlinear multicomponent diffusion process through a heterogeneous domain that can be applied to transdermal drug delivery system.

BCATS 2003 Symposium Proceedings

48

Poster P10

WHOLE GENOME HAPLOTYPE ALIGNMENT OF CIONA SAVIGNYI Kerrin Small and Sideow Arend

The genome of the urochordate seasquirt Ciona savignyi has been sequenced by a whole genome shotgun strategy. Unlike most other sequenced organisms, the sequenced individual was a wild diploid, not an inbred laboratory strain. The high degree of polymorphism between the individual’s haplotypes resulted in a fractured assembly, where allelic sequences from the same genomic regions have assembled into separate contigs. This project seeks to construct a whole genome haplotype alignment of the C. savignyi assembly. The motivation for such an alignment is twofold, to improve the assembly and to delineate the haplotypes. Long range contiguity in the assembly will be improved by piecing together supercontigs based upon the tiled path in alignments. Delineating the haplotypes will allow for characterization of the polymorphism present in the C. savignyi, providing information on point substitutions, insertions, deletions, and inversions across the entire genome. A more robust assembly with separated haplotypes will also greatly aid the gene annotation efforts currently underway. Ultimately a consensus C. savignyi sequence can be built from the whole genome haplotype alignment.

BCATS 2003 Symposium Proceedings

49

Poster P11

BRAIN WAVE RECOGNITION OF EMOTIONS IN EEG Elliot Berkman, Dik Kin Wong, Marcos Perreau Guimaraes, E. Timothy Uy, James

Gross and Patrick Suppes Purpose Psychologists have identified brain hemispheric asymmetry during the experience of two types of emotions. Approach emotions are associated with adaptive stimuli such as food or attractive members of the opposite sex, and are characterized by left hemisphere activation in the alpha band (8-13Hz). Withdrawal emotions are elicited by aversive stimuli and are marked by right hemisphere activation, also in the alpha band. Can these classes of emotion be identified in the brain using single-trial classification? Two hypotheses were tested: (1) Across trials, asymmetric prefrontal activations should distinguish between approach- and withdrawal-related stimuli; (2) For each trial, prototype averaging and neural network algorithms will accurately classify stimuli among three valences—approach, neutral, and withdrawal. Materials and Methods Electroencephalographic (EEG) brain-wave data were collected from 24 male Ss while they viewed emotional images and neutral control images. Data were then classified (1) by aggregate log-transformed alpha band power and (2) on a single-trial basis by optimized filtering and neural network training. Results As predicted, alpha-band activation was found to be greater in the right but not left frontal lobe for withdrawal images and greater in the left but not right temporal lobe for approach images. Also as predicted, individual trial data were classified into the correct categories significantly better than chance using an optimal filter and least-square classification algorithm. With a guessing probability of 1/3, 462 of 1,170 individual trials (39%) were classified correctly with p < 10-7. Furthermore, this classification rate was improved to 46% by the use of a single-layer neural network. Conclusion Approach-withdrawal emotions are not only differentiable using gross averaging techniques; they can also be classified on a trial-by-trial basis using various sorting algorithms. These results represent a first step toward a sharp classification of emotions using EEG data. Future project designs and ideas for further improvement on the classification are discussed.

BCATS 2003 Symposium Proceedings

50

Poster P12

MONTE CARLO SIMULATION OF FOLDING PROCESSES FOR 2D LINKAGES MODELING PROTEINS WITH OFF-GRID HP-CHAINS

Leo Guibas, Rachel Kolodny, Michael Levitt and Ileana Streinu Purpose We explore the process of protein folding using simple exact models of protein structure (two dimensional HP-chains). The motion itself is modeled using pseudo-triangulation (PT) mechanisms. We hope that a more sophisticated mathematical model will enable folding of larger protein models, offering new insights to the folding process. Materials and Methods We model proteins using a two-dimensional HP chains; the energy model is the one suggested by Dill [1]. We use an exact simple off-grid model of protein chains: it is in two dimensions, and with two types of residues, yet the atoms are not constrained to a grid. We model motion using a Monte-Carlo simulation. Every step is the motion implied by a random PT 1dof mechanism. Pseudo triangulations are minimally rigid graphs; when removing a hull-edge the graph becomes a 1dof mechanism. Every chain can be pseudo triangulated by adding enough edges. There are many ways to add these edges, and selecting one at random is an interesting open problem. The motion implied by this mechanism can be calculated by solving O(n) quadratic equations (using Newton-Raphson). Developing algorithms to solve this set of equations in a robust manner is open for further investigation. System validation can be offered by considering the energy as a function of time, considering the radius of gyration as a function of time, and by looking for secondary structure formation in the chains. Results We implemented a PT linkage package (based on PT workbench by L. Kettner). This allows us to run “folding“ simulations. The implementation raised many interesting geometrical problems regarding pseudo-triangulations. Some were solved: e.g. the task of finding rigid components in a PT mechanism allows a faster implementation. We can find all rigid components in PT mechanism in linear time (instead of n^4 previously known for finding rigid components in a general graph). Others are still open and are described in our poster. Conclusion Exploring the folding process of chains using pseudo triangulations offers means of investigating protein folding as well as the mathematical objects of pseudo triangulations. References

1. Dill KA, Bromberg S, Yue K, Fiebig KM, Yee DP, Thomas PD, Chan HS, “Principles of protein folding--a perspective from simple exact models”. Protein Science, 1995 Apr;4(4):561-602.

BCATS 2003 Symposium Proceedings

51

Poster P13

VISUALIZING BIOLOGICAL NETWORKS THROUGH SELECTIVE REDUCTION AND FORCE DIRECTION

Adam Wright Purpose Science continues to make rapid progress in sequencing genomes. These sequences have the potential to provide biologists with a rich source of information about living things. Sequencing, however, is only the first step. A genetic sequence contains a series of “open reading frames”: series of nucleotides that potentially code for a single protein. Together, these proteins determine the biological characteristics (phenotypes) of a life form. Going from a sequence to a meaningful understanding of how genes affect phenotypes is difficult and complex, and biologists often rely on the web of interactions between ORFs to gain insight. However, genomes often contain thousands of ORFs, and each ORF may interact with several hundred others, resulting in a huge and nuanced network of interactions. Understanding this much data in raw form is infeasible, so biologists need ways to simplify and visualize it. Materials and Methods In this research, I develop a set of tools based on graph theory that allows biologists to simplify and visualize biological networks, made up of five modules:

• NodeStats: NodeStats computes statistics about the connectedness of biological networks, which is useful for identifying highly connected ORFs that form “hubs” in genetic networks.

• Clusters: The clusters tool uses a probabilistic model to extract a “big picture” of a genomic network.

• Prune: Prune isolates an ORF of interest, along with closely connected ORFs, allowing the user to “drill down” and focus their efforts by eliminated unrelated ORFs.

• Dot Interface: Dot is a best-in-class tool for laying out networks, originally developed to map phone networks. I created an interface to view biological networks using Dot.

• GraphLayout: This tool allows users to interactively manipulate, arrange and view biological networks by clicking and dragging. It’s a particularly effective way to see how networks of proteins interact.

Results To test my work, I asked a colleague to attempt to predict the function of a protein (YNL102W) from the Saccharomyces cerevisiae (Baker’s yeast) genome using this toolkit. Using the GraphLayout, Prune and Clusters module, she correctly determined that the protein was involved in the initiation of DNA replication. Conclusion The software developed in this research, and graph theory more broadly, has the potential to assist biologists as they attempt to understand the illuminating yet unwieldy data sets that the genomic revolution has afforded them.

Web Page Paper: http://www.stanford.edu/~adamatw/graphs/bionets.pdf Software: http://www.stanford.edu/~adamatw/graphs/

BCATS 2003 Symposium Proceedings

52

Poster P14

PROTEIN-PROTEIN INTERACTIONS IN POLIOVIRUS Andres Bayani Tellez, Scott Crowder, Serkan Apaydin, Doug Brutlag and

Karla Kirkegaard Poliovirus transcription occurs via an RNA dependent RNA polymerase (called 3D) and a peptide primer (called 3B). In the presence of a poly-A template strand and UTP, 3B is uridylylated by the polymerase 3D. The uridylylated peptide then primes transcription of the polyadenylated RNA, producing 3B-linked poly U. The structure of 3B and mechanism of the peptide polymerase interaction is unknown; this work uses computational methods to determine putative folds for 3B and then docks these folds onto 3B. Specific predictions about residue-residue contacts in the ligand-enzyme complex will be tested with a gel shift binding assay. The gel shift will involve peptides designed to produce a loss of binding and complementary point mutations on the polymerase that will restore binding. Each peptide will be scored for how the dissociation constant of binding changes with particular mutations, as determined by the shift point in the gel. Using a combination of computation supported and verified by traditional bench techniques, we hope to move toward a model of primer-polymerase interactions in positive stranded RNA viruses.

BCATS 2003 Symposium Proceedings

53

Poster P15

MULTI-HAND HAPTIC INTERACTION TO SIMULATE COMPLEX SURGICAL PROCEDURES

Christopher Sewell, Kenneth Salisbury, Sabine Girod, Tom Krummel and Jean-Claude Latombe

Purpose The goal of this project is to develop a realistic computer simulation of a surgical environment that can be used to better train medical students before they work on live humans. The focus of our research, in contrast to the sole emphasis of most previous work on physical modeling, is to understand and model the logical flow of specific events that can occur during an operation, so that the student learns the optimal sequence of actions to perform given various scenarios. A primary objective has been to develop scenario authoring tools to allow for the virtualization of the curriculum of specific surgical procedures. Materials and Methods A prototype of such a discrete event engine has been implemented. A surgeon can design an input script for a procedure by defining a set of states, each with associated actions, and transitions among states, each triggered by detection of a specific event. The program defines a library of action and detection functions, which are used in the script, along with parameters for specifics such as length, force, or location. The framework for the software development is the Simulation & Active Interface platform for graphics and haptics developed at Stanford, including a proxy algorithm, collision detection methods, and stereo 3-D graphic rendering. In addition, we have developed support for multi-finger haptic gripper interaction. Results A demo has been created using simple texture-mapped geometric primitives that interact with the discrete event engine, driven by user input from a one-finger desktop Phantom haptic device. In the demo, the surgeon cuts through the skin and muscle layers of a thorax, ties an artery, excises a tumor, and sutures the incision back together. He/she must select the proper tool for each step. If a cut is made in the wrong place or with excessive force, bleeding starts, and the location must be sutured within a given time frame or the patient dies. The amplitude of a simulated heart monitor signal and the associated volume of the beeping help indicate the state of the patient. Conclusion Significant progress has been made towards achieving the aims of our proposed research. We are continuing work on improved graphic and haptic models of tissues and organs, enhanced coordination among multiple haptic devices and multiple users, and more realistic and complex surgical procedures.

BCATS 2003 Symposium Proceedings

54

Poster P16

COMPARING HOUSEKEEPING AND TISSUE-SPECIFIC GENE PROMOTERS: AN ANALYSIS OF SPURIOUS TRANSCRIPTION FACTOR

BINDING SITES Diane I. Schroeder and Rick Myers

Purpose We tested the hypothesis that promoters that are ubiquitously expressed and thus have a continually-open chromatin structure would have sequences that have a paucity of spurious transcription factor binding sites. Evolution would select against such binding due to steric hinderance with transcription factors that are actively regulating the gene's expression. Tissue-specific promoters, however, would have less pressure to be rid of spurious sites because the corresponding transcription factors may not even be present in the same tissues at the same time. Materials and Methods Human tissue-specific gene promoters and ubiquitiously-expressed (housekeeping) gene promoters were obtained from the Affymetrix Gene Expression Atlas data set. A set of tissue-specific transcription factor binding site matrices from Transfac were run across all of these promoters to look for statistically-significant differences in occurance. The values were also compared to binding site occurances in shuffled promoters. Results As expected, tissue-specific gene promoters had about the same number of transcription factor binding sites as expected by chance. Surprisingly, however, housekeeping gene promoters also had about as many binding sites as expected by chance. Finally, as expected, tissue-specific promoters had more binding sites than housekeeping promoters but further analysis showed that most of the difference was explained by GC content alone. Conclusion The results were surprising and demonstrate that the process of transcriptional reguation is probably more sophisticated than the simple model outlined above. For example, it is possible that the regulatory machinery for housekeeping genes is either impervioius to any potential steric hinderance from spurious transcription factor binding or that there are mechanisms preventing such transcription factors from binding.

BCATS 2003 Symposium Proceedings

55

Poster P17

THREE-DIMENSIONAL MEASUREMENTS AND ANALYSIS OF THE ISOLATED MALLEUS-INCUS COMPLEX

Jae Hoon Sim, Sunil Puria and Charles R. Steele Purpose It is now known that the malleus and incus motion changes with frequency in all three dimensions. To describe the complicated motion of the malleus and incus, the slippage at the joint should be considered. However, in most mathematical models of the human middle ear, slippage at the joint is not allowed. Such models cannot explain the 3D motion above about 2 kHz. Materials and Methods We make detailed 3D measurements of the isolated malleus-incus complex (MIC) where the eardrum and the stapes footplate have been removed. Removing the eardrum decouples the complicated motions of the eardrum from the MIC while removing the stapes footplate and inner ear gives greater access to the malleus and incus. Without an ear drum, the isolated malleus incus complex is directly driven by a tiny magnet attached to the umbo and a coil around the tympanic annulus. Velocities of several points on the isolated malleus-incus complex, which is attached to two stacked goniometers, are measured from several different angles. The 3-D velocity components of each point, and translation and rotation of the malleus and the incus, are calculated. Measurements are made while driving the umbo in the forward direction and the incus in the reverse direction. Measurements are made with the output freely moving, or while it is immobilized. The measurements are used to estimate parameters of an anatomically based structural model of the isolated MIC. A mathematical model of the malleus-incus complex, which allows slippage at the incudo-malleolar joint, is introduced. Ligaments are modeled as pre-stretched linear springs and dashpots. Typical ligament positions and center of mass of the malleus and incus are specified. The input force and model parameters are obtained by least square error methods. Results The 3-D motion of the malleus-incus complex is observed with slippage at the incudo-malleolar joint. The model is consistent with the measurements for frequencies below about 2 kHz. For modeling higher frequencies, more accurate description of ligament positions and center of mass of the malleus and incus is required. These measurements can be obtained using microCT imaging methods. Work supported in part by a grant from the NIDCD of NIH (DC03085).

BCATS 2003 Symposium Proceedings

56

Poster P18

IMAGE-BASED ANALYTIC SURFACE REPRESENTATION AND MESH GENERATION

Erik J. Bekkers and Charles A. Taylor Purpose A major limitation of computational modeling in clinical medicine is the prohibitive time and user effort required to construct a geometric model from medical image data. Our work has been motivated by the need to convert large, three-dimensional (3D) vascular imaging volume data sets into finite element models. An important step in this process is the identification and representation of the boundary (or surface) of the domain of interest. We have developed a robust and topologically independent method for the representation of globally smooth surfaces; such a representation is critical for multiresolution adaptive mesh generation (particularly for fluid flow problems) and contact problems (such as stent/vessel wall interaction). Materials and Methods Input data can come from any volumetric imaging modality, including MR, CT, or confocal laser microscopy. The volumetric data is sent through an entirely 3D image processing pipeline which includes edge preserving smoothing and level set segmentation. The resulting volumetric data contains an implicit representation of the surface that is converted to an explicit representation using the Marching Cubes algorithm. This triangulated representation, though extremely accurate, is very dense and lacks multiresolution ability, thus limiting its usefulness as input to a volume mesh generator. In contrast, non- uniform rational B-spline (NURB) surfaces have multiresolution ability and can be easily used as input for mesh generators and solid modeling programs. Thus, we convert the triangulated representation to a NURB surface by performing a mesh simplification to generate a base mesh and then use this base mesh as the control mesh for a subdivision surface that will be least squares fit to the underlying dense mesh. More specifically, base mesh generation is performed by a novel application of the Fast Marching Method for accurate Voronoi diagram construction. The polygonal dual to the Voronoi diagram is the triangulated base mesh. Triangles are combined pairwise to form squares. The resulting triangle/square surface then undergoes one Catmull-Clark subdivision to form the entirely quadrilateral base mesh. The vertices of this base mesh are then taken as control points for a NURB surface representation which is an approximation to the limit Catmull Clark surface of the base mesh. Since the base mesh vertices uniquely determine the surface, they can be considered independent variables in a surface optimization routine: data points (a subset of the dense mesh vertices) are efficiently projected onto the NURB surface, and the base mesh vertices are re-positioned so as to minimize the least square distance between the data points and the NURB surface. Results The method has been demonstrated on a variety of data sets (including vascular models), and has proven to be useful for computational modeling. Conclusion We have developed an accurate, automatic method for 3D model construction directly from medical imaging data. The multiresolution ability of our method allows for adaptive re-meshing and the model’s topological independence allows for analysis of many of the smooth organic shapes encountered in medical imaging. Additionally, the fact that the resulting representation is a NURB surface allows it to be used in standard solid modeling programs.

BCATS 2003 Symposium Proceedings

57

Poster P19

AN AUTOMATIC LUNG SEGMENTATION SCHEME FOR COMPUTER-AIDED DETECTION

Shaohua Sun, Geoffrey D. Rubin and Sandy Napel Purpose Automatic lung segmentation is a critical first step for computer aided detection (CAD) of lung nodules in 3D CT lung images. We developed an automatic lung segmentation scheme for lung nodule CAD that extracts the right and left lungs without removing juxtapleural (touching the chest wall) nodules. Materials and Methods Our method consists of 3 main steps: initial segmentation, right and left lung extraction and separation, and edge smoothing. In the first step, all pixels with intensities above a threshold and all air-intensity pixels outside the patient are deleted, thereby leaving only the lungs and other air-filled regions inside the patient. In the second step, the lungs are longitudinally divided into central (CP), superior (SP) and inferior (IP) portions. The CP contains the trachea and bronchi, which we remove by deleting small regions with air density. Then we track and remove the trachea from the most superior slice of the CP through the SP. In the IP, we delete additional small regions of air density, representing air in other organs (stomach, bowel, etc.). In the third step, the key for retaining juxtapleural nodules, for each slice, we perform connected region analysis and choose the two largest regions, considering them the left and right lungs. We separate each lung contour into a left and right segment, and test to see if line segments drawn at a range of angles from points on the contour meet other points in the same segment. If they do, the line is included in the contour, thereby enclosing any juxtapleural nodules. Results Figure 1 shows 4 lung slices extracted by the proposed scheme, showing retained juxtapleural nodules (white boxes). In preliminary experiments, we applied the method to a database consisting of 6 patients containing 96 known pulmonary nodules. The studies ranged in size from 304 to 597 slices, and the average time required to segment each study was 21 minutes. To evaluate performance, we compared FROC curves, which measure sensitivity for nodule detection as a function of the number of false positive detections, and the length of time required by CAD, for the cases with our segmentation, with region filling segmentation, and with bounding box segmentation. Overall, the time required by CAD with our segmentation is 89% of the time used by region filling segmentation, and 59% of the time used by bounding box segmentation, while generating similar FROC curves. Conclusion These preliminary results suggest that our segmentation scheme improves CAD efficiency without sacrificing accuracy. Further evaluation using a larger lung CT database is required for statistical significance. Web Page Figures: see http://snapg4.stanford.edu/~snapel/bcats03/ssun.html

BCATS 2003 Symposium Proceedings

58

Poster P20

SIMULATION BY REACTION PATH ANNEALING: PROTEIN MISFOLDING AND AGGREGATION IN A 7 RESIDUE PEPTIDE FROM THE YEAST PRION

PROTEIN SUP35 Jan Lipfert, Joel Franklin, Fang Wu and Sebastian Doniach

Prions are the cause of several neurodegenerative diseases, most notably BSE in cows and CJD (Creutzfeldt-Jakob Disease) in humans. The disease causing agent is believed to be the prion protein (PrP), which can undergo a transition from its normal form to a disease-associated misfolded amyloid conformation. This conformational conversion involves a transition from alpha-helix to beta-sheet of large parts of the protein. Recently, microcrystals of a 7 residue segment of the yeast prion-like protein, Sup 35, have been obtained by Eisenberg and collaborators [1]. It was shown that the peptide aggregates into amyloid assemblies. By fitting to the x-ray powder pattern of these microcrystals, we have been able to generate candidate atomic models of the amyloid-like unit cell. We have studied the transition of the alpha-helix conformation of this peptide to the beta-sheet conformation, both as an isolated peptide in solution and in the presence of the model amyloid-like unit cell as a substrate, using the method of reaction path annealing[2], [3]. In our current parallel implementation of reaction path annealing, we discretize the trajectory into 64 time slices and, after generating initial trajectories, we subsequently perform stochastic all-atom "pseudo-dynamical" molecular dynamics annealing simulations, in parallel for all time slices, to sample trajectory space and generate trajectories with action close to the minimum. Our results indicate that, while for the single peptide in solution the conversion from alpha-helix to beta-sheet is unfavorable (the forward reaction has higher Onsager-Machlup action than the back reaction), the opposite is true in the presence of the amyloid-like protein substrate, i.e. the beta-sheet conformation is more stable than a single alpha helix in the presence of the beta-sheet substrate. Having generated trajectories in all atom detail, we are in the position to investigate the molecular cause of this stabilizing effect of the beta sheet substrate. First results indicate that while the final state is stabilized by hydrogen bonds, electrostatic and van der Waals interactions play the key role in the course of the alpha -> beta conversion. References

1. M. Balbirnie, R. Grothe, and D.S. Eisenberg. PNAS 98 (5): 2379-2380 (February 27, 2001). 2. P. Eastman, N. Grøbech-Jensen and S. Doniach. J. Chem. Phys. 114:3823-3841 (February 2001). 3. L. Onsager and S. Machlup. Phys. Rev. 91:1505-1512 (September 1953).

BCATS 2003 Symposium Proceedings

59

Poster P21

ANALYSIS OF NUCLEOTIDE DIVERSITY IN EXONIC SPLICING ENHANCERS (ESES) OF HUMAN MEMBRANE TRANSPORTER GENES Bernie J. Daigle, Maya K. Leabman, Kathleen M. Giacomini and Russ B. Altman

Exonic splicing enhancers (ESEs) are specific, short oligonucleotide sequences that enhance pre-mRNA splicing when present in exons.1 Using a dataset containing the DNA sequence of 24 membrane transporter genes in 247 ethnically diverse individuals, we are interested in analyzing the nucleotide diversity (�) of regions containing ESEs. Part of this process involves assigning a score to each exon’s splice sites based on how well the splicing donor and acceptor sites match genome-wide consensus sequences. The overall goals of the project are first to evaluate the significance of differences (if any) between nucleotide diversity in ESEs versus other exonic regions, and second to assess whether a correlation exists between splice site score and ESE nucleotide diversity. Splice site scores were calculated by aligning all non-redundant non-overlapping human reference sequence transcripts over the positions constituting the canonical splice site donor and acceptor sequences. Two position-specific log-odds score matrices were created from these alignments and used to score individual splicing donor and acceptor sites. ESE sequences were identified based on a putative list generated by Fairbrother et al. 20021; nucleotide diversity was calculated using the Arlequin population genetics software. 2 Splice donor and acceptor sites for all 24 membrane transporter genes were scored and ranked, putative ESEs were identified, and nucleotide diversity in ESEs was calculated for a subset of the genes. References

1. Fairbrother WG, Yeh R, Sharp PA, Burge CB. Predictive Identification of Exonic Splicing Enhancers in Human Genes. Science, 297, 1007-1013.

2. Excoffier L, http://lgb.unige.ch/arlequin/.

BCATS 2003 Symposium Proceedings

60

Poster P22 PREDICTING HIV DRUG RESISTANCE USING SUPERVISED MACHINE

LEARNING METHODS Jaideep Ravela, Asa Ben-Hur and Robert W. Shafer

Purpose More than 100 mutations at ~50 sites in HIV reverse transcriptase (RT) and protease, the molecular targets of therapy, confer HIV drug resistance. These mutations emerge in complex patterns and often have synergistic and antagonistic interactions. We applied 3 supervised learning methods to RT and protease genetic sequence data to predict reductions in HIV drug susceptibility. Materials and Methods Susceptibility to 9 RT inhibitors were available for ~300 sequenced RT isolates and to 5 protease inhibitors for ~500 sequenced protease isolates. Susceptibility results indicated the fold-increase in the drug concentration required to inhibit a wild type virus isolate by 50%. Because of the high-dimensionality of the protease and RT sequence data, 4 alternate genotypic feature sets were explored: (a) 6 protease and 6 RT mutations strongly associated with drug resistance based on in vitro experiments, (b) 19 protease and 16 RT mutations accepted by most experts to be associated with drug resistance, (c) 45 protease and 42 RT positions recently found to be statistically associated with treatment failure, (d) all 99 protease and all 240 RT positions. Decision trees (DT, C4.5), neural networks (NN), and support vector machines (SVM) were trained to classify virus isolates as susceptible, low-level resistant, or high-level resistant to each of the 14 drugs. Results A minimum of 19 protease and 16 RT mutations were required to obtain optimal predictions for each of the 14 drugs using each of the three learning methods. The median prediction accuracy into the 3 categories (susceptible, low-level resistant, and high-level resistant) was 69.5% (range: 63%-84%) for DTs and 73% (range: 64%-88%) for NNs. Discriminating between susceptible and low or high-level resistance was more accurate then discriminating between low-level and high-level resistance. Conclusion This study demonstrates the applicability of three learning methods for using HIV genotypes to predict drug susceptibility phenotype. The study identifies a lower limit for the number of mutations that must be used for phenotypic prediction. Analyses are underway to compare the importance of individual mutations for classification by each of the methods.

BCATS 2003 Symposium Proceedings

61

Poster P23

ROBUST NEURAL DECODING OF REACHING MOVEMENTS FOR PROSTHETIC SYSTEMS

Caleb Kemere, Maneesh Sahani and Teresa Meng Purpose A new algorithm for decoding intended movements from neural signals is presented. A Hidden Markov Model of arm movements is used to assist in trajectory reconstruction. Materials and Methods Experimentally captured reaching movements are combined with a neural signal model to generate a synthetic data set for testing the robust decoding algorithm. Results In most situations, using the HMM prior of arm movements reduces the error of reconstructed trajectory by half compared to typical methods. Conclusion The new paradigm of model-based neural decoding should provide significant benefits in human neural prosthetic systems. References

1. C. Kemere, M. Sahani and T. H. Meng. “Robust neural decoding of reaching movements for prosthetic systems,” in Proc. IEEE Engineering in Medicine and Biology Society 25th Annual Conference (EMBS '03), Cancun, Mexico, Nov. 2003.

BCATS 2003 Symposium Proceedings

62

Poster P24

NANOROBOTS AS CELLULAR ASSISTANTS IN INFLAMMATORY RESPONSES

Arancha Casal, Tad Hogg (HP Labs) and Adriano Cavalcanti (Darmstadt Univ. of Technology)

The ongoing development of molecular-scale electronics, sensors and motors could eventually lead to nano-scale robots ("nanorobots") with dimensions comparable to bacteria. While such robots cannot yet be fabricated, theoretical studies identify their plausible range of capabilities, including operating in fluid microenvironments of the body for medical applications. We investigate the possibility of using nanorobots to assist inflammatory cells leaving blood vessels to repair injured tissues. The recruitment of inflammatory cells or white blood cells (which include neutrophils,lymphocytes, monocytes and mast cells) to the affected area is the first response of tissues to injury. Because of their small size (in the scale of cellular surface receptors) nanorobots could attach themselves to the surface of recruited white cells, to squeeze their way out through the walls of blood vessels and arrive at the injury site, where they can assist in the tissue repair process. Passage of cells across the blood endothelium, a process known as transmigration, is a complex mechanism involving engagement of cell surface repcetors to adhesion molecules, active force exertion and dilation of the vessel walls and physical deformation of the migrating cells. By attaching themselves to migrating inflammatory cells, the robots can in effect 'hitch a ride' across the blood vessels, bypassing the need for a complex transmigration mechanism of their own. A recently developed nanorobot simulator allows investigating system-level control algorithms for such nanomedicine applications. The simulator uses a typical set of design parameters for robots operating in a simplified fluid environment motivated by medically relevant microenvironments. The robots, whose motion is dominated by viscosity (with Reynolds numbers around 0.001), have behaviors quite different from common experience with larger, faster flows. The dominance of viscosity simplifies the numerical evaluation of the fluid dynamics by ignoring inertial effects. This simplification allows focusing on overall behaviors of groups of robots, while balancing a reasonable approximation to important physical phenomena of the environment with limited computational cost of the simulator. For example, the simulator can follow the behavior of tens of robots with sizes of hundreds of nanometers over periods up to a second or so with reasonable computational effort. In this poster presentation, we describe how this simulator can examine the response of a group of robots to chemical signals associated with the early immunological response to injury. In this task, the robots must detect the signals, use gradient information to guide their motions to the source, and recognize the correct target cells to bind to in order to achieve blood-vessel transmigration. The recruited robots must also communicate with each other to monitor their numbers and ensure a desired density of response. Such behaviors are an initial step toward a control program allowing the nanorobots to reach an injury site in appropriate numbers and assist inflammatory cells in the tissue repair process. The simulator allows us to quantitatively evaluate the robot response in terms of reaction time, necessary power consumption and robustness. In addition, the simulator provides graphical visualization of object motions in the task environment, which is useful both to illustrate robot behaviors and identify difficulties with specific robot control algorithms. Due to the use of chemical sensors and 3D motions through the viscous fluid, this robot task is more complex than most foraging tasks studied with large scale robots, i.e., the physical properties of the fluid microenvironment of nanorobots provide new control challenges.

BCATS 2003 Symposium Proceedings

63

Poster P25

PREDICTING FUNCTIONAL SITES ON PROTEIN STRUCTURES WITH SEQFEATURE

Mike Hsin-Ping Liang, Russ Altman and Doug Brutlag Predicting protein function is an interesting and challenging problem. In the past, computational prediction of protein function involved mostly analysis at the sequence level. With the growing availability of protein structures, incorporating structural information can provide better predictive power and insight into the function of the proteins. As structural genomics initiatives begin to work, more and more protein structures with unknown function will be available, and methods for predicting protein function on structures will become increasing valuable. We have developed a method, SeqFEATURE, to integrate 1D sequence motifs with structural information to automatically construct 3D structural models for protein function prediction on protein structures. These structural models provide a description of statistically conserved physicochemical properties in the 3D environment around the sequence motif. We use these structural models for identifying specific functional locations on protein structures. Our method has been applied to predicting calcium binding sites with success and is currently being tested on serine protease catalytic sites.

BCATS 2003 Symposium Proceedings

64

Poster P26

COMPARING CONSERVATION AND COVARIANCE IN PROTEIN ALIGNMENTS

Anthony A. Fodor and Richard W. Aldrich

Purpose A number of recent papers have suggested that algorithms that find correlated mutations in multiple sequence alignments can be used to find important residues in proteins.(1, 2) We evaluate three different methods for detecting these covarying mutations in protein multiple sequence alignments. Materials and Methods Our first covariance algorithm computes a score for a pair of alignment columns (i,j) based on a chi-square like difference between the expected and observed numbers of pairs of residues. Our second algorithm(3) is based on the ratio of the expected and observed pairs of residues. Our third algorithm(2, 4) compares the properties of a sub-alignment of column j to the properties of the full alignment for column j where the members of the sub-alignment are determined by the residues present in column i. Results While all three algorithms gave low covariance scores to highly conserved columns, we found a surprising lack of agreement in identifying highly covarying pairs of residues. Despite their differences, all three algorithms were, to varying degrees of power, able to find pairs of residues that were physically close to each other and had low solvent accessible surface areas in a test set of 132 crystal structures. All three algorithms, however, were outperformed in finding physically close residues by an algorithm that simply chooses the most conserved pairs of residues. Conclusions Covarying residues are distinct from conserved residues. Nonetheless, both conserved and covarying residues are clustered in the protein core. We interpret this to mean that covariance is a kind of conservation. Conservation is lack of change in a column i that can be seen by just observing that column; covariance is the lack of change in a column i that can only be observed by considering that column in relationship to another column j. The combination of conservation and covariance information is therefore likely to yield more information than considering either algorithm alone(5). The overall low power of covariation algorithms, however, means that approaches which concentrate only on covariance and ignore conservation, for example recent attempts to map “conserved evolutionary pathways” in proteins(2,4) are likely to fail. References

1. Bickel et. al, Proc Natl Acad Sci USA 99, 14765-71 2002. 2. Suel et. al, Nat Struct Biol 10, 59-69 2003. 3. Wollenberg and Atchley, Proc Natl Acad Sci USA 97, 3288-91 2000. 4. Olmea et al. J Mol Biol 293, 1221-39 1999.

BCATS 2003 Symposium Proceedings

65

Poster P27

3-DIMENSIONAL MAPPING OF THE WATER STRUCTURE AROUND HYDROPHOBIC SOLUTES

Tanya Raschke and Michael Levitt Purpose The hydrophobic effect is one of the most important forces in stabilizing biological molecules. The nature of this important interaction lies in the unique hydrogen-bonding properties of water. Experimental results have shown that there is a decrease in entropy upon exposure of hydrophobic solutes to water. Is this decrease in entropy due to the formation of a highly ordered, hydrogen bond network in the hydration shells (the clathrate model) or is it simply due to excluded volume of the solute? Materials and Methods To investigate these questions, we computed detailed maps of the position and orientation of the water molecules that surround simple, hydrophobic solutes. Water structure maps were generated from molecular dynamics simulations computed using different potentials, ensembles, solutes and water models. Results Our results show that waters in the first hydration shell occur at high density, but are not strongly oriented. A notable exception is benzene, where water molecules show a strong interaction with the top and bottom of the ring. Conclusion 3D maps of the density of water surrounding hydrophobic solutes show two clear hydration shells, separated by a low-density layer. Water molecules in the first hydration layer surrounding a hydrophobic solute occur at high density, but are not strongly oriented. Water makes very favorable interactions with the top and bottom of the benzene ring, indicating that aromaticity alters the hydrophobic effect.

BCATS 2003 Symposium Proceedings

66

Poster P28

GENOME-WIDE DISCOVERY OF TRANSCRIPTIONAL MODULES FROM DNA SEQUENCE AND GENE EXPRESSION Eran Segal, Roman Yelensky and Daphne Koller

Purpose Many cellular processes are regulated at the transcriptional level, by one or more transcription factors that bind to short DNA sequence motifs in the upstream regions of the process genes. These co-regulated genes then exhibit similar patterns of expression. Given the upstream regions of all genes, and measurements of their expression under various conditions, we could hope to ‘reverse engineer’ the underlying regulatory mechanisms and identify transcriptional modules — sets of genes that are co-regulated under these conditions through a common motif or combination of motifs. We present a genome-wide approach for discovering this modular organization using only the genome sequence and gene expression data as input. Materials and Methods We propose a novel probabilistic model that describes the mechanism by which patterns (motifs) in DNA promoter sequences give rise to observed expression profiles, and an algorithm to learn this model automatically from data. Our model integrates both the gene expression measurements and the DNA sequence data into a unified model. The model assumes that genes are partitioned into modules, which determine the gene’s expression profile. Each module is characterized by a motif profile, which specifies the relevance of different sequence motifs to the module. A gene’s module assignment is a function of the sequence motifs in its promoter region. By integrating the sequence and expression into a unified model, we allow for bi-directional “information flow” when learning the models: genes with similar expression profiles are more likely to be in the same module, forcing us to find motifs in co-expressed genes; similarly, genes with common motifs affect the module assignment, forcing an organization consistent with regulatory mechanisms. Our learning algorithm is genome-wide: we search for motif combinations that “explain” the observed expression profiles, and dynamically add new motifs as necessary. We note that all components of the model, including the motifs themselves, are learned de-novo. As motifs are usually short, there are many genes where a motif is randomly present but does not play a role. Thus, our model does not assume that all motif appearances are active and attempts to discover those motif appearances that play a regulatory role in some particular set of experiments; a motif that is active in some settings may be completely irrelevant in others. Our model identifies motif targets—genes where the motif plays an active role in affecting regulation in a particular expression data set. Results We show that our approach is better at recovering known motifs and at generating biologically coherent modules in comparison to standard approaches that first cluster the expression profiles and then search for motifs in the upstream regions of the genes in each cluster. Using only sequence and expression data as input, our method recovered 25 of the 36 known motifs in yeast, 10 known motif combinations in human, and suggested novel modules whose expression is coherent in expression datasets not used as input to our method, thereby providing support for these novel hypotheses. Our genome-wide approach also provided global insights about gene regulation: that there are more “general” (binding to many genes) and more “specific” (binding to few genes) motifs than expected by chance; and that some motifs bind preferentially to certain regions within the upstream region of genes. Finally, we also combine our results with genome-wide location data in order to relate the actual transcription factors to the module they regulate, resulting in a detailed regulatory network of yeast. We show that 21 of the 64 inferred regulatory relationships in this network have support in the literature. The remaining 43 predictions remain as novel hypotheses.

BCATS 2003 Symposium Proceedings

67

Poster P29

WALKING ALONG CHROMOSOMES: GENOMIC MAPPING OF CYTOGENETIC ABERRATIONS FROM TUMOR DATABASES

Michael Baudis The analysis of karyotypes by Giemsa banding or molecular cytogenetic techniques (CGH, FISH) is a standard procedure in the clinical workup of many neoplastic diseases. Besides their diagnostic value, nonrandom chromosomal abnormalities may point towards genomic regions with increased probability for the occurrence of oncogenes involved in the generation of the respective disease. Through DNA array technologies, the continuous accumulation of genetic expression profiles from neoplastic tissues has become feasible. However, external reference systems, e.g. the information about non-randomly abberrant chromosomal regions derived from cytogenetic analysis, have to be applied to define the neoplastic significance of differentially expressed genes. One obstacle here is the specific syntax of the cytogenetic nomenclature. As part of the PROGENETIX database project for Comparative Genomic Hybridization experiments (CGH), the author had developed a parsing engine for the transformation of CGH annotations to a positional matrix. Recently, FELIX, an algorithm for parsing karyotype results from chromosomal banding analysis has been implemented. In contrast to the single unanonimous value for each genomic segment reported by CGH, banding analysis depicts the structural composition of single chromosomes from tumor metaphase spreads. Also, karyotypes may contain chromosomal segments of unknown definition (e.g. marker and double minute chromosomes). For the FELIX parser, those factors were adjusted for by 1. compound band values from separate analysis of each annotation, and 2. allowing a user definable parsing stringency (e.g., only completely resolvable cases). As examples, the analysis of CGH karyotypes of follicular lymphomas from the Mitelman database led to an imbalance pattern comparable to the data from PROGENETIX, with the additional benefit of highlighting translocation breakpoints. However, in the comparison of data from a solid tumor entity (neuroblastoma), a superior detection sensitivity for imbalances could be detected in the CGH dataset. The development of the cytogenetic parsing engines has enabled the mining of the two largest, non-redundant cytogenetic databases. Through the possible integration of the more than 8000 cases from PROGENETIX, and the data from the more than 45000 cases containing Mitelman database, the development of specific genomic profiles for most tumor entities should be feasible. The preformatted PROGENETIX data as well as online versions of the conversion engines are available through the website. Web Site http://www.progenetix.net

BCATS 2003 Symposium Proceedings

68

Poster P30

A PERFORMANCE-BASED MULTI-CLASSIFIER APPROACH TO ATLAS-BASED SEGMENTATION

T. Rohlfing, D. B. Russakoff and C. R. Maurer, Jr. Purpose Combinations of classifiers are consistently more accurate than single classifiers [1]. For atlas-based segmentation, multiple independent classifiers arise naturally from the use of multiple atlases derived from different individuals [2]. We describe in this work methods to estimate the performance parameters of multiple, not necessarily atlas-based, classifiers. The performance estimates can be used in the classifier combination to assign higher confidence to more reliable individual classifiers. Materials and Methods Warfield et al. [3] recently described a method for estimating the binary segmentation performance of multiple experts in the absence of a ground truth. We present two extensions of this method to multi-label segmentations. The first method independently applies the Warfield algorithm to each label in the segmentation. It estimates the common binary performance parameters sensitivity p and specificity q for each classifier and each label. The second method is based on a multi-label performance parameter model (row-normalized confusion matrix of a Bayesian classifier) that takes into account cross-label misclassifications. We evaluate both methods relative to each other and to simple classifier combination by Sum Rule decision fusion [1]. For each of 20 individuals we compute 19 atlas-based segmentations, using each of the remaining 19 individuals as the atlas. The 19 segmentations are combined into a final segmentation using 1) simple Sum Rule fusion, 2) the binary performance parameters estimated by the Warfield algorithm, and 3) the multi-label performance parameters estimated using our novel algorithm. The accuracies of all segmentations are computed by comparing them to a manual gold standard segmentation. Results Both methods computed accurate estimates of the true performance parameters (Pearson’s correlation coefficient 0.94 for Warfield algorithm, 0.87 for multi-label algorithm). Using these parameters, the accuracy of the combined segmentation improved substantially. The mean recognition rates were: 93% for Warfield’s algorithm, 95% for multi-label estimation, 91% for Sum Rule fusion with no performance estimates. Conclusion Performance-based combinations of multiple segmentations substantially improve segmentation accuracy. The methods are universally applicable to general classification problems beyond atlas-based segmentation, for example for handwriting recognition. References

1. J Kittler, et al. On combining classifiers. IEEE Trans Pattern Anal Machine Intell, 20(3):226–239, 1998. 2. T Rohlfing, et al. Segmentation of three-dimensional images using non-rigid registration: Methods and

validation with application to confocal microscopy images of bee brains. In Medical Imaging: Image Processing, vol. 5032 of Proceedings of the SPIE, pp. 363-374, 2003.

3. SK Warfield, et al. Validation of image segmentation and expert quality with an expectation-maximization algorithm. In Proceedings of MICCAI, pp. 298–306, Springer-Verlag, Berlin, 2002.

BCATS 2003 Symposium Proceedings

69

Poster P31

DNA STACKING INTERACTIONS OF FLUORINATED AROMATIC C-DEOXYRIBONUCLEOSIDES Jacob S. Lai and Eric T. Kool

Purpose To probe the importance of electrostatic interactions found in duplex DNA base stacking in water, using a series of nonnatural deoxyribonucleosides containing fluorine-substituted aromatic hydrocarbons as the DNA base. Materials and Methods Mono-, di-, tri-, tetra- and penta-fluorinated benzene deoxyribonucleoside analogues are synthesized using the nucleophilic reaction of lithiated aromatic derivatives and the known 3’,5’-O-((1,1,3,3-tetraisopropyl)disiloxanediyl)-2’-deoxy-D-ribono-1,4-lactone, giving primarily the beta-anomer. These C-nucleoside derivatives are then incorporated into 7-mer and 9-mer DNA oligonucleotides via their phosphoramidite derivatives. These oligomers are designed to have a self-complementary core sequence of natural DNA and a 5’-dangling end comprised of the unnatural nucleotides. Thermal denaturation studies were used to evaluate melting temperatures and thermodynamics of these modified DNAs. Results The results showed better stacking of all fluorinated nucleobases compared to a non-fluorinated phenyl case, possibly due to greater surface area and higher hydrophobicity. Surprisingly, dipolar and quadrupolar electrostatic components were found to contribute little to stacking energies. In addition, an unexpected ortho substitution effect was also observed. Conclusion These insights add to the understanding of electrostatic interactions in DNA base stacking and may lead to future developments in design of molecules that can stabilize DNA structures. Web Page http://www.stanford.edu/~jakelai/research/

BCATS 2003 Symposium Proceedings

70

Poster P32

LARGE SCALE STUDY OF PROTEIN DOMAIN DISTRIBUTION IN THE CONTEXT OF ALTERNATIVE SPLICING

Shuo Liu and Russ B. Altman Alternative splicing plays an important role in processes such as development, differentiation, and cancer. With the recent increase of the estimates of the number of human genes that undergoes alternative splicing from 5% to 35-59%, it becomes critical to develop a better understanding of its functional consequences and regulatory mechanisms. We conducted a large scale study of the distribution of protein domains in a curated data set of several thousand genes and identified protein domains disproportionately distributed among alternatively spliced genes. We also identified a number of protein domains that tend to be spliced-out. Both the proteins having the disproportionately distributed domains as well as those with spliced-out domains are predominantly involved in the processes of cell communication, signaling, development and apoptosis. These proteins function mostly as enzymes, signal transducers, and receptors. Somewhat surprisingly, 28% of all occurrences of spliced-out domains are not performed by straightforward exclusion of exons coding for the domains but by inclusion or exclusion of other exons to shift the reading frame while retaining the exons coding for the domains in the final transcripts.

BCATS 2003 Symposium Proceedings

71

Poster P33

REAL-TIME LENS DISTORTION CORRECTION USING TEXTURE MAPPING

Michael Bax Purpose Optical lens systems suffer from non-linear radial distortion. Applications such as computer vision and medical imaging require distortion compensation for the accurate location, registration and measurement of image features. While in many applications distortion correction may be applied offline, a real-time capability is desirable for systems that interact with the environment or with a user in real time. Materials and Methods The construction of a triangle mesh combined with distortion compensation of the mesh nodes results in a pair of static node co-ordinate sets, which a texture-mapping graphics accelerator can use along with the dynamic distorted image to render high-quality distortion-corrected images at video frame rates. Results Mesh generation, an error analysis, and performance results are presented. Conclusion The polar-based method proposed in this work is shown to have both more accuracy than a conventional grid-based approach and greater speed than the traditional method of using the CPU to transform each pixel individually.

BCATS 2003 Symposium Proceedings

72

Poster P34

CONSERVATION OF KNOWN REGULATORY ELEMENTS AND THEIR IDENTIFICATION USING COMPARATIVE GENOMICS

Yueyi Liu, X. Shirley Liu, Liping Wei, Russ B. Altman and Serafim Batzoglou Purpose Regulatory elements are crucial to our understanding of gene regulation. Comparative genomics (cross-species sequence comparison) is a promising approach to the challenging problem of regulatory element identification, since functional non-coding sequences may be conserved across species due to evolutionary constraints. In this work, we present a systematic study on the conservation of known regulatory elements, as well as CompareProspector, a program that integrate comparative genomics into Gibbs sampling for regulatory element identification. Materials and Methods We first systematically analyzed known human and S. cerevisiae regulatory elements documented in databases such as TRANSFAC and SCPD. Based on the results of these analyses, we developed a regulatory element-finding algorithm called CompareProspector, which extends Gibbs sampling by biasing the search in promoter regions conserved across species. Results In our systematic analyses, we found that known human regulatory elements are more conserved between human and mouse than background sequences. Though known S. cerevisiae regulatory elements do not appear to be more conserved by comparison of S. cerevisiae to S. pombe, they are more conserved when compared to multiple other yeast genomes (S. paradoxus, S. mikatae, and S. bayanus) using multiple sequence alignment. Using human–mouse comparison, CompareProspector correctly identified the known elements for transcription factors Mef2, Myf, Srf, and Sp1 from a set of human muscle-specific genes. It also discovered the NFAT element from genes upregulated by CD28 stimulation in T cells, which suggests the direct involvement of NFAT in mediating CD28 stimulatory signal. Using C. elegans–C. briggsae comparison, CompareProspector found the PHA-4 element from pharyngeally expressed genes and the UNC-86 element from genes known to be regulated by UNC-86. CompareProspector outperformed many other computational regulatory element-finding programs tested, demonstrating the power of comparative genomics-based biased sampling in regulatory element identification.

BCATS 2003 Symposium Proceedings

73

Poster P35

IMAGE-BASED MODELING OF BLOOD FLOW IN PULMONARY ARTERIES USING A ONE-DIMENSIONAL FINITE ELEMENT METHOD COUPLED TO A MORPHOMETRY-BASED MODEL OF THE DISTAL

VESSELS Ryan L. Spilker, David Parker, Jeffrey A. Feinstein and Charles A. Taylor

Purpose The ability to simulate the hemodynamic outcomes of potential corrections of a particular patient’s cardiovascular disease will aid physicians in the treatment planning process. We have developed a method to simulate blood flow in a subject-specific model of the entire pulmonary arterial tree and applied this to the case of unilateral stenosis, a narrowing of a blood vessel often associated with congenital cardiovascular disease. Materials and Methods A stabilized, space-time finite element method1 was employed to solve the nonlinear one-dimensional equations for blood flow in pulmonary arterial trees constructed from magnetic resonance angiography (MRA) data. Periodic flow, measured with cine phase contrast magnetic resonance imaging, was imposed at the model’s inlet. At each outlet of the image-based model, where the pulmonary arteries could no longer be resolved from MRA data, an impedance representing a linearized solution of blood flow equations in the distal vessels was imposed. Each tree of downstream vessels was created using morphometric data from the human pulmonary arteries, comprising the lengths, diameters, and connectivity of 15 orders of vessels.2

Results Hemodynamic simulations and corresponding experiments have been performed for the porcine pulmonary arteries in the presence and absence of a unilateral stenosis. The predicted and measured division of flow between the left and right lungs matched, and the predicted pressures were within a reasonable range. A simulation was also performed retrospectively on a model of a patient with repaired tetralogy of Fallot. The simulation of a treatment option for this patient demonstrated the potential ineffectiveness of the treatment plan. Conclusion Preliminary results show that modeling pulmonary hemodynamics with the methods described above holds promise for cardiovascular treatment planning. The novel use of morphometry-based models of the distal vessels coupled to image-based models of the proximal vessels provides a useful starting point to explore this alternative to fractal-based models of distal vascular trees. References

1. Wan J, Steele BN, Spicer SA, Strohband S, Feijoo GR, Hughes TJR, and Taylor CA. A One-Dimensional Finite Element Method for Simulation-Based Medical Planning for Cardiovascular Disease. Computer Methods in Biomechanics & Biomedical Engineering, 5:195-206.

2. Huang W, Yen RT, McLaurine M, and Bledsoe G. Morphometry of the human pulmonary vasculature. J Appl Physiol, 81:2123-2133.

BCATS 2003 Symposium Proceedings

74

Poster P36

CALCIUM QUANTIFICATION IN THE AORTOILIAC ARTERIES: INTERSCAN VARIABILITY OF AGATSON SCORING VS. AUTOMATED

MASS QUANTIFICATION IN NONCONTRAST AND CONSTRAST-ENHANCED SCANS

Bhargav Raman, Raghav Raman, Mercedes Carnethon, Stephen P. Fortmann, Sandy Napel and Geoffrey D. Rubin

Purpose Multi-Detector Computed Tomography is increasingly used to evaluate atherosclerotic disease by quantifying the amount of dystrophic mural calcium (Ca) visualized in the walls of the arteries using an index called an Agatson score (AS). However, the AS can exhibit high variability, is only semi-quantitative, is time-consuming to calculate and is only accurate in non-contrast scans. We have developed an automated method of quantifying the actual mass of vascular Ca. In this study we compare the interscan variability of AS to our automated algorithm. Materials and Methods 22 patients (11 normal) were scanned from mid-neck to the level of the knees using a 16-row MDCT scanner using noncontrast and contrast-enhanced protocols. To assess interscan variability, the contrast scan was repeated in 10 patients and the noncontrast scan was repeated in 12 patients. The AS and the mass (M) of mural Ca fragments were then automatically quantified by our algorithm that uses a conversion factor derived from standards of known Ca density included in the scan field. Absolute interscan variability (v) was measured for AS and M, per scan and per fragment. % interscan variabilities were compared using two-tailed paired t-tests. Results Patients had an average AS of 2064.4 ± 2453.7, with an average M of 355.4 ± 425.9 mg of Ca. Noncontrast scans had an v of 6.9 ± 6.7 mg (5.3% ± 4.8%) per scan; Compared to M, AS had a significantly higher v of 140.7 ± 190.9 units (12.2% ± 8.2%, p<0.01) per scan. Per fragment v in noncontrast scans was 2.3 ± 1.2 mg (14.91% ± 4.9%) for M; AS had a significantly higher v of 22.2 ± 18.1 units (23.6% ± 13.6%, p<0.05). Contrast scans had a v of 9.3 ± 8.4mg (8.5% ± 6.4%) while AS had a significantly higher v of 261.5 ± 292.4 units (26.5 % ± 20.4%, p<0.05) per scan compared to M. Per-fragment v in contrast scans was 2.9 ± 1.5 mg (16.2% ± 5.3%) for M; AS had a significantly higher v of 43.7 ± 29.8 units (32.5% ± 16.4%, p<0.05). There was no significant difference in v of M between contrast and noncontrast scans per scan (p=0.743), and per fragment (p=0.329) while v of AS was significantly higher in contrast vs. noncontrast scans per scan (p<0.05) and per fragment (p<0.05). Conclusion Automated Ca mass quantification with contrast correction has a lower interscan variability than traditional Agatson scoring in both noncontrast and contrast scans. Additionally, the variability is not affected significantly by contrast enhancement level, potentially allowing more accurate and rapid Ca quantification in the systemic arteries. Web Page http://3dlab.stanford.edu/BCATS/CalciumVariation/default.asp

BCATS 2003 Symposium Proceedings

75

Poster P37

MORPHOLOGICAL DIFFERENCES IN THORACIC AORTIC ANEURYSMS WHEN COMPARED TO NORMAL AORTAS – A COMPARATIVE STUDY IN

49 PATIENTS Raghav Raman, Zhao Shaohung, Bhargav Raman, Sandy Napel and

Geoffrey D. Rubin

Purpose Thoracic aortic aneurysms (TAAs) can distort normal aortic anatomy by causing elongation, expansion and torsion, potentially causing difficulty during stent graft treatment planning and deployment. We aimed to characterize these morphological changes by comparing the shape and size profiles of the thoracic aorta in normal patients to patients with TAAs using Multi-Detector Row CT. Materials and Methods We obtained preoperative CT Angiograms (CTAs) of 32 patients with TAAs and 17 controls scanned for other indications. Using a previously developed algorithm, an initial path was calculated through the lumen of the aorta to indicate its course. The path was then iteratively centralized to the center of mass of the aorta along its length. Segments of the aorta were defined as the portions between the major branches of the aorta in this region. Using this central path, the diameter, length and curvature (tortuosity) of the segments were quantified and compared with the corresponding segments in normal patients. Results On average, the thoracic aorta was significantly larger in patients with aneurysms (34.7 vs 26.1mm, p<0.01) with the largest difference being distal to the branch point of the left subclavian artery (LSCA) (34.7 vs 25.3 mm, p<0.01). Patients with TAAs had thoracic aortas that were on average significantly longer (347.0 mm vs 288.9 mm, p<0.01), but the only segment that contributed significantly to this disparity of length was the aorta distal to the LSCA (245.2 vs 192.1mm, p<0.01). There was no difference in overall average curvature (15.5 vs 14.8 m-1, p=0.17). Only the segment bounded by the LSCA proximally and the proximal neck of the aneurysm distally had a significantly higher curvature in patients with aneurysms (23.1 vs 13.7 m-1, p<0.01). Conclusion The morphological differences in aortas with thoracic aortic aneurysms are found in relatively specific areas. While the most well known effect of aneurysmal change is an increase in diameter, there is also a significant increase in length distal to the LSCA and a significant increase in the curvature of the proximal neck of the aneurysm. Further studies to delineate the time course of these changes in morphology may be warranted to aid preoperative stent graft planning and deployment and improve follow-up for thoracic aortic aneurysms. Web Page http://3dlab.stanford.edu/BCATS/TAA/default.asp

BCATS 2003 Symposium Proceedings

76

Poster P38

2D/3D REGISTRATION FOR IMAGE GUIDANCE IN INTERVENTIONAL RADIOLOGY

Joyoni Dey, Markus Kukuk and Sandy Napel Purpose Interventional radiology commonly uses 2D fluoroscopic images (based on X-ray absorption) to guide minimally invasive procedures. Depth information is not retained, which sometimes makes guidance difficult. We aim to incorporate information from preoperatively acquired 3D tomographic images (e.g., CT and MR), which requires spatial registration of 2D images with a 3D volume. Materials and Methods For this initial evaluation, we simulated 2D fluoroscopic images corresponding to a 3D CT dataset by positioning a virtual camera and computing raysums (reprojection) along divergent rays. We then attempted to match these resulting Digitally Reconstructed Radiographs (DRR) to projections through the 3D dataset. This 3D-to-2D registration can be formulated as the problem of finding 6 parameters of a camera that produced the 2D image. The idea is to find the optimum value, over the 6D parameter space, of an image metric by comparing the original DRR to new projections through the 3D volume. We tested the performance of two metrics: 1) maximizing mutual information and 2) minimizing image dissimilarity. In case 1) we performed gradient ascent to maximize the mutual information [1]. For faster convergence, we used a two-pass approach using smoothed images first and then using original resolution images. Learning rates were changed dynamically. In case 2) we measured the dissimilarity between the two images by means of the L1-norm and found the minimum by using a best-neighbor optimizer [2]. To speed-up the calculation of the L1-norm we implemented a stochastic approach called SSDA (Sequential Similarity Detection Algorithm). To increase the capture range and robustness we implemented a multi-resolution framework. Results Metric 1; object rotation only: We produced a 2D DRR reference image and subsequently rotated the object by (10,10, 20) deg. to produce the starting image for the optimization process. The mutual information gradient descent converged smoothly to the reference image within 0.5 deg about all axes in 40 iterations. Metric 2; camera rotation and translation: We produced a 2D DRR reference image and subsequently reoriented the camera by (37 deg, 43 deg, 44 mm, 41 mm, 51 mm, 26 deg). The algorithm converged within 0.75 deg about all axes and 1.3 mm along all axes in 50 iterations. See the web page for figures illustrating the results of both approaches. Conclusion Both approaches were found to be promising and the results justify further research. Our next step is to test the algorithms on actual fluoroscopic datasets. Web Page for Figures and References http://snapg4.stanford.edu/~snapel/bcats03/dk.htm

BCATS 2003 Symposium Proceedings

77

Poster P39

INDEPENDENT COMPONENT ANALYSIS (ICA) FOR REMOVING BALLISTOCARDIOGRAM AND OCULAR ARTIFACTS FROM EEG DATA

ACQUIRED INSIDE AN MRI SCANNER Gaurav Srivastava, Vinod Menon, Sonia Crottaz-Herbett and Gary H. Glover

Purpose Combining information from Functional Magnetic Resonance Imaging (fMRI) and Electroencephalography (EEG) holds great promise for examining the spatial and temporal dynamics of sensory and cognitive processes underlying brain function. We present our work on removing various kinds of artifacts from simultaneously acquired interleaved EEG / fMRI data. We have concentrated on removing Ballistocardiogram artifact that is caused by involuntary head and body movements due to cardiac pulsation and the Ocular artifacts which are caused due to eye-blink/slow eye movement activity. These motion related artifacts are a result of small but regular movements of the subject’s body inside the magnetic environment of the MRI scanner which induces an EMF (Electromotive Force) at the scalp electrodes recording the neural activity in the brain. Materials and Methods We use Independent Component Analysis (ICA) for isolating and removing the artifacts from the EEG channel data. ICA is a blind source separation technique and a statistical algorithm. It decomposes a set of simultaneously recorded multi channel data into components which are statistically independent of each other. The great usefulness of ICA lies in the fact that being a blind-source separation technique, it does not make any assumptions either about the nature of the input data or the mixing process though which various independent neurological and other artifactual sources have combined to give the multi channel EEG recordings on the scalp electrodes. Results The ICA was able to successfully identify and isolate components corresponding to Ballistocardiogram and Ocular artifacts and clean EEG data was obtained after removing these artifactual components. The spectral power in harmonics of the fundamental EKG frequency (~1.1 Hz) was substantially reduced demonstrating that the Ballistocardiogram artifact has been effectively eliminated from the data. Conclusion The main advantage of ICA is that it gives a clear visualization and distinction between components corresponding to genuine neural activity in the brain and artifactual components arising from other independent physiological processes like cardiac pulsation and eye movement. References

1. Comon, P, “Independent Component Analysis, a new concept?”, Signal Processing, vol36, 1994, pp. 287-314

2. Bonmassar, G. et.al., “Motion and Ballistocardiogram Artifact Removal for Interleaved Recording of EEG and EPs during MRI”, NeuroImage 16, 1127-1141 (2002)

BCATS 2003 Symposium Proceedings

78

Poster P40

INFERRING MOTIFS THAT MEDIATE PROTEIN INTERACTIONS Haidong Wang, Eran Segal, Asa Ben-Hur, Douglas Brutlag and Daphne Koller

Purpose Many cellular functions are carried out through interactions between proteins. With the advancement of proteomic technology, large-scale protein interaction data are being generated [1]; these data, though noisy, provide us an opportunity to obtain a genome-wide view of protein interactions. However, there are relatively few measurements of interactions on the residue level. To probe protein interaction on a finer scale, we propose to study interactions computationally at the level of sequence motifs. Motifs, i.e., evolutionarily conserved sequence elements, are known to be responsible for functional sites of proteins, such as catalytic or binding sites. Materials and Methods We construct a probabilistic model that explains an interaction between a pair of proteins as the binding between the motifs that are active in the interaction. The probability that binding occurs depends on the affinity between the motifs. Based on the observed protein interactions and the motif compositions of the proteins, our model learns the active motifs in all the proteins and the binding affinity between motifs. We use the EM algorithm for training the model, and a branch-and-bound algorithm to deal with the exponential number of possible assignments to motif activities. Results We applied our method to a dataset of 13,340 protein interactions in 3485 yeast proteins, which have 39,454 motif hits from the eMOTIF database. We predicted 10,970 motifs to be active. We also inferred motif interactions from the co-crystallized chains in PDB and compared these with predictions from our method. We found significant enrichment of interacting motifs inferred from PDB among the active motifs predicted by our method. To further validate the biological assumptions in our model, we varied our models along two axes: allowing motifs to be inactive vs. assuming that all motifs are active; and using motifs vs. longer Prosite domains. We trained each of these four models, and for each computed the average number of shared GO functions and expression correlation for protein pairs predicted to interact. The result shows the protein pairs predicted by our model – using motifs, and allowing for inactivity – achieve higher coherence for both functional annotation and expression profile. Conclusion Our method is a step towards understanding the mechanics of protein interactions using a sequence-based analysis, without doing a full protein-structure prediction or docking analysis. References

1. Breitkreutz, BJ., Stark, C., Tyers M. (2003) “The GRID: The General Repository for Interaction Datasets” Genome Biology 4(3) R23

BCATS 2003 Symposium Proceedings

79

Poster P41

COMMUNITY BASED APPROACH TO MICROARRAY RESEARCH Brian Null and QuangQiu Wang

Purpose Optimized, biased clustering of a large microarray data set to detect and investigate detailed biology that is otherwise lost within a large data set, and facilitation of distributed analysis via a Web-Based Cluster Analysis Platform Project Summary Having completed a high-resolution transcriptional description of the life cycle of Drosophila using DNA microarrays, we found that a staggering quantity of biologically interesting patterns that can be extracted by a simple specialized method of clustering these data, which we refer to as the ‘sliding window’ view of development. Essentially, if one clusters temporal subsets of time course data separately, high resolution patterns of gene use begin to emerge. Not only does this method produce a more logically valid analysis of microarray data, but it reveals tremendous detail that is present within, but obscured by, the typical broad analyses published for these large data sets. We have developed a web-based tool that allows specialists in various aspects of biology and medicine to browse this large reference data set optimized for the area of specialization of the user. We feel that a tool that can organize and serve genomic data in an intuitive, interactive manner, and also allow databasing of user annotations, could potentially yield much greater utility than the existing paradigm of one-off, static publication of data in subscription-based journals. Materials and Methods In this project, we provide an interface which allows scientists to discuss and annotate data from microarray experiments in a Slashdot-like community. Microarray datasets are presented in a web interface, where users can annotate and discuss the experiment data using a web browser. User annotation and discussion are organized and easily searchable. To aid further data analysis, we also use Java Web Start to deliver software tools, such as Cluster, TreeView, and a custom pattern-finding application. Thus the tool allows for feedback annotation, pattern searching, hypothesis generation, pooling of intellectual resources, and organization of research efforts w/in the scientific community. Results/Conclusion In the case of a large time course microarray data set, we can expect that the data will tend to be most logically valid if we study regions of development by comparing their set of experiments, or time points, only to those closest to them, rather than to the whole data set of development. Therefore, we believe that the most informative method of describing the transcriptional development of the fly is as a series of localized clustergrams, each a few time points wide, spanning the data set in overlapping steps. Thus far we have constructed two such overlapping sliding window views of the drosophila life cycle, using clustering windows both 5 time points and 7 time points wide. We have rapidly discovered that all time points have sudden, dramatic changes and dynamic, localized patterns of gene expression represented in the data. Upon examining these data, one can begin to fully grasp the sheer volume of browseable biological phenomena that appear to be represented within such a large and complex data set. References

1. M. Arbeitman, `E. Furlong, `F. Imam, `E. Johnson, `B. Null, `B. Baker, M. Krasnow, M. Scott, R. Davis, K. White; “Gene Expression During the Life Cycle of Drosophila melangogaster”; Science, 27 Sep 2002, pp. 2270-2275 (`Co-first authors)

BCATS 2003 Symposium Proceedings

80

Poster P42

PROBABILISTIC DISCOVERY OF OVERLAPPING CELLULAR PROCESSES AND THEIR REGULATION Alexis Battle, Eran Segal and Daphne Koller

Purpose Many of the functions carried out by a living cell are regulated at the transcriptional level, to ensure that genes are expressed when they are needed. To understand biological processes, it is thus necessary to understand this transcriptional network. We propose a novel probabilistic model of gene processes and regulation. The key features of our approach are that genes can participate in multiple processes, and we include a regulatory program for each process. Materials and Methods We described our model in terms of Probabilistic Relational Models, representing the domain in terms of the relevant biological entities: genes, arrays, expression levels, regulators, and biological processes. Each gene may belong to any of the processes. Each regulator has an expression measurement in each array, and each process has an activity level in each array. Regulator expression levels predict the activity level of each process, and the activity of each process and a gene’s memberships predict its final expression level. We model the probabilistic relationships between these attributes. This approach allowed us to learn, using methods based on structural EM, which genes belong to each process, which regulators regulate each process, and how active each process is in each array. Results We applied our learning algorithm to a large database of Yeast expression data, composed by combining four data sets, with expression measurements for 5747 genes in 394 different experiments. According to GO annotations, our model assigned genes with similar functions to the same process with greater significance than a model with no regulatory programs or a standard clustering model. Also, our process activity levels corresponded well to annotations of each array indicating which biological functions would be affected. Conclusion We show that by modeling both the assignment of genes to overlapping cellular processes and the regulatory programs of these processes, we obtain better models than standard approaches or models with no regulatory information. We are able to predict gene process membership, meaningful process activity levels, and regulatory relationships, all from microarray data. References

1. E. Segal, A. Battle, and D. Koller. Decomposing Gene Expression into Cellular Components. Proc. PSB, 2003.

2. E. Segal, M. Shapira, A. Regev, D. Pe’er, D. Botstein, D. Koller, and N. Friedman. Module Networks: Identifying Regulatory Modules and their Condition-Specific Regulators from Gene Expression Data. Nature Genetics, 2003.

3. N. Friedman. The Bayesian Structural EM Algorithm. Proc. UAI, 1998.

SYMPOSIUM PARTICIPANT LIST

81

BCATS 2003 Symposium Proceedings Symposium Participant List

Nima A. Electrical Engineering [email protected] Ali El-Sayed Abbas Management Science and Engineering [email protected] Laurent Abi-Rached Structural Biology [email protected] Miwa Akiyama DMS [email protected] Murat Aksoy Electrical Engineering [email protected] Alfred Electrical Engineering [email protected] Ruben Almaraz UC Davis [email protected] Mohammed AlQuraishi Wolfram Research, Inc. [email protected] Russ Biagio Altman Genetics [email protected] Josh Alwood Aeronautics and Astronautics [email protected] Christopher Amos Chemistry [email protected] Anew Chongmin An Stanford Biotechnology Quarterly [email protected] M. Serkan Apaydin Electrical Engineering [email protected]

Jack C. Armstrong Jack C. Armstrong & Associates [email protected] Mike Asmar Mathematical and Computational Science [email protected] Orna Avsian-Kretchmer Obstetrics and Gynecology [email protected] Brian Babcock Computer Science [email protected] Christopher Baer Human Biology [email protected] Yu Bai Biophysics [email protected] Kai Bao Genetics [email protected] Alexis Battle Computer Science [email protected] Laura Baudis Physics [email protected] Michael Baudis Pathology [email protected] Michael R Bax Electrical Engineering [email protected] Henry van den Bedem Stanford Synchrotron Radiation Lab [email protected] Erik Jan Bekkers Mechanical Engineering [email protected]

Asa Ben-Hur Biochemistry [email protected] Sylvia Bereknyei Antrhopological Sciences [email protected] Aviv Bergman Biological Sciences [email protected] Benjamin Bernard Berk School of Medicine [email protected] Elliot Berkman Psychology [email protected] Thor Besier Mechanical Engineering [email protected] Gail A. Binkley Genetics [email protected] Jonathan Binkley PATHOLOGY [email protected] Silvia Salinas Blemker Mechanical Engineering [email protected] Nikolas Blevins Surgery [email protected] Roy Bohenzky Roche Bioscience [email protected] Jim Bowlby Ingenuity Systems [email protected] Onn Brandman Molecular Pharmacology [email protected]

82

BCATS 2003 Symposium Proceedings Symposium Participant List

Scott Delp Steve Chen Relly Brandman Mechanical Engineering - Biomechanical Engineering

Sun Microsystems Molecular Pharmacology [email protected] [email protected]

[email protected] Swaine Lin Chen Doug Brutlag Janos Demeter Developmental Biology Biochemistry Biochemistry [email protected] [email protected] [email protected] Wendy Cheng David George Bryder Joyoni Dey Mechanical Engineering PATHOLOGY Radiology [email protected] [email protected] [email protected] Ashley Chi Tim Burcham Jeanne Geczi Digel Biochemistry diaDexus, Inc Molecular Pharmacology [email protected] [email protected] [email protected] Lillian T. Chong Kaizad Cama Vesselin Ivaylov Dimitrov Chemistry Biomedical Informatics Computer Science [email protected] [email protected] [email protected] Hediye Nese Cinar Carlos Cardenas Epic Ding UC Santa Cruz Radiation Oncology, Physics

Division Tularik, Inc. [email protected] [email protected] [email protected] Hulusi Cinar Yi Ding UC Santa Cruz John D Carpenter Scientific Computing and Computational Mathematics

[email protected] Biochemistry [email protected]

[email protected] Howard J. Cohen Cohen Software Consulting, Inc. Dick Carter Nhan Van Do [email protected] Hewlett Packard Biomedical Informatics [email protected] [email protected] Gregory Michael Cooper Genetics Aranzazu Casal Tom Ba Do [email protected] Mechanical Engineering Computer Science [email protected] [email protected] Darwin Cruz Biomedical Computation John Cavallaro Stan Dong [email protected] Management Science and

Engineering Genetics [email protected] Bernie Daigle [email protected] Genetics Beate Dorow [email protected] Elizabeth Danhwa Chao Center for the Study of Language and Information

Biochemistry Eugene Davydov [email protected]

[email protected] Computer Science [email protected] Angi Chau Chris Duffield Electrical Engineering Pediatrics Anea Decker [email protected] [email protected] Chemistry [email protected] Gal Chechik Helmy Atef Eltoukhy Computer Science Electrical Engineering Robert DeConde [email protected] [email protected] Biomedical Informatics [email protected]

83

BCATS 2003 Symposium Proceedings Symposium Participant List

Lisa He Amit Garg Mark Engelhardt [email protected] Biomedical Informatics Biochemistry [email protected] [email protected] Michael He Statistics Anew Gentles Lawrence M Fagan [email protected] Mathematics Biomedical Informatics [email protected] [email protected] Noah Charles Helman Applied Physics Sam Z. Glassenberg Ali Faghfuri [email protected] Computer Science Electrical Engineering [email protected] [email protected] Kirk Hirano [email protected] Vimal Goel Zhenbin Fan Multi-Organ Transport Program Urology Edmond Ho [email protected] [email protected] Computer Science [email protected] Mike Goelzer Zihong Fan [email protected] Electrical Engineering Tad Hogg [email protected] Hewlett Packard David Goldberg [email protected] Palo Alto Research Center Timothy D. Fenn [email protected] Molecular and Cellular Physiology Daniel Wellesley Holbert [email protected] School of Engineering Carlos Alberto Gomez-Uribe [email protected] Electrical Engineering John Fernandes [email protected] Biological Sciences Susan P Holmes [email protected] Statistics Paul Thomas Gurney [email protected] Electrical Engineering Nathan Elliot Floyd [email protected] [email protected] Wan-Jen Hong Mechanical Engineering David Gutierrez Anthony Fodor [email protected] Electrical Engineering Molecular and Cellular Physiology [email protected] [email protected] Shawn Hoon Genetics Zoltan Gyongyi Robert French [email protected] Computer Science [email protected] [email protected] Alison Kay Hottes Hao Fu Electrical Engineering Thomas D. Haggerty Electrical Engineering [email protected] Mechanical Engineering [email protected] [email protected] Jennifer Hsu Irene Gabashvili NASA Ames Research Center Thorsten Hamann Hewlett Packard [email protected] Dept. of Plant Biology Carnegie

Institution [email protected]

Jianqiang Huang [email protected] Raymond Gan Genetics Foothill College [email protected] Dan Harris [email protected] Molecular Research Institute Joseph Huang Tia Gao Obstetrics and Gynecology Eddy Hartanto Electrical Engineering [email protected] Computer Science [email protected] [email protected]

84

BCATS 2003 Symposium Proceedings Symposium Participant List

Pete Klosterman Joseph Kahn Yang Huang UCSF Management Science and

Engineering Stanford Medical Informatics

[email protected] [email protected] [email protected] Rachel Kolodny Joe S. Hur Computer Science Penjit Kanjanarat Mechanical Engineering [email protected] Stanford Linear Accelerator Center [email protected] [email protected] Seungbum Koo Samuel Sze Ming Ieong Mechanical Engineering Amit Kaushal Computer Science [email protected] Biomedical Informatics [email protected] [email protected] Nikesh Kotecha Joji Inamasu Biomedical Informatics Sam Kavusi Neurosurgery [email protected] Electrical Engineering [email protected] [email protected] Charles Kou Takanari Inoue [email protected] Brett T. Kawakami Molecular Pharmacology Civil and Environmental

Engineering [email protected]

Marc E. Kowalski Stanford Linear Accelerator Center [email protected] Jared Jacobs [email protected] Computer Science Nicholas William Kelley [email protected] John R Koza Biophysics Mechanical Engineering [email protected] Orion Jankowski [email protected] Chemistry Caleb Tilo Kemere [email protected] Lee G. Kozar Electrical Engineering Molecular and Genetic Medicine [email protected] Guha Jayachanan [email protected] Computer Science Viktoria Kheifets [email protected] Markus Kukuk Molecular Pharmacology Radiology [email protected] Kristian Jessen [email protected] Petroleum Engineering Natalia Khuri [email protected] Chethana Kulkarni San Francisco State University Biological Sciences [email protected] Huan Jiang [email protected] Computer Science Sami Khuri [email protected] Anand Kumble San Jose State University Pointilliste, Inc. [email protected] Dave Johnson [email protected] Genetics Man Lyang Kim [email protected] Jacob S Lai Molecular Pharmacology Chemistry [email protected] Sonal Josan [email protected] Electrical Engineering Peter Kim [email protected] Anew A. de Laix Mathematics Wolfram Research, Inc. [email protected] Ioulia Kachirskaia [email protected] Biological Sciences Jeff Klingner [email protected] Shirley Lam Computer Science UC Santa Cruz [email protected] Kimberly Ann Kafadar [email protected] Biology [email protected]

85

BCATS 2003 Symposium Proceedings Symposium Participant List

Brent Louie Richard Lin Rasmus Larsen Incyte Corporation Genetics Statistics [email protected] [email protected] [email protected] David Lowsky Steven Lin Hong-Lak Lee Computer Science Genetics Applied Physics [email protected] [email protected] [email protected] Man Yu Lui Bruce Ling Kaman Lee Computer Science Tularik, Inc. CMGM [email protected] [email protected] [email protected] mark luo Jen Liou Karen Lee Computer Science Molecular Pharmacology Biological Sciences [email protected] [email protected] [email protected] Juntao Ma Jan Lipfert Kyung-Hee Lee Microbiology and Immunology Physics Mechanical Engineering [email protected] [email protected] [email protected] Mohammed H. Mahbouba Jane Liu Sung-Joo E. Lee Biomedical Informatics Tularik, Inc. Chemistry [email protected] [email protected] [email protected] Mars Mallari May Liu Doron Levy Pathology Mechanical Engineering Mathematics [email protected] [email protected] [email protected] Daniel Margolis Mon-Chi Marcus Liu Jia Li Radiology Aeronautics and Astronautics Graduate School of Business [email protected] [email protected] [email protected] Athina Markopoulou Shuo Liu Ping Li Electrical Engineering Biomedical Informatics Statistics [email protected] [email protected] [email protected] Peter Markstein Tommy Fuliswa Liu Mike Hsin-Ping Liang Hewlett Packard Labs Mechanical Engineering Biomedical Informatics [email protected] [email protected] [email protected] Vicky Markstein Yueyi Liu Jung-Chi Liao in Silico Labs Biomedical Informatics UC Berkeley [email protected] [email protected] [email protected] Shahriyar Matloub Scott Lohr Jason Lih Electrical Engineering AGY Therapeutics, Inc. Genetics [email protected] [email protected] [email protected] Neville Zarir Mehenti Fiona Loke Barbara Lin Chemistry Electrical Engineering diaDexus, Inc [email protected] [email protected] [email protected] Nipun Mehra Itay Lotan Christopher C. Lin Computer Science Computer Science School of Engineering [email protected] [email protected] [email protected]

86

BCATS 2003 Symposium Proceedings Symposium Participant List

Samuel M. Pearlman Brian Null Milton J. Merchant Computer Science Stanford Genome Tech Ctr/BIO-X Microbiology and Immunology [email protected] [email protected] [email protected] Boris Peker William Ogle Sia Meshkat Biophysics Biological Sciences Protein Mechanics, Inc. [email protected] [email protected] [email protected] Alexanos Pertsinidis Elana Okrent Nesanet Mitiku Physics University of Washington Genetics [email protected]

[email protected] [email protected]

John Gordon Olyarchuk Sean Mooney Nick F. Peterson [email protected] Genetics San Francisco State University [email protected] [email protected] Steve Osborn [email protected] Martin Moorhead Nicolas Peyret ParAllele BioScience Applied Biosystems Art B Owen [email protected] [email protected] Statistics [email protected] Sergio Moreno Jan Benjamin Pietzsch Physics Management Science and Engineering

Michael Thomas Padilla [email protected] Electrical Engineering

[email protected] [email protected] Dan Morris Computer Science Zach Pincus David Seungwon Paik [email protected] Biomedical Informatics Radiology [email protected] [email protected] Kiran Mukhyala Protein Mechanics, Inc. Carla D. Pinon Krishna Palaniappan [email protected] Computer Science Gene Logic, Inc. [email protected] [email protected] Jaji Murage Molecular Pharmacology Ting P Pun Adam Palermo [email protected] [email protected] Molecular Pharmacology [email protected] Joseph Murray Elizabeth Anne Purdom AGY Therapeutics, Inc. Statistics Sam Pan [email protected] [email protected] Tularik, Inc. [email protected] Sandy Napel Sunil Puria Radiology Mechanical Engineering Vijay S. Pande [email protected] [email protected] Chemistry [email protected] Brian Thomas Naughton Claudia Pérez-Maldonado [email protected] Universidad Autónoma Metropolitana

Zac Panepucci Molecular and Cellular Physiology Nihar R Nayak

[email protected] [email protected] Obstetrics and Gynecology [email protected] Sriram Ramami Uma Patchava BizDNA [email protected] Pandurang Nayak Computer Science [email protected]

87

BCATS 2003 Symposium Proceedings Symposium Participant List

Jae Hoon Sim Daniel B. Russakoff Bhargav Raman Mechanical Engineering Computer Science Radiology [email protected] [email protected] [email protected] Natalie Renee Simmons Mitul Saha Raghav Raman Biological Sciences Mechanical Engineering Radiology [email protected] [email protected] [email protected] Nina Singhal Khaled Nabil Salama Prashanth Ranganathan Computer Science Electrical Engineering Biomedical Informatics [email protected] [email protected] [email protected] Marina Sirota Diane Irene Schroeder Tanya Raschke Biomedical Computation Biomedical Informatics Structural Biology [email protected] [email protected] [email protected] Jon Slenk Armin Schwartzman Jaideep Srinivas Ravela [email protected] Statistics Computer Science [email protected] [email protected] Kerrin Small Genetics Eran Segal Soo-Yon Rhee [email protected] Computer Science Infectious Diseases [email protected] [email protected] Armand Q. Smith Beckman Center Anand Sethuraman Young Min Rhee [email protected] Genetics Chemistry [email protected] [email protected] Christopher Davis Snow Biophysics Christopher Sewell Nadeem Riaz [email protected] Computer Science Mechanical Engineering [email protected] [email protected] Monika Sobczyk Mountain View Pharmaceuticals Ross D Shachter Kenneth Lloyd Rider [email protected] Management Science and

Engineering Psychiatry and Behavioral Sciences

[email protected] Eric Joseph Sorin [email protected] Chemistry Jee Eun Rim [email protected] Gavin James Sherlock Materials Science and Engineering Genetics [email protected] Lucinda Southworth [email protected] Biomedical Informatics Daniel Riordan [email protected] Cindy Shi Genetics Radiology [email protected] Ryan L. Spilker [email protected] Mechanical Engineering Nick Paul Robertson [email protected] Gireesh Shrimali Structural Engineering Electrical Engineering [email protected] Sreedevi [email protected] [email protected] Graham Rodwell Mark Siegal Nephrology and Developmental

Biology Kunju Joshi Sridhar Biological Sciences Hematology [email protected] [email protected] [email protected] Paul Sigala Torsten Rohlfing Biochemistry Neurosurgery [email protected] [email protected]

88

BCATS 2003 Symposium Proceedings Symposium Participant List

Rebecca Lynne Unruh Jessie Dale Tenenbaum Balaji Srinivasan Residential Education Biomedical Informatics Electrical Engineering [email protected] [email protected] [email protected] Francisco Useche Steven Teo Gaurav Srivastava UC Santa Cruz Biological Sciences Electrical Engineering [email protected] [email protected] [email protected] Elwyn Timothy Uy Mary Frances Nunez Teruel Josh Stuart Applied Physics Molecular Pharmacology Biomedical Informatics [email protected] [email protected] [email protected] Steve Vassallo Mayank Thanawala Jayanthi Subramani [email protected] diaDexus, Inc Tularik, Inc. [email protected] [email protected] Irene Vignon-Clementel Mechanical Engineering Daryl Thomas JingLucy Sun [email protected] UC Santa Cruz Structural Biology [email protected] [email protected] Neel Vora Aeronautics and Astronautics Matt Acker Thompson Shaohua Sun [email protected] Graduate School of Business Electrical Engineering [email protected] [email protected] Ahmad Waleh TaoSecure Agnes L Tin Padmavathi Sundaram Undergraduate Advising Center Electrical Engineering Kay Walter [email protected] [email protected] Foothill College/NASA Ames [email protected] Solina Tith Patrick Suppes Undeclared Computer Science Chia-wei Wang [email protected] [email protected] Chemistry [email protected] Dikran Tokmakjian Karl G Sylvester Santa Clara University Surgery Dongmei Wang [email protected] [email protected] Urology [email protected] David C. Tong Sara Tanenbaum Neurology Ingenuity Systems Haidong Wang [email protected] [email protected] Computer Science [email protected] Gergely Toth Beverly Tang Protein Mechanics, Inc. Mechanical Engineering Haili Wang [email protected] [email protected] Obstetrics and Gynecology [email protected] Garrick Toubassi Hui Tang Protein Mechanics, Inc. UC Berkeley Huiquan Wang [email protected] [email protected] Genome Center, UCD [email protected] Jim Turner Holly Tao [email protected] Abgenix QuangQiu Wang [email protected] Stanford Genome Tech Ctr/BIO-X Seiji Ueda [email protected] Biomechanical Engineering

Division Anes Bayani Tellez

Biomedical Informatics [email protected] [email protected]

89

BCATS 2003 Symposium Proceedings Symposium Participant List

90

Stream Su-Ching Wang Developmental Biology [email protected] Zheng Wang Applied Physics [email protected] Marion Webster Applied Biosystems [email protected] Edwin Wee Electrical Engineering [email protected] Jonathan Wickham [email protected] Dominic Widdows Computer Science [email protected] Cyrus A Wilson Biochemistry [email protected] Michael Wittbrodt Mechanical Engineering [email protected]

Ben Wong Computer Science [email protected] Dik Kin Wong Electrical Engineering [email protected] Eric Wong Adam Wright Statistics [email protected] Bo Wu Pediatrics [email protected] Xie Lillian Xu Sagres Discovery [email protected] Lily Yang Statistics [email protected] Wen Jen Yang Biological Sciences [email protected]

Iwei Yeh Biomedical Informatics [email protected] Jung-Hua Yeh Microbiology and Immunology [email protected] Ru-Fang Yeh UCSF [email protected] Chih-Han Yu Computer Science [email protected] Kwong Hiu Yung Statistics [email protected] Sheng Zhao Psychology [email protected] Yubo Zhou Electrical Engineering [email protected] Yan Zhu Graduate School of Business [email protected] Feng Zhuge Electrical Engineering [email protected]

BCATS 2003 Symposium Proceedings Symposium Participant List

SPONSOR PARTICIPANTS

Affymetrix Alloy Ventures Doug Kelley J. Leighton Read Craig Taylor Apple Computers Silvia Herrero-Duvaras Celera Diagnostics Harrison Leong Bob Kohlenberger Genentech Brian Desany Ken Jung Jingtao Sun Hopkins and Carley Gail Hashimoto

Incyte Corporation Ning Lan Mirjana Marjanovic Betsy Mooney Iqbal Panesar Xinhao Wang Roche Bioscience Rachel Li Lada Markovtsova Arjun Vadapalli Silicon Graphics Katrina Feliciano-Stoddard Darrow Wehara Sun Microsystems Loralyn Mears Ismet Nesicolaci Robin Sun Stefan Unger

91

92

93

SYMPOSIUM SPONSORS

Tier 1 Sponsors • • • • •

BioX Incyte Corporation NIH BISTI Silicon Graphics, Inc. Sun Microsystems

Tier 2 Sponsors • • •

Alloy Ventures Genentech Roche Bioscience

Tier 3 Sponsors • Affymetrix • Apple Computers • Bay Area Bioinformatics • BioScience Forum • Celera Diagnostics • Hopkins and Carley • IEEE Computer Society Computational Systems

Bioinformatics Conference (CSBCON2003)

94