Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds...

33
Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey Mudd College

Transcript of Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds...

Page 1: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

Ran Libeskind-Hadas, Department of Computer Science

Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science)

Bioinformatics Education at Harvey Mudd College

Page 2: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

Our name is Mudd…

• Undergraduate only; 700 students

• Sciences, mathematics, and engineering

Page 3: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

Our name is Mudd…

• Undergraduate only; 700 students

• Sciences, mathematics, and engineering

Page 4: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

Our name is Mudd…

• Undergraduate only; 700 students

• Sciences, mathematics, and engineering

Page 5: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

The HMC Curriculum

Major

CoreHumanities

Electives

Includes one semester of CSand one of Biology

Page 6: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

Experiments in the Core

The “regular” pathIntroduction to CSIntroduction to CS

Semester 1 Semester 2

Introduction to BiologyIntroduction to Biology

An integrated fullyear course

Integrated Introduction to CS and BiologyIntegrated Introduction to CS and Biology

A one semesterintegrated course

20 studentsin 2009-2010

200 studentsper year

Computation and BiologyComputation and Biology Introduction

to BiologyIntroduction to Biology

Introduction to BiologyIntroduction to Biology

Introduction to BiologyIntroduction to Biology

… or a second Biology course

40 studentsin 2010-2011

Satisfies CS core requirementbut not the Biology requirement

Page 7: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

Computation and Biology Core Course

Objectives:

– Cover the content of the “regular” CS intro course– Demonstrate the relationship between computing

and biology– Use computation to teach biology fundamentals and

use biology to motivate computing fundamentals– Provide students with computational tools to

perform their own “dry lab” experiments

Page 8: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

Computation and Biology Core Course

Objectives:

– Cover the content of the “regular” CS intro course– Demonstrate the relationship between computing

and biology– Use computation to teach biology fundamentals and

use biology to motivate computing fundamentals– Provide students with computational tools to

perform their own “dry lab” experiments

Page 9: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

Computation and Biology Core Course

Objectives:

– Cover the content of the “regular” CS intro course– Demonstrate the relationship between computing

and biology– Use computation to teach biology fundamentals and

use biology to motivate computing fundamentals– Provide students with computational tools to

perform their own “dry lab” experiments

Page 10: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

Computation and Biology Core Course

Objectives:

– Cover the content of the “regular” CS intro course– Demonstrate the relationship between computing

and biology– Use computation to teach biology fundamentals and

use biology to motivate computing fundamentals– Provide students with computational tools to

perform their own “dry lab” experiments

Page 11: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

Course Structure

Tuesday

Thursday

Friday

Biologist

Lab!

C.S.ist

Weekend

Assignment

CSist

Page 12: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

Biology CS Subset of student HWw

ks 1

-3w

ks 4

-5W

ks 6

-7W

ks 8

-9

Gene finding, gene expression, lactase

expression

Implement alignment and extend to deal with

substitutions

Mitochondrial Eve, diploid populations with selection, molecular

evolution simulations

Introduction to Python: Data,

functions, and basic constructs

Designing a larger program, randomness,

simulation

Population genetics, molecular evolution

Sequence alignment

Phylogenetics

Recursion

Recursion on trees and phylogenetic tree

algorithms

Implementing a phylogenetic tree algorithm and making inferences from the results

DNA, RNA, central dogma, genes: Case

study of lactose intolerance

Page 13: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

Biology CS Subset of student HWw

ks 1

0-1

1W

ks 1

1-12

Wks

13-

14

Implement RNA folding and visualize results

Capstone Projects

Chemotaxis simulations and evaluation of models

RNA folding algorithm, efficiency,

and memoization

Computation and modelingSystems biology and

modeling: Chemotaxis

Topics Limitations of computation

Folding: RNA to Proteins

Page 14: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

Using computation to teach biology fundamentals

Population genetic model

Explore effects of drift and selection,

Hardy-Weinberg equilibrium

Page 15: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

Using biology to motivate computation: RNA Folding

Recursion and memoization

Page 16: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

Above and Beyond…

Page 17: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

Above and Beyond…

Page 18: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

Final project example: What makes cholera pathogenic?

Pathogenic vs. non-pathogenic strains

Page 19: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

Final project example: What makes cholera pathogenic?

Compare all genes in one strain with all in other to find orthologs (use fast global alignment)

Page 20: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

Final project example: What makes cholera pathogenic?

Programmatically Blast unique proteins to see what they are

Read about these unique genes and explain what they do

Page 21: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

Microarray data…

Some genes encode for transcription factors that promote or inhibit the expression of other genes

Purple is highly expressed, green is not expressed

conditions

Courtesy of Prof. Russell Schwartz

Page 22: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

Intuition Behind Network Inference

0

11

11

00

10

11

01

00

00

11

1

1

4

32

+ -

-

1

32

+

-

1

32

+ -

1

32 -

-

1

32

+

-

-

conditions

correlated expression implies common regulation that intuition still leaves a lot of ambiguity

Courtesy of Prof. Russell Schwartz

gene 1gene 2gene 3gene 4

Page 23: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

We will assume that genes only have two possible states: 0 (off) or 1 (on)

We will also assume that we want to find directionality but not strength of regulatory interactions

We will exclude the possibility of regulatory cycles:

Assuming a Binary Input Matrix

1 01 0 1 1 1 00 1 0 1 1 1 1 0

conditions

gene 1gene 2

0 0 1 0 0 0 0 10 0 0 0 0 1 0 1

gene 3gene 4

1

32

4 1

32

4OK NOT OK

Courtesy of Prof. Russell Schwartz

Page 24: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

The Project

Take binary microarray data as input Find the acyclic regulatory network with the

highest likelihood Display the network somehow

Page 25: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

Student Response

“This course stimulated my interest in the subject matter”

College mean: 5.53/7.0 (std. dev 0.80)Computation and Biology: 6.51/7.0

Likert scale (1 low, 7 high) survey:

“I learned a great deal in this course”

College mean: 5.76/7.0 (std. dev 0.72)Computation and Biology: 6.49/7.0

“Time spent outside of class (per week)”

College mean: 4.98 hours (std. dev 2.42)Computation and Biology: 6.28 hours

Page 26: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

What did students choose to do the following term?

Students have one elective in the spring term

Took introductory biology: 0/40Took an elective other than CS or biology: 0/40Took an “upper division” biology course: 18/40Took the second CS course: 22/40 Outperformed

their peers

Page 27: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

• Students learned the foundational content of “Intro CS” and “Intro Biology” • Students’ programs provide rich “dry lab” experiments and simulations that reinforce understanding of biology

• Students develop general problem-solving and programming skills (e.g. DP) and have confidence to solve “new” problems on their own

Page 28: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

• Students learned the foundational content of “Intro CS” and “Intro Biology” • Students’ programs provide rich “dry lab” experiments and simulations that reinforce understanding of biology

• Students develop general problem-solving and programming skills (e.g. DP) and have confidence to solve “new” problems on their own

Page 29: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

• Students learned the foundational content of “Intro CS” and “Intro Biology” • Students’ programs provide rich “dry lab” experiments and simulations that reinforce understanding of biology

• Students develop general problem-solving and programming skills (e.g. DP) and have confidence to solve “new” problems on their own

Page 30: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

Next steps…

• Increasing student demand for more courses and even a major in computational biology

• “Mathematical Biology Major” redesigned in Spring 2011 to “Mathematical and Computational Biology (MCB)” major– Good news: 9 MCB majors in sophomore year

(6 Biology majors and 2 Biochemistry majors)– Bad news: Few faculty in a position to contribute

Page 31: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

Beyond the core (intro CS, intro Biology, 3 semesters math,2 chemistry, 1 physics, …)

Introductory Sequence

• Discrete Math• Biology laboratory• Introduction to Mathematical and Computational Biology

Biology Foundations

• Three of: Comparative physiology, ecology and environmental biology, evolutionary biology, molecular biology• One biology seminar• One biology laboratory

Mathematical and Computation Courses

• Intermediate Mathematical Biology• Computational Biology• One upper-division math course• One upper-division CS course• Three more math and CS courses

Electives, Thesis, Colloquium

• One related elective• Colloquium• Senior thesis

Page 32: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

Future Plans…

• Refine and improve introductory course

• Write a book for the introductory course

• Collaborate with “sister” institutions to expand computational biology curriculum– New faculty– New courses

Page 33: Ran Libeskind-Hadas, Department of Computer Science Thanks to Eliot Bush (Biology) and Zach Dodds (Computer Science) Bioinformatics Education at Harvey.

Questions, Comments, Heckles