moodle2.brandeis.edu · Web viewPlease read carefully this document. Pay special attention to the...

24
Brandeis University Division of Graduate Professional Studies Rabb School of Continuing Studies Course Syllabus I. Course Information 1. Introduction to probability and statistics 2. RBIF-0103-G1 3. 01/21/2015- 04/25/2015 4. Distant Learning Course Week: Wednesday through Tuesday 5. Instructor, contact info: Michael B. Partensky, PhD Please contact me via email: [email protected] , [email protected] To avoid delays, please send your mail to both addresses if you want to contact me before 01/21/15. Later, please use the Brandeis address. 6. Virtual office hours: Sunday, 11 am – 13 am (EST) [occasional changes are possible] 7. Document Overview This syllabus contains all relevant information about the course: its objectives and outcomes, the grading criteria, the texts and other materials of instruction, and of weekly topics, outcomes, descriptions of assignments, and due dates. Consider this your roadmap for the course. Please read through the syllabus carefully and feel free to share any questions that you may have. Please print a copy of this syllabus for reference. 8. Course Description Purpose and content. The course builds a foundation for the “probabilistic thinking” method, with applications to real life problems including bioinformatics, bio- and medical statistics, computational biology and biophysics, data analysis. The topics cover random numbers, discrete and continuous random variables, elements of Combinatorics, conditional probability, Bayes' formula, Markov chain, Binomial, Poisson and normal distribution, entropy and information, Monte-Carlo method, the central limit theorem, confidence interval and hypothesis testing, correlations, nonlinear regression and maximum likelihood. We will also learn some Please read carefully this document. Pay special attention to the sections marked They contain information that you will need from

Transcript of moodle2.brandeis.edu · Web viewPlease read carefully this document. Pay special attention to the...

Page 1: moodle2.brandeis.edu · Web viewPlease read carefully this document. Pay special attention to the sections marked. They contain information that you will need from the very …

Brandeis UniversityDivision of Graduate Professional StudiesRabb School of Continuing Studies

Course Syllabus

I. Course Information

1. Introduction to probability and statistics2. RBIF-0103-G13. 01/21/2015- 04/25/2015 4. Distant Learning Course Week: Wednesday

through Tuesday5. Instructor, contact info:

Michael B. Partensky, PhD Please contact me via email: [email protected], [email protected]

To avoid delays, please send your mail to both addresses if you want to contact me before 01/21/15. Later, please use the Brandeis address.

6. Virtual office hours: Sunday, 11 am – 13 am (EST) [occasional changes are possible] 7. Document Overview

This syllabus contains all relevant information about the course: its objectives and outcomes, the grading criteria, the texts and other materials of instruction, and of weekly topics, outcomes, descriptions of assignments, and due dates. Consider this your roadmap for the course. Please read through the syllabus carefully and feel free to share any questions that you may have. Please print a copy of this syllabus for reference.

8. Course Description Purpose and content. The course builds a foundation for the “probabilistic thinking” method, with applications

to real life problems including bioinformatics, bio- and medical statistics, computational biology and biophysics, data analysis.  The topics cover random numbers, discrete and continuous random variables, elements of Combinatorics, conditional probability, Bayes' formula, Markov chain, Binomial, Poisson and normal distribution, entropy and information, Monte-Carlo method, the central limit theorem, confidence interval and hypothesis testing, correlations, nonlinear regression and maximum likelihood. We will also learn some basics of Mathematica programming language and will be using it for the computational probabilistic experiments.

Prerequisites. Solid knowledge of basic algebra, geometry and trigonometry would be very helpful for your success. If you are not fluent in basic math, please reserve more time for your weekly studies. Some familiarity with introductory calculus (functions, derivatives, integrals) is preferable, but not required. The lectures will provide you with the necessary background in calculus as needed.

Catching up with Math. On the first week of the class an introductory math quiz will be offered, aimed to help you refresh your math background, and allocate adequate time and efforts for your weekly studies. The test will cover the areas of basic and more advanced Math directly related to the class. The test is not graded, but required (the grade is 100 if you took it or 0 otherwise). Based on the outcome, you will be advised to refresh some of the materials if necessary. Mathematica (see section 9.3), an excellent educational and research software intensively used throughout the course, will also help in refreshing your Math skills. It is strongly advisable to start practicing Mathematica without delay.

9. Instruction Materials 9.1. Semi-Required Texts (mostly for the individual studies)

Please read carefully this document. Pay special attention to the sections marked

They contain information that you will need from the very beginning, some even before the class starts.

Page 2: moodle2.brandeis.edu · Web viewPlease read carefully this document. Pay special attention to the sections marked. They contain information that you will need from the very …

1. M.S. Spiegel, J.J. Schiller and R.A.Srinivasan , Schaum's Outline of Probability and Statistics, Schaum’s Outline Series, McGraw-Hill, 3-d (2009), ISBN:9780071544252

2. E. Don, Schaum's Outline of Mathematica, Schaum’s Outline Series, McGraw-Hill 2-d (2009) ISBN: 9780071608282

3. C.M. Grinstead and J.L., Snell. Introduction to probability. Am. Math, 2-d (1997) ISBN: 9780821894149 (this book can be also downloaded from the web for free Please send a thank-u note to the authors)

9.2 Recommended Text(s) 4. Bennett, D.J. 1998. Randomness. Harvard University Press, Cambridge, (1999), ISBN: 978-0674107465 Enjoyable supplementary reading. A lot of insights, paradoxes, peculiarities.

5. S. Wolfram Mathematica (9-th edition): the reference Source. It is included in e-format in the standard Mathematica distribution).

6. W.J. Ewens and G.R. Grant, Statistical methods in bioinformatics (an introduction), Springer, 2-d, (2005) ISBN-13: 978-0387400822 (will be used only occasionally, but could be also handy in your future study of bioinformatics.)

7. R. Durrett, Probability: Theory and Examples (Cambridge Series in Statistical and Probabilistic Mathematics), CUP (2010) ISBN-13: 978-0521765398

8. N.N. Taleb, Fooled by randomness, Random House, 2-d, (2008) ISBN-13: 978-1400067930 [Contains a lot of insights and cute examples]

9. W.W Hines et al., Probability and Statistics in Engineering, Wiley, 4-th (2009) ISBN: 978-0471240877

10. R. Durbin, S.R. Eddy, A. Krogh, G. Mitchison, Biological Sequence Analysis : Probabilistic Models of proteins. Cambridge University Press; Reprint edition (1999), ISBN: 978- 0521629713 (the comment from #6 is also applicable here)

9.3 Required Software

Mathematica 10. We will be using Mathematica for the experiments with randomness. In addition Mathematica will help you to refresh some of the math required for the course. You will be able to purchase a student version of Mathematica 10 (which is fully functional) at a significant discount. To get an additional 15% discount please enter the promotion code PD1637 at checkout from the Wolfram Web Store at store.wolfram.com (If asked, please enter my name. This feature is provided to the members of the Wolfram Faculty Program). Mathematica is an extremely powerful and elegant tool, and I am sure that some of you will find it very useful in your future work.

9.4 On-line Course Content

This course will be conducted completely online using Brandeis’ LATTE site, available at http://latte.brandeis.edu.   The site contains the course syllabus, assignments, our discussion forums,

It is very important to get Mathematica ASAP, prior to the first class session. You can even complete a few introductory assignments (including watching two videos) before the class starts. This can greatly boost you progress in the class.

2

Page 3: moodle2.brandeis.edu · Web viewPlease read carefully this document. Pay special attention to the sections marked. They contain information that you will need from the very …

links/resources to course-related professional organizations and sites, and weekly checklists, objectives, outcomes, topic notes, self-tests, and discussion questions.  Access information is emailed to enrolled participants before the start of the course.  

10. Overall Course Outcomes

The course is designed to teach the probabilistic way of thinking. It provides a thorough background in the basics of probability theory and statistics, the major pillars of bioinformatics and biostatistics. We will utilize the multi-disciplinary approach by using the examples and examining the ideas from various fields, from statistical physics and computer modeling of proteins to the probabilistic aspects of evolution and biological data analysis. The class will strongly benefit from using Mathematica, the most advanced “computer aided thinking tool” which helps in understanding the major concepts of P&S, developing algorithms and running random experiments.

Course Outcome Assignment / Assessment

1. Apply the elements of set theory to the analysis of complex events and biological sequences

Lect. 2, 3; HW 2, 3

2. Use Combinatorics for the analysis of various random selection problems, derivation of major probability distributions and grasping some major combinatorial problems of sequence analysis.

Lect. 3,4; HWs 3,4

Lect. 10; HW 10

In addition, various Combinatorial concepts are quite evenly distributed over the course, as one of the foundations of Probabilistic Thinking

3. Apply Binomial, Poisson, geometric hyper-geometric, negative binomial, Normal, exponential and other probability distributions to the analysis of probabilities, sampling errors, sequence similarity.

Lect. 4, 5, 10, 12;

HW 4-6, 10-12

4. Recognize and analyze phenomena described by conditional probability. Use the Bayes formula to analyze prior probabilities given the outcomes

Lect. 6,7; HW 6,7

5. Apply non-linear regression (NLR) to data modeling; develop Mathematica-based applications of NLR for solving some real-life problems

Lect. 11; HW 11.

6. Apply the concept of Maximum likelihood to the experimental data analysis.

Lect. 11; HW 11

3

Page 4: moodle2.brandeis.edu · Web viewPlease read carefully this document. Pay special attention to the sections marked. They contain information that you will need from the very …

7. Analyze some archetypical paradoxes of probability (‘Monty Hall’, ‘prisoner’s dilemma’, second daughter) for the guidance in solving complex real-life statistical problems.

Lect. 2, 7

Multiple Q&A forum discussions

8. Apply the measures of central tendency (mean, variance, e.t.c.) for the statistical estimates

Lect. 10; HW 10

9. Analyze and simulate with Mathematica various Markov models as a foundation of the major algorithms of sequence analysis (HMM, Blast, e.t.c.)

Lect. 7,8; HW 8

10. Use the central limit theorem for the analysis of sampling errors and confidence interval

11. Apply the hypothesis testing technique to the analysis of statistical data

Lect. 12; HW 12; Test preparation problems.

Lect. 12; HW 12; Test preparation problems.

12. Use relation between entropy and probability, and Boltzmann statistics as fundamental concepts behind the protein dynamics and energetics.

Elucidate relation between entropy, disorder, and information.

Lect. 13; Q&A forum discussions.

13. Formulate basic principles underlying the Monte Carlo and Molecular dynamics modeling of molecular biological systems.

Lect. 5, 13 (+ Videos of MC simulations)

14. Analyze and describe some statistical problems of genetics ( Hardy-Weinberg law, probabilities of genetically inherited diseases, applications of Bayesian statistics)

Lect. 7; HW 7

15. Actively participate in the team work: problem solving in groups

16. Use Mathematica as the programming, visualization and presentation environment

Weeks 2 - 13

Weeks 1-5 : intense introduction to Mathematica; practical applications of Mathematica are evenly distributed between the classes

Upon completion of the course students will be able to

Use general principles of P&S in preparation for future work in bioinformatics

- Use the operational definition of probability to estimate the empiric probabilities for random events and biological sequences

- Apply the elements of set theory to the analysis of complex events

4

Page 5: moodle2.brandeis.edu · Web viewPlease read carefully this document. Pay special attention to the sections marked. They contain information that you will need from the very …

- Use Combinatorics for the analysis of various random selection problems, derivation of major probability distributions and grasping some major combinatorial problems of sequence analysis.

- Apply Binomial, Poisson, Normal, geometric, hyper-geometric and negative binomial distributions to the analysis of probabilities, sampling errors, sequence similarity

- Recognize and analyze phenomena described by conditional probability- Use the Bayes’ formula to analyze prior probabilities given the outcomes- Apply non-linear regression (NLR) to data modeling; develop Mathematica-based NLR applications for some

practical examples- Apply the concept of Maximum likelihood to the experimental data analysis.- Analyze some archetypical paradoxes of probability (prisoner’s dilemma, Buffen needle, etc) for the guidance

in the analysis of complex real-life statistical problems.- Apply the measures of central tendency (mean, variance etc) for the statistical estimates- Analyze and simulate with Mathematica various Markov and random walk models for better understanding of

the major algorithms of sequence analysis (HMM, Blast, etc)- Use the central limit theorem for the analysis of sampling errors and confidence interval- Apply the hypothesis testing technique to the analysis of statistical data- Use the ORC curves approach to the test design

Apply probabilistic methods and concepts to the analysis of biological systems on different levels:

- Use relation between entropy and probability, and Boltzmann statistics as fundamental concepts behind the protein dynamics and energetics

- Formulate basic principles underlying the Monte Carlo and Molecular dynamics modeling of molecular biological systems.

- Analyze the probabilistic basis of Mendelian genetics, distribution of alleles, Hardy-Weinberg (HW) theorem;

Participate in a team research work involving numerical statistical analysis and modeling, and communicate its results to colleagues; make presentations on various statistical topics

- Team work in the class- Use Mathematica as the programming, visualization and presentation environment

11. General Grading Criteria

The course grade will be based on homework (50%), tests (20%), student’s activity in class (30%).  In addition, students can earn extra credits for various extra activities.  This can be done, for instance, by completing the optional assignments offered in most of the lectures, making short presentations (papers + computer experiments), etc. 

12. Assignments and Tests: Description, Structure and Grading

13.1 Participation/Attendance All students are expected to participate regularly. The activities (forum discussions, group activities, reading and Home Work assignments) should be spread evenly over the week.

13.2 Communication, correspondence. All the emails related to this class will be sent to your Brandeis email account. However, almost everyone has and uses a primary personal account. For this reason it is extremely important to set up forwarding from your Brandeis account to the primary account.

It is you responsibility to make sure that all the messages from the instructor and from the school are received on time. At the beginning of the class I will ask you to send me confirmations to make sure that everyone is tuned in .

13.3 Home assignments (content, early submission options, and grading).

GeneralEvery week, a homework assignment will be offered. It typically includes a required part and an extra-

5

Page 6: moodle2.brandeis.edu · Web viewPlease read carefully this document. Pay special attention to the sections marked. They contain information that you will need from the very …

credit. The deadline for the submission is Tuesday 11.30 pm. The late assignments are not accepted (graded F). In such cases, a make-up can be offered. However, it is highly recommended to submit on time because the class is quite intense and working on additional assignments can jeopardize your progress.

All the submissions should be done via the latte

Submission options.Usually, you will be offered to choose one of two options:

a) Submitting once (single file submission) for final grading. The only deadline in this case is Tuesday 11:30 pm.

b) Submitting more than once (multiple file submission). As explained further, this option is also named the “Early submission” (ES), and involves two deadlines [for the first submission (see weekly assignments), and for the final submission, Tuesday, 11:30 pm].

The “Early Submission” (ES) elaborated

ES implies “multiple file submission”, where the originally submitted assignment can be improved and resubmitted.

One who chooses this option must submit early, usually by 14:00 on Sunday preceding the class (unless otherwise is stated for a particular week). If the original submission is not perfect, it will be returned to you with the initial grade (we designate it G(1) ), with the score assigned to each of the problems, and with some questions and hints helping you to find and fix the errors. Then you are given an opportunity to resubmit and improve your grade.

The initial submission must be complete: you should provide solutions to all the required problems. The first grade G(1) is the starting point, and all the further grades depend on it. At the end, after the resubmission(s), your grade cannot be less than G(1), but you also (except for some rare occasions) cannot get 100% (assuming G(1) was less than 100%). Each submission numbered n (n= 1, 2, 3…) is initially graded based purely on its quality. We name this the “unbiased” grade G(n). The “real” grade for each submission is defined as

Greal (n )=12(G(n)+G(1)) (1)

For instance, if the first percentile grade is G(1) = 60 and the second grade (first resubmission) is G(2) = 100, then the final grade is 80%. This approach should motivate you (in addition to submitting early) to receive the starting grade as high as possible.

The individual problems are graded on the scale 0 to 1. In each submission the total percentile grade G(n) is obtained as the total of the scores for the individual problems divided by the total number of the problems, times 100.

The real grades for the individual problems are calculated for each submission in the spirit of rule (1). For example, if the (unbiased) grade for a particular problem changes as {0.6, 0.7, 1. } in the course of three consecutive submissions , then the “real” grade for this problem is {0.6, 0.65 and 0.8}, and the final grade is 0.8. Usually, there is only one resubmission, n=2. In some cases (especially if the work was submitted earlier during the week, say before Sunday), the second, and, occasionally, even the third resubmission (n= 3, 4) will be allowed. Please use the same file for all the (re)submissions of a current week!

6

Page 7: moodle2.brandeis.edu · Web viewPlease read carefully this document. Pay special attention to the sections marked. They contain information that you will need from the very …

For each resubmission, please create a separate subsection after each solution being fixed, named the “Solution n” with n = 2 (for the first resubmission) or n=3 (for the second). We will learn how to format Mathematica files (including the sectioning) during the first two weeks. All my comments and your solutions (the current one and all the previous) must stay in the file unchanged. Do not delete your previous solutions and my comments! The name of the file should contain your name, and the submission number (n). It is usually derived from the original name of the file (posted by the instructor) by adding “_YourName_n”. For example, the assignment of the 5-th week was named HW5.nb. Then, HW5_ SamClemens_2.nb is the second (2) submission of this assignment by Samuel Clemens. Similarly, HW7_JaneWang_1.nb is Jane’s first submission of HW7.

The discussions that follow the early submission lead to a better and deeper understanding of the course material and improve your overall performance.

Naturally, submissions made after the ES deadline (but before the final deadline), are graded only once. There is neither a penalty for not using the ES option, nor a reward for submitting early (except for the opportunity to resubmit and fix the errors).

13.4 Self-tests. Some assignments will be accompanied by the self-tests containing the problems similar to those from the Home Assignment. These are offered solely for your practice and benefit, and do not have to be submitted. All the self-test problems can be discussed on the open forum

13.5 Class presentations Concepts reviewed in the class or related to those could be enriched through the (optional) students’ presentations (typically, the short papers including the examples with Mathematica). This activity is entirely voluntary. It is graded as a participation assignment (see the grading policy). I will suggest a few topics, but quite often students contribute their own ideas and topics, and share P&S- related experiences from their work (I remember remarkable presentations about the Bayesian networks, and on a Stock Market analysis) or even their hobbies (the Mathematica model of the “Texas Hold'em” was one of such examples). Please, indicate you interest in making a presentations as early as possible). The “presentations” will be added to the reading materials, and everyone will be encouraged to read and discuss them on the forums. The starting threads for the discussions can be created by the presenters.

13.6 Groups and related activities. The class will be divided into several groups, usually three students per group. Certain assignments will be offered for group work, and the answers will be graded as “class activity” or the HW, depending on the type of assignment. Short (30 min) tests will also be offered during some weeks, either for the whole class, or separately for the group work.An important component of your class activity is the participation in the weekly discussion forums. Your posts (responses, questions etc.) will be evaluated based on substantiality of their content. Instead of elaborating our understanding of “substantiality” it is easier to give the examples of non-substantial posts:

“Hi, John. It’s a wonderful idea! I was thinking along the same line! Cheers. Mike”. “Ann, I liked your solution but could not understand the last part. Could you please explain it”.

They both are valid and useful responses. The first one is a kind encouragement, while the second contains a question and invites for further discussion. In other words, they do not contribute to the grade, but they both are valuable and important. All kinds of responses are welcome and important, even if they are not graded as substantive. Besides,

7

Page 8: moodle2.brandeis.edu · Web viewPlease read carefully this document. Pay special attention to the sections marked. They contain information that you will need from the very …

there is no sharp boundary between the substantive and other responses. For example a simple inquiry can be considered substantive if it triggers a valuable discussion, and it should be graded accordingly.

Quite often the HW assignments will be offered on a group basis. In such cases the detailed instructions will be provided.

Note: In general, the early submission policy is not applied to the group assignments. Instead, the group members are allowed to discuss the solutions on the group forum, in the process of composing jointly a submitted document. The instructor can participate in these discussions and provide hints and advises if necessary.

13.7 Mid-term and Final tests will be offered on the 8th and 13-th week respectively. They include 5-6 problems each. The specific instructions will be provided.

13.8 Online Participation  There are four major types of forum activity (1) Responses to the original questions posted by the instructor (Q&A forum(s)). This includes questions related to the HW assignment and Lecture materials. Answering a certain number of these questions will be required. Each student gains complete access to such forums only after having responded to the first question posted by the instructor. Usually, there will be up to three required questions. The specific instructions will be provided weekly.

(2) Participation in the discussion at Q&A forum(s)After answering the required questions, you will be able to access these forums and participate in discussion. This is also a valuable component of your participation.

(3) Participation in the open discussion forum(s) (ODF), where a student can ask and respond to any class-related questions (except for the HW assignment).

Comment: The exception is the Home Work problems: they must be solved individually. The only HW-related questions allowed for discussion are the questions posted by the instructor (see A.1 ). Otherwise, the discussion of HW assignments is prohibited. However, you can ask me HW –related questions (mostly related to the understanding of the problems rather than their solutions) at private forums.

(4) Group discussion forums These forums will be created for various group activities.

Detailed participation assignments will be posted weekly. Here is an example (we presume that the class week starts on Wednesday):

“(a) By Friday Night (22:00 EST) post two original (required) responses on Q&A forum(s). (b) Not later than 12:00 (EST) on Monday post two (at least) replies to the posts of other participants (and/or submit your own substantive questions or comments) on Q&A forum, and at least one post on ODF. The posts must be submitted on at least three different days of the online course week. For example, you can post your answers to Q&A on Thursday and Friday, reply to Q&A on Saturday, and participate in ODF during the week".

8

Page 9: moodle2.brandeis.edu · Web viewPlease read carefully this document. Pay special attention to the sections marked. They contain information that you will need from the very …

Online participation is very important. It contributes 30% to the total grade, and it is a very effective learning tool. You will soon realize that the aforementioned requirements are not “abusive”, and discussing your questions with others is rewarding and enjoyable. Most of you will easily surpass the required level of participation.

13.9 Participation Evaluation 

First, we introduce two types of students’ responses: Type 1(T1): responses to the original questions posted by instructor at Q&A forum;Type 2 (T2): Participation in Q&A (after and ODF discussion.

Points may be earned for original responses and substantive replies based on the following criteria:

Type 190-100 pts

(Very Thoughtful)

Discussion is substantive and relates to key principles. The answers are complete, and well explained The Math and coding part (if present ) is correct Provides examples demonstrating application of principles. Is submitted according to the deadlines in the course

schedule. Language is clear, concise, and easy to understand. Uses

terminology appropriately and is logically organized.

79-89 pts

(Thoughtful)

Makes reference to key principles, but is not well developed or integrated in the response.

The answers are not complete, and not well explained The Math/coding part (if present ) is on the right track,

with some errors Offers some examples, but they are not sufficiently

illustrative and not well integrated in the response. Submitted according to the deadlines in the course

schedule. Is adequately written, but may use some terms incorrectly;

may need to be read two or more times to be understood.

68-78 pts

(Somewhat Thoughtful)

Contains no reference to key principles; if key principles are present, there is no evidence the learner understood principles, or key principles are not integrated into the response.

The Math/Coding part (if present ) contains errors Does not offer examples, or the examples are too trivial. Response is not submitted by the due date. Poorly written; terms are used incorrectly; cannot

comprehend learner’s ideas after repeated readings.

Type 290-100 pts

(Very Thoughtful)

Is substantially related to and reinforces the unit overview, text, and/or supplementary readings.

Responds to the ideas and concerns of other learners. Math/coding (if present) is correct and clearly explained Is characterized by three to four of the following criteria:

o Thought-provokingo Supportiveo Challengingo Reflective

Is submitted according to deadlines in the course schedule. Language is clear, concise, and easy to understand; uses

terminology appropriately and is well organized.

79-89 pts

(Thoughtful)

Contains references to unit overview, text, and/or supplemental readings, but references are not well integrated in the response.

Response is peripherally related to the ideas and concerns

9

Page 10: moodle2.brandeis.edu · Web viewPlease read carefully this document. Pay special attention to the sections marked. They contain information that you will need from the very …

of other learners. Math/coding (if present) contains some minor errors or

explained clearly Is characterized by one or two of the following criteria:

o Thought-provokingo Supportiveo Challengingo Reflective

Submitted according to deadlines in the course schedule. Adequately written, but may use some terms incorrectly;

may need to be read two or more times to be understood.

68-78 pts

(Somewhat Thoughtful)

Contains no reference to key principles; if key principles are present, there is no evidence learner understood principles, or key principles are not integrated into the response.

Math/coding (if present) contains errors Response is unrelated to the ideas and concerns of other

learners. Response is not thought-provoking, supportive, challenging,

or reflective. Response is not submitted by the due date.

Is poorly written; terms are used incorrectly; instructor cannot comprehend learner’s ideas after repeated readings.

The total participation grade is calculated as a weighted average. The responses belonging to T1 directly test your understanding of the lecture material and of the HW assignment. For this reason they are sometimes assigned a higher weight.

For example, the grade X1 for the type 1 response can be assigned the weight p = 0.6. Then, the weight of contributions X2 of the second type has weight q=0.4. The grade X1 itself is the average of percentile grades for all the required responses to the instructor’s questions at Q&A forum. The grade X2 for T2 is the average of the grades for corresponding contributions.

In all cases, if the number of responses exceeds the required number of responses, Nreq, the best Nreq responses will be chosen. For example, if the required number for T1 is Nreq=3 and the actual number of responses is 5, then only three best responses will be counted towards the grade (we call these grades X1, X2 and X3 ) and the total grade for T1 becomes X1 = (X1 + X2 + X3)/3 . The same holds for T2 responses. The final grade is calculated as

X total=p× X 1+q× X 2 Consider the example:X1=70, X2=90, p=0.6, and q = 0.4. Then, X total=0.6 × 70+0.4× 90=78.The final grade is

shifted towards X1 demonstrating the role of the weights. Note: This “weighting” approach is not strict. For example, if your contribution belonging to T2 category is original, mind-provocative and demonstrates your deep understanding of the subject, its contribution to the grade will be enhanced.

II. Weekly Information

1. Course schedule and class topics Some minor changes in the topics, and in their distribution, are possible.

WeekStartingDate Topic

Comments(for the week of the class)

1 01/22/14 1-st Introduction to Mathematica. Mathematical Background for You are encouraged to watch

10

Page 11: moodle2.brandeis.edu · Web viewPlease read carefully this document. Pay special attention to the sections marked. They contain information that you will need from the very …

P&S the suggested videos even before the class starts

2

01/29/14

2-d Introduction to Mathematica. First random simulation with Mathematica.

First introduction in Probability (P). Early history of P: on The shoulders of the giants. Laws of chance: are they possible? Conundrums and Paradoxes of probability.

Please, volunteer and select topics for 1-st presentations (for the weeks 4 or 5)*

3

02/05/14

Random experiments, sample space and random events. Introduction to the set theory. Axioms of Probability. Frequency definition of Probability. Random variables. Probability function and Cumulative Distribution Function.Counting Probabilities: Multiplication Rule.

3-d Introduction to Mathematica.

4

02/12/14

Counting probabilities (continued). Elements of Combinatorics. Permutations, combinations, binomial coefficients.Bernoulli trials and related Probability Distributions: Binomial, Geometric, Negative binomial distribution. Some applications in Sequence Analysis.

4-th Introduction to Mathematica: Random numbers, chance experiments with Mathematica: Matrices in M.

5

02/19/14

Multinomial, Gypegeomteric and Poisson distributions. Applications, problem solving.Chance experiments with Mathematica: Monte-Carlo Integration.

Please select topic for the presentation, week 9-11. You can use my suggestions or pick your own. The presentation is graded as class participation.If you decide not to present, - do not worry.This is completely volunteer activity

6 02/26/14 Conditional probability. Independence, Global independence.Total probability Rule. Simulations with Mathematica.

703/05/13

Bayes formula and related “paradoxes”Two-stage experiments. Hardy Weinberg theorem. Markov Chain: recursive treatment.Practice for the test.

8

03/12/14

Mid-term test (it may be distributed between weeks 8 and 9)Markov chain: Matrices-based treatment. Applications to Bioinformatics: CpG islands. Random walks.

9

03/19/14

Integrals with Mathematica.Continuous random variables. Distribution function, CDF. Important distributions and densities: Uniform, Exponential, Gamma, Normal, Chi-Square. Relations between Binomial, Poisson and Normal distributions. Practice.

1003/26/14

Practice with the continuous distributions.Mean, Variance and other estimators (moments) for discrete and continuous random variables. Sums of random variables.Some applications of Probability Distributions in bioinformatics Joint distributions, marginal distribution.

11

04/02/14

Introduction to data modeling: (1) Maximum Likelihood; (2) Linear and Non-linear regression.Real life examples with Mathematica.

11

Page 12: moodle2.brandeis.edu · Web viewPlease read carefully this document. Pay special attention to the sections marked. They contain information that you will need from the very …

1204/09/14

Distributions of sums of random variables. Laws of large numbers. Central limit theorem. Confidence Interval. Hypothesis testing (1). Random samples, and sampling distributions.

1304/16/14

Probability and Entropy. Boltzmann distribution. Monte Carlo and Molecular dynamic simulation of biomolecules.Final Test. Review.

*The link to the suggested topics can be found in the “Lecture Materials” page (Latte), but you are also encouraged to suggest your own topics.

2. Weekly assignments

Every week we offer a HW assignment typically including 5 – 9 problems (one of them is usually an extra-credit problem). The assignments are in Mathematica notebook format, and Mathematica is used both for solving the problems, and formatting the submitted document. Some assignments include random experiments with Mathematica. The early submission policy is described in section 10 (1). The latest submission time is Tuesday, 11.30 pm. The assignments will also include the “participation tasks”, especially the forum activities.

3. Weekly outcomes

1 At the end of week 1, students will:

Refresh the main mathematical concepts/tools used in the class, including some elements of algebra, sums, products, integrals.

Write first Mathematica-based programs using functions, tables, random number generator, plots.

2 At the end of week 2, students will be able to:

Describe the major sources of Probability Theory Describe some archetypical paradoxes of Probability Apply some basic analytical and visualization tools of Mathematica Run simple random simulations with Mathematica

3 At the end of week 3, students will be able to:

Describe the sample spaces of various random experiments. Analyze simple and complex events in terms of the set theory Apply the set theory to the classification of amino acids Use the frequency-based definition of probability rule for the analysis

of probabilities of different random events Describe random phenomena in terms of Random variables, Probability Function and Cumulative

Distribution Function. Apply Multiplication Rule to counting the outcomes of sequential experiments.

4 At the end of week 4, students will be able to:

12

Page 13: moodle2.brandeis.edu · Web viewPlease read carefully this document. Pay special attention to the sections marked. They contain information that you will need from the very …

1 At the end of week 1, students will:

Refresh the main mathematical concepts/tools used in the class, including some elements of algebra, sums, products, integrals.

Write first Mathematica-based programs using functions, tables, random number generator, plots.

2 At the end of week 2, students will be able to:

Describe the major sources of Probability Theory Describe some archetypical paradoxes of Probability Apply some basic analytical and visualization tools of Mathematica Run simple random simulations with Mathematica

Use Combinatorics for the analysis of various random selection problems, derivation of major probability distributions and grasping some major combinatorial problems of sequence analysis.

Running simple statistical simulations in Mathematica. Recognize the Bernoulli trials process, and the related discrete probability distributions (DPDs):

Binomial, Geometric, and Negative Binomial. Describe some sequence analysis problems in terms of discrete probability distributions. Performing basic operations on matrices using M.

5 At the end of week 5, students will be able to:

Perform numeric computations using Monte Carlo Approach Formulate basic principles underlying the Monte Carlo approach to computer modeling Apply Poisson, Hypergeometric and Multinomial DPDs to the analysis of random events Recognize differences between sampling with and without replacement.

6. At the end of week 6, students will be able to:

Apply various statistical distributions to analysis of random events Investigate properties of the statistical distributions using the Mathematica-based algorithms Analyze the properties of related events in terms of conditional probability. Investigate the pairwise and global independence of the events.

7. At the end of week 7, students will be able to:

Apply Bayes’ formula to the analysis of posterior probabilities Using Bayes’ approach, analyze reliability of tests based on the on the two types of errors and

prevalence. Apply the concepts of Specificity and Sensitivity to the medical tests analysis. Analyze multi-step random experiments Derive the Hardy Weinberg theorem Describe the general properties of Markov chains Apply recursive approach to the analysis of Markov chain

8. At the end of week 8, students will be able to:

Apply matrices to the analysis of Markov chains (MC) Apply MC to bioinformatics ( detecting the CpG Islands) Simulate the random walks with Mathematica.

9.At the end of week 9, students will be able to

Describe the continuous probability distributions in terms of pdf and cdf Apply Mathematica to the analytical and numerical computations of the continuous probabilities

13

Page 14: moodle2.brandeis.edu · Web viewPlease read carefully this document. Pay special attention to the sections marked. They contain information that you will need from the very …

1 At the end of week 1, students will:

Refresh the main mathematical concepts/tools used in the class, including some elements of algebra, sums, products, integrals.

Write first Mathematica-based programs using functions, tables, random number generator, plots.

2 At the end of week 2, students will be able to:

Describe the major sources of Probability Theory Describe some archetypical paradoxes of Probability Apply some basic analytical and visualization tools of Mathematica Run simple random simulations with Mathematica

Recognize and use some important continuous distributions: Uniform, Exponential, Gamma, and Normal.

Recognize and apply joint and marginal distributions

10 At the end of the week 10, students will be able Compute the measures of central tendency (expectation, variance etc) and apply them to the

analysis of statistical properties Analyze the statistical properties of the sums of random variables using Mathematica-based

simulations Apply the law of large numbers to the analysis of the asymptotic behaviors of the sums.

11 At the end of week 11, students will be able to Apply the Maximum Likelihood approach to the modeling of statistical data Use Linear and Non-linear regression for data modeling and analysis of correlations Use Mathematica-based statistical algorithms (NonlinearFit, Anova, LinearRegression) for the data

analysis Develop Mathematica-based NLR tools for some practical applications

12 At the end of week 12, students will be able to Use the central limit theorem, and explain the special role played in statistics by the normal

distribution. Explain the concept of “confidence interval”, “statistical significance” and “ p-values”, and their

role in statistics. Apply these concepts to the hypothesis testing.

13At the end of week 13, students will be able to

Describe the relation between probability and entropy. Boltzmann’s formula. Describe Boltzmann distribution and explain its use for derivation of equilibrium statistical

properties (examples of gases). Explain physical principles behind Monte Carlo simulation of biological systems.

14

Page 15: moodle2.brandeis.edu · Web viewPlease read carefully this document. Pay special attention to the sections marked. They contain information that you will need from the very …

4. Weekly Reading assignments. The lecture materials are mostly self-containing. The additional reading assignments will be posted.

III. Course Policies and Procedures

Late Policies

The Homework assignments must be submitted prior to the class, not later than 11.30 pm on Tuesday. Those who do not submit their assignments on time will have to take a make-up test. However, the class is quite intense, and it is in your best interest to complete your assignments on time.

Grading Standards

Work expectations

Students are responsible to explore each week's materials and submit required work by their due dates.  On average, a student can expect to spend approximately 9 - 12 hours per week (more specific recommendations will be made individually during the week 2), reading and completing assignments.  This presumes that a student’s educational background satisfies the prerequisites.   Otherwise, more efforts would be required. The assignments will be posted at the beginning of each week (Wednesday morning).

Grades are not given but are earned. Students are graded on demonstration of knowledge or competence, rather than on effort alone. Each student is expected to maintain high standards of honesty and ethical behavior.

How points and percentages equate to grades

%% Character grade %% Character grade98-100 A+ 70-74 C+94-97 A 65-69 C90-93 A- 60-64 C-85-89 B+ 50-59 D80-84 B 0-49 F75-79 B- Extra credit Adds up to 10% of the base grade

Attention: Sage converts both A+ and A into 4.0. I will still use A+ as a token of my appreciation for a job done far above the required level. In a practical sense, the extra “+” can be used to improve your other grades if needed.

FeedbackFeedback will be provided on assignments and exams within 2-3 days of receipt. Responses to the forum posts will be provided not less than 4 times per week.

Confidentiality

We can draw on the wealth of examples from our organizations in class discussions and in our written work. However, it is imperative that we not share information that is confidential, privileged, or proprietary in nature. We must be mindful of any contracts we have agreed to with our companies. In addition, we should respect our fellow classmates and work under the assumption that what is discussed here (as it pertains to the workings of particular organizations) stays within the confines of the classroom.

15

Page 16: moodle2.brandeis.edu · Web viewPlease read carefully this document. Pay special attention to the sections marked. They contain information that you will need from the very …

[Please add this to your syllabus, in the confidentiality sub-section:

For your awareness, members of the University's technical staff have access to all course sites to aid in course setup and technical troubleshooting. Program Chairs and a small number of Graduate Professional Studies (GPS) staff have access to all GPS courses for oversight purposes. Students enrolled in GPS courses can expect that individuals other than their fellow classmates and the course instructor(s) may visit their course for various purposes. Their intentions are to aid in technical troubleshooting and to ensure that quality course delivery standards are met. Strict confidentiality of student information is maintained.

Class Schedule

Week 1 01/22 - 01/28 Week 7 03/05 – 03/11Week 2 01/29 - 02/04 Week 8 03/12 – 03/18Week 3 02/05 - 02/11 Week 9 03/19 - 03/25Week 4 02/12 – 02/18 Week 10 03/26 - 04/01Week 5 02/19 – 02/25 Week 11 04/02 - 04/08Week 6 02/26 – 03/04 Week 12 04/09 - 04/15

Week 13 04/16 - 04/22

IV. University and Division of Continuing Studies Standards

Please review the policies and procedures of Continuing Studies, found at http://www.brandeis.edu/gps/students/studentresources/policiesprocedures/index.html. Among them, we would like to highlight the following.

Learning Disabilities                                                                                  

If you are a student with a documented disability on record at Brandeis University and wish to have a reasonable accommodation made for you in this course, please contact me immediately.

Academic Honesty and Student IntegrityAcademic honesty and student integrity are of fundamental importance at Brandeis University and we want students to understand this clearly at the start of the term. As stated in the Brandeis Rights and Responsibilities handbook, “Every member of the University Community is expected to maintain the highest standards of academic honesty.  A student shall not receive credit for work that is not the product of the student’s own effort.   A student's name on any written exercise constitutes a statement that the work is the result of the student's own thought and study, stated in the students own words, and produced without the assistance of others, except in quotes, footnotes or references with appropriate acknowledgement of the source."  In particular, students must be aware that material (including ideas, phrases, sentences, etc.) taken from the Internet and other sources MUST be appropriately cited if quoted, and footnoted in any written work turned in for this, or any, Brandeis class.  Also, students will not be allowed to collaborate on work except by the specific permission of the instructor. Failure to cite resources properly may result in a referral being made to the Office of Student Development and Judicial Education.  The outcome of this action may involve academic and disciplinary sanctions, which could include (but are not limited to) such penalties as receiving no credit for the assignment in question, receiving no credit for the related course, or suspension or dismissal from the University.

Further information regarding academic integrity may be found in the following publications: "In Pursuit of Excellence - A Guide to Academic Integrity for the Brandeis Community", "(Students') Rights and Responsibilities Handbook" AND "Continuing Studies Student Handbook".  You should read these publications, which all can be accessed from the Continuing Studies Web site.  A student that is in doubt about standards of academic

16

Page 17: moodle2.brandeis.edu · Web viewPlease read carefully this document. Pay special attention to the sections marked. They contain information that you will need from the very …

honesty (regarding plagiarism, multiple submissions of written work, unacknowledged or unauthorized collaborative effort, false citation or false data) should consult either the course instructor or other staff of the Rabb School for Continuing Studies.

University Caveat

The above schedule, content, and procedures in this course are subject to change in the event of extenuating circumstances.

17