Review of Coursera Data Analysis Course Jim Thompson [email protected].

24
Review of Coursera Data Analysis Course Jim Thompson [email protected]

Transcript of Review of Coursera Data Analysis Course Jim Thompson [email protected].

Page 1: Review of Coursera Data Analysis Course Jim Thompson JamesThompsonC@gmail.com.

Review of Coursera Data Analysis Course

Jim [email protected]

Page 2: Review of Coursera Data Analysis Course Jim Thompson JamesThompsonC@gmail.com.

To make sense of my comments…

• Who’s the reviewer• What is MOOC• Overview of course

(Through this reviewers eyes)

Page 3: Review of Coursera Data Analysis Course Jim Thompson JamesThompsonC@gmail.com.

The Reviewer (Who am I?)

Not a professional data analyst:• Chemist by training• Develop and commercialize new materials and

applications by profession.Not a data analysis layman• Data analysis as a hobby, on and off for 25 years.• Downloaded R, Jan 2009, used ever sinse

“Data Analysts Captivated by R’s Power”The New York Times, January 2009http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html?pagewanted=all

Page 4: Review of Coursera Data Analysis Course Jim Thompson JamesThompsonC@gmail.com.

How I taught myself R

Whatever fancies me at the moment• No mentor, nor colleague• Books (> 10 on R), Internet articles, R vignettes• Learning by doing, mainly work data,

for fun not for work.Because hobby, lacked discipline in:• Clean code• Reporting• Reproducible research• Appropriate use of stat technique

Page 5: Review of Coursera Data Analysis Course Jim Thompson JamesThompsonC@gmail.com.

How I taught myself R

Whatever fancies me at the moment• No mentor, nor colleague• Books (> 10 on R), Internet articles, R vignettes• Learning by doing, mainly work data,

for fun not for work.Because a hobby, lacked discipline in:• Clean code• Reporting• Reproducible research• Appropriate use of stat technique

Page 6: Review of Coursera Data Analysis Course Jim Thompson JamesThompsonC@gmail.com.

I tried Open University• Excellent Teachers• One hour long lectures• Some class homework

provided. No grading• Complete at your own

pace

Intro to Programing , Stanford

Page 7: Review of Coursera Data Analysis Course Jim Thompson JamesThompsonC@gmail.com.

I tried Open University• Excellent Teachers• One hour long lectures• The class homework

provided. No grading• Complete at your own

pace

Intro to Programing , Stanford

Don’t have one hour chunks of time. Nor the discipline.

Page 8: Review of Coursera Data Analysis Course Jim Thompson JamesThompsonC@gmail.com.

“The Year of the MOOC”the New York Times [1]

• A massive open online course (MOOC) is … aimed at large-scale interactive participation and open access via the web. [2]• www.Udacity.com • www.edX.org• www.Coursera.org

[1] http://www.nytimes.com/2012/11/04/education/edlife/massive-open-online-courses-are-multiplying-at-a-rapid-pace.html?pagewanted=all&_r=0[2] http://en.wikipedia.org/wiki/Massive_open_online_course

Page 9: Review of Coursera Data Analysis Course Jim Thompson JamesThompsonC@gmail.com.

Data Analysis by Jeffrey LeekAn applied statistics course focusing on data analysis, not mathematical details. How to:• Organize and perform analysis,• interpret results, • diagnose potential problems• write-up data analysesStatistical methods :

Page 10: Review of Coursera Data Analysis Course Jim Thompson JamesThompsonC@gmail.com.

Data Analysis by Jeffrey LeekAn applied statistics course focusing on data analysis, not mathematical details. How to:• Organize and perform analysis,• interpret results, • diagnose potential problems• write-up data analysesStatistical methods :

Requires a working knowledge ofR

Page 11: Review of Coursera Data Analysis Course Jim Thompson JamesThompsonC@gmail.com.

How does this work?• Time bond (i.e 6 weeks)• Plan on 3-10 hrs/wks• Watch three to five videos a week, 10-15 min long• Weekly quizzes• Submit two papers/reports• Slides, video, R code available for download• A certificate

Page 12: Review of Coursera Data Analysis Course Jim Thompson JamesThompsonC@gmail.com.

Structure the analysis: Tips of finding, organizing, cleaning the data and the code.

Week 1 Week 2

Personal comments:

Page 13: Review of Coursera Data Analysis Course Jim Thompson JamesThompsonC@gmail.com.

Structure the analysis: Tips of finding, organizing, cleaning the data and the code. Very useful.

Week 1 Week 2

Biggest Benefit I

Page 14: Review of Coursera Data Analysis Course Jim Thompson JamesThompsonC@gmail.com.

Exploratory & Inferential:Clustering for exploratory analysis

Week 3 Week 4

Page 15: Review of Coursera Data Analysis Course Jim Thompson JamesThompsonC@gmail.com.

Inferential & Predictive Analysislearned new techniques, best practices

Week 5Week 6

Page 16: Review of Coursera Data Analysis Course Jim Thompson JamesThompsonC@gmail.com.

Advanced TechniquesGood stuff, but I was running out of gas

Week 5Week 5

Page 17: Review of Coursera Data Analysis Course Jim Thompson JamesThompsonC@gmail.com.

Submit Two Reports

1. Inference analysis of mortgage data:“This analysis considers whether any other variables have an important association with interest rate after taking into account the applicant's FICO score”

2. Predictive modeling using censors on cell phones:“Given the output Samsung phone, can we predict whether the owner is sitting, laying, standing, walking, walking up stairs, or walking down stairs.”

Page 18: Review of Coursera Data Analysis Course Jim Thompson JamesThompsonC@gmail.com.

Submit Two Reports

1. Inference analysis of mortgage data:“This analysis considers whether any other variables have an important association with interest rate after taking into account the applicant's FICO score”

2. Predictive modeling using censors on cell phones:“Given the output Samsung phone, can we predict whether the owner is sitting, laying, standing, walking, walking up stairs, or walking down stairs.”

Biggest Benefit II• submitting mine, • analyzing others

Page 19: Review of Coursera Data Analysis Course Jim Thompson JamesThompsonC@gmail.com.

Data analysis rubric

• Main text Does the analysis have an introduction, methods, analysis, and conclusions? Are figures labeled and referred to by number in the text? Is the analysis written in grammatically correct English? Are the names of variables reported in plain language, rather than in coded

names? Does the analysis report the number of samples? Does the analysis report any missing data or other unusual features? Does the analysis include a discussion of potential confounders? Are the statistical models appropriately applied? Are estimates reported with appropriate units and measures of uncertainty? Are estimators/predictions appropriately interpreted? Does the analysis make concrete conclusions? Does the analysis specify potential problems with the conclusions?

Page 20: Review of Coursera Data Analysis Course Jim Thompson JamesThompsonC@gmail.com.

Data analysis rubric

• FigureIs the figure caption descriptive enough to stand alone?Does the figure focus on a key issue in the

processing/modeling of the data?Are axes labeled and are the labels large enough to read?

• ReferencesDoes the analysis include references for the statistical

methods used?• R script

Can the analysis be reproduced with the code provided?

Page 21: Review of Coursera Data Analysis Course Jim Thompson JamesThompsonC@gmail.com.

Final comments

On MOOC• Thumbs up!

On Data Analysis by Jeffrey Leek• Thumbs up!• Target audience: I might be the sweet-spot • Excellent reference (links attached).• On submitting reports:

• Learned most by writing the reports and grading others

NOTE: Intro to R course scheduled for September 2013

Page 22: Review of Coursera Data Analysis Course Jim Thompson JamesThompsonC@gmail.com.

Data Analysis by Jeffrey Leek

The Class• https://www.coursera.org/

course/dataanalysis• https://github.com/jtleek/d

ataanalysis

The Prof• http://www.biostat.jhsph.e

du/~jleek/• http://simplystatistics.org/

Page 23: Review of Coursera Data Analysis Course Jim Thompson JamesThompsonC@gmail.com.

MOOC

Page 24: Review of Coursera Data Analysis Course Jim Thompson JamesThompsonC@gmail.com.