Review of Coursera Data Analysis Course

24
Review of Coursera Data Analysis Course Jim Thompson [email protected]

description

Review of Coursera Data Analysis Course. Jim Thompson [email protected]. To make sense of my comments…. Who’s the reviewer What is MOOC Overview of course (Through this reviewers eyes). The Reviewer (Who am I?). Not a professional data analyst: Chemist by training - PowerPoint PPT Presentation

Transcript of Review of Coursera Data Analysis Course

Page 1: Review of  Coursera  Data Analysis Course

Review of Coursera Data Analysis Course

Jim [email protected]

Page 2: Review of  Coursera  Data Analysis Course

To make sense of my comments…

• Who’s the reviewer• What is MOOC• Overview of course

(Through this reviewers eyes)

Page 3: Review of  Coursera  Data Analysis Course

The Reviewer (Who am I?)Not a professional data analyst:• Chemist by training• Develop and commercialize new materials and applications

by profession.Not a data analysis layman• Data analysis as a hobby, on and off for 25 years.• Downloaded R, Jan 2009, used ever sinse

“Data Analysts Captivated by R’s Power”The New York Times, January 2009http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html?pagewanted=all

Page 4: Review of  Coursera  Data Analysis Course

How I taught myself RWhatever fancies me at the moment• No mentor, nor colleague• Books (> 10 on R), Internet articles, R vignettes• Learning by doing, mainly work data,

for fun not for work.Because hobby, lacked discipline in:• Clean code• Reporting• Reproducible research• Appropriate use of stat technique

Page 5: Review of  Coursera  Data Analysis Course

How I taught myself RWhatever fancies me at the moment• No mentor, nor colleague• Books (> 10 on R), Internet articles, R vignettes• Learning by doing, mainly work data,

for fun not for work.Because a hobby, lacked discipline in:• Clean code• Reporting• Reproducible research• Appropriate use of stat technique

Page 6: Review of  Coursera  Data Analysis Course

I tried Open University• Excellent Teachers• One hour long lectures• Some class homework

provided. No grading• Complete at your own

pace

Intro to Programing , Stanford

Page 7: Review of  Coursera  Data Analysis Course

I tried Open University• Excellent Teachers• One hour long lectures• The class homework

provided. No grading• Complete at your own

pace

Intro to Programing , Stanford

Don’t have one hour chunks of time. Nor the discipline.

Page 8: Review of  Coursera  Data Analysis Course

“The Year of the MOOC”the New York Times [1]

• A massive open online course (MOOC) is … aimed at large-scale interactive participation and open access via the web. [2]• www.Udacity.com • www.edX.org• www.Coursera.org

[1] http://www.nytimes.com/2012/11/04/education/edlife/massive-open-online-courses-are-multiplying-at-a-rapid-pace.html?pagewanted=all&_r=0[2] http://en.wikipedia.org/wiki/Massive_open_online_course

Page 9: Review of  Coursera  Data Analysis Course

Data Analysis by Jeffrey LeekAn applied statistics course focusing on data analysis, not mathematical details. How to:• Organize and perform analysis,• interpret results, • diagnose potential problems• write-up data analysesStatistical methods :

Page 10: Review of  Coursera  Data Analysis Course

Data Analysis by Jeffrey LeekAn applied statistics course focusing on data analysis, not mathematical details. How to:• Organize and perform analysis,• interpret results, • diagnose potential problems• write-up data analysesStatistical methods :

Requires a working knowledge ofR

Page 11: Review of  Coursera  Data Analysis Course

How does this work?• Time bond (i.e 6 weeks)• Plan on 3-10 hrs/wks• Watch three to five videos a week, 10-15 min long• Weekly quizzes• Submit two papers/reports• Slides, video, R code available for download• A certificate

Page 12: Review of  Coursera  Data Analysis Course

Structure the analysis: Tips of finding, organizing, cleaning the data and the code.

Week 1 Week 2

Personal comments:

Page 13: Review of  Coursera  Data Analysis Course

Structure the analysis: Tips of finding, organizing, cleaning the data and the code. Very useful.

Week 1 Week 2

Biggest Benefit I

Page 14: Review of  Coursera  Data Analysis Course

Exploratory & Inferential:Clustering for exploratory analysis

Week 3 Week 4

Page 15: Review of  Coursera  Data Analysis Course

Inferential & Predictive Analysislearned new techniques, best practices

Week 5Week 6

Page 16: Review of  Coursera  Data Analysis Course

Advanced TechniquesGood stuff, but I was running out of gas

Week 5Week 5

Page 17: Review of  Coursera  Data Analysis Course

Submit Two Reports1. Inference analysis of mortgage data:

“This analysis considers whether any other variables have an important association with interest rate after taking into account the applicant's FICO score”

2. Predictive modeling using censors on cell phones:“Given the output Samsung phone, can we predict whether the owner is sitting, laying, standing, walking, walking up stairs, or walking down stairs.”

Page 18: Review of  Coursera  Data Analysis Course

Submit Two Reports1. Inference analysis of mortgage data:

“This analysis considers whether any other variables have an important association with interest rate after taking into account the applicant's FICO score”

2. Predictive modeling using censors on cell phones:“Given the output Samsung phone, can we predict whether the owner is sitting, laying, standing, walking, walking up stairs, or walking down stairs.”

Biggest Benefit II• submitting mine, • analyzing others

Page 19: Review of  Coursera  Data Analysis Course

Data analysis rubric• Main text

Does the analysis have an introduction, methods, analysis, and conclusions? Are figures labeled and referred to by number in the text? Is the analysis written in grammatically correct English? Are the names of variables reported in plain language, rather than in coded

names? Does the analysis report the number of samples? Does the analysis report any missing data or other unusual features? Does the analysis include a discussion of potential confounders? Are the statistical models appropriately applied? Are estimates reported with appropriate units and measures of uncertainty? Are estimators/predictions appropriately interpreted? Does the analysis make concrete conclusions? Does the analysis specify potential problems with the conclusions?

Page 20: Review of  Coursera  Data Analysis Course

Data analysis rubric• Figure

Is the figure caption descriptive enough to stand alone?Does the figure focus on a key issue in the

processing/modeling of the data?Are axes labeled and are the labels large enough to read?

• ReferencesDoes the analysis include references for the statistical

methods used?• R script

Can the analysis be reproduced with the code provided?

Page 21: Review of  Coursera  Data Analysis Course

Final commentsOn MOOC• Thumbs up!

On Data Analysis by Jeffrey Leek• Thumbs up!• Target audience: I might be the sweet-spot • Excellent reference (links attached).• On submitting reports:

• Learned most by writing the reports and grading others

NOTE: Intro to R course scheduled for September 2013

Page 22: Review of  Coursera  Data Analysis Course

Data Analysis by Jeffrey LeekThe Class• https://www.coursera.org/

course/dataanalysis• https://github.com/jtleek/d

ataanalysis

The Prof• http://www.biostat.jhsph.e

du/~jleek/• http://simplystatistics.org/

Page 23: Review of  Coursera  Data Analysis Course

MOOC

Page 24: Review of  Coursera  Data Analysis Course