Statistics

13
Nemours Biomedical Research Statistics March 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility

description

Statistics. March 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility. Overview. Class goals Master basic statistical concepts Learn analytic techniques & when to apply them Learn how to interpret analysis results - PowerPoint PPT Presentation

Transcript of Statistics

Page 1: Statistics

Nemours Biomedical Research

Statistics

March 2009Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D.

Nemours Bioinformatics Core Facility

Page 2: Statistics

Nemours Biomedical Research

Overview

• Class goals– Master basic statistical concepts– Learn analytic techniques & when to apply

them– Learn how to interpret analysis results– Develop familiarity with R and related tools– Gain understanding that will transfer to a

broad range of other statistics tool

Page 3: Statistics

Nemours Biomedical Research

Overview

• Class structure– 8 sessions– 1.5 hours per session– Several homework assignments

• Class website– http://www.medsci.udel.edu/open/StatClass/March2009

Page 4: Statistics

Nemours Biomedical Research

R

• Installing– Download from Class website

• Pick right (Mac versus Windows) version

– Run installer program• Go with all the defaults

• Running R– Live Demonstration

Page 5: Statistics

Nemours Biomedical Research

R Concepts

• Command line similar to– Windows Command Shell– Mac Terminal– Unix/Linux Shell

• Uses ‘>’ as prompt• Accepts

– Constants (e.g., 2 3 156 -99.0, etc.)– Variables (e.g., Height, Weight, SubjID,…)– Operations (e.g., ‘+’ ‘-’ ‘*’ ‘/’ ‘^’ ‘>’ ‘<‘ ‘==‘ ‘<-’)– Functions (e.g., sum(c(1,2,3)), mean(c(1,2,3))…)

Page 6: Statistics

Nemours Biomedical Research

R Concepts

• Variable types– Scalar (a = 128)– Vector (a = c(3, 2, 9, 5))– Matrix (dim(a) = c(2,2))– Data Frame - collection of vectors.

• Similar to spread sheet• ‘rows’ are indexed• ‘cols’ named and indexed• E.g., df$a is the column of data frame df named ‘a’ and

df$a[5] is the 5th element of ‘a’.

Page 7: Statistics

Nemours Biomedical Research

Statistics

• Science of data collection, summarization, analysis and interpretation.

• Descriptive versus Inferential Statistics:

– Descriptive Statistic: Data description (summarization) such as center, variability and shape for quantitative variable (e.g. age) and number (frequency) and percentage for categorical variable (e.g. gender, race etc).

– Inferential Statistic : Drawing conclusion beyond the sample studied, allowing for prediction.

Page 8: Statistics

Nemours Biomedical Research

Statistical Description of Data

• Statistics describes a numeric set of data by its

• Center (mean, median, mode etc)

• Variability (standard deviation, range etc)

• Shape (skewness, kurtosis etc)

• Statistics describes a categorical set of data by

• Frequency, percentage or proportion of each

category

Page 9: Statistics

Nemours Biomedical Research

Statistical Inference

•Statistical inference is the process by which we acquire information about populations from samples.•Two types of estimates for making inferences:

–Point estimation.–Interval estimate.

sample population

Page 10: Statistics

Nemours Biomedical Research

Population and Sample

• Population: The entire collection of individuals or

measurements about which information is desired.

• Sample: A subset of the population selected for study.

– Primary objective is to create a subset of population

whose center, spread and shape are as close as that of

population.

Page 11: Statistics

Nemours Biomedical Research

Parameter v.s. Statistic

• Parameter: – Any statistical characteristic of a population.

– Population mean, population median, population standard deviation are examples of parameters.

– Parameter describes the distribution of a population

– Parameters are fixed and usually unknown

Page 12: Statistics

Nemours Biomedical Research

Parameter v.s. Statistic

• Statistic:

– Any statistical characteristic of a sample.

– Sample mean, sample median, sample standard deviation are some examples of statistics.

– Statistic describes the distribution of population

– Value of a statistic is known and is varies for different samples

– Are used for making inference on parameter

Page 13: Statistics

Nemours Biomedical Research

Parameter v.s. Statistic

• Statistical Issue– Estimate a population parameter using a

sample statistic.– E.g., the sample mean is an estimate of the

population mean.