Statistics
description
Transcript of Statistics
Nemours Biomedical Research
Statistics
March 2009Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D.
Nemours Bioinformatics Core Facility
Nemours Biomedical Research
Overview
• Class goals– Master basic statistical concepts– Learn analytic techniques & when to apply
them– Learn how to interpret analysis results– Develop familiarity with R and related tools– Gain understanding that will transfer to a
broad range of other statistics tool
Nemours Biomedical Research
Overview
• Class structure– 8 sessions– 1.5 hours per session– Several homework assignments
• Class website– http://www.medsci.udel.edu/open/StatClass/March2009
Nemours Biomedical Research
R
• Installing– Download from Class website
• Pick right (Mac versus Windows) version
– Run installer program• Go with all the defaults
• Running R– Live Demonstration
Nemours Biomedical Research
R Concepts
• Command line similar to– Windows Command Shell– Mac Terminal– Unix/Linux Shell
• Uses ‘>’ as prompt• Accepts
– Constants (e.g., 2 3 156 -99.0, etc.)– Variables (e.g., Height, Weight, SubjID,…)– Operations (e.g., ‘+’ ‘-’ ‘*’ ‘/’ ‘^’ ‘>’ ‘<‘ ‘==‘ ‘<-’)– Functions (e.g., sum(c(1,2,3)), mean(c(1,2,3))…)
Nemours Biomedical Research
R Concepts
• Variable types– Scalar (a = 128)– Vector (a = c(3, 2, 9, 5))– Matrix (dim(a) = c(2,2))– Data Frame - collection of vectors.
• Similar to spread sheet• ‘rows’ are indexed• ‘cols’ named and indexed• E.g., df$a is the column of data frame df named ‘a’ and
df$a[5] is the 5th element of ‘a’.
Nemours Biomedical Research
Statistics
• Science of data collection, summarization, analysis and interpretation.
• Descriptive versus Inferential Statistics:
– Descriptive Statistic: Data description (summarization) such as center, variability and shape for quantitative variable (e.g. age) and number (frequency) and percentage for categorical variable (e.g. gender, race etc).
– Inferential Statistic : Drawing conclusion beyond the sample studied, allowing for prediction.
Nemours Biomedical Research
Statistical Description of Data
• Statistics describes a numeric set of data by its
• Center (mean, median, mode etc)
• Variability (standard deviation, range etc)
• Shape (skewness, kurtosis etc)
• Statistics describes a categorical set of data by
• Frequency, percentage or proportion of each
category
Nemours Biomedical Research
Statistical Inference
•Statistical inference is the process by which we acquire information about populations from samples.•Two types of estimates for making inferences:
–Point estimation.–Interval estimate.
sample population
Nemours Biomedical Research
Population and Sample
• Population: The entire collection of individuals or
measurements about which information is desired.
• Sample: A subset of the population selected for study.
– Primary objective is to create a subset of population
whose center, spread and shape are as close as that of
population.
Nemours Biomedical Research
Parameter v.s. Statistic
• Parameter: – Any statistical characteristic of a population.
– Population mean, population median, population standard deviation are examples of parameters.
– Parameter describes the distribution of a population
– Parameters are fixed and usually unknown
Nemours Biomedical Research
Parameter v.s. Statistic
• Statistic:
– Any statistical characteristic of a sample.
– Sample mean, sample median, sample standard deviation are some examples of statistics.
– Statistic describes the distribution of population
– Value of a statistic is known and is varies for different samples
– Are used for making inference on parameter
Nemours Biomedical Research
Parameter v.s. Statistic
• Statistical Issue– Estimate a population parameter using a
sample statistic.– E.g., the sample mean is an estimate of the
population mean.