Lecture 1 Stat Applications, Types of Data And Statistical Inference.

16
Lecture 1 Lecture 1 Stat Applications, Types of Stat Applications, Types of Data Data And Statistical Inference And Statistical Inference

Transcript of Lecture 1 Stat Applications, Types of Data And Statistical Inference.

Page 1: Lecture 1 Stat Applications, Types of Data And Statistical Inference.

Lecture 1Lecture 1Lecture 1Lecture 1Stat Applications, Types of Stat Applications, Types of

DataDataAnd Statistical InferenceAnd Statistical Inference

Page 2: Lecture 1 Stat Applications, Types of Data And Statistical Inference.

Statistics and Data• What is Statistics

– Intuitively, Stats are numbers or summaries of numbers. • Such as average, sum, maximum/ minimum number.

– In a broader sense, Statistics is the art and science of collecting, analyzing, presenting and interpreting data. • Statisticians collect, analyze data and then draw

conclusion about the truth behind the data. • Data are the facts and figures we collected, analyzed, and

summarized for presentation and interpretation.

• All the data collected in a particular study are called Data Set.

2

Page 3: Lecture 1 Stat Applications, Types of Data And Statistical Inference.

Basic Concepts• Data: facts and figures collected, analyzed

and summarized for presentation and interpretation.

• Data Set: all data collected in a particular study

• Elements: individual entities of a data set• Variables: a characteristic of interest for the

elements• Observations: The set of measurements

collected for a particular element

Page 4: Lecture 1 Stat Applications, Types of Data And Statistical Inference.

Types of Data• There are different classifications.• Each classification serves some

specific purpose.• Correspondingly, each type of

data requires/demands some specific techniques of analysis.

Page 5: Lecture 1 Stat Applications, Types of Data And Statistical Inference.

Types of Data• Qualitative: labels or names used to identify

an attribute of each element• Nominal: order does NOT matter (gender, race, marital

status)• Ordinal: order DOES matter (Level of satisfaction, class

[fresh, soph, jr, sr])

• Quantitative: require numeric values that indicate how much or how many

• Interval: ratios of quantities cannot be compared (temp (C/F), IQ score)

• Ratio: ratios of quantities have meaning (height, weight, age)• Difference between interval and ratio, whether ZERO

matters.

Page 6: Lecture 1 Stat Applications, Types of Data And Statistical Inference.

Types of Data (cont.)• Cross-Sectional: data collected

at the same or approximately the same point in time

• Time Series Data: data collected over several time periods.

Page 7: Lecture 1 Stat Applications, Types of Data And Statistical Inference.

Sources of Data• Existing Sources: data that are there already•

• Surveys: teaching evaluation

• Experiments:• *key thing in an experiment rather than observational study

is that you manipulate and control what the groups, such as assigning different treatments (drugs) to each one*

• Observational Studies:• *key thing in an observational study rather than an

experiment is that you are simply observing what happens and are not giving a specific treatment to anything*

Page 8: Lecture 1 Stat Applications, Types of Data And Statistical Inference.

How to get data?•Sampling

•For survey, experiment and observational studies.

Page 9: Lecture 1 Stat Applications, Types of Data And Statistical Inference.

Types of Sampling• Simple Random Sampling (Finite Population): A sample

of size n from a finite population of size N is selected such that each possible sample of size n has the same probability of being selected.

• Simple Random Sampling (Infinite Population): a sample selected such that each element selected comes from the population and each element is selected independently.

• Sampling With Replacement: Elements are put back in the population after being selected for the sample allowing for a chance of being selected more than once for a single sample

• Sampling Without Replacement: Elements are not replaced after being selected and are therefore only chosen once to be in a sample.

Page 10: Lecture 1 Stat Applications, Types of Data And Statistical Inference.

Types of Sampling (cont.)

• Convenience Sample: participants in a sample are not selected at random, but instead those more convenient of attaining are chosen, such as polling people as they come into the grocery store on a Saturday from 1-2pm rather than taking all the people that came on Saturday and choosing the same size sample where all people are equally likely to be chosen.

• Stratified Random Sample: The population is divided into different strata (or groups) and people are randomly chosen from each strata. An example would be to divide Purdue undergraduates into Freshman, Sophomore, Junior and Senior, and selecting from each group.

Page 11: Lecture 1 Stat Applications, Types of Data And Statistical Inference.

Types of Sampling (cont. II)

• Cluster Sample: The population is divided into different groups and groups are randomly selected to be included in the sample and each element in the group is a part of the sample.

• Kth element Random Sample: number the units in the population from 1 to N decide on the n (sample size) that you want or need and set k = N/n = the interval size. Randomly select an integer between 1 to k then take every kth unit.

Page 12: Lecture 1 Stat Applications, Types of Data And Statistical Inference.

Statistical Inference

• Population: the set of all elements of interest in a particular study

• Sample: a subset of the population

• Sample Survey: the process of conducting a survey to collect data for a sample

• Census: the process of conducting a survey to collect data for the entire population

Page 13: Lecture 1 Stat Applications, Types of Data And Statistical Inference.

What is Statistical Inference ?

Using data from a sample to estimate the characteristic of a

population

Page 14: Lecture 1 Stat Applications, Types of Data And Statistical Inference.

Why sample rather than population?

• Hard to sample EVERYONE

• Too expensive to sample everyone

• Too much time/effort to sample everyone

Page 15: Lecture 1 Stat Applications, Types of Data And Statistical Inference.

Bias• How a question is asked can have

an effect on how a respondent responses.

• Bias is bad.

Page 16: Lecture 1 Stat Applications, Types of Data And Statistical Inference.

• What are the Sources of Bias ?