StatisticalDesign&ModelsValidation. Introduction.

59
Statistical Design & Models Validation

Transcript of StatisticalDesign&ModelsValidation. Introduction.

StatisticalDesign

& Models

Validation

Introduction

KNOWLEDGE INQUIRY

How we know, what we know

and

How we know,

we know

Bouma Gary D. & G.B.J.Atkinson. (1995) A Handbook of Social Science Research. p.3

Description

Explanation

Prediction

Control

Development

Exploration

What, When Where, How

Why

PURPOSE OF RESEARCH

Confirmatory Factor Analysis & Path Analysis

Interest Idea Theory

? YY ?

X YA B

?

? A B C D E

F G H I

Conceptualization

Specify the meaning of the concepts and

variables to be studied.

Operationalization

How will we actually measure the variables

under study?

Choice of Research MethodExperimental Research

Survey Research Field Research Content Analysis Existing Data Research Comparative Research Evaluation Research Mixed Design

Population & SamplingWhom do we want to be

able to draw conclusions about?

Who will be observed for the purpose?

Observation

Collecting data for analysis and

interpretation

Data Processing

Transforming the data collected into a form

appropriate to manipulation and analysis

Analysis

Analyzing data and drawing conclusions

Application

Reporting results and

assessing their implications.

1

2

3

5

4

6

7 8 9

RESEARCH PROCESS & DESIGN

Interest Idea Theory

? YY ?

X YA B

?

? A B C D E

F G H I

Conceptualization

Specify the meaning of the concepts and

variables to be studied.

Operationalization

How will we actually measure the variables

under study?

Choice of Research MethodExperimental Research

Survey Research Field Research Content Analysis Existing Data Research Comparative Research Evaluation Research Mixed Design

Population & SamplingWhom do we want to be

able to draw conclusions about?

Who will be observed for the purpose?

Observation

Collecting data for analysis and

interpretation

Data Processing

Transforming the data collected into a form

appropriate to manipulation and analysis

Analysis

Analyzing data and drawing conclusions

Application

Reporting results and

assessing their implications.

1

2

5

7 9

RESEARCH PROCESS & DESIGN

AnalysisDesign

MeasurementDesign

SamplingDesignResearch

Design

Data CollectingDesign

3

6

4

8

Interest Idea Theory

? YY ?

X YA B

?

? A B C D E

F G H I

Conceptualization

Specify the meaning of the concepts and

variables to be studied.

Operationalization

How will we actually measure the variables

under study?

Choice of Research MethodExperimental Research

Survey Research Field Research Content Analysis Existing Data Research Comparative Research Evaluation Research Mixed Design

Population & SamplingWhom do we want to be

able to draw conclusions about?

Who will be observed for the purpose?

Observation

Collecting data for analysis and

interpretation

Data Processing

Transforming the data collected into a form

appropriate to manipulation and analysis

Analysis

Analyzing data and drawing conclusions

Application

Reporting results and

assessing their implications.

RESEARCH PROCESS & DESIGN

Cross-sectional Study254721-3031-4041-50

One-point of time

Trend Study2547 255721-30 21-3031-40 31-4041-50 41-50

Same framework & instruments

Cohort Study2547 255721-30 21-3031-40 31-4041-50 41-50

51-60

Same framework & instruments

Panel Study2547 255721-30 21-3031-40 31-4041-50 41-50

51-60

Same individuals

RESEARCH DESIGN: TIME DIMENSIONS

Validity = Accuracy = Low BiasReliability = Precision = Low Variance

Pro

babili

ty

Densi

ty

Precision

Reference value

Accuracy

Value

Parameter

Statistics

Validity and Reliability of Research Finding

Low Validity = Low Accuracy = High BiasLow Reliability = Low Precision = High Variance

Pro

babili

ty

Densi

ty

Low Precision

Reference value

Low Accuracy

ValueParameter

Statistics

Low Validity and Low Reliability

Low Validity = Low Accuracy = High BiasHigh Reliability = High Precision = Low Variance

Pro

babili

ty

Densi

ty

Precision

Reference value

Low Accuracy

ValueParameter

Statistics

Low Validity and High Reliability

Quality of Measurement

A test with low validity because of

low reliability

A highly valid test A reliable test with low validity.

Validity and Reliability of Measurement

NullHypothesis Testing

Null Hypothesis Testing

Goal: To determine if the independent variable has a statistically significant (real) effect on the dependent variable. That means, an effect that is UNLIKELY to be due to chance variations or sampling error.

The null hypothesis

• Researchers make the initial assumption that the independent variable manipulation will have NO EFFECT on the dependent variable (will be null).

• Under the “null hypothesis”, any observed difference between groups is assumed to be due to chance (random error) unless proven otherwise!

• Inferential statistics are the tools used to resolve this question.

Inferential Statistics

• Tools for testing how likely it is that the results of a study are due to error or chance variation.

• It is always possible that differences between groups & Relationship at the end of the study may have been due to sampling error, rather than being due to the independent variable.

• Sampling error: the extent to which the groups were different at the start of the study.

Inferential Statistics

• Statistical Significant: • Type I error ()• Type II error ()• Power of test (1-)• Confidence Interval (1-)

• Practical Significant: • Effect size (2, 2)• Sample Determination

Statistical Model

& Analysis

DESCRIPTIVESTATISTICS

DESCRIPTIVE STATISTICS

Mean

Standard deviation

Variance, Covariance

Frequency & Percentage & ratio

Percentile, quartile

Median & mode

Range, etc.

TESTING FOR ASSUMPTION OF STATISTICS

Kurtosis

Skewness

Normal Distribution

Multivariate Normality

Multicolinearity

Linearity

Outliers

Mean (Y)

Mean (X1

)

Mean (X2

)

Mean (X3

)

Descriptive Statistics: How Importance?Measure of Central Tendency: Mean, Mode, MedianMeasure of Dispersion: Variance, Standard Deviation, Mean Deviation, Range

Univariate: Variable, Variation & Variance

2X1 2

X2 2X3 2

Y

YX1 X2 X3

Descriptive Statistics:Mean Vector

variance-covariance matrix

Bivariate: Variables, Variance & Covariance

2X1

2X2 2

X3 2Y

2X1 2

X2 2X3 2

Y

Bivariate: Variables, Variance & Covariance

Cov (X1,Y)

Cov (X1,X2

)

Cov (X1,X3

) Cov (X2,X3

)

Cov (X2,Y)

Cov (X3,Y)

Cov (X1,Y)

Cov (X1,X2

)

Cov (X1,X3

) Cov (X2,X3

)

Cov (X2,Y)

Cov (X3,Y)

Statistical

design

&

Conceptual

Models

10

Y

1 00 10 0

1 0 00 1 00 0 10 0 0

d1 d2 d1 d2 d3

ตั�วแปรสั�งเกตัได้ Observed variable(Nominal Scale)Observed

variable(Interval Scale)1 Latent

variable

Causal relationship

Relationship

d1

1

สั�ญลั�กษณ์�แลัะความหมายที่��ใช้�

Analysis UsingDependent Techniques

10

X1

Y

One-way ANOVA (Independent sample t-test)

Statistical Design

Ypo

st

Ypre

One-way ANOVA with repeated measured (Dependent sample t-test)

Within-subjects Design

?

?

Different

DifferentChange, Gain, Development

Between-subjects DesignDirect

effects

Direct effects

Bivariate Correlation Analysis (rxy)

YX

rx

yYX Z

Cov(x,y)

rx

y

ry

z

rx

z

Cov(x,z)

Cov(y,z)

Cov(x,y)

Standardized Score

Raw Score

Statistical Design

X1

X2 X3

Y?

Statistical DesignPartial & Part Correlation Analysis

(Spurious or Indirect Causality)Direct effects

X1

Y

1 0

0 1

0 0

One-way ANOVA (F-test)

YT2YT1

One-way ANOVA with repeated measured

Within-subjects Design

YT2

?

? ?

?

Between-subjects Design

Statistical Design

Direct effects

Direct effects

10X

1 Y

Two-way ANOVA (additive model) -- >No interaction effects

X2

1 0

0 1

0 0

Main effect-X1

Main effect-X2

Between-subjects Design

Statistical Design

Direct effects

10X

1 Y

Two-way ANOVA (non-additive model) -- > Interaction effects

X2

1 0

0 1

0 0

Main effect

Main effect

Interaction effect

Between-subjects Design

Direct effects

Statistical Design

Y

10

10

1 00 10 0

Multi-way ANOVA (the interactive structure)

X1

X2

X3

Between-subjects Design

Statistical Design

Direct effects

Interaction effect

Interaction effect

Main effect

Main effect

Main effect

Y

One-way Analysis of Covariance (ANCOVA) additive model

X1

1 0

0 1

0 0

(Covariate)

Z

? Between-subjects Design

Statistical Design

Y10

1 00 10 0

Two-way ANCOVA (Interactive structure)

Z

X1

X2

(Covariate)

Between-subjects Design

Statistical Design

Direct effects

Main effect

Interaction effect

Main effect

Interaction effect Main

effect

X1

X2

X3

Y

Simple Regression Analysis (SRA)Multiple Regression Analysis (MRA) (Convergent Causal structure)

No Correlatio

n(r = 0)

Direct effects

y.x1

y.x2

y.x3

X Yy.x

YX

rx

y

Statistical Design

X1

X2

X3

Multivariate Multiple Regression Analysis (MMR)(Convergent Causal structure two or several times)

Y1

Y2

Direct effects

No Correlatio

n(r = 0)

Statistical Design

10

X1

X2

X3

Two-groups Discriminant Analysis (Discriminant structure)Binary Logistic Regression Analysis

(Y)

W

W

W

Direct effects

No Correlatio

n(r = 0)

Statistical Design

X1

X2

X3

Multiple Discriminant Analysis(Discriminant Structure with more than two population groups)

1 0

0 1

0 0

(Y)

W

W

W

Direct effects

No Correlatio

n(r = 0)

Statistical Design

Y1

10

10

1 00 10 0

Multivariate Analysis of Variance -- MANOVA(Interactive Structure two or several times)

Y2

X1

X2

X3

Statistical Design

Main effectInteraction

effect

Interaction effect

Main effect

Main effect

Y1

10

1 00 10 0

ZY2

Multivariate Analysis of Covariance -- MANCOVA (Interactive Structure two or several times)X

1

X2

(Covariate)

Statistical Design

Main effectInteraction

effect

Interaction effect

Main effect

Main effect

Analysis UsingInterdependent Techniques

U1 V1

Canonical variates

(Independent)

Canonical variates

(Dependent)

U2 V2

RC1, 1

X1

X2

X3

X4

Y1

Y2

Set of Independe

nt variables

Set of Dependent variables

Canonical Function-1

RC2, 2

Canonical Loading2

Canonical Loading2

Simple Correlatio

n

Simple Correlatio

n

Canonical Correlation Analysis (CCA)

Canonical weight

Canonical Weight

Canonical Function-2

Statistical Model

(Conceptualization)High

Low(Operationaliza

tion)

Level of

Ab

str

acti

on

Concept &

Construct

Variables

Indicator Indicator Indicator

Item Item Item Item Item Item Item Item Item

Conceptual Definition

Theoretical Definition

Real Definition

Operational Definition(How to

measured?)

Generalized idea

Communication

Real worldHypothesis

testing

Time, Space, Context

Test-1 Test-2 Test-n

Principle Component Analysis (PCA)

2 31

X1 X2 X3 X4 X5 X6 X7 X8 X9

The Component Loading or the Structure/Pattern Coefficient

Factor structure / Component / Dimensions / Unmeasured variables

Measured variables (Observed) / Indicators / Items

Statistical Design

Measured variables

(Observed) / Indicators / Items

2 31

X1 X2 X3 X4 X5 X6 X7 X8 X9

The Factor Loading or the Structure/Pattern Coefficient

Factor structure /

Component / Dimensions / Unmeasured

variables

Exploratory Factor Analysis (EFA) with Orthogonal Rotation

Errors or Uniqueness

Statistical Model

Measured variables

(Observed) / Indicators / Items

2 31

X1 X2 X3 X4 X5 X6 X7 X8 X9

The Factor Loading or the Structure/Pattern Coefficient

Factor structure /

Component / Dimensions / Unmeasured

variables

Exploratory Factor Analysis (EFA) with Oblique Rotation

Errors or Uniqueness

Statistical Model

2,1

3,1

3,2

Measurement Model:Construct X with 3 subdimensions or 3 factors

2 31

X1 X2 X3 X4 X5 X6 X7 X8 X9

2,1

3,1

3,2

2,11,1 3,1 4,2 5,2 6,2 7,3 8,3 9,3

Statistical Model

Measured variables

(Observed) / Indicators /

Items

2 31

X1 X2 X3 X4 X5 X6 X7 X8 X9

The Factor Loading or the Structure/Pattern Coefficient

Latent Construct

Unmeasured variables

Errors or Uniqueness

Confirmatory Factor Analysis (CFA)

2,1

3,1

3,2

Some Errors are correlated

Some Factors are correlated/ Some Factors are not correlated

2,11,1 3,1 4,2 5,2 6,2 7,3 8,3 9,3

Statistical Design

12345678

91011121314151617

18192021222324252627

282930313233

x1x2x3x4x5x6x7x8

x9x10x11x12x13x14x15x16x17

x18x19x20x21x22x23x24x25x26x27

x28x29x30x31x32x33

F-1

F-2

F-3

F-4

First-order Confirmatory Factor Analytic Model

2,1

3,2

4,3

3,1

4,2

4,1

Sta

tist

ical D

esi

gn

: Fi

rst-

ord

er

Fact

or

An

aly

sis

12345678

91011121314151617

18192021222324252627

282930313233

x1x2x3x4x5x6x7x8

x9x10x11x12x13x14x15x16x17

x18x19x20x21x22x23x24x25x26x27

x28x29x30x31x32x33

F-1

F-2

F-3

F-4

F-A

F-B

Second-order Confirmatory Factor Analytic Model

Sta

tist

ical D

esi

gn

: Seco

nd-o

rder

Fact

or

An

aly

sis

Firs

t, S

eco

nd

-ord

er

Fact

or

Analy

sis

First-order CFA and Second-order CFA

M-1

x1x2x3x4x5x6x7x8

x9x10x11x12x13x14x15x16x17

x18x19x20x21x22x23x24x25x26x27

x28x29x30x31x32x33

LV-1

LV-2

LV-3

LV-4

M-2

Sta

tist

ical D

esi

gn

: M

ult

itra

its-

Mult

imeth

ods

Matr

ix

First-order CFA and Multitrait-Multimethod Matrix (MTMM)

Analysis UsingDependent & Interdependent

Techniques

Sakesan Tongkhambanchong, Ph.D (Applied Behavioral Science Research)

Y

X1

X2

X3

Causal Modeling I: Path Analysis with Observed Variables(Assumption: Measurement error = 0)

Y

X1

X2

X5X4

Total Effect = Direct + Indirect Effects

Total Effect = Direct + Indirect Effects

X3

Statistical Design

2

1,1

2,1

3,1

2Y6,

2

Y4,

2Y5,

2

1X3,

1

X1,

1X2,

1

2X6,

2

X4,

2X5,

2

1Y3,

1

Y1,

1Y2,

1

Causal Modeling II: Path Analysis with Latent Variables Linear Structural Equation Modeling (SEM)(Assumption: Measurement error > 0)

4,2

1,1

5,2

6,3

2,1

3,1

4,2

5,2

6,2

1

Total Effect = Direct + Indirect Effects

Path Analysis + Confirmatory Factor Analysis

Statistical Design