Introduction to Generalized Additive...

29
Introduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta October 17, 2013 / NWAV Pittsburgh 1 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität T

Transcript of Introduction to Generalized Additive...

Page 1: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

Introduction to Generalized Additive Models

R. Harald Baayen

Seminar für SprachwissenschaftUniversität Tübingen & Department of Linguistics

University of Alberta

October 17, 2013 / NWAV Pittsburgh

1 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 2: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

linear regressionGalton

2 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 3: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

but . . . how linear were his data?

3 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 4: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

wiggly lines: regression splines

I restricted cubic splines

I thin plate regression splines

4 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 5: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

restricted cubic splines(pupil dilation curve)

●●

●●●

●●

●●

●●

●●

●●●

●●●●

●●●

●●

●●●●

●●●

●●

●●

●●

0 500 1000 1500 2000 2500

−10

00

100

200

300

400

500

Time (ms)

Pup

il D

ilatio

n (0

.001

mm

)

●●

●●●

●●

●●

●●

●●

●●●

●●●●

●●●

●●

●●●●

●●●

●●

●●

●●

0 500 1000 1500 2000 2500−

100

010

020

030

040

050

0

Time (ms)

Pup

il D

ilatio

n (0

.001

mm

)

5 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 6: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

thin plate regression splines

0 20 40 60 80 100

010

2030

4050

x

f1(x

)

0 20 40 60 80 100

−50

−30

−10

0

x

f2(x

)

0 20 40 60 80 100

−40

020

40

x

f3(x

)

0 20 40 60 80 100

−10

0−

60−

2020

x

f1(x

) +

2 *

f2(x

) +

3/5

* f3

(x)

6 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 7: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

stress on triconstituent compounds in English

left-branching stress left háy fever treatmentleft-branching stress right science fíction bookright-branching stress left business crédit cardright-branching stress right family Christmas dínner

7 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 8: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

Pitch(Hz)

100

300

Pitch(Hz)

100

300

300

Pitch(Hz)

100

she read about a gene therapy technology last night

Time (s)0 2.996

8 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 9: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

> library(mgcv)> pitch.gam = bam(PitchSemiTone ~> Sex +> BranchingCondition +> s(NormalizedTime, by=BranchingCondition) +> s(NormalizedTime, Speaker, bs="fs", m=1) +> s(NormalizedTime, Compound, bs="fs", m=1) +> s(Compound, Sex, bs="re"),> data=pitch,> rho=0.825, AR.start=pitch$NewTimeSeries)

9 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 10: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

> round(summary(pitch.gam)$p.table, 2)

Estimate Std. Error t value Pr(>|t|)(Intercept) 85.34 1.60 53.22 0Sexm -9.92 1.61 -6.15 0BranchingConditionLN2 5.54 1.40 3.97 0BranchingConditionR 4.11 1.18 3.48 0

10 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 11: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

> round(summary(pitch.gam)$s.table, 2)[,1:2]

edf Ref.dfs(NormalizedTime):BranchingConditionLN1 6.27 6.59s(NormalizedTime):BranchingConditionLN2 7.14 7.42s(NormalizedTime):BranchingConditionR 8.40 8.58s(NormalizedTime,Speaker) 96.04 106.00s(NormalizedTime,Compound) 304.18 349.00s(Compound,Sex) 50.33 76.00

> round(summary(pitch.gam)$s.table, 2)[,3:4]

F p-values(NormalizedTime):BranchingConditionLN1 2.70 0.01s(NormalizedTime):BranchingConditionLN2 5.40 0.00s(NormalizedTime):BranchingConditionR 15.34 0.00s(NormalizedTime,Speaker) 1298.17 0.00s(NormalizedTime,Compound) 44.11 0.00s(Compound,Sex) 18.82 0.00

11 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 12: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

0 20 40 60 80 100

−6

−4

−2

02

46

normalized time

part

ial e

ffect

(in

sem

itone

s)

0 20 40 60 80 100

−6

−4

−2

02

46

normalized time

part

ial e

ffect

(in

sem

itone

s)

0 20 40 60 80 100

−6

−4

−2

02

46

normalized time

part

ial e

ffect

(in

sem

itone

s)0 20 40 60 80 100

−6

−4

−2

02

46

normalized time

part

ial e

ffect

(in

sem

itone

s)

0 20 40 60 80 100

−2

−1

01

23

normalized time

part

ial e

ffect

(in

sem

itone

s) ●●

●●

●●

●●

●●

●●

●●

●●

●●

−2 −1 0 1 2

−1.

0−

0.5

0.0

0.5

s(Compound,Sex,50.33)

Gaussian quantilesef

fect

s

12 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 13: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

−1.0 −0.5 0.0 0.5 1.0 1.5

−1.

0−

0.5

0.0

0.5

by−word random effects for females

by−

wor

d ra

ndom

effe

cts

for

mal

es

adult jogging suit

baby lemon tea

business credit card

celebrity golf tournament

city hall restoration

coffee table designer

company internet page

conference time sheet

cotton candy maker

cream cheese recipe

day care center

diamond ring exhibition

family christmas dinner

family planning clinic

field hockey player

gene therapy technologyhay fever treatment

kidney stone removal

lung cancer surgery

maple syrup production

money market fund

passenger test flight

piano sheet music

pilot leather jacket

pizza home delivery

prisoner community service

restaurant tourist guide

science fiction book

security guard service

sign language class

silicon chip manufacturer

silver jubilee gift

student season ticket

student string orchestra

team locker room

tennis grass court

tennis group practice

visitor name tag

weather station data

woman fruit cocktail

13 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 14: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

words that send males’ pitch up

−1.0 −0.5 0.0 0.5 1.0 1.5

0.2

0.3

0.4

0.5

0.6

by−word random effects for females

by−

wor

d ra

ndom

effe

cts

for

mal

es

baby lemon tea

coffee table designer

company internet page

cotton candy maker

family planning clinicmaple syrup production

money market fund

passenger test flight

piano sheet music

pilot leather jacket

tennis group practice

weather station data

14 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 15: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

words that send females’ pitch up

−1.0 −0.5 0.0 0.5 1.0 1.5

−1.

2−

1.0

−0.

8−

0.6

−0.

4−

0.2

by−word random effects for females

by−

wor

d ra

ndom

effe

cts

for

mal

es

adult jogging suit

business credit card

cream cheese recipe

day care centerlung cancer surgery

restaurant tourist guide

science fiction book

student season ticket

team locker room

visitor name tag

woman fruit cocktail

15 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 16: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

I the standard linear model: multiplicative interaction

I Y ∼ X1 + X2 + X1 · X2

x1

x2

linear predictor

16 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 17: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

I the standard linear model: multiplicative interaction

I Y ∼ X1 + X2 + X1 · X2

x1

x2linear predictor

17 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 18: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

I the standard linear model: multiplicative interaction

I Y ∼ X1 + X2 + X1 · X2

18 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 19: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

I the standard linear model: multiplicative interaction

I Y ∼ X1 + X2 + X1 · X2

−0.4 −0.2 0.0 0.2 0.4

−0.

4−

0.2

0.0

0.2

0.4

linear predictor

x1

x2

−0.2

−0.2

−0.15

−0.15

−0.1

−0.1

−0.05

−0.05

0

0

0.05

0.05

0.1

0.1

0.15

0.15

0.2

0.2

19 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 20: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

wiggly surfaces

I thin plate regression splines

isometric predictors

I tensor products

non-isometric predictors

20 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 21: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

thin plate regression splines

21 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 22: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

tensor product smooths

22 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 23: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

beware of TPRS!

X

Y

650 700 750 800 850 900

68

1012

−0.5

−0.5

−0.5

−0.5

−0.5

0

0 0 0

0.5

0.5

0.5

0.5

X

Y

650 700 750 800 850 900

68

1012

600 650 700 750 800 850 900

46

810

12

X

Y

−0.6

−0.6

−0.

6

−0.4

−0.2

−0.2

−0.2 −

0.2

0

0

0

0

0.2

0.2

0.2

0.4

0.4

0.4

0.4

0.6

0.6

0.6

0.8

600 650 700 750 800 850 900

46

810

12

X

Y

0

0.05

0.1

0.15

0.2

0.2

5 0

.3

0.3

5 0

.4

23 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 24: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

interactions with factors

I multiple surfaces, one for each factor level

I for binary factors: difference surface

24 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 25: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

two surfaces, and their difference surface

X

Y

−2

−1

−1

−1

−0.5

−0.5

−0.5

−0.

5

−0.5

0

0

0

0

0.5

0.5

0.5

0.5

1

1

A

X

Y

−4

−3

−3

−2

−2

−2

−2 −1

−1

−1

−1

−1 0

0

0

0

1

1

1

1

2

2

2

2

3

B

X

Y

−2

−1

−1

−1

−1

−1

0

0

0

0 0

1

1

1

1

1

2

2

B−A

25 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 26: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

two surfaces, and their difference surface

> # two surfaces> sinus.gam = bam(Z ~ Condition +> te(X, Y, by = Condition), data = sinus)>> # a difference surface> sinus$ConditionNum = ifelse(sinus$Condition=="A", 0, 1)> sinus.diff.gam = bam(Z ~ te(X, Y) +> te(X, Y, by = ConditionNum), data = sinus)

26 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 27: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

decompositional models

> sinus2.gam = gam(Zsin ~ ti(X) + ti(Y) + Condition +> ti(X, Y, by=Condition), data=sinus2)

27 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 28: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

decompositional models

600 650 700 750 800 850 900−

1.5

−0.

50.

51.

5X

Zsi

n4 6 8 10 12

−1.

5−

0.5

0.5

1.5

Y

Zsi

n

X

Y

−1 −1

−0.5

−0.5

−0.5

0 0 0

0

0.5

0.5

0.5

0.5

1

1

A

X

Y

−3

−3 −2

−2

−2

−1

−1

−1

−1

−1 0

0

0

0 0

1

1

1

1 2

2

2

2

3

3

B

600 650 700 750 800 850 900

46

810

12

X

Y

−2.

5

−2

−1.5

−1

−1

−1 −1

−0.5

−0.5

−0.5

0

0 0.

5

0.5

1

1

1.5

1.5

2

A

600 650 700 750 800 850 900

46

810

12

X

Y

−4

−4

−2

−2

−2

−2 −1

−1

−1

−1 0

0

0 0

1

1

1 2

2

2

3

3

4

B

28 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta

Page 29: Introduction to Generalized Additive Modelswieling/NWAV/presentations/NWAV-Part-I-Harald.pdfIntroduction to Generalized Additive Models R. Harald Baayen Seminar für Sprachwissenschaft

higher-dimensional interactions

> m.gam = gam(Response ~ te(X, Y, Z),> data= dfr)> m.gam = gam(Response ~ te(X, Y) + te(X, Z),> data = m)> m.gam = gam(Response ~ ti(X) + ti(Y) + ti(Z) +> ti(X, Y) + ti(X, Z),> data = m)

29 | R. H. Baayen GAMs Seminar für Sprachwissenschaft Universität Tübingen & Department of Linguistics University of Alberta