Download - Statistics IV - tcs.inf.kyushu-u.ac.jp

Statistics IV

July 22, 2020

来嶋秀治 (Shuji Kijima)

Dept. Informatics,

Graduate School of ISEE

Todays topics

• linear regression (線形回帰)

•単回帰

• 重回帰

•自己回帰

• モデル選択 AIC

確率統計特論 (Probability & Statistics)

Lesson 11

2

Final exam PLAN (期末試験案)

Date/time: August 5 or 12 (8/5 or 12), 13:00-

Place (場所): moodle.

Submit electric file (incl. photo).

電子ファイルを提出 (手書きを写真にとって提出可).

Topics (範囲):

Probability and Statistics.

check the course page (講義ページを参照のこと)

http://tcs.inf.kyushu-u.ac.jp/~kijima/

Books, notes, google, etc. are allowed to use (持ち込み可).

Communication (e-mail, SNS, BBS) is prohibited.

What do you prefer? Please click. 3

1. Final exam on August 5:

期末試験は8/5にしてほしい．

2. Final exam on August 12: Day-off on August 5.

期末試験は8/12にして、8/5は休講にしてほしい．

3. Final exam on August 12: Advanced topic on August 5.

期末試験は8/12にして、8/5は発展的話題を聞きたい．

Statistics Inference (統計的推論)

Estimation (推定)

Statistical test (統計検定)

Regression (回帰)

Correlation (相関)

Time series analysis (時系列解析)

Classification/Clustering (分類)

Applications

Machine learning (機械学習),

Pattern recognition (パターン認識),

Data mining (データマイニング), etc.

Statistics / Data science4

Statistics Inference (統計的推論)

Estimation (推定) ←July 1,8 and next week

Statistical test (統計検定) ←last week

Regression (回帰) ←today

Statistics / Data science5

Linear regression

Ex. Advertisement7

Question

How does 𝑦 increase, as 𝑥 increasing?

year 1 2 3 4 5 6 7 8

𝑥: ad. cost 8 11 13 10 15 19 17 20

𝑦: sale amount 115 124 138 120 151 186 169 193

Ex. Advertisement8

Question


year 1 2 3 4 5 6 7 8

𝑥: ad. cost 8 11 13 10 15 19 17 20

𝑦: sale amount 115 124 138 120 151 186 169 193

0

50

100

150

200

250

0 5 10 15 20 25

系列1

Least Square Estimator9

Question


Linear regression (線形回帰)

Suppose 𝑦𝑖 = 𝛼 + 𝛽𝑥𝑖 + 𝑒𝑖 where 𝑒𝑖 ∼ N(0, 𝜎2).

Estimate 𝛼 and 𝛽 such that

min

𝑖=1

𝑛

𝑦𝑖 − 𝛼 + 𝛽𝑥𝑖2

year 1 2 3 4 5 6 7 8

𝑥: ad. cost 8 11 13 10 15 19 17 20

𝑦: sale amount 115 124 138 120 151 186 169 193




Estimate 𝛼 and 𝛽 such that minσ𝑖=1𝑛 𝑦𝑖 − 𝛼 + 𝛽𝑥𝑖

2

𝜕

𝜕𝛼𝑔 𝛼, 𝛽 =

𝑖=1

𝑛

−2(𝑦𝑖 − (𝛼 + 𝛽𝑥𝑖))

𝜕

𝜕𝛽𝑔 𝛼, 𝛽 =

𝑖=1

𝑛

(−2𝑥𝑖)(𝑦𝑖 − (𝛼 + 𝛽𝑥𝑖))

መ𝛽 =ො𝛼 =




Estimate 𝛼 and 𝛽 such that minσ𝑖=1𝑛 𝑦𝑖 − 𝛼 + 𝛽𝑥𝑖

2

𝜕

𝜕𝛼𝑔 𝛼, 𝛽 =

𝑖=1

𝑛

−2(𝑦𝑖 − (𝛼 + 𝛽𝑥𝑖))

𝜕

𝜕𝛽𝑔 𝛼, 𝛽 =

𝑖=1

𝑛

(−2𝑥𝑖)(𝑦𝑖 − (𝛼 + 𝛽𝑥𝑖))

𝛼 + 𝛽 ҧ𝑥 = ത𝑦

𝛼 ҧ𝑥 + 𝛽𝑥2 = 𝑥𝑦𝜕

𝜕𝛽𝑔 𝛼, 𝛽 = 0

𝜕

𝜕𝛼𝑔 𝛼, 𝛽 = 0

መ𝛽 =𝑥𝑦 − ҧ𝑥 ⋅ ത𝑦

𝑥2 − ҧ𝑥2

ො𝛼 = ത𝑦 − መ𝛽 ҧ𝑥

Ex. Advertisement12

year 1 2 3 4 5 6 7 8

𝑥: ad. cost 8 11 13 10 15 19 17 20

𝑦: sale amount 115 124 138 120 151 186 169 193

𝑥 ≔1

𝑛

𝑖=1

𝑛

𝑥𝑖 =113

8= 14.125

𝑦 ≔1

𝑛

𝑖=1

𝑛

𝑦𝑖 =1196

8= 149.5

𝑥2 ≔1

𝑛

𝑖=1

𝑛

𝑥𝑖2 =

1729

8= 216.25

𝑥𝑦 ≔1

𝑛

𝑖=1

𝑛

𝑥𝑖𝑦𝑖 =17810

8= 2226.25

መ𝛽 =𝑥𝑦 − 𝑥 ⋅ 𝑦

𝑥2 − 𝑥2

=2226.25 − 14.125 × 149.5

216.125 − 14.1252= 6.9

ො𝛼 = 𝑦 − መ𝛽𝑥 = 149.5 − 6.9 × 14.125 = 52.1

E መ𝛽 = E𝑥𝑦 − ҧ𝑥 ത𝑦

𝑥2 − ҧ𝑥2=?

E ො𝛼 = E 𝑦 − መ𝛽 ҧ𝑥 =?

Q: Are ො𝛼, መ𝛽 unbiased estimators?13


𝑥2 − ҧ𝑥2= 𝛽

E ො𝛼 = E 𝑦 − መ𝛽 ҧ𝑥 = 𝛼


?

?


𝑥2 − ҧ𝑥2=E 𝑥𝑦 − ҧ𝑥E ത𝑦

𝑥2 − ҧ𝑥2= 𝛽

E ො𝛼 = E 𝑦 − መ𝛽 ҧ𝑥 = E ത𝑦 − ҧ𝑥E መ𝛽 = 𝛼


?

?


𝑥2 − ҧ𝑥2=E 𝑥𝑦 − ҧ𝑥E ത𝑦

𝑥2 − ҧ𝑥2= 𝛽

E ො𝛼 = E 𝑦 − መ𝛽 ҧ𝑥 = E ത𝑦 − ҧ𝑥E መ𝛽 = 𝛼


?

?

E 𝑦 =1

𝑛

𝑖=1

𝑛

E 𝑦𝑖 =1

𝑛

𝑖=1

𝑛

𝛼 + 𝛽𝑥𝑖 = 𝛼 + 𝛽𝑥

E 𝑥𝑦 =1

𝑛

𝑖=1

𝑛

E 𝑥𝑖𝑦𝑖 =1

𝑛

𝑖=1

𝑛

𝑥𝑖 𝛼 + 𝛽𝑥𝑖 = 𝛼𝑥 + 𝛽𝑥2

E መ𝛽 =E 𝑥𝑦 − ҧ𝑥E ത𝑦

𝑥2 − ҧ𝑥2=𝛼𝑥 + 𝛽𝑥2 − ҧ𝑥 𝛼 + 𝛽𝑥

𝑥2 − ҧ𝑥2= 𝛽

E ො𝛼 = E ത𝑦 − ҧ𝑥E መ𝛽 = 𝛼 + 𝛽 ҧ𝑥 − ҧ𝑥𝛽 = 𝛼

Variance (Gauss–Markov theorem)17

Thus

ො𝛼 ∼ N 𝛼, 𝑎 𝑥 𝜎2

መ𝛽 ∼ N(𝛽, 𝑏 𝑥 𝜎2)

Thm.

Let ො𝜎2: =1

𝑛−2σ𝑖=1𝑛 𝑦𝑖 − ො𝛼 + መ𝛽𝑥𝑖

2,

then 𝐸 ො𝜎2 = 𝜎2 and 𝑛−2 ෝ𝜎2

𝜎2∼ 𝜒𝑛−2

2 hold.

omit the proof (not easy)

Remark

Var ෝ𝛼 and Var 𝛽 decrease

as 𝑠𝑥2 =

σ𝑖=1𝑛 (𝑥𝑖

2− 𝑥 2)

𝑛increases.

Observe 𝑥 in a wide range,

then we obtain a good estimator.

Var ො𝛼 ≔ E ො𝛼 − 𝛼 2 =1

𝑛1 +

𝑥2

𝑠𝑥2 𝜎2

Var መ𝛽 ≔ E መ𝛽 − 𝛽2=

1

𝑛⋅𝑠𝑥2 𝜎

2

Cf. 中田，内藤「確率・統計」

Hypothesis testing


Question

Does the data support the claim 𝛽 = 0?




min

𝑖=1

𝑛


year 1 2 3 4 5 6 7 8

𝑥: ad. cost 8 11 13 10 15 19 17 20

𝑦: sale amount 115 124 138 120 151 186 169 193

Hypothesis testing for 𝛽20

The central limit theorem suggests that መ𝛽 − 𝛽

Var መ𝛽

∼ N 0,1

Then, its Studentization is

𝑇𝑛 ≔መ𝛽 − 𝛽

𝜎2

𝑛 ⋅ 𝑠𝑥2

≃መ𝛽 − 𝛽

Var መ𝛽

Thm.

𝑇𝑛 ∼ 𝑡𝑛−2

Ex. Advertisement21

𝑡6−2∗ = 2.477 null hypothesis 𝛽 = 0 is rejected.

year 1 2 3 4 5 6 7 8

𝑥: ad. cost 8 11 13 10 15 19 17 20

𝑦: sale amount 115 124 138 120 151 186 169 193

መ𝛽 =𝑥𝑦 − 𝑥 ⋅ 𝑦

𝑥2 − 𝑥2

= 6.9

ො𝛼 = 𝑦 − መ𝛽𝑥 = 52.1

𝜎2 =1

𝑛 − 2

𝑖=1

𝑛

𝑦𝑖 − ො𝛼 + መ𝛽𝑥𝑖2

= 21.418

𝑇𝑛 =መ𝛽 − 𝛽

𝜎2

𝑛 ⋅ 𝑠𝑥2

=6.9 − 0

4.6282

132.875

= 17.19


1 2 3 … … 37 38

x: applied dose

(投与量)

2.32 2.39 2.61 7.78 8.28

y: observed value

(観測数値)

2.88 3.21 3.01 5.88 6.67

Question





min

𝑖=1

𝑛


ex24

ො𝛼 = 2.07 , መ𝛽 = 0.49 , ො𝜎2 = 0.472, 𝑠𝑥2 = ⋯

Q:𝛽 = 0?

administration does not work? (投与効果はない？)

If 𝑇𝑛 > |𝑡36∗ | then

the null hypothesis 𝛽 = 0 is rejected.

𝑇𝑛 ≔መ𝛽 − 𝛽

𝜎2

𝑛 ⋅ 𝑠𝑥2

=0.49 − 0

0.472

38 ×? ? ?

= ⋯

Multiple Linear Regression


Proposition

The optimum solution is 𝜷 = 𝑋⊤𝑋−1𝑋⊤𝒚.

Furthermore, 𝜷 is an unbiased estimator.

where 𝒚 =𝑦1⋮𝑦𝑛

and 𝑿 =𝒙𝟏⊤

⋮𝒙𝒏⊤

Multiple linear regression (多重線形回帰)

Suppose that 𝑦𝑖 = 𝒙𝒊⊤𝜷 + 𝑒𝑖 and 𝑒𝑖 ∼ N(0, 𝜎2).

Estimate 𝜷 such that

min𝜷

𝑖=1

𝑛

𝑦𝑖 − 𝒙𝒊⊤𝜷

2= min

𝜷𝒚 − 𝑋𝜷 ⊤(𝒚 − 𝑋𝜷)

Multi linear regression27

Proposition

The optimum solution is 𝜷 = 𝑋⊤𝑋−1𝑋⊤𝒚.

𝛻 𝒚 − 𝑋𝜷 ⊤ 𝒚 − 𝑋𝜷 = −2𝑋⊤ 𝒚 − 𝑋𝜷

𝑋⊤𝒚 − 𝑋⊤𝑋𝜷 = 0

𝑋⊤𝑋𝜷 = 𝑋⊤𝒚

Rem 10-1. (see Apex.)

𝛻 𝐴𝒙 ⊤ 𝐴𝒙 = 2𝐴⊤𝐴𝒙

Apex. Proof of Rem. 10-128

Rem 10-1. (see Apex.)

𝛻 𝐴𝒙 ⊤ 𝐴𝒙 = 2𝐴⊤𝐴𝒙

Since

𝜕 𝑎𝑖1𝑥1 +⋯+ 𝑎𝑖𝑑𝑥𝑑2

𝜕𝑥𝑗= 2 𝑎𝑖1𝑥1 +⋯+ 𝑎𝑖𝑑𝑥𝑑 𝑎𝑖𝑗

we have

𝛻 𝑎𝑖1𝑥1 +⋯+ 𝑎𝑖𝑑𝑥𝑑2 = 2(𝑎𝑖1𝑥1, … , 𝑎𝑖𝑑𝑥𝑑)

𝑎𝑖1⋮𝑎𝑖𝑑

= 2 𝒂𝒊⊤𝒙 𝒂𝒊

Then,

𝑖=1

𝑑

𝛻 𝑎𝑖1, … , 𝑎𝑖𝑑2 = 2

𝑖=1

𝑑

𝒂𝒊⊤𝒙 𝒂𝒊 = 2 𝒂𝟏… 𝒂𝒅

𝒂𝟏⊤𝒙⋮

𝒂𝒅⊤𝒙

= 2𝐴⊤𝐴𝒙

!

Claim (next slide)

𝑖=1

𝑑

𝑐𝑖𝒚𝒊 = (𝒚𝟏…𝒚𝒅)𝒄

𝐴𝒙 =𝒂𝟏⊤

⋮𝒂𝒅⊤

𝒙 =

𝑎11𝑥1 +⋯+ 𝑎1𝑑𝑥𝑑⋮

𝑎𝑑1𝑥1 +⋯+ 𝑎𝑑𝑑𝑥𝑑

𝛻 𝐴𝒙 ⊤ 𝐴𝒙 = 𝛻

𝑖=1

𝑑

𝑎𝑖1𝑥1 +⋯+ 𝑎𝑖𝑑𝑥𝑑2 =

𝑖=1

𝑑

𝛻 𝑎𝑖1𝑥1 +⋯+ 𝑎𝑖𝑑𝑥𝑑2

linearity of 𝛻

Apex. Proof of Rem. 10-1 (contd.)29

Claim

𝑖=1

𝑑

𝑐𝑖𝒚𝒊 = (𝒚𝟏…𝒚𝒅)𝒄

r. h. s. =

𝑦11 ⋯ 𝑦𝑑1⋮ ⋱ ⋮𝑦1𝑑 ⋯ 𝑦𝑑𝑑

𝑐1⋮𝑐𝑑

=

𝑦11𝑐1 +⋯+ 𝑦1𝑑𝑐𝑑⋮

𝑦𝑑1𝑐1 +⋯+ 𝑦𝑑𝑑𝑐𝑑

= 𝑐1

𝑦11⋮𝑦1𝑑

+⋯+ 𝑐𝑑

𝑦𝑑1⋮

𝑦𝑑𝑑

=

𝑖=1

𝑑

𝑐𝑖𝒚𝒊 = (l. h. s. )

Nonlinear Regression


AIC (Akaike Information criteria: 赤池情報量)

AIC = −2

𝑖=1

𝑛

log 𝑓 𝑋𝑖; 𝜃𝑛ML + 2dim(𝜃)

Nonlinear regression (非線形回帰)

Suppose

𝑦𝑖 = ℎ 𝑥𝑖 + 𝑒𝑖 and 𝑒𝑖 ∼ N(0, 𝜎2) where

ℎ 𝑥 = 𝛾0 + 𝛾1𝑥 + 𝛾2𝑥2 + 𝛾3𝑥

3 +⋯ .

Estimate ℎ such that

min𝛽

𝑖=1

𝑛

𝑦𝑖 − ℎ(𝑥𝑖)2