Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision...
Transcript of Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision...
![Page 1: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/1.jpg)
Artificial intelligence
Decision trees
PRISM - Nicolas Sutton-Charani
18/01/2021
1 / 52
![Page 2: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/2.jpg)
Artificial intelligence
1. Introduction
2. Use of decision trees2.1 Prediction2.2 Interpretability : Descriptive data analysis
3. Learning of decision trees3.1 Purity criteria3.2 Stopping criteria3.3 Learning algorithm
4. Pruning of decision trees4.1 Cost-complexity trade-off
5. Extension : random forest
2 / 52
![Page 3: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/3.jpg)
Artificial intelligence
Introduction
Plan
1. Introduction
2. Use of decision trees2.1 Prediction2.2 Interpretability : Descriptive data analysis
3. Learning of decision trees3.1 Purity criteria3.2 Stopping criteria3.3 Learning algorithm
4. Pruning of decision trees4.1 Cost-complexity trade-off
5. Extension : random forest
3 / 52
![Page 4: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/4.jpg)
Artificial intelligence
Introduction
What is a decision tree ?
attribute J1
attribute J2
labelprediction
labelprediction
attribute J3
attribute J4
labelprediction
labelprediction
labelprediction
4 / 52
![Page 5: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/5.jpg)
Artificial intelligence
Introduction
What is a decision tree ?
attribute J1
attribute J2
labelprediction
labelprediction
attribute J3
attribute J4
labelprediction
labelprediction
labelprediction
5 / 52
![Page 6: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/6.jpg)
Artificial intelligence
Introduction
What is a decision tree ? → supervised learning
attribute J1
attribute J2
labelprediction
values
labelprediction
values
values
attribute J3
attribute J4
labelprediction
values
labelprediction
values
values
labelprediction
values
values
6 / 52
![Page 7: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/7.jpg)
Artificial intelligence
Introduction
A little history
!4machine learning (or data mining) decision trees6= decision theory decision trees
7 / 52
![Page 8: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/8.jpg)
Artificial intelligence
Introduction
Types of decision trees
type of class label
I numerical → regression tree
I nominal → classification tree
type of algorithm (→ structure)
I CART : statistics, binary tree
I C4.5 : computer science, small tree
8 / 52
![Page 9: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/9.jpg)
Artificial intelligence
Use of decision trees
Plan
1. Introduction
2. Use of decision trees2.1 Prediction2.2 Interpretability : Descriptive data analysis
3. Learning of decision trees3.1 Purity criteria3.2 Stopping criteria3.3 Learning algorithm
4. Pruning of decision trees4.1 Cost-complexity trade-off
5. Extension : random forest
9 / 52
![Page 10: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/10.jpg)
Artificial intelligence
Use of decision trees
Prediction
Plan
1. Introduction
2. Use of decision trees2.1 Prediction2.2 Interpretability : Descriptive data analysis
3. Learning of decision trees3.1 Purity criteria3.2 Stopping criteria3.3 Learning algorithm
4. Pruning of decision trees4.1 Cost-complexity trade-off
5. Extension : random forest
10 / 52
![Page 11: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/11.jpg)
Artificial intelligence
Use of decision trees
Prediction
Classification treesWill the badminton match take place ?
11 / 52
![Page 12: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/12.jpg)
Artificial intelligence
Use of decision trees
Prediction
Classification treesWhat fruit is it ?
12 / 52
![Page 13: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/13.jpg)
Artificial intelligence
Use of decision trees
Prediction
Classification treesWhat he/she come to my party ?
13 / 52
![Page 14: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/14.jpg)
Artificial intelligence
Use of decision trees
Prediction
Classification treesWill they wait ?
14 / 52
![Page 15: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/15.jpg)
Artificial intelligence
Use of decision trees
Prediction
Classification treesWho will win the US presidential election ?
15 / 52
![Page 16: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/16.jpg)
Artificial intelligence
Use of decision trees
Prediction
Regression treesWhat grade will a student get (given his homework averagegrade) ?
16 / 52
![Page 17: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/17.jpg)
Artificial intelligence
Use of decision trees
Interpretability : Descriptive data analysis
Plan
1. Introduction
2. Use of decision trees2.1 Prediction2.2 Interpretability : Descriptive data analysis
3. Learning of decision trees3.1 Purity criteria3.2 Stopping criteria3.3 Learning algorithm
4. Pruning of decision trees4.1 Cost-complexity trade-off
5. Extension : random forest
17 / 52
![Page 18: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/18.jpg)
Artificial intelligence
Use of decision trees
Interpretability : Descriptive data analysis
Data analysis tool
Trees are very interpretable : attributes spaces partitioning
→ a tree can be resumed by its leaves which define a law mixture
→ wonderful collaboration tool with experts
!4 INSTABILITY ← overfitting
18 / 52
![Page 19: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/19.jpg)
Artificial intelligence
Learning of decision trees
Plan
1. Introduction
2. Use of decision trees2.1 Prediction2.2 Interpretability : Descriptive data analysis
3. Learning of decision trees3.1 Purity criteria3.2 Stopping criteria3.3 Learning algorithm
4. Pruning of decision trees4.1 Cost-complexity trade-off
5. Extension : random forest
19 / 52
![Page 20: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/20.jpg)
Artificial intelligence
Learning of decision trees
Formalism
Learning dataset (supervised learning) x1, y1...
xN , yN
=
x11 . . . xJ1 y1...
......
x1N . . . xJN yN
samples are assu-med to be i.i.d
I Attributes X = (X 1, . . . ,X J) ∈ X = X 1 × · · · × X J
I Spaces X j can be categorical or numerical
I Class label Y ∈ Ω = ω1, . . . , ωK (∈ RK for regression)
Tree
PH = t1, . . . , tH and πh = P(th) ≈ |th|N
with |th| = #i : xi ∈ th
20 / 52
![Page 21: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/21.jpg)
Artificial intelligence
Learning of decision trees
Recursive partitioning
21 / 52
![Page 22: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/22.jpg)
Artificial intelligence
Learning of decision trees
Recursive partitioning
22 / 52
![Page 23: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/23.jpg)
Artificial intelligence
Learning of decision trees
Recursive partitioning
23 / 52
![Page 24: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/24.jpg)
Artificial intelligence
Learning of decision trees
Recursive partitioning
24 / 52
![Page 25: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/25.jpg)
Artificial intelligence
Learning of decision trees
Learning principleI Start with all the dataset in the initial nodeI Chose the best splits (on attributes) in order to get pure
leaves
Classification trees
purity = homogeneity in term of class labels
I CART → Gini impurity : i(th) =K∑
k=1pk (1− pk )
I ID3, C4.5 → Shanon entropy : i(th) = −K∑
k=1pk log2(pk )
whith
pk = P(Y = ωk |th)
Regression trees
purity = low variance of class labels
→ i(th) = Var(Y |th) = 1|th|
∑xi∈th
(yi − E(Y |th))2 with E(Y |th) = 1|th|
∑xi∈th
yi
25 / 52
![Page 26: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/26.jpg)
Artificial intelligence
Learning of decision trees
Impurity measures
26 / 52
![Page 27: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/27.jpg)
Artificial intelligence
Learning of decision trees
Purity criteria
Plan
1. Introduction
2. Use of decision trees2.1 Prediction2.2 Interpretability : Descriptive data analysis
3. Learning of decision trees3.1 Purity criteria3.2 Stopping criteria3.3 Learning algorithm
4. Pruning of decision trees4.1 Cost-complexity trade-off
5. Extension : random forest
27 / 52
![Page 28: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/28.jpg)
Artificial intelligence
Learning of decision trees
Purity criteria
Purity criteria
leafto split
th
Impurity measure + tree structure → criteria
CART, ID3 : purity gain
C4.5 : information gain ratio
Regression trees
CART : Variance minimisation28 / 52
![Page 29: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/29.jpg)
Artificial intelligence
Learning of decision trees
Purity criteria
Purity criteria
attribute ?
prediction ?
values ?
prediction ?
values ?th
tL tR
Impurity measure + tree structure → criteria
CART, ID3 : purity gain → ∆i = i(th)− πLi(tL)− πR i(tR)C4.5 : information gain ratio → IGR = ∆i
H(πL,πR)
Regression trees
CART : Variance minimisation → ∆i = i(th)− πLi(tL)− πR i(tR)
29 / 52
![Page 30: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/30.jpg)
Artificial intelligence
Learning of decision trees
Stopping criteria
Plan
1. Introduction
2. Use of decision trees2.1 Prediction2.2 Interpretability : Descriptive data analysis
3. Learning of decision trees3.1 Purity criteria3.2 Stopping criteria3.3 Learning algorithm
4. Pruning of decision trees4.1 Cost-complexity trade-off
5. Extension : random forest
30 / 52
![Page 31: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/31.jpg)
Artificial intelligence
Learning of decision trees
Stopping criteria
Stopping criteria (pre-pruning)
For all leaves thh=1,...,H and their potential children :
I leaves purity : ∃k ∈ 1, . . . ,K : pk = 1
I leaves and children sizes : |th| ≤ minLeafSize
I leaves and children weights : πh = |th|t0≤ minLeafProba
I leaves number : H ≥ maxNumberLeaves
I tree depth : depth(PH) ≥ maxDepth
I purity gain : ∆i ≤ minPurityGain
31 / 52
![Page 32: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/32.jpg)
Artificial intelligence
Learning of decision trees
Learning algorithm
Plan
1. Introduction
2. Use of decision trees2.1 Prediction2.2 Interpretability : Descriptive data analysis
3. Learning of decision trees3.1 Purity criteria3.2 Stopping criteria3.3 Learning algorithm
4. Pruning of decision trees4.1 Cost-complexity trade-off
5. Extension : random forest
32 / 52
![Page 33: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/33.jpg)
Artificial intelligence
Learning of decision trees
Learning algorithm
Learning algorithm
Result: Learnt tree
Start with all the learning data in an initial node (single leaf);
while Stopping criteria not verified for all leaves dofor each splitable leaf do
compute the purity gains obtained from all possiblesplit;
endSPLIT : select the split achieving the maximum purity gain;
endprune the obtained tree;
Recursive partitioning33 / 52
![Page 34: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/34.jpg)
Artificial intelligence
Learning of decision trees
Learning algorithm
ID3 - Training Examples – [9+,5-]
34 / 52
![Page 35: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/35.jpg)
Artificial intelligence
Learning of decision trees
Learning algorithm
ID3 - Selecting Next Attribute
35 / 52
![Page 36: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/36.jpg)
Artificial intelligence
Learning of decision trees
Learning algorithm
ID3 - Selecting Next Attribute
36 / 52
![Page 37: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/37.jpg)
Artificial intelligence
Learning of decision trees
Learning algorithm
ID3 - Selecting Next Attribute
37 / 52
![Page 38: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/38.jpg)
Artificial intelligence
Learning of decision trees
Learning algorithm
ID3 - Best Attribute - Outlook
38 / 52
![Page 39: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/39.jpg)
Artificial intelligence
Learning of decision trees
Learning algorithm
ID3 - Ssunny
39 / 52
![Page 40: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/40.jpg)
Artificial intelligence
Learning of decision trees
Learning algorithm
ID3 - Results
40 / 52
![Page 41: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/41.jpg)
Artificial intelligence
Pruning of decision trees
Plan
1. Introduction
2. Use of decision trees2.1 Prediction2.2 Interpretability : Descriptive data analysis
3. Learning of decision trees3.1 Purity criteria3.2 Stopping criteria3.3 Learning algorithm
4. Pruning of decision trees4.1 Cost-complexity trade-off
5. Extension : random forest
41 / 52
![Page 42: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/42.jpg)
Artificial intelligence
Pruning of decision trees
Overfitting
42 / 52
![Page 43: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/43.jpg)
Artificial intelligence
Pruning of decision trees
Overfitting
Remark : decision trees do not need variable selection ordimension reduction (in term of accuracy).
43 / 52
![Page 44: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/44.jpg)
Artificial intelligence
Pruning of decision trees
Cost-complexity trade-off
Plan
1. Introduction
2. Use of decision trees2.1 Prediction2.2 Interpretability : Descriptive data analysis
3. Learning of decision trees3.1 Purity criteria3.2 Stopping criteria3.3 Learning algorithm
4. Pruning of decision trees4.1 Cost-complexity trade-off
5. Extension : random forest
44 / 52
![Page 45: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/45.jpg)
Artificial intelligence
Pruning of decision trees
Cost-complexity trade-off
Cost-Complexity Pruning
The ideaI trade-off between predictive efficiency and complexity
I find a subtree that fulfills this trade-off
MetricsI ’Err’ ← misclassification rate or MSE
I Criterion : Rα = Err + αH
Steps
I Find a useful sequence of nested subtrees
I Choose the right subtree
45 / 52
![Page 46: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/46.jpg)
Artificial intelligence
Pruning of decision trees
Cost-complexity trade-off
Cost-Complexity Pruning
Sequence of subtrees creation
Result: sequence of trees that are all sub-trees of T0 : T0T1 T2 T3 . . . Tk P1(initialnode)
Learn the biggest tree Ts = T0 := PHmax obtained for α0 = 0(s=0);
while Ts 6= P1 doTs+1 = argmin
t∈subtrees(Ts)[Rαs (t)− Rαs (Ts)];
αs+1 = Rαs (Ts+1)− Rαs (Ts);
end
We get 2 bijective sets : T0, . . . ,TS and α0, . . . , αS (with TS = P1)
Selection : Ts∗ = argminTs∈T0,...,TS
Err(Ts) ← pruning set or cross validation
46 / 52
![Page 47: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/47.jpg)
Artificial intelligence
Pruning of decision trees
Cost-complexity trade-off
Cost-Complexity Pruning
Figure – Sequence of nested subtrees
Here, α2 < α1 =⇒ T − T1 ⊂ T − T2
47 / 52
![Page 48: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/48.jpg)
Artificial intelligence
Extension : random forest
Plan
1. Introduction
2. Use of decision trees2.1 Prediction2.2 Interpretability : Descriptive data analysis
3. Learning of decision trees3.1 Purity criteria3.2 Stopping criteria3.3 Learning algorithm
4. Pruning of decision trees4.1 Cost-complexity trade-off
5. Extension : random forest
48 / 52
![Page 49: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/49.jpg)
Artificial intelligence
Extension : random forest
Random forest
MotivationI trees instability
I bias-variance trade-off
Averaging reduces variance :
Var(X ) =Var(X )
N(for independant predictions)
→ Average models to reduce model variance
One problem :
- only one training set
- where do multiple models come from ?49 / 52
![Page 50: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/50.jpg)
Artificial intelligence
Extension : random forest
Bagging : Bootstrap Aggregation
I Tin Kam Ho (1995) → Leo Breiman (2001)
I Take repeated bootstrap samples from the training set
I Bootstrap sampling : Given a training set D containing Nexamples, draw N examples at random with replacement fromD.
I Bagging :
- create B bootstrap samples D1, . . . ,DB
- train distinct classifier on each Db
- classify new instance by majority vote / averaging /aggregating predictions
50 / 52
![Page 51: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/51.jpg)
Artificial intelligence
Extension : random forest
Random forest
51 / 52
![Page 52: Decision trees · 2021. 1. 18. · Use of decision trees Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning](https://reader036.fdocuments.in/reader036/viewer/2022081411/60adf7408f18934a4b2fa352/html5/thumbnails/52.jpg)
Artificial intelligence
Extension : random forest
References
* L. Breiman, J. Friedman, C. J. Stone, and R. A. Olshen,Classification And Regression Trees, 1984.
* J. Quinlan, “Induction of decision trees,” Machine Learning,vol. 1, pp. 81–106, Oct. 1986
* L. Breiman. Random forests. Statistics, pages 1–33, 2001.
* G. Biau, L. Devroye, and G. Lugosi. Consistency of randomforests and other averaging classifiers. J. Mach. Learn. Res.,9 :2015–2033, jun 2008.
52 / 52