MARS - Multivariate Adaptive Regression Splines - Visionday
Transcript of MARS - Multivariate Adaptive Regression Splines - Visionday
MARS - Multivariate AdaptiveRegression Splines
Giulia Prando - MSc Mathematical Modelling and Computation
Department of Informatics and Mathematical Modelling
Introduction
What?Method for multivariate regression
When? In 1990 by Jerome H. Friedman
Why?To overcome disadvantages of alreadyexisting methods:◮ Global parametric modelling◮ Non-parametric modelling◮ Adaptive methods⊲ Regression trees (CART)
Main Characteristics
MARS introduces the following modifications to the CART algorithm:◮ new type of basis functions: the Heavyside functions H[±(x − t)] are replaced by the two-sided
truncated power splines [±(x − t)]q+, being t the knot site⊲ usually q = 1 for the continuity of the approximating function⊲ from the solution a continuous derivative solution is then derived
◮ not removing the parent basis function after it has been split⊲ parent and both its daughters are eligible for further splitting⊲ the corresponding regions overlap
◮ restricting the product associated with each basis function to factors involving distinct predictorvariables⊲ products of splines with dependencies on individual variables of power greater than q are notallowed.
Two stages of MARS
Forward Stage
1. start with only the constant function h0(x) = 1
2. at each stage consider all products of a basis function hm in the model set M with one pair in C
3. add to the model M the new pair of basis functions
hl(x) · (xj − t)q+, hl(x) · (t − xj)q+, hl ∈ M
and the termaM+1hl(x) · (xj − t)q+ + aM+2hl(x) · (t − xj)
q+
that gives the largest decrease in the training error;⊲ coefficients aM+1 and aM+2 are estimated by least-squares.
Backward Stage
Why pruning?To reduce the dimension of themodel and thus avoid overfitting
How to prune?At each step the term whoseremoval causes the smallest increase in RSSis removed
When to stop pruning?When the model reachesthe optimal size λ∗, selected usingGeneralized Cross Validation
Example - Regression of a bivariate function
Linear Splines - Univariate basis functions
(a) 1 basis function. (b) 3 basis functions. (c) 5 basis functions. (d) 7 basis functions. (e) Final model without
pruning (15 basis funcs).
(f) Final model with pruning
(4 basis funcs).
(g) Final model with pruning
- Other view.
Linear Splines - Bivariate basis functions
(h) 1 basis function. (i) 3 basis functions. (j) 5 basis functions. (k) 7 basis functions. (l) Final model without
pruning (23 basis funcs).
(m) Final model with pruning
(7 basis funcs).
(n) Final model with pruning
- Other view.
Cubic Splines - Univariate basis functions
(o) Final model without
pruning (15 basis funcs).
(p) Final model with pruning
(4 basis funcs).
(q) Final model with pruning
- Other view.
Cubic Splines - Bivariate basis functions
(r) Final model without
pruning (23 basis funcs).
(s) Final model with pruning
(7 basis funcs).
(t) Final model with pruning
- Other view.
Comparison with CART on a blob image
On data that could be considered ”categorical”, CART performs better than MARS, like this example shows:
True data
(u) (v)
Performance of MARS
(w) Univariate basis funcs, max basis
funcs: 21, pruning performed.
(x) Bivariate basis funcs, max basis
funcs: 41, no pruning.
Performance of CART
(y)