Polynomial Curve Fitting - NG

download Polynomial Curve Fitting - NG

of 27

Transcript of Polynomial Curve Fitting - NG

  • 8/11/2019 Polynomial Curve Fitting - NG

    1/27

    Navneet GoyalDepartment of Computer Science, BITS-Pilani, Pilani Campus, India

    Polynomial Curve FittingBITS F464

    Machine Learning

  • 8/11/2019 Polynomial Curve Fitting - NG

    2/27

    Seems a very trivial concept!!

    All of us know it well!!

    Why are we discussing it in Machine Learning

    course? A simple regression problem!! It motivates a number of key concepts of ML!!

    Lets discover

    Polynomial Curve Fitting

  • 8/11/2019 Polynomial Curve Fitting - NG

    3/27

    Polynomial Curve Fitting

    Observe Real-valuedinput variablex Usex to predict valueof target variable t

    Synthetic datagenerated fromsin(2x) Random noise in

    target valuesInput Variable

    Target

    Variable

    Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer

  • 8/11/2019 Polynomial Curve Fitting - NG

    4/27

    Polynomial Curve Fitting

    Input Variable

    Targ

    et

    Variable

    N observations ofxx = (x1,..,xN)Tt = (t1,..,tN)T Goal is to exploit trainingset to predict value offrom x

    Inherently a difficultproblem

    Data Generation:N = 10Spaced uniformly in range [0,1]Generated from sin(2x) by adding

    small Gaussian noiseNoise typical due to unobservedvariables

    Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer

  • 8/11/2019 Polynomial Curve Fitting - NG

    5/27

    Polynomial Curve Fitting

    Input Variable

    Target

    Variable

    Where M is the order of thepolynomial Is higher value of M better?

    Well see shortly! Coefficients w0 ,wM aredenoted by vectorw Nonlinear function ofx, linear

    function of coefficients w Called Linear Models

    Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer

  • 8/11/2019 Polynomial Curve Fitting - NG

    6/27

    Sum-of-Squares Error Function

    Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer

  • 8/11/2019 Polynomial Curve Fitting - NG

    7/27

    Polynomial curve fitting

    Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer

  • 8/11/2019 Polynomial Curve Fitting - NG

    8/27

    Choice of M??

    Called model selection or model comparison

    Polynomial curve fitting

    Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer

  • 8/11/2019 Polynomial Curve Fitting - NG

    9/27

    0thOrder Polynomial

    Poor representations of sin(2x)

    Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer

  • 8/11/2019 Polynomial Curve Fitting - NG

    10/27

    1stOrder Polynomial

    Poor representations of sin(2x)

    Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer

  • 8/11/2019 Polynomial Curve Fitting - NG

    11/27

    3rdOrder Polynomial

    Best Fit to sin(2x)

    Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer

  • 8/11/2019 Polynomial Curve Fitting - NG

    12/27

    9thOrder Polynomial

    Over Fit: Poor representation of sin(2x)

    Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer

  • 8/11/2019 Polynomial Curve Fitting - NG

    13/27

    Good generalization is the objective Dependence of generalization performance on M? Consider a data set of 100 points Calculate E(w*) for both training data & test data

    Choose M which minimizes E(w*) Root Mean Square Error (RMS)

    Sometimes convenient to use as division by N allows us to

    compare different sizes of data sets on equal footing Square root ensures ERMSis measure on the same scale ( and in

    same units) as the target variable t

    Polynomial Curve Fitting

    Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer

  • 8/11/2019 Polynomial Curve Fitting - NG

    14/27

    Flexibility & Model Complexity

    M=0, very rigid!! Only 1 parameter to play with!

  • 8/11/2019 Polynomial Curve Fitting - NG

    15/27

    Flexibility & Model Complexity

    M=1, not so rigid!! 2 parameters to play with!

  • 8/11/2019 Polynomial Curve Fitting - NG

    16/27

  • 8/11/2019 Polynomial Curve Fitting - NG

    17/27

    Over-fittingFor small M(0,1,2)Inflexible tohandle oscillationsof sin(2x)

    M(3-8)flexible enough to

    handleoscillations ofsin(2x)

    For M=9Too flexible!!

    TE = 0GE = high

    Why is it happening?

    Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer

  • 8/11/2019 Polynomial Curve Fitting - NG

    18/27

    Polynomial Coefficients

    Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer

  • 8/11/2019 Polynomial Curve Fitting - NG

    19/27

    Data Set SizeM=9- Larger the data set, the more complexmodel we can afford to fit to the data- No. of data pts should be no less than 5-10 times the no. of adaptive parameters inthe model

    Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer

  • 8/11/2019 Polynomial Curve Fitting - NG

    20/27

    Over-fitting Problem

    Should we limit the no. of parameters accordingto the available training set?

    Complexity of the model should depend only on the

    complexity of the problem!

    LSE represents a specific case of Maximum Likelihood

    Over-fitting is a general property of maximumlikelihood

    Over-fitting Problem can be avoided using the

    Bayesian Approach!

  • 8/11/2019 Polynomial Curve Fitting - NG

    21/27

    Over-fitting Problem

    In Bayesian Approach, the effective number ofparameters adapts automatically to the size of the dataset

    In Bayesian Approach, models can have more

    parameters than the number of data points

    Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006

    Springer

  • 8/11/2019 Polynomial Curve Fitting - NG

    22/27

    Penalize large coefficient values

    Regularization

    Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer

  • 8/11/2019 Polynomial Curve Fitting - NG

    23/27

    Regularization:

    Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer

  • 8/11/2019 Polynomial Curve Fitting - NG

    24/27

    Regularization:

    Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer

  • 8/11/2019 Polynomial Curve Fitting - NG

    25/27

    Regularization: vs.

    Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006

    Springer

  • 8/11/2019 Polynomial Curve Fitting - NG

    26/27

    Polynomial Coefficients

    Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006

    Springer

  • 8/11/2019 Polynomial Curve Fitting - NG

    27/27

    Concept of over-fitting Model Complexity & Flexibility

    Take Aways from Polynomial Curve Fitting

    Will keep revisiting it from time to time