Prediction

Confidence Intervals, Cross-validation, and Predictor Selection

Skill Set

• Why is the confidence interval for an individual point larger than for the regression line?

• Describe the steps in forward (backward, stepwise, blockwise, all possible regressions) predictor selection.

• What is cross-validation? Why is it important?

• What are the main problems as far as R-square and prediction are concerned with forward (backward, stepwise, blockwise, all possible regressions)

Prediction v. Explanation

• Prediction is important for practice– WWII pilot training

• Ability tests, e.g., eye-hand coordination• Built an airplane that flew• Fear of heights• Favorite flavor ice cream

– Age and driving accidents

• Explanation is crucial for theory. Highly correlated vbls may not help predict, but may help explain. Team outcomes as function of team resources and team backup.

Confidence Intervals

700600500400GRE

AData from Partial Correlation Example

Confidence Interval for Regression Line (Mean)

Confidence Interval for Indivdual Predicted Values

CI for the line, i.e., the mean score:

'),2/(' StYCI df

2.xyS =MSR. N=

sample size. The df are for MSR (variance of residuals).

CI for a single person’s score:

'),2/(' Ydf StYCI

Note shape.

Computing Confidence Intervals

Suppose: 40;3;983.5;20 22. xXSN xy

XY 75.05.5' Find CI for line (mean) at X=1.

1983.5947.

df = N-k-1 = 20-1-1 = 18. 80.5)1(75.05.575.05.5' XY

)947)(.101.2(8.5' '),2/( StYCI df

CI = .29 to 11.31

For an individual at X=1, what is the CI?

11983.5623.2

)623.2)(101.2(8.5' '),2/( Ydf StYCI

CI = 3.81 to 7.79

Review

Why is the confidence interval for the individual wider than a similar interval for the regression line?

Why are the confidence intervals regression curved instead of being straight lines?

ShrinkageR2 is biased (sample value is too large) because of capitalizing on chance to minimize SSe in sample.

If the population value of R2 is zero, the expected value in the sample is R2 =k/(N-1) where k is the number of predictors and N is the number of people in the sample. If you have many predictors, you can make R2 as large as you want. What is the expected value of R-square if N = 101 and k =10? Ethical issue here.Common adjustment or shrinkage formula:

( )R RN

N k2 21 1

This is reported by SAS (PROC REG) under ‘Adj R-Sq.’ Adjusts for both k and N and size of initial R2.

Shrinkage ExamplesSuppose R2 is .6405 with k = 4 predictors and a sample size of 30. Then

( . ) .R2 1 1 640530 1

30 4 1583

R2 = .6405

N=Adj R2

15 .497

30 .583

100 .625

R2 = .30

N=Adj R2

15 .020

30 .188

100 .271

Note small N means lots of shrinkage but also smaller initial R2 shrinks more.

Cross-Validation

• Compute a and b(s) (can have one or more IVs) on initial sample.

• Find new sample, do not estimate a and b, but use a and b to find Y’.

• Compute correlation between Y and Y’ in new sample; square. Ta da! Cross- validation R2.

• Cross-validation R2 does not capitalize on chance and estimates operational R2.

Cross-validation (2)

• Double cross-validation

• Data splitting

• Expert judgment weights (don’t try this at home)

• Math Estimates

111ˆ 22 R

Fixed:

11ˆ 22 R

Random:

)6405.1(1430

1301513.

Review

• What is shrinkage in the context of multiple regression? What are the things that affect the expected amount of shrinkage?

• What is cross-validation? Why is it important?

Predictor Selection

• Widely misunderstood and widely misused.• Algorithms labeled forward, backward,

stepwise, etc.• NEVER use for work involving theory or

explanation (hint: this clearly means your thesis and dissertation).

• NEVER use for estimating importance of variables.

• Use SOLELY for economy (toss predictors).

All Possible Regressions

GPA (Y)

GREQ GREV MAT AR

GPA (Y)

GREQ .611 1

GREV .581 .468 1

MAT .604 .267 .426 1

AR .621 .508 .405 .525 1

Mean 3.313 565.333 575.333 67.00 3.567

S.D. .600 48.618 83.03 9.248 .838

Data from Pedhazur example.

GPA is grade point average. GREQ is Graduate Record Exam, Quantitative. GREV is GRE Verbal. MAT is Miller Analogies Test. AR is Arithmetic Reasoning test.

All Possible Regressions (2)k R2 Variables in Model

1 .385 AR1 .384 GREQ1 .365 MAT1 .338 GREV

2 .583 GREQ MAT2 .515 GREV AR2 .503 GREQ AR2 .493 GREV MAT2 .492 MAT AR2 .485 GREQ GREV

3 .617 GREQ GREV MAT3 .610 GREQ MAT AR3 .572 GREV MAT AR3 .572 GREQ GREV AR

4 .640 GREQ GREV MAT AR

Note how easy it is to choose the model with the highest R2 for any given number of predictors. In predictor selection, you also need to worry about cost. You get both V and Q GRE in one test. Also consider what change in R2 means. Accuracy in prediction of dropout.

Predictor Selection Algorithms• Forward – build up from start with p value.

End when no variables meet PIN. May include duds.

• Backward – Start with all vbls and pull out with POUT. May lose gems.

• Stepwise – Start forward, check backward at each step. Not guaranteed to give best R2.

• Blockwise – not used much. Forward by blocks, then any method (eg stepwise) within block to choose best predictors.

Things to Consider in PS

• Algorithms consider statistical significance, but you have to consider practical significance and cost, i.e., algorithms don’t work well.

• Surviving variables are often there by chance. Do the analysis again and you would choose a different set. OK for prediction.

• The value of correlated variables is quite different when considered in path analysis and SEM.

Hierarchical Regression

• Alternative to predictor selection algorithms

• Theory based (a priori) tests of increments to R-square

Example of Hierarchical Reg

MATbUgGPAbaMedGPA 21

NAbConscbMATbUgGPAbaMedGPA 4321

Does personality increase prediction of med school success beyond that afforded by cognitive ability?Collect data on 250 med students for first two years.

Model 1:

Model 2

R2=.10 , p<.05

R2=.13 , p<.05

Model test:

)14250/()13.1(

)24/()10.13(.

)1/()1(

)/()(2

dfdfRRF

F(2,245)=4.22, p < .05

Review

• Describe the steps in forward (backward, stepwise, blockwise, all possible regressions) predictor selection.

• What are the main problems as far as R-square and prediction are concerned with forward (backward, stepwise, blockwise, all possible regressions)

• Why avoid predictor selection algorithms when doing substantive research (when you want to explain variance in the DV)?

Prediction

Documents

Transcript of Prediction

Prediction Markets 2018 - VirtualPrivateLibrarywhitepapers.virtualprivatelibrary.net/Prediction Markets.pdfPrediction Markets 2018 White Paper Link Compilation ... enric/papers/prediction-markets-ArgMAS.pdf

Tutorial 11 RNA Structure Prediction. Rfam – RNA structures database RNAfold – RNA secondary structure prediction tRNAscan – tRNA prediction TargetScan.

Prediction of protein disorder - aidanbudd.github.io · IDP prediction and other 1D prediction methods Secondary structure prediction methods Coil is an ordered, irregular structural

Data Mining: Classification. Classification and Prediction zWhat is classification? What is prediction? zIssues regarding classification and prediction.

Solar Cycle Prediction - Home - Springer Cycle Prediction 5 1 Introduction Solar cycle prediction is an extremely extensive topic, covering a very wide variety of proposed prediction

10/24/05 Promoter Prediction RNA Structure & Function Prediction

Ashtakavarga Prediction

Solar cycle prediction - Springer · 2021. 1. 7. · Solar cycle prediction is an extremely extensive topic, covering a very wide variety of proposed prediction methods and prediction

Multiclass Prediction Model for Student Grade Prediction ...

Tutorial 11 RNA Structure Prediction. Introduction – RNA secondary structure RNAfold – RNA secondary structure prediction TargetScan – microRNA prediction.

Classification and Prediction - SRM · PDF file1 Classification and Prediction • What is classification? What is prediction? • Issues regarding classification and prediction •

Protein structure Predictive methods. Topics Covered Secondary structure prediction methods 3D fold prediction –Ab initio protein structure prediction.

Lecture 9: Branch Prediction, Dependence Speculation, and Data Prediction

Advancing Climate Prediction Science – Decadal Prediction

Prediction Market Uses (Other Than Prediction) - Bitcoin Hivemindbitcoinhivemind.com/papers/3_PM_Applications.pdf · 2020-04-14 · Prediction Market Uses (Other Than Prediction)

Classification and Prediction. What is classification? What is prediction? Issues regarding classification and prediction Classification by decision.

Advanced Hydrologic Prediction Services: Ensemble ... · Advanced Hydrologic Prediction Services: Ensemble Streamflow Prediction ... Ensemble Streamflow Prediction ... NWSRFS Models

Structure Prediction (I): Secondary structure Structure Prediction (I): Secondary structure DNA/Protein structure-function analysis and prediction Lecture.

Decadal prediction of sustainable agricultural and forest management - Earth system prediction differs from climate prediction

Assessment of prediction error of risk prediction models