1980 Junior Circuit Pitchers' Combined Ratings and Stamina ...
Quantifying the Volatility of Starting Pitchers
description
Transcript of Quantifying the Volatility of Starting Pitchers
Quantifying the Volatility of Starting
Pitchers
Bill PettiSABR Analytics Conference
March 2014Phoenix, AZ
(Sort of)
2
Let’s Make This a Little Interactive: Presentation Cliché
Game
3
See How Many You Hear/See Today
“Needed to start somewhere”
“Not sure what to make of the results”
“Take the results with a grain of salt”
“Results directional, but not definitive”
“More questions than answers”
“Lot’s of work to be done”
4
Motivating Questions
Are there differences in how volatile starting pitchers are over the course of a season?
Are certain types of pitchers more volatile than others?
5
Why Study Volatility in Baseball?
• It’s my unicorn; what’s a unicorn?
• My first published baseball research focused on David Wright and whether he was volatile**
• Basically, I haven’t been able to let it go
*Gone in 60 Seconds**http://www.beyondtheboxscore.com/2011/1/4/1908646/player-volatility-the-case-of-david-wright
“Fabled creature? You know, the horse with the horn? Impossible to capture?”*
6
Better Reason• We know less about Volatility than
other subjects, e.g. aging• There is some evidence that Volatility
in run scoring and run prevention matters for teams– How teams distribute their runs can
impact their expected win percentage– Sal Baxamusa* showed that the increase
in win probability becomes more marginal as teams score more than 5 runs
*The Hardball Times, 2007, http://www.hardballtimes.com/consistency-is-key/
7
Why Study Volatility in Baseball?
• Some evidence that Volatility at the team level helps teams beat their Pythagorean Expectation*– Run Scoring (RS) and Runs Allowed (RA)
Volatility were both negatively correlated to total wins
– However, RS and RA Volatility were positively correlated to wins above expectation
*FanGraphs, 2012, http://www.fangraphs.com/blogs/does-consistent-play-help-a-team-win/
Streakiness is about how extreme positive and negative performances lump together over the course of a season
8
Volatility is not the same as Streakiness
Volatility is about the overall distribution of a player’s daily performance relative to their average (i.e. central tendency)
5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
105
110
115
120
125
130
135
140
145
150
Average wOBA
1% 1% 2% 7% 9%
30%20%
11% 8% 8% 1% 1% 1% 0%
Average wOBA
9
Volatility and Hitters• Developed a metric for quantifying the volatility
of hitters (VOL) and examined what types of hitters may be more prone to inconsistency*
*The Hardball Times, 2014, http://www.hardballtimes.com/what-kind-of-hitters-are-volatile/
VOL=STD(daily_wOBA)/Yearly_wOBA.52, where:
VOL=Seasonal VolatilitySTD(daily_wOBA)=the standard deviation of a player’s daily batting performance, measured by wOBA
Yearly_wOBA.52: a player’s seasonal wOBA, raised to .52 power
Only games where a player had three or more plate appearances are included
10
Volatility and Hitters (cont.)• This method ensured that VOL was not biased in
favor of inferior players and not simply a function of players with high PA/G
• VOL has a year-to-year correlation of .4 (n=435)– Some evidence it’s a repeatable skill, but one that
fluctuates much like BABIP or batting average• High VOL hitters tended to be high strikeout, fly ball
slugging hitters, while low VOL hitters tended to be ground ball, high contact, high on-base hitters
• Some evidence that hitters might be “structurally volatile”, but not all performance explained by this– Phrase borrowed from Matt Swartz and his work on LHHs
and clutch performance
11
Volatility and Hitters (cont.)
12
What About Pitchers?• There have been some attempts to quantify
consistency in pitchers– David Gasko using Quality Starts as a proxy for
consistency*• Controlling for ERA, pitchers don’t retain their consistency,
year-to-year• But inconsistency in a pitcher is preferable compared to a
consistent pitcher of similar talent– Eric Seidman using the Flake statistic at Baseball
Prospectus**– I briefly looked at a pitcher version of volatility,
consistent Flake***• Pitching creates unique challenges to this type
of metric*The Hardball Times, 2006, http://www.hardballtimes.com/what-kind-of-hitters-are-volatile/**2009, http://www.baseballprospectus.com/article.php?articleid=8579***Beyond the Box Score, 2011, http://www.beyondtheboxscore.com/2011/9/8/2404007/pitcher-volatility-part-i
13
What About Pitchers?• First, hitters generate a larger number of
observations for study over the course of a season– Roughly 5x as many observations than starting
pitchers• Second, this makes outliers much more of
a problem for pitchers• Third, managers create the biggest
problem, since starters don’t’ control when they will exit a game– Tends to accentuate the outlier issue
14
Decisions, decisions, decisions…
• Could continue to use a standard deviation-based metric– But, distribution of game performance not quite normal,
and outliers can play havoc with individual scores
• Another option is interquartile range (IQR), adjusted for median (i.e. quartile coefficient of dispersion or IQR COD)– Similar to hitter VOL, which uses coefficient of variation)– Still not perfect, but IQR CoD a robust measures that
handles outliers better
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 300%5%
10%15%20%
Game RA9: 2009-2013
15
Decided to use IQR CoD• E.g. Buzz Capra, 1974• 27 starts in 1974 with a ERA- 59• 2.80 RA9 in those starts, but a few key outliers
– 22.5 and 405.0, both outings lasted less than 2 IP• Using the IQR CoD method does mitigate the impact of
outliers
VOL using Standard Deviation VOL using Coefficient of Dispersion (IQR COD)
0%
50%
100%
150%
200%
250%
300%
271%
143%
VOL relative to League Average
16
Volatility for Pitchers• Data from 2009-2013, only pitchers that started
>= 20 games used
• There was no limit placed on the number of innings for a start– Tough decision, but had to start somewhere
RA9VOL=(IQR_daily_RA9/2) / Median_daily_RA9, where:
IQR_daily_RA9=Interquartile Range of pitcher’s daily RA9Median_daily_RA9=Median of a pitcher’s daily RA9
FIPVOL=(IQR_daily_FIP/2) / Median_daily_FIP, where:
IQR_daily_FIP=Interquartile Range of pitcher’s daily FIPMedian_daily_FIP=Median of a pitcher’s daily FIP
17
Comparing RA9 and FIP VOL• At a population-level, the volatility of RA9 has a much larger
spread than the volatility of FIP• RA9VOL: Mean - .65 Standard Deviation - .23• FIPVOL: Mean - .36 Standard Deviation - .09
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.20%
5%
10%
15%
20%
25%
30%
35%
40%
45%
RA9VOL
% o
f Pi
tche
rs
Average
Average
18
Contrasting RA9 and FIP VOL• E.g. Dillon Gee, 2013, 32 GS• RA9 3.89 / FIP 4.00
RA9 FIPQuartile 1 (bottom 25%) 1.3 2.7
Quartile 3 (top 25%) 6.0 4.7(Q3-Q1)/2 2.36 1.02
Median 2.5 3.7VOL [((Q3-Q1)/2)/Median]
.93 .27
VOL-[VOL/lgVOL] 142 65
19
Contrasting RA9 and FIP VOL:Ryu vs. Strasburg 2013
• Both pitchers posted a 3.00 ERA, 30 GS, and ~ 6 IP/GS• Ryu – 3.24 FIP, Strasburg 3.21 FIP• While very similar in their seasonal outcomes, Ryu was the
more consistent starter, both in terms of runs allowed and FIP
Hyun-Jin Ryu RA9 FIPVOL .54 .31VOL- 80 79
Stephen Strasburg RA9 FIPVOL .80 .43VOL- 118 107
20
Hypotheses on Causes of RA9VOL
• Pitchers with high K%s will have lower VOL• Pitchers with high LOB% will have lower
VOL• Pitchers with high BB% will have higher
VOL• Pitchers with high HR/FB rates will have
higher VOL• Pitchers with high BABIP will have higher
VOL• Pitchers with low GB/FB rates will have
higher VOL
21
Testing the Hypotheses
• Four of the six hypothesized variables were statistically significant, but the magnitude of the relationship was small
• None of the relationships were in the hypothesized direction
Correlation with RA9
Statistically Significant?
Correct Hypothesized
Direction?K% .200** Yes No
LOB% .233** Yes No
BB% -.008 No No
HR/FB -.232** Yes No
BABIP -.124** Yes No
GB/FB .013 No Yes
22
What to Make of This?• While the relationships were significant,
the directionality is hard to explain• When taken together, it’s hard to decipher– High K, high LOB = higher VOL; but– High HR/FB, high BABIP = lower VOL
• Often, pitchers with high K rates are also more home run prone (throw more fastballs, attack the zone, fly ball pitchers, etc.)
• Pitchers with lower BABIPs tend to strand more runners, not fewer
23
Is VOL a Talent?• A quick read on whether something is a
talent or skill is to see if it is repeatable, year over year
• Previous research suggested that, whatever measure you use, VOL or consistency was not repeatable in consecutive years– And that appears to be the case with my measure as
well Year-to-Year CorrelationStandard Deviation-based VOL .04
RA9 VOL -.01
FIP VOL .06
24
Is VOL a Talent?• It’s possible that VOL is simply a descriptive
statistic that captures the variances from pitcher to pitcher in how their performances randomly distributed themselves over 30-ish starts
• It’s also possible that VOL is a statistic that needs more time to stabilize, much like BABIP– Need to look at multiple seasons averaged to get a better
sense of a pitcher’s consistency• Finally, because of some of the inherent problems
with trying to measure VOL, it may be best used to compare pitchers with similar outcome metrics– E.g. compare pitchers with similar ERAs, K%, etc.– Provides another data point to consider
• Or, Occum’s Razor: the metric isn’t that great
25
Summing Up• There appear to be measurable differences in
how pitchers distribute their runs allowed and FIP over the course of a season– And those differences are normally distributed
over the course of a season– FIP appears to be generally less volatile than RA9
across the league• However, VOL itself seems quite inconsistent,
year to year, at the individual level– It does appear to stabilize a bit when looking at
multiple seasons (akin to clutch ability)– Also, it’s not clear VOL is structurally determined,
somewhat like clutch hitting
26
So Where Do We Stand?• I’m not in love with this metric, currently, and recommend
anyone use it with a big, fat grain of salt• It could be that parks impact pitcher VOL more than hitters
– Need to split the data by home and away starts (hat tip Vince Gennaro)
• Quality of opponent could also play a role (hat tip Sean Forman)
• There is also the possibility that inconsistency in mechanics throughout the year impacts VOL more than other metrics– Adjustments to mechanics, or just inability to repeat
mechanics, or injury could be what drives VOL (hat tip Jeff Zimmerman)
• It could also be that there is no way around the IP/GS issue– Two pitchers that give up 4 runs over 8 innings could
arrive their differently; one gives up 4 runs over the last 2 innings, the other over the first 3. High odds the latter doesn’t make it to 8 innings
• Bottom line: more work to be done
28
Appendix
29
RA9VOL leaders: 2013GS RA9VOL FIPVOL RA9VOL- FIPVOL-
Jason Hammel 23 0.25 0.17 37% 43%Wei-Yin Chen 23 0.29 0.32 42% 81%Jon Niese 24 0.33 0.35 49% 89%Roberto Hernandez 24 0.36 0.22 53% 56%Miguel Gonzalez 28 0.36 0.32 54% 79%John Danks 22 0.36 0.45 54% 113%Andrew Cashner 26 0.36 0.41 54% 102%Kevin Correia 31 0.37 0.43 54% 109%Jose Quintana 33 0.38 0.46 57% 114%Jacob Turner 20 0.41 0.53 60% 134%Joe Blanton 20 0.41 0.41 61% 104%Jason Marquis 20 0.42 0.30 62% 74%Felix Doubront 27 0.42 0.34 62% 86%Edwin Jackson 31 0.44 0.40 65% 101%Zach McAllister 24 0.44 0.39 65% 98%Rick Porcello 29 0.45 0.37 66% 94%Tommy Milone 26 0.46 0.33 68% 82%R.A. Dickey 34 0.47 0.29 69% 72%Brandon McCarthy 22 0.47 0.24 69% 59%John Lackey 29 0.47 0.38 69% 95%Yu Darvish 32 0.47 0.44 69% 110%Scott Diamond 24 0.47 0.39 69% 98%A.J. Griffin 32 0.47 0.33 69% 83%Chris Sale 30 0.47 0.44 69% 111%Andy Pettitte 30 0.47 0.19 70% 47%Jeff Samardzija 33 0.47 0.28 70% 71%C.J. Wilson 33 0.48 0.27 71% 68%Wade Miley 33 0.48 0.41 72% 103%James Shields 34 0.49 0.30 73% 77%Cliff Lee 31 0.50 0.29 74% 72%
30
RA9VOL trailers: 2013GS RA9VOL FIPVOL RA9VOL- FIPVOL-
Gio Gonzalez 32 0.75 0.32 112% 80%Homer Bailey 32 0.76 0.45 113% 112%Wade Davis 24 0.77 0.33 114% 84%Justin Verlander 34 0.77 0.46 115% 115%Barry Zito 25 0.79 0.46 117% 117%Scott Kazmir 29 0.79 0.44 117% 110%Stephen Strasburg 30 0.80 0.43 118% 107%Chris Archer 23 0.80 0.52 118% 130%Clayton Kershaw 33 0.82 0.39 121% 98%Dan Haren 30 0.83 0.46 124% 115%Zack Greinke 28 0.83 0.45 124% 113%Hiroki Kuroda 32 0.84 0.44 125% 110%Jered Weaver 24 0.85 0.36 126% 91%Jason Vargas 24 0.85 0.36 127% 90%Joe Saunders 32 0.86 0.36 128% 91%Matt Cain 30 0.87 0.37 129% 93%Esmil Rogers 20 0.87 0.41 129% 104%Jeff Locke 30 0.89 0.28 132% 70%Jose Fernandez 28 0.90 0.43 134% 109%Felix Hernandez 31 0.91 0.43 135% 108%Matt Harvey 26 0.92 0.44 136% 109%Patrick Corbin 32 0.94 0.23 139% 57%Dillon Gee 32 0.96 0.26 142% 65%Lance Lynn 33 1.00 0.34 149% 85%Mike Leake 31 1.04 0.39 154% 97%Justin Masterson 29 1.04 0.38 154% 95%Chris Capuano 20 1.10 0.64 164% 161%Lucas Harrell 22 1.45 0.41 215% 104%Matt Moore 27 1.59 0.31 236% 78%Francisco Liriano 26 1.62 0.37 240% 94%
31
RA9VOL leaders: 2009-2013GS RA9VOL FIPVOL RA9VOL- FIPVOL-
Kevin Correia 144 75.1 62.6 52% 43%Ryan Dempster 128 66.9 42.6 52% 33%John Lannan 91 48.2 26.5 53% 29%Chris Volstad 109 59.7 37.5 55% 34%Bud Norris 87 47.7 39.0 55% 45%A.J. Burnett 159 87.8 52.7 55% 33%Ricky Nolasco 121 67.4 45.4 56% 37%Roberto Hernandez 113 63.0 36.2 56% 32%Jeremy Guthrie 130 73.0 39.3 56% 30%Edwin Jackson 95 53.5 43.6 56% 46%Anibal Sanchez 93 52.5 36.3 56% 39%Derek Lowe 101 57.2 31.6 57% 31%James Shields 166 94.7 61.0 57% 37%David Price 146 85.3 54.9 58% 38%R.A. Dickey 125 73.7 42.6 59% 34%Luke Hochevar 88 52.2 40.2 59% 46%Chad Billingsley 120 72.2 37.4 60% 31%Joe Blanton 79 48.1 33.5 61% 42%Mark Buehrle 161 99.4 56.3 62% 35%Mat Latos 127 78.6 60.6 62% 48%Jeremy Hellickson 91 56.4 31.5 62% 35%CC Sabathia 161 99.9 59.2 62% 37%Randy Wolf 101 62.7 33.7 62% 33%Ricky Romero 125 78.4 39.5 63% 32%Scott Feldman 74 46.6 36.0 63% 49%Jason Hammel 130 82.4 61.6 63% 47%Bruce Chen 82 52.3 37.2 64% 45%Josh Johnson 92 58.7 45.6 64% 50%Mike Pelfrey 126 80.5 48.6 64% 39%Ervin Santana 151 96.6 69.3 64% 46%
32
RA9VOL trailers: 2009-2013GS RA9VOL FIPVOL RA9VOL- FIPVOL-
Brandon Morrow 77 143.1 46.3 186% 60%Johan Santana 75 127.8 38.4 170% 51%Francisco Liriano 105 132.9 53.4 127% 51%Felix Hernandez 165 163.4 67.3 99% 41%Chris Carpenter 97 95.1 28.7 98% 30%Jair Jurrjens 77 73.8 27.9 96% 36%Jason Vargas 120 113.5 52.9 95% 44%Nick Blackburn 85 80.4 37.8 95% 44%Clayton Kershaw 161 149.1 61.2 93% 38%Randy Wells 82 75.0 28.4 91% 35%Homer Bailey 107 97.7 54.7 91% 51%Chris Capuano 84 76.6 46.4 91% 55%Justin Masterson 125 112.4 52.5 90% 42%Wandy Rodriguez 95 85.4 31.1 90% 33%Jered Weaver 154 136.1 65.0 88% 42%Zack Greinke 122 106.0 49.7 87% 41%Carlos Zambrano 92 79.4 40.8 86% 44%Gavin Floyd 120 103.4 50.0 86% 42%Bartolo Colon 80 68.6 41.0 86% 51%Josh Beckett 83 69.9 35.7 84% 43%Tim Hudson 116 97.3 45.7 84% 39%Mike Leake 109 91.2 50.0 84% 46%Jon Niese 110 92.0 51.5 84% 47%Jaime Garcia 80 66.8 40.1 84% 50%Scott Baker 83 68.4 49.6 82% 60%Tim Lincecum 163 134.0 60.5 82% 37%Kyle Kendrick 86 70.7 34.5 82% 40%Jhoulys Chacin 83 67.1 27.5 81% 33%Max Scherzer 158 124.4 58.2 79% 37%Johnny Cueto 118 92.4 41.4 78% 35%