Multi Criteria Selection of All-Star Pitching Staff

9
Proceedings of the 2010 Industrial Engineering Research Conference A. Johnson and J. Miller, eds. Multi-criteria Selection of All-Star Pitching Staff for Fantasy Baseball Austin Lambert, Mark McGinley, Chaitanya Chandan, David Claudio, Lourdes Medina The Pennsylvania State University Department of Industrial Engineering State College, Pennsylvania 16801 USA Abstract This paper presents the problem of a fantasy baseball team managed by three decision makers with different methods of conducting performance reviews. A multiple-criteria, multiple-decision maker optimization approach was used in order to select the best pitching staff for the team. The criteria considered was based on the 2008 MLB pitching statistics, including the ERA, OPP OBP, WHIP, FLD PTC, and K/BB ratio. The problem was formulated as a linear/integer programming model with a weighted objective function. The results include a list of the pitching selection and additional analysis conducted to evaluate the sensibility of the results to different weights scenarios. Keywords Linear programming, Integer programming Fantasy Baseball, Multi-criteria selection, Multiple-decision maker 1. Introduction A frequently noted axiom in baseball is that a team is only as good as that day’s pitcher. Historically, in order to succeed, a well-rounded pitching staff is needed. As a result, the offices in baseball strive have as their first priority to lock up a quality pitching staff. In this paper, the data from the 2008 Major League Baseball (MLB) season is used to create the optimal pitching staff for a fantasy baseball team. The problem presented is of a fantasy baseball team that is managed by three decision makers, where each has different methods of conducting performance reviews. The intent is to maximize the preferences when selecting the best twelve man staff, while limiting the field of potential candidates by imposing restrictions on permissible variables. These variables were determined by looking at the types of decisions that go into making an effective pitching staff for any team in the MLB. After research on the different variables and statistics that go into a pitching staff, it was determined that a set of detailed constraints would allow performing a linear/integer programming (LP) operation on this

Transcript of Multi Criteria Selection of All-Star Pitching Staff

Page 1: Multi Criteria Selection of All-Star Pitching Staff

Proceedings of the 2010 Industrial Engineering Research ConferenceA. Johnson and J. Miller, eds.

Multi-criteria Selection of All-Star Pitching Staff for Fantasy Baseball

Austin Lambert, Mark McGinley, Chaitanya Chandan, David Claudio, Lourdes MedinaThe Pennsylvania State University Department of Industrial Engineering

State College, Pennsylvania 16801 USA

AbstractThis paper presents the problem of a fantasy baseball team managed by three decision makers with different methods of conducting performance reviews. A multiple-criteria, multiple-decision maker optimization approach was used in order to select the best pitching staff for the team. The criteria considered was based on the 2008 MLB pitching statistics, including the ERA, OPP OBP, WHIP, FLD PTC, and K/BB ratio. The problem was formulated as a linear/integer programming model with a weighted objective function. The results include a list of the pitching selection and additional analysis conducted to evaluate the sensibility of the results to different weights scenarios.

KeywordsLinear programming, Integer programming Fantasy Baseball, Multi-criteria selection, Multiple-decision maker

1. IntroductionA frequently noted axiom in baseball is that a team is only as good as that day’s pitcher. Historically, in order to succeed, a well-rounded pitching staff is needed. As a result, the offices in baseball strive have as their first priority to lock up a quality pitching staff. In this paper, the data from the 2008 Major League Baseball (MLB) season is used to create the optimal pitching staff for a fantasy baseball team. The problem presented is of a fantasy baseball team that is managed by three decision makers, where each has different methods of conducting performance reviews. The intent is to maximize the preferences when selecting the best twelve man staff, while limiting the field of potential candidates by imposing restrictions on permissible variables. These variables were determined by looking at the types of decisions that go into making an effective pitching staff for any team in the MLB.

After research on the different variables and statistics that go into a pitching staff, it was determined that a set of detailed constraints would allow performing a linear/integer programming (LP) operation on this particular situation. Those variables include metrics such as the Earned Run Average (ERA), Oppositions On-Base Percentage (OPP OBP), Walks plus Hits divided by Innings Pitched (WHIP), Fielding Percentage (FLD PCT), and Strike to Ball ratio (K/BB ratio). These variables tend to be the most important statistics that are considered by managers and coaches of a baseball team. The problem was formulated as a LP model with a weighted objective function. With the necessary funds, the solution to this LP problem will allow a fantasy team to build the most effective and efficient pitching staff available, and thus have the greatest chance of success during a specific season.

2. Literature ReviewLP has been used in numerous fields ranging from manufacturing to grocery shopping. The sports industry is no exception although it has not been as extensively used as in other industries. Within sports, LP has been frequently used for scheduling the seasonal games. For example, Hamiez and Hao [1] used algebra and LP while attempting to solve the Sports League Scheduling Problem (SLSP). Although the specific details of the procedure are not necessarily vital to this research, this shows how various methods of LP can lead to very different results. Michael Trick [2] also wrote an interesting article on his method of scheduling MLB games and ACC basketball games. He used various optimization methods (including LP) in order to successfully schedule games without conflict [2]. Trick’s work helped in this research to gather more understanding on how to collect specific data for a problem and the steps necessary to complete the problem.

Some people have proposed the use of optimization techniques for other sports related problems. For example, Zappe et al. [3] constructed a model to determine the consistency in performance of baseball players. He used different weights throughout the criteria when ranking different players in different positions. Although there could

Page 2: Multi Criteria Selection of All-Star Pitching Staff

Lambert, McGinley, Chandan, Medina, and Claudio

be some controversy about the assigned weights, Zappe et al. states that the LP solution and the assigned weights “makes sense” and thus proves no problem to the given solution [3]. Adler et al. [4] submitted an article on the use of LP to find out exactly how a baseball team can make the playoffs. Although there is currently an accepted method on the prediction of playoff spots in baseball, Adler et al. points out that this method is flawed as it does not take into account additional games that must be played [4]. Similarly, a group of industrial engineers at Berkley have set up a website (Baseball Playoff Races) that uses a sophisticated LP model to predict and estimate playoff spots several games in advance compared to the method that is currently used [5].

Although baseball involves the constant training of all the players on the team, there is another side of the game that many fans do not see, the managerial aspect of the game. Every day of the year managers and supervisors of MLB team make important decisions regarding the offense, defense, pitching, and various other aspects of the game. Each of these decisions can either make or break a team and can be in direct relation to the win loss ratio of a given team. Lewis et al. [6] wrote an article regarding this exact aspect and how LP can be used to evaluate a team’s managerial staff and the decisions they make. By evaluating offensive, defensive, and post season statistics, Lewis et al. were able to use the fundamentals of LP to rank MLB’s managerial staffs in order of effectiveness over a period of time [6]. This is one of the goals of having a fantasy baseball team. Sports fans get to experience the managerial decisions that can affect the team they build.

3. Problem DescriptionThe goal of using LP in fantasy baseball is to create a pitcher roster that is optimized based on several ranked pitching statistics. In this particular model, twelve pitchers were to be selected where five were to be starters, five to be relievers, and two to be closers.

In baseball, a pitcher’s skill is determined by calculating multiple statistics that range from strikeouts to wild pitches. These statistics were used as the basis of the LP model. Although there are over forty official pitching statistics that could be used, only five were selected to maintain simplicity and usability of the program. The following are the pitching statistics selected:

ERA-Earned Run Average: total number of earned runs multiplied by 9 then divided by innings pitched* WHIP-Walks and Hits per Inning Pitched: The average number of walks and hits allowed by the pitcher

per inning* OPP OBP- Oppositions On Base Percentage: times reached base divided by at bats plus walks plus hit by

pitch plus sacrifice flies* FLD PCT-Fielding Percentage: Total plays divided by the number of total chances K/BB- Strikeout-to-Walk Ratio: number of strikeouts divided by number of base on balls

Note that attributes with a “*” represent a statistic where a lower value is preferred. The data used for the problem was obtained from reference [7] and [8].Table 1 contains a sample of pitchers from the overall pool along with their respective statistics. All this data was taken from the 2008 Major League Baseball season. Since ERA, OPP OBP, and WHIP are all measured with a smaller value being desired it was necessary to linearize and normalize the data so that each statistic could easily be compared to one another. The first step in completing this was to scale all the data by dividing the overall smallest ERA, WHIP, and OPP OBP of each pitching group by each individual player’s respective statistic. Similarly, each players FP and K/BB was divided by the largest overall FP and K/BB for each pitching group.

Table 1- Sample of pitchers from the overall pool along with their respective statistics

Page 3: Multi Criteria Selection of All-Star Pitching Staff

Lambert, McGinley, Chandan, Medina, and Claudio

PITCHERS Pitcher Type ERA WHIP OPP OBP FLD PCT K/BB

Page 4: Multi Criteria Selection of All-Star Pitching Staff

Lambert, McGinley, Chandan, Medina, and Claudio

Tim Lincecum Starter 2.62 1.15 .297 1.0 3.15

Cliff Lee Starter 2.54 1.11 .285 .95 5.00

Johan Santana Starter 3.13 1.21 .296 .951 3.17

Rich Harden Starter 3.39 1.237 .283 .968 3.10

Bill Starter 3.53 1.20 .343 .951 2.11

Ben Sheets Starter 3.73 1.20 .284 .954 3.36

Jon Lester Starter 3.41 1.23 .301 .956 3.52

Joe Saunders Starter 4.22 1.37 .349 1.0 1.78

Ricky Nolasco Starter 5.06 1.25 .301 1.0 4.43

Paul Maholm Starter 4.44 1.44 .346 .954 1.98

Joey Devine Reliever .59 .8 .225 1.0 3.27

Scott Downs Reliever 1.78 1.15 .298 .96 2.11

Billy Wagner Reliever 2.3 .89 .228 1.0 5.2

Jon Lieber Reliever 4.05 1.39 .317 1.0 1.16

Brian Shouse Reliever 2.81 1.17 .288 .950 5.79

Matt Thornton Reliever 2.33 1.0 .258 1.0 4.05

Geoff Geary Reliever 4.5 1.5 .290 0 3.94

Brad Ziegler Reliever 1.06 1.16 .311 1.0 3.32

Grant Balfour Reliever 1.54 .89 .233 1.0 3.7

Chris Perez Reliever 3.46 1.34 .324 1.0 1.91Francisco Rodriguez Closer 2.24 1.29 0.31 0.83 2.26Joe Nathan Closer 1.33 0.90 0.24 0.90 4.11

Jonathon Papelbon Closer 2.34 0.95 0.25 0.77 9.63

Brad Lidge Closer 1.95 1.23 0.30 1.00 2.63

Mariano Rivera Closer 1.40 0.67 0.19 1.00 12.83

Jose Valverde Closer 3.38 1.18 0.29 0.90 3.61

Joakim Soria Closer 1.60 0.86 0.25 1.00 3.47

Bobby Jenks Closer 2.63 1.10 0.29 1.00 2.24

Brian Wilson Closer 4.62 1.44 0.34 1.00 2.39

BJ Ryan Closer 2.95 1.28 0.32 1.00 2.07

Next, each of the five statistics was weighted based on both pitching group and fantasy baseball owner’s preference as shown in Tables 2 through 4. Finally, each scaled value was then multiplied by the appropriate weight. The new scaled and weighted values allowed for the maximization of the objective function.

Table 2: Starters’ Weights

Statistic GroupRanking Weight

ERA 1 .333WHIP 2 .267OBP 3 .200Fld Pct 5 .067K/BB 4 .133

Table 3: Closers’ Weights

Statistic Group Ranking Weight

ERA 4 .133WHIP 1 .333OBP 2 .267Fld Pct 5 .067K/BB 3 .200

Table 4: Relievers’ Weights

Statistic Group Ranking Weight

ERA 1 .333WHIP 4 .133OBP 2 .267Fld Pct 5 .067K/BB 3 .200

4. Problem Formulation

4.1 Objective FunctionThe objective function is described by,

Page 5: Multi Criteria Selection of All-Star Pitching Staff

Lambert, McGinley, Chandan, Medina, and Claudio

Maximize Z = ∑i ∑j ∑k wik xij (1)

where wik is the weighted value for pitcher i on criteria k, and x ij is a binary variable that refers to pitcher i for position j. The definition of index i depends on the number of pitchers being considered, while j identifies the three different positions known as the Reliever (j = 1), the Starter (j = 2) and the Closer (j = 3). The index k represents the different criteria, previously defined as the five pitching statistics being considered: ERA (k = 1), OPP OBP (k = 2), K/BB (k = 3), FLD PCT (k = 4) and WHIP (k = 5).

Also, note that:xij= (2)

4.2 ConstraintsAlong with the fantasy owners being able to set their own weights (tables 2-4), this particular LP allows for even more customization within the code’s constraint section. A set of seven constraints were defined for each pitching group representing the skills being evaluated, while being deemed as necessary for the selection of the pitcher into the roster. The skills are defined as follows:

Pitching Skill: ERA+ OPP OBP Balanced Player Skill 1: FLD PTC + WHIP Balanced Player Skill 2: WHIP + K/BB Ball Handling Skill: K/BB + FLD PTC Pitching Effectiveness Skill: WHIP + ERA Pitching Consistency Skill: ERA + K/BB Defensive Skill: FLD PTC + OPP OBP

Although these seven skills are simply the program’s defaults, any user can make adjustments by either adding or removing skill constraints within the program’s code (see Section 6). Each constraint was designed in such a way that the program searches for the maximum sum of the two statistics that make up each skill. The skill cutoff values (at the right side of each constraint) were estimated based on the minimum skill expected for each pitcher (e.g. the 1.35 in constraint 3 is from the calculation of 5*0.27 and the 0.54 in constraint 11 is from 2*0.27). If any pitcher’s skill is below the weighted cutoff value that pitcher will neither be selected nor analyzed further in the study.

For Starters and Relievers (j=1,2):

∑i wi1 xij + ∑i wi2 xij ≥ 1.35 j=1,2 (Pitching Skill) (3) ∑i wi4 xij + ∑i wi5 xij ≥ 0.75 j=1,2 (Balanced 1) (4)∑i wi3 xij + ∑i wi4 xij ≥ 0.65 j=1,2 (Ball Handling) (5)∑i wi5 xij + ∑i wi3 xij ≥ 1.05 j=1,2 (Balanced 2) (6)∑i wi5 xij + ∑i wi1 xij ≥ 1.15 j=1,2 (Effectiveness) (7)∑i wi1 xij + ∑i wi3 xij ≥ 0.75 j=1,2 (Consistency) (8)∑i wi4 xij + ∑i wi2 xij ≥ 1.25 j=1,2 (Defensive Skill) (9)∑i xij = 5 j=1,2 (5 starters and 5 relievers) (10)

For Closers (j=3):

∑ i wi1 xi3 + ∑ i wi2 xi3 ≥ 0.54 (Pitching Skill) (11)∑ i wi4 xi3 + ∑ i wi5 xi3 ≥ 0.30 (Balanced 1) (12)∑ i wi3 xi3 + ∑ i wi4 xi3 ≥ 0.26 (Ball Handling) (13)∑ i wi5 xi3 + ∑ i wi3 xi3 ≥ 0.42 (Balanced 2) (14)∑ i wi5 xi3 + ∑ i wi1 xi3 ≥ 0.46 (Effectiveness) (15)∑ i wi1 xi3 + ∑ i wi3 xi3 ≥ 0.30 (Consistency) (16)∑ i wi4 xi3 + ∑ i wi2 xi3 ≥ 0.50 (Defensive Skill) (17)∑ i xi3 = 2 (Need 2 closers) (18)

Page 6: Multi Criteria Selection of All-Star Pitching Staff

Lambert, McGinley, Chandan, Medina, and Claudio

wij, xij ≥ 0 i,j,k (Non-negativity) (19)

5. Summary of ResultsAfter collecting data and calculating the linear program using GAMS, a pitching staff of twelve members was selected. This staff consisted of five starters, five relievers, and two closers. The specific players chosen are shown in the table below.

Table 5: Players SelectedStarters Relievers ClosersLincecum Devine Nathan Lee Wagner Rivera Nolasco Shouse Harden Ziegler Sheets Balfour

6. AnalysisThe analysis performed consisted of changing the weights of the different pitching criteria such that an emphasis was placed on different criteria (such as ERA). Particularly note the case of Matt Thorton (reliever), who missed being selected by a value of .625. By analyzing Thorton’s ERA and OPP OBP (see Table 6) values, it was possible to calculate the minimal amount that Matt Thorton would have to improve in order to be selected over Shouse. Tables 7 through 10 demonstrate different weight scenarios and their impact on the players selected.

Table 6: Improvements needed for Matt Thorton to be selectedPlayer Original ERA Improved ERA for Selection Original OBP Improved OBP for SelectionMatt Thorton 2.325 2.297 .258 .2449

Table 7: Changing the weights to have a greater emphasis on ERAStatistic WeightERA .5WHIP .2OBP .15Fielding Percentage .05K/BB .1

Table 8: New Players SelectedStarters Relievers ClosersLincecum Downs Nathan Lee Wagner Lidge Santana Thornton Harden Ziegler Sheets Balfour

Table 9: Changing the weights for an all around balanced pitcherStatistic WeightERA .4WHIP .2OBP .2Fielding Percentage .05K/BB .15

Table 10: New Players SelectedStarters Relievers ClosersLincecum Shouse Nathan Lee Wagner Lidge Harden Thornton

Page 7: Multi Criteria Selection of All-Star Pitching Staff

Lambert, McGinley, Chandan, Medina, and Claudio

Sheets Ziegler Nolasco Balfour

7. ConclusionsThis paper shows the formulation of a fantasy baseball selection problem as a LP and its solution through the use of GAMS. An interesting result was that several players, that were understood to be good enough to make it into the pitching staff, were in fact left out. This includes Santana (starter) who had initially missed the cut by just several points. However, the analysis demonstrated that by simply changing the importance (in the form of weights) of the statistics (ERA, OBP, etc), the players selected would be quite different. This could be very useful to real baseball organizations as they could slightly alter parameters and constraints in the LP in order to develop the pitching staff of their choice. Also, this formulation could be used to help MLB managers decide which minor league pitchers to bring up into the major leagues. Finally, by using reverse engineering, a baseball agent could inform their players of the improvements they would need to achieve in order to stay competitive or be selected by a particular team, as illustrated through an example in the analysis. However, in order to be used as a solution for MLB managers, several additional cost constraints will have to be added depicting how much the team is willing to spend for each player. Any additional research will be focused into making this transition.

References 1. Hamiez, J.P., and Hao, J.K., 2004, "A Linear-Time Algorithm to solve the sports league scheduling problem,"

Discrete Applied Mathematics 143(1-3), 252-265.2. Trick, M., 2005, “Adventures in Sports Scheduling,” Carnegie Mellon’s Graduate School of Industrial

Administration. http://www.cs.cmu.edu/~ACO/dimacs/trick.html 3. Zappe, C., Webster, W., and Horowitz, I., 1993, "Using Linear/integer programming to Determine Post-Facto

Consistency in Performance Evaluations," Interfaces, 23(6), 107-113.4. Adler, I., Erera, A.L., Hochbaum, D.S., and Olinick, E.V., 2002, “Baseball, Optimization, and the World Wide

Web," Interfaces, 32(2), 12-22.5. University of California, Berkeley. “Baseball Playoff Races”. RIOT Baseball Project.

http://riot.ieor.berkeley.edu/~baseball/ 6. Lewis, H. F., Lock, K. A., and Sexton, T.R., 2009, "Organizational capability, efficiency, and effectiveness in

Major League Baseball: 1901-2002," European Journal on Operations Research, 197(2), 731-740. 7. Baseball statistics and history, 2008. Retrieved from http://www.baseball-reference.com. 8. Fantasy baseball. (2008). Retrieved from http://games.espn.go.com/frontpage/baseball.