FREE AGENCY AND CONTRACT OPTIONS: HOW … · FREE AGENCY AND CONTRACT OPTIONS: ... find that in...
Transcript of FREE AGENCY AND CONTRACT OPTIONS: HOW … · FREE AGENCY AND CONTRACT OPTIONS: ... find that in...
FREE AGENCY AND CONTRACT OPTIONS:HOW MAJOR LEAGUE BASEBALL TEAMS VALUE PLAYERS
May 11, 2007
Michael [email protected]
Stanford University, Department of EconomicsAdvisor: Prof. Bob Hall
Abstract
When evaluating and signing players, Major League Baseball teams face incomplete information regarding a player’s true value. This paper explores how teams deal with such uncertainty and whether their approaches toward high-risk free agent signings and low-risk contract option decisions differ. I use seemingly unrelated regression to estimate the relationships between past performance and future performance and between past performance and free agent salaries. I find that in determining free agent salary offers, teams undervalue past performance relative to its power in predicting future performance. For the low-risk option decisions, I use the Wilcoxon Sum Rank Test and a logit regression to determine that teams are much less cautious and follow no discernible pattern in exercising options. Teams thus use very different approaches in making free agent and contract option decisions.
Keywords: sports economics, salary determination, free agent, labor, management
I am grateful to my advisor, Professor Bob Hall, for his guidance and thoughtful comments. Working with him has been an invaluable learning experience. I would also like to thank Professor Luigi Pistaferri for his econometrics advice. Any mistakes are my own.
Michael Dinerstein, May 11, 2007, Page 1
1. Introduction
In 2004 Adrian Beltre, a third baseman for the Los Angeles Dodgers, had a fantastic year
and received considerable support as the possible Most Valuable Player of the National League.
Beltre’s performance proved somewhat unexpected, however, because in the years prior to 2004
his statistics were relatively average compared to the rest of the league. Critics pointed out that
Beltre’s contract situation likely explained his unexpected performance. Beltre’s contract was
set to expire after the 2004 season, at which point he could become a free agent and negotiate a
new contract with any baseball team. Perhaps Beltre then exerted a particularly strong effort
during 2004, his “contract year,” that he had not shown during previous seasons. Or Beltre could
have always performed to his full capacity and may have just reached a level of skill in 2004 that
allowed him to develop into a very productive player. The Seattle Mariners seemed to assume
the latter explanation was true (or at least that the former explanation would not predict a future
devolution into prior habits) by offering Beltre a 5-year contract for $64 million, an amount that
would hardly be justified by his performance prior to 2004. Beltre’s subsequent performance
proved disappointing relative to the expectations his new contract created.
The case of Adrian Beltre highlights the difficulty in discerning players’ true marginal
values with incomplete information and the consequences to teams and players of contract
decisions. The goal of this paper is to understand how Major League Baseball (MLB) general
managers and their teams navigate this process of deciding whether to keep and sign players.
This paper will examine both the typically high-risk decisions of signing free agents to new
contracts and the typically low-risk decisions of exercising team options that MLB front offices
must make. Is there a statistically significant pattern of soon-to-be free agents performing above
expectations in the final year of their contracts? If so, are teams fooled by this unexpected
Michael Dinerstein, May 11, 2007, Page 2
performance or do they adjust their salary offers? While many papers have addressed the former
question, none has asked whether teams fall into the trap of overemphasizing the most recent
performance, or even whether they create the incentive in the first place for players to try harder
during their “contract year.” A related question of team rationality in the contract process is
whether teams exercise their team options rationally. Options, which will be explained later in
the introduction, represent a less significant commitment to a player because options are rarely
guaranteed and are for just one year of service. I will address the degree to which teams exercise
options rationally and then compare the valuation of player options with the more significant
commitments in lucrative free agent deals.
1.1 Uniqueness of Major League Baseball
Baseball, due to the rules of the game and the nature of player contracts, affords
researchers advantages that other sports cannot provide. While baseball is naturally a team
game, individual performance, particularly by hitters, is reasonably independent from the actions
of others. When a player hits the ball, his fate is nearly always independent of how his
teammates act and depends on his opponents primarily through their defense. But given the high
quality of defense at the major league level, most plays are either routine or quite difficult, and I
assume that the variation in defenders is relatively limited. Whether the batter hits the ball is
often quite dependent on the quality of the opposing pitcher, but because teams play 162 games
and have fairly balanced schedules, I make the reasonable assumption that all hitters face
approximately the same quality of pitchers over the course of the season.
For most at-bats, a batter attempts to reach base without making an out, regardless of
whether the previous hitter made an out or reached base. Certain situations can arise, however,
that require the batter to aim for a specific outcome that would not necessarily be the goal for
Michael Dinerstein, May 11, 2007, Page 3
each at-bat. For instance, if a teammate has reached third base before two outs have been
recorded in an inning, most hitters will attempt to hit a sacrifice fly, which sacrifices the current
hitter to help the base runner score. While a hit is nearly always superior to a sacrifice fly in any
situation, the risk involved in seeking a hit can be large enough that hitters aim for sacrifice flies.
But even in such specific situations when the batter’s goal changes, baseball statistics usually
reward, or at least do not penalize, the batter for helping the team. Furthermore, the statistics are
general enough that they can accommodate hitters with distinct strengths. The power hitter may
frequently hit home runs but also strikes out more than another hitter who rarely hits more than a
double but reaches base often. Statistics like slugging percentage reward both types of hitters by
giving more points for home runs than singles but also accounting for the number of times a
batter reaches base. Baseball statistics thus usually measure actions that are highly correlated
with individual and team goals and are flexible enough to apply to the full spectrum of hitter
types.
Whether pitchers’ performance lends itself so easily to research is debatable. While most
prior studies have focused on hitters, Krautmann, Gustafson, and Hadley (2003) built a model
that predicted pitchers’ salaries based on past performance. Their findings indicate that all
pitchers cannot be treated as members of the same population but rather starting pitchers, middle
and long relief pitchers, and closing pitchers require separate analyses. Furthermore, no single
performance measure emerges as explaining variation in pitcher contracts. Measuring pitching
performance is thus more challenging.
The rules for player contracts also overcome some of the difficulties of analyzing athlete
contracts in major team sports. Unlike the National Football League (NFL), contracts are
guaranteed so that if a team releases the player, the team still has an obligation to pay the player
Michael Dinerstein, May 11, 2007, Page 4
the full contract amount. Contract length, as well as salaries in later years of long-term contracts,
should be close to a baseball player’s projected value, whereas a football team might offer a six-
year contract without expecting the player to be on the roster for the last few years. Player and
team options can be exceptions to the rule of guaranteed contracts, but, as this paper will show,
their relative simple nature still allows for analysis.
The other unique aspect of Major League Baseball contracts is that teams do not have to
remain under a salary cap. In the National Basketball Association (NBA), teams can spend only
a limited sum on players each year, whereas no such restriction exists in MLB. The salary cap
has several effects. First, players may not receive their marginal value to a team because the
team has an upper limit it can offer. Therefore, a model that uses past performance to predict
salary may run into difficulties caused by truncation. Second, to ensure that teams have some
flexibility despite the salary cap, the NBA includes several salary exceptions that allow a team to
sign a player to a certain amount regardless of salary cap implications. Such exceptions can
force player salaries into slots that make it difficult to fit a continuous distribution to salaries and
again create a system where players may not receive their marginal value. Third, to circumvent
the salary cap basketball teams often offer contracts that are back-loaded or include large signing
bonuses that can be distributed over the length of the contracts in years where the team may be
well under the salary cap. The large variation in yearly salaries can prove difficult to model.
The relative consistency of baseball contracts makes analysis more straightforward.
Baseball’s system of free agency also more closely resembles a free market. Unlike the
NBA, MLB does not have rules that allow a player’s current team to offer a contract amount that
no other team can match. Unlike the NFL, MLB does not feature franchise tags that allow teams
to designate future free agents as “franchise players” and restrict them from entering the free
Michael Dinerstein, May 11, 2007, Page 5
agent market. The salary determination process is therefore easier to model because the bidding
is more competitive and players have less uncertainty about when they will become free agents.
1.2 Large Commitments: Free Agent Decisions
Because the contract determination process can be complex and involve many factors that
are hard to separate, this paper will focus on the specific decision of how teams value most
recent performance when signing free agents. An understanding of this aspect will offer insight
into how teams generally deal with large financial commitments.
The team’s goal is some combination of winning a championship and maximizing its
profits. A player’s on-field production is an important component of both of these goals, and all
else being equal an increase in on-field production increases the probability of winning and the
team’s profits. When offering a player a contract, a team must predict players’ future on-field
production, but this task can be quite difficult, as highlighted by the Adrian Beltre example.
Changes in player performance could be a result of two factors. First, a player’s possible
performances form a distribution from which each year is one draw. Even if the distribution
remains unchanged, player performance can vary between years. Second, a player’s distribution
may change. For instance, if a player becomes more accustomed to major league pitching, his
distribution of outcomes may change to favor higher performance. Distinguishing between these
factors is the team’s challenge.
Adding to this variation in on-field performance is a player’s ability to control his effort.
Following Krautmann (1990), a worker’s marginal production equation is
jjj XEfMP ,, (1)
where jMP is the jth worker’s marginal product, jE is the jth worker’s effort, X is a vector of
other inputs, and j is a random variable. I define effort broadly to include any measures that
Michael Dinerstein, May 11, 2007, Page 6
improve performance, whether they are expected of the worker or not. For instance, taking
steroids is illegal and many teams may frown upon such action, but if players can increase their
production by taking steroids, then they are a measure of effort. Effort can be decomposed, as
Krautmann (1990) outlines, as
jjj ZCThE , (2)
where jCT is the time remaining on the jth worker’s contract and jZ is a vector of other factors
affecting his effort. The question is why might
j
j
CT
MP
, (3)
the marginal effect of time remaining on a contract on marginal product, not equal zero.
If baseball players anticipate they will receive higher salaries for certain actions, they will
engage in such behavior. Players at the beginning of long-term contracts know that they have
locked in specific salaries for the following years and may expect that their current performance
will have no impact on their compensation. Players in the final year of their contracts, or the
“contract year,” can expect to negotiate new contracts in a short time frame and may believe that
increased performance now will translate to higher salaries in a year. For this reason, players
may exert more effort in the final year of their contract because they expect it will have
significant benefits.
Whether this expectation is rational is unclear. Teams presumably evaluate players on
their performance over several years. If this pattern is established, players have little incentive to
adjust their effort for only the end of their current contracts. Instead, production should be more
consistent. But teams may have outside pressures that preclude them from taking such an
approach toward evaluation. For instance, if a baseball team’s fans or sports writers are tired of
Michael Dinerstein, May 11, 2007, Page 7
losing and see that a player who just finished a productive season is available, they may call for
their team to sign the player. Such pressure could cause the team to alter how it values players.
Teams have turned to incentive mechanisms to protect themselves from the risk
associated with signing a player whose future production is highly uncertain. For instance, many
contracts include bonuses if a player participates in a certain number of games or makes an All-
Star team. The size of these bonuses, however, is quite insignificant compared to the base
salaries, and as Harder (1989) notes, large incentive mechanisms are highly correlated with high
base salaries. These incentive mechanisms thus may not provide the team with much insurance.
Therefore, teams continue to face the problem of how to predict future performance. These
decisions can have large impacts because free agent contracts are often multi-year commitments
to players for large sums of money.
1.3 Small Commitments: Options
Options can be team options, player options, or mutual options. When a player and team
initially negotiate a contract, they can add an option to the end of the contract for a pre-specified
amount. Then, when the player completes the non-option years of the contract, the holder of the
option chooses whether to exercise it. If the holder exercises the option, the player’s contract is
extended by one year for the amount that was specified when the contract was originally signed.
For a mutual option, both the team and the player must exercise the option for the previous
contract to extend. Typically the team or player can exercise the option at any point during the
contract, although only the teams seem to exercise options early.
Team options usually include buyouts that the player receives if the team chooses not to
exercise the option, though these buyouts are significantly smaller than the option amount.
Michael Dinerstein, May 11, 2007, Page 8
Sometimes the contract will delineate conditions under which a team option will “vest,” or
automatically become guaranteed.
Some form of options has actually predated free agency, as during the reserve era the
team always had an option to renew the previous one-year contract if the two parties could not
negotiate a new deal. But once the free agency era began, options became much less frequent
and have only recently regained their popularity. Interestingly, options are so entrenched in the
business of baseball that the player’s union had an option to extend the 1997-2000 Collective
Bargaining Agreement for one year, which it exercised.
When exercising options, teams are thus making smaller commitments to players because
options are only one-year extensions of contracts. Furthermore, options only apply to players
previously under contract with the team, so the team likely is aware of the player’s work ethic
and interactions with teammates. Because option choices are relatively simple binary decisions
to model – the holder either exercises the option or does not – they lend themselves to salary
models and offer an interesting point of comparison to modeling free agent contracts. This paper
will consider only team options.
2. Literature Review
There has not been a study of player or team options in baseball. I hypothesize that this
dearth of literature is partly a result of the challenge of finding comprehensive options data.
While downloadable databases of salaries and lists of free agents are readily accessible, there is
no organized public source of option data, and until the growth of Internet blogs, researchers
lacked a way to compile option data. Furthermore, options seem to have become much more
popular in recent years, and therefore their significance has only arisen recently. My attempt to
collect option data and analyze it thus appears to open the door into understanding how players
Michael Dinerstein, May 11, 2007, Page 9
and teams use options and whether these low-risk commitments differ from decisions regarding
free agents contracts.
As for the question of whether players allow their effort to fluctuate over the course of a
contract, baseball researchers have offered several models but have yet to settle on a conclusive
answer. Harder (1989) collected data from four years: 1976, 1977, 1987, and 1988. Harder used
two unconventional statistics, “runs created” and “total average,” as his measures of
performance. He claimed that these statistics could best measure a player’s contribution to a
team’s success, and their correlations with conventional statistics were very high. Harder then
identified the log of salary as the dependent variable and ran a linear regression with variables
for players’ career production, experience, contract status, team performance, ethnicity, and All-
Star status the previous season. The paper concluded that being in a contract year had no effect
on total average or runs created in 1976 or 1977 while in 1989 the effect was negative on runs
created but not statistically significant on total average. Harder concluded that players did not
anticipate that statistical gains in their contract years would lead to higher salaries. The study’s
years were so unique in baseball history that these results likely do not apply to the current
environment. During 1976 and 1977, the first two years of data, the league was transitioning
into free agency. Players likely could not predict how their production would affect their future
salaries. During the later years, 1987 and 1988, owners were found to be colluding. Again,
players may have expected to be low-balled in contract negotiations regardless of performance
and thus saw no incentive to try harder during their contract year. Because today’s league more
closely models a free market, players’ incentives have likely changed.
Without using the standard regression, Krautmann (1990) attempted to answer the
related question of whether shirking (exerting less than maximal effort) occurs in the first year of
Michael Dinerstein, May 11, 2007, Page 10
a long-term contract. Krautmann’s data ran from 1976 to 1983 and included free agents who
signed contracts with lengths of at least five years. He constructed forecast intervals for players’
slugging averages based on past data and then defined super-par and sub-par performance as the
top and bottom 5% of the interval, respectively. Krautmann found that only 4.5% of the players
had super-par performances and only 1.8% had sub-par performances. He concluded that
shirking did not occur.
Scroggins (1993) challenged Krautmann’s result by claiming that slugging percentages
would not pick up shirking but that using total bases, a measure that could account for time spent
injured, proved that shirking occurred. Scroggins used a linear regression with Krautmann’s data
and found that the coefficient on whether a player had just signed a long-term contract was
significantly negative. Krautmann (1993) defended his choice of slugging percentage and then
applied his method of analysis to the total bases statistic. He found that the results were nearly
identical to those of his initial paper.
This paper will add to the existing literature in several ways. First, the discussion of team
options fills in a gap that research has not yet covered. Their emergence in recent years and their
uniqueness as non-guaranteed contract years allow options to offer insight into how players react
to a more uncertain contract situation and how teams value players. Second, the analysis of
effort during contract years will make use of more recent data than previous studies. Particularly
in a labor market that has fundamentally changed over the last 20 years, recent data is essential in
explaining current patterns of behavior. Finally, previous studies have focused on whether
players react to an incentive to alter behavior in the contract year. This paper will delve into the
team’s side of this issue by asking whether teams create the incentive to alter behavior and how
they value players in the context of possible shirking.
Michael Dinerstein, May 11, 2007, Page 11
3. Data
As discussed above, pitching statistics are not general enough to cover all types of
pitchers. I choose to drop pitchers from my analysis for two reasons. First, splitting up the
pitchers into starters, long and middle relievers, and closers and analyzing them separately
causes my conclusions to suffer from small sample sizes. Second, because the different pitching
roles can require different mentalities and skill sets, a player’s past performance in one role may
not accurately predict his future performance if he changes roles. Hitters, on the other hand, can
be aggregated, and while they may switch positions, this change should not affect their hitting
performance.
My sample consists of player-years from 2001 to 2004. Players who signed free agent
contracts beginning in 2005 are also included.1 While the full population of hitters from these
years numbers 2111, my sample includes only 1330 of the player-years, or 62.9%. This partial
coverage is a result of incomplete contract information. My data consists of three main subsets:
player on-field performance, off-field player characteristics, and contracts. On-field
performance is publicly available for all players and consistent across sources because baseball
designates an official scorer.2 Off-field player characteristics are also widely available and do
not depend on the source.3 Contract information is much harder to find and the level of detail
can depend on the source.4
1 Because I evaluate free agents based on variables observed prior to their new contract, these observations use data from only 2001-2004. These players are included to increase the sample size of free agents.2 For on-field performance, I used version 5.4 of Sean Lahman’s baseball database, “The Baseball Archive.” The database is accessible at http://www.baseball1.com/statistics/. I checked a random sample of these statistics with ESPN.com (http://espn.go.com), and found no inaccuracies.3 I again used version 5.4 of “The Baseball Archive” as well as Baseball-Reference.com (http://www.baseball-reference.com/). 4 For the salary amounts, I used version 5.4 of “The Baseball Archive.” For lists of free agents, I consulted Associated Press articles. For contract lengths and option data, I used two online blogs and checked their data against Associated Press articles. The first blog, MLB Contracts (http://www.bluemanc.demon.co.uk/baseball/mlbcontracts.htm), is no longer accessible directly, so I used “The
Michael Dinerstein, May 11, 2007, Page 12
There is no consensus on how to measure yearly on-field batter performance. The most
popular statistics in newspapers and casual fan conversations are home runs, runs batted in, and
batting average. Home runs give a sense of a player’s power, but they shed little light on his
ability to reach base through other hits. Furthermore, they are highly dependent on the number
of games a batter plays. Runs batted in (RBIs) may be a strong measure of a hitter’s contribution
to a team’s production. The drawback, however, is that a hitter’s opportunities for RBIs are very
dependent on his teammates. An RBI is much easier to obtain if teammates have reached base.
Thus, hitters who bat after good hitters have more opportunities for RBIs. Batting average (BA)
gives the percentage of hits a player earns for his total at-bats:
BatsAtTotal
HitsTotalBA
. (4)
At-bats are the total number of times a player bats, though walks, sacrifices, and hit-by-pitches
are not included. Batting average offers a strong measure of a player’s ability to reach base but it
fails to distinguish between different types of hits. For instance, a home run is more valuable
than a single, but batting average counts each as one hit. The most common statistics thus are
not sufficient for analysis that attempts to find a player’s full independent value.
Other “hybrid” statistics more closely measure a player’s value to a team. Most previous
studies have used slugging percentage (SLG), where
BatsAtTotal
BasesTotalSLG
. (5)
The total bases statistic, another performance measure, refers to the number of bases a player has
earned through hits. For instance, a single equals one base, a double equals two, but a non-hit
WayBack Machine” (http://www.archive.org/web/web.php) to access the site. The other blog, Cot’s Baseball Contracts (http://mlbcontracts.blogspot.com/), proved to be an excellent source.
Michael Dinerstein, May 11, 2007, Page 13
like a walk does not count even though the player advances a base. Slugging percentage thus
conveys a combination of how often a player gets hits and how valuable the hits are.
On-base percentage (OBP) improves upon batting average by accounting for non-hits,
such as walks, that lead to the batter reaching base without causing an out. OBP is the ratio of
the times reached base without making an out and total plate appearances. OBP has gained
popularity recently, as recounted in Michael Lewis’s Moneyball (2003), which described the
approach taken by Oakland Athletics General Manager Billy Beane in evaluating players. The
problem with OBP is the same as that of BA – OBP fails to differentiate between different types
of hits. In response to this deficiency, baseball researchers have turned to OPS, the sum of SLG
and OBP. OPS combines the advantages of SLG and OBP. Bill James and Thomas Boswell
also have argued for their own complicated statistics. James (1988) constructed “runs created,”
which accounts for not only hits and walks but also considers a player’s ability to steal a base.
Boswell (1988) created “total average,” a measure that also included base-running statistics.
The real test of which performance measure to employ is whether the teams actually use
it when predicting a player’s value. Even if one statistic is the best predictor of wins or team
profit, its usefulness in this analysis is limited because I am testing how teams weigh the most
recent performance versus less recent performance in determining salary offers. Therefore, I
take the statistic, or combination of statistics, that the team uses as given. But since teams do not
make such information public in most cases, I assume that teams have learned from experience
and now use a performance measure that best accounts for a player’s contribution. The problem
returns to determining the performance measure that best leads to wins and profits. The
combination of statistics should account for both total production during a season and average
production during the games a batters plays. Thus, the statistics should distinguish between two
Michael Dinerstein, May 11, 2007, Page 14
players who produce equal cumulative totals but play in different numbers of games. For the
reasons already discussed, home runs, RBIs, BA, and OBP do not account for players’ full value
in ways that better statistics do. Runs created and total average are too obscure to assume that all
teams use them to evaluate players.
Instead, I use a combination of slugging percentage and total bases. While OPS includes
SLG, the results hardly change when I substitute OPS for SLG, and SLG is easier to interpret
than OPS, which double-counts hits (hits appear in both the SLG and OBP components). Also,
most studies have used SLG. Therefore, I prefer SLG to OPS. In order to predict future
slugging percentage (SLGt) and a player’s value to a team, I will use lagged slugging
percentages from the three previous seasons (SLGt-1, SLGt-2, SLGt-3).5 In order to distinguish
between players who are efficient but often hurt and players who are efficient and play many
games, I will include a three-year average of total bases (AVGBASES). This combination is
simple but takes into account a player’s rate of production and total contribution.6
In addition to a batter’s hitting skill, his value to a team includes fielding abilities.
Fielding can be difficult to measure because a high number of errors may not necessarily indicate
a poor fielder but rather a fielder with wider range who can reach many balls. Therefore, his
errors might have been hits if other fielders were playing. Furthermore, general managers rarely
mention players’ fielding when signing free agents, so I am inclined to believe that only the best
fielders receive more money for their fielding abilities. Thus, I include a dummy variable
5 I choose to predict slugging percentage rather than total bases because total bases are less consistent across years as they depend largely on injuries that occur randomly. Therefore, lagged slugging percentages are better predictors of future slugging percentage, so the player model will predict slugging percentage but still make use of total bases. 6 In the second model used in evaluating players with options, I use lagged values of total bases (BASESt-1,
BASESt-2, BASESt-3) as well as slugging percentages because the dependent variable is now a salary amount, not a performance statistic.
Michael Dinerstein, May 11, 2007, Page 15
(GOLDGLOVE) for whether the position player (non-pitcher) has received a Gold Glove
Award, which designates each year the best fielder at a certain position in each league.
Off-field player characteristics are player-specific characteristics that do not depend on
performance. The most obvious variable choice is AGE. I expect that as players age, they
become accustomed to the grueling 162-game seasons, they learn what lifestyle will allow them
to perform at a high level, and they make other changes driven by experience to maximize their
performance. Batters may also adapt to Major League pitching over time, but conversely
pitchers may adapt to the batters’ tendencies, which makes the total effect ambiguous. In this
sense, deviations in performance between years could be a result of a player’s underlying ability
changing rather than random draws each year from the same distribution. Thus, unless the effect
of pitchers adapting to batters is particularly strong, I expect older players to have higher
statistics and higher value to teams. But at a certain point older players face disadvantages that
can affect their value. Injuries can accumulate, hitters’ eyes may become less sharp, and other
physical ailments can limit a player’s performance. To model this quadratic relationship, I
include an AGE2 variable. Unfortunately data on how long players have been in the minor
leagues, which could affect their major league performance, is nearly impossible to collect.
Players also contribute value to teams beyond their on-field performance. Players with
magnetic personalities or characteristics that inspire the community have marketing potential that
can increase team revenues. Because a player’s marketability is nearly impossible to measure
precisely, I include a dummy variable (ALLSTAR) for whether a player was an All-Star in any
of the three previous seasons. All-Stars are often the most well-known players and have the best
marketing potential and so will serve as a proxy for a player’s marketability.
Michael Dinerstein, May 11, 2007, Page 16
I also include dummy variables for the observation’s year (2001, 2002, 2003, 2004).
Even though the period studied is short, the labor market or general performance may change
between the years. In 2002, the owners and players association agreed to a new Collective
Bargaining Agreement. This agreement included for the first time revenue sharing among teams,
a luxury tax on teams with high payrolls, and testing for steroids. Because these changes could
affect player performance and team decisions, yearly dummy variables are necessary.
Furthermore, rumors that the composition of the baseballs has changed over the years demands
that I differentiate between years.
To relate these variables to labor market decisions, I require contract data. Because such
data is reported in a haphazard manner, I can rely on only the most significant and basic parts of
the contract. For instance, the reporting of performance incentive clauses varies between sources
as well as between players. Details of contracts of high-profile players are more often reported
because the public’s interest in these players is high. More marginal players are less noteworthy
and may even be playing with minor league contracts that escape the media’s attention.
Because most players’ base salaries appear publicly, I will focus my analysis on base
salaries (SALARY). A player’s salary serves as a proxy for the player’s value to a team and is a
scarce resource for the team. For most players, their base salary dominates potential bonuses.
This may not be true, however, for some players, especially those who are injury-prone. The
difficulty in collecting data on incentives and bonuses unfortunately precludes a more
comprehensive analysis. I convert all salaries into number of millions of dollars to avoid large
numbers. I also collect the length of contracts (LENGTH) but do not explicitly place it in my
models. Contract length can represent the size of a team’s commitment and is thus relevant. But
since it is endogenous in the teams’ decision-making, I use it only to sort observations and
Michael Dinerstein, May 11, 2007, Page 17
determine which observations appear in each model. Similarly, NEWCONTRACT is a dummy
variable for whether the player has signed a new free agent contract for the current season, but it
does not appear in the equations. Instead, it filters old contracts from entering the free agent
salary equations.
Finally, to test whether a player outperforms expectations prior to becoming a free agent,
I include a dummy variable (CONTRACTYEAR) for whether the player will be a free agent in
after the current season.7
The descriptive statistics, reported in Table 2, offer some basic trends. Slugging
percentage falls in the sample from year-to-year whereas total bases trends upward. The average
age is about 31 years old and the average salary is $3.357 million. Almost 25% of observations
signed new free agent contracts and 25% are in the final year of a contract before becoming free
agents.
Because this sample only covers 62.9% of players-years from 2001 to 2004, the degree to
which the sample represents the population is at issue. Since on-field performance and off-field
player characteristics data are available for all players, I can test whether the players included in
the sample are characteristically different from those not included. A test of whether means are
equal shows that the players in the sample are older, more experienced, have higher base salaries
(when reported), are less often switch hitters, and have more at-bats, higher slugging
percentages, and higher on-base percentages in the prior year. All of these differences, except
the proportion of switch hitters, appear significant in all four years as well as in the aggregated
sample. While the following analysis suffers from a censoring problem, the bias in the sample
favors older and more experienced players, who are exactly those players more likely to be
7 Some players will become free agents unexpectedly if they are released or an option is not exercised. Only those players who are guaranteed to become free agents, unless an extension is signed before the season ends, have a 1 for CONTRACTYEAR.
Michael Dinerstein, May 11, 2007, Page 18
eligible for free agency and to have options in their contracts. This paper attempts to answer
questions that revolve around free agency, so the sample’s bias toward older and more
experienced players is mitigated. This paper also focuses on the difficulty that teams have in
evaluating players and making contract decisions. These decisions are more important for
players with higher salaries because the team’s commitment is larger. The censoring problem is
thus less significant than it first appears.
4. Model
For the question asking how teams make large commitments, I have player and team regression
equations. On the player side, my dependent variable is the player’s on-field performance,
measured in terms of slugging percentage, in the current year. To predict such performance, I
use off-field player characteristics and past on-field performance. I also include the dummy
variable CONTRACTYEAR. On the team side, my dependent variable is the first-year salary
given to a newly-signed free agent. I use the same off-field player characteristics and past on-
field performance variables in addition to dummy variables for ALLSTAR and GOLDGLOVE.
These variables appear only in the team equation because they are relevant for a player’s value to
a team but they fail to predict future slugging percentage beyond the inclusion of the lagged
SLG. Similarly, CONTRACTYEAR is only relevant to the player, who may alter his
performance when in the contract year, whereas every free agent the team signs just finished a
contract year.
The player equation makes use of all observations in the sample. The team equation,
because it applies to team decisions on free agents, only uses observations for which the player
has just signed a new free agent contract. The dataset is thus unbalanced between the equations,
as players register slugging percentages each year but sign free agent contracts less frequently.
Michael Dinerstein, May 11, 2007, Page 19
Because the equations model similar processes, I expect that the regression error terms
could be correlated. Instead of running two separate OLS equations, I use seemingly unrelated
regression to determine the coefficients jointly. I want to test whether coefficients on the same
independent variables are equal across the two equations, so I need both dependent variables to
lie on the same scale. With an OLS regression, I estimate that each marginal slugging
percentage point (.001) equates to a marginal value of $29,500.8 After rescaling tSLG into dollar
amounts, the equations appear as follows:
)7(20042003
20022001
)6(20042003
20022001
12112
10987
6543322110
112
10987
6543322110*
GOLDGLOVEALLSTARAGEAGE
AVGBASESSLGSLGSLGSALARY
ARCONTRACTYEAGEAGE
AVGBASESSLGSLGSLGSLG
ttt
tttt
where *tSLG is the rescaled slugging percentage in the current year.
Because STATA’s seemingly unrelated regression command “sureg” cannot handle
unbalanced datasets where the number of observations differs between equations, I follow the
method outlined in McDowell (2004). I scale the equations so that the error terms have equal
variance and then combine the data into one panel with a variable indicating which equation the
observation enters. I use the command “xtgee” to produce the results equal to those of
seemingly unrelated regression for unbalanced data.
For the question of how teams make smaller commitments, I predict how much players
with upcoming options can expect to receive as free agents. I use this prediction as a proxy for
8 By convention, one point of a baseball percentage refers to 0.001. I derive the estimate for the marginal value of a point of slugging percentage from an OLS regression of salary on three lags of slugging percentage and a constant. The estimate of 29.5 ($29,500) is the sum of the coefficients on the lags of slugging percentage.
Michael Dinerstein, May 11, 2007, Page 20
the player’s value to the team with the team option. To predict this value, I run the following
regression to determine coefficients on key variables:
)8(2004200320022001 132
121110987
3625143322110
ALLSTARAGEAGE
BASESBASESBASESSLGSLGSLGSALARY tttttt
Because options only extend a player’s contract for one year, in determining the
coefficients I only include new free agents who sign one-year contracts. I then use the
coefficient estimates and the data from players with options to predict their salaries, which I label
PREDICTSAL. Once I have a player’s predicted salary, I subtract from it the cost to the team of
exercising the option (the option amount minus any pre-negotiated buyout if the option is
declined). This amount, if positive, predicts that the player’s value to the team is higher than the
cost of exercising the option. A negative outcome means the team’s costs are higher than its
predicted returns from keeping the player.
To test whether a higher value of PREDICTSAL – OPTIONCOST predicts that a team is
more likely to exercise the option, I use two methods. First, I use the nonparametric Wilcoxon
Sum Rank Test. I order the players by PREDICTSAL – OPTIONCOST from lowest to highest
and then sum the ranks of the contract options that were exercised. Because the number of
options exercised and declined both exceed 10 in the sample, I use a normal approximation to
find a p-value for the test of the null hypothesis that higher values of PREDICTSAL –
OPTIONCOST have no relationship with whether a team exercises an option.
Second, I use a logit regression to find the relationship between the PREDICTSAL –
OPTIONCOST variable and a variable for whether the team exercises the option. My logit
regression equation appears as follows:
OPTIONCOSTPREDICTSALEXERCISE 10 (9)
Michael Dinerstein, May 11, 2007, Page 21
where EXERCISE is a binary variable that takes 1 if an option is exercised and 0 if the option is
declined. I expect the Wilcoxon Sum Rank Test and the logit regression to yield similar results.
5. Results and Discussion
The results from the seemingly unrelated regressions appear in Table 4 at the end of the
paper. I will analyze the results from the player equation first and then the team equation results.
Finally, I will compare the coefficients across equations.
As I would expect, lagged observations of slugging percentage strongly predict future
slugging percentage, and the prediction power increases for the most recent observations. A
0.010 increase in SLGt-1 predicts, ceteris paribus, a 0.0047 increase in SLGt. Similarly, 0.010
increases in SLGt-2 and SLGt-3 predict 0.0012 and 0.0005 increases in SLGt, respectively. With
p-values below 0.01, the first two lagged slugging percentages are clearly strong predictors of
future performance. The ratio of the coefficients reveals that the slugging percentage of the
previous year accounts for 72.8% of the variation explained by lagged slugging percentages.
Before considering the team’s equation, this ratio’s high value could explain why players might
expect a jump in statistics in the contract year to lead to much larger free agent contracts.
Players indeed seem to outperform predicted performance levels when they are due to
become free agents after the season. The coefficient on the dummy variable CONTRACTYEAR
is 0.2795 (on the dollar scale), so being in the contract year predicts an increase in slugging
percentage of 0.0095. Since the p-value for the coefficient is 0.115, I cannot claim with near
certainty that this relationship between the contract year and slugging percentage is definite, but
it provides strong evidence that anecdotes of players trying harder in the contract year have an
empirical basis. This result agrees with several studies, including Scroggins (1993), Maxcy,
Fort, and Krautmann (2002), and Marburger (2003), which find evidence of players altering
Michael Dinerstein, May 11, 2007, Page 22
behavior in anticipation of free agency. The behavior change found in this paper favors higher
performance, while Scroggins (1993) concluded that a player’s production falls during the same
period. The result also differs from those of Harder (1989) and Krautmann (1993), which found
no changes in behavior.
A player’s total production, measured in total bases, over the previous three seasons also
has a positive correlation with SLGt. An increase of 10 total bases on average over the three
previous seasons (equal to 10 additional singles, 5 additional doubles or other combinations of
hits summing to 10 total bases) predicts a 0.0016 increase in SLGt. Because the lagged slugging
percentages should already account for a batter’s efficiency, this result seems to indicate that
players who have more at-bats on average will have higher slugging percentages in future
seasons. These players may be more durable and less affected by lingering injuries that could
depress slugging percentages.
While these predicted effects seem like very small increases in SLGt, even fractions of
percentage point changes can affect a player’s value appreciably. For instance, in 2004, the
slugging average mean for players with at least 50 at-bats was 0.396. This corresponded to the
46th percentile of the distribution. An increase of 0.010 to 0.406 would have raised the player to
the 51st percentile. The effects of lagged slugging averages and being in a contract year therefore
have a discernible economic impact.
The key off-field player characteristic, AGE, exhibits unexpected effects on SLGt. I
hypothesized that players would improve as they age but at a decreasing rate. The coefficient on
AGE, however, is negative and large enough that its impact cannot be ignored. An increase in
age by one year corresponds to a decrease in SLGt of 0.0208. The p-value for the coefficient’s
difference from 0 is 0.016. The positive coefficient for AGE2, 0.0075 (0.0003 after scaling back
Michael Dinerstein, May 11, 2007, Page 23
to slugging percentages), is also the opposite sign of my expectation, with a p-value of 0.056.
The effect of the linear AGE variable is stronger because the break-even point at which a change
in age does not predict any change in slugging percentage occurs between 40 and 41 years old,
which lie at the very top end of the age distribution of players. The improvement in skill due to
experience and becoming accustomed to the Major Leagues may then be overrated or possibly
counteracted by other effects associated with aging, such as the accumulation of injuries. Those
players who last into their late 30s and early 40s may be the healthiest players. Because the
injury-prone players usually retire earlier, the oldest players are not representative of the rest of
the population, and their return to experience may be higher or their vulnerability to injury may
be lower.
The last variables in the player equation are the year dummy variables. The coefficients
on 2001 and 2002 are negative while those on 2003 and 2004 are positive. The only small p-
value corresponds to the 2004 coefficient. The yearly effects appear to be large, as a player in
2003 can expect to have a slugging percentage 0.0142 higher than a player in 2002. As
discussed previously, there could be several reasons that the yearly averages fluctuate, including
the composition of the balls and the prevalence of steroid use. These results fail to distinguish
among many possible reasons, but they do indicate that hitting averages are susceptible to the
different yearly playing environments.
The signs of the coefficients in the team equation are similar, which signifies that general
trends in the variables are well-established enough to predict both future performance and
salaries. A 0.010 increase in slugging percentage in the most recent year predicts a salary
increase of $138,813. Equal increases in SLGt-2 and SLGt-3 generate predicted salary increases
of $38,868 and $18,915, respectively, holding all other variables constant. Only the coefficient
Michael Dinerstein, May 11, 2007, Page 24
on SLGt-3 has a p-value above 0.01, and its p-value of 0.245 is low enough that its effect on free
agent salaries could carry some weight. These predicted salary increases only apply to new free
agent contracts, not to players who are under long-term contracts. The figures also represent the
amount the player signed for and give no conclusive prediction on what the player was offered
by other teams except that other offers likely fell below the predicted salary. The ratio of the
coefficients on the lagged slugging averages shows that the slugging percentage from the most
recent year explains 63.0% of the salary variation due to past slugging averages. This ratio is
lower than the equivalent ratio in the player equation, a relationship I will explore below.
A player’s durability, as measured by his average total bases in the previous three years
while holding slugging averages constant, has a large impact on his free agent salary. An
increase of 10 total bases on average over the three previous seasons yields a predicted salary
increase of $128,598. These predicted changes in salary can have a large impact on a team’s
financial structure, although the changes are smaller in scale than those predicted in the player
equation. An increase in free agent salary of $500,000 moves the mean salary from the 72nd to
the 76th percentile.
Accounting for a player’s value to a team beyond his batting statistics is more difficult,
though the coefficients on ALLSTAR and GOLDGLOVE may capture some of this value.
Having been an All-Star at least one of the three previous seasons increases a player’s predicted
salary by $1.393 million. Because the lagged slugging averages and total bases average should
cover much of a player’s hitting value, this large salary increase due to All-Star status may
indicate that a player’s marketability is very important in determining free agent offers. The
effect of defensive ability, measured by number of gold gloves, is smaller. An additional gold
glove corresponds to a salary increase of $64,161, and the coefficient’s p-value of 0.343
Michael Dinerstein, May 11, 2007, Page 25
indicates that this relationship is somewhat shaky. Either a player’s fielding skills may not factor
strongly in salary offer decisions or the number of gold gloves is a poor measure of a player’s
defense.
A player’s age affects salaries in a similar manner to its effect on future slugging
percentage. The AGE coefficient is negative while the AGE2 coefficient is positive, albeit with
higher p-values than the corresponding variables in the player equation. The age at which a year
increase has no effect on free agent salary is between 44 and 45 years, older than almost every
major leaguer. Even though the effect of age on player’s future performance runs counter to my
original hypothesis, teams seem to have figured out the correct relationship when constructing
their free agent salary offers.
Finally, the year coefficients are all negative relative to the free agents who signed their
contracts for the 2005 season. Interestingly, free agent salaries on average were higher in 2001
than in the later years (until 2005). This result contradicts the general thought that free agent
salaries escalate from year to year. These differences caused by years could derive from the
variability in free agent markets across years. Some years many teams may have more money to
spend due to generally favorable economic conditions or a higher general interest in baseball.
This higher demand for players would cause salaries to rise.
To test whether the coefficients on the same independent variables are equal across the
equations, I employ F-tests.9 The coefficient on SLGt-1 is larger in the player equation, and there
is only a .0198 probability that they are equal. This test is critical in answering the question of
how teams make large commitments and how their behavior affects players. If teams were
basing salary decisions primarily on future slugging percentage, then they were undervaluing the
most recent slugging average. Teams obviously consider other factors in determining a player’s 9 To test the coefficients across the equations, I use the player equation coefficients in dollar terms.
Michael Dinerstein, May 11, 2007, Page 26
value, but the teams still appear to be particularly cautious in using recent performance to predict
future performance. This undervaluing or cautiousness discredits the prediction that players are
rationally trying harder in their contract year with the expectation that their efforts will lead to
significantly higher rewards. Teams do not create the incentive to try harder during the contract
year. Instead, if players were to act more rationally, they would hold their effort relatively more
constant prior to signing new free agent deals.
The coefficients on the other lags of slugging average, however, are very similar across
the equations, as are the coefficients on 2001, AGE, and AGE2. These effects on slugging
percentage and salary appear to be relatively equivalent. AVGBASES though is a stronger
predictor of future slugging percentage than future salary, as the p-value for a test of equal
coefficients is 0.0008. This result runs counter to intuition, as I would expect teams to be quite
concerned with a player’s durability whereas durability’s effect on slugging percentage is less
clear. The problem may be that AVGBASES is very highly correlated with slugging percentage
because the statistics are so similar. A different measure of durability may prove more
enlightening. Finally, the coefficients on 2003 and 2004 are conclusively higher in the player
equation, while a test of equal coefficients on 2002 offers a p-value of 0.2805. Year effects on
player performance in 2003 and 2004, and to a lesser extent in 2002, were stronger than effects
on the free agent market.
While the results from how teams determine their large commitments are relatively
explainable, the statistical results from small commitments are puzzling. The Wilcoxon Sum
Rank Test results appear in Table 6, while Table 7 delineates the results from the logit regression
model.
Michael Dinerstein, May 11, 2007, Page 27
Of the 69 team options in the sample, exactly two-thirds were declined. The predicted
salaries of the players with upcoming options, which serve as proxies for the player’s value to
their teams, are lower than the team’s option cost in 63 of the observations. Option amounts,
when negotiated at the beginning of a long-term contract, thus appear to exceed players’ value in
almost all cases. The surprise then is that only two-thirds of team options were declined. The
sum of ranks of the exercised options totals 851, which yields a two-sided p-value of 0.558. The
exercised options are fairly evenly distributed throughout the values of PREDICTSAL –
OPTIONCOST, with five in the first quartile, six in the second, five in the third, and seven in the
last. In fact, the player with the most negative value of PREDICTSAL – OPTIONCOST, Raul
Mondesi in 2003, saw his team exercise his option.
The logit results are very similar, as they show no discernible relationship between
PREDICTSAL – OPTIONCOST and whether the option was exercised. The coefficient on the
independent variable is only 0.034, meaning that an increase in PREDICTSAL – OPTIONCOST
does little to predict whether the option will be exercised. The p-value is 0.609, so the
relationship is very weak.
I expected teams to follow at least some pattern when making options decisions, but these
results offer no straightforward answer. The model for predicting salary could be erroneous,
though these option results are so devoid of a pattern that small changes to the model likely will
not change the conclusions. Teams may feel particularly comfortable with their own players
because their work habits are established or the fans have made a connection with the player, and
thus the teams may be more willing to exercise options even when a model using observable
variables advises otherwise. The size of the commitment may also affect a team’s decision.
Exercising an option is a short-term commitment with limited ramifications. Some teams could
Michael Dinerstein, May 11, 2007, Page 28
be less discriminating when making options decisions because the effects are smaller. An
alternative hypothesis, which requires more data to be tested, is whether teams exercise options
to gain the goodwill of a player in the hope that the team and player can agree to a team-friendly
extension subsequently.
6. Conclusion
This paper has found that MLB clubs take different approaches in making decisions
regarding large commitments versus small commitments. When making often long-term
decisions about free agents, teams undervalue most recent on-field performance relative to its
ability to predict future on-field performance. Teams may base salary decisions on many factors
in addition to on-field performance, which could explain this possible undervaluation. Front
offices may also be particularly cautious when evaluating the value of free agents because their
actual value is so uncertain. The result that players perform better than expected during their
contract year complicates free agent decisions, and teams may react to such variation by taking a
circumspect approach. I would expect that as players (or their agents) observe this cautiousness
or undervaluation, they will adjust their behavior and no longer perform at a higher level in their
contract year but rather exert consistent effort over several years. This development, in turn,
could lead to more certainty when evaluating players, and teams may start to place more of an
emphasis on recent on-field performance when making free agent decisions. The cyclical nature
of this relationship reveals how interrelated player and team behavior are and justifies the use of
the seemingly unrelated regression in modeling these effects.
When making small commitments, baseball teams seem to be much less cautious and
exercise team options too often. The lack of any pattern between a player’s predicted profit to
the team and whether the option is exercised is startling. A team’s and its fans’ familiarity with a
Michael Dinerstein, May 11, 2007, Page 29
player may explain this result, but more likely the team follows a strategy for small commitments
that is difficult to model. Because teams are less worried about the impact of these decisions,
they may use more arbitrary decision-making.
Many avenues exist for further research. The growth of the Internet and rising interest in
baseball research mean that data will become more readily available. Very few studies have
incorporated data from across many years. Such an effort could reveal how stable these
relationships found in this paper are across different eras.
More research into on-field performance measures might produce statistics that are more
representative of a player’s value to a team, particularly in terms of fielding and speed.
Qualitative research through interviews or other methods could reveal which performance
measures teams actually use to determine players’ values. Similar methods could determine
whether teams negotiate options in original contracts with the intention of exercising them.
Unfortunately, teams value secrecy because they operate in a competitive environment, so such
data likely will only emerge years after decisions are made.
Further research into how the length of free agent contracts affects salary amounts and
the evaluation of a player’s recent on-field performance could fine-tune the conclusions found in
this paper. The reverse causality problems of including contract length as an independent
variable precluded its direct use in this analysis, but future research may be able to relate contract
length to teams’ free agent and option decisions.
Michael Dinerstein, May 11, 2007, Page 30
Table 1: Variable Definitions
Variable Definition
SLGt Batter’s slugging percentage in year t
SLGt-1 Batter’s slugging percentage in year (t-1)
SLGt-2 Batter’s slugging percentage in year (t-2)
SLGt-3 Batter’s slugging percentage in year (t-3)
AVGBASES Average of total bases in 3 previous years
BASESt-1 Total bases in year (t-1)
BASESt-2 Total bases in year (t-2)
BASESt-3 Total bases in year (t-3)
2001 Dummy variable =1 if t=2001
2002 Dummy variable =1 if t=2002
2003 Dummy variable =1 if t=2003
2004 Dummy variable =1 if t=2004
AGE Batter’s age
AGE2 Batter’s age squared
SALARY Salary for first year of new free agent contract (in millions of dollars)
LENGTH Length of new free agent contract
NEWCONTRACT Dummy variable =1 if player signed new free agent contract starting in year t
CONTRACTYEAR Dummy variable =1 if year t is the player’s last year before becoming a free
agent
ALLSTAR Dummy variable =1 if the player has made an All-Star team in the previous 3
years
GOLDGLOVE Number of Gold Glove Awards prior to year t
Michael Dinerstein, May 11, 2007, Page 31
Table 2: Descriptive Statistics
ALL 2001 2002 2003 2004
Variables Mean (S.E.) Mean (S.E.) Mean (S.E.) Mean (S.E.) Mean (S.E.)
SLGt 0.419 (0.096) 0.427 (0.104) 0.410 (0.097) 0.419 (0.093) 0.423 (0.095)
SLGt-1 0.432 (0.093) 0.453 (0.097) 0.434 (0.094) 0.422 (0.085) 0.426 (0.095)
SLGt-2 0.433 (0.099) 0.443 (0.100) 0.443 (0.099) 0.438 (0.109) 0.414 (0.090)
SLGt-3 0.433 (0.108) 0.435 (0.107) 0.430 (0.113) 0.443 (0.107) 0.427 (0.114)
AVGBASES 168.25(88.60)
185.55(83.70)
172.73(88.87)
162.67(91.80)
155.75(89.50)
BASESt-1 173.92(94.71)
192.78(93.07)
175.06(97.13)
166.51(92.56)
166.75(94.75)
BASESt-2 174.06(99.40)
193.47(94.26)
181.02(98.02)
170.06(103.03)
157.56(100.07)
BASESt-3 147.61(102.04)
176.90(104.99)
177.71(99.17)
179.66(103.59)
163.71(105.93)
2001 0.191 (0.393) 1 (0) 0 (0) 0 (0) 0 (0)
2002 0.244 (0.430) 0 (0) 1 (0) 0 (0) 0 (0)
2003 0.259 (0.438) 0 (0) 0 (0) 1 (0) 0 (0)
2004 0.254 (0.435) 0 (0) 0 (0) 0 (0) 1 (0)
AGE 30.98 (4.18) 31 (3.97) 31.04 (4.07) 30.76 (4.27) 30.36 (4.10)
AGE2 977.09(265.26)
976.32(246.02)
980.05(259.61)
964.04(267.78)
938.40(257.24)
SALARY 3.357 (3.768) 3.716 (3.476) 3.377 (3.561) 3.507 (4.015) 2.963 (4.017)
LENGTH 2.350 (1.810) 2.770 (1.850) 2.485 (1.861) 2.323 (1.809) 2.045 (1.773)
NEWCONTRACT 0.248 (0.432) 0.126 (0.333) 0.198 (0.399) 0.220 (0.415) 0.261 (0.440)
CONTRACTYEAR 0.250 (0.433) 0.174 (0.380) 0.256 (0.437) 0.302 (0.460) 0.269 (0.444)
ALLSTAR 0.242 (0.429) 0.297 (0.458) 0.241 (0.428) 0.242 (0.429) 0.216 (0.412)
GOLDGLOVE 0.433 (1.425) 0.483 (1.450) 0.401 (1.350) 0.451 (1.468) 0.375 (1.363)
Michael Dinerstein, May 11, 2007, Page 32
Table 3: Differences in Descriptive Statistics
Mean (S.E.)
Variable In Sample Not In Sample T-Statistic P-Value
No. obs. 1330 784 n/a n/a
Salary 3.357 (3.768) 1.171 (2.169) 16.914 0.000
Age 30.76 (4.14) 28.32 (4.31) 12.741 0.000
Born in USA 0.711 (0.453) 0.717 (0.451) -0.268 0.606
Switch Hitter 0.137 (0.344) 0.165 (0.358) -1.683 0.954
Experience 6.478 (4.242) 3.130 (4.195) 18.247 0.000
At-Bats 380.16 (177.00) 213.92 (150.56) 22.938 0.000
OBP 0.333 (0.050) 0.310 (0.047) 10.503 0.000
SLG 0.420 (0.097) 0.383 (0.085) 9.007 0.000
Michael Dinerstein, May 11, 2007, Page 33
Table 4: Seemingly Unrelated Regression Results
SUREG: *tSLG (1330 obs.)
Variable Coefficient($ scale)
Coefficient (SLG scale)
St. Error($ scale)
Z P>abs(Z)
SLGt-1 13.88125 0.470551 1.027305 13.51 0.000
SLGt-2 3.587371 0.121606 0.9962874 3.60 0.000
SLGt-3 1.597243 0.054144 0.8391799 1.90 0.057
AVGBASES 0.0048503 0.000164 0.0012833 3.78 0.000
2001 -0.0519348 -0.00176 0.374481 -0.14 0.890
2002 -0.1014643 -0.00344 0.3739389 -0.27 0.786
2003 0.3159927 0.010712 0.3765473 0.84 0.401
2004 0.6406431 0.021717 0.3765615 1.70 0.089
AGE -0.6128213 -0.02077 0.2549867 -2.40 0.016
AGE2 0.007528 0.000255 0.0039441 1.91 0.056
CONTRACTYEAR 0.2795232 0.009475 0.1774078 1.58 0.115
CONSTANT 14.62528 0.495772 4.109621 3.56 0.000
SUREG: SALARY (266 obs.)
Variable Coefficient St. Error Z P>abs(Z)
SLGt-1 9.82019 1.441412 6.81 0.000
SLGt-2 3.886767 1.408454 2.76 0.006
SLGt-3 1.891461 1.628608 1.16 0.245
AVGBASES 0.0128598 0.0020629 6.23 0.000
2001 -0.2674216 0.3766442 -0.71 0.478
2002 -0.6364843 0.3437637 -1.85 0.064
2003 -1.252135 0.3254681 -3.85 0.000
Michael Dinerstein, May 11, 2007, Page 34
2004 -1.236207 0.307357 -4.02 0.000
AGE -0.5610292 0.3810154 -1.47 0.141
AGE2 0.0063262 0.0054899 1.15 0.249
ALLSTAR 1.393422 0.3146635 4.43 0.000
GOLDGLOVE 0.0641614 0.0676646 0.95 0.343
CONSTANT 5.956073 6.645668 0.90 0.370
Michael Dinerstein, May 11, 2007, Page 35
Table 5: SALARY (164 obs.), R-squared = 0.4587
Variable Coefficient St. Error t P>abs(t)
SLGt-1 2.431705 1.387001 1.75 0.082
SLGt-2 1.216455 1.054979 1.15 0.251
SLGt-3 3.062118 1.346025 2.27 0.024
BASESt-1 0.0036005 0.0018145 1.98 0.049
BASESt-2 0.0030688 0.00157 1.95 0.052
BASESt-3 0.0001124 0.00148 0.08 0.940
2001 -0.1504599 0.3196281 -0.47 0.639
2002 -0.5734228 0.2899159 -1.98 0.050
2003 -0.6522764 0.2621081 -2.49 0.014
2004 -0.6734454 0.2505469 -2.69 0.008
AGE 0.4423488 0.2842597 1.56 0.122
AGE2 -0.00673 0.0040092 -1.66 0.098
ALLSTAR 0.9218431 0.2683967 3.43 0.001
CONSTANT -9.197668 5.067256 -1.82 0.072
Table 6: Wilcoxon Sum Ranks (69 obs.)
Sum Exercised No. Exercised No. Not Exerc. t P>abs(t)
851 23 46 0.58554 0.558
Table 7: Logit Regression on EXERCISE (69 obs.)
Variable Coefficient St. Error Z P>abs(Z)
PREDICTSAL – OPTIONCOST 0.0471769 0.0923196 0.51 0.609
CONSTANT -0.5739271 0.3421482 -1.68 0.093
Michael Dinerstein, May 11, 2007, Page 36
Works Cited
Boswell, Thomas. 1998. “Total Average Talks Again.” Inside Sports, March: pp. 30-43.
Harder, Joseph. 1989. “Play for Pay: Salary Determination and the Effects of Over- and Under-Reward on Individual Performance in Professional Sports.” Ph.D. dissertation, Graduate School of Business, Stanford University.
James, Bill. 1988. Baseball Abstract. New York: Ballantine Books.
Krautmann, Anthony. 1990. “Shirking or Stochastic Productivity in Major League Baseball?” Southern Economic Journal, April 56(4): pp. 961-968.
Krautmann, Anthony. 1993. “Shirking or Stochastic Productivity in Major League Baseball: Reply.” Southern Economic Journal, July 60(1): pp. 241-243.
Krautmann, Anthony, Elizabeth Gustafson, and Lawrence Hadley. 2003. “A Note on the Structural Stability of Salary Equations: Major League Baseball Players.” Journal ofSports Economics, February 4(1): pp. 56-63.
Lewis, Michael. 2003. Moneyball: The Art of Winning an Unfair Game. New York: W.W. Norton & Co.
Marburger, Daniel. 2003. “Does the Assignment of Property Rights Encourage or Discourage Shirking? Evidence from Major League Baseball.” Journal of Sports Economics, February 4(1): pp. 19-34.
Maxcy, Joel, Rodney Fort, and Anthony Krautmann. 2002. “The Effectiveness of Incentive Mechanisms in Major League Baseball.” Journal of Sports Economics, August 3(3): pp. 246-255.
McDowell, Allen. 2004. “From the Help Desk: Seemingly Unrelated Regression with Unbalanced Equations.” The Stata Journal, 4(4): pp. 442-448.
Scroggins, John. 1993. “Shirking or Stochastic Productivity in Major League Baseball: Comment.” Southern Economic Journal, July 60(1): pp. 239-240.