FREE AGENCY AND CONTRACT OPTIONS: HOW … · FREE AGENCY AND CONTRACT OPTIONS: ... find that in...

37
FREE AGENCY AND CONTRACT OPTIONS: HOW MAJOR LEAGUE BASEBALL TEAMS VALUE PLAYERS May 11, 2007 Michael Dinerstein [email protected] Stanford University, Department of Economics Advisor: Prof. Bob Hall Abstract When evaluating and signing players, Major League Baseball teams face incomplete information regarding a player’s true value. This paper explores how teams deal with such uncertainty and whether their approaches toward high-risk free agent signings and low-risk contract option decisions differ. I use seemingly unrelated regression to estimate the relationships between past performance and future performance and between past performance and free agent salaries. I find that in determining free agent salary offers, teams undervalue past performance relative to its power in predicting future performance. For the low-risk option decisions, I use the Wilcoxon Sum Rank Test and a logit regression to determine that teams are much less cautious and follow no discernible pattern in exercising options. Teams thus use very different approaches in making free agent and contract option decisions. Keywords: sports economics, salary determination, free agent, labor, management I am grateful to my advisor, Professor Bob Hall, for his guidance and thoughtful comments. Working with him has been an invaluable learning experience. I would also like to thank Professor Luigi Pistaferri for his econometrics advice. Any mistakes are my own.

Transcript of FREE AGENCY AND CONTRACT OPTIONS: HOW … · FREE AGENCY AND CONTRACT OPTIONS: ... find that in...

FREE AGENCY AND CONTRACT OPTIONS:HOW MAJOR LEAGUE BASEBALL TEAMS VALUE PLAYERS

May 11, 2007

Michael [email protected]

Stanford University, Department of EconomicsAdvisor: Prof. Bob Hall

Abstract

When evaluating and signing players, Major League Baseball teams face incomplete information regarding a player’s true value. This paper explores how teams deal with such uncertainty and whether their approaches toward high-risk free agent signings and low-risk contract option decisions differ. I use seemingly unrelated regression to estimate the relationships between past performance and future performance and between past performance and free agent salaries. I find that in determining free agent salary offers, teams undervalue past performance relative to its power in predicting future performance. For the low-risk option decisions, I use the Wilcoxon Sum Rank Test and a logit regression to determine that teams are much less cautious and follow no discernible pattern in exercising options. Teams thus use very different approaches in making free agent and contract option decisions.

Keywords: sports economics, salary determination, free agent, labor, management

I am grateful to my advisor, Professor Bob Hall, for his guidance and thoughtful comments. Working with him has been an invaluable learning experience. I would also like to thank Professor Luigi Pistaferri for his econometrics advice. Any mistakes are my own.

Michael Dinerstein, May 11, 2007, Page 1

1. Introduction

In 2004 Adrian Beltre, a third baseman for the Los Angeles Dodgers, had a fantastic year

and received considerable support as the possible Most Valuable Player of the National League.

Beltre’s performance proved somewhat unexpected, however, because in the years prior to 2004

his statistics were relatively average compared to the rest of the league. Critics pointed out that

Beltre’s contract situation likely explained his unexpected performance. Beltre’s contract was

set to expire after the 2004 season, at which point he could become a free agent and negotiate a

new contract with any baseball team. Perhaps Beltre then exerted a particularly strong effort

during 2004, his “contract year,” that he had not shown during previous seasons. Or Beltre could

have always performed to his full capacity and may have just reached a level of skill in 2004 that

allowed him to develop into a very productive player. The Seattle Mariners seemed to assume

the latter explanation was true (or at least that the former explanation would not predict a future

devolution into prior habits) by offering Beltre a 5-year contract for $64 million, an amount that

would hardly be justified by his performance prior to 2004. Beltre’s subsequent performance

proved disappointing relative to the expectations his new contract created.

The case of Adrian Beltre highlights the difficulty in discerning players’ true marginal

values with incomplete information and the consequences to teams and players of contract

decisions. The goal of this paper is to understand how Major League Baseball (MLB) general

managers and their teams navigate this process of deciding whether to keep and sign players.

This paper will examine both the typically high-risk decisions of signing free agents to new

contracts and the typically low-risk decisions of exercising team options that MLB front offices

must make. Is there a statistically significant pattern of soon-to-be free agents performing above

expectations in the final year of their contracts? If so, are teams fooled by this unexpected

Michael Dinerstein, May 11, 2007, Page 2

performance or do they adjust their salary offers? While many papers have addressed the former

question, none has asked whether teams fall into the trap of overemphasizing the most recent

performance, or even whether they create the incentive in the first place for players to try harder

during their “contract year.” A related question of team rationality in the contract process is

whether teams exercise their team options rationally. Options, which will be explained later in

the introduction, represent a less significant commitment to a player because options are rarely

guaranteed and are for just one year of service. I will address the degree to which teams exercise

options rationally and then compare the valuation of player options with the more significant

commitments in lucrative free agent deals.

1.1 Uniqueness of Major League Baseball

Baseball, due to the rules of the game and the nature of player contracts, affords

researchers advantages that other sports cannot provide. While baseball is naturally a team

game, individual performance, particularly by hitters, is reasonably independent from the actions

of others. When a player hits the ball, his fate is nearly always independent of how his

teammates act and depends on his opponents primarily through their defense. But given the high

quality of defense at the major league level, most plays are either routine or quite difficult, and I

assume that the variation in defenders is relatively limited. Whether the batter hits the ball is

often quite dependent on the quality of the opposing pitcher, but because teams play 162 games

and have fairly balanced schedules, I make the reasonable assumption that all hitters face

approximately the same quality of pitchers over the course of the season.

For most at-bats, a batter attempts to reach base without making an out, regardless of

whether the previous hitter made an out or reached base. Certain situations can arise, however,

that require the batter to aim for a specific outcome that would not necessarily be the goal for

Michael Dinerstein, May 11, 2007, Page 3

each at-bat. For instance, if a teammate has reached third base before two outs have been

recorded in an inning, most hitters will attempt to hit a sacrifice fly, which sacrifices the current

hitter to help the base runner score. While a hit is nearly always superior to a sacrifice fly in any

situation, the risk involved in seeking a hit can be large enough that hitters aim for sacrifice flies.

But even in such specific situations when the batter’s goal changes, baseball statistics usually

reward, or at least do not penalize, the batter for helping the team. Furthermore, the statistics are

general enough that they can accommodate hitters with distinct strengths. The power hitter may

frequently hit home runs but also strikes out more than another hitter who rarely hits more than a

double but reaches base often. Statistics like slugging percentage reward both types of hitters by

giving more points for home runs than singles but also accounting for the number of times a

batter reaches base. Baseball statistics thus usually measure actions that are highly correlated

with individual and team goals and are flexible enough to apply to the full spectrum of hitter

types.

Whether pitchers’ performance lends itself so easily to research is debatable. While most

prior studies have focused on hitters, Krautmann, Gustafson, and Hadley (2003) built a model

that predicted pitchers’ salaries based on past performance. Their findings indicate that all

pitchers cannot be treated as members of the same population but rather starting pitchers, middle

and long relief pitchers, and closing pitchers require separate analyses. Furthermore, no single

performance measure emerges as explaining variation in pitcher contracts. Measuring pitching

performance is thus more challenging.

The rules for player contracts also overcome some of the difficulties of analyzing athlete

contracts in major team sports. Unlike the National Football League (NFL), contracts are

guaranteed so that if a team releases the player, the team still has an obligation to pay the player

Michael Dinerstein, May 11, 2007, Page 4

the full contract amount. Contract length, as well as salaries in later years of long-term contracts,

should be close to a baseball player’s projected value, whereas a football team might offer a six-

year contract without expecting the player to be on the roster for the last few years. Player and

team options can be exceptions to the rule of guaranteed contracts, but, as this paper will show,

their relative simple nature still allows for analysis.

The other unique aspect of Major League Baseball contracts is that teams do not have to

remain under a salary cap. In the National Basketball Association (NBA), teams can spend only

a limited sum on players each year, whereas no such restriction exists in MLB. The salary cap

has several effects. First, players may not receive their marginal value to a team because the

team has an upper limit it can offer. Therefore, a model that uses past performance to predict

salary may run into difficulties caused by truncation. Second, to ensure that teams have some

flexibility despite the salary cap, the NBA includes several salary exceptions that allow a team to

sign a player to a certain amount regardless of salary cap implications. Such exceptions can

force player salaries into slots that make it difficult to fit a continuous distribution to salaries and

again create a system where players may not receive their marginal value. Third, to circumvent

the salary cap basketball teams often offer contracts that are back-loaded or include large signing

bonuses that can be distributed over the length of the contracts in years where the team may be

well under the salary cap. The large variation in yearly salaries can prove difficult to model.

The relative consistency of baseball contracts makes analysis more straightforward.

Baseball’s system of free agency also more closely resembles a free market. Unlike the

NBA, MLB does not have rules that allow a player’s current team to offer a contract amount that

no other team can match. Unlike the NFL, MLB does not feature franchise tags that allow teams

to designate future free agents as “franchise players” and restrict them from entering the free

Michael Dinerstein, May 11, 2007, Page 5

agent market. The salary determination process is therefore easier to model because the bidding

is more competitive and players have less uncertainty about when they will become free agents.

1.2 Large Commitments: Free Agent Decisions

Because the contract determination process can be complex and involve many factors that

are hard to separate, this paper will focus on the specific decision of how teams value most

recent performance when signing free agents. An understanding of this aspect will offer insight

into how teams generally deal with large financial commitments.

The team’s goal is some combination of winning a championship and maximizing its

profits. A player’s on-field production is an important component of both of these goals, and all

else being equal an increase in on-field production increases the probability of winning and the

team’s profits. When offering a player a contract, a team must predict players’ future on-field

production, but this task can be quite difficult, as highlighted by the Adrian Beltre example.

Changes in player performance could be a result of two factors. First, a player’s possible

performances form a distribution from which each year is one draw. Even if the distribution

remains unchanged, player performance can vary between years. Second, a player’s distribution

may change. For instance, if a player becomes more accustomed to major league pitching, his

distribution of outcomes may change to favor higher performance. Distinguishing between these

factors is the team’s challenge.

Adding to this variation in on-field performance is a player’s ability to control his effort.

Following Krautmann (1990), a worker’s marginal production equation is

jjj XEfMP ,, (1)

where jMP is the jth worker’s marginal product, jE is the jth worker’s effort, X is a vector of

other inputs, and j is a random variable. I define effort broadly to include any measures that

Michael Dinerstein, May 11, 2007, Page 6

improve performance, whether they are expected of the worker or not. For instance, taking

steroids is illegal and many teams may frown upon such action, but if players can increase their

production by taking steroids, then they are a measure of effort. Effort can be decomposed, as

Krautmann (1990) outlines, as

jjj ZCThE , (2)

where jCT is the time remaining on the jth worker’s contract and jZ is a vector of other factors

affecting his effort. The question is why might

j

j

CT

MP

, (3)

the marginal effect of time remaining on a contract on marginal product, not equal zero.

If baseball players anticipate they will receive higher salaries for certain actions, they will

engage in such behavior. Players at the beginning of long-term contracts know that they have

locked in specific salaries for the following years and may expect that their current performance

will have no impact on their compensation. Players in the final year of their contracts, or the

“contract year,” can expect to negotiate new contracts in a short time frame and may believe that

increased performance now will translate to higher salaries in a year. For this reason, players

may exert more effort in the final year of their contract because they expect it will have

significant benefits.

Whether this expectation is rational is unclear. Teams presumably evaluate players on

their performance over several years. If this pattern is established, players have little incentive to

adjust their effort for only the end of their current contracts. Instead, production should be more

consistent. But teams may have outside pressures that preclude them from taking such an

approach toward evaluation. For instance, if a baseball team’s fans or sports writers are tired of

Michael Dinerstein, May 11, 2007, Page 7

losing and see that a player who just finished a productive season is available, they may call for

their team to sign the player. Such pressure could cause the team to alter how it values players.

Teams have turned to incentive mechanisms to protect themselves from the risk

associated with signing a player whose future production is highly uncertain. For instance, many

contracts include bonuses if a player participates in a certain number of games or makes an All-

Star team. The size of these bonuses, however, is quite insignificant compared to the base

salaries, and as Harder (1989) notes, large incentive mechanisms are highly correlated with high

base salaries. These incentive mechanisms thus may not provide the team with much insurance.

Therefore, teams continue to face the problem of how to predict future performance. These

decisions can have large impacts because free agent contracts are often multi-year commitments

to players for large sums of money.

1.3 Small Commitments: Options

Options can be team options, player options, or mutual options. When a player and team

initially negotiate a contract, they can add an option to the end of the contract for a pre-specified

amount. Then, when the player completes the non-option years of the contract, the holder of the

option chooses whether to exercise it. If the holder exercises the option, the player’s contract is

extended by one year for the amount that was specified when the contract was originally signed.

For a mutual option, both the team and the player must exercise the option for the previous

contract to extend. Typically the team or player can exercise the option at any point during the

contract, although only the teams seem to exercise options early.

Team options usually include buyouts that the player receives if the team chooses not to

exercise the option, though these buyouts are significantly smaller than the option amount.

Michael Dinerstein, May 11, 2007, Page 8

Sometimes the contract will delineate conditions under which a team option will “vest,” or

automatically become guaranteed.

Some form of options has actually predated free agency, as during the reserve era the

team always had an option to renew the previous one-year contract if the two parties could not

negotiate a new deal. But once the free agency era began, options became much less frequent

and have only recently regained their popularity. Interestingly, options are so entrenched in the

business of baseball that the player’s union had an option to extend the 1997-2000 Collective

Bargaining Agreement for one year, which it exercised.

When exercising options, teams are thus making smaller commitments to players because

options are only one-year extensions of contracts. Furthermore, options only apply to players

previously under contract with the team, so the team likely is aware of the player’s work ethic

and interactions with teammates. Because option choices are relatively simple binary decisions

to model – the holder either exercises the option or does not – they lend themselves to salary

models and offer an interesting point of comparison to modeling free agent contracts. This paper

will consider only team options.

2. Literature Review

There has not been a study of player or team options in baseball. I hypothesize that this

dearth of literature is partly a result of the challenge of finding comprehensive options data.

While downloadable databases of salaries and lists of free agents are readily accessible, there is

no organized public source of option data, and until the growth of Internet blogs, researchers

lacked a way to compile option data. Furthermore, options seem to have become much more

popular in recent years, and therefore their significance has only arisen recently. My attempt to

collect option data and analyze it thus appears to open the door into understanding how players

Michael Dinerstein, May 11, 2007, Page 9

and teams use options and whether these low-risk commitments differ from decisions regarding

free agents contracts.

As for the question of whether players allow their effort to fluctuate over the course of a

contract, baseball researchers have offered several models but have yet to settle on a conclusive

answer. Harder (1989) collected data from four years: 1976, 1977, 1987, and 1988. Harder used

two unconventional statistics, “runs created” and “total average,” as his measures of

performance. He claimed that these statistics could best measure a player’s contribution to a

team’s success, and their correlations with conventional statistics were very high. Harder then

identified the log of salary as the dependent variable and ran a linear regression with variables

for players’ career production, experience, contract status, team performance, ethnicity, and All-

Star status the previous season. The paper concluded that being in a contract year had no effect

on total average or runs created in 1976 or 1977 while in 1989 the effect was negative on runs

created but not statistically significant on total average. Harder concluded that players did not

anticipate that statistical gains in their contract years would lead to higher salaries. The study’s

years were so unique in baseball history that these results likely do not apply to the current

environment. During 1976 and 1977, the first two years of data, the league was transitioning

into free agency. Players likely could not predict how their production would affect their future

salaries. During the later years, 1987 and 1988, owners were found to be colluding. Again,

players may have expected to be low-balled in contract negotiations regardless of performance

and thus saw no incentive to try harder during their contract year. Because today’s league more

closely models a free market, players’ incentives have likely changed.

Without using the standard regression, Krautmann (1990) attempted to answer the

related question of whether shirking (exerting less than maximal effort) occurs in the first year of

Michael Dinerstein, May 11, 2007, Page 10

a long-term contract. Krautmann’s data ran from 1976 to 1983 and included free agents who

signed contracts with lengths of at least five years. He constructed forecast intervals for players’

slugging averages based on past data and then defined super-par and sub-par performance as the

top and bottom 5% of the interval, respectively. Krautmann found that only 4.5% of the players

had super-par performances and only 1.8% had sub-par performances. He concluded that

shirking did not occur.

Scroggins (1993) challenged Krautmann’s result by claiming that slugging percentages

would not pick up shirking but that using total bases, a measure that could account for time spent

injured, proved that shirking occurred. Scroggins used a linear regression with Krautmann’s data

and found that the coefficient on whether a player had just signed a long-term contract was

significantly negative. Krautmann (1993) defended his choice of slugging percentage and then

applied his method of analysis to the total bases statistic. He found that the results were nearly

identical to those of his initial paper.

This paper will add to the existing literature in several ways. First, the discussion of team

options fills in a gap that research has not yet covered. Their emergence in recent years and their

uniqueness as non-guaranteed contract years allow options to offer insight into how players react

to a more uncertain contract situation and how teams value players. Second, the analysis of

effort during contract years will make use of more recent data than previous studies. Particularly

in a labor market that has fundamentally changed over the last 20 years, recent data is essential in

explaining current patterns of behavior. Finally, previous studies have focused on whether

players react to an incentive to alter behavior in the contract year. This paper will delve into the

team’s side of this issue by asking whether teams create the incentive to alter behavior and how

they value players in the context of possible shirking.

Michael Dinerstein, May 11, 2007, Page 11

3. Data

As discussed above, pitching statistics are not general enough to cover all types of

pitchers. I choose to drop pitchers from my analysis for two reasons. First, splitting up the

pitchers into starters, long and middle relievers, and closers and analyzing them separately

causes my conclusions to suffer from small sample sizes. Second, because the different pitching

roles can require different mentalities and skill sets, a player’s past performance in one role may

not accurately predict his future performance if he changes roles. Hitters, on the other hand, can

be aggregated, and while they may switch positions, this change should not affect their hitting

performance.

My sample consists of player-years from 2001 to 2004. Players who signed free agent

contracts beginning in 2005 are also included.1 While the full population of hitters from these

years numbers 2111, my sample includes only 1330 of the player-years, or 62.9%. This partial

coverage is a result of incomplete contract information. My data consists of three main subsets:

player on-field performance, off-field player characteristics, and contracts. On-field

performance is publicly available for all players and consistent across sources because baseball

designates an official scorer.2 Off-field player characteristics are also widely available and do

not depend on the source.3 Contract information is much harder to find and the level of detail

can depend on the source.4

1 Because I evaluate free agents based on variables observed prior to their new contract, these observations use data from only 2001-2004. These players are included to increase the sample size of free agents.2 For on-field performance, I used version 5.4 of Sean Lahman’s baseball database, “The Baseball Archive.” The database is accessible at http://www.baseball1.com/statistics/. I checked a random sample of these statistics with ESPN.com (http://espn.go.com), and found no inaccuracies.3 I again used version 5.4 of “The Baseball Archive” as well as Baseball-Reference.com (http://www.baseball-reference.com/). 4 For the salary amounts, I used version 5.4 of “The Baseball Archive.” For lists of free agents, I consulted Associated Press articles. For contract lengths and option data, I used two online blogs and checked their data against Associated Press articles. The first blog, MLB Contracts (http://www.bluemanc.demon.co.uk/baseball/mlbcontracts.htm), is no longer accessible directly, so I used “The

Michael Dinerstein, May 11, 2007, Page 12

There is no consensus on how to measure yearly on-field batter performance. The most

popular statistics in newspapers and casual fan conversations are home runs, runs batted in, and

batting average. Home runs give a sense of a player’s power, but they shed little light on his

ability to reach base through other hits. Furthermore, they are highly dependent on the number

of games a batter plays. Runs batted in (RBIs) may be a strong measure of a hitter’s contribution

to a team’s production. The drawback, however, is that a hitter’s opportunities for RBIs are very

dependent on his teammates. An RBI is much easier to obtain if teammates have reached base.

Thus, hitters who bat after good hitters have more opportunities for RBIs. Batting average (BA)

gives the percentage of hits a player earns for his total at-bats:

BatsAtTotal

HitsTotalBA

. (4)

At-bats are the total number of times a player bats, though walks, sacrifices, and hit-by-pitches

are not included. Batting average offers a strong measure of a player’s ability to reach base but it

fails to distinguish between different types of hits. For instance, a home run is more valuable

than a single, but batting average counts each as one hit. The most common statistics thus are

not sufficient for analysis that attempts to find a player’s full independent value.

Other “hybrid” statistics more closely measure a player’s value to a team. Most previous

studies have used slugging percentage (SLG), where

BatsAtTotal

BasesTotalSLG

. (5)

The total bases statistic, another performance measure, refers to the number of bases a player has

earned through hits. For instance, a single equals one base, a double equals two, but a non-hit

WayBack Machine” (http://www.archive.org/web/web.php) to access the site. The other blog, Cot’s Baseball Contracts (http://mlbcontracts.blogspot.com/), proved to be an excellent source.

Michael Dinerstein, May 11, 2007, Page 13

like a walk does not count even though the player advances a base. Slugging percentage thus

conveys a combination of how often a player gets hits and how valuable the hits are.

On-base percentage (OBP) improves upon batting average by accounting for non-hits,

such as walks, that lead to the batter reaching base without causing an out. OBP is the ratio of

the times reached base without making an out and total plate appearances. OBP has gained

popularity recently, as recounted in Michael Lewis’s Moneyball (2003), which described the

approach taken by Oakland Athletics General Manager Billy Beane in evaluating players. The

problem with OBP is the same as that of BA – OBP fails to differentiate between different types

of hits. In response to this deficiency, baseball researchers have turned to OPS, the sum of SLG

and OBP. OPS combines the advantages of SLG and OBP. Bill James and Thomas Boswell

also have argued for their own complicated statistics. James (1988) constructed “runs created,”

which accounts for not only hits and walks but also considers a player’s ability to steal a base.

Boswell (1988) created “total average,” a measure that also included base-running statistics.

The real test of which performance measure to employ is whether the teams actually use

it when predicting a player’s value. Even if one statistic is the best predictor of wins or team

profit, its usefulness in this analysis is limited because I am testing how teams weigh the most

recent performance versus less recent performance in determining salary offers. Therefore, I

take the statistic, or combination of statistics, that the team uses as given. But since teams do not

make such information public in most cases, I assume that teams have learned from experience

and now use a performance measure that best accounts for a player’s contribution. The problem

returns to determining the performance measure that best leads to wins and profits. The

combination of statistics should account for both total production during a season and average

production during the games a batters plays. Thus, the statistics should distinguish between two

Michael Dinerstein, May 11, 2007, Page 14

players who produce equal cumulative totals but play in different numbers of games. For the

reasons already discussed, home runs, RBIs, BA, and OBP do not account for players’ full value

in ways that better statistics do. Runs created and total average are too obscure to assume that all

teams use them to evaluate players.

Instead, I use a combination of slugging percentage and total bases. While OPS includes

SLG, the results hardly change when I substitute OPS for SLG, and SLG is easier to interpret

than OPS, which double-counts hits (hits appear in both the SLG and OBP components). Also,

most studies have used SLG. Therefore, I prefer SLG to OPS. In order to predict future

slugging percentage (SLGt) and a player’s value to a team, I will use lagged slugging

percentages from the three previous seasons (SLGt-1, SLGt-2, SLGt-3).5 In order to distinguish

between players who are efficient but often hurt and players who are efficient and play many

games, I will include a three-year average of total bases (AVGBASES). This combination is

simple but takes into account a player’s rate of production and total contribution.6

In addition to a batter’s hitting skill, his value to a team includes fielding abilities.

Fielding can be difficult to measure because a high number of errors may not necessarily indicate

a poor fielder but rather a fielder with wider range who can reach many balls. Therefore, his

errors might have been hits if other fielders were playing. Furthermore, general managers rarely

mention players’ fielding when signing free agents, so I am inclined to believe that only the best

fielders receive more money for their fielding abilities. Thus, I include a dummy variable

5 I choose to predict slugging percentage rather than total bases because total bases are less consistent across years as they depend largely on injuries that occur randomly. Therefore, lagged slugging percentages are better predictors of future slugging percentage, so the player model will predict slugging percentage but still make use of total bases. 6 In the second model used in evaluating players with options, I use lagged values of total bases (BASESt-1,

BASESt-2, BASESt-3) as well as slugging percentages because the dependent variable is now a salary amount, not a performance statistic.

Michael Dinerstein, May 11, 2007, Page 15

(GOLDGLOVE) for whether the position player (non-pitcher) has received a Gold Glove

Award, which designates each year the best fielder at a certain position in each league.

Off-field player characteristics are player-specific characteristics that do not depend on

performance. The most obvious variable choice is AGE. I expect that as players age, they

become accustomed to the grueling 162-game seasons, they learn what lifestyle will allow them

to perform at a high level, and they make other changes driven by experience to maximize their

performance. Batters may also adapt to Major League pitching over time, but conversely

pitchers may adapt to the batters’ tendencies, which makes the total effect ambiguous. In this

sense, deviations in performance between years could be a result of a player’s underlying ability

changing rather than random draws each year from the same distribution. Thus, unless the effect

of pitchers adapting to batters is particularly strong, I expect older players to have higher

statistics and higher value to teams. But at a certain point older players face disadvantages that

can affect their value. Injuries can accumulate, hitters’ eyes may become less sharp, and other

physical ailments can limit a player’s performance. To model this quadratic relationship, I

include an AGE2 variable. Unfortunately data on how long players have been in the minor

leagues, which could affect their major league performance, is nearly impossible to collect.

Players also contribute value to teams beyond their on-field performance. Players with

magnetic personalities or characteristics that inspire the community have marketing potential that

can increase team revenues. Because a player’s marketability is nearly impossible to measure

precisely, I include a dummy variable (ALLSTAR) for whether a player was an All-Star in any

of the three previous seasons. All-Stars are often the most well-known players and have the best

marketing potential and so will serve as a proxy for a player’s marketability.

Michael Dinerstein, May 11, 2007, Page 16

I also include dummy variables for the observation’s year (2001, 2002, 2003, 2004).

Even though the period studied is short, the labor market or general performance may change

between the years. In 2002, the owners and players association agreed to a new Collective

Bargaining Agreement. This agreement included for the first time revenue sharing among teams,

a luxury tax on teams with high payrolls, and testing for steroids. Because these changes could

affect player performance and team decisions, yearly dummy variables are necessary.

Furthermore, rumors that the composition of the baseballs has changed over the years demands

that I differentiate between years.

To relate these variables to labor market decisions, I require contract data. Because such

data is reported in a haphazard manner, I can rely on only the most significant and basic parts of

the contract. For instance, the reporting of performance incentive clauses varies between sources

as well as between players. Details of contracts of high-profile players are more often reported

because the public’s interest in these players is high. More marginal players are less noteworthy

and may even be playing with minor league contracts that escape the media’s attention.

Because most players’ base salaries appear publicly, I will focus my analysis on base

salaries (SALARY). A player’s salary serves as a proxy for the player’s value to a team and is a

scarce resource for the team. For most players, their base salary dominates potential bonuses.

This may not be true, however, for some players, especially those who are injury-prone. The

difficulty in collecting data on incentives and bonuses unfortunately precludes a more

comprehensive analysis. I convert all salaries into number of millions of dollars to avoid large

numbers. I also collect the length of contracts (LENGTH) but do not explicitly place it in my

models. Contract length can represent the size of a team’s commitment and is thus relevant. But

since it is endogenous in the teams’ decision-making, I use it only to sort observations and

Michael Dinerstein, May 11, 2007, Page 17

determine which observations appear in each model. Similarly, NEWCONTRACT is a dummy

variable for whether the player has signed a new free agent contract for the current season, but it

does not appear in the equations. Instead, it filters old contracts from entering the free agent

salary equations.

Finally, to test whether a player outperforms expectations prior to becoming a free agent,

I include a dummy variable (CONTRACTYEAR) for whether the player will be a free agent in

after the current season.7

The descriptive statistics, reported in Table 2, offer some basic trends. Slugging

percentage falls in the sample from year-to-year whereas total bases trends upward. The average

age is about 31 years old and the average salary is $3.357 million. Almost 25% of observations

signed new free agent contracts and 25% are in the final year of a contract before becoming free

agents.

Because this sample only covers 62.9% of players-years from 2001 to 2004, the degree to

which the sample represents the population is at issue. Since on-field performance and off-field

player characteristics data are available for all players, I can test whether the players included in

the sample are characteristically different from those not included. A test of whether means are

equal shows that the players in the sample are older, more experienced, have higher base salaries

(when reported), are less often switch hitters, and have more at-bats, higher slugging

percentages, and higher on-base percentages in the prior year. All of these differences, except

the proportion of switch hitters, appear significant in all four years as well as in the aggregated

sample. While the following analysis suffers from a censoring problem, the bias in the sample

favors older and more experienced players, who are exactly those players more likely to be

7 Some players will become free agents unexpectedly if they are released or an option is not exercised. Only those players who are guaranteed to become free agents, unless an extension is signed before the season ends, have a 1 for CONTRACTYEAR.

Michael Dinerstein, May 11, 2007, Page 18

eligible for free agency and to have options in their contracts. This paper attempts to answer

questions that revolve around free agency, so the sample’s bias toward older and more

experienced players is mitigated. This paper also focuses on the difficulty that teams have in

evaluating players and making contract decisions. These decisions are more important for

players with higher salaries because the team’s commitment is larger. The censoring problem is

thus less significant than it first appears.

4. Model

For the question asking how teams make large commitments, I have player and team regression

equations. On the player side, my dependent variable is the player’s on-field performance,

measured in terms of slugging percentage, in the current year. To predict such performance, I

use off-field player characteristics and past on-field performance. I also include the dummy

variable CONTRACTYEAR. On the team side, my dependent variable is the first-year salary

given to a newly-signed free agent. I use the same off-field player characteristics and past on-

field performance variables in addition to dummy variables for ALLSTAR and GOLDGLOVE.

These variables appear only in the team equation because they are relevant for a player’s value to

a team but they fail to predict future slugging percentage beyond the inclusion of the lagged

SLG. Similarly, CONTRACTYEAR is only relevant to the player, who may alter his

performance when in the contract year, whereas every free agent the team signs just finished a

contract year.

The player equation makes use of all observations in the sample. The team equation,

because it applies to team decisions on free agents, only uses observations for which the player

has just signed a new free agent contract. The dataset is thus unbalanced between the equations,

as players register slugging percentages each year but sign free agent contracts less frequently.

Michael Dinerstein, May 11, 2007, Page 19

Because the equations model similar processes, I expect that the regression error terms

could be correlated. Instead of running two separate OLS equations, I use seemingly unrelated

regression to determine the coefficients jointly. I want to test whether coefficients on the same

independent variables are equal across the two equations, so I need both dependent variables to

lie on the same scale. With an OLS regression, I estimate that each marginal slugging

percentage point (.001) equates to a marginal value of $29,500.8 After rescaling tSLG into dollar

amounts, the equations appear as follows:

)7(20042003

20022001

)6(20042003

20022001

12112

10987

6543322110

112

10987

6543322110*

GOLDGLOVEALLSTARAGEAGE

AVGBASESSLGSLGSLGSALARY

ARCONTRACTYEAGEAGE

AVGBASESSLGSLGSLGSLG

ttt

tttt

where *tSLG is the rescaled slugging percentage in the current year.

Because STATA’s seemingly unrelated regression command “sureg” cannot handle

unbalanced datasets where the number of observations differs between equations, I follow the

method outlined in McDowell (2004). I scale the equations so that the error terms have equal

variance and then combine the data into one panel with a variable indicating which equation the

observation enters. I use the command “xtgee” to produce the results equal to those of

seemingly unrelated regression for unbalanced data.

For the question of how teams make smaller commitments, I predict how much players

with upcoming options can expect to receive as free agents. I use this prediction as a proxy for

8 By convention, one point of a baseball percentage refers to 0.001. I derive the estimate for the marginal value of a point of slugging percentage from an OLS regression of salary on three lags of slugging percentage and a constant. The estimate of 29.5 ($29,500) is the sum of the coefficients on the lags of slugging percentage.

Michael Dinerstein, May 11, 2007, Page 20

the player’s value to the team with the team option. To predict this value, I run the following

regression to determine coefficients on key variables:

)8(2004200320022001 132

121110987

3625143322110

ALLSTARAGEAGE

BASESBASESBASESSLGSLGSLGSALARY tttttt

Because options only extend a player’s contract for one year, in determining the

coefficients I only include new free agents who sign one-year contracts. I then use the

coefficient estimates and the data from players with options to predict their salaries, which I label

PREDICTSAL. Once I have a player’s predicted salary, I subtract from it the cost to the team of

exercising the option (the option amount minus any pre-negotiated buyout if the option is

declined). This amount, if positive, predicts that the player’s value to the team is higher than the

cost of exercising the option. A negative outcome means the team’s costs are higher than its

predicted returns from keeping the player.

To test whether a higher value of PREDICTSAL – OPTIONCOST predicts that a team is

more likely to exercise the option, I use two methods. First, I use the nonparametric Wilcoxon

Sum Rank Test. I order the players by PREDICTSAL – OPTIONCOST from lowest to highest

and then sum the ranks of the contract options that were exercised. Because the number of

options exercised and declined both exceed 10 in the sample, I use a normal approximation to

find a p-value for the test of the null hypothesis that higher values of PREDICTSAL –

OPTIONCOST have no relationship with whether a team exercises an option.

Second, I use a logit regression to find the relationship between the PREDICTSAL –

OPTIONCOST variable and a variable for whether the team exercises the option. My logit

regression equation appears as follows:

OPTIONCOSTPREDICTSALEXERCISE 10 (9)

Michael Dinerstein, May 11, 2007, Page 21

where EXERCISE is a binary variable that takes 1 if an option is exercised and 0 if the option is

declined. I expect the Wilcoxon Sum Rank Test and the logit regression to yield similar results.

5. Results and Discussion

The results from the seemingly unrelated regressions appear in Table 4 at the end of the

paper. I will analyze the results from the player equation first and then the team equation results.

Finally, I will compare the coefficients across equations.

As I would expect, lagged observations of slugging percentage strongly predict future

slugging percentage, and the prediction power increases for the most recent observations. A

0.010 increase in SLGt-1 predicts, ceteris paribus, a 0.0047 increase in SLGt. Similarly, 0.010

increases in SLGt-2 and SLGt-3 predict 0.0012 and 0.0005 increases in SLGt, respectively. With

p-values below 0.01, the first two lagged slugging percentages are clearly strong predictors of

future performance. The ratio of the coefficients reveals that the slugging percentage of the

previous year accounts for 72.8% of the variation explained by lagged slugging percentages.

Before considering the team’s equation, this ratio’s high value could explain why players might

expect a jump in statistics in the contract year to lead to much larger free agent contracts.

Players indeed seem to outperform predicted performance levels when they are due to

become free agents after the season. The coefficient on the dummy variable CONTRACTYEAR

is 0.2795 (on the dollar scale), so being in the contract year predicts an increase in slugging

percentage of 0.0095. Since the p-value for the coefficient is 0.115, I cannot claim with near

certainty that this relationship between the contract year and slugging percentage is definite, but

it provides strong evidence that anecdotes of players trying harder in the contract year have an

empirical basis. This result agrees with several studies, including Scroggins (1993), Maxcy,

Fort, and Krautmann (2002), and Marburger (2003), which find evidence of players altering

Michael Dinerstein, May 11, 2007, Page 22

behavior in anticipation of free agency. The behavior change found in this paper favors higher

performance, while Scroggins (1993) concluded that a player’s production falls during the same

period. The result also differs from those of Harder (1989) and Krautmann (1993), which found

no changes in behavior.

A player’s total production, measured in total bases, over the previous three seasons also

has a positive correlation with SLGt. An increase of 10 total bases on average over the three

previous seasons (equal to 10 additional singles, 5 additional doubles or other combinations of

hits summing to 10 total bases) predicts a 0.0016 increase in SLGt. Because the lagged slugging

percentages should already account for a batter’s efficiency, this result seems to indicate that

players who have more at-bats on average will have higher slugging percentages in future

seasons. These players may be more durable and less affected by lingering injuries that could

depress slugging percentages.

While these predicted effects seem like very small increases in SLGt, even fractions of

percentage point changes can affect a player’s value appreciably. For instance, in 2004, the

slugging average mean for players with at least 50 at-bats was 0.396. This corresponded to the

46th percentile of the distribution. An increase of 0.010 to 0.406 would have raised the player to

the 51st percentile. The effects of lagged slugging averages and being in a contract year therefore

have a discernible economic impact.

The key off-field player characteristic, AGE, exhibits unexpected effects on SLGt. I

hypothesized that players would improve as they age but at a decreasing rate. The coefficient on

AGE, however, is negative and large enough that its impact cannot be ignored. An increase in

age by one year corresponds to a decrease in SLGt of 0.0208. The p-value for the coefficient’s

difference from 0 is 0.016. The positive coefficient for AGE2, 0.0075 (0.0003 after scaling back

Michael Dinerstein, May 11, 2007, Page 23

to slugging percentages), is also the opposite sign of my expectation, with a p-value of 0.056.

The effect of the linear AGE variable is stronger because the break-even point at which a change

in age does not predict any change in slugging percentage occurs between 40 and 41 years old,

which lie at the very top end of the age distribution of players. The improvement in skill due to

experience and becoming accustomed to the Major Leagues may then be overrated or possibly

counteracted by other effects associated with aging, such as the accumulation of injuries. Those

players who last into their late 30s and early 40s may be the healthiest players. Because the

injury-prone players usually retire earlier, the oldest players are not representative of the rest of

the population, and their return to experience may be higher or their vulnerability to injury may

be lower.

The last variables in the player equation are the year dummy variables. The coefficients

on 2001 and 2002 are negative while those on 2003 and 2004 are positive. The only small p-

value corresponds to the 2004 coefficient. The yearly effects appear to be large, as a player in

2003 can expect to have a slugging percentage 0.0142 higher than a player in 2002. As

discussed previously, there could be several reasons that the yearly averages fluctuate, including

the composition of the balls and the prevalence of steroid use. These results fail to distinguish

among many possible reasons, but they do indicate that hitting averages are susceptible to the

different yearly playing environments.

The signs of the coefficients in the team equation are similar, which signifies that general

trends in the variables are well-established enough to predict both future performance and

salaries. A 0.010 increase in slugging percentage in the most recent year predicts a salary

increase of $138,813. Equal increases in SLGt-2 and SLGt-3 generate predicted salary increases

of $38,868 and $18,915, respectively, holding all other variables constant. Only the coefficient

Michael Dinerstein, May 11, 2007, Page 24

on SLGt-3 has a p-value above 0.01, and its p-value of 0.245 is low enough that its effect on free

agent salaries could carry some weight. These predicted salary increases only apply to new free

agent contracts, not to players who are under long-term contracts. The figures also represent the

amount the player signed for and give no conclusive prediction on what the player was offered

by other teams except that other offers likely fell below the predicted salary. The ratio of the

coefficients on the lagged slugging averages shows that the slugging percentage from the most

recent year explains 63.0% of the salary variation due to past slugging averages. This ratio is

lower than the equivalent ratio in the player equation, a relationship I will explore below.

A player’s durability, as measured by his average total bases in the previous three years

while holding slugging averages constant, has a large impact on his free agent salary. An

increase of 10 total bases on average over the three previous seasons yields a predicted salary

increase of $128,598. These predicted changes in salary can have a large impact on a team’s

financial structure, although the changes are smaller in scale than those predicted in the player

equation. An increase in free agent salary of $500,000 moves the mean salary from the 72nd to

the 76th percentile.

Accounting for a player’s value to a team beyond his batting statistics is more difficult,

though the coefficients on ALLSTAR and GOLDGLOVE may capture some of this value.

Having been an All-Star at least one of the three previous seasons increases a player’s predicted

salary by $1.393 million. Because the lagged slugging averages and total bases average should

cover much of a player’s hitting value, this large salary increase due to All-Star status may

indicate that a player’s marketability is very important in determining free agent offers. The

effect of defensive ability, measured by number of gold gloves, is smaller. An additional gold

glove corresponds to a salary increase of $64,161, and the coefficient’s p-value of 0.343

Michael Dinerstein, May 11, 2007, Page 25

indicates that this relationship is somewhat shaky. Either a player’s fielding skills may not factor

strongly in salary offer decisions or the number of gold gloves is a poor measure of a player’s

defense.

A player’s age affects salaries in a similar manner to its effect on future slugging

percentage. The AGE coefficient is negative while the AGE2 coefficient is positive, albeit with

higher p-values than the corresponding variables in the player equation. The age at which a year

increase has no effect on free agent salary is between 44 and 45 years, older than almost every

major leaguer. Even though the effect of age on player’s future performance runs counter to my

original hypothesis, teams seem to have figured out the correct relationship when constructing

their free agent salary offers.

Finally, the year coefficients are all negative relative to the free agents who signed their

contracts for the 2005 season. Interestingly, free agent salaries on average were higher in 2001

than in the later years (until 2005). This result contradicts the general thought that free agent

salaries escalate from year to year. These differences caused by years could derive from the

variability in free agent markets across years. Some years many teams may have more money to

spend due to generally favorable economic conditions or a higher general interest in baseball.

This higher demand for players would cause salaries to rise.

To test whether the coefficients on the same independent variables are equal across the

equations, I employ F-tests.9 The coefficient on SLGt-1 is larger in the player equation, and there

is only a .0198 probability that they are equal. This test is critical in answering the question of

how teams make large commitments and how their behavior affects players. If teams were

basing salary decisions primarily on future slugging percentage, then they were undervaluing the

most recent slugging average. Teams obviously consider other factors in determining a player’s 9 To test the coefficients across the equations, I use the player equation coefficients in dollar terms.

Michael Dinerstein, May 11, 2007, Page 26

value, but the teams still appear to be particularly cautious in using recent performance to predict

future performance. This undervaluing or cautiousness discredits the prediction that players are

rationally trying harder in their contract year with the expectation that their efforts will lead to

significantly higher rewards. Teams do not create the incentive to try harder during the contract

year. Instead, if players were to act more rationally, they would hold their effort relatively more

constant prior to signing new free agent deals.

The coefficients on the other lags of slugging average, however, are very similar across

the equations, as are the coefficients on 2001, AGE, and AGE2. These effects on slugging

percentage and salary appear to be relatively equivalent. AVGBASES though is a stronger

predictor of future slugging percentage than future salary, as the p-value for a test of equal

coefficients is 0.0008. This result runs counter to intuition, as I would expect teams to be quite

concerned with a player’s durability whereas durability’s effect on slugging percentage is less

clear. The problem may be that AVGBASES is very highly correlated with slugging percentage

because the statistics are so similar. A different measure of durability may prove more

enlightening. Finally, the coefficients on 2003 and 2004 are conclusively higher in the player

equation, while a test of equal coefficients on 2002 offers a p-value of 0.2805. Year effects on

player performance in 2003 and 2004, and to a lesser extent in 2002, were stronger than effects

on the free agent market.

While the results from how teams determine their large commitments are relatively

explainable, the statistical results from small commitments are puzzling. The Wilcoxon Sum

Rank Test results appear in Table 6, while Table 7 delineates the results from the logit regression

model.

Michael Dinerstein, May 11, 2007, Page 27

Of the 69 team options in the sample, exactly two-thirds were declined. The predicted

salaries of the players with upcoming options, which serve as proxies for the player’s value to

their teams, are lower than the team’s option cost in 63 of the observations. Option amounts,

when negotiated at the beginning of a long-term contract, thus appear to exceed players’ value in

almost all cases. The surprise then is that only two-thirds of team options were declined. The

sum of ranks of the exercised options totals 851, which yields a two-sided p-value of 0.558. The

exercised options are fairly evenly distributed throughout the values of PREDICTSAL –

OPTIONCOST, with five in the first quartile, six in the second, five in the third, and seven in the

last. In fact, the player with the most negative value of PREDICTSAL – OPTIONCOST, Raul

Mondesi in 2003, saw his team exercise his option.

The logit results are very similar, as they show no discernible relationship between

PREDICTSAL – OPTIONCOST and whether the option was exercised. The coefficient on the

independent variable is only 0.034, meaning that an increase in PREDICTSAL – OPTIONCOST

does little to predict whether the option will be exercised. The p-value is 0.609, so the

relationship is very weak.

I expected teams to follow at least some pattern when making options decisions, but these

results offer no straightforward answer. The model for predicting salary could be erroneous,

though these option results are so devoid of a pattern that small changes to the model likely will

not change the conclusions. Teams may feel particularly comfortable with their own players

because their work habits are established or the fans have made a connection with the player, and

thus the teams may be more willing to exercise options even when a model using observable

variables advises otherwise. The size of the commitment may also affect a team’s decision.

Exercising an option is a short-term commitment with limited ramifications. Some teams could

Michael Dinerstein, May 11, 2007, Page 28

be less discriminating when making options decisions because the effects are smaller. An

alternative hypothesis, which requires more data to be tested, is whether teams exercise options

to gain the goodwill of a player in the hope that the team and player can agree to a team-friendly

extension subsequently.

6. Conclusion

This paper has found that MLB clubs take different approaches in making decisions

regarding large commitments versus small commitments. When making often long-term

decisions about free agents, teams undervalue most recent on-field performance relative to its

ability to predict future on-field performance. Teams may base salary decisions on many factors

in addition to on-field performance, which could explain this possible undervaluation. Front

offices may also be particularly cautious when evaluating the value of free agents because their

actual value is so uncertain. The result that players perform better than expected during their

contract year complicates free agent decisions, and teams may react to such variation by taking a

circumspect approach. I would expect that as players (or their agents) observe this cautiousness

or undervaluation, they will adjust their behavior and no longer perform at a higher level in their

contract year but rather exert consistent effort over several years. This development, in turn,

could lead to more certainty when evaluating players, and teams may start to place more of an

emphasis on recent on-field performance when making free agent decisions. The cyclical nature

of this relationship reveals how interrelated player and team behavior are and justifies the use of

the seemingly unrelated regression in modeling these effects.

When making small commitments, baseball teams seem to be much less cautious and

exercise team options too often. The lack of any pattern between a player’s predicted profit to

the team and whether the option is exercised is startling. A team’s and its fans’ familiarity with a

Michael Dinerstein, May 11, 2007, Page 29

player may explain this result, but more likely the team follows a strategy for small commitments

that is difficult to model. Because teams are less worried about the impact of these decisions,

they may use more arbitrary decision-making.

Many avenues exist for further research. The growth of the Internet and rising interest in

baseball research mean that data will become more readily available. Very few studies have

incorporated data from across many years. Such an effort could reveal how stable these

relationships found in this paper are across different eras.

More research into on-field performance measures might produce statistics that are more

representative of a player’s value to a team, particularly in terms of fielding and speed.

Qualitative research through interviews or other methods could reveal which performance

measures teams actually use to determine players’ values. Similar methods could determine

whether teams negotiate options in original contracts with the intention of exercising them.

Unfortunately, teams value secrecy because they operate in a competitive environment, so such

data likely will only emerge years after decisions are made.

Further research into how the length of free agent contracts affects salary amounts and

the evaluation of a player’s recent on-field performance could fine-tune the conclusions found in

this paper. The reverse causality problems of including contract length as an independent

variable precluded its direct use in this analysis, but future research may be able to relate contract

length to teams’ free agent and option decisions.

Michael Dinerstein, May 11, 2007, Page 30

Table 1: Variable Definitions

Variable Definition

SLGt Batter’s slugging percentage in year t

SLGt-1 Batter’s slugging percentage in year (t-1)

SLGt-2 Batter’s slugging percentage in year (t-2)

SLGt-3 Batter’s slugging percentage in year (t-3)

AVGBASES Average of total bases in 3 previous years

BASESt-1 Total bases in year (t-1)

BASESt-2 Total bases in year (t-2)

BASESt-3 Total bases in year (t-3)

2001 Dummy variable =1 if t=2001

2002 Dummy variable =1 if t=2002

2003 Dummy variable =1 if t=2003

2004 Dummy variable =1 if t=2004

AGE Batter’s age

AGE2 Batter’s age squared

SALARY Salary for first year of new free agent contract (in millions of dollars)

LENGTH Length of new free agent contract

NEWCONTRACT Dummy variable =1 if player signed new free agent contract starting in year t

CONTRACTYEAR Dummy variable =1 if year t is the player’s last year before becoming a free

agent

ALLSTAR Dummy variable =1 if the player has made an All-Star team in the previous 3

years

GOLDGLOVE Number of Gold Glove Awards prior to year t

Michael Dinerstein, May 11, 2007, Page 31

Table 2: Descriptive Statistics

ALL 2001 2002 2003 2004

Variables Mean (S.E.) Mean (S.E.) Mean (S.E.) Mean (S.E.) Mean (S.E.)

SLGt 0.419 (0.096) 0.427 (0.104) 0.410 (0.097) 0.419 (0.093) 0.423 (0.095)

SLGt-1 0.432 (0.093) 0.453 (0.097) 0.434 (0.094) 0.422 (0.085) 0.426 (0.095)

SLGt-2 0.433 (0.099) 0.443 (0.100) 0.443 (0.099) 0.438 (0.109) 0.414 (0.090)

SLGt-3 0.433 (0.108) 0.435 (0.107) 0.430 (0.113) 0.443 (0.107) 0.427 (0.114)

AVGBASES 168.25(88.60)

185.55(83.70)

172.73(88.87)

162.67(91.80)

155.75(89.50)

BASESt-1 173.92(94.71)

192.78(93.07)

175.06(97.13)

166.51(92.56)

166.75(94.75)

BASESt-2 174.06(99.40)

193.47(94.26)

181.02(98.02)

170.06(103.03)

157.56(100.07)

BASESt-3 147.61(102.04)

176.90(104.99)

177.71(99.17)

179.66(103.59)

163.71(105.93)

2001 0.191 (0.393) 1 (0) 0 (0) 0 (0) 0 (0)

2002 0.244 (0.430) 0 (0) 1 (0) 0 (0) 0 (0)

2003 0.259 (0.438) 0 (0) 0 (0) 1 (0) 0 (0)

2004 0.254 (0.435) 0 (0) 0 (0) 0 (0) 1 (0)

AGE 30.98 (4.18) 31 (3.97) 31.04 (4.07) 30.76 (4.27) 30.36 (4.10)

AGE2 977.09(265.26)

976.32(246.02)

980.05(259.61)

964.04(267.78)

938.40(257.24)

SALARY 3.357 (3.768) 3.716 (3.476) 3.377 (3.561) 3.507 (4.015) 2.963 (4.017)

LENGTH 2.350 (1.810) 2.770 (1.850) 2.485 (1.861) 2.323 (1.809) 2.045 (1.773)

NEWCONTRACT 0.248 (0.432) 0.126 (0.333) 0.198 (0.399) 0.220 (0.415) 0.261 (0.440)

CONTRACTYEAR 0.250 (0.433) 0.174 (0.380) 0.256 (0.437) 0.302 (0.460) 0.269 (0.444)

ALLSTAR 0.242 (0.429) 0.297 (0.458) 0.241 (0.428) 0.242 (0.429) 0.216 (0.412)

GOLDGLOVE 0.433 (1.425) 0.483 (1.450) 0.401 (1.350) 0.451 (1.468) 0.375 (1.363)

Michael Dinerstein, May 11, 2007, Page 32

Table 3: Differences in Descriptive Statistics

Mean (S.E.)

Variable In Sample Not In Sample T-Statistic P-Value

No. obs. 1330 784 n/a n/a

Salary 3.357 (3.768) 1.171 (2.169) 16.914 0.000

Age 30.76 (4.14) 28.32 (4.31) 12.741 0.000

Born in USA 0.711 (0.453) 0.717 (0.451) -0.268 0.606

Switch Hitter 0.137 (0.344) 0.165 (0.358) -1.683 0.954

Experience 6.478 (4.242) 3.130 (4.195) 18.247 0.000

At-Bats 380.16 (177.00) 213.92 (150.56) 22.938 0.000

OBP 0.333 (0.050) 0.310 (0.047) 10.503 0.000

SLG 0.420 (0.097) 0.383 (0.085) 9.007 0.000

Michael Dinerstein, May 11, 2007, Page 33

Table 4: Seemingly Unrelated Regression Results

SUREG: *tSLG (1330 obs.)

Variable Coefficient($ scale)

Coefficient (SLG scale)

St. Error($ scale)

Z P>abs(Z)

SLGt-1 13.88125 0.470551 1.027305 13.51 0.000

SLGt-2 3.587371 0.121606 0.9962874 3.60 0.000

SLGt-3 1.597243 0.054144 0.8391799 1.90 0.057

AVGBASES 0.0048503 0.000164 0.0012833 3.78 0.000

2001 -0.0519348 -0.00176 0.374481 -0.14 0.890

2002 -0.1014643 -0.00344 0.3739389 -0.27 0.786

2003 0.3159927 0.010712 0.3765473 0.84 0.401

2004 0.6406431 0.021717 0.3765615 1.70 0.089

AGE -0.6128213 -0.02077 0.2549867 -2.40 0.016

AGE2 0.007528 0.000255 0.0039441 1.91 0.056

CONTRACTYEAR 0.2795232 0.009475 0.1774078 1.58 0.115

CONSTANT 14.62528 0.495772 4.109621 3.56 0.000

SUREG: SALARY (266 obs.)

Variable Coefficient St. Error Z P>abs(Z)

SLGt-1 9.82019 1.441412 6.81 0.000

SLGt-2 3.886767 1.408454 2.76 0.006

SLGt-3 1.891461 1.628608 1.16 0.245

AVGBASES 0.0128598 0.0020629 6.23 0.000

2001 -0.2674216 0.3766442 -0.71 0.478

2002 -0.6364843 0.3437637 -1.85 0.064

2003 -1.252135 0.3254681 -3.85 0.000

Michael Dinerstein, May 11, 2007, Page 34

2004 -1.236207 0.307357 -4.02 0.000

AGE -0.5610292 0.3810154 -1.47 0.141

AGE2 0.0063262 0.0054899 1.15 0.249

ALLSTAR 1.393422 0.3146635 4.43 0.000

GOLDGLOVE 0.0641614 0.0676646 0.95 0.343

CONSTANT 5.956073 6.645668 0.90 0.370

Michael Dinerstein, May 11, 2007, Page 35

Table 5: SALARY (164 obs.), R-squared = 0.4587

Variable Coefficient St. Error t P>abs(t)

SLGt-1 2.431705 1.387001 1.75 0.082

SLGt-2 1.216455 1.054979 1.15 0.251

SLGt-3 3.062118 1.346025 2.27 0.024

BASESt-1 0.0036005 0.0018145 1.98 0.049

BASESt-2 0.0030688 0.00157 1.95 0.052

BASESt-3 0.0001124 0.00148 0.08 0.940

2001 -0.1504599 0.3196281 -0.47 0.639

2002 -0.5734228 0.2899159 -1.98 0.050

2003 -0.6522764 0.2621081 -2.49 0.014

2004 -0.6734454 0.2505469 -2.69 0.008

AGE 0.4423488 0.2842597 1.56 0.122

AGE2 -0.00673 0.0040092 -1.66 0.098

ALLSTAR 0.9218431 0.2683967 3.43 0.001

CONSTANT -9.197668 5.067256 -1.82 0.072

Table 6: Wilcoxon Sum Ranks (69 obs.)

Sum Exercised No. Exercised No. Not Exerc. t P>abs(t)

851 23 46 0.58554 0.558

Table 7: Logit Regression on EXERCISE (69 obs.)

Variable Coefficient St. Error Z P>abs(Z)

PREDICTSAL – OPTIONCOST 0.0471769 0.0923196 0.51 0.609

CONSTANT -0.5739271 0.3421482 -1.68 0.093

Michael Dinerstein, May 11, 2007, Page 36

Works Cited

Boswell, Thomas. 1998. “Total Average Talks Again.” Inside Sports, March: pp. 30-43.

Harder, Joseph. 1989. “Play for Pay: Salary Determination and the Effects of Over- and Under-Reward on Individual Performance in Professional Sports.” Ph.D. dissertation, Graduate School of Business, Stanford University.

James, Bill. 1988. Baseball Abstract. New York: Ballantine Books.

Krautmann, Anthony. 1990. “Shirking or Stochastic Productivity in Major League Baseball?” Southern Economic Journal, April 56(4): pp. 961-968.

Krautmann, Anthony. 1993. “Shirking or Stochastic Productivity in Major League Baseball: Reply.” Southern Economic Journal, July 60(1): pp. 241-243.

Krautmann, Anthony, Elizabeth Gustafson, and Lawrence Hadley. 2003. “A Note on the Structural Stability of Salary Equations: Major League Baseball Players.” Journal ofSports Economics, February 4(1): pp. 56-63.

Lewis, Michael. 2003. Moneyball: The Art of Winning an Unfair Game. New York: W.W. Norton & Co.

Marburger, Daniel. 2003. “Does the Assignment of Property Rights Encourage or Discourage Shirking? Evidence from Major League Baseball.” Journal of Sports Economics, February 4(1): pp. 19-34.

Maxcy, Joel, Rodney Fort, and Anthony Krautmann. 2002. “The Effectiveness of Incentive Mechanisms in Major League Baseball.” Journal of Sports Economics, August 3(3): pp. 246-255.

McDowell, Allen. 2004. “From the Help Desk: Seemingly Unrelated Regression with Unbalanced Equations.” The Stata Journal, 4(4): pp. 442-448.

Scroggins, John. 1993. “Shirking or Stochastic Productivity in Major League Baseball: Comment.” Southern Economic Journal, July 60(1): pp. 239-240.