A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4....

45
A recurring theme in the mathematics of sports Doug Ensley Shippensburg University [email protected]

Transcript of A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4....

Page 1: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

A recurring theme in

the mathematics of

sports

Doug Ensley

Shippensburg University

[email protected]

Page 2: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

April is Mathematics Awareness Month!

Page 3: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

April is also…

•Alcohol Awareness Month

•National Oral Health Month

•Stress Awareness Month

•Jazz Appreciation Month

•Train Safety Month

Page 4: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Esoterica(pedia)

The longest known singles tennis game was one of 80

points between Anthony Fawcett (Rhodesia) and Keith

Glass (Great Britain) in the first round of the Surrey,

Great Britain Championships on 26 May 1975.

QUESTION:

What is the probability of this happening by chance?

What assumptions on a model of a tennis game can

account for this freakish phenomenon?

Page 5: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Scoring in tennis

Essentially a game in tennis is won by the first

player with 4 points, but that player must win by 2

points.

When the score is tied 3-3, 4-4, etc., we say the

score is at deuce.

After a deuce score, when the server is up one

point, we say the score is ad in, and when the

receiver is up one point, the score is ad out.

The 2005 Fawcett-Glass game had deuce 37 times.

Page 6: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Expected ValueProblem. What is the expected length of a tennis game

which begins tied at deuce and in which player A wins a point with probability p?

Background: For a random (quantitative, discrete) variable X (e.g., number of points in a tennis game), the expected value of X is a weighted average of the possible values of X; specifically, if the possible values of X are v0, v1, v2, …, then

k

kk vXvXE )Pr()()(

Page 7: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Aside: Average Value

Suppose for the experiment of “choosing a

random member of the Ensley family” on

04/09/2010, we define the variable X = the

age of the person chosen. The following

table shows the four possible values of X

as well as the probability each is chosen.

Value 14 17 45 46

Pr(X=Value) 0.25 0.25 0.25 0.25

Page 8: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Aside: Average Value

What is the average age of people in the

Ensley house today?

5.30

)25.0()46()25.0()45(

)25.0()17()25.0()14()(

XE

Value 14 17 45 46

Pr(X=Value) 0.25 0.25 0.25 0.25

Page 9: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Tennis, anyone?Problem. What is the expected length of a tennis game

which begins tied at deuce and in which player A wins a point with probability p?

Let X = the number of points that are played after

deuce. What is the set of all possible values of X?

Page 10: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Tennis, anyone?Problem. What is the expected length of a tennis game

which begins tied at deuce and in which player A wins a point with probability p?

Let X = the number of points that are played after

deuce. What is the set of all possible values of X?

According to the definition, the expected value is the

infinite series

0

)Pr()()(k

kXkXE

NOTE: The distribution of the values of X is called a geometric distribution

in probability theory.

Page 11: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Tennis, anyone?Problem. What is the expected length of a tennis game

which begins tied at deuce and in which player A wins a point with probability p?

Solution. Let S be the set of all outcomes of this

experiment. That is,S = {AA, BB, ABAA, ABBB, BAAA, BABB, ABBAAA, …}

Hence, every element of S is either

• AA alone, or • BB alone, or• of the form AB____ or • of the form BA____ , where the blank is filled by any

element of S.

Page 12: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Problem. What is the expected length of a tennis game which begins tied at deuce and in which player A wins a point with probability p?

Solution. Let S be the set of all outcomes of this

experiment. That is,S = {AA, BB, ABAA, ABBB, BAAA, BABB, …}

Let L be the average length of a string in S.

• AA alone,

• or BB alone,

• or AB____,

• or BA____.

Page 13: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Problem. What is the expected length of a tennis game which begins tied at deuce and in which player A wins a point with probability p?

Solution. Let S be the set of all outcomes of this

experiment. That is,S = {AA, BB, ABAA, ABBB, BAAA, BABB, …}

Let L be the average length of a string in S.

• AA alone,

• or BB alone,

• or AB____,

• or BA____.

Probability: p∙p = p2

Length: 2

Page 14: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Problem. What is the expected length of a tennis game which begins tied at deuce and in which player A wins a point with probability p?

Solution. Let S be the set of all outcomes of this

experiment. That is,S = {AA, BB, ABAA, ABBB, BAAA, BABB, …}

Let L be the average length of a string in S.

• AA alone,

• or BB alone,

• or AB____,

• or BA____.

Probability: p∙p = p2

Length: 2

Probability: (1 – p)2

Length: 2

Page 15: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Problem. What is the expected length of a tennis game which begins tied at deuce and in which player A wins a point with probability p?

Solution. Let S be the set of all outcomes of this

experiment. That is,S = {AA, BB, ABAA, ABBB, BAAA, BABB, …}

Let L be the average length of a string in S.

• AA alone,

• or BB alone,

• or AB____,

• or BA____.

Probability: p∙p = p2

Length: 2

Probability: (1 – p)2

Length: 2

Probability: p∙(1 – p)

Length: 2 + L

Page 16: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Problem. What is the expected length of a tennis game which begins tied at deuce and in which player A wins a point with probability p?

Solution. Let S be the set of all outcomes of this

experiment. That is,S = {AA, BB, ABAA, ABBB, BAAA, BABB, …}

Let L be the average length of a string in S.

• AA alone,

• or BB alone,

• or AB____,

• or BA____.

Probability: p∙p = p2

Length: 2

Probability: p∙(1 – p)

Length: 2 + L

Probability: (1 – p)2

Length: 2

Probability: (1 – p)∙p

Length: 2 + L

Page 17: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Problem. What is the expected length of a tennis game which begins tied at deuce and in which player A wins a point with probability p?

Solution. S = {AA, BB, ABAA, ABBB, BAAA, BABB, …}Elements of S (Probability)∙(Length)

• AA alone p2 (2)

• or BB alone, (1 – p)2 (2)

• or AB____, p (1 – p) (2 + L)

• or BA____. (1 – p) p (2 + L)

The average length L of elements of S satisfies the

equation

L = p2 (2) + (1 – p)2 (2) + 2 p (1 – p) (2 + L)

which has solution

222 )1(

2

122

2

ppppL

Page 18: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Average length of a tennis game

beyond “deuce”

22 )1(

2

ppL

NOTE: Probability theory tells us that the variance of the geometric distribution

of X is given by 8p(1-p)/(p2+(1-p)2)2, which has maximum value of 8.

Page 19: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

A more general problemIn tennis a deuce point is always served from the right-hand

service court; an ad point is always served from the left-hand

service court. Tennis broadcasts often present data on players as if

there is no difference. While this might be sound at the highest

levels of tennis, it is certainly not true for amateur players. We will

try the previous solution method allowing for p and q to differ.

Page 20: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Seriously?

Page 21: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

A more general problemProblem. What is the expected length of a tennis game

which begins tied at deuce and in which player Awins a deuce point with probability p and an ad point with probability q?

Page 22: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

A more general problemProblem. What is the expected length of a tennis game

which begins tied at deuce and in which player Awins a deuce point with probability p and an ad point with probability q?

Solution. S = {AA, BB, ABAA, ABBB, BAAA, BABB, …}Every element of S is either

• AA alone, or • BB alone, or • of the form AB____ , or• of the form BA____ , where the blank is filled by any element of S.

Page 23: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Problem. What is the expected length of a tennis game which begins tied at deuce and in which player Awins a deuce point with probability p and an ad point with probability q?

Solution. S = {AA, BB, ABAA, ABBB, BAAA, BABB, …}Elements of S (Probability)∙(Length)

• AA alone p∙q∙(2)

• or BB alone, (1 – p)∙(1 – q)∙(2)

• or AB____, p∙(1 – q)∙(2 + L)

• or BA____. (1 – p)∙q∙(2 + L)

The average length L of elements of S satisfies the

equation

L = 2pq + 2(1–p)(1–q) + p(1–q)(2+L) + (1–p)q(2+L)

which has solution

)1)(1(

2

12

2

qppqqppqL

Page 24: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

)1)(1(

2),(

qppqqpf

Page 25: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Examples

When q = 0.40, the maximum L is 5 with variance = 6.0.

When q = 0.30, the maximum L is 6.7 with variance = 9.3.

When q = 0.20, the maximum L is 10 with variance = 16.0.

When q = 0.10, the maximum L is 20 with variance = 36.0.

Even in the last, extreme case a 74-point game is 9 standard deviations

above the mean.

Theorem. The expected length L of a tennis game which begins tied at deuce and in which player A wins a deuce point with probability p and an ad point with probability q is given by

)1)(1(

2

12

2

qppqqppqL

Page 26: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Alternative game scoring

Some tennis matches or leagues employ "No-Ad" scoring.

Each game proceeds as in regular tennis scoring, but if

the score reaches deuce, then the winner of the next

point, the seventh in the game, wins the game. The

receiver selects which court to receive in. No-ad scoring

is most notably used in World Team Tennis, in many

recreational leagues, and some Major Mixed doubles

events.

Note: This scoring system assumes that the server is not

equally effective from deuce and ad courts.

Page 27: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Other problems to approach

Problem. What is the probability that player A(who has probability p of winning a point) wins a tennis game that begins tied at deuce?

Solution. Let t represent the probability of player A winning once the game is tied at deuce. Use recursive thinking to justify the equation

t = p2 + p (1 – p) t + (1 – p) p t

Solving this equation yields22

2

)1( pp

pt

Page 28: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Other problemsProbability of winning a tennis game

22

2

)1()Pr(

pp

pA

Page 29: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

What does the data say?

In the 2009 Wimbledon Championship, Andy

Roddick won 71% of his service points and

Roger Federer won 78% of his service points.

Federer won 95% (35 of 37) of his service

games and Roddick won 98% (37 of 38) of his

service games. (There were two tie-breakers

played, split between the two players.)

Based on these point probabilities, the model

predicts 93% of service games won by Roddick

and 86% by Federer.

Page 30: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Tennis as a gambling problem

Suppose two players A and B have $2 each,

and they play a sequence of games with

$1 at stake each time until someone is out

of money. This is also known as the

Gambler’s Ruin – it generalizes nicely to

other starting values.

Page 31: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Tennis as a board game

State 1: A wins game

State 2: A up 1 point

State 3: DEUCE

State 4: B up 1 point

State 5: B wins game

Page 32: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Markov chains

States of the game

State 1: A wins game

State 2: A up 1 point

State 3: DEUCE

State 4: B up 1 point

State 5: B wins game

10000

3/103/200

03/103/20

003/103/2

00001

Transition Matrix.

Say the probability of A winning any point is 2/3. The transition matrix gives the probabilities of moving between states in one points.

Page 33: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Markov chains

Matrix multiplication

10000

3/103/200

03/103/20

003/103/2

00001

10000

3/103/200

03/103/20

003/103/2

00001

Page 34: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Matrix multiplication

Row 3 times Column 3 …

)0)(0()3/1)(3/2()0)(0()3/2)(3/1()0)(0(

0

3/1

0

3/2

0

03/203/10

… gives the probability of going from State 3 to State 3 in

two moves

Markov chains

Page 35: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Markov chains

Matrix multiplication

10000

3/19/209/40

9/109/409/4

09/109/23/2

00001

10000

3/103/200

03/103/20

003/103/2

000012

The entry in Row i, Column j of M2 is the probability of the game

progressing from State i to State j in exactly 2 moves.

Page 36: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Markov chains

General Matrix Powers

If M is a transition matrix for a game, then

the entry in Row i, Column j of Mk is

the probability of the game progressing

from State i to State j in exactly k

moves.

This allows us to compute the probability that a

game lasts a specified number of points!

Page 37: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Markov chains

10000

467.01050109533.0

200.001090800.0

067.01020105933.0

00001

10000

3/103/200

03/103/20

003/103/2

00001

1414

14

1414

74

This shows that the probability that the game is still going on after 74 points is about 10-13.

Page 38: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Probability of long games

With p = q = 0.5 …

10000

750.01040104250.0

500.001070500.0

250.01040104750.0

00001

10000

2/102/100

02/102/10

002/102/1

00001

1212

12

1212

74

This shows that the probability that the game is still going on after 74 points is about 10-11. This is the most optimistic outcome for the case where p = q.

Page 39: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Probability of long games

With p = 0.67 and q = 0.25 …

10000

900.01030106100.0

600.001020400.0

450.01090102550.0

00001

10000

4/304/100

03/103/20

004/304/1

00001

1010

9

109

74

This shows that the probability that the game is still going on after 74 points is about 10-9.

Page 40: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Probability of long games

With p = 0.90 and q = 0.05 …

10000

984.0102010202.0

676.001040320.0

642.01040104354.0

00001

10000

20/19020/100

010/1010/90

0020/19020/1

00001

54

3

43

74

This shows that the probability that the game is still going on after 74 points is about 0.004.

Page 41: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

What does the data say?

In the 2009 Wimbledon Championship, Andy

Roddick won 71% of his service points and

Roger Federer won 78% of his service points.

Federer won 95% (35 of 37) of his service

games and Roddick won 98% (37 of 38) of his

service games. (There were two tie-breakers

played, split between the two players.)

Based on these point probabilities, the model

predicts 93% of service games won by Roddick

and 86% by Federer.

Page 42: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Other sports with recurrence

Notes.

Cal Ripken, Jr. was 2 for 13 for Rochester in this game.

Wade Boggs went 4 for 12 for Pawtucket.

Baseball

A game cannot end in a tie so additional whole innings are played until there is

a winner. The longest professional baseball game was a 33 inning affair

played in 1981 at McCoy Stadium in Pawtucket, Rhode Island:

Page 43: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

Other sports with recurrence

Baseball

An “at bat” can last any number of pitches. We can list the possible states as

counts of 0-0, 0-1, 1-0, 1-1, 0-2, 2-0, 1-2, 2-1, 3-0, 2-2, 3-1 or 3-2, base hit,

strike out, or base on balls. We can then relate the probability p of getting a

hit on any given pitch with the official batting average.

There are no official records for number of pitches in an “at bat,” but here is

some baseball lore:

• Alex Cora had an 18-pitch at bat against Matt Clement in 2004.

• Roy Thomas (1901) supposedly had a 29-pitch at bat. His ability to foul

away pitches supposedly brought about a rule change re: foul balls.

• Luke Appling supposedly fouled off 17 straight pitches before hitting a

triple.

• Phillies’ pitcher Brett Myers had a 9-pitch at bat against CC Sabathia in the

2008 playoffs.

Page 44: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

More tennis esoterica

Most games in a singles match before the introduction of the tiebreaker:

In 1969 at Wimbledon, Pancho Gonzales took 112 games to defeat Charlie

Pasarell in the first round 22–24, 1–6, 16–14, 6–3, 11–9.

Most games in a singles match after the introduction of the tiebreaker: In

2003 at the Australian Open, Andy Roddick took 83 games to defeat Younes

El Aynaoui in the quarterfinals 4–6, 7–6(5), 4–6, 6–4, 21–19.

Most games in a doubles match before the introduction of the tiebreaker:

In the American Zone Final of the 1973 Davis Cup, the United States team

of Stan Smith and Erik Van Dillen took 122 games to defeat the Chile team

of Patricio Cornejo and Jaime Fillol 7–9, 37–39, 8–6, 6–1, 6–3.

Most games in a doubles match after the introduction of the tiebreaker:

In 2007 at Wimbledon, the team of Marcelo Melo and André Sá took 102

games to defeat the team of Paul Hanley and Kevin Ullyett 5–7, 7–6(4), 4–6,

7–6(7), 28–26.

Page 45: A recurring theme in the mathematics ofwebspace.ship.edu/deensley/MathSportsMAA.pdf · 2010. 4. 15. · Scoring in tennis. Essentially a game in tennis is won by the first player

References

Math Awareness Month at http://www.mathaware.org

Tennis Statistics at http://www.atpworldtour.com/

Baseball Statistics at http://www.baseball-reference.com/

Doug Ensley, [email protected]