Step to the
Rear
Will everyone
here kindly step to the rear
And let a winner lead the way
Here's where we separate
The notes from the noise
The men from the boys
The rose from the poison ivy
How many
games does it take to determine a winner?
The 2004 Boston Red Sox beat the 2004 New York Yankees in
Major League Baseball’s American League Championship Series. The 2004 Red Sox
prevailed to win four games in an epic seven game series. But were the Red Sox
the better team? And why seven games? Why not a 1, 2, 3, 4, 5, 6, 8 or 9 game series?
On any given Sunday, any team can win a game, and all jazz.
So how many games does it take before you can be certain that the winner of a
series of games IS the better team. In a one game series there are two possible
outcomes: Win or Lose. The mean is 0.5, if a win is worth 1, and a loss is worth
0. The square root of the variance of those outcome is 0.167. If there were an infinite number of games, and
both outcomes, Win and Lose are equally likely, then the mean is always equal to half the number of
games. If that mean is subtracted from the number of wins such that number of points
is normalized to zero, it is like the
figure below.
The problem is that we can’t afford to play an infinite number of games, we can only afford to play a finite number of games. For a finite series with an odd number of games, the mean is equal to the median but neither of those are an integer number of games. By contrast for a finite series with an even number of games, the mean is an integer, but with an even number of games, this outcome could occur by chance and thus the outcome can not decide who is better.
Infinity is both odd AND even. Its median is its mode is
its mean, which is one of the hallmarks of a normal distribution. However the
other hallmark of a normal distribution is that it satisfies the 68/95/99 rule.
And with an infinite number of games you can satisfy this. But what are the fewest
number of games in a series that satisfies the 68/95/99 rule? You also want to discard
all even games series because there should
only be one winning outcome, not two.
A seven-game series satisfies more of the 68/95/99 rule. The 2004 ALCS was a seven-game series, and the Red Sox were thus the better team. Ain’t statistics wonderful!