Login
Home > Blog

Statistical Error:
Baseball, Bayes, and information quality

10 Aug 2009 in , ,

Wall Street Journal's "Numbers Guy" Carl Bialik writes in an August 7 blog post about the claim made by Sky Andrecheck that baseball would be no different if the numbers of balls and strikes were reduced to three and two, respectively:

Specifically, writing on Baseball Analysts, [Andrecheck] presents data suggesting that a game where three balls earned a batter a walk but two strikes ends his at bat would have very similar outcomes to what we know as baseball, but get to those outcomes a lot faster — and with fewer pitching changes.

Andrecheck commits an elementary statistical error and he incorrectly assumes that at-bat data are true.

First, the statistical error.

Andrecheck treats each pitch as an independent event rather than as part of a series of dependent events leading to measurable outcome (the "at bat"). Each pitch during an at bat may be unique, but the behavior of both batters and pitchers depends on the existing four-ball three-strike rule. There is no theoretical reason to believe that batters and pitchers would behave the same under different rules. For example, with fewer balls to give up, pitchers would be less inclined to waste pitches outside the strike zone and batters would be more defensive. The data on which Andrecheck relies are the result of conditional probabilities updated by Bayesian analysis.

But Andrecheck has applied a classical frequentist approach that treats each pitch in an at bat as an independent event. This is wrong because only at bats matter in how the games is scored, not individual pitches within an at bat. There is a lesson in this for baseball statistics: Just because something is measurable doesn't mean it's important.

Regarding the current rule, Andrecheck says "there's no real rhyme or reason" why the current 4-ball/3-strike rule was chosen. The rule "simply worked well and over time they became tradition." This is an internally inconsistent argument refuted by Andrecheck's own report:

The rules weren't always the same. In 1879, the rules were originally nine balls for a walk. The number of balls for a walk were gradually reduced to four balls to a walk by 1889. The number of strikes for an out was also temporarily changed in 1887 from three strikes to four. For the last 120 years however, the rules have been the same.

Of course, today nine balls to a walk sounds ludicrous - pitchers would simply dally and work around the strike zone trying to get a batter to chase a pitch outside, leading to interminable at-bats and increasingly long games. Clearly, reducing the number of balls required for a walk was a wise move and the same goes for reducing the number of strikes from four to three. But did the founders of the game go far enough?

To say the current rule "worked well" means that it was the result of trial and error, like the size and composition of the baseball and the choice of 90-foot base paths. (Consider how the game would be different if they were 30 meters apart instead.)

Andrecheck says that if the founders of baseball had continued to reduce the number of balls to three and strikes to two, little would be different. To reach this conclusion, he compares current at bat outcomes with what happens when the count is 1-1. Andrecheck interprets a reduction in batting average from .268 to .250 as "hardly drastic," but it's not obvious that a 7% reduction would be trivial to the game or make the game more appealing to fans.

But all of Andrecheck's analysis is suspect because batters and pitchers have no reason to behave the same with a 1-1 count under a 3-ball/2-strike rule as they do with a 1-1 count under the current rule. Andrecheck's own data show that battters and pitchers behave very differently on 1-1 counts:

Outcome of Pitch
2007 data
Count Ball Strike 2-Strike Foul In Play Strike Zone
1-1 35.4% 41.6% --- 23.0% 64.6%
2-2 29.6% 17.6% 24/8% 28.1% 70.4%

Reliable inferences cannot be made about what would happen under a 3-ball/2-strike rule using existing data. Yet Andrecheck draws a long list of inferences that implicitly assume batter and pitcher behavior would be no different if the rules were different.

Now, the information quality error.

During the 120-year history of the current rule, many changes have been made that affect the outcome of at bats. Possibly the two most important relevant changes are the lowering of the height of the pitcher's mound and the shrinkage of the strike zone. In 1968, Bob Gibson won 22 games (13 by shutout) with an ERA of 1.12 and 268 strikeouts in 304-2/3 innings, At least 25 NL pitchers and 22 AL that year finished the season with ERAs below 3.00.

In 1969, the mound was lowered from 15 to 10 inches, and the number of sub-3.00 ERAs declined to 13 in the NL and 11 in the AL. (Gibson was less affected than most by the change. He pitched 314 innings and struck out 269, and had an ERA of 2.18.)

Still, the significance of the height of the pitcher's mound has been well recorded:

The height of the mound has not been constant, or even well defined, through baseball history. Before 1893, the pitcher threw from a pitcher's box, which worked better with a level surface rather than a sloped one. In 1893, the pitching distance was changed, and the box was replaced with the pitcher's rubber. Pitchers discovered that they could get more speed on the ball if they were allowed to stride downhill, so their groundskeepers would provide them with a mound. Those early mounds were not regulated; in Pitching in a Pinch, Christy Mathewson commented that the height of the mound might be changed from day to day to suit the pitching style of the home team's pitcher.

The regular changing of mound height was eventually prohibited. Teams settled on a height of 15 inches for the mound. Despite this regulation, some teams were accused of using a higher than regulation height mound; Dodger Stadium was particularly notorious for having a high mound. Following the incredibly low scoring in 1968, the rules were changed to reduce the mound to the contemporary 10 inch height. Some accusations of gamesmanship with mounds continue, usually with visiting teams compaining that the mounds in the visitor's bullpen don't match the mound of the field, so that relievers entering the game aren't properly adapted to the game mound.

As for the strike zone, there is ample evidence that it has varied greatly over the years even though the formal rule hasn't changed. Statistical evidence has been reported indicating a home field bias among some umpires. Writing at BasebalGuru.com, John B. Holway discusses how batting averages vary by umpire:

[Richard] Kitchin found that umpires can add or subtract 15 points or more to a man’s batting average. Against Dave Phillips batters hit .271; Mike Reilly held them to .250. That’s almost a ten-percent difference, enough to change a .300-batter into a .315 hitter, or a .285 hitter.

Another scholar, David Driscoll, found an even greater disparity. He charted every umpire in the Blue Jays’ 1986 schedule and found that Phillips gave up a .301 average; Joe Brinkman, .215. That’s an almost 40-percent difference. Brinkman could turn a man’s [.300]-average into .240.

In 1990 Stanley Kaplan charted all Mets’ games and reported that when Joe West was behind home, the opponents out-hit the Mets by 67 points. When Charlie Williams was, the Mets out-hit their foes by 76 points.

Williams called one of the worst games in World Series history in 1993 between the Phillies and Blue Jays. TV commentator Tim McCarver repeatedly called for the over-head camera to replay Williams’ pitches in slow motion. It certainly seemed as if his “strikes” were 18 inches wide of the plate. Toronto won the game and Series. I don't know which team would have won if Williams had used the rule-book strike zone.

In the 1997 NL playoffs, umpire Eric Gregg also moved home plate a foot and a half west. Florida’s Livan Hernandez whiffed a record 15 batters, and his Florida Marlins won the game and Series. Would the Braves be wearing World Series rings with another umpire? Sometime around 1984 the umpires changed “baseball” into “low ball. They took the strike zone and laid it on its side.

...

They took the bat out of the hands of hitters like Jimmie Foxx, who murdered high pitches. They also took the ball out of the glove of pitchers like Jim Palmer, who thrived on high heat at the throat. Luckily Jim retired that year. If he had been just starting out, he would have walked the first four men in spring training on 16 balls and would never have been heard of again.

Meantime, how many of Greg Maddux’ and Tom Glavine’s 300 victories were won on that now huge lower outside corner?

Both pitchers and batters take into account the identity of the umpire. It has been widely reported that managers take into account which umpire will be calling balls and strikes when deciding which pitchers to field during short series' such as playoff and World Series competition. These decisions are not necessarily superstitious.

As long as umpires, not machines, are calling balls and strikes, pitch-by-pitch data such as Andrecheck uses must be interpreted with great care.

[add a comment]

Add a Comment

*
*
*
Check to receive notifications of future comments.
Yes
No