Springtime for Hitters Has Pundits on Wrong Trail
This article is from the archive of The New York Sun before the launch of its new website in 2022. The Sun has neither altered nor updated such articles but will seek to correct any errors, mis-categorizations or other problems introduced during transfer.

As many fans expected, the Toronto Blue Jays are scuffling through the first two weeks of the 2006 season, playing .500 ball and hanging around at the bottom of a deep AL East. Their $100 million off-season investment in pitching has yielded only 10 innings of work so far, as A.J. Burnett has made just one start and reliever B.J. Ryan pitches about as often as a modern closer does. The back end of the rotation has suffered from the winter’s defensive downgrades, especially Josh Towers, who’s allowed 25 hits in 12 2 /3 innings.
The good news for Jays fans is that the team’s offense, which also under went a costly makeover, is averaging six runs a game; the bad news is it’s not sustainable. They’re not going to hit .321 all year. Look for Toronto to stay within a handful of games of .500 throughout the season, and be disappointed by its final record.
Out in Oakland, the A’s were off to an unimpressive 6-7 start heading into last night’s game – though it had them tied for first in the shallow AL West. They’ve allowed more than five runs a game, a figure that belies a stat line showing them to have the highest strikeout rate in the AL, a better than 2-to-1 strikeout-to-walk ratio, and the fourth fewest home runs allowed in the league. That 5.26 ERA is going to fall, and when it does, the A’s will get separation in the West on their way to a division title.
These assessments of two of the AL’s most promising teams are factually accurate, deceptively analytical … and a load of crap.
It helps to be a bit dogmatic about the idea of not drawing conclusions from of small sample sizes.The statistical reason is that performance in baseball, by both teams and players, varies widely during the course of the season. Two weeks of play simply isn’t enough time for underlying ability to shine through the variance, rendering the data essentially unusable.
The thornier problem,though,is confirmation bias, which is the human tendency to assign importance to the data that fits one’s hypothesis and dismiss the data that undermines it.
The Blue Jays are off to a .500 start and playing lousy defense behind a contact staff? That’s to be expected when you overspend on so-so free-agent pitchers and trade away a great glove man like second baseman Orlando Hudson. And that six runs a game they’re scoring (in part due to Troy Glaus, who was acquired for Hudson, and his 1.037 OPS)? That won’t last. The A’s, however – their 6-7 start isn’t something to be taken seriously. The core talent is very good, and they’ve just been unlucky in how many runs they’ve allowed in the early part of the season.
It’s insidious and, quite frankly, a lot more dangerous in the work of statistical analyzers, like those of us at Baseball Prospectus, who lard arguments with information and data rather than random opinions about character and fortitude and clutch and heart and spleen. It’s easier to see through mainstream “analysis,” with its daily level of nonsense, but when a performance analyst is pointing to numbers and making a case, it’s harder to see through the biases.
Think about last season, when even at the All-Star break I was still dismissing the Chicago White Sox, focusing on their record in one-run games rather than their terrific defense and functional, if wildly misunderstood, offense. All of the points I made about the Sox were true,but the team’s record in one-run games meant more to me because it fit my preconception that they were a sub-.500 team that was getting lucky. They were actually a .565 team that was getting lucky, but because I dismissed the information that supported the idea, I didn’t see their true ability.That’s confirmation bias.
Perhaps the most notable example of this in the early part of the 2006 season is Barry Bonds. Bonds is hitting .192 AVG/.488 OBA/.269 SLG with no home runs in 26 at-bats, amid a press circus and with knee and elbow problems.
If you’re inclined to believe that Bonds is simply going through a normal decline, you point to the surgically-repaired knees, the bone chips in his elbow, and the meager sample size. You note that Bonds played the first two weeks in three of the lousiest hitting environments in baseball, much of the time spent in cool, damp weather, and that he’s dealing with a level of scrutiny we don’t impose on nominees for White House Cabinet positions.
If, instead, you believe that Bonds is a steroid-using cheater who has been scared off of the juice by new penalties, a book about his usage, and the opprobrium of the American populace, you look at the goose egg in the home-run column, the sub-Mendoza batting average, and the handful of warning-track flyballs, and you snicker. Bonds is done, you say; he’s stopped using steroids, which were the only reason he played so well so late in his career.
In either case, you’re wrong. There’s not enough evidence to support either conclusion at this point, and in choosing your data, you’re guilty of confirmation bias. Bonds may be done, or he may just be going through an injury-enhanced slump.We won’t know for some time,and that’s the only reasonable conclusion.
Being aware of confirmation bias is important at all times, but it is especially so early in the season, when you can pretty much find information to support any conclusion you care to draw, and dismiss that which doesn’t suit you as “small sample size.” It’s all small sample size, which is why, as the season goes on, you should take any analysis that emphasizes performance in April with an enormous grain of salt. It’s definitely got a sample-size problem, and it most likely comes with a confirmation bias.
Mr. Sheehan is a writer for Baseball Prospectus.For more state-of-the-art commentary and information, visit www.baseballprospectus.com.