Spring Chaos with Graph

As Bryant kindly reminded us yesterday, it is, indeed, baseball season. For the next month, players will attempt to re-familiarize themselves with live action, conditioning their senses once again to react to that little white ball stitched 108 times over. Managers will experiment with various lineups, and give many pitchers the opportunity to show their stuff, not only searching for reason to keep them on the major league squad, but also to avoid injury to those already locked into the opening-day roster.

It’s an exciting time of year to be a baseball fan. Everyone is loss-less. We get to witness kids get a crack at the show, and veterans try to ease back into another season. But what should we, as fans, take out of the spring league stats? If Saunders slashes .450/.550/.700, should we care?

Spoiler: we shouldn’t care very much.

Purely from a probability standpoint, there is a lot of variance in a sample of even 100 plate appearances—the most anyone will get this spring season. For instance, while a player who generally walks about 15% of the time is expected to walk 15 times in 100 plate appearances, the likelihood that he actually walks exactly 15 times is pretty slim. His 95% prediction interval—a range he will fall into with about 95% probability—is more like anywhere from 8 to 22 walks in 100 trips to the plate, or 8% to 22%. That’s a large spread! What this means is that we can’t be too sure how reliable spring training numbers really are. The difference between batting .270 and .300 is huge over the course of a career, but in spring training, that’s about 3 hits. Three broken-bat bloopers, and you can fool a naïve fan base into thinking you’re a .300 hitter.

I took at a look at some spring training stats over the last three seasons to see if they were at all predictive of regular season performance. I took only the players who racked up enough spring at bats to be in the top 20 each of the last three spring seasons. In other words, these are the guys with the largest spring sample sizes. First, I looked at the basic linear correlation between spring stats and season stats for batting average, OBP, slugging, K-rate and walk rate. The only correlations that were even significant came in strikeout rates and slugging percentage, and they weren’t all that strong (r = .33 and .48, respectively). Some research suggests that walks should be more predictive than slugging, in small sample sizes, so this may just mean that hitters aren’t too worried about exercising their patience during the spring. But regardless of the reason, there’s still a poor correlation between spring ball stats and regular season stats.

For every Miguel Cabrera (who slashed .356/.397/.603 and followed it up during the season with an equally productive .328/.420/.622) there were two Chone Figgins (Figginses?) who teased us with a .373/.448/.490 line… just last spring. I don’t need to remind us what he did during the season. Spring training is experimenting time. Managers experiment, hitters experiment, pitchers experiment. But the real driving force behind the lack of any helpful, predictive spring stats is small sample size.

I cut the sample down–at the risk of some sampling bias–to just players who stepped up to the plate at least 500 times the following season. These are the guys that were likely slated to start from day one, and didn’t fall off a cliff during the season. Again, the only significant correlations came in the strikeout and slugging categories, and they weren’t strong enough for any useful predictions. Fun graph included, as promised.

So as fans, we should take spring ball for what it is. A chance for players to gear back up for the season, and a chance for us to watch them do it without having to worry about falling below .500. But there is not a lot to be taken from the readily available stats. Someone is going to unexpectedly scorch the ball this spring for the Ms. It might even be Chone Figgins. We should only care because it’s fun to watch who’s made us suffer play well, not because it foretells of any career revivals.

Add us as a preferred source on Google

Schedule