King Felix and Predicting ERA


After posting back-to-back seasons with ERAs under 2.50, Felix Hernandez recorded a 3.47 ERA in 2011, good for just 18th in the AL among starters with at least 100 innings pitched. So what gives?

I don’t think I need to go into too much detail around here on why ERA is not the best thing to be looking at in small sample sizes. With 500+ innings of data, it has been shown that ERA actually does become a respectable estimator of future ERA. However, the King pitched 233 innings last season, a hell of a workload for one year, but not a big enough sample for accurate forecasting. The best estimators based on just one season of data are, as you might assume, SIERA, xFIP and FIP. These are three ERA estimators based on peripheral statistics like strikeout and walk rates. Felix’s 2011 version of those stats went 3.22, 3.15 and 3.13, respectively. His three-year average ERA sits at 2.73. So which is it? About 2.7, 3.2, or somewhere in between?

There is one key figure that is ignored in the first three statistics, but heavily weighted in a player’s ERA. That, of course, would be BABIP. The ERA estimators are based on linear formulas which assume the pitcher has little, if even no control over his BABIP.* But some individuals—oh hey, Matt Cain!—have shown an ability to influence their own BABIPs to some degree, and thus these estimators are shaky in their predictions. The combination of pitching style and ballpark needs to be taken into account. So here we go!

*SIERA attempts to account for some BABIP variation using interaction terms in the regression formula. However, it struggles to capture the complexity of things like batted ball data.

Let’s kick things off with the King’s basics: strikeouts and walks.













The ability to strike hitters out while avoiding walks is obviously important to pitching well, but as you can see, these are also very predictive stats. Hernandez’s K and walk rates from 2009 and 2010 would have been nearly perfect estimators for his 2011 campaign. In this department, he performed no worse.

Moving on to his batted ball profile…





















Last season, he saw his line drive rate increase while his groundball rate decreased. These are not good signs for a pitcher’s run prevention, and likely played a role in spiking Felix’s BABIP and homerun rate—two of the three primary culprits in his ballooning ERA.

The other major reason for the increase in Felix’s ERA came in the form of stranded runners, or lack thereof. His left-on-base percentage (LOB%) checked in at just 72.7% last season, 4.3% worse than his 2008-2010 average of nearly 77%. That may not seem like a big difference, but consider this: Felix allowed 292 runners to reach base on hits, walks and hit-by-pitches. 4.3% of 292 adds up to 12 additional runs scored, and an ERA half-a-run higher.

So we have some idea of where the ERA spike came from—a few more dingers, an increased BABIP and a greater proportion of baserunners scoring—but which of these factors, if any, foreshadow another season like 2011?

The decreased groundball rate looks pretty important. The drop from 53.9% in 2010 to 50.2% is significant enough to indicate that it’s probably not just some random error in categorizing contact. Fangraphs’ Pitch f/x also informs us that Felix threw his sinker 4% less often in 2011 than the year before, giving us more reason to believe that his higher flyball and line drive numbers represent true changes. However, pitching in Safeco, this is not likely to affect his homerun rates all that much. Last year he allowed just two more homers than in 2010. The theoretical difference in homeruns should be about two, also, since we’re talking about 10 more flyballs and 20 more line drives allowed (total over a whole season). One or two homeruns might increase an ERA by a tenth of a run, but not enough to greatly affect his results.

His BABIP likely hurt his ERA more than anything, as it was also the catalyst in his low LOB%. In 2011, his teammates recorded average BABIPs of .287, below the AL average of .293. Felix, however put up a .307 figure, twenty points higher than his teammates. Compounding his problems, his BABIP jumped to .312 with runners in scoring position. His career BABIP with nobody on is .290, and just .294 with RISP, so the RISP jump was expected, but his BABIP should never be that high to begin with.

Using batted ball data from Fangraphs, and regressing some of it back to his 2009 and 2010 levels, we can estimate an expected BABIP. Weighting my predictions more heavily toward recent seasons, I project the following: 17.8% LD, 52.0% GB, 22.3% FB, 7.9% IFFB. With a profile like that, the average starter (that made the minimum innings cut last season) would expect a BABIP around .293. Felix pitches in Safeco, with defensive juggernaughts at both short and in center. A BABIP in the .285 range would be a perfectly reasonable projection, jumping possibly as high as .290 with RISP. What does this mean for his ERA? Well, shaving 17 BABIP points off his RISP situations saves him four or five hits, and potentially 5-10 runs in the form of LOB%.

To summarize, a fair chunk of his ERA spike can be explained by things that are likely to regress. It would be fair to assume that the King’s BABIP levels from 2009 and 2010 were artificially low, but I’m sure all Mariner fans can still appreciate a drop back into the .280’s. Some natural groundball regression should keep his homerun tally from rising any more, and stranding runners (both RISP and on first) at his career levels could shave another couple runs off his total.

All in all, a fair projection for next season’s ERA would be something on the good side of the FIPs. I’ll go ahead and set his median predicted ERA performance at 3.00, trimming last season’s figure by nearly a half a run. While we didn’t really have to look at all the numbers to come to this conclusion—anyone pitching in Safeco would naturally see his ERA undercut his FIP—understanding BABIP and the ERA estimators are crucial to identifying why a pitcher did what he did, and what he’s likely to do in the future. And I like playing with numbers.

Special thanks, as always, to Baseball Reference and Fangraphs for their collections of statistics. Without them, my life would be more meaninglesser.