clock menu more-arrow no yes

Filed under:

Baseball projections are what they're designed to be

New, comments

Some thoughts on projection systems on St. Louis Cardinals ZiPS Eve.

Jerome Miron-USA TODAY Sports

Dan Szymborski will release the ZiPS projections for the St. Louis Cardinals at Fangraphs tomorrow. It's one of my favorite days of the year. Sure, the Steamer projections are already out and the Baseball Prospectus PECOTA projections are forthcoming, but I've grown over the years to most look forward to ZiPS Day for the Cardinals. This is especially true for the 2015 ZiPS projections, which will include fielding and, consequently, a more well-rounded zWAR.

Last week, The Hardball Times ran a piece by Jack Moore that compared those who develop the projection systems to the barus of ancient Mesopotamia and the projections themselves to divination:

The baru practiced what we now call "divination" and would read the movement of drops of oil along a bowl, or the movement of smoke rising from incense, the patterns of hot candle wax dropping from the wick, or even the shapes and colors of the organs of sacrificed animals. The baru‘s ritual was always preceded by a prayer, in which the baru asked a god to reveal his intentions through the chosen medium.

Moore cites to three sources on the divination of ancient Mesopotamia, quotes Nate Silver from The Signal and the Noise, and quotes Rob Arthur's Baseball Prospectus article from earlier this month analyzing various projection systems. While the sources Moore looks to are somewhat balanced, he focuses on the methodology of the Mesopotamian barus to the near exclusion of those who project baseball stats. Moore doesn't delve into the methods used by those in the field of projecting baseball stats or the way in which results are analyzed. The seeming justification for this is that the formulae for some projection systems are not public knowledge. But the general principles regarding the methodology are and Moore gives no space to them. "Sometimes the explanation is too loaded with jargon for a layman to be reasonably expected to understand," Moore remarks.

I've taken one math class since Algebra 2 in high school and it was the only class I dropped in college. My brain is not well-suited for mathematics so I understand how intimidating something like a projection system or the calculation of wins above replacement can seem. But, over the years, I've tried to learn the mathematical principles behind advanced baseball stats. Doing so has enhanced my enjoyment of the game I love and I try to pass along some of what I've learned here at VEB, to make concepts more accessible for Cardinals fans like me who aren't particularly adept at math. This is why I bristle when baseball writers more mathematically talented than I refer to non-stats-heavy baseball writing as the "humanities," seemingly with a hint of condescension. To me, baseball stats do not represent an either/or proposition.

My attempts to gain a better understanding of advanced stats makes me take umbrage when someone more humanities-inclined seemingly throws his or her hands up in the face of these intricate mathematical formulas and declares "Divination!" When they do so using as their frame ancient Mesopotamia—a place in time I'm rather familiar with due to my days studying art history—I take issue. It's a false comparison.

A projection system isn't so much trying to tell you what will happen in 2015 as what a player's true talent level for 2015 is. Various systems do this in various ways. Most emphasize most recent seasons. For example, Tom Tango's Marcel uses a player's three most recent seasons, giving the most weight to the most recent season, the second-most weight to the second-most-recent season, and least weight to the third-most-recent season. ZiPS uses four seasons. Bill James can use up to eight seasons of input, but emphasizes the most recent three seasons. Most projection systems use age as an input; thus pulling a players numbers toward the typical aging curve for ballplayers. They also tend to regress a player's stats—especially BABIP—toward the MLB mean.

Using overall trends across baseball history and a player's individual performance, the systems project what is most likely to happen. That often means the stats we see at Fangraphs, in the Bill James Handbook, or at Baseball Prospectus represent the mean projection for a player. Many simulations have been run and the mean projeciton represents the 50th percentile—the one right smack dab in the middle. The mean projection represents the outcome most reflective of a player's true talent level.

Moore doesn't even discuss the ultimate aim of a projection or the use of mean projections in his THT article. Making this oversight all the worse is the basis for Moore's piece: Projection systems miss breakouts from players like Jose Bautista, R.A. Dickey, Michael Brantley, or Victor Martinez. In doing so, Moore stakes out the inconsistent position of at once decrying that these systems attempt to be as accurate as possible while criticizing them for not nailing breakout seasons.

Might a player over-perform or under-perform his mean projection? Of course. It happens every year. And the projection systems recognize this possibility. That's why they have projections in excess or below the 50th percentile. But a player reaching such heights or depths doesn't necessarily mean that his true talent changed. While a player may have undergone a change in true talent, it's also possible and perhaps even more likely that things just break his way and he has an otherworldly season.

If a system is based on a player's past production, it shouldn't project a player like Dickey, who was entering his age-37 season in 2012 with a career 4.34 ERA, 4.50 FIP, and 4.36 xFIP, to win the Cy Young with 20 wins (on a not-so-great Mets team) with a 2.73 ERA that defied a 3.27 FIP and xFIP. No one in their right mind would've predicted such an outcome in 2012, not even an ancient Mesopotamian reading sheep innards. And that's part of what makes baseball so great. In 2012, Dickey wasn't a 2.73 ERA pitcher in terms of true talent, but in 2012 he was a 2.73 ERA Cy Young winner, against all odds. That no projection system spit such a result out as a Dickey's mean projection for 2012 isn't a glitch; it's a feature. And for me as a fan, comparing Dickey's final stat line to his pre-2012 projections makes the knuckleballer's improbable Cy Young season all the more fun.

Of course, even after Dickey's Cy Young-winning 2012 season, he wasn't projected to be that good again entering 2013. For example, ZiPS projected Dickey, as a 38-year-old, to post a 3.89 ERA and 4.06 FIP over 194 1/3 innings. While it considered Dickey's Cy Young season, ZiPS wasn't phased by missing it. Nor should it have been. Dickey was unlike to repeat his performance, especially after moving to the AL East. And that's precisely what baseball projections should do: Reflect the most likely forthcoming performance.