We have a pretty active community of commenters here at VEB. Our conversations in the discussion threads provide regular fodder for articles. We appreciate your input and insight! The site’s active commenters are dwarfed, though, by the large number of readers who faithfully visit VEB to get their daily Cardinals fix without jumping into the fray of discussion.
Today I want to feature one of these readers, Ryan, who wrote me a nice email in response to one of my articles. He had a question that I think is worth my time to answer and your time to read since the subject matters to almost every analysis article we write around here. Here’s his question:
I am a long time reader of Viva El Birdos, but don’t have a username. I truly enjoy the content and comments from the community, as I think it provides more insight…about Cardinals baseball than I can find anywhere else.
One issue I have is understanding all of the advanced metrics used in the articles. I don’t know if a post has ever been done explaining some of them, but I thought while baseball is locked out, a primer article that goes through some of the more commonly referenced metrics and stats, explains their usefulness, and what a good, bad, terrible and excellent number for each stat is would go a long way to help my comprehension and possibly others.
In any event, just an idea. Thanks for your articles and I truly appreciate the site.
Good question, Ryan, and a good idea! I know that a “Primer for Advanced Statistics” article – which I’ll title “Stats You Need to Know” – won’t be the entertainment highlight of your day, but it will be something that we can keep in our toolbox and reference in future articles.
For the sake of readability, I’m going to divide this into three different articles: offensive stats, pitching and defensive stats, and, lastly, WAR and the Statcast stats.
Here are 5 types of offensive stats you need to know:
1. Hitting Basics – Counting Stats, and BA/OBP/SLUG Slash Line
Let’s start with the stats that have been on the back of your favorite brand of baseball card for decades. We have home runs, runs batted in, stolen bases, doubles, etc. We can file all of those under “counting stats” – stats that count as they happen.
These stats are useful for their simplicity: the number of home runs a player has tells you how many home runs they have hit. Genius. Same with RBI or runs scored.
The problem with these stats is that they only tell you one thing and baseball is more than one thing. There are a lot of things that can happen on a baseball field and all of them contribute to winning or losing – which is the whole point of playing.
To illustrate this, let’s hop in the old Delorean and travel back to 1993, which was about the year I started playing fantasy baseball. (Oh yes, I’m that special kind of life-long nerd.) That season the immortal Phil Plantier hit 34 home runs for the San Diego Padres. The same year, gangly, batting-helmet-in-the-field-wearing John Olerud hit 24.
How useful is that as a statistic? It’s useful in that it tells us that Plantier hit more homers than Olerud in 1993. It certainly doesn’t tell us that Plantier was better than Olerud. It doesn’t even tell us if Plantier or Olerud were even good players.
How do we find that out? Can we do it using traditional stats?
We sort of can. A traditional “slash line” helps provide context for the counting stats – how often a baseball event happens on average. What’s in a slash line? Three things: BA/OBP/SLUG.
Batting Average (BA) – how frequently a player gets a hit per at bat.
On-Base Percentage (OBP) – how frequently a player reaches base (not counting errors or fielder’s choice) per plate appearance.
Slugging Percentage (SLUG) – the total number of bases a player generates per average at bat.
These stats also have their flaws. A .300 batting average does not mean that a batter always gets 3 hits per every 10 at bats. A batter might only have 5 hits in 100 ab’s and then bust out for 55 in their next 100. And over that span, they might have actually come to the plate 225 times (PAs) to get those 200 at bats (ABs).
How does it help you to know that a batter, on average, generates 1/2 a “total base” each time they have an AB? Nothing, really. At that rate, a batter can go a week without “slugging” much of anything and still end up with a well-earned high slugging%.
As flawed as these stats are they do provide a starting point for understanding how a player arrives at their counting stats. Let’s provide a little context for those HR stats above for Plantier and Olerud:
1993 Phil Plantier: .240/.335/.509
1993 John Olerud: .363/.473/.599
OK, big difference. Plantier might be the big power hitter the 90s chicks dug (digged? Dig Dugged?) and I wanted on my fantasy team. But nerdy John Olerud was a phenomenal hitter. Counting stats weren’t enough. Rate or average stats helped. We can still do better.
2. Not-So-Advanced “Advanced” Hitting Stats: OPS, OPS+
Back in the late 90s/early 00s, we started to realize that counting stats and batting average just weren’t descriptive enough for the reasons illustrated above. The OBP revolution was starting. Batting average fell out of favor with franchises and the rage-against-the-machine was finding players who could get on base and hit for power.
Some not-so-brilliant person somewhere said, “hey since getting on base and hitting for power are totally tubular, let’s just combine OBP and SLUG% and BOOYAH statistics, dude!”
No, that’s not how we really talked back then. But, yes, that’s all OPS is.
OPS – stands for “On-base% Plus Slug%”. OBP + SLUG.
Yup, it’s just two slash stats added together, with 1.000 being the “elite” standard for OPS more by an accident of math than by design.
Honestly? Despite its increasing usage on TV broadcasts, I find OPS to be mostly worthless. It gives us an indication of how productive a player is – 700’s are pretty average and anything over 800 is pretty good – but it’s less descriptive of a stat than simply using the old slash line standby. OPS gives us no information on how a player arrived at his OPS.
And OPS+? You know those smart kids in high school who took the honors and AP courses? That’s OPS+. It takes OPS, finds league average, and then re-scales it to 100. An OPS+ above 100 means the player is above average offensively. Below 100? A player is below average. That’s better than OPS – it provides league context – but it doesn’t tell us much about how a player accomplished that production.
3. Weighted Stats – wOBA and wRC+
Actual statisticians and sabermetricians (baseball stat people) might have appreciated the way OPS tried to be a cumulative offensive stat, but they weren’t satisfied with its simplistic approach. The problem with OPS is that it was “weighted” in an arbitrary way. The stat just added on base percentage and slugging percentage together assuming they were equal in value.
The sabermetricians started asking important (and sort of obvious) questions. Like, “is there a way to determine the ‘weight’ or value of an offensive event and compare those to each other?”
Sabermetricians, namely “friend of the blog” and current Senior Database Architect for MLB Tom Tango, not only asked those kinds of questions but had the math chops to figure out the answers. Baseball, after all, has always been a stats-oriented sport. Tango, and others, had 100+ years of data to work with. From that data and some fairly straightforward statistical models, they were able to calculate the value of each type of offensive event, whether that was a homerun or a caught stealing. Tango then weighed those events against each other, added them up, set them onto a familiar scale (on-base percentage), and suddenly a whole new and extremely useful statistic was born: wOBA.
wOBA – a cumulative offensive production statistic that weighs a player’s offensive events by pre-determined values and averages them by adjusted plate appearances.
wOBA sits on the same scale as OBP. Anything below .300 is poor. Average is around .320. Then scale up from there, with .400 being elite.
wRC+ is very similar, except like OPS+ to OPS, it scales the same production as wOBA but puts it on a +/- 100 range, centered around league average (100). Above 100 is above average. Below 100 is below average. Anything over, say, 150, is really, really good. This provides vital context – because the offensive environment for a league can change quite a bit over time. wRC+ adjusts for that. It also includes park factors – whether a player’s home ballpark encourages or suppresses offense. There are huge variations in offense between, say, Coors Field and Busch Stadium. wRC+ neutralizes that.
So, a player can have a .380 wOBA in Coors but just a 120 wRC+. The player’s offensive events and production haven’t changed but because Coors has such a high offensive environment, wRC+ adjusts the value of those events downward. If we put the same player in Busch the ballpark would eat up some of their production events. Their wOBA could drop all the way to .340. Their wRC+, though, would probably still be a 120. Offensive events are worth more at Busch than they are at Coors. wOBA doesn’t care about this. wRC+ does.
wRC+ – a cumulative offensive production statistic that weighs a player’s offensive events by pre-determined values, averages them by adjusted plate appearances, and then contextualizes them by league and ballpark.
Let’s see how this plays out practically with the 2021 Cardinals. Below is a simple list of ’21 Cards hitters who qualified by the batting title, with the stats we’ve talked about so far: slash line, OPS, wOBA, and wRC+, and sorted by wRC+:
From that chart, it’s easy to see how little correlation there is between batting average and overall production – by any of the cumulative metrics. OBP and SLUG% are a little better but the differences in Arenado and Carlson, for example, show where they break down as stand-alone evaluative tools. Arenado and Carlson are separated by 30 points of OBP and 60 points of slug%. That adds up to about 30 points of OPS.
Was Arenado 30-60 points better than Carlson, depending on the metric?
When you weigh those metrics by historical data, the answer is a clear no. OPS over-weighs slugging% simply because it’s a larger number than on-base%. This gives an arbitrary bump to Arenado because he’s a power hitter. wOBA corrects that, weighing these events by real historical data instead of random accident. That allows Carlson and his higher walk rate and batting average to easily bridge that 30-point OPS and 60-point slugging gap: they have an identical .336 wOBA. Since they played in the same ballpark in the same season, their wRC+ is also identical at 113 – 13% above league average.
The same level of production. Completely different ways of getting to that production.
Therein lies the problem with these kinds of cumulative, weighted stats. We now know with a high level of confidence that Arenado and Carlson were pretty even offensively. We still don’t know how they achieved that production without going back to counting stats and slash lines.
4. The X Stats – Expected Stats (An Introduction)
If the sabermetrics revolution brought math into baseball, Statcast brought technology along with it. In the last few years, baseball has outfitted ballparks with an array of expensive motion capture tech, ranging from cameras and radar guns to imaging software that measures everything that happens on a baseball field.
Nowadays, if Tyler O’Neill hits a home run, we know how fast the pitch was going, how much it was spinning, how hard TON hit it, the angle at which it left the bat, how far it went, and how fast he ran around the bases.
Give Tango a few years and we’ll know the impact force of Wainwright slapping him on the butt in the dugout.
We’ll get into the specifics of these new Statcast statistics in a later article. For now, I want to consider a line drive.
What’s the difference between a line drive hit right at a fielder and a line drive that goes to the wall? Often there is no difference in how hard the ball was hit, the launch angle, or even the direction of the hit. Sometimes the only difference is the positioning or skill of a fielder.
Just as historical data allows us to know the value of a hit versus a walk or a double versus a triple, so does Statcast data let us understand what kind of batted ball event is expected to produce an out or a hit.
The smart people at Baseball Savant (the official MLB analytics website) can look at all that Statcast batted ball data and plug it into the basic formulas above – batting average, slugging%, and wOBA. This gives us what we call “X” or “expected” statistics.
xBA – expected batting average.
xSLUG – expected slugging percentage
xwOBA – expected weighted on base average
These “X” stats completely ignore what actually happened on a batted ball event, and, instead, tell us what should have happened based on what that event looked like.
Over the course of a season, those actually’s and should’ves tend to even out. How close a player’s X stats are with their actual stats tells us how much they did or did not “earn” the production they have or if they “deserved” better or worse.
Let’s go back to the same chart we used above. We know how they did. What did Statcast expect them to do?
Here you can see the kind of variance that can show up in X stats versus actual stats. Based on how he hit the ball, Carlson overperformed. Arenado didn’t hit the ball hard enough to warrant his high slugging%. Edman, though, probably deserved better than his actual stat line.
How important are X stats? Like everything else, they are just part of the puzzle that makes up a baseball player’s season. Like adjusting for league or ballpark or weighing production, they provide additional context that helps us understand both what happened and what might happen down the road.
The Point: Don’t Use Just One Offensive Stat!
Maybe now you might see why I, and the other writers around here, try to use a variety of offensive stats when analyzing players.
There is no one perfect offensive stat. Not BA. Not wOBA. Not wRC+. Not OPS. Not even WAR – which we’ll get to in a later article.
It takes all of these stats – basic counting stats, rate stats, cumulative stats, weighted stats, and expected stats to paint an accurate picture of what a baseball player is.
It’s one of the things that makes baseball so great! It’s a very simple game – throw the ball, hit the ball, catch the ball. But it creates wonderfully complex stories!
Where do you find these stats? Most of what you’ll need is available for free at Fangraphs.com or baseballreference.com. For more complex stuff, try Baseball Savant - baseballsavant.mlb.com.
For REALLY complex stuff, just dm John LaRue. That’s what I do!
We’ll hit pitching and defense and Statcast stuff soon. Thanks for the question, Ryan. And thanks to everyone for reading.