Stats to use in VEB: BATTLEDOME!
After the brilliant thread about ecks posted by larry, and another great NL MVP diary posted at the red reporter, I thought it'd be nice to have a stats deathmatch.
Baseball is probably the most misunderstood sport of all time. Just walk on over to the PD forums and take a gander, go to the local sports bar, or listen to the office water cooler talk (assuming it's not football season) and you can get a good gauge about just how incredibly off the general bandwagoner is.
After some comments here and there, but mostly the thread discussing replacement level got me thinking there should be an official VEB glossary of preferred stats to use when discussing players, trades, etc. Hopefully here we can discuss and come to conclusions on which stats to use so you don't look silly and we can ride our SABRmetric high horse all over John Q. Edmondsjersey.
TEAM
First when discussing overall/general team quality most Statheads like to use a teams Pythagorean winning percentage. Basically this is just a simple formula that involves the Runs a team scores vs the runs a team allowed. "There is no explanation for the correlation between the formula and actual winning percentage in theory, rather the correlation has just been shown to work empirically". I personally like using this stat to see what a team needs to improve on and how they evolved or regressed throughout the years.
OFFENSE
Most of us know or at least should know that man can not survive on Batting Average alone. Who would you rather have, a .287 hitter or a .278 hitter? Well if you chose the .287 you just picked Mark Loretta over Lance Berkman (okay that was predictable to almost everybody). If you must use average when discussing a player it is more accepted to use the trio of slash stats, AVE/OBP/SLG. If you don't understand why then you need to read MoneyBall or BP's Baseball Between the Numbers (where most sabr stats are explained in great detail). Often people like to use OPS instead of slash stats or if you want to get really fancy use OPS+, which is OPS with park-factor and you can find already calculated at the famedbaseball-reference.com.
VORP
One of everybody's favorite and seems to be the one that dominates is VORP or Value Above Replacement Player. As I already said, there is a 9 page explanation in BP's Book. Here's the long and short:
VORP is cumulative stat; additional calculations are required before using them to compare multiple player's potentials for future contribution. At the very least they must be normalized for PA. VORPr is rate stat, and therefore is slightly better for comparing players who have had a different amount of playing time as long as you account for sample size.[1]
Replacement level = freely available talent. Basically If your player goes down, a team can replace them from the waiver wire or the farm.
How Replacement level is calculated = It's actually based on RC/27 (runs created will be expanded on later) and it's based yearly. Basically BP tracks each teams regular players, then their backups then compare. They've found that generally backups perform at around 80% (again this will change year to year and by position)
Since they use RC/27 to determine replacement level, there is a formula published (that they don't explain how they came up with it) to turn slash stats into replacement level.
VORP is OFFENSE only and does not consider defense and it DOES consider position played so ARODs vorp isn't technically comparable to Pujols vorp since VORP compares Arod to 3b-men and Pujols to 1b.
VORP is defined that way for a reason. A player putting up Hanley Ramirez numbers at SS is more valuable to his team than a player putting up those numbers at 1B, so if you are comparing "value" (IE trying to determine the league MVP) it is fine and common to use VORP.[2]
RCOne of my personal favorites is RC. The reason why its the easiest way to relate a player to a teams Pythagorean W/L. You can click on the link to see the different versions, but I just use Baseball-References.com already calculated ones, which he uses the "technical version".
Clutch Hitting
Clutch hitting is still in dispute by BP because they can't empirically prove it yet. If you are a hardcore fan of clutch hitting then a lot of people like to use either WPA that is tracked by fangraphs.com OR Win Shares (they are calculated way differently and often thought as the same stat as WPA).
This blog isn't meant to tell you what to use, so I'm not going to expand any further on WPA/Win Shares and clutch in general, I'm just saying if you like "clutch" these are the most accepted stats for clutch.
Please don't use RBIs as a validation for a player. I like using RBI when only discussing a hit. BP tracks players performance in RBI situations, but even fans when RBI was created in the 20s knew that RBIs would be team dependent. If you are in a fantasy league that uses RBIs I like sorting players by RBI opportunities then try to balance who is the best with the most.
The scope of this blog was to inform readers of the most commonly used stats around here, maybe eliminate some, maybe add some. Also help myself understand or hammer out any details. If I have any info wrong or missing please feel free to add/discuss. Also, if you have a better way to data mine specific stats PLEASE add. A lot of times I have to copy/paste stats into a excel spreadsheet when there is probably a better way. If this blog is on the right track, next will come Pitching then Defense.
Summary
Team: Pythagorean W/L
Offense: Vorp, RC, Slash stats, OPS, OPS+
Clutch: WPA, Win Shares
NOT to use: Average alone, RBI
33 comments
|
0 recs |
Do you like this story?
Comments
i think this is a great diary
I would add:
- A paragraph about "sample size" and "regression to the mean". Geovanny Soto isn't going to have a 1.100 OPS next year.
- VORP and WARPx are cumulative stats; additional calculations are required before using them to compare multiple player's potentials for future contribution. At the very least they must be normalized for PA. VORPr is rate stat, and therefore is slightly better for comparing players who have had a different amount of playing time as long as you account for sample size. I wish BP had a "WARP4", which would be WARP3 adjusted to 590 PA...
- while what you said about comparing players at different positions by VORP is true, VORP is defined that way for a reason. A player putting up Hanley Ramirez numbers at SS is more valuable to his team than a player putting up those numbers at 1B, so if you are comparing "value" (IE trying to determine the league MVP) it is fine to use VORP.
- The conditions surrounding statistical performance must be kept in mind when using statistics to compare players. For instance, tools like PECOTA and Zips are not very useful when trying to predict the future performance of a player like Ludwick or Ankiel, because they don't fit the model of a typical player's development that the tools are based on. Also, certain stats should not be looked at as "absolute"- IE Chris Duncan is not a .830 OPS player; his true talent level is probably low-.900's. His actual performance this season was hindered by a freak injury that isn't likely to recur. This is one of the reasons you still have to actually watch the games in order to make good decisions about players.
- Keep other external factors like park effects and platoon splits in mind- Aaron Rowand won't be nearly as good if you take him out of Philly, Flores wouldn't be a good closer, etc.
excellent feedback
Whoever can write it do so and I'll just add it with credit. I don't really want this to be an "all me" blog, so I have no issues with taking anything out or adding anything if it's correct to do so or more coherent etc.
I def. need to add in the disclaimers about rate stats.
Do most people use WARPx? I found that most people don't like the way BP rates fielding (frar and fraa) so a strong offensive metric plus a lame fielding metric kinda kills the stat, no?
Great idea
by qwikimport on Oct 17, 2007 4:55 PM EDT reply actions
Question
"
Pythagorean winning percentage is an estimate of a team's winning percentage given their runs scored and runs allowed. Developed by Bill James, it can tell you when teams were a bit lucky or unlucky. It is calculated by
(Runs Scored)^1.83
---------------------------------------------------------
(Runs Scored)^1.83 + (Runs Allowed)^1.83
The traditional formula uses an exponent of two, but this has proven to be a little more accurate.
"
My question is this: what does it mean when the authors states "a little more accurate". Accurate implies there is a right answer - what is the pythagorean win% being compared to to be deemed more or less accurate.
clarify
Accurate
Actual win percentage is dispreferred to pythagorean for some purposes since 162 games isn't a large enough set of contests to measure how good a team really is. (Although a hell of a lot better than 16.)
I believe
-Joe Morgan
that's a good point
Consistency is measurable
This is a great post
I would love to see an expanded version of this as a permanent link for reference in the future. Perhaps adding a list of Acronyms would help, too.
Thanks rockStark!
I also liked this post a lot.
Maybe an explanation of EqA, anybody? Don't know much about it.
I was gonna put in EqA
Like Vorp its purely offensive but it's a Rate statistic.
It combines hitting for average, power, drawing walks, HBPs, Stolen base ability and it's Park Adjusted and Adjusted for league/yr difficulty.
It's all mathed up to Equivalent Runs per Out that reads like a batting average I.E. a .260 EqA is an average hitter, .300 is good, .340 is great.
What I always do is check wiki and google. Most stathead sites have glossary pages, but they really have just definitions. I wanted this page to be a "most used" or "most accepted" type thing. I hate coming across a stat and not knowing if it's already been dated, disproved or otherwise useless.
You can't disprove a stat
But none of that disproves batting average, in the end. It is what it is, and it is certainly a measure of something, and certainly a hitter with a .400 average is more valuable to his team than a batter with a .200 average.
Now, the main thing is, sabrmetric type stats are great for a lot of things, and it makes comparing things a lot easier, but they also have a lot of drawbacks. A lot of the advanced metrics will depend on hidden statistical modeling* that the creator doesn't directly explain. This can be a problem, particularly for things like park factors, which have a LOT of statistical noise in them, IMO. So, when you say something like "VORP is a better measure of offense than OPS+, which in turn is better than unadjusted OPS," there is an implied value judgment there, that depends on what you mean by 'better' and 'offense.'
Some statistics make some things clearer and make other things more obscure. Having a big toolbox is important, and while saying 'VORP is the end all of measuring offense' might be a slightly more accurate statement than 'a player with a high BA is always best,' it is still better to look at a player through as wide an array of lenses as possible.
Sorry for the long rant.
*statistical modeling is the practice of assuming an adjustable relationship between two things, and then using a large data set to adjust the relationship to something that you think mirrors a true relationship--an example is taking a bunch of points that you expect to be a straight line, and drawing what you think to be the best straight line through them.
.200 vs .400
This was especially helpful Val
by nycardfan on Oct 18, 2007 10:39 AM EDT up reply actions
Ditto
I like
This isn't directed at you, NY, or anyone in particular, but I will say that I haven't met a saber geek yet that doesn't take the time to explain something when someone asks. The retarded straw men and knee-jerk defense mechanism of mocking something you don't understand gets old, but open-mindedness and a willingness to just ask is awesome. At least in my opinion.
What I have not heard much about
It's not just a matter of asking or explaining. It's learning more about varying perspectives and approaches of knowledgeable people, and how they perceive statistical modeling to be influencing conclusions. I don't read VEB everyday, so maybe I've missed these kinds of discussions. Nevertheless, I thought what Val said about that in relationship to needing "a big tool box" to be particularly helpful.
by nycardfan on Oct 19, 2007 8:35 AM EDT up reply actions
I don't think they
What are the drawbacks? Most of them do what they say they do. If not they get their ass handed to them by Tangotiger or someone.
A lot of people are more interesting in proving that "the game isn't played on paper" (like anyone fucking claims that) and why stats shouldn't be trusted then figuring out what they are useful for. One of these things is easier than the other, and this extends to lots more than just baseball stats.
I should have added that I agree that most
by nycardfan on Oct 19, 2007 10:15 AM EDT up reply actions
And one of the big problems is that
This is the 'black box' problem that you hear people referring to--BR takes a players raw stats (AB, H, HR, 2B, etc.--they usually are pretty good about telling you what the inputs are, at least), put them into one end of a 'black box' and get VORP and FRAA and whatnot out. We're not allowed to look inside the black box. This is why it is important to see if everyone's system and the traditional methods agree on a guy.
None of this is to say that what they're doing isn't valid--more than anything, I'm just showing my bias as a physical scientist. To me, VORP is kinda the equivalent of a service telling me what I can use a Bessel function for, and allowing me to look up values on a case by case basis, but not telling me any of the theory behind it, or how to calculate it's value for an arbitrary set of parameters.
(which, by the way, is actually how scientists did business back in the day--go to a secondhand book store, and you can find books that do nothing but list all of the values of sin x).
btw,
FJM is basically
I do agree with most of what you've said here though. The problem is generally in the utilization and not the stat itself.
Aside from the proprietary
If PECOTA shows me a good coefficient and Nate talks about his methods in detail here and there, and smart people agree that it's a good system, then I'll treat it like a good projection system. If FRAA is demonstrably deficient then I'll ignore it. Those are the only two things I think we don't know the formulas for.
What you are saying makes perfect sense
As I said above, I found your "big toolbox" image to be particularly helpful. It gives me a better idea of what to be aware of as I gradually learn about all these new things. Well, anyway, thanks for helping along a newbie.
by nycardfan on Oct 19, 2007 1:11 PM EDT up reply actions
the vorp forumla
Val
by cardsgirl95 on Oct 18, 2007 11:36 AM EDT up reply actions



















