Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: The Most Dangerous Division in Sports

Comparing Fangraphs WAR and Baseball Reference WAR

My impression is that fWAR is more about predicting what could happen going forward, and that bWAR is more about showing what actually happened in a given season. My goal by writing this is to learn more about this difference, and to perhaps gain some insight into what goes into each particular statistic.

First thing to realize is that WAR is different for pitchers and hitters. It is so different that I wonder if pitchers and hitters WAR should be compared. But let's get started with:

Hitting

Fangraphs.com uses wOBA and UZR as key components to their version of WAR. Weighted On Base Average sums up the offense and Ultimate Zone Rating the defense of a position player. Once the sum of wins is realized, fWAR makes an adjustment for position (premium position players such as shortstops and center fielders get bonus points), a league adjustment, and a seasonal adjustment unique to each year's league average.

Baseball-reference.com puts their own spin on WAR, using OPS+ and Total Zone. OPS+ is an adjusted version of OPS, and TZR is a different type of defensive rating. Total Zone uses less information to determine its rating than UZR, thus many say it is not as good at rating defense. However, it is much better at making historical comparisons in defense, and perhaps is less susceptible to statistical distortions (that is only my impression).

Perhaps the main difference is that bWAR downplays the amount of defensive value in its WAR calculation, and fWAR values defense more. Another difference (correct me if I'm wrong) is that OPS+ uses park factors, and wOBA does not. However, wOBA uses linear weights, or statistical corrections for accuracy. So both fWAR and bWAR have their strengths when evaluating position players. One criticism I've had of fWAR is that it uses UZR, notorious for not being accurate on a year by year basis. But it also incorporates wOBA, arguably a much more advanced stat than OPS+. Pretty sure the linear weights still mystify a number of VEBrs, I know it is a bit of a mystery still to me. However, I really like that OPS+ uses park factors.

Pitching

The difference between fWAR and bWAR in evaluating pitching is even more fundamentally great. fWAR uses FIP as its main component. This means that this version of WAR does not exactly show what happened to the pitcher's outcomes. But, it does correct for what type of fielding a pitcher has behind him, which is a big deal. An example would be a quarterback with talent but a poor offensive line. They are just not going to be able to be as good or show their potential. Whereas a mediocre pitcher with an amazing defense will have a much better chance at looking like an excellent pitcher.

This is the main difference, since both bWAR and fWAR use park factors. FIP is much different than ERA.... However, bWAR's pitching equation is not just based on ERA. bWAR uses ERA+ which only adjusts for park and league factors. This version of pitching WAR shows what actually happened in a year, which is less predictive but more of a representation of what happened. This makes bWAR more useful in making historical comparisons, and matches up more with the results of a season. However, defense is often difficult to quantify, and groundball style pitchers tend to buck certain trends in ERA/FIP. So again, both forms of WAR are useful.

Star-divide

So now for the fun part, how do WAR leaderboards differ?

2011 bWAR hitting leaders:

  1. Kemp: 10
  2. Bautista: 8.5
  3. Braun: 7.7*
  4. Ellsbury: 7.2
  5. Cabrera: 7.1
  6. Gonzalez: 6.9
  7. Pedroia: 6.8
  8. Votto: 6.5
  9. Longoria: 6.3
  10. Sandoval: 6.1

2011 fWAR hitting leaders:

  1. Ellsbury: 9.4
  2. Kemp: 8.7
  3. Bautista: 8.3
  4. Pedroia: 8.0
  5. Braun: 7.8
  6. Kinsler: 7.7
  7. Cabrera: 7.3
  8. Granderson: 7.0
  9. Gordon: 6.9
  10. Votto: 6.9

So a few things jump out on the difference of WARs... Ellsbury benefits greatly from his defense. But either way he was a top 5 player because he hit so well. I must say I never expected him to be this good. Kemp takes a big hit in his wins total because of differences in calculations. It's interesting to see how the two systems evaluate players. Conversely, Pedroia is more highly rated by fWAR. His ridiculous fielding numbers really boost his wins. Whereas a player like Braun is rated almost exactly the same by bWAR and fWAR.

Then there's Kinsler, who is not a top 10 player by bWAR... and by fWAR he is the 6th most valuable player in baseball. On the other hand, Sandoval breaks the baseball-reference top 10, but is barely top 25 by fangraphs' WAR. Another thing that sticks out to me is Bautista vs Kemp. Both players are similarly poor fielders, but Kemp gets a boost for playing center field in 2011 and Bautista takes a hit for not playing CF in 2011... illustrating the positional adjustment. This raises the question of just how much the positional adjustment should be, but that is perhaps a whole different discussion.

So since there is an obvious difference, yet many similarities, I'd like to see a new best of list. Here is what I'd like to call general WAR, or gWAR:

  1. Kemp: 9.35
  2. Bautista: 8.4
  3. Ellsbury: 8.3
  4. Braun: 7.75
  5. Pedroia: 7.4
  6. Cabrera: 7.2

Going to be a little lazy and not include players who are not on both lists (except Votto, who I don't feel confident is #7). This list makes me more happy, since personally I'd rather have the power and on-base percentage of Kemp and Bautista over Ellsbury. I could go on with more observations, but I'll just say the leaderboard for 2011 is rather surprising to me. If Ellsbury and Pedroia continue to be this good, the Red Sox should be ridiculed for not being a great team if they are not able to assemble a good cast around the two.

If nothing else I guess this illustrates that WAR is a generalization.

In 2010, Hamilton is #1 in fWAR, and #4 in bWAR. Longoria is #1 & #2 f/bWAR, Crawford has a 1.5 win gap between the two systems, etc. Anyway, these discrepancies probably are not any sort of surprise to many here, but I hope this has helped some as it has helped me in realizing some of the difference between the two. I like that bWAR is more hitting oriented, but it does not have linear weights. One question though: what is the positional adjustment difference, or are they rather similar?

Making the switch over to pitching, here is where things really get different...

2011 bWAR pitching leaders:

  1. Verlander: 8.6
  2. Halladay: 7.4
  3. Kershaw: 7.0
  4. Lee: 6.9
  5. Sabathia: 6.9
  6. Weaver: 6.6
  7. Beckett: 6.3
  8. Shields: 6.1
  9. Romero: 5.9
  10. Fister: 5.7

2011 fWAR pitching leaders:

  1. Halladay: 8.2
  2. Sabathia: 7.1
  3. Verlander: 7.0
  4. Kershaw: 6.8
  5. Lee: 6.7
  6. Haren: 6.4
  7. Wilson: 5.9
  8. Weaver: 5.6
  9. Fister: 5.6
  10. Hernandez: 5.5

This particular season saw a pretty similar top 5, which is sort of surprising. (Verlander's ratings are quite a bit different though.) But #s 6-10 are not very equatable due to the big differences in the two systems. One criticism that may be made of fWAR is that it is too objective. bWAR for pitching shows some clutchness to it, for those who think that's a thing. Looking at WPA, Verlander is #1, but Weaver is #2, and Ian Kennedy is #3. That part is fascinating, as WPA sort of mystifies me. Kennedy did get really lucky BABIP-wise last year though.

So how to reconcile the two WARs? gWAR!

  1. Halladay: 7.8
  2. Verlander: 7.8
  3. Sabathia: 7.0
  4. Kershaw: 6.9
  5. Lee: 6.8

Giving Kershaw the Cy Young over Halladay is such a weird occurrence that no system can explain it. Kershaw did have a higher K rate, but he was also very lucky in terms of BABIP. Unfortunately I think they went with the W-L record, or wanted to change it up and not give too many awards to Roy.

If you like this examination of WAR here is a somewhat interesting related article: http://www.patrickfloodblog.com/2010/07/16/war-problems-part-two/

I do not mean to call WAR into question as a usable statistic, but to try and make it more understood as an estimation system and an attempt to show individual overall values in an objective way. It just depends on how you want to evaluate players. I suppose in the case of pitching, fWAR is clearly more useful, but then again FIP removes some aspect of pitching. There is more than just strikeouts, walks, and home runs... so bWAR has value.

How big of a difference does fielding make? Teams like the Diamondbacks, Rays, Reds, Red Sox and Angels prevented a lot of runs last year. Those 5 teams were just a lot better at defense than other teams, at least by UZR. Except for the Reds (who had some bad starting pitching) they were all successful. The BoSox and Braves are a study in terms of contrast. Braves had an equal ERA and FIP, but not very good fielding. The Red Sox improved their ERA from 4.20 to 4.05 FIP. Another poor fielding team, the Cardinals, had a great FIP that was very similar to their ERA. Then you have the Rays and Diamondbacks who were not helped by their fielding, if this is to be read straightforward. Especially the Rays, who had a 4.03 FIP, a 3.58 ERA, very good UZR, and a very lucky BABIP. How does one explain that? On top of that, the Angels had a similarly worse FIP than their ERA, yet were great at fielding. Maybe I have been writing too long tonight, but those numbers seem rather odd. How does one explain all these ERAs greater than the team's FIP?

Thanks for reading, interested to hear from the community more about the two types of WAR and player valuations.

Comment 48 comments  |  7 recs  | 

Do you like this story?

Comments

Display:

fWAR is park adjusted

The batting component of fWAR, which used wRAA (linear weights) is definitely park adjusted in the fWAR calculation.

Also, the statement that fWAR puts more of an emphasis on defense is a common myth: http://www.fangraphs.com/blogs/index.php/how-much-is-fielding-weighted-in-war/

by dkappelman on Jan 21, 2012 2:41 AM EST reply actions  

so that solves the positional WAR problem

will continue to look at fWAR for that… but for pitching, it still seems a little weird.

"young man, when you throw a strike, Mr. Hornsby will let you know"

by Cards Fan in Chitown on Jan 21, 2012 1:32 PM EST up reply actions  

also

is there a version of WAR out there that uses tERA and a say, 3 season window average for the UZR component?

"young man, when you throw a strike, Mr. Hornsby will let you know"

by Cards Fan in Chitown on Jan 21, 2012 1:34 PM EST up reply actions  

I'll add my request to David here

FanGraphs should display WAR for FIP, RA, xFIP and tRA and allow the reader to decide which one he would like to use in a given situation.

Secretary of WAR and Defense of the Tyler Greene Fanclub.

by vivaelpujols on Jan 23, 2012 4:24 AM EST up reply actions   1 recs

Anyone know

where I can get 2011 WAR results for a list of 150 players input at once? I’m doing a little research project, and having to flip to each player’s page one by one gets a bit onerous.

by siddfynch on Jan 21, 2012 10:45 PM EST reply actions  

great question

"young man, when you throw a strike, Mr. Hornsby will let you know"

by Cards Fan in Chitown on Jan 21, 2012 11:54 PM EST up reply actions  

What do you mean by

“150 players input at once”? You can get 2011 fWAR at Fangraphs output to excel then write a little script to filter by name or BID ID.

by Xeifrank on Jan 23, 2012 2:24 AM EST up reply actions  

You can create custom lists on Fangraphs and then convert them to Excel.

"I'm gonna throw the nastiest curveball I have ever thrown...if he hits it, I'll tip my cap, but if not we're going to the Series."

--Adam Wainwright on the final pitch of the 2006 NLCS

by bgh on Jan 24, 2012 10:30 AM EST up reply actions  

I am disappoint, VEB

"young man, when you throw a strike, Mr. Hornsby will let you know"

by Cards Fan in Chitown on Jan 22, 2012 12:06 PM EST reply actions  

just discovered this.

i like it. gWAR is just the average of fWAR and bWAR, correct? that wasn’t immediately apparent to me.

it is what it is, not what we thought it'd be

by il rosso on Jan 23, 2012 3:52 PM EST up reply actions  

yep that's all it is

kind of puts a different spin on the WAR leaderboard. I can’t decide which I like better, but obviously fWAR is a bit more accepted at least on veb. I spent hours on this fanpost, it started as something that I thought wouldn’t take very long but there was more reading on the internet than I had anticipated re: comparing the two.

"young man, when you throw a strike, Mr. Hornsby will let you know"

by Cards Fan in Chitown on Jan 23, 2012 6:59 PM EST up reply actions  

Nice lol!

But didnt you use a bunch of other stuff?

"young man, when you throw a strike, Mr. Hornsby will let you know"

by Cards Fan in Chitown on Jan 25, 2012 9:47 PM EST via Android app up reply actions  

Oh yeah.

Way too much, probably. lol

by stlfan on Jan 25, 2012 10:48 PM EST up reply actions  

very cool

sounds interesting, I like the idea of incorporating even more stats into the fold. my offensive ability rating was pretty fun, but I got to the point where it was about as accurate as OPS and shelved it until/if I get better at mathematics. glad to see you are incorporating so many stats.

"young man, when you throw a strike, Mr. Hornsby will let you know"

by Cards Fan in Chitown on Jan 26, 2012 2:09 AM EST up reply actions  

at the top,

you said something about not being sure if WAR for pitchers and position players should even be compared. I assume this was (one of) the original goal(s) of the stat. What is it about comparing the two results that gives you pause? Is it because of how each system assigns “win” values to the stats in the formula? Is there a better way to do it?

Because Matheny

by WyoCardsFan on Jan 24, 2012 12:13 PM EST reply actions  

Im still wondering about the end of the piece

But i guess my main issue is that a hitters production is so much more obvious than a Pitchers production.

"young man, when you throw a strike, Mr. Hornsby will let you know"

by Cards Fan in Chitown on Jan 24, 2012 1:34 PM EST via Android app up reply actions  

Clearly Theriot is the best gauge

Whichever system says he’s the worst is obviously the most accurate.

by DiscoJer on Jan 25, 2012 1:21 AM EST reply actions  

Are you sure bWAR for hitters uses OPS+? I'm almost positive they're using some sort of linear weights, similar to wOBA

So essentially the only difference is TZ vs UZR, and I don’t think there’s a whole lot of difference in accuracy between the two, but a whole lot difference in what they measure/how they measure

Of all sad words of tongue or pen; the saddest are these: 'It might have been!' -- Whittier
Twitter

by mysterui on Jan 25, 2012 1:33 AM EST reply actions  

And I don't really like the idea of arbitrarily averaging the two WARs

For hitters, maybe, if you think TZ and UZR are exactly similarly accurate/inaccurate

For pitchers, I don’t think the differences between the WARs are equivalent

Of all sad words of tongue or pen; the saddest are these: 'It might have been!' -- Whittier
Twitter

by mysterui on Jan 25, 2012 1:36 AM EST up reply actions  

I actually think averaging out FIP and defense adjusted RA is a good idea

give the pitcher some credit for BABIP, but not full.

Secretary of WAR and Defense of the Tyler Greene Fanclub.

by vivaelpujols on Jan 25, 2012 1:38 AM EST up reply actions  

It's so arbitrary though

You’re saying bWAR = fWAR and I don’t think that’s true.

Of all sad words of tongue or pen; the saddest are these: 'It might have been!' -- Whittier
Twitter

by mysterui on Jan 25, 2012 1:43 AM EST up reply actions  

yeah, but it's probably pretty close to true

Secretary of WAR and Defense of the Tyler Greene Fanclub.

by vivaelpujols on Jan 25, 2012 1:50 AM EST up reply actions  

there are plenty of instances where a consensus approach outperforms the best individual approach

even when one of the individual approaches is clearly superior to the other. I have no idea whether it’s true here, but it’s certainly not unheard of.

by brackenthebox on Jan 25, 2012 12:22 PM EST up reply actions  

It's more likely that the two are equal than one is unequivocally better

so averaging it out is definitely an improvement. It just might not be the best system.

Secretary of WAR and Defense of the Tyler Greene Fanclub.

by vivaelpujols on Jan 25, 2012 2:02 PM EST up reply actions  

Im sure there are better ways

But for the layman it makes sense

"young man, when you throw a strike, Mr. Hornsby will let you know"

by Cards Fan in Chitown on Jan 25, 2012 6:09 PM EST via Android app up reply actions  

BR uses linear weights

but they are custom to the team (they take the average value of a single to that particular team).

Secretary of WAR and Defense of the Tyler Greene Fanclub.

by vivaelpujols on Jan 25, 2012 1:38 AM EST up reply actions  

That seems prone to SSS, yeah?

Of all sad words of tongue or pen; the saddest are these: 'It might have been!' -- Whittier
Twitter

by mysterui on Jan 25, 2012 1:44 AM EST up reply actions  

well no, a team has like 6000 PA a year

it’s just going to hurt players who play on bad teams. bWAR is more concerned with value to team rather than value to a team.

Secretary of WAR and Defense of the Tyler Greene Fanclub.

by vivaelpujols on Jan 25, 2012 1:51 AM EST up reply actions  

It seems there is not only a lot of confusion in the mind of the fan of what WAR is

But in articles and such on the web. I should have put my sources down…. But from what i read it said ops plus. Would like to see more on all this

"young man, when you throw a strike, Mr. Hornsby will let you know"

by Cards Fan in Chitown on Jan 25, 2012 11:03 AM EST via Android app up reply actions  

From bref
—Rbat, batting runs are a linear weights formula utilizing custom weights depending on team runs scored and run scoring environment. For the current season this is adjusted batting runs.

Of all sad words of tongue or pen; the saddest are these: 'It might have been!' -- Whittier
Twitter

by mysterui on Jan 25, 2012 12:07 PM EST up reply actions  

Well i have no odea which system is better then

How many versions of war are there btw?

"young man, when you throw a strike, Mr. Hornsby will let you know"

by Cards Fan in Chitown on Jan 25, 2012 1:39 PM EST via Android app up reply actions  

Infinite-ish?

WAR’s just a framework. You can use whatever you want for the inputs

Of all sad words of tongue or pen; the saddest are these: 'It might have been!' -- Whittier
Twitter

by mysterui on Jan 25, 2012 9:59 PM EST up reply actions  

ok

I knew that… but what are the main ones? anything other than bWAR and fWAR I should know about?

"young man, when you throw a strike, Mr. Hornsby will let you know"

by Cards Fan in Chitown on Jan 26, 2012 2:07 AM EST up reply actions  

statcorner might have WAR

Secretary of WAR and Defense of the Tyler Greene Fanclub.

by vivaelpujols on Jan 26, 2012 3:47 AM EST up reply actions  

BPro has WARP, which is basically the same thing

Hardball Times has WAR (both past and projected) if you pay for their projection system.

Secretary of WAR and Defense of the Tyler Greene Fanclub.

by vivaelpujols on Jan 26, 2012 3:48 AM EST up reply actions  

Statcorner has WAR, but it's not really "complete"

For example, the WAR on a hitter’s page is offense and position adjustment only. I think pitching WAR is only pitching too. They don’t appear to have defensive stats either.

"I'll be deep in the cold, cold ground before I recognize Missoura!"

by mattybobo on Jan 26, 2012 8:15 AM EST up reply actions  

I saw that they have their own wOBA

can you convert that into WAR stats for me? lol

"young man, when you throw a strike, Mr. Hornsby will let you know"

by Cards Fan in Chitown on Jan 26, 2012 8:21 PM EST up reply actions  

I'll give you the formula

= (wOBA – leageuwOBA)/1.15*PA + 20/600*PAA. And then you add in their positional adjustments, which you can find listed somewhere.

Secretary of WAR and Defense of the Tyler Greene Fanclub.

by vivaelpujols on Jan 27, 2012 12:02 AM EST up reply actions  

I might just do that

but I will probably downplay the positional adj a bit

"young man, when you throw a strike, Mr. Hornsby will let you know"

by Cards Fan in Chitown on Jan 27, 2012 1:42 AM EST up reply actions  

...why?

Of all sad words of tongue or pen; the saddest are these: 'It might have been!' -- Whittier
Twitter

by mysterui on Jan 28, 2012 4:44 AM EST up reply actions  

wOBA seasonal adjustments

I’m 110% confident that VEP already knows this better than me, however, for other ones who are interested in, Tom Tango explains how to calculate each coefficient, which is essential to make seasonal adjustments.

Cardinals fan from Korea

by FreeRedbird on Jan 31, 2012 11:59 PM EST up reply actions  

i don't think that bWAR for pitchers uses ERA+ at all, but rather runs allowed (not earned runs) weighted to the run

scoring environment (factoring in park and defense). it’s a little obtuse, but read the explanations for bWAR here.

i used to be disgusted, but now i try to be amused . . . - macmanus

by tom s. on Jan 30, 2012 5:25 PM EST reply actions  

the definitions of these things are even more murky than I had thought

lots of conflicting information out there

"young man, when you throw a strike, Mr. Hornsby will let you know"

by Cards Fan in Chitown on Jan 30, 2012 8:27 PM EST up reply actions  

Just look at the website glossary

Of all sad words of tongue or pen; the saddest are these: 'It might have been!' -- Whittier
Twitter

by mysterui on Jan 31, 2012 1:54 AM EST up reply actions  

Comments For This Post Are Closed


User Tools

The Internet's #1 St. Louis Cardinals blog.
Yahoo_full_count

Managers

Jack_benny__1__small DanUpBaby

Editors

Bendermad_small azruavatar

Trigun_001_small the red baron

Images_small tom s.

Authors

1989_bgh_cropped_small bgh

Valverde_medium_small vivaelpujols