My impression is that fWAR is more about predicting what could happen going forward, and that bWAR is more about showing what actually happened in a given season. My goal by writing this is to learn more about this difference, and to perhaps gain some insight into what goes into each particular statistic.
First thing to realize is that WAR is different for pitchers and hitters. It is so different that I wonder if pitchers and hitters WAR should be compared. But let's get started with:
Fangraphs.com uses wOBA and UZR as key components to their version of WAR. Weighted On Base Average sums up the offense and Ultimate Zone Rating the defense of a position player. Once the sum of wins is realized, fWAR makes an adjustment for position (premium position players such as shortstops and center fielders get bonus points), a league adjustment, and a seasonal adjustment unique to each year's league average.
Baseball-reference.com puts their own spin on WAR, using OPS+ and Total Zone. OPS+ is an adjusted version of OPS, and TZR is a different type of defensive rating. Total Zone uses less information to determine its rating than UZR, thus many say it is not as good at rating defense. However, it is much better at making historical comparisons in defense, and perhaps is less susceptible to statistical distortions (that is only my impression).
Perhaps the main difference is that bWAR downplays the amount of defensive value in its WAR calculation, and fWAR values defense more. Another difference (correct me if I'm wrong) is that OPS+ uses park factors, and wOBA does not. However, wOBA uses linear weights, or statistical corrections for accuracy. So both fWAR and bWAR have their strengths when evaluating position players. One criticism I've had of fWAR is that it uses UZR, notorious for not being accurate on a year by year basis. But it also incorporates wOBA, arguably a much more advanced stat than OPS+. Pretty sure the linear weights still mystify a number of VEBrs, I know it is a bit of a mystery still to me. However, I really like that OPS+ uses park factors.
The difference between fWAR and bWAR in evaluating pitching is even more fundamentally great. fWAR uses FIP as its main component. This means that this version of WAR does not exactly show what happened to the pitcher's outcomes. But, it does correct for what type of fielding a pitcher has behind him, which is a big deal. An example would be a quarterback with talent but a poor offensive line. They are just not going to be able to be as good or show their potential. Whereas a mediocre pitcher with an amazing defense will have a much better chance at looking like an excellent pitcher.
This is the main difference, since both bWAR and fWAR use park factors. FIP is much different than ERA.... However, bWAR's pitching equation is not just based on ERA. bWAR uses ERA+ which only adjusts for park and league factors. This version of pitching WAR shows what actually happened in a year, which is less predictive but more of a representation of what happened. This makes bWAR more useful in making historical comparisons, and matches up more with the results of a season. However, defense is often difficult to quantify, and groundball style pitchers tend to buck certain trends in ERA/FIP. So again, both forms of WAR are useful.
So now for the fun part, how do WAR leaderboards differ?
2011 bWAR hitting leaders:
2011 fWAR hitting leaders:
So a few things jump out on the difference of WARs... Ellsbury benefits greatly from his defense. But either way he was a top 5 player because he hit so well. I must say I never expected him to be this good. Kemp takes a big hit in his wins total because of differences in calculations. It's interesting to see how the two systems evaluate players. Conversely, Pedroia is more highly rated by fWAR. His ridiculous fielding numbers really boost his wins. Whereas a player like Braun is rated almost exactly the same by bWAR and fWAR.
Then there's Kinsler, who is not a top 10 player by bWAR... and by fWAR he is the 6th most valuable player in baseball. On the other hand, Sandoval breaks the baseball-reference top 10, but is barely top 25 by fangraphs' WAR. Another thing that sticks out to me is Bautista vs Kemp. Both players are similarly poor fielders, but Kemp gets a boost for playing center field in 2011 and Bautista takes a hit for not playing CF in 2011... illustrating the positional adjustment. This raises the question of just how much the positional adjustment should be, but that is perhaps a whole different discussion.
So since there is an obvious difference, yet many similarities, I'd like to see a new best of list. Here is what I'd like to call general WAR, or gWAR:
Going to be a little lazy and not include players who are not on both lists (except Votto, who I don't feel confident is #7). This list makes me more happy, since personally I'd rather have the power and on-base percentage of Kemp and Bautista over Ellsbury. I could go on with more observations, but I'll just say the leaderboard for 2011 is rather surprising to me. If Ellsbury and Pedroia continue to be this good, the Red Sox should be ridiculed for not being a great team if they are not able to assemble a good cast around the two.
If nothing else I guess this illustrates that WAR is a generalization.
In 2010, Hamilton is #1 in fWAR, and #4 in bWAR. Longoria is #1 & #2 f/bWAR, Crawford has a 1.5 win gap between the two systems, etc. Anyway, these discrepancies probably are not any sort of surprise to many here, but I hope this has helped some as it has helped me in realizing some of the difference between the two. I like that bWAR is more hitting oriented, but it does not have linear weights. One question though: what is the positional adjustment difference, or are they rather similar?
Making the switch over to pitching, here is where things really get different...
2011 bWAR pitching leaders:
2011 fWAR pitching leaders:
This particular season saw a pretty similar top 5, which is sort of surprising. (Verlander's ratings are quite a bit different though.) But #s 6-10 are not very equatable due to the big differences in the two systems. One criticism that may be made of fWAR is that it is too objective. bWAR for pitching shows some clutchness to it, for those who think that's a thing. Looking at WPA, Verlander is #1, but Weaver is #2, and Ian Kennedy is #3. That part is fascinating, as WPA sort of mystifies me. Kennedy did get really lucky BABIP-wise last year though.
So how to reconcile the two WARs? gWAR!
Giving Kershaw the Cy Young over Halladay is such a weird occurrence that no system can explain it. Kershaw did have a higher K rate, but he was also very lucky in terms of BABIP. Unfortunately I think they went with the W-L record, or wanted to change it up and not give too many awards to Roy.
If you like this examination of WAR here is a somewhat interesting related article: http://www.patrickfloodblog.com/2010/07/16/war-problems-part-two/
I do not mean to call WAR into question as a usable statistic, but to try and make it more understood as an estimation system and an attempt to show individual overall values in an objective way. It just depends on how you want to evaluate players. I suppose in the case of pitching, fWAR is clearly more useful, but then again FIP removes some aspect of pitching. There is more than just strikeouts, walks, and home runs... so bWAR has value.
How big of a difference does fielding make? Teams like the Diamondbacks, Rays, Reds, Red Sox and Angels prevented a lot of runs last year. Those 5 teams were just a lot better at defense than other teams, at least by UZR. Except for the Reds (who had some bad starting pitching) they were all successful. The BoSox and Braves are a study in terms of contrast. Braves had an equal ERA and FIP, but not very good fielding. The Red Sox improved their ERA from 4.20 to 4.05 FIP. Another poor fielding team, the Cardinals, had a great FIP that was very similar to their ERA. Then you have the Rays and Diamondbacks who were not helped by their fielding, if this is to be read straightforward. Especially the Rays, who had a 4.03 FIP, a 3.58 ERA, very good UZR, and a very lucky BABIP. How does one explain that? On top of that, the Angels had a similarly worse FIP than their ERA, yet were great at fielding. Maybe I have been writing too long tonight, but those numbers seem rather odd. How does one explain all these ERAs greater than the team's FIP?
Thanks for reading, interested to hear from the community more about the two types of WAR and player valuations.