Runs Created by Decade: An Analysis in Variance
In the discussion over Albert Pujols's status as a Hall of Famer, a statistical debate arose over how we should compare players from by-gone decades in terms of OPS+. Valentin pointed out that a comparison to the average player (what OPS+ does) might be biased if there's a significant difference in the variance in OPS in past eras. IOW, Valentin suggested that it would matter if, say, a player who posted a 130+ in the 1950's was an extreme abnormality while a player who posted a 130+ in the 1990's was relatively common.
Well, this presents an empirical question: is there a substantial difference in the variance in offensive production in previous eras. To help answer this question, I have examined the Runs Created for MLB from 1950 to the present. Specifically, I've looked at the average Runs Created for each decade from 1950 to 2005 (latest year available), to see if there has been a substantial difference in the variation in runs created over those decades.
I employed the dataset on batting provided in Sean Lahman's Baseball Database. I excluded all players with less that 150 at bats in a season. The following data include the mean, median, & modes, the variance and standard deviations for Runs Created. I've used the following runs created formula:
rc_X1 = (H + BB - CS + HBP - GIDP);
rc_X2 = TB + (.26 * (BB - IBB + HBP));
rc_X3 =(.52 * (SH + SF + SB));
rc_X4 = AB + BB + HBP + SH +SF;
rc = rc_X1 * (rc_X2 + rc_X3) / rc_X4;
I've also included a quantile breakdown for runs created in each of the decades:
The Fifties (1950-1959)
N 1813
Mean 55.9209845
Std D 31.5128001
Var 993.056569
Med 49.45367
Quantile Estimate
100% Max 187.62018
99% 148.14395
95% 114.88000
90% 100.00556
75% Q3 76.41856
50% Median 49.45367
25% Q1 30.34681
10% 20.05333
5% 16.39892
1% 11.37026
0% Min 8.57665
The Sixties (1960 - 1969)
N 2289
Mean 52.29305
Std Dev 29.53063
Median 47.23479
Variance 872.05787
Range 169.64884
Quantile Estimate
100% Max 177.66250
99% 132.01471
95% 107.21445
90% 94.02014
75% Q3 70.89245
50% Median 47.23479
25% Q1 27.71607
10% 18.39132
5% 14.82995
1% 10.43723
0% Min 8.01366
The Seventies (1970 - 1979)
N 2918
Mean 52.75782
Std Dev 28.66373
Median 47.93912
Var 821.60919
Range 154.24352
Quantile Estimate
100% Max 160.26340
99% 129.96813
95% 103.82451
90% 92.60296
75% Q3 72.36598
50% Median 47.93912
25% Q1 28.57147
10% 19.13247
5% 15.67842
1% 10.78744
0% Min 6.01988
The Eighties (1980 -1989)
N 3162
Mean 51.70840
Std Dev 28.05438
Median 46.96388
Var 787.04827
Range 147.99317
Quantile Estimate
100% Max 153.60612
99% 126.75796
95% 105.90848
90% 90.73343
75% Q3 70.23333
50% Median 46.96388
25% Q1 28.43210
10% 19.38759
5% 16.01681
1% 11.61000
0% Min 5.61295
The Nineties (1990 - 1999)
N 3395
Mean 56.06562
Std Dev 31.74681
Median 49.65593
Var 1008
Range 186.91802
Quantile Estimate
100% Max 193.33921
99% 146.74168
95% 115.88431
90% 101.28195
75% Q3 76.41611
50% Median 49.65593
25% Q1 30.25041
10% 20.69598
5% 16.86006
1% 12.17636
0% Min 6.42118
The Oughts (2000 - 2005)
N 2639
Mean 59.70025
Std Dev 34.02325
Median 53.48486
Var 1158
Range 223.95212
Quantile Estimate
100% Max 230.40970
99% 154.90486
95% 124.12382
90% 108.89175
75% Q3 80.00087
50% Median 53.48486
25% Q1 31.56901
10% 21.79130
5% 18.15000
1% 12.58929
0% Min 6.45758
============
OK. So what do all those numbers mean? Well, for one, it is apparent that there has been some shifts in the variance in runs created over time. The maximum RC's have increased in some eras relative to other eras (Max RC in the Oughts: 230.40, Max RC in the 1980's: 153.60). This is consistent with the notion that the runs created distribution has expanded over the last few decades.
However, there doesn't seem to be a substantial difference in the standard deviations in each decade. IOW, while we've seen an expansion of the distribution at the tails, for the most part, the standard deviations in the bulk of the distribution remain relatively the same. We've seen a slight increase in standard deviations, relative to other decades:
1950's 31.51 runs
1960's 29.53 runs
1970's 28.66 runs
1980's 28.05 runs
1990's 31.75 runs
2000's 34.02 runs
So the max variance in comparing any one decade to another is 5 runs created in a season. So even if we weighted our OPS+ calculations to account for this difference in SD, I doubt we'd get much of a difference in our results.
I invite comment on my baseline assumptions, the results, and the conclusions I've drawn. D.GOOCH
0 recs |
31 comments
Comments
In a nutshell...
by GOOCH24 on Feb 10, 2008 6:48 PM EST reply actions 0 recs
So in terms
http://steroids-and-baseball.com/
Isn't what they say match what you say.
The "steroid" era is a myth.
And if not.
Why?
by Harknights on Feb 10, 2008 6:56 PM EST reply actions 0 recs
On the steroids tangent...
But you should also note that it is possible that steroids had a substantial influence on hitters AND pitchers. So while there may have been significant effects, they tend to cancel one another out. D.GOOCH
by GOOCH24 on Feb 10, 2008 7:32 PM EST up reply actions 0 recs
don't forget expansion...
now, perhaps, the dramatic increase in the number of Latin American players has diminished the second effect (i.e. less popularity), but i doubt that it makes up for the fact that football and basketball have so dramatically outgrown baseball on the youth level, for several decades now.
so, the truly talented can excel and pull the mean up on their own. we've seen this over the past decade or so; whether it's due to steroids or other factors (or some combination) is open to debate, of course, but we've still see an offensive increase in the tails, as you've described.
by kindred on Feb 11, 2008 7:18 AM EST up reply actions 0 recs
On a somewhat related subject
by Hardcore Legend on Feb 10, 2008 7:00 PM EST reply actions 0 recs
Differences in variance
1980s 2000s
N 3162 2639
Mean 51.7 59.7
Std Dev 28.1 34.0
Median 47.0 53.5
Var 787.0 1158.0
Range 148.0 224.0
by Zubin on Feb 10, 2008 10:39 PM EST reply actions 0 recs
An answer to your question
Remember, OPS+ is calculated in relation to the mean. So a substantial increase in the mean isn't particularly important. The question for us is the shape of the distribution.
The 'problem' with RC (or most any offensive production stat for that matter) in terms of a substantial increase in the mean is that offense is left censored (with most offensive stats you can't have negative offensive production).
So we could invision a RC distribution that would approximate the poission distribution. As the mean increases, it would approach a standard normal distribution (as the center moves further away from the censor point).
But from what I can see from the distribution of RC scores, the eighties and the oughts both approximate normal distributions. Note the relationship between the median and the means. The exact same slight right skew. So the left censor doesn't really come in to play yere.
So while agree that run creation has inflated in the oughts in comparision to the eighties, I think the shape of the distributions are relatively the same. Hence we shouldn't have a concern regarding comparisions of OPS+. Consider the below hypothetical distributions
DISTRIBUTION 1
0
0
0
5
5
5
10
10
15
15
15
30
30
30
DISTRIBUTION 2
5
5
5
15
15
15
20
20
40
40
40
Note that the means and ranges of these two distributions are distinct. The mean for distribution 1 is 12.14 and the mean for distribution 2 is 18.21. The range for D1 is 30 and the range for D2 is 35. But the shape of these distributions is relatively the same. So a '20' in the second distribution, relatively speaking, means the same as a '10' in the first.
Valentin's concern regarding OPS+ was if the distributions were substantially different. Consider the following:
DISTRIBUTION 3
0
0
0
5
5
5
10
10
15
15
15
20
20
20
DISTRIBUTION 4
30
30
40
40
40
40
40
50
60
60
60
Distribution 3 here approximates a normal distribution. Distribution 4 is a flat distribution. It is decidedly NOT normal. So if there were substantial differences in the shape of the distributions of runs created in the 1980's in relations to the 2000's...well, I'd have some concerns regarding comparisons. As it is, I think the real world is closer to D's 1 & 2 rather than anything like D's 3 & 4. D.GOOCH
by GOOCH24 on Feb 10, 2008 11:42 PM EST up reply actions 0 recs
Correction on Distribution 2
5
5
5
10
10
10
15
15
20
20
20
40
40
40
by GOOCH24 on Feb 10, 2008 11:47 PM EST up reply actions 0 recs
Yes, but Valentin and I were concerned
by Zubin on Feb 11, 2008 10:50 AM EST up reply actions 0 recs
Comparisons...
It doesn't matter that they are on that upper 1% of the curve or not; OPS+ adjusts for relation to the mean of the particular year. Since the standard deviation and distribution numbers are not much different from era to era, OPS+ for the upper 1% of players will not differ any more than the OPS+ of an average player. Hence, statistically it is a good tool for comparison between eras.
by fourstick on Feb 11, 2008 12:47 PM EST up reply actions 0 recs
Um, yes it does...
We are talking about two guys in the tails, not in the bulk.
GOOCH:
Any chance you'd be willing to put up histograms of the distributions?
by Zubin on Feb 11, 2008 2:50 PM EST up reply actions 0 recs
your wish...
http://www.donaldgooch.com/RC_HIST.txt
I don't have time to comment right now (off to class) but feel free to preuse and comment in the meantime. :)
by GOOCH24 on Feb 11, 2008 3:30 PM EST up reply actions 0 recs
ok...
There is a pronounced increase in the spread of RC.
So the distribution has changed...but it has mostly changed with regards to the extreme upward tails of the distribution. IOW, it seems to me that the elites of the modern era are subsantially out performing (or out run creating) the elites of the 80's in relation to the means of their distributions...however the bulk of players (i.e. most of the players) are distributed relatively simillar from era to era.
But is that really a problem of comparison? Maybe the elites of the ought are just that much better than the elites of the eighties.
I guess I'd like to see Val's argument threshed out a little more so I can determine if I disagree with it. ;) D.GOOCH
by GOOCH24 on Feb 11, 2008 7:23 PM EST up reply actions 0 recs
My guess is that this phenomea
As far as who is better, I guess we can't really determine for sure since both guys are in the upper tail of their respective distribution. That was my whole point when I said 3 or 4 of Mattingly's best years are comparable to Pujols.
by Zubin on Feb 11, 2008 9:49 PM EST up reply actions 0 recs
Note...
At the 100% cut point (i.e. the maximum RC for that decade) there is a significant difference between the 80's and the 00's:
2000's: 230
1980's: 153
That's a fairly substantial difference. The top score is 5.01 standard deviations above the mean, while the top for the 1980's is 3.63 SD's above the mean.
2000's: 5.01 SD's a/b mean
1980's: 3.63 SD's a/b mean
But that difference rapidly dissapates as we move past the very extreme of the distribution.
2000's: 1.89 SD's a/b mean
1980's: 1.93 SD's a/b mean
Note how the difference here has actually flipped. The 95 percentile in the 1980's actually outperformed the 2000's relative to their mean, though these are virtually identical.
Let's look at the 75th percentile:
2000's: 0.60 SD's a/b mean
1980's: 0.66 SD's a/b mean
Again, same result. 1980's slightly outperform the oughts, though difference is statistically irrelevant.
And the 25th percentile:
2000's: 0.82 SD's b/l mean
1980's: 0.83 SD's b/l mean
So I would argue that other than the absolute extreme of the 2000's distribution, the relative shape of the two distributions is substantially the same...presenting little to no problem for OPS+.
Additionally, I'm not convinced the extreme values are problematic for OPS+ either, since this is calculated relative to the mean. IOW, OPS+ should reflect the fact that the elite of the elite players in the oughts have been that much better than the average player in the oughts when compared to their counterparts in the 1980's.
Val? D.GOOCH
by GOOCH24 on Feb 12, 2008 1:28 PM EST up reply actions 0 recs
I never really wanted to decry OPS+
And I think it's valid concern for HOF consideration--not just ask the question "How much did the guy beat the average player", which OPS+ answers pretty well, but also ask "How many other guys performed at the same level?" or "How rare is it for a layer to perform at this level?"
And I think you've shown, at the extremes, these are empirically shown to not the exact same question.
by Valatan on Feb 12, 2008 1:37 PM EST up reply actions 0 recs
Disagree...
I agree that your data validates using OPS+ for the middle ~95% or so of the distribution, but when we are discussing HoFers we are generally discussing guys in the upper 1-2%. And therefore OPS+ isn't as great a tool when measuring aross eras.
by Zubin on Feb 12, 2008 2:29 PM EST up reply actions 0 recs
But...
As I see it, OPS+ is useful (whereas OPS may be much less so) because it factors in the mean OPS for that particular season (hence, contingent on the era of baseball the offense is produced in)...and thus accounts for the general inflation of offensive statistics in the 00's...as much as it accounts for the deflation of offensive statistics in the 70's and 80's.
I can see the problem with using OPS on its own. An .800 OPS in 2007 is much different than an .800 OPS in 1987. But OPS+ adjusts for the era effects by accounting for the average OPS in the league for that season.
So, as I see it, OPS+ should be an era-independent stat. D.GOOCH
by GOOCH24 on Feb 12, 2008 3:46 PM EST up reply actions 0 recs
I think you misunderstand me
Again for 95%+ of players it will work great, but when we are writting about Pujols and Mattingly it is not as meaningful of a comparison.
by Zubin on Feb 13, 2008 10:33 AM EST up reply actions 0 recs
Not to mention
by Valatan on Feb 13, 2008 12:02 PM EST up reply actions 0 recs
Now that...
If we're interested in looking at offensive production relative to the middle offensive player (in a season, era, what have you) rather than the average offensive production, then we might recast OPS+ using the league median rather than the league mean OPS. The median, of course, is immune to tail effects. Call it OPS++. ;) D.GOOCH
by GOOCH24 on Feb 13, 2008 5:30 PM EST up reply actions 0 recs
OK...
My question is: why is that a problem of comparison? If OPS+ shows that Pujols was, say, 30% better than Mattingly...then it would tend to conform with the distributional differences we've noted between the eras.
So I guess my questions are this:
Do you think OPS+ would show that Mattingly is worse than Pujols?
If that's what OPS+ shows, why would that be a problem for era comparisons?
If the problem is the mean being drawn up by the tails, I would expect the opposite problem: OPS+ would show the difference between Pujols & Mattingly to be smaller than it should be.
Is that what you're arguing?
D.GOOCH
by GOOCH24 on Feb 13, 2008 5:35 PM EST up reply actions 0 recs
MHO...
As far as comparisons, I would expect OPS+ to show in their best seasons, Pujols was better than Mattingly.
That being said, OPS+ still presents a problem for comparisons because the environment is different and the populations you are basing your comparison upon are different.
I am not so sure about the means drawing up the tails. I think the problem are the tails themselves. Again once we hit the area where the distribution are no longer comparable, I think we lose the ability to compare players across eras.
by Zubin on Feb 14, 2008 3:17 PM EST up reply actions 0 recs
Also,
by Valatan on Feb 11, 2008 2:49 PM EST up reply actions 0 recs
And additionally,
by Valatan on Feb 11, 2008 2:52 PM EST up reply actions 0 recs
Now for gits and shiggles
Would also be interesting to see that graph overlayed with the average ERA...
Excellent work Gooch. I take back all the average things I've said about you. :)
by bukowski on Feb 10, 2008 11:06 PM EST reply actions 0 recs
Excellent Post
by ajo080s on Feb 11, 2008 8:37 AM EST reply actions 0 recs
Just to be clear
Also, if the distribution suffers from censoring, just use a censored normal distribution estimate of the standard deviation and mean.
http://www.ssicentral.com/lisrel/techdocs/censor.pdf
There's no easy function to do this in Excel, if that's what you're using, but it would only take a couple more cells to fill in.
by enoscountry on Feb 11, 2008 3:28 PM EST reply actions 0 recs
some answers...
You're right, if censoring is the problem there is a correction factor that can be applied to get unbiased SD's and means.
I'm using SAS. D.GOOCH
by GOOCH24 on Feb 11, 2008 7:26 PM EST up reply actions 0 recs
Interesting
We're just seeing how many elite hitters and vice versa were around in each decade. You could just do this with a graph of OPS+ by decade to see how heavy the tails are or whatever.
Also, the decade cutoffs are arbitrary. 1967 to 1977 might get you a totally different answer because it causes two elite or terribly shitty player's peak or valley seasons to overlap when they did not.
by plh903 on Feb 13, 2008 2:47 AM EST reply actions 0 recs

by 

















