clock menu more-arrow no yes

Filed under:

Randomness in a shortened season

New, 2 comments

We know that small sample sizes can create weird results. What does that mean for the 2020 MLB season?

Divisional Series - St Louis Cardinals v Atlanta Braves - Game Five Photo by Carmen Mandato/Getty Images

I won’t beat around the bush: there is a nonzero chance the 2020 MLB season gets cancelled altogether. As the calendar prepares to flip to June, negotiations between the league and its players association are expected to–one way or another–reach a climax in the very near future.

It’s stating the obvious to note that, assuming there is baseball to be played, this season is going to be different. We’ve recently discussed at VEB how potential rule modifications such as a universal DH or expanded roster might affect the Cardinals, but one change that will undoubtedly affect every team is that a shortened season would be...well, shortened.

I think we can all agree the Dodgers, for example, are a better baseball team than the Marlins. If the two played each other once, we’d expect Los Angeles to “prove” they were better by winning more times than not. But baseball is weird and Miami would occasionally pull off the upset. If they were to play each other, say, 10 times instead of just once, we’d be much more confident in the Dodgers to come out ahead.

This underscores the statistical principle that variability decreases as sample size increases and vice versa. So if you cut the normal 162-game schedule down to the proposed 82 games or even less, it becomes more difficult for the good teams to separate themselves from the bad ones. In other words, things get more random.

How does that alter the pennant race? Conveniently, Baseball Prospectus and FanGraphs both projected the Cardinals to finish the year 81-81 before the season was halted. Admittedly, that feels a bit low, but using a perfectly average team will serve as a decent proxy for how season length can influence playoff odds.

I created a hypothetical .500 team and ran 1,000 simulations of 162-game seasons and another 1,000 simulations of 82-game seasons. Below are the 25th, 50th, and 75th percentile records for both season lengths.

Simulation Results

Metric 82-Game Season 162-Game Season
Metric 82-Game Season 162-Game Season
25th Percentile 38-44 77-85
50th Percentile 41-41 81-81
75th Percentile 44-38 85-77
Standard Deviation 4.54 wins 6.40 wins

At first glance, there doesn’t seem to be much difference between the 82-game and 162-game seasons. After all, the 50th percentile outcome is a .500 record for both. But even though the 82-game season is only about half as long, its standard deviation is nowhere near halved. This confirms what we predicted: a .500 club will, on average, post a .500 record, but is more prone to abnormally above or below average campaigns when limited to 82 games instead of 162.

As far as postseason implications go, let’s assume MLB follows through with their proposal to expand the 2020 playoffs to seven teams per league. Since 2012–when MLB added the second wild card–the average seventh place NL team turned in a 82-80 record. If we want to look at “elite” teams in a given season, the third best team in the NL typically has a 93-69 record. The top team in the league, presumably receiving a first-round bye in this new format, averages out at 100-62.

However, we need to account for those numbers changing under an 82-game schedule. Through the first 82 games of every season since 2012, the average seventh, third, and first place NL teams won 42, 47, and 52 games, respectively.

Returning to the simulation, we can take a look at how often our .500 team reached those benchmarks.

Simulation Playoff Odds

NL Rank Target Record 82-Game Odds 162-Game Odds
NL Rank Target Record 82-Game Odds 162-Game Odds
7th 42-40 or 82-80 46.7% 46.4%
3rd 47-35 or 93-69 12.0% 3.8%
1st 52-30 or 100-62 1.4% 0.2%

The odds of finishing in the top seven (i.e. the chances of qualifying for the playoffs) are roughly a toss-up no matter how long the season is. In practice, most of the benefit for a middling .500 team comes from the fact that the playoffs would now include seven teams instead of five. (Although MLB was considering a larger postseason going forward even prior to the COVID-19 outbreak.)

That said, our .500 team has significantly better hopes (3.8% vs. 12.0%) at a top-three record when the season is shortened to 82 games, in addition to being seven times more likely–but still a longshot–to grab the bye as the #1 seed (0.2% vs. 1.4%).

Of course, playoff probabilities are a zero-sum game. Every extra percentage point a mediocre team gains comes at the expense of somebody else. In this case, the ones losing out are the teams at the top. Which means if you believe St. Louis is one of the premier teams in the National League, a shorter season becomes bad news.

The way I see it, though, baseball resuming would be good news whether the circumstances benefit the Cardinals or not.