Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: Fighters React to Nick Diaz's Positive Drug Test

The clutchiness of the home team

In the comments to this post about the results of his/her simulator Xeifrank recently mentioned that due to the rules of baseball, the most likely outcomes of closely matched teams have the home team winning by a single run. In particular, even if the away team is favored, the most likely single scores still involve the home team winning by one run, because the distribution over scores in which the away team wins is more evenly distributed. This is basically a result of the home team not tacking on additional runs in the 9th inning or later when they win (since the game ends). I asked if this trend is visible in real games and not just the simulations, and, on his/her suggestion, decided to take a look for myself. (Spoiler alert: the answer is yes)

I took as my data the scores of all of the games from 2009 (through last night, 9/24) yielding a total of 2288 games. Overall, the home team had a winning percentage of .546, in line with the average over the history of baseball of 54%. To be honest, I wasn't aware prior to this analysis that home field advantage was actually that strong in baseball (even though it is much weaker than most sports, from what I understand). In any event, here's a heatmap of the joint histogram of final scores:

Because the home team already has the slight edge, the results aren't as suprising in this case, but the most common scores all had the home team winning by one run (4-3, 3-2, 5-4). Because the home team wins more games overall, it's a little hard to interpret the above plot in terms of the one-run bias for which we are looking. One way to account for this, is to normalize the home team and away team victories separately. The result is a plot that shows the probability of a given score, assuming that the home team wins (upper left triangle) or that the away team wins (lower right triangle). The resulting plot is:

If there were no bias towards one-run games, the above plot would look symmetric about the diagonal. While the effect isn't particularly strong, there clearly is an asymmetry in the data. Outcomes of 4-3 and 3-2 are more likely for the home team than the away team, meaning that home teams win a higher percentage of games by these scores than away teams do. To offset this, the away team has more density further away from the diagonal. Note, for example, that 3-1, 4-1, and 5-1 victories are enriched in away team victories relative to home team victories. There also seems to be enrichment for away teams blowing out the home team (10+ runs vs 1-6 runs), but that might just be noise.

Finally, I wanted to look at the same data but focusing on the margin of victory instead of the exact score. This plot shows the total winning percentage of home teams given a particular margin of victory (but not conditioned on a home team victory):

Winpcent_medium

This plot really shows the dramatic difference in one run wins. In one-run games, the home team won almost 61% of the time; that's equivalent to a record of 99-63. In contrast, the winning percentage in 2+ run games is closer to 52% (84-78). On average, obviously, this still comes out to the 54% winning percentage that home teams have overall. While I didn't demonstrate it here, this bias towards 1-run home-team victories is likely a result of the rules of baseball, and not some psychological lift that the home team gets in close games. This idea is supported by the fact that it still appears in the simulator, which obviously doesn't have any psychology or anything of the sort built in.

So, what's the point of all of this? As my title alludes to, one way to view this result is as a caution against selection bias. Winning percentage in one run games is often thrown around as some measure of how "clutch" a team is. While I know I'm preaching to the choir, this is just one example of how such discrepancies in results can arise without any human element at all. Next time you hear about how well a team has performed in one run games, I'd at least take a look at how many of those games were won on their home turf.

Comment 22 comments  |  14 recs  | 

Do you like this story?

Comments

Display:

lol...home runs.

I misread as HR’s at first…. til I read the x axis.

Yo MLBPA, I'm really happy for you, and I'mma let you finish, but Albert is the most ridiculous player of all time. OF ALL TIME!

by vexedtechie on Sep 25, 2009 6:36 PM EDT reply actions  

Figure 3 is dramatic indeed

Nice work.

Guys like Bradley are exactly why we can't have a pumpkin patch anymore.

by liam on Sep 25, 2009 7:12 PM EDT reply actions  

Awesome, awesome stuff

This seems like a great avenue for future study. I think the obvious reason is that the home team has the walk off, which is going to be one-sided in the favor of the home team and usually results in a 1 run victory. However, part of it could also be that closers will play better at home (as does everyone else in baseball) and they’ll have a better save% in 1 run games at home than on the road.

by vivaelpujols on Sep 26, 2009 12:44 AM EDT reply actions  

Thanks much

In regards to further analysis, are there public databases for these kinds of stats? I scraped all the data here from cbssportsline, but that’s a little clunky and doesn’t have all the data I’d like. In particular, I was hoping to split it out by extra inning games, but that was gonna take a little more scripting than I felt up to since I couldn’t find a single page with that information on it.

by brackenthebox on Sep 26, 2009 11:41 AM EDT up reply actions  

You’d probably need to use Retrosheet, Lehman DB or BIS data and write some scripts.
vr, Xei

by Xeifrank on Sep 26, 2009 1:49 PM EDT up reply actions  

Or

Oracle DBxml, which is what I’ve been messing around with lately.

Guys like Bradley are exactly why we can't have a pumpkin patch anymore.

by liam on Sep 27, 2009 12:26 AM EDT up reply actions  

I've got a decent amount of MySQL experience

I’m planning on loading the retrosheet data into a new DB later this week. I know you’ve posted about the pitchfx data previously. Any other great data sources out there?
Thanks.

by brackenthebox on Sep 28, 2009 9:01 AM EDT up reply actions  

Well, if you want to have a shortcut to get a Retrosheet database

You can download the entire thing in an SQL dump at this website:

http://www.wantlinux.net/2009/04/retrosheet-baseball-mysql-database-download/

I haven’t tried it yet, because it’s a huge file and would probably kill my computer, but you could give it a shot.

by vivaelpujols on Sep 28, 2009 1:34 PM EDT up reply actions  

Also, you say that you are good with SQL?

Do you think you could drop me an email so I could pester you with some questions? I’m having some trouble figuring how to set up a query. My email is listed on my profile.

by vivaelpujols on Sep 28, 2009 2:03 PM EDT up reply actions  

Bravo

Well done, and that’s a good job on the stats.

by zoomzoomj88 on Sep 26, 2009 11:24 AM EDT reply actions  

I rec'd this

I figured I should in order to keep the rec/comment ratio nice and high.
Great stuff.

Albert Pujols does not have "down" years. He has "~6 WAR" years.

by mattybobo on Sep 26, 2009 11:43 AM EDT reply actions  

These graphs look like extreme close-ups of Mega Man.

EXTREME CLOSE-UP!!!!!!

I once shot a man just to see him die...then I got distracted and missed it.

by TheDuke32 on Sep 26, 2009 6:17 PM EDT reply actions  

GET EQUIPPED WITH

HEAT MAP!

Albert Pujols does not have "down" years. He has "~6 WAR" years.

by mattybobo on Sep 26, 2009 9:39 PM EDT up reply actions   1 recs

one of the best fanposts I've seen

bravo

Positronic Upgraded Juggernaut Optimized for Logical Sabotage

by Cards Fan in Chitown on Sep 26, 2009 6:53 PM EDT reply actions  

what happens if you subtract 1-run walkoff victories?

I realise this isn’t 100% fair either, but would be interesting to see.

Felonius Monk - bitching to contact since 2008

by Felonius_Monk on Sep 28, 2009 6:44 AM EDT reply actions  

Wow

well done…and very interesting data…rec’d

"Albert hits good pitches hard and bad pitches even harder. And when he gets in the batter's box, if you pray, then you start praying. And if you don't pray, you think about starting."--Brian Bannister

by VolsnCards5 on Oct 1, 2009 11:01 AM EDT reply actions  

Comments For This Post Are Closed


User Tools

The Internet's #1 St. Louis Cardinals blog.

FanPosts

Community blog posts and discussion.

Recommended FanPosts

649494__1__small
Hall of WAR: Part 2

Recent FanPosts

Dsc01844_small
Cardinals take the Governor's Joplin Challenge, will help build 35 homes for torando victims
St-louis-cardinals-script_small
Best Cardinals of All-Time - Relief Pitching Edition
St-louis-cardinals-script_small
Best Cardinals of All-Time - Starting Pitching Edition
Small
Two Trades That Set the Cards Back in the 70s
Nyc_small
Cardinals Offense vs. Reds Offense - 2012
Nyc_small
Cardinals Rotation vs. Reds Rotation - 2012
St-louis-cardinals-script_small
Best Cardinals by Position - Center Fielders
St-louis-cardinals-script_small
Best Cardinals by Position - Corner Outfielders
Stl_gay_small
2011 League Minimum All Star Team

+ New FanPost All FanPosts >


Managers

Jack_benny__1__small DanUpBaby

Editors

Bendermad_small azruavatar

Trigun_001_small the red baron

Images_small tom s.

Authors

1989_bgh_cropped_small bgh

Valverde_medium_small vivaelpujols