clock menu more-arrow no yes

Filed under:

An early look at Statcast's new batted ball data

New, 44 comments

MLB Advanced Media is releasing a small amount of batted ball data to the public. Here's a look at what this move entails, as well as what the early season data looks like for Cardinals hitters.

Jasen Vinlove-USA TODAY Sports

This season, MLB Advanced Media began using Statcast, a comprehensive data tracking system that captures an enormous amount of data for every single game using TrackMan radar-based technology. Since I will actually be interning for TrackMan this summer, I thought it would be interesting to find out more about Statcast and look at some of the data that has been collected and released to the public.

If you've been following games using MLB Gameday, you may have noticed some of the batted ball data that is being released to the public. Under the "feed" tab, there will be a description of every at-bat result, and it will look something like this:

In addition to the usual description of each play, we are given an exit speed and a distance for every ball in play. While this batted ball data is only a small fraction of the data being collected by Statcast, it is still better than nothing, which is what we had (at least publicly) before this season. As this process of releasing batted ball data to the public is still very new, there are a few kinks that need to be ironed out.

First, it appears that the data released is incomplete, as there are some parks that (up to this point) are not reporting certain pieces of data for every game. In addition, it appears that there are some inconsistencies in the batted ball distance numbers. For example, most ground balls do not have a distance given at all, and even if they did, we would have to wonder if the distance given was where the ball first hit the ground or where it was fielded. In addition, batted balls that reach the outfield aren't all being measured the same way.

I noticed one example of this in Thursday's game against the BrewersYadier Molina's line drive RBI hit in the 8th was given a hit distance of 263 feet, and this distance appears to be where the ball was fielded in the outfield, not where it first hit the ground. However, Matt Carpenter's RBI double from the same game appeared to be measured by where it first landed, not where it was fielded. While it had a hit distance of 312 feet, it was fielded at the wall in right center field, which is definitely more than 312 feet from home plate.

Personally, I am not surprised that there are inconsistencies like this this at this point in the process. When I interned at Baseball Info Solutions, tracking these kind of details could get tricky at times, and it was common to make mistakes such as these. BIS has a number of set rules on how to assign hit distances to every kind of ball in play, and I'm sure that Statcast will eventually make these details more consistent as well.

With all this new data that is now publicly available, many people are wondering what kind of analysis can be done and what kind of things we can learn about the game by looking at this data. John Choiniere of Beyond the Box Score did a brief writeup on the topic, concluding that we probably need far more data before we can do meaningful analysis. In addition, Ben Lindbergh of Grantland explored this topic in more detail, mentioning outcome-independent player evaluations, injury/fatigue detection, and better defensive positioning as just a few areas of analysis that could improve through the use of this data. While we probably don't have enough data yet to draw any grand conclusions or dramatically alter the way we analyze players, I thought I'd at least take a look at some of the data available for Cardinals players and see if there's anything interesting.

Over at Baseball Savant, there is a basic leaderboard for this batted ball data, and it includes maximum, minimum, and average batted ball velocity as well as maximum and average batted ball distance. Unfortunately, there was no way to filter the leaderboard by team, so I created my own Cardinals-specific table with the same data. (I excluded pitcher batting stats.)

Name ABs With Data Max Batted Ball Velocity Min - MPH Avg - MPH Max Batted Ball Distance - Feet Avg - Feet
Matt Holliday"]" style="padding: 2px 3px; text-align: left; font-size: 100%; vertical-align: bottom; background-color: #ffffff;">Matt Holliday 15 110.00 70.00 89.60 290.00 81.93
Kolten Wong"]" style="padding: 2px 3px; text-align: left; font-size: 100%; vertical-align: bottom; background-color: #ffffff;">Kolten Wong 17 107.00 65.00 90.00 372.00 200.00
Jason Heyward"]" style="padding: 2px 3px; text-align: left; font-size: 100%; vertical-align: bottom; background-color: #ffffff;">Jason Heyward 15 106.00 62.00 89.33 382.00 146.87
Jon Jay"]" style="padding: 2px 3px; text-align: left; font-size: 100%; vertical-align: bottom; background-color: #ffffff;">Jon Jay 17 106.00 45.00 86.65 324.00 200.65
Randal Grichuk 2 105.00 96.00 100.50 363.00 342.00
Matt Carpenter"]" style="padding: 2px 3px; text-align: left; font-size: 100%; vertical-align: bottom; background-color: #ffffff;">Matt Carpenter 19 103.00 60.00 91.32 378.00 198.53
Matt Adams"]" style="padding: 2px 3px; text-align: left; font-size: 100%; vertical-align: bottom; background-color: #ffffff;">Matt Adams 12 101.00 57.00 84.50 397.00 172.92
Jhonny Peralta"]" style="padding: 2px 3px; text-align: left; font-size: 100%; vertical-align: bottom; background-color: #ffffff;">Jhonny Peralta 14 101.00 72.00 92.07 353.00 119.36
Yadier Molina"]" style="padding: 2px 3px; text-align: left; font-size: 100%; vertical-align: bottom; background-color: #ffffff;">Yadier Molina 16 99.00 60.00 87.75 353.00 106.94
Mark Reynolds"]" style="padding: 2px 3px; text-align: left; font-size: 100%; vertical-align: bottom; background-color: #ffffff;">Mark Reynolds 3 96.00 61.00 77.67 225.00 75.00

The sample size here is almost too small to draw any meaningful conclusions, as no player has more than 19 at-bats worth of data. I think it would be interesting, though, to look at this data later in the season, once we have a better idea of what a baseline is for some of these numbers, especially the average velocity and distance. With that being said, here are some random thoughts that came to mind when I was looking at this data.

  • I find it fitting that Matt Holliday has the hardest hit ball recorded so far this year, at least among Cardinals hitters. With that being said, he is only 5th on the team in average batted ball velocity. While Holliday has started the season with a .360 wOBA and a 137 wRC+ in 37 plate appearances, he does not have a single extra base hit. I would expect his average batted ball velocity and distance to increase significantly over the course of the season, especially once he starts getting more extra base hits.
  • It appears that average batted ball velocity is starting to settle in the high 80s/low 90s range, at least for the team's regular position players (who have more than ten at-bats with data). So far, the team leaders in average batted ball velocity are Matt Carpenter and Jhonny Peralta, who have also been the team's two best hitters so far this season. This should not be a surprise, since it intuitively makes sense that hitting the ball harder will lead to better offensive performance. Still, I'd be interested in looking at this data later in the year to see how strong the correlation is between batted ball velocity and measures of offensive performance.
  • Randal Grichuk's two at-bats recorded by Statcast were pretty strong overall, and this doesn't even include his home run in Cincinnati, which was reportedly clocked at nearly 105 MPH and traveled around 420 feet. It's been clear for a while now that he has the raw power to hit the ball as hard as just about anybody, but the bigger question will be whether he can reduce his strikeout rate while still hitting the ball hard when he does make contact.
  • The average batted ball distances are still all over the place. Of course, the range on these is much higher than with the batted ball velocities, meaning that an especially small or large number (from ground balls and home runs) could have a big effect on the overall average, especially this early in the season. While it seems likely that better hitters, especially power hitters, will have a higher average batted ball distance, this number probably doesn't mean a lot in isolation. Instead, batted ball distances should probably be separated by batted ball type, especially when looking at fly balls.

While it's hard to draw a lot of meaningful conclusions from this small amount of data, it is certainly nice that this data is being collected and released to the public. Personally, I'm looking forward to having a larger sample of data and a better understanding of how this data can be meaningfully analyzed.