FanPost

Pitch f/x Thread

Hi everybody!!  It seems that we have been discussing Pitch f/x a lot in the main threads recently, and I thought it might be nice if we could take it all to one place.  

First, I'd like to give some background info on how the system works, how it is accessible to the public, and it's practical application to baseball.  Hopefully, this will clear up any misconceptions certain people might have...

 

 

The System

Implement by Sport Vision, PItch f/x cameras are able to capture certain attributes of a pitch, including: velocity, movement, location, and much more.  Pitch f/x data has been recorded on every single pitch in the majors since opening day 2008, and in a lot of games in 07 as well. 

Where to find it

PItch f/x data is made publicly available via MLB.com's Gameday service.  While Gameday is only meant for entertaintment use, and doesn't lend itself for serious analysis, MLB archives all of the files in XML form, like so

http://gd2.mlb.com/components/game/mlb/year_2009/month_09/day_04/gid_2009_09_04_anamlb_kcamlb_1//pbp/pitchers.xml

These are all of the pitchers on the Royals who pitched today.  If you click on them, you'll notice that it just looks like a bunch of weird shit; however, you can export it to excel by right clicking on the xml file and downloading it to your computer.  Then it organizes itself, and becomes managable through excel.

However, that only allows you to take a look at one game by one pitcher at a time.  If you want to aggregate every single pitch in the majors to do a more detailed study, or compare pitchers start to start, or even look at hitters, you'll have to parse all of the Gameday files to an SQL database.  I recently "wrote" a primer on how to do so, which you can read here:

http://www.beyondtheboxscore.com/2009/8/19/994666/saberizing-a-mac-4-pitch-f-x

It should be relatively easy to follow, and you should *definitely* look in the comment section for more info.  Be warned, it's a daunting task and may take up to a week to do, but it's definitely worthwhile.  

The Data

Whether you are looking at one game or all of the pitchers, you are provided with a boatload of data on each pitch.  Mike Fast put together an excellent description of each field on his blog:

http://fastballs.wordpress.com/2007/08/02/glossary-of-the-gameday-pitch-fields/ 

Read that and bookmark it.  

Analyzing the data

There is a ton of simple, yet revealing, things you can look at, just based on one game using Pitch f/x.  For example, here are all of the pitches thrown by Wainwright in his start against the Giants, when he struck out 12 hitters in 9 innings:

http://spreadsheets.google.com/ccc?key=0AmhtqthzQ8zFdGJpZDJaNENiendic2hPaW1sT1ZHeEE&hl=en

Obviously, his stuff was really good that night.  So how do we quantify that with Pitch f/x?  Well, we can take a look at the two main attributes of a pitchers stuff; velocity and movement.  Velocity is denoted by the heading called "start_speed" (they also track the end speed of each pitch, but that really isn't important as far as I know).  Basic movement is denoted by the "pfx_x" and "pfx_z".  The first one is the vertical movement of the pitch, and the second is the horizontal movement.

Given those two categories, you can reasonably show how good a pitchers' stuff was in a given night; however, first you have to seperate the pitches by pitch type.  Pitch f/x data comes with a pitch type algorythm, but it's often wrong.  Fortunately, it seems to classify Waino pretty well, because he has 4 distinct pitches.  A fastball (don't worry about breaking it up by 2 Seam and 4 Seam yet), slider, curve and change.

The pitches are marked under the heading pitch_type.  FF is 4 seam fastball, FT is 2 seam fastbal (again, just combine the two for now), SL is slider, CU is curve, CH is changeup and KN is knuckleball.

So sort the data by pitch type, and figure out the average start_speed, pfx_x and pfx_z of each of his pitches.  Or, if you want to take the lazy way out, and make a pretty graph at the same time, you can graph out the movement like so:

7_1_medium

 

That graph may seem a little obscure, but it is very informative.  You can see the average velocity on his pitches, and the range of break on each of them.  As you can see, the changeup and fastball have similar movement, with the changeup having a bit more drop to it.  The slider moves about 10 inches to the right (from the catchers point of view) in comparison to those two pitches, and the curveball is way to the right and drops about 10 inches (that's one of the biggest breaking curves in the majors obviously).  That one "fastball" that has the movement of a slider, is probably a slider.

Of course, this is pretty worthless on it's own.  Let's take a look at how his stuff looked against the Braves on April 29th.  That start he gave up 3 runs, and walked 5 hitters while only striking out two.  Here is the data for that start:

http://spreadsheets.google.com/ccc?key=0AmhtqthzQ8zFdGFwd2pnVkl0WHBZNWVsQjU5cnVhaEE&hl=en 

And here is how his stuff looked:

4_29_medium

 

You can see some subtle, yet important differences.  His fastball velocity was over 1 MPH slower, and his slider velocity was faster, meaning the speed differential was worse.  The break on his fastball, changeup and slider was also moved slightly over to the right, while the curveball break held constant, meaning he was getting less seperation on those pitches.

In any given start, there are really 3 things that a pitcher has control over: stuff, location and sequecning.  We already took a look at Waino's stuff, now let's look at his location.  

To plot location on a graph, you select px as the x axis and pz as the y axis.  I like to break it up by pitch type, or pitch outcome (swinging strike, ball, hit, called strike, etc.).  Let's take a look at Waino's night against the Giants by pitch type, with swinging strikes circled:

7_1_location_medium

As you can see, he was downright unhittable that night, totaling 19! swinging strikes.  His curveball was especially good, as he generated 10 swinging strikes on 36 curves.  He was able to pound the 1st Base side of the zone with his curveball and slider, while keeping his fastball always around the strike zone.

These are just one thing that you can do with Pitch f/x.  Other simple things you can look at are:

  • Velocity and movment by inning
  • Location against righties and lefties
  • Pitch selection
  • Where a player gets his swinging strikes
  •  Spin of each pitch
  • Release point

And if you want to take a look at some more complex and actionable things:

  • How a pitcher pitches on the stretch compared to with the bases empty
  • How effective offspeed pitchers are following a fastball in comparison to following an offspeed pitch
  • How location can affect things like GB% and HR/FB ratio
  • How umpires affect individual pitchers and hitters

So play around with the two spreadsheets I gave you, or download your own data.  We are only starting to scratch the surface of what we can do with all this data, and I can guarantee you that it will end up helping major league clubs if it hasn't already.  

You can use this thread to ask questions about Pitch f/x, or have criticisms or propose new ideas for studies.  Or anything else than you can think of.  

Here are some links for further reading on the subject:

http://www.hardballtimes.com/main/article/pitch-identification-tutorial/ 

http://www.sonsofsamhorn.net/wiki/index.php/Pitchfx#The_Basics:_Starting_at_the_Data

http://www.hardballtimes.com/main/article/the-eye-of-the-umpire/

http://www.hardballtimes.com/main/article/inside-the-changeup/

http://www.hardballtimes.com/main/article/what-makes-a-home-run-pitch/

http://www.hardballtimes.com/main/article/a-zone-of-their-own/

http://bjays.wordpress.com/

http://fastballs.wordpress.com/

http://www.sbnation.com/users/Harry%20Pavlidis/blog

http://www.beyondtheboxscore.com/2009/6/3/896845/graphing-201-pitchf-x-flight-paths

http://www.beyondtheboxscore.com/2009/6/2/895971/pitch-f-x-primers

http://baseballanalysts.com/archives/fx_visualizatio_1/

Anything else written by Josh Kalk, Mike Fast, Jon Hale, Harry Pavlidis, Dave Allen, Jeff Zimmerman or Alan Nathan.

And some of my own work (/shameless self promotion):

http://www.hardballtimes.com/main/blog_article/measuring-the-umpires-affect-on-the-game/

http://www.drivelinemechanics.com/2009/8/7/979394/kyle-lohses-triumphiant-return

X
Log In Sign Up

forgot?
Log In Sign Up

Forgot password?

We'll email you a reset link.

If you signed up using a 3rd party account like Facebook or Twitter, please login with it instead.

Forgot password?

Try another email?

Almost done,

Join Viva El Birdos

You must be a member of Viva El Birdos to participate.

We have our own Community Guidelines at Viva El Birdos. You should read them.

Join Viva El Birdos

You must be a member of Viva El Birdos to participate.

We have our own Community Guidelines at Viva El Birdos. You should read them.

Spinner

Authenticating

Great!

Choose an available username to complete sign up.

In order to provide our users with a better overall experience, we ask for more information from Facebook when using it to login so that we can learn more about our audience and provide you with the best possible experience. We do not store specific user data and the sharing of it is not required to login with Facebook.

tracking_pixel_9351_tracker