/cdn.vox-cdn.com/photo_images/6897291/144217184.jpg)
one question that was raised last week with regard to jaime was whether jaime gets spooked when errors happen behind him. many people are concerned about the number of unearned runs jaime has put up.
in 2010, jaime allowed 64 runs, only 49 of which are earned, meaning 23% of his runs were unearned. in 2011, he allowed 100 runs, only 77 of which were earned, meaning 23% of his runs were unearned again.
unearned runs on their face look like a terrible way to try to get at whether a pitcher blows up following an error. for one, the error may immediately cause an unearned run to score, even if the pitcher responds to the error by striking out the next batter.
for another, it's not clear how evenly distributed errors are among starting pitchers on the same staff. think about run distribution as a parallel; two pitchers in the same rotation may get vastly different run support from the same team over the course of a whole season. errors seem, if anything, to be far more volatile and far more likely to accrue randomly to one pitcher over another than run support.
the trend in sabermetrics has been to try to get defense OUT of the pitching equation. for that reason there's not a site that i am aware of that provide splits like "after an error is committed." it would sure be nice if there were such a site with a virtually infinite set of possible splits. but there is not.
to be clear, i am not seeking to commend the "earned run" theory. earned runs are a pretty poor idea, as well stated previously. what i am seeking to test is whether a single season sample (or even a two season sample) of unearned runs is a good basis for declaring or even suspecting that a pitcher persistently pitches worse after an error is committed.
after thinking about it and poking around some, i did find a sort of a hack to get at some of this information without laboriously reviewing each game log for a whole season for jaime garcia and an adequate control group.
if you go to an individual pitcher's page on fangraphs and click on their play log, you'll see every play the pitcher was involved in over the season. if you search the page for the words "error by," you'll find the standard language used when an error occurs, and your browser will report how many times that word comes up.
i also found you could search for the term "scored on error" and find the number of players who scored as an immediate result of that error. the results are after the jump.
so, i took a look at the game logs for 2011. i pulled the numbers for our six 2011 starters. note that these numbers are not limited to "starting for the cardinals" scenarios. edwin jackson's numbers include his time with the white sox. kyle mcclellan's numbers include his time in relief.
just taking a quick look at the number of errors behind each pitcher, they are not evenly distributed. the most errors happened behind chris carpenter (22), followed closely by garcia, with the other starters seeing far fewer errors.
of course it's hard to compare carpenter's long season to kyle mcclellan's when carpenter pitched almost 100 innings more. i started to look at errors by innings pitched, then by total batters faced, but decided to compare them based on balls in play (TBF - BB - K -HBP). while it's possible to have an error on a pickoff or a stolen base attempt, or even a dropped third strike, i thought the BIP numbers would most closely capture the "risk" of an error. if you'd rather look at the numbers by TBF or IP, feel free.
using this approach, i got a total error per ball in play number, which shows that garcia had the highest number of errors behind him per ball in play of any of the pitchers. with the exception of carpenter (2.9%), garcia's 3.2% error on ball in play rate far outstrips the rate for other pitchers who all range around 2.1-2.3%. this illustrates more pointedly that errors behind a pitcher are NOT evenly distributed, where garcia gets almost half again as many errors behind him as most of the other starters on the staff.
game log (2011) |
carpenter |
garcia |
jackson |
mcclellan |
lohse |
westbrook |
"error by" |
22 |
20 |
14 |
11 |
13 |
13 |
"scored on error" |
8 |
4 |
0 |
2 |
2 |
1 |
BIP |
744 |
618 |
649 |
482 |
619 |
630 |
"error by"/BIP |
0.02957 |
0.032362 |
0.021572 |
0.022822 |
0.021002 |
0.020635 |
UER |
7 |
23 |
8 |
5 |
9 |
8 |
UER/error |
0.318182 |
1.15 |
0.571429 |
0.454545 |
0.692308 |
0.615385 |
(UER-scored on error)/error |
-0.04545 |
0.95 |
0.571429 |
0.272727 |
0.538462 |
0.538462 |
next comes the dicier question, which relates more directly to the "mental makeup" question. even if garcia, through dumb luck, gets more errors behind him, do they proportionally turn into more unearned runs?
i looked at this two ways, for the reason shown above.
as stated above, if the run scores during the play on which the error occurs, it seems like we shouldn't blame the pitcher's mental makeup, because the pitcher's response in the plate appearances following the error would not change whether the run scores.
however, chris carpenter's line above illustrates the problem with this approach. 8 runs "scored on an error" against chris carpenter, but only 7 were unearned runs. i scratched my head on this one a little bit until i realized what had happened.
i tracked down a game log where three runs scored on errors, but only 2 unearned runs were listed. on one play, a runner on third scored on an error. but there were fewer than two outs, and carpenter followed the error up by allowing a single. even though the run scored on the error itself, the scorer called it an earned run, because the runner on third probably would have scored on the single anyway.
so, i present the data to you two ways - digest it how you will. you can look either at unearned runs per error, or at unearned runs minus runs scored on the error itself per error. the latter is an attempt to capture with more precision how much the pitcher's response to the error impacted the scoring of unearned runs.
i provide the numbers above not because i think they're significant, but because i think they illustrate this issue BETTER than just saying "jaime garcia gave up 23 unearned runs and that's a lot, therefore he has a bad mental makeup."
the number of incidents we are talking about - errors or unearned runs - are so tiny relative to the total number of batters faced that i think any real skill level here is almost certainly completely swallowed by noise. and you see this somewhat by the enormous variation in pitcher outcomes. some pitchers allow unearned runs following an error at an almost 60% rate, some at a 0% rate (that's what i'm calling carpenter's anomalous result), jaime allowed a run following an error at a 95% rate. that's a pretty huge swing.
looking at the same numbers from 2010 shows that the year-on-year variation highlights this same instability, and the role of luck. for one, jaime also had the same unfortunate luck regarding the incidence of errors behind him, again suffering from the worst percentage of errors (2.8% versus 1.8% to 2.6%). note: jackson's numbers come from the d-backs and the white sox.
game log (2010) |
carpenter |
garcia |
jackson |
mcclellan |
lohse |
westbrook |
"error by" |
18 |
14 |
16 |
4 |
7 |
17 |
"scored on error" |
7 |
4 |
4 |
0 |
1 |
3 |
BIP |
714 |
496 |
631 |
221 |
339 |
636 |
"error by"/BIP |
0.02521 |
0.028226 |
0.025357 |
0.0181 |
0.020649 |
0.02673 |
UER |
15 |
15 |
7 |
1 |
8 |
4 |
UER/error |
0.833333 |
1.071429 |
0.4375 |
0.25 |
1.142857 |
0.235294 |
(UER-scored on error)/error |
0.444444 |
0.785714 |
0.1875 |
0.25 |
1 |
0.058824 |
kyle lohse was actually the worst pitcher in 2010 at preventing runs following an error, allowing one run following every error committed behind him. jaime's rate was still comparatively high, at 79%, but the larger point is that it is most likely that numbers swinging from 0 to 0.44 runs/error year (carpenter), or 0.53 to 1.00 runs in a year (lohse), and relying on a very small number of incidents (fewer than 25) are extremely unlikely to be much more than noise.
in essence, it's not clear why looking at unearned runs scored in the 4 to 23 innings in which an error was committed is any more reliable than looking at any random 4 to 23 inning sample of a pitcher's performance. believing that jaime allowed 23 extra unearned runs last year because he mentally collapses after errors is no more reasonable than believing he mentally collapses in following errors in open air stadia (22 UER in 2011 out of 89 R), versus domes (1 UER out of 11 R).
Also, recall the variability of one error over another, in terms of how difficult recovering from the error might be. If a third baseman throws a double play ball into right field, the run expectancy with runners on second and third with one out is very high. By contrast, a runner who reaches first on an error with the bases empty and two outs will have a much lower run expectancy. All errors are not created equal. Given the small number occurring behind one pitcher in one year, you should not assume that each pitcher will be facing similar errors in number or type.
if there were a site that collected these kinds of splits and allowed league-wide analysis of this issue, i believe it's very likely it would show little to no year-after-year significance in the frequency of unearned runs allowed. i have no desire to put together the above numbers by hand for hundreds of pitchers in trying to prove this. the above suggests, but does not prove, the high variability in how many unearned runs a pitcher allows.
in short, i think it's very unlikely that jaime garcia's supposed inability to recover from errors behind him has a substantial relationship to the number of unearned runs he's allowed. he's been unlucky two years running to have his defense allow more errors per ball in play than any other starting pitcher on the cardinals (adam wainwright, not in the stats above, had errors behind him at a 1.6% rate per ball in play and did not allow an unearned run). it also looks unlikely that "recovering from an error" is a repeatable skill.