a couple of readers lately have called my attention to the fangraphs site, run by david appelman and featured regularly at the hardball times. i appreciate the heads-up; it's an outstanding site. don't head over there until you have time to waste, because if you like stats the site'll get its hooks into you pretty quickly.
among its vast stocks of data, fangraphs has numbers that can shed more light on the BABIP stuff we were wrestling with tuesday and wednesday -- viz., if a player's BABIP (or batting average on balls in play) ticks up sharply for a season, does it reflect improved hitting ability or just good luck? if it's the former, we can have some confidence that the player will repeat his BABIP figure for a second season (and beyond); if it's the latter, we'd probably expect the extra points of BABIP to evaporate in subsequent years.
so how do we tell the difference? batted-ball data would provide one clue -- and fangraphs has the goods. just type in the player's name and you can get, for every season going back to 2002, a breakdown of the guy's batted balls -- what percentage of his BIP were line drives, groundballs, and fly balls. here are the numbers for juan encarnacion, who (you'll recall from earlier this week) posted an atypically high BABIP in 2005:
|year||avg||babip||ld %||gb %||fb %||total BIP||k %|
what i see are two extremely similar seasons (2002-03), an injury-marred one (2004), and last year's "peak" performance. the variations in these numbers are suggestive. here's how joker24 interpreted them last night in the comments:
A look at this and we find that BABIP has strong correlation with %ld + .120. A 3% increase in line drives = a .03 increase in BABIP and Encarnacion's was .034.
For the OBP, to be honest, I have no idea where the .30 point increase came from because his BB%, K/BB, K%--everything except BA either stayed the same or got worse. Those last .10 points don't really seem to make sense.
my read of these numbers is slightly different from joker24's. i see a 2 percent increase in line drives (as a percentage of balls in play, or BIP) in 2005, and about a 1.5 pct increase in fly balls; both those increases came out of encarcion's groundball budget, which shrunk by 3.5 to 4 pct. those are small but potentially telling changes in encarnacion's BIP distribution. how do we measure the effect on BABIP and, ultimately, batting average?
let's start with the line drives. a 2 pct increase in line drives, taken over encarnacion's 396 balls in play from 2005, translates into 8 additional line drives (ie, 396 x .02). as noted, those extra line drives came out of the groundball budget, so let's postulate that all 8 would have been grounders in 2002-03. according to this article, about 70 pct of line drives turn into base hits, while only 25 pct of groundballs do (and if somebody has better or more recent data on this, please post a link). taking those averages, we would expect 8 line drives to yield 6 basehits, whereas 8 groundballs would yield just 2 base hits.
by that reasoning, encarnacion's increased line-drive pct netted him 4 base hits in 2005, or about an .010 increase in BABIP (ie, 4 hits divided by 396 BIP) over 2002-03. his BABIP for those two years combined was .291, so the extra line drives in and of themselves would have raised his 2005 BABIP to about .301 . . . but that still leaves .033 of marginal BABIP unaccounted for. i'm not gonna attribute that margin to luck, because you guys have convinced me that such is an unfairly dismissive delineation; but i do remain skeptical that these extra 33 points reflect a sustainable change in encarnacion's hitting ability.
before we leave the subject, let's do one more piece of quick-dirty math: if encarnacion should revert to his 2002-03 level of skill, what difference would it make in the win column? let's say his batting average falls back from .287 to .267 and his obp drops back from .340 to .320; based on 500 at-bats and 550 plate appearances (about enc's averages the last two seasons), he's losing 10 base hits and making an additional 12 outs. if we just assume all 10 hits are singles and all 12 outs are groundball outs, the likely cost is about 7-8 runs over the course of the year -- at most 1 game in the standings. (my calculation is based on tom ruane's value-added method, which is a subject for another post -- but i urge you to check it out via this link.)
so whether or not encarnacion retains those few extra points of batting avg in 2005, they're not likely to make or break the cardinals season. now if he should hit .230 -- or .310 -- different story. but .230's not very likely barring injury, and if this dude bats .310 in 2006 . . . . . well then we can hold the luck v skill debate about walt jocketty next off-season.