Defending JuanEnc

First off, I'm so happy to see everyone getting in on the Fangraphs fun.  That site deserves all the publicity in the world, and when people discover it, we get the kind of useful, stimulating discussion that's been going on this week.

So, coming late to the party as usual...I think I might have found a small piece of evidence to throw out in favor of Juan Encarnacion.  Up until now, all signs have pointed to this being a bad signing.  If what follows here makes sense, it might nudge things in a more positive light.  I've got an Excel worksheet on this, and would be happy to share if anyone wants to see all the numbers (no).  Here, in any case, is the white meat:

Okay, from the Hardball Times article that lboros referenced yesterday, we find that about 28% of groundballs turn into hits, 21% of flyballs turn into hits, and 74% of line drives turn into hits.  If we're just talking about balls in play, though, we need to tweak the flyball figure, because a decent chunk of flyballs turn into home runs.  Still using the numbers from that article, we now find that 11% of in-play flyballs turn into hits.

Now, on Fangraphs, we can find JuanEnc's totals for each type of batted ball from 2002-2005: his GB, FB, and LD totals.  Again, we need to adjust for home runs, so subtracting those from his FB totals, we now have totals for GB, FBIP (flyballs in play), and LD.  So, multiplying those totals by the percentages we got from the Hardball Times, we can figure Juan's "expected" hits for each batted-ball type:  xGBH, xFBH, and xLDH.  Totaling those up gives us his expected hits, excluding home runs (xHits).  Using that, we can figure his "xBABIP" and check that against his actual BABIP.  Here's what I've got for those things:

Year    xHits  xBABIP  BABIP  Diff.
2002    148    .331    .300    -.031
2003    164    .328    .286    -.042
2004    111    .290    .257    -.034
2005    127    .329    .334     .005

So, how to interpret these data?  Here's one, rather optimistic way:  throwing out 2004, Juan's xBABIP seems to be holding steady at around .330.  His actual BABIP in 2002 and 2003 were 30 and 40 points below that.  However, in 2005 he essentially met his expected BABIP.  So, on this optimistic interpretation, perhaps he was unlucky in 2002 and 2003 and finally evened out in 2005.

By way of comparision, Albert Pujols' differences between BABIP and xBABIP from the same period are -.028, .007, -.008, -.008.  2002 was his "worst" year, 2003 was a monster, and 2004 and 2005 are roughly where his "typical" level seems to be.  For 3 of the last 4 years, his actual BABIP has been within plus-or-minus 10 points of his xBABIP.  If we take his xBABIP to represent his "real" skill level, then Albert's performance usually matches his skill level pretty closely.

If we take the optimistic interpretation of Juan's stats, then Juan's "real" skill level is a .330 BABIP hitter.  On that interpreation, 2002 and 2003 were hard-luck years.  In 2005, his luck evened out, and his BABIP better reflected his actual ability.  It seems fair to guess that if Juan's luck again breaks even in 2006, his actual BABIP will be within plus-or-minus 10 points of his xBABIP.  Assuming that JuanEnc's xBABIP stays right around .330, that would give him an actual BABIP somewhere in the .320 to .340 range--good, but not unrealistic.

Naturally, since these data (on this interpretation) suggest that Juan's high BABIP in 2005 was not lucky, but simply a fair reflection of the number of hits his batted-ball numbers should produce, they also suggest that Juan's 2006 ought to be more like 2005 than 2002-2003.  I honestly didn't expect to find these results, so there's been no deliberate number-twisting on my part; I'm sure I've made mistakes, but they were honest ones.  If this stuff is on target, though, it makes me feel better about Juan Enc. being our regular RF.  

Questions I Still Have:  
Does this make any sense at all?  
Even if it does, is it in any way useful to the "luck vs. new performance level" debate?  
How optimistic is my interpretation?  How realistic is it?
How reliable are those numbers from THT?  They are taken from a portion of the 2004 season; how useful is that sample?  
Also, we don't know about what Juan's actual batted ball results are (except for HR/FB%, really).  He topped his expected hits in 2005--but were those "extra" hits line drives or seeing-eye dribblers?  There aren't data telling us how many, e.g., groundball hits he got each year, although one would assume that Baseball Info Solutions knows has them.  
How much do park factors influence the difference between xBABIP and actual BABIP?  Juan played in a fairly extreme pitcher's park for most of 2002-2005; Albert played in a slight pitcher's park.
Does Albert's BABIP usually come close to his xBABIP because he's good and consistent, or is that typical for all levels of hitters?

I'd love to hear any feedback, good or bad.  Unless we've all moved on to bleating about McGwire.