Those who read AN know that one of my pet peeves is the notion that all stretches of unusually high BABIP (batting average on balls put in play) automatically reflect "good luck," while all stretches of unusually low BABIP must reflect "bad luck." Yet certainly the smaller the sample, the better chance that a freakishly high or low BABIP is little more than statistical noise -- after all, a flurry of "at 'em balls" or a cluster of badly hit balls that roll, bloop, and chop their way to base hit glory, is hardly unheard of.
A hitter's BABIP usually hovers somewhere between .280-.320. So when you see that a hitter who is having a bad season has a BABIP of .240, how do you know whether this is predominantly due to chance factors that will self-correct or whether the hitter is actually contributing, non-randomly, to his poor performance? Does Rajai Davis deserve credit for his excellent hitting line this year -- which includes an exceptional BABIP -- or has a cluster of lucky bounces and fortunate placement just made his hitting look better than it has actually been?

I have a couple theories that can be checked by those adept at unearthing the statistical data. I'm not suggesting the theories are accurate; I am putting them out there as "common sense" ideas, and asking readers to feel free to reveal that they are sensible or perhaps dead wrong...
It seems to me that two factors should influence "expected BABIP": One is putting balls into play on favorable counts, and the other is putting "strikes" into play.
I would imagine that when 3-1 pitches are put into play, the average BABIP is higher than when 0-2 pitches are put into play, that the average BABIP "following a 2-1 count" is higher than the average BABIP "following a 1-2 count," and so on. So if a hitter is generally getting into, and hitting from, favorable counts, I would expect a low BABIP to self-correct (a more "external locus of control") more than if I found that the hitter was getting into, and hitting from, unfavorable counts -- where a lower BABIP should be expected and the hitter is more responsible for his lack of success (a more "internal locus of control").
Similarly, I would imagine that hitters (at least those not named Vlad or Panda) do better when the ball they put in play is a strike as compared to when it is, say, a sinker below the knees or a fastball running in on the hands. So if a hitter with a high BABIP is also putting "strikes" in play at a higher than average rate, I would be less inclined to assume it to be luck and more inclined to give due credit for the stretch of success.
You get the idea. I'm just wondering whether in addition to "line drive %," two good stats to check to help separate "luck/chance" from "skill/earned" might be to look at how the counts that B's are being PIP compare to league averages, and to look at how the percentage of "putting strikes into play" compares to league average.
Or not.
Note: If you haven't seen this post about the upcoming Community Service (9/19) and Chez Nico (9/29) gets-together, check it out and reply accordingly if you want to be a part of one or both!
3 recs | 68 comments
That would seem to make sense to me without looking deeper
It just stands to reason that if the count is in your favor you’re going to get better pitches to hit and likely make better contact when you do compared to someone who’s always battling from behind, making contact more on pitcher’s pitches.
Flashfire - September 13, 2009
It makes sense to me too, but at the same time
here’s a case where sometimes “what you’d think” and “what the numbers say” could differ and I’ll trust the numbers over my sensibilities on this one.
But it seems to me that hitting strikes is a skill (and one that can come and go during better and worse stretches), as is hitting in better counts (by being properly selective/aggressive early in the count), and that fluctuations in BABIP might be attributable, at times, to changes in how well these areas are being managed.
Nico - September 13, 2009
great post
sfoakbay - September 13, 2009
+1.375
(i like odd numbers, but i’m not getting along right now with 9)
Gaijin_Suketto - September 14, 2009
Interesting
I’ve never gotten along with 8. Too full of itself thinking it’s somehow better than infinite just because it’s standing up. The bastard.
DMOAS - September 14, 2009
Another chance for me to plug Hardball Times's awesome xBABIP calculator.
I’m not getting paid, I swear.
http://www.hardballtimes.com/main/fantasy/article/simple-xbabip-calculator/
Instead of assuming a league-average BABIP, the quick tool at the bottom-left corner of this spreadsheet calculates what any player’s expected BABIP (xBABIP) should be, given perfectly neutral luck. For players with odd batted ball profiles, this is a great tool.
danmerqury - September 13, 2009
rec'ed
designatedforassignment - September 13, 2009
I recommend even more
the explanatory article linked from that page — for anyone who, like me, is more interested in understanding what the formula seeks to measure and how it is arrived at, than in just plugging numbers into the calculator and getting the answer.
In brief, the xBABIP team has identified a number of factors which they use as adjustments to the plain BABIP. The weight of the adjustments was determined by an analysis of the existing data on hitters, with the result that the xBABIP is believed to be an even better measurement of a hitter’s “true” skill than regular BABIP.
The added factors tend to address the issues Nico asks about in the lead post, albeit indirectly. For example, it adjusts for a hitter’s BB/SS ratio, which presumably is correlated with whether the pitches he does make contact with are balls or strike. It also adjusts for a hitter’s average pitches per plate appearance, which presumably is connected to what sort of counts he’s likely to get into.
…
Two things jump out at me when reading that page. First, after a table that shows how “unlucky” a certain former Athletic was in 2008, the authors editorialize:
This was written before the 2009 season, when Giambi was still a free agent. Whatever other merits the xBABIP calculation may have, it failed miserably at predicting how Jason would fare in 2009.
The second is a table where the authors are comparing their own xBABIP figure with what they call “old-xBABIP”. Old-xBABIP is simply a hitter’s LD% (ie, the percentage of his balls in play that are line drives) plus a normalizing constant to scale it to BABIP. That calculation had previously been figured to be a better predictor than actual BABIP, but the authors say it is not as good as their new xBABIP.
The list in question, then, shows the hitters who, during the 2008 season, showed the greatest discrepancy between old xBABIP and new xBABIP. In other words, these are the guys who are doing something right that the new formula captures but the old line-drive-based figure does not. They list the top 19 (why 19 and not 20, I don’t know):
What fascinates me about this list is how heavily it is tilted toward certain teams. Out of 19 players listed, six are Minnesota Twins, four are LA Angels, three are Boston Red Sox, and three are Houston Astros, with only three more from all 16 others teams (one each for Brewers, Giants, and Braves). (Since the list is 2008 data, I’m counting where the player played in 2008, but only one of them (Crisp) moved between 2007 and 2009.)
That seems too extreme to just be a coincidence. I should think that these teams are either doing something different with their hitting, or they’re looking for something different in obtaining hitters in the first place.
iglew - September 13, 2009
Those are all with the exception of Mathis and Varitek fast players.
The inclusion of the Speed Score may vary well be the difference you see.
designatedforassignment - September 13, 2009
Ah, that makes sense.
I recognize many of the names by their team but am not familiar with their specific skills. I already knew the Angels like speed. So I guess the Twins and Astros do, too?
iglew - September 13, 2009
Yes
The Astros traded Brad Lidge for Michael Bourn. The Twins gave starts in Major League Baseball to Alexi Casilla and Carlos Gomez.
If I had to make a list of “franchises most willing to trade power for speed,” those would probably be at or near the top of the list.
PaulThomas - September 14, 2009
For the record.
The 2008 Speed Scores of the aforementioned list.
It’s pretty apparent that this list contains many of the fastest guys in baseball. but the presence of Varitek and Mathis make it pretty clear that something else is going on. Their xBABIP formula uses stolen bases to measure speed (with the promise of a better formula down the road that uses Speed Score). I’d imagine that this old-xBABIP/xBABIP discrepancy has something to do with the difference between stolen bases and the true measure of speed down the line. The discrepancy should decrease with the advent of the Speed Score-derived xBABIP formula.
danmerqury - September 14, 2009
My understanding is the underlying xBABIP model
already uses Bill James’ speed score, and it’s only the spreadsheet calculator they provide that uses stolen bases. The spreadsheet calculator apparently has to take a couple of short cuts for data that is not readily available online for a regular user.
The list of players with the highest old-xBABIP to new-xBABIP discrepancy come from the actual model, not the calculator, so I assume they are calculated with actual speed score.
iglew - September 14, 2009
This is accurate.
designatedforassignment - September 14, 2009
And anyway, regardless of whether you use
stolen bases or speed score, the speedy guys should show up in old-xBABIP vs new-xBABIP comparison. Old xBABIP is just based on line drives, so it doesn’t capture speed at all. New xBABIP does, and that’s one of the ways in which it’s better.
iglew - September 14, 2009
Ah, understood.
Sounds good.
danmerqury - September 14, 2009
with Boston on that list
I wonder if some of it could be due to park effects. The expected BABIP on fly balls is pretty low, but fly balls hit to left field in Fenway? Probably a lot higher.
I don’t really have explanations for why Minnesota, Anaheim, and Houston would also show up on the list, plus Ellsbury is a LHB and both Varitek and Crisp are switch-hitters, so maybe they aren’t hitting so many balls to left field.
Yeah, so it’s far from an airtight theory, but maybe interesting.
colin - September 14, 2009
oh wait
I also realized that the list iglew posted are the major discrepancies between old xBABIP and new xBABIP. Assuming that neither of those formulas include park effects, then the idea I described wouldn’t show up here. Instead you might look for if in the actual vs xBABIP outliers.
colin - September 14, 2009
and....
I see from iglew’s post way down below that xBABIP does correct for park factors. Just forget I said anything at all (unless you are entering me in a reply-to-self contest against cutthemullet, in which case this is really helping my chances).
colin - September 14, 2009
Metrodome BABIP
could be increased by three factors:
1) the Hefty bag in RF (similar to your observation about the Green Monster);
2) outfielders losing fly balls in the roof;
3) turf hits getting through the infield.
Nick - September 14, 2009
so applying Nico's fANpost here
the question would be: are making contact on strikes and making contact in hitter’s counts better predictors of BABIP than BB/SS ratio and pitches/PA? Hardball Times picked their parametrization, but that doesn’t mean it’s the best one.
colin - September 14, 2009
I assume their choice of parametrization
is a function of availability of the data. It’s pretty easy to get BB/SO ratio and pitches/PA. Not so easy to breakdown each ball-in-play by the count, and harder still to identify which balls-in-play were on strikes and which were on balls.
iglew - September 14, 2009
TWSS
monkeyball - September 14, 2009
I'm by no means one of the qualified stats experts here
but that’s mainly because once sophomore-year geometry ended, I became really bad at math, and I really don’t have the math skills to analyze more sophisticated baseball stats.
That being said, here is Rajai’s “more stats” page on baseball-reference.com. It has lots of data, mainly semi-obscure counting stats, not complex analytic stats, so although the SSs are S, it’s all pretty comprehensible.
Rajai’s BB% — the percentage of PAs that end in BBs — is more than twice last year’s (7.2% to 3.4%). He has a career-high line-drive % (22%), but his ground-ball/fly-ball ratio is pretty much the same as it’s always been (1/1).
Surprisingly, he’s had only 21 3-1 counts this year, and has made contact 8 times on those 3-1 counts. So I doubt that those 8 times are the difference for him this season.
Nick - September 13, 2009
*waiting for PaulThomas' input on the matter*
Cheezombie - September 13, 2009
"A hitter's BABIP usually hovers somewhere between .280-.320"
you forgot to put the word “average” before “hitter’s”.
The thing thats wrong with BABIP is that there are two distinct kinds of hitters, both are great hitters, and both groups BABIP cannot be looked at because BABIP is flawed and doesnt take into consideration these kinds of hitters.
The first group are the above average slap hitting guys will always have a high BABIP, because they know to “hit em where they aint”. Ichiro & Jeter’s have been between .350-.400 many times in their careers. Hanley Ramirez is at over .400 this year, has he been “lucky”? No, he’s just awesome. They know what theyre doing, they know how to hit a ball over the INF and before the OFer.
Theres also the reverse in the great pull hitters who deal with the “shift”. Barry Bonds never got over .300 and was always closer to .220, mostly due to the effect of the shift. Giambi’s was .226 this year and hasnt been over .300 since 2003, when managers started putting the shift of him.
Whats kind of funny is that the 2 best hitters in the game dont fall under either of those categories, so Pujols & A-Rod actually have relevant, normal BABIPs. But my point is that BABIP doesnt work for all hitters, its really important to note this.
PL78 - September 13, 2009
Bonds BAPIP was never close to .220
If his BAPIP was .220 unless his HR>Ks it would be impossible for him to have the batting averages that he did.
designatedforassignment - September 13, 2009
I think Bonds had more HR than Ks in 2001
There really aren’t words for how ridiculous his 2001-2004 peak was.
In his case, I think his BABIP was lowish just because every time he got even decent wood on the ball, it was landing in McCovey Cove.
PaulThomas - September 13, 2009
I thought HRs weren't counted as "in play"
Cheezombie - September 13, 2009
That's why Bonds' BABIP was low
He didn’t get “credit” for all the HRs.
Nico - September 13, 2009
But HRs are subtracted on both side.
Cheezombie - September 13, 2009
They aren't
but just as an example, in 2001, his BABIP was .268 while his batting average “on contact” was an eye-popping .407.
Incidentally, he did not have more HR than Ks that year, but that year also happened to see an unusual spike in his K rate (probably from chasing the HR record). In 2004 he actually did have more HR than Ks, although his BA on contact was an almost identical .406.
PaulThomas - September 13, 2009
Bonds had two years where his BAPIP was that low
in 2005 when he was hurt and 1999 when he hit an uncharacteristically low .262 otherwise his career BAPIP was .288 which is surprisingly close to the normal considering his role as one of the greatest outliers ever.
designatedforassignment - September 13, 2009
it was .228 in 1999
PL78 - September 14, 2009
When he was hurt... your point?
designatedforassignment - September 15, 2009
I guess it's not ideally worded
What I meant by “A hitter’s BABIP usually hovers somewhere between .280-.320” was “For most hitters, their BABIP hovers…” I guess I should have visited the Department of Redundancy Department and rented the phrase, “Usually, a hitter’s BABIP usually hovers somewhere between .280-.320.”
Nico - September 13, 2009
Bonds BABIP was low because he hit a shitton of homeruns.
Also, sometimes people forget BABIP is a skill. Some guys have lower BABIP because there not as good at hitting. It’s variation in BABIP that suggests shifts in luck, as Nico did a good job of explaining.
travdog6 - September 13, 2009
It's also important to mention that hitters have much more control over BABIP than pitcher do.
Cheezombie - September 13, 2009
Hitting balls in hitters' counts is definitely a way to improve one's BABIP
Walk rate is very positively correlated with BABIP, which seems odd until you realize that the guys who walk a lot also tend to be the guys who are putting the ball in play on those 2-0 and 3-1 counts. Getting ahead in the count is always a good idea.
It drives me nuts when people treat plate discipline as “all about drawing walks”— no, it’s about getting ahead in the count and then either getting a walk or smashing the ball depending on what the pitcher gives you. It’s not about passivity, it’s about selective aggressiveness.
Also, if you do a search at THT, you should be able to unearth a study they did on the results of balls vs. strikes when they are put in play. Nico’s intuition is again correct here— balls are much, much more likely to become outs than strikes are. (One of the reasons Ichiro keeps hitting .340 is that his batting average on balls is as high as the typical player’s average on strikes!)
However: it remains the case that the best indicator of what a hitter is likely to do with balls in play is “what that hitter has done with them in the past,” and so if a guy is way over his own career average, that’s an extremely strong “tell” that he’s been lucky.
PaulThomas - September 13, 2009
Thanks for the summary and for the important additions
to my points, such as each hitter’s personal track record being a good predictor of their future outputs. I couldn’t agree more about “selective aggressiveness” — to me the best approach to try to “zone hit” on a lot of counts, not to aim to hit on a certain count or get to a certain count. The pitcher has a lot to say about when they give you a hitable pitch; your job is to be ready to hit the “mistake” or “hitable pitch,” not to try to figure out when you’ll get the best pitch to hit.
Nico - September 13, 2009
I think this:
has been the A’s problem in the last three (if not five) years. We have mentioned before that the A’s seem to be all about drawing walks. Good hitters are not and shouldn’t be looking for a walk when they go up to bat. They are looking for the pitch they can drive. If they take the walk, it’s because they couldn’t do anything with the previous pitches. They don’t walk just to walk.
baseballgirl - September 14, 2009
No. The problem the last 3 years
Is that the hitters haven’t been very good. They haven’t been able to do anything even when they DO get that pitch they’re looking for.
mikev - September 14, 2009
We apparently haven't been watching the same team, then
PaulThomas - September 14, 2009
You are rightly pointing out the alternatives to luck
When you are flipping a coin the only factor is luck. So if you were to call heads a hit and tails an out then you would expect that the “BABIP” if you will, would be .500 on average. However, over certain stretches in your flipping you would expect that you would get a disproportionate number of either side, a fluctuation in the coins “BABIP”.
In the coin situation the only explanation to a “BABIP” above or below .500 would be luck. In baseball however, there are many other alternatives which you are rightly pointing out. Thus in baseball when you look at a non-average BABIP you have to weigh the odds of a persons BABIP fluctuating due to luck, or as you are noticing, changes in their, or an opponent’s game.
A while ago I looked at the sheer odds of Rajai Davis’s batting average increasing the way it did solely due to luck so that fans could more easily evaluate what his particular rise was due to, luck, or other.
http://www.athleticsnation.com/2009/8/9/983066/old-rajai-new-success
5Tool - September 13, 2009
Actually, I disagree
If you flip a coin , and there’s a significant statistical anomaly away from a 1:1 ration heads to tails (caveats being suitable sample size), it’s probably an indication that someone is queering the pitch
bobnothing - September 14, 2009
True
At some point you may decide that your test is flawed. I was using the coin flip scenario as an example of when there is no skill involved, but obviously in real world application there is. The grand question is where is that point. For example if you flipped 5 heads in a row would that be enough to make you think something was wrong? There is a 3.125% of even a true coin accomplishing that feat. For a coin I would say it depends mostly on sample size, for baseball there are infinite possibilities.
5Tool - September 14, 2009
I wonder if you're going against a straw man,
Nico, where you complain that high BABIP is attributed to “good luck” and low BABIP is attributed to “bad luck”. It is not my impression that serious stat people claim such a thing.
I wonder if you’re confusing BABIP of hitters with BABIP-against for pitchers. It is routinely claimed that a pitcher’s BABIP-against is primarily luck, but it doesn’t immediately follow that the same is true for the batter. This is a simple consequence of the fact/assumption that hitters can affect BABIP but pitchers can’t.
So when you complain about people saying a hitter’s BABIP is all luck, you’re either mischaracterizing what the stats people actually say, or you’re only railing against the partially-informed stats-wannabes who don’t get it right (in which case I think we would all join in agreement).
iglew - September 13, 2009
Thank you...
This is something that I was trying to figure out a noncombative way to say since I failed so miserably to do that the last time we had this conversation. People that use only BAPIP without looking at career averages, batted ball profiles, and speed are generally either doing back of the napkin analysis that shouldn’t really be trusted or lazy.
designatedforassignment - September 13, 2009
I've started using an eight-factor test for super-detailed analysis of a player's BABIP
1. Is the hitter right-handed or switch-hitting? (LHB is better.)
2. Does the hitter spray the ball to all fields or primarily pull the ball? (Spray is better.)
3. Does the hitter strike out often? (More strikeouts is better.)
4. Does the hitter walk often? (More walks is better.)
5. What is the hitter’s line drive rate?
6. What is the hitter’s GB/FB ratio? (More GB is better.)
7. What is the hitter’s IsoP? (More power is better.)
8. How fast is the hitter?
If a hitter scores above average in 6 or 7 of these categories (8 is almost impossible, not least because some of them fight each other, like GB rate and power), odds are he will be very strong on BABIP. If he’s only good at 1 or 2, his BABIP is likely to be Crosbyesque.
PaulThomas - September 13, 2009
Doesn't the ISO cut both ways with the HRs?
I was looking at Howard’s minor league numbers where all of a sudden he started hitting fucktons of HRs but his number of doubles collapsed wouldn’t that produce a high ISO and a lower BA?
designatedforassignment - September 13, 2009
That should be a lower BAPIP
designatedforassignment - September 13, 2009
This list is very similar to the list of factors
used in the xBABIP calculation discussed above.
Some of them they formulate a little differently in order to make them more workable for their calculation, but it’s pretty much the same stuff.
The only thing I see that xBABIP does that you are including is there’s a factor for average number of pitches the hitter sees per AB.
They also do some basic corrections, one for park and one for how many years of data they have. (I’m not sure exactly what they do with year; presumably some factors are weighted differently with a longer history.)
iglew - September 13, 2009
I was under the impression that the calculator skipped some of the factors
but I haven’t actually had occasion to use it yet, so… well, that’s one of the reasons WHY I haven’t had occasion to use it yet. Though most of that is just complexity and lack of need. If I ever finish my article on the A’s’ BABIP, I’ll probably be digging it out.
The article you linked to in this post is where I got the test from, so if it’s not similar, that’s because I’m forgetting something…
PaulThomas - September 14, 2009
You're welcome.
Next time you’re looking for a “noncombative way”, feel free to borrow my weasely formulation:
iglew - September 14, 2009
Well the last time I think people thought I was calling Nico lazy when I wasn't.
designatedforassignment - September 14, 2009
WHAT THE HELL DID YOU JUST CALL ME?
mikev - September 14, 2009
You heard me.
designatedforassignment - September 14, 2009
You need to speak clearly.
Your voice is muffled by the walls in your mom’s basement!
mikev - September 14, 2009
iglew, I've seen it a lot with regard to hitters
Matt Holliday, at the start of the season, is one example, where I thought “luck” was inaccurately attributed to his low BABIP. Yes, maybe it’s just “partially-informed stats-wannabes” or maybe it’s well-informed stat-based folks just being lazy in that moment.
The purpose of the post isn’t to complain about it, though, it’s to try to look at how you might better be able to go deeper into the numbers, as PT does nicely with his list of 8 factors.
Nico - September 14, 2009
See youre understanding what we are saying incorrectly.
A lower than true talent BAPIP indicates the presence of statistical noise or luck. Over a career an average BAPIP becomes closer to the true talent BAPIP as the sample size increases. Therefore, significant deviations from a career norm without other obvious explanatory factors are likely to be related to luck. If Matt Holliday started hitting left handed for example you would expect his BAPIP to suffer. If Matt Holliday got traded to a band box that emphasized homers you would expect his BAPIP to decline. If Matt Holliday just stops hitting line drives that is going to negatively affect his BAPIP. Now if the batted ball profiles are the same and those line drives stop falling in then you are looking at statistical noise.
And for the record over a larger sample size Matt Holliday responded just fine and regressed towards his previously excellent levels of awesomeness, which is what people looking at his numbers suggested would happen.
designatedforassignment - September 14, 2009
To be sure, there was some evidence that Holliday had "stopped hitting line drives" early in the season
but naturally, you’d expect his LD rate to regress toward his career norms as well.
It’s not so much that he wasn’t unusually average at the beginning of the year— he was— it’s just that half a season of average play isn’t enough to make our projections of what he’ll do in the second half of the season change very much.
Or to put that even more succinctly: regression has both a skill component and a luck component.
PaulThomas - September 14, 2009
Whatever, stathead.
Holliday just hated Oakland. DUH!
mikev - September 14, 2009
... where "statistical noise or luck" is defined as
“anything our model does not measure”.
iglew - September 14, 2009
I saw it a bit differently with Holliday
I fully expected Holliday’s numbers to improve, but thought (and think) his low BABIP was the result of poor hitting, not poor luck. Good hitters will figure it out and make adjustments (and they are good enough that their adjustments are effective), just as hitters good and bad will see chance factors move towards the (or their) mean.
I really don’t think Holliday hit in bad luck at all in April and May. I think he just hit poorly but was unlikely to suddenly start hitting poorly the rest of his career. In other words, his luck didn’t change, but rather his hitting predictably reverted to his high standard of excellence.
Nico - September 14, 2009
I think Holliday's change was all related to the change in his swing.
travdog6 - September 14, 2009
This was fascinating
Thanks to Nico and all the expert AN’ers. Discussions like this make AN the best sports blog of all. I am really glad that everyone is back.
redtopcowboy - September 14, 2009
Me too...this is so interesting to me.
baseballgirl - September 14, 2009
Question
How much of getting into hitters counts is luck, and how much is skill? I would think at least some of it is luck, considering a hitter has no control over a pitcher throwing balls/strikes.
travdog6 - September 14, 2009
You must Login with your SB Nation account and be a member of Athletics Nation to post a comment.