clock menu more-arrow no yes mobile

Filed under:

Ultra Small Sample Size Defensive Stats, Buster Olney, and Alfonso Soriano

March 30, 2012; Mesa, AZ, USA; Chicago Cubs left fielder Alfonso Soriano (12) against the Los Angeles Dodgers in the first inning at HoHoKam Stadium.  Mandatory Credit: Rick Scuteri-US PRESSWIRE
March 30, 2012; Mesa, AZ, USA; Chicago Cubs left fielder Alfonso Soriano (12) against the Los Angeles Dodgers in the first inning at HoHoKam Stadium. Mandatory Credit: Rick Scuteri-US PRESSWIRE

Buster Olney tweeted this earlier today:

Superficially, Olney appears to have a point. Soriano is a bad defender. This would appear to present a flaw in the UZR mechanism.

But what if Olney had tweeted this?

GMs say there are no perfect offensive metrics. Some evidence: No. 9-ranked player in OPS is A.J. Pierzynski.

— Buster Olney (@Buster_ESPN) April 24, 2012

Or this:

GMs say there are no perfect offensive metrics. Some evidence: No. 4-ranked player in wRC+ is Chase Headley.

— Buster Olney (@Buster_ESPN) April 24, 2012

The reaction is that such "evidence" is silly...everyone knows that two weeks worth of data for OPS or wRC+ can fluctuate dramatically, and that this sort of thing is not unusual.

For whatever reason, though, while people seem to be able to accept large fluctuations in small sample sizes in offensive stats, there is resistance to doing so with defensive statistics. The sense seems to be that there should not be slumps in defensive performance, that defensive performance should be more consistent from year to year than offensive stats.

This problem is exacerbated by the fact that defensive numbers generally require a longer period of time to normalize and stabilize than offensive numbers -- the inherent reliability of defensive numbers over a full season is about the same as the inherent reliability of offensive numbers over a half-season.

Look at Headley and Soriano, for example. If we look at the RZR data, just for the purposes of examining how many balls have been in Soriano's area of responsibility, we have 17 "Balls in Zone," plus another 13 "Out of Zone" balls Soriano has made plays on. 30 different balls in Soriano's territory which are driving his UZR.

Headley, meanwhile, has 74 plate appearances...almost 2.5 times the number of opportunities to impact his offensive numbers than Soriano has had opportunities to impact his defensive numbers. Headley's sample size is larger than Soriano's, but it can still be dismissed as Small Sample Size fluctuation, while Soriano's 30 plays are enough to have Olney use it to try to discredit UZR.

Next time someone wants to point to, say, Carlos Lee's LF UZR numbers from 2011 (when he played 80 games in the outfield), think of Chris Shelton, or Endy Chavez, or any other number of bad offensive players who had a great month or two at the plate.