While reading the article on Nate Silver in Scientific
American this week I noted that while working for Baseball Prospectus he
utilized a system of statistics called Sabermetrics. After doing a bit of research on Wikipedia I
found out that Sabermetricians utilize baseball statistics in order to devise
new and original ways to analyze statistics which are different from the
traditional measures of stats. An
example of this type of sabermetric equation would be batting average on balls
in play or BABIP. The formula for
BABIP=H-HR/AB-K-HR+SF. With H=hits,
HR=home runs, AB=at bats, K= strike outs, and SF= sacrifice flies. For a pitcher this formula makes quite a bit
of sense as a traditional measure of pitching quality would be earned run average
which fails to account for many of the qualitative factors which are mentioned
above. One of the unique uses of BABIP
is not only comparing pitchers across leagues but also being able to compare
various pitchers across time! Although
it is true that we can do this with ERA as well there are many factors which
influence ERA which do not allow for the weightedness which is considered with
BABIP. To illustrate this example I have
utilized an analysis system which presents graphical representations of two
different pitchers against the average for all pitchers. For this analysis I chose Sandy Koufax and
Roger Clemens. Koufax and Clemens were
both known for having amazing beginnings to their careers so the question that
I wanted to know was, “whose beginning was better?”
Luckily for me baseball is an oft enough discussed topic that these two men are both available with a full list of BABIP statistics available. The graphical representation is shown below. Clemens clearly had a better start to his career although as BABIP would predict the fluctuations from year to year are also extreme. Utilizing this system of weighted averages we could say (asterisks notwithstanding) that Clemens was a better pitcher than Koufax.
Luckily for me baseball is an oft enough discussed topic that these two men are both available with a full list of BABIP statistics available. The graphical representation is shown below. Clemens clearly had a better start to his career although as BABIP would predict the fluctuations from year to year are also extreme. Utilizing this system of weighted averages we could say (asterisks notwithstanding) that Clemens was a better pitcher than Koufax.

AJ-
ReplyDeleteInteresting post. As a means of clarification, is the green line a BABIP or ERA average for all pitchers? From your data, I would also argue that Koufax didn't have that excellent of a start, he is only slightly above average for one data point.
However, I'd argue that neither is significantly better than average (statistically speaking) if you look at the variations throughout their careers. In fact, Koufax might even be slightly worse than average.
I'd try to prove this myself by comparing the Koufax and Clemens averages to the overall average, but I cannot read your axes on my computer. :(
It's cool Kathleen. BABIP is an aggregate weighted score for quite a bit of different statistics. The green line is a BABIP average for all pitchers. Apparently how BABIP works is that the tumultuousness of the representation is how it should represent. The skill of both Koufax and Clemens is assumed anecdotally (Koufax has always been assumed more historically) and if you look at ERA Koufax would have in his inaugural year been competitive with Clemens. The graphical representation of this that I was able to find didn't allow for ERA to be represented temporally as well. The average is at .3 and that does not speak particularly well for Koufax however Clemens' BABIP is well above three almost four.
ReplyDelete