The Box-Cox Transformation

The Box-Cox transformation is a family of power transform functions that are used to stabilize variance and make a dataset look more like a normal distribution. Lots of useful tools require normal-like data in order to be effective, so by using the Box-Cox transformation on your wonky-looking dataset you can then utilize some of these tools.

Here’s the transformation in its basic form. For value x and parameter \lambda:

\displaystyle \frac{x^{\lambda}-1}{\lambda} \quad \text{if} \quad x\neq 0 

\displaystyle log(x) \quad \text{if} \quad x=0

Continue reading “The Box-Cox Transformation”

What Goes First – Speed or Strength?

I recently had access to a lot of baseball data, specifically data on every season of every player in the history of the MLB going back to 1871. Here’s some analysis on how baseball players lose speed and strength (or both) throughout their career. Analysis primarily consisted of variable creation and data queries. Unfortunately, code not available 😦

Continue reading “What Goes First – Speed or Strength?”