While sitting at my desk today, I started developing a somewhat satirical statistic to measure the overall value of a Homerun Derby participant (Rewarding strikeouts and flyballs, penalizing inside-the-park homeruns, and infield hits). But this got me thinking; What if we used advanced stats to try to predict the Homerun Derby champion? Derby pitchers have historically suffered from inflated FIPs, a trend that's unlikely to change (Ha). However there are some statistical tendencies of the eight participants that I will be examining.
I frantically put together a spreadsheet when I got home from work of the 2012 Homerun Derby Standings, and the corresponding players' statistics. The stats I used relate to the percentages of types of batted balls, and other rates of production, or lack thereof. On another tab, I listed the same statistics for the 2013 participants. I then used the 2012 data to find the correlation between the stats I selected, and the participant's position in the HR Derby. I multiplied this correlation by the z-score for the player's value for the statistic compared to the rest of the field, and gave them a score for each category. I then summed up their scores for each category to give them their total Home Run Derby prediction score (I'll come up with a cool name for it later). I'm pretty sure that this is similar to linear weights by my impression of it. I didn't have enough time to look up the correct way to create a score based on correlation, I just did what made sense to me, and I want to get this post out before the derby starts. Now, here are my results.
Cespedes takes the crown, Cano disappoints again[/caption]