Oct 22, 2014; Kansas City, MO, USA; MLB newly elected commissioner Rob Manfred speaks at a press conference before game two of the 2014 World Series between the Kansas City Royals and the San Francisco Giants at Kauffman Stadium. Mandatory Credit: Christopher Hanewinckel-USA TODAY Sports
Last Friday at the Sloan Sports Analytics Conference, MLB Network’s Brian Kenny interviewed new MLB Commissioner Rob Manfred. The two discussed a wide variety of topics, but perhaps the most exciting one was StatCast, Major League Baseball Advanced Media’s flashy and impressive player tracking system that debuted last year.
StatCast measures things like the angle and velocity with which the ball leaves a player’s bat, how quickly fielders react and what route they take to the ball, and how big of a lead a runner takes off a base. MLBAM has made some impressive videos that display the technology’s ability, but we haven’t heard much about how much data will be available for public analysis.
On Friday, Manfred indicated that StatCast data will be available on MLB.com and the MLB At Bat mobile apps, with some raw data becoming available to analysts at some point. While it’s still unclear how granular that data will be, we have another chance to wonder how this new data will impact how we look at the game.
I like to break down stats into two categories: valuation, and description. Valuation refers to any stat that tries to put an actual win or run value on a player’s performance. Stats like Wins Above Replacement, Batting Runs Above Average, Defensive Runs Saved, and Win Probability Added all fall under this umbrella. They are important because they are designed to estimate how many runs or wins a player added to his team. A larger positive total is always better, since more runs (or fewer runs when looking at something like Defensive Runs Saved) gives the player’s team a chance to win more games. Most valuation stats combine multiple categories: WAR combines offense, defense, base running, and playing time, while Defensive Runs Saved combines fielding plays and throwing plays.
Descriptive stats on the other hand, simply help us understand how a player executes their game plan. Swing rate, contact rate, even on base percentage are all descriptive stats. While these stats often correlate strongly with value, only tell a part of the story. A player with a .300 OBP may be a more valuable offensive player than a player with a .330 OBP if the first guy hits more double and home runs. Strikeout rate is a classic example of this, with many valuable hitters nowadays striking out often while still being valuable. While these stats have limited use cases, they are still very important for being able to fully understand how a player creates runs and wins, and they help us see what parts of his game are doing well or poorly when his value is above or below what we expect.
With StatCast data, we’ll be able to create better valuation statistics AND better descriptive statistics. The actual mix depends on the type and quality of the raw data released to the public, but one can begin to imagine what the impact will be as long as there is some data made available.
The most common topic that arises regarding new stats is always defensive metrics. Stats like Ultimate Zone Rating (UZR) and the aforementioned Defensive Runs Saved (DRS) use video scouts to estimate each batted ball’s trajectory and location in order to come up with an estimate of the likelihood that a batted ball will turn into an out. This method works rather well for putting players into tiers after gathering decent samples (usually three or more years is considered relatively reliable), but always causes controversy when an outlier arises and is rated as 20-30 runs better than a league average player at that position. While some of this is simply people not liking things that go against their instincts, there is a significant margin of error on these numbers due to the inexact method with which the play probabilities are created.
Next: Will StatCast Fix These Issues?