Saturday, August 8, 2020

Small Sample Strikeout Woes

Gary Sanchez, through his first 25 plate appearances in this abbreviated 2020 season, accumulated 12 strikeouts, a rate of 48 percent. Needless to say, there has been a lot of hand-wringing about this slow start to the year, including questions about his focus (which New York fans love to talk about with respect to Sanchez) and his overall ability (despite the massive sample of Sanchez being an excellent hitter, the best among catchers since his debut in 2016 on a rate basis).

I wanted to pen a quick post about narrative building off of small samples of plate appearances. Given the short season and the general nature of baseball fandom, it is difficult to put aside what you are watching on a game to game basis and have perspective on the range of outcomes for a player in small samples and what that player should be expected to do, on a macro-level (single or multiple seasons), in the future.

Inspired by Sanchez's strikeout-riddled start to the season, I looked at theoretical hitters with true strikeout rate talents from 20 percent to 40 percent in increments of two percentage points. For each batter I simulated 25 plate appearance samples 10,000 times each. I used a Monte Carlo method where a number below the strikeout rate talent level resulted in a "strikeout" while all other numbers were other plate appearance outcomes, which I was not concerned with for this study. The following are histograms for each theoretical hitter and the distribution of strikeout rates in each of the 10,000 25 plate appearance samples:

The horizontal axis represents the strikeout rate in the sample. The dashed line represents a strikeout rate of 50 percent, so any result past that would indicate the hitter struck-out in at least 13 plate appearances in the sample. As you would imagine, a true talent 20 percent strikeout rate hitter rarely records 13 strikeouts in 25 plate appearances. Also not surprising, a 40 percent strikeout rate hitter is rung-up at least 13 times more often than the hitter with a 20 percent talent. For more clarity on how often each hitter records a strikeout rate of 50 percent, I created another visualization using the distributions above: 
The 20 percent strikeout rate hitter recorded 13 strikeouts in just six of the samples, or 0.06 percent. The 40 percent hitter recorded the value of interest in 16.1 percent of the samples. Even a 40 percent true talent hitter rarely records this result and there are not even any of these types of hitters who regularly play in MLB. The average strikeout rate in MLB is 23 percent and a hitter of that talent level would be expected to record this result about 0.15 percent of the time. Let us now go back to Gary Sanchez. Depth Charts projections at FanGraphs have him at a 28.9 percent strikeout rate for the rest of the season, which we can consider that system's estimation of his true talent. We would expect a hitter of this caliber to strikeout at least 50 percent of the time about 1.2 percent of the time. What this means is that Sanchez is performing (with respect to striking out) at a level that certainly is within his range of possible outcomes, but an incredibly bad outcome at that. So I guess what you, the reader, should take away from this is maybe, just maybe, cut the guy a little slack. He has certainly been dreadful thus far, but I would not expect this level of performance to persist throughout the season. 


No comments:

Post a Comment