Sabermetric Musings

Monday, December 4, 2023

Using Velocity-Based Splits in Analysis

One avenue of analysis I see most around the postseason when individual batter-pitcher matchups or team matchups are more heavily scrutinized is bucketing offensive performance by velocity faced, specifically against fastballs. An analyst might posit “This bullpen has thrown X percent of its fastballs over 95 mph and this team has produced a Y wOBA on such pitches.” Or something along the lines of “John Doe is going to have a tough go of it against this teams cadre of high octane arms; he has posted an X wOBA on fastballs over 95 mph, versus Y on fastballs below that threshold.” Similar points might be made in an analysis of a player in the midst of a slump during the championship season (instances that come to mind are various pieces related to Jose Abreu from 2023, for example). In all of these cases, the performance against what many would call “premium velocity” or “95+”, which at this point in time I would argue does not constitute premium velocity (reminds me of this Effectively Wild episode), is certainly descriptive. We take what happened in the past, create a cutoff (albeit an arbitrary one), and calculate the respective wOBAs.

That is all well and good, a description of that happened in the past. But I have few points of contention with type of analysis, either at the player or team level. The first, which I have already alluded to, is the arbitrary point at which most decide to bucket the data. This is almost always 95 mph. I am not aware of why this became the accepted line of demarcation; I would guess it just looks like a nice round number. Or maybe it is related to MLB’s definition of a hard hit ball, also 95 mph. Nevertheless, I can appreciate the desire to bucket information for ease of analysis or understanding. Modelling this in a continuous manner and presenting the results is often a more cumbersome endeavor and may lose some of the readers. A trivial comparison of performance on either side of a cutoff is more digestible. I suggest in the future, maybe we use cutoff based on velocity percentile? League-wide velocity is not a static figure, thus the significance of the 95 mph figure is different in say 2023 versus 2015.

The arbitrary buckets are one issue, but like I said I can understand the impetus for wanting to structure analysis in this way. My main issue the the usefulness of this information altogether, in that I am very skeptical that this sort of analysis yields valuable insights into performance in the past or future. The usage of wOBA as the measure of choice belies a player/team’s performance against higher velocity pitches because it only considers pitches that mark the end of a plate appearance. All of the fastballs that a player swings through with either zero or one strike are not considered at all. Furthermore, using wOBA muddies the picture by not distinguishing whether a player/team struggles with making contact or producing damage on contact. If you want to demonstrate an effectiveness (or lack thereof) against velocity, these two skills need to be separated out. The easiest method would be to consider whiff or swinging strike rate and wOBACON (or some measure concerned solely with contact quality). This criticism does not apply to all analyses of this type, but it is something that I see quite often and think needs to be adjudicated. I understand this distinction can be a bit tricky. It requires a few different Savant searches and some spreadsheet jockeying. But failing to make this distinction prohibits any conclusions one draws from being revealing.

Now, let us move on to the functionality of bucketing performance by velocity faced, even after making the distinction between contact and contact quality. Looking at contact (in the form of swinging strike rate) or contact quality (in the form of wOBACON) there has been been a standard bell curve for deviations in performance against higher and lower velocity, looking solely at players who accumulated 100 plate appearances in a given season since 2015.

There are a few conclusions we can draw from these figures. First, these look like standard Gaussian distributions. The swinging strike rate figure is centered close to around 0.05 (i.e. the median player has a swinging strike rate about five percentage points higher against 95+ mph fastballs). The wOBACON plot is similarly shaped, but around 0 (i.e. the point where there is no difference between a player’s wOBACON when you bucket balls in play by velocity faced). The shapes of both figures indicates one of two things: either the talent in facing higher end velocity (relative to lower velocity) is normally distributed in MLB or the differences in the two figures are random around the league-wide mean. The fact that the wOBACON figure is centered around zero lends credence to the latter; intuitively players should not be expected to produce better against higher velocity. Therefore, using a threshold of 100 PAs, looking at a season’s worth of PAs is not sufficient in judging a player’s ability to face high end velocity. For swinging strikes, there is a bit more to consider given the center of the distribution is located at a value that makes sense; we expect (and know) that higher velocity pitches (all else being equal) result in more whiffs. But still, within a season, any difference that deviates from that medium is mostly noise because year over year, there is little to no consistency in those deviations.

This whole post is a long-winded plea to caution analysts looking to find signal in any velocity-based splits when it comes to batting performance (this can undoubtedly be said for any analysis leveraging splits). We know facing higher end velocity is going to result in lower offensive performance on the whole(evident in the depressed offensive environment of the postseason). But there is little to no evidence that players are particularly adept or poor at facing high end velocity relative to lower velocity at the major league level. There might be something to explore when looking at minor league players, where the distribution of talent is much wider. Without extensive minor league pitch-tracking publicly available, however, this is not possible to verify at this point in time.

Thursday, March 4, 2021

RAPM Style wOBA Estimates

The most common all-in-one statistics in basketball are based on the adjusted plus-minus (APM) framework. The basic idea is as follows: how many points per 100 possessions does a player add to a team's scoring ledger. It is a regression that consists of the player in question, the other nine players on the court in the stints he plays, and the score differential in the segments that player is on the court. This has spawned a plethora of other metrics including box plus-minus, ESPN's real plus-minus, 538's RAPTOR, Basketball Index's LEBRON, and most notably regularized adjusted plus-minus (RAPM). Ryan Davis over at nbashotcharts.com has developed the current gold-standard for RAPM (at least in the public sphere) where he includes a raw RAPM, multi-year RAPM, and luck adjusted variants of both where he accounts for opponent three point shooting and free throw shooting, two quantities which a single player does not have control over. The newer variants of plus-minus metrics are usually developed with and compared to players' RAPM figures when building those metrics.

The key word in RAPM is regularized. What this means is that when fitting the model, the coefficients (both the external factors and the actual player coefficients) are regressed towards zero in a process called ridge regression. The rate at which the coefficients are regressed towards zero is controlled by the hyperparameter lambda. This technique has been utilized in the other sports applications, most notably in the hockey by a few public-facing analysts. For context, there is no confidence intervals given as an output for the ridge regression, but bootstrapped based distributions can be derived by fitting the model over and over again. This can take a while depending on the size of the training data set.

I set out to apply the ridge regression framework to derive estimates of true hitting and pitching talent in the form of wOBA and wOBA allowed, respectively. The inputs into the model were the pitcher, the batter, whether or not the batter had the platoon advantage, the park, and the month. The month was included because offenses get the benefit the ball traveling further in the warmer air. The parameters added to the model were meant to give context neutral estimates of each batter and pitcher in the dataset. Players who play in favorable offensive or defensive parks are adjusted accordingly as are players who are deployed in such a way where they often have the platoon advantage.

I trained the model on data from the 2019 and 2020 season. After training the model I pulled the the hitters and pitchers who had 200 plate appearances or 200 batters faced respectively. Below are tables including the top and bottom few names from my dataset. wOBA is the batters actual wOBA from 2019 through 2020 and woba_est is the player's context neutral wOBA estimate and ERA- is the context neutral ERA indexed to 100 where anything below 100 is better than average :

Model data via MLBAM. Plate appearances, batters faced, and ERA- via FanGraphs

Saturday, February 20, 2021

The Effects of Curveball and Four-Seam Fastball Spin Axis Differential

A swinging strike is arguably the most important pitch result from the perspective of a pitcher. A strike, not matter how it is obtained, is a strike when you look at a scorecard. However, when evaluating and predicting the performance of a pitcher, swinging strike rates are vital. A swinging strike is different than a foul ball: in the case of a foul ball the batter makes contact with the pitch. A swinging strike is different from a called strike: in the case of a called strike the batter does not swing, thus we cannot know the effectiveness of the pitch if the batter swung. The swinging strike is the outcome that gives us, as analysts, the most positive information of the pitcher. We see the batter makes the decision to swing at the pitch and upon making that decision he misses. So while all strikes are technically equal, the swinging strike gives us the most information about the effectiveness of a pitch.

For these reasons, modelling pitches in terms of their probability of generating a swinging strike is the best way in gauging pitch quality. What goes into a swinging strike? Well we know velocity and spin rate are essential factors. Higher velocities give batters less time to make the optimal decision. Higher spin rates allow pitches to move as much as possible given their spin axis (transverse spin and seam-shifted wakes play into pitch movement but all else being equal spinning the ball as much as possible is beneficial, sometimes with the exception of changeups and splitters. That is a topic for another time).

Once the pitcher makes the decision to swing at a pitch, attempting to disguise its movement is of the utmost importance. Pitchers that throw both a fastball and a curveball have the benefits of deception because the spin of the pitches are opposite. Four-seam fastballs have close to perfect back spin (depending on the pitcher). Curveballs have close to perfect top spin (again, depending on the pitcher). To the batter, these pitchers look eminently similar. The orientation of the seams are seemingly the same, but the actual spin is the opposite. Leveraging the spin characteristics of these two pitches can give the pitchers advantages in terms of deception.

With the implementation of Hawk-Eye cameras in all MLB stadiums in 2020, the spin axis of every pitch can be directly tracked. Previously with Trackman, MLBAM tracked the spin axis of pitches by inferring the axis from the movement of the pitch. The Hawk-Eye cameras can directly track the spin axis of the pitch. This gives analysts a better understanding of how pitches pair with each other and gives insight into pitches the move more than their movement-inferred spin axes would indicate.

In this post I wanted to look at the effect of spin axis differential between curveballs and four-seam fastballs on those pitches swinging strike rates. There is a lot of research into the effect of spin axis on the effectiveness of pitches, all else being equal, but I wanted to look at how pitchers who throw both of these pitches can get an extra edge. I built a model for pitchers who throw both of these pitches and how often they should generate swinging strikes.

First let us look at which pitchers generate the most spin on both four-seam fastballs and curveballs. I took all pitchers in 2020 who threw 100 four-seamers and 100 curveballs and compared their spin efficiencies on each spin type (the percent of spin that contributes to transverse movement).

There is not much of a relationship between the two quantitates. Lance Lynn is notable because despite the fact that he does not have great spin efficiency on either pitch he is supremely effective. I will note that Lynn dos not throw his curveball often and is known to manipulate the shape of his fastball. He is unusual in his ability to throw different fastballs to great effect. Shane Bieber was the best pitcher in the league in the shortened 2020 season and he shows well by these measures. Lucas Sims has elite spin rates overall but seems to have trouble generating high-end transverse movement given the ample spin he imparts on the ball. Hyun Jin-Ryu does not throw with great velocity but maintains great results threw efficient spin. For further insight into spin rates and spin efficiency on four-seamers and curveballs I direct the reader to the following visual where the reader can get an idea of which pitchers have the best raw pitch characteristics (without consideration of arsenal and seam-shifted wake. Bauer, Lugo, and Sims stand out here):

For all pitchers who threw 100 curveballs and four-seam fastballs, I binned the swinging strike rates of the curveballs and fastballs by swinging strike rate (the percentage of pitches that resulted in swinging strikes). I also gave context into the effectiveness of the pitchers in each bin by summarizing the pitchers in each bin by wOBA allowed. The spin axis differential should be important but there are other factors at play so the qualitative importance of the the spin axis differential can be derived from this visual:

The spin axis and spin axis differential is displayed in terms on arms on a clock. For example, a curveball with perfect top spin has a spin axis of 6:00 (it moves straight down). A fastball with perfect backspin has a spin axis of 12:00 (relative to other pitches it has ride. Obviously in an absolute sense the pitch does not rise). The spin axes were taken from the perspective of the batter. I will not that this is not literally the axis around which the ball spins. It describes the axis of rotation in terms of how the ball should move from transverse spin.

The visual above gives the reader a qualitative idea of the importance of spin axis differential between curveballs and four-seamers in a pitcher's arsenal. In the case of generating swinging strikes on either pitch there is not a clear relationship. I wanted to account of the overall effectiveness of the pitchers in each bucket by accounting for their wOBA allowed. My thought was maybe the axis differential matters, but so do other aspects of a pitch. Maybe pitchers who happened to be really good do not get optimal axis differential on their four-seamers and curveballs and derive value via other means? It is impossible to totally discern these relationships from a chart. So I built a model to empirically evaluate the importance of the spin axis differential.

The model is a general additive model that takes the smoothed relationship between a pitch's velocity, location, and movement, whether or not the pitcher has the platoon advantage, and a variable for spin axis differential modelled as a random effect. The model was trained on 80 percent of the pitches thrown by the pitchers who threw 100 four-seamers and curveballs. It was then tested on the remaining 20 percent of those pitches. The model had an accuracy of 89.8 percent in that it was correct 89.8 percent of the time when it predicted whether or not a pitch would be a swinging strike. From here I could apply the model to every pitch in the data set. There are many ways you can look at this data. I created a column in the data frame that I called expected swinging strike rate, the probability that the pitch was a swinging strike. I grouped and summarized the expected swinging strike rate data by pitcher and posted a thread on twitter for those who are interested in which pitches faired best. Guys like Bieber, Glasnow, Gray, and Cole have curveballs at the top of this leaderboard but worth noting Griffin Canning by pure "stuff" has an excellent curveball. And Josh Staumont has the only fastball that appears at the top when you combine four-seamers and curveballs.

As interesting as the player-level data is, ultimately I wanted to check the importance of spin axis differential for a pitcher's four-seamer and curveball. The variable for spin axis differential was significantly significant based on its p value, though much less vital to predicting swinging strikes than the pitches velocity, movement, and location. This p value in general is often misinterpreted but is encouraging nonetheless. Similar to my last post about park effects, I could extract the distribution of possible effects for each spin-axis deviation. Here are the results:

Remember 6:00 is perfect spin mirroring. Based on the model while this spin axis differential is important in predicting swinging strike rates but much less so than the other characteristics. Here you can see that while the spin axis differential is important, there is no discernable pattern. Perfect spin-mirroring has almost no effect while 6:30 has a negative effect. On the other hand 7:00 and 5:00 have the highest positive effects followed by 8:30, the latter of which is far off from perfect spin-mirroring.

What can we conclude from this? I think spin mirroring is still very important. Making your pitches appear the same to the batter's eye is essential in deceiving him. However, my model does not capture this phenomenon and my hypothesis is that the random effect in my model is not really addressing the effect of the axis differential. Instead it is capturing the effect of the pitcher overall and the best pitchers at generating swinging strikes happen to fall into the 7:00, 5:00, and 8:30 buckets. I will point out the if you look at the twitter thread with the expected swinging strike rate leaders, the top pitchers cluster around axis differentials between 5:30 and 6:30 (Bieber, Canning, Ray, Glasnow, Duffey, Cole, and Young) which is close to perfect mirroring and from the standpoint of the batter is probably barely discernable. Those guys make up half of the top of the leaderboard. Still, velocity and movement remain the most important factors in getting hitters to whiff but I remain convinced that small edges are to be had with effective spin-mirroring. That does not mean work on this type of data is over. There are better models that can be built that better isolate the effect of spin-mirroring from the overall quality of the pitcher. And we will have more data. Hawk-Eye was implemented in the shortened 2020 season so we have barely any pitch data relative to other seasons. I only had 82 pitchers who threw 100 four-seamers and curveballs and the pitchers who met this threshold obviously did not throw a full season's worth of pitches. In a sport with as much variation as baseball, this is not a sufficient sample. In the future, as our Hawk-Eye based dataset grows, analysts (including myself) will be able to generate better insights on the effects of spin and learn more about the hitter-pitcher interaction, the most consequential part of any baseball game.

Monday, February 15, 2021

Homerun Park Factors

Anyone who watches baseball understands a park can affect the outcome of a batted ball and the overall season lines for players. Certain parks are huge and seem to have an endlessly vast outfield. Others seem like they just require a flick of the wrist to punch the ball out of the ballpark. Homeruns are the most obvious manifestation of park factors. Other plate appearance outcomes (singles, doubles, triples, and even strikeouts) are affected by the park in which a given plate appearance takes place. Homeruns, however, are the most visible and most consequential aspect of park factors. With the influx of Statcast data, park factors can be refined with batted ball characteristics which require a much smaller sample to stabilize than homerun rates. Homerun rates are also very dependent on teams/players. The Statcast batted ball data lets us strip the identity of the player and team from the batted ball at hand and evaluate it solely on how well it is struck.

I built a generalized additive model with a smoothing term combining batted ball exit velocity, vertical launch angle, and spray angle (or horizontal launch angle). I also included a random effect for the home ballpark. The model was a logistic regression, in that the predictors were regressed against a binary output (whether or not a batted ball was a homerun).

To train the model, I used 80 percent of the batted balls from 2020 (selected randomly). The remaining 20 percent would be the test set. After training the model and applying it to the test set, I yielded an accuracy of 97.8 percent. This meant that of the balls in the test set the model deemed having at least a 50 percent chance of leaving the ball park, about 97.8 percent of them were homeruns. The following is the ROC curve from the model and you can see how how small the false positive rate came out (the further away from the line with a slope of one and intercept of zero, the better):

Before I show the park effects, I applied the model to all of the batted balls in 2020 and looked at the results from the player level. Here is a visualization with all players with at least 15 expected homeruns (based on the model) in the 2020 season:

Voit stood out from the rest of the field. Many of the names are not too surprising but some less-heralded names include Adam Duvall (just signed a one-year deal with the Marlins), Wil Myers who San Diego spent all last season trying to dump, and Teoscar Hernandez who always showed plus raw power but never could parley that with enough contact to garner significant playing time. As one can see from the chart, players can have significant deviations from their expected homerun totals. Players who hit more homeruns than their batted ball data would indicate got a bit lucky. Correspondingly players could have gotten unlucky based on the quality of the balls they put in play.

The further to the right and down a player sits means he got more unlucky and up and to the left indicates the player was lucky. I included all players who deviated by at least five homeruns from their actual output. Harper and Freeman had excellent shortened seasons (Freeman won the NL MVP award) but were actually unlucky in terms of homerun output. Machado, Betts, and Ramirez were three of the top vote getters for MVP in their respective lead and seem to have been buoyed by homerun luck.

Finally let's look at the park factors. Like I said above the park was a random effect in the model; I knew each park had some effect on whether or not a batted ball was a homerun but I was not sure how much of an effect. The following is a chart showing each park's effect on homeruns labeled by the home team for that park. The effects are based on how much the park is estimated to have an effect on the size of the intercept of the model. The more positive the value, the more of a positive effect the park has on the homerun probability. The more negative the effect the more a park shrinks the probability of a homerun. The model was fit many times over with different random effects for each park. I included the 95 percent confidence intervals for each park to show how much these effects may overlap.

Cincinnati has the largest positive effect by far followed by Yankee Stadium and Dodger Stadium. On the other end of the spectrum, Miami brings up the rear followed by the two Bay Area parks. One may quibble with Coors Field in Denver only ranking ninth. I would note that Coors almost definitely has the largest effect on overall run scoring. But the outfield in Coors was made so large to combat the thin air in Denver that hitting homeruns is not as easy as the run-scoring environment would suggest. Also note the position of Arizona. Arizona used to be a bandbox like Coors due to the altitude but upon the introduction of a humidor in Chase Field both the homerun hitting and run-scoring environments have taken a tumble in favor of the pitchers. The new Rangers park (which now has a retractable roof) plays down compared to its older and completely outdoor counterpart. Arlington is one of, if not the hottest place to play in the summer facilitating the ball flying much further than in other parks. One final note, the Bay Area teams surprisingly play in the coldest weather in the summer months. This has a negative effect on the flight of the ball (from the perspective of the batter). This in conjunction with the large park dimensions make both parks especially tough to deposit the ball into the outfield bleachers.

Sunday, January 31, 2021

Approaching the Beginning of a Plate Appearance (1/n)

In my last post I talked about throwing pitches out of the zone in 0-2 counts. I concluded that it was an effective strategy because of the increased chance of a putaway and the cost of going from an 0-2 to 1-2 count was small from the perspective of the pitcher. If the batter takes the pitch, the pitcher maintains the advantage in the plate appearance. Plate appearances that reach a 1-2 count still result in strikeouts 39.9 percent of the time and a 0.243 wOBA compared to 46.2 percent and a 0.217 wOBA in 0-2 in plate appearances reaching an 0-2 count (based on 2020 data).

Now I am going to flip this analysis around and look at approaching the beginning of a plate appearance (from the pitcher's perspective). Investigating waste 0-2 waste pitches was of interest because of their controversial nature when talking about or watching baseball. But we saw how the decision to throw a pitch of this nature lacked import, in that the pitcher still had all of the leverage if the batter takes the pitch. The beginning of a plate appearance (I looked at the first three pitches) is another animal. The advantage for either the pitcher or batter swings wildly with these three pitches. Let us first look at the first pitch. The average OPS in the 2020 season was 0.740. After an 1-0 count that increased by 92 points to 0.832. For an 0-1 count the figure dropped by 118 points to 0.622. For context the difference between 1-2 and 0-2 counts was 50 points. If you consider leverage the potential swing in plate appearance outcomes in terms of OPS/wOBA, you would say there is about twice the leverage in the first pitch as there is in an 0-2 pitch (neglecting outcomes involving plate appearances ending which obviously have the most drastic OPS/wOBA swings). Here is another example: after an 1-0 count the league average OPS is 0.832 (cited above). That swings to 0.995 after a ball (163 points) and 0.679 after a strike (153 points). These early pitch outcomes are crucial to understanding how a trip to the plate plays out.

This is going to be one of (hopefully) a series of posts about how pitchers approach plate appearances including (but not limited to) sequences of pitch types and locations and their effectiveness relative to one another. I ultimately want to look at macro trends and how different groups of pitchers solve this puzzle. Before undergoing this type of analysis I had to first figure out ways to manipulate the pitch data to answer these questions. Thus, I am focusing on one player for now and then plan to apply the necessary techniques to the league at large. I chose the best pitcher on the team I most closely: Gerrit Cole.

Cole unsurprisingly yields better results (like the rest of the league) when he throws a first pitch strike.

Cole gets first pitch strikes on balls not put into play about 60 percent of the time which is just about league average. Plate appearances that start with a strike are about 30 percent more effective, also in line with the league-wide splits I cited above. The 23 balls in play on the first pitch against Cole he gave up a 0.529 wOBA compared to the league average of 0.383. Cole has never been a standout in limiting damage on contact (to the extent that pitchers actually control it). His elite repertoire makes him not fear challenging the hitter in the strike zone. His approach has merit supported by his results in Houston and New York. The downside to attacking the zone is when hitters make contact his results may go awry. Keep in mind that we are looking at just 23 batted balls. I would not expect this phenomenon to persist in the future.

Cole is extremely effective picking up first pitch strikes with his slider and the results with is curveball fall way behind. In 2021 if he wants to continue to be aggressive throwing his curveball to start a plate appearance he needs to be more efficient either by throwing the pitch in the zone or inducing swings and misses. Batters often let first pitch breaking pitches go by so the former approach may be the better option.

Naturally let us look at the second pitch and account for the pitch type he threw on the first pitch. The sequences are coded so that the pitch to the left of the dash is the second pitch.

The red dashed line in each panel represents Cole's average wOBA allowed overall so points below the line are better. Cole has better results on contact when throwing his fastball twice in a row. When leaning on his slider Cole was excellent which speaks to the pitches quality in 2020. He had trouble with his curveball no matter the sequence. He did not throw his changeup enough to come to any firm conclusions but he mainly uses it when he is facing a platoon disadvantage so to be effective in that sample is positive. But the sample is small so I am reticent to make bold proclamations.

Finally a similar visualization but with sequences for Cole's first three pitches to the batter.

There is a lot going on here so I will let you look over the panels on your own but I will make a few comments. Cole starts hitters with three straight fastballs frequently. Sequences involving at least two fastballs generally performed well which speaks to how tough his fastball is on hitters (despite the fact that hitters generally perform better on fastballs). The curveball worked well following his other pitches. That is not to say he suddenly got control over the pitch when sequencing it with others but I would imagine he does not mind throwing it outside the zone in favorable counts and it tunnels especially well off of his 98 percent active spin fastball (which means its vertical movement is top notch).

There is nothing predicative about these results. The 2020 season was short and I looked at one pitcher. When I eventually look at more pitchers and the league as a whole hopefully I can yield some more interesting insights on the best methods for attacking hitters to mitigate wOBA against. I did not consider the counts in which these sequences were thrown which I will do in the future. The issue here was I am looking at one pitcher in a shortened season so I thought I would be slicing the data too finely (if I was not already).

OPS count splits via data reference and pitch level data acquired via MLBAM

Wednesday, October 28, 2020

NBA Championship Equity Based on Best Players

The Lakers recent NBA Finals victory got me thinking about how seemingly unusual their roster was shaped. The teams that win the championship almost always have at least a couple of high-end, All-NBA type talents on their roster (the 2011 Mavericks and 2004 Pistons are exceptions, but certainly not the norm). Still, this Lakers roster seemed especially top heavy, with LeBron James and Anthony Davis surrounded by a just solid crop of players. According to the newest Box Plus-Minus (BPM) model available at basketball-reference (the write-up of which can be found here if you are like me and want to get into the weeds), the third best player on the Lakers was JaVale McGee who was out of the rotation by end of the Western Conference Finals and did not play a game in the final round. Of the players in the rotation against the Heat, the third best player during the regular season was Danny Green, who added about half a point above average per 100 possessions during the regular season. McGee added 1.5; the difference between Davis (who posted a BPM of 8) and McGee was the second largest this century between the second and third best player on the NBA champion after the gap between the second and third best players on the 2012 Heat, Wade and Bosh. The gap between the Lakers this year and the third largest difference in the sample (incidentally between Kobe Bryant and Rick Fox on the 2001 Lakers) was the same as the difference between third and eighth of the list (which was the gap on the 2009 Lakers; the Lakers have won a lot of championships).

It seemed like there were a lot of worthy challengers for the NBA crown this year who boasted much deeper rosters than the Lakers, including the Clippers, Bucks, Celtics, Heat (who they vanquished in championship round), the Nuggets (who they beat in the semifinal round), and even the Rockets (who they beat in the second round). Yet the Lakers came out as the champion on the backs of their two superstars. We know the NBA is a star-driven league, much more so than any other sport excluding maybe the quarterback on a football team. So it begs the question: how much championship equity does a team going into the playoffs based on its best player and its best several players? To investigate, I took every team season since 2000 and pulled out the best player on each team and the three best players on each team based on BPM and a minimum of 500 minutes played in the regular season. I built two models: the first is a logistic regression where the target was whether or not a team won the championship and the variable was only the BPM of the best player. The second is another logistic regression where the target is whether or not the team won the championship, but the variables were the BPM of the best player, the second best player, and the third best player.

First, I took a look at the predicted championship probabilities of the model using data from only the best player. I will note I tried incorporating other data about the best player into the model, such as usage and shooting efficiency, but it actually made the model less accurate (based on AIC which gauges in-sample predictive power). For context, here is the distribution of BPM figures for the best players on championship teams versus all other teams in the sample:

Unsurprisingly, the teams that win championships have top players significantly better than the average team. The exceptions are Chauncey Billups on the 2004 Pistons, Dirk Nowitzki on the 2011 Mavericks, Kobe Bryant on the 2009 and 2010 Lakers, and Kawhi Leonard on the 2014 Spurs. Notable seasons on non-title winners were LeBron in 2009 and 2010 on the Cavaliers, Chris Paul on the 2009 Hornets, Steph Curry on the 2016 Warriors, Giannis Antetokounmpo on the 2020 Bucks, James Harden on the 2018 Rockets, Kevin Garnett on the 2004 Timberwolves, and Kevin Durant on the 2014 and 2015 Thunder. When you look at the output of the regression model, the best player has an outsized affect on a team's championship equity.

Interestingly, the marginal gains associated with your best player getting a little bit better have an outsized effect on your probability of winning a title at the higher-ends of the player production spectrum. For example, if a team had a best player that was about one point per 100 possessions better than average and added a player in the off-season who was about 6.25 points per 100 possessions better than average, that would would add about five percent of championship win probability in the average season this century (based on the model). The same championship equity would be added if the best player on a team went from about 8.75 points per 100 better than average to about 10. So on the upper end of the player ability spectrum, adding about 1.25 points is the same as 5.25 points when considering just championship equity.

When incorporating the second and third best players into the model, the model becomes more accurate (by AIC). This is not surprising because I am incorporating more information about each team's roster. Theoretically, adding in all of the players would be the most accurate, but there would probably be significant amount of diminishing returns after the seventh or eighth best players since rotations shorten in the playoffs. But in any sort of modelling process, there is a balance that needs to be struck between accuracy and the amount of information required to make a good prediction. A more complicated model that requires 20 inputs but adds only five percent more accuracy compared to a model with four inputs is not really a better model. Marginal gains in accuracy compared to monumental changes in inputs is not good process. Thus, I figured for now looking at just the three best players would suffice do to its simplicity and intuitiveness.

With the model trained, I could generate championship probabilities based on the play of the three best players on the team, just like the model that only incorporated the best player. My initial reaction was to look at the landscape of each team's top two players and outliers in championship probability.

The Warriors show up a few times here, as they put together won of the most dominant stretches of play in league history. Interestingly, we also see the 2018 and 2019 Rockets with James Harden and Chris Paul at the helm. In 2018 specifically, the model indicated that the Rockets actually had a better chance at the championship than the Warriors. This is based on regular season performance and the Warriors after the Kevin Durant signing were known to loaf through games so the results should be taken with a grain of salt. Another couple of teams that unfortunately ran into the Warriors juggernaut were the 2015 and 2016 Thunder, featuring the aforementioned Kevin Durant and Russell Westbrook. The 2001 Jazz were the last hurrah for Stockton and Malone as serious playoff contenders. I also isolated teams that were similar to this year's Lakers, where the third best player was much worse than the top two. I filtered by teams with an implied championship probability of at least 20 percent and a third best player who was at best worth 2.5 points per 100 above average.

The "Heattles" show up here as do Durant's Thunder and the Stockton and Malone Jazz. Famously, Kevin Garnett carried a relatively barren roster (by contender standards) to the Western Conference Finals in 2004. The second and third best players on that team were Sam Cassell and Fred Hoiberg; this masterpiece from Garnett was one of the best seasons by a big-man in NBA history. Alternatively, here are some of the deepest contenders of the century (both the second and third best players were at least 3 points per 100 better than average):

Here we see different versions of great Spurs teams, an organization known for its depth and ability to develop players for most of the Greg Poppovich tenure. The earlier Spurs teams were led by the trio of Tim Duncan, Tony Parker, and Manu Ginobili while some of the later teams featured Kawhi Leonard and LaMarcus Aldridge. The Chris Paul and Blake Griffin led Clippers also make an appearance. After recent renditions of the Rockets and Thunder, these Clippers were the best group to not win a title.

When taking into account each team's championship probability every year since 2000, we can look at which teams exceeded or disappointed their expected championship output. First, all the teams that won at least one championship in the 21st century:

The Lakers lead the way in titles over expected. Some of this can be attributed to the Shaq and Kobe teams not taking the regular season seriously. The rest is attributable to the back-to-back championship teams in 2009 and 2010 not being especially strong title winners. Miami is second and for similar reason to the Shaq and Kobe teams they exceeded their expectations. Detroit and Toronto come out as surprising as "lucky" by this measure (a small ratio is expected to actual titles). For Toronto, they did not have much championship equity until they traded for Kawhi Leonard and in his lone season in Toronto Leonard often sat out regular season games to stay fresh. Detroit in 2004 was the worst champion by the model. Dallas had some great teams between 2000 and 2010 and oddly finally got a ring with Dirk when he was well past his MVP-level peak.

As I mentioned above, the Thunder, Rockets, and Clippers were very unfortunate given the players they employed this century. Milwaukee's misfortune is concentrated in the past couple of seasons where Giannis posted some of the best seasons of the century. Minnesota's misfortune was also concentrated in a couple of seasons, 2004 and 2005. Also showing up here is Orlando (2008 through 2010), the Suns (mostly due to the Seven Seconds or Less teams) and Utah (the tail end of the Stockton and Malone connection).

Finally, I looked at the strength of the top teams every season for the last 20 years. The model output does not consider other teams when computing championship probability. For example, the 2018 Rockets title chances are not affected by the fact that the 2018 Warriors existed. The model was not trained on individual seasons but the entire sample. Thus, the predicted probability is the probability of winning a championship given the players on hand in the average season in the 21st century. The issue for the 2018 Rockets is the average NBA season does not also include a team of the Warriors caliber. The top teams from each season have an outsized effect on the summation of title chances for all teams in a season. So when summing up all the title chances in each season, seasons over one can be considered top-heavy, while seasons under one have title contenders with worse players than the average contender. I call the sum of the probabilities the heavy weight index. The average heavy weight index in the 21st century is about one, which is not surprising. The landscape of the league, on the other hand, is noteworthy:

Recent seasons, especially 2015 through 2018, were especially top heavy. Those seasons were dominated by the Warriors, Cavaliers, Thunder, Spurs, and Rockets. These teams had some of the most impressive top-end talent the league has ever seen and just happened to exist in the same universe so the Rockets and Thunder missed out on titles. All of these seasons had heavy weight indices of at least 1.5, so they were at least 50 percent more top-heavy than average. Seasons in the first ten years of the sample featured much less impressive talent, especially in the years where Kobe's Lakers took how titles. These seasons were almost 40 percent less top heavy than average. With the break-up of the Thunder, Warriors, Cavaliers, and now Rockets, the league is starting to balance itself out again after some extremely top-heavy seasons. I think, subjectively, having an index a bit over one (the best teams are slightly better than the best teams in an average season), but not so dominant that they bowl over the competition offers the best entertainment product. With many top players hitting the free agent market after the 2020-2021 season, it will be interesting to watch how the contenders sort themselves out and if any team adds talent to the extent of some of the great teams of the last 20 years.

Thursday, October 8, 2020

Addendum to Investigation of Gary Sanchez's Struggles

In my last post I talked about Gary Sanchez and his 2020 struggles at the plate. I cited his excellent batted ball statistics persisting in 2020 and the main reasons for his demise being attributable to a bloated strikeout rate that is bound to regress and some bad luck on balls in play. After further reflection, my stance on his exceedingly low BABIP has not changed. It was one of the worst figures of the past five seasons and given how hard he hits the ball when it is put into play, I do not expect anything close to that figure to persist. Even though he hits the ball with so much authority, Sanchez may always post lower BABIP figures due to his propensity for hitting fly balls and the proportion of his hits that are home runs (home runs are not included in BABIP), but nevertheless there will be some positive regression on this front.

The strikeout rate requires more nuance. I noted his slight improvement in his approach; he improved his chase rate and swung at more pitches in the strike zone in 2020. His overall swing rate did not change much and he saw a slightly higher percentage of pitches in the strike zone. The issue was his contact rates, both inside and outside the zone. His zone contact rate declined a little bit from an already unimpressive figure, but the larger issue was Sanchez, relative to league average, could not put the bat on the ball when he chased. This meant that even though he was chasing less, his rate of contact when he did chase was such where swings on pitches outside of the zone had an outwardly bad effect on his results. I also included some analysis on the probability that this increase in strike out rate was purely the result of variance and found there was some non-zero possibility that this was the case. I threw the improved approach, substantial negative regression in contact rates, and variance into a blender and concluded that he was bound to be much better in the strikeout department in 2021.

Thinking more about the strikeout issue, I thought I failed to offer context behind Sanchez's rising strikeout rate and look at other players who saw large changes in strikeout rate and how they fared in later seasons. It is easy to say that any massive increase in strikeout rate should be followed by a corresponding decrease towards the players "true talent".

I pulled data on every set of three hitter seasons since 2015 where the hitter had at least 150 plate appearances in each season. I then took the calculated the changes in strikeout rate between year N and year N-1 and between year N-1 and N-2.

Positive integers indicate increases in strikeout rate, which are generally bad for hitters (but not always, could be an indication of a hitter being more selective or selling out for some power). As you can see in the direction of the trend line, increases in strikeout rate are often followed by decreases the following season. When you isolate Sanchez, the results are concerning:

This is the same set of points with Sanchez highlighted. You do not want to find yourself on the top right of this chart, which indicates multiple seasons where your strikeout rate increased. Sanchez's 2020 is especially ugly in this regard. I will not that his 2019 season looks bad in this visualization, but in 2019 Sanchez posted his best numbers on contact. That might indicate that he was selling out for some power coming off a down 2018 season. Using the data presented above, I built a simple linear model predicting a player's change in strikeout rate. The model had two inputs: the prior change in strikeout rate and the player's age (we know changes in strikeout rate are partly a function of age). After fitting the model to the data, I wanted to look at how Sanchez looked in 2019 and 2020 relative to expectation (i.e. the output of the model). In 2019, based on changes from 2017 to 2018, we should have expected Sanchez to trim 0.57 percentage points off of his strikeout rate while in actuality he added 3.47 percentage points. In 2020, we should have expected him to lose about 0.72 percentage points off of his strikeout rate. Instead, he added eight percentage points. His 8.72 percentage point change over expected was in the 98th percentile in the entire sample. His two year total change is 7th in baseball from 2018 to 2020. Furthermore, I built another model that gauged the probability of a player trimming his strikeout rate based again on age and prior year strikeout change. Sanchez again sticks out.

Player seasons in the top right quadrant are disappointing relative to expectation. Sanchez both seasons was expected to trim his strikeout rate and he did the opposite. I will note that most of the seasons where players added to their strikeout rate much more than expected came from 2020, due to the small samples. Still, I think there is some reason to be concerned with Sanchez and his increasing strikeout rate. If he wants to return to his 2019 level, he is going to have to buck the strikeout trend. Sanchez will be 28 going into next season and has been a major league regular for about four years now when you account of the fact that he did not play full seasons in 2016 and 2020. He is probably not going to get better from a batted ball perspective at this point; exit velocity peaks in a players mid 20s. Improvements will have to be made in his contact rates and correspondingly his strikeout rate. Whether or not he has the ability to make these improvements will largely dictate whether or not the Yankees tender him a contract going into his second year of arbitration.