Monday, August 31, 2020

Rolling Averages of QB Performance in 2019

Leading up to the 2020 NFL season, I have been digging into the performance of quarterbacks and their offenses from the 2019 season. My previous post was a study into when a quarterback's (or by extension the offense's) sample of plays become reliable, in that we can be reasonably confident the results-based EPA analysis is a good gauge of his past performance. I recommend anyone interested go back and read the post, but if you are short on time, I came to the conclusion that about 275 plays provides a reliable sample (anywhere between six and eight games).

This post will not be as research intensive; instead I want to provide some visualizations on the performance trends for the most relevant 2019 NFL passers. I compiled the rolling averages in 50 play increments for the most frequent passers in the 2019 season and threw them all in the same chart for clarity and ease of comparison.
The dashed lines represent the average EPA per play in a window, the average plus two standard deviations, and the average minus two standard deviations. So passers who find themselves below that middle line more often then not were not especially effective. The opposite is true of the converse.

Players have peaks and valleys. That is the nature of the beast with a small sample game played with a weirdly shaped ball. This applies to everyone except Patrick Mahomes. Mahomes somehow remained comfortably above average and often elite (plus one standard deviation from the mean) throughout the entire season with little to zero dips in production. Even Lamar Jackson during his excellent MVP campaign had a bit of downturn about a quarter of the way through the season. Jackson's best stretches were better than some of Mahomes' best, hence the MVP award. I should note, because it applies to Jackson, that these rolling averages include plays where the quarterback is the ball carrier because quarterback rushing attempts (unlike their running back counterparts) are drivers of efficient offense and yield a high percentage of positive EPA plays. Given Jackson's talent is a ball carrier, his EPA per play numbers are always going to be juiced compared to some of his more statuesque peers (though if you isolate his passing Jackson was still great in 2019). This seems like a great time to mention Josh Allen. Allen, like Lamar, adds value through his proficiency as a ball carrier in the Buffalo offense. Given his reputation (and frankly ability) as a passer, one might expect Allen to look at bit worse. But since rushing is included, he looks merely fine.

Jameis Winston has the reputation of being a roller coaster where at times he is making deep completions all over the field and looks like he can play with the best the league has to offer and other times he looks like he forgot how to play football. His former teammate Ryan Fitzpatrick is similar in this way, but the range of outcomes is muted. Oddly enough, for someone who many evaluators think is one of the best quarterbacks in the league Rodgers was both inconsistent and not great especially towards the end of the season. This has been a trend since his MVP season in 2014 and subsequent bouts with staying healthy. Perhaps the most inconsistent player this past season was the number one overall pick Kyler Murray. Basically half the season he was awesome and half the season he was dreadful. Given some of the skill players around him with the addition of DeAndre Hopkins, our prior on him as a prospect, and some of his excellent play, I think we should expect a big season out of Murray.

From Murray let's jump to the other two rookies. Going into the year I was not excited about Daniel Jones after his time from Duke. I would say he was a little better the expectations, but he was still a mediocre to bad quarterback for most of the year. If the Giants hope to avoid being one of the worst teams in the league for the fourth straight season, Jones is going to have to break out in a big way and totally blow my prior out of the water. Haskins is not on the chart due to his lack of completions. Here is what his season looked like:
He steadily improved after he was named the starter in Washington. As I alluded to in my last post, I would be weary of extrapolating a few good games toward the end of the season and predicting Haskins to be an above-average starter in 2020. Nevertheless, I would say it is better that he improved upon his wretched start then continuing to play poorly. 

Other players of note: Goff and Wentz are going to have to be better if they want their teams to compete for a division title in the NFC West and East, respectively. They are both facing up-hill battles in the form of either brutal competition (San Francisco and Seattle) or one offensive juggernaut primed for regression after an unlucky set of close games (Dallas). Goff's struggles last year are a bit overstated (he had 7 throws where the receiver was tackled at the one yard line. If those were touchdowns the narrative would be different). For Wentz it comes down to health. 

Mayfield and Darnold need to play well to cash-in on big extensions. I am way more confident in the former than the latter. Darnold fared much worse than Mayfield last year and that is without accounting for Mayfield's excellent stretch of play in his rookie season. Here is the same visualization as the one at the start of the post, but with 2018 included: 
Mayfield was great in 2018 and had a nice end to 2019. Similar to Murray, though he is further removed from the draft, if you still lean on our prior for him as a prospect plus you consider he has been more good than bad in the NFL, I am expecting Baker to establish himself as an high-end starting quarterback this season. 

Finally, look at Ryan Tannehill's run of play in 2019, especially compared to 2018. Tannehill, despite the narrative surrounding Tennessee, was the driver of the success of that offense once he was inserted as the starter. I am expecting a healthy amount of regression in 2020, but if Tannehill is even 80 to 90 percent of the player he was in 2019, I think Tennessee should be heavily favored to win that division despite the additions made by Chris Ballard in Indianapolis.

Sunday, August 30, 2020

QB EPA per Play Reliability

With the influx in evaluating football teams based on expected points added (EPA) and the accessibility of play-by-play data, NFL fans are better equipped then ever to gauge the relative merits of every team and player around the league. Instead of using metrics that either fail to address context (yards per carry/catch, total yards) or just are not reliable due to extremely small samples (touchdowns, interceptions, touchdown to interception ratio), EPA adjusts for down, distance, yard line, and a plethora of other factors (this is provides a more in-depth explanation of EPA and the history of the statistic).

Anecdotally, EPA is most commonly used to evaluate team offensive and defensive efficiency (in the form of mean EPA per play). This does lack some nuance (the average EPA per play tells us nothing about the distribution of EPA for an offense or defense), but nevertheless if you are going to boil down a unit to one number, mean EPA per play is about as useful a measure as we have access to. Given the out-sized effects quarterbacks have on a team's offense, it is natural to use EPA to gauge the play of quarterbacks.

The natural question when using any statistic in analyze a quarterback is when does that statistic become reliable? How large of a sample do we need before we have an idea of how well that quarterback has played? Using one play is meaningless; randomly select one Patrick Mahomes throw and you can come up with a 50 yard touchdown or a pick-six. Same idea holds for two throws, three throws, etc. So how many throws do we need before a sample is reliable? I tried to answer that question using the play-by-play data since 2010. I should note, measuring reliability or stability is different than predictability. When a metric stabilizes tells us how many samples we need to evaluate something that has already happened. So if a metric stabilizes or becomes reliable after 50 plays, we can look at the past 50 plays and see how well a player or team performed in those 50 plays. That does not mean that player or team will perform to that level over the course of the next 50 plays. We can just be reasonably confident that the level of performance displayed over the prior 50 plays is a good description of that past performance but it should not be used to predict future performance.

To conduct my analysis, I randomly sampled a number of plays from all plays where a quarterback either ran or threw the ball. I did so twice on 1,000 separate occasions for each number of plays. The number of plays I sampled ranged from 25 plays on the low-end and 400 plays on the high-end in increments of 25 plays. For context, a starting quarterback who plays in every game will throw anywhere between 400 and 700 passes and will have between 10 and 100 carries (unless you are Lamar Jackson, Cam Newton, or Josh Allen). First I looked at the spread in average EPA per play between two samples.
The numbers above each window indicates the number of plays in the samples. The spread starts out very wide for 25 plays, so using 25 plays to evaluate a quarterback would be foolhardy. As the plays per sample grows, the spread tightens. Once we get to 125 or 150 plays, the change size of the spread shrinks and does not change much as we add plays to a sample. So where can we say "this sample is reliable"? I took the average difference between sample pairs and the standard deviation in that difference for all pairs and compared them by number of plays per sample: 
The heights of the error bars represent the average difference plus or minus the standard deviation of the average difference. At 25 plays, the deviation in sample difference is very large relative to the average. This effect is mitigated as the sample of plays grows. By the time we get to 275 plays, the change in standard deviation between sample sizes is about five percent of the total standard deviation. For the rest of the sample sizes, this five percent figure is relatively constant. Thus, I would say that a sample of about 275 plays is the point where EPA per play becomes reliable. 

275 plays is somewhere between six and eight games' worth of plays for a quarterback, if he starts the entire season. That means we need six, seven, or eight games to be reasonably confident of the level at which the quarterback played. Keep that in mind when you see various members of the media and talking-head types freak-out after a game or two or you see a player have a great three games to end a season and people start to sing his praises going into the next season. Also remember, this does not mean that we need six games to predict how well a quarterback will perform in the future. 275 plays just provides a good snap-shot of how well a quarterback played in the past.

Friday, August 28, 2020

Addendum to Saquon Barkley Analysis

About a couple of weeks ago, I penned a piece looking at Saquon Barkley's effectiveness through two seasons with the New York Giants. I concluded the post by acknowledging the idea that his lack of efficiency could be attributed to a lack of support via the offensive line in front of him and the Giants' ineffectiveness in the passing game. I thought I should do more than pay lip service to these two arguments so I dug into these ideas some more. To the first point, the Giants offensive line the past two regular seasons has been middling according to Pro Football Focus (see here and here). This argument might seem like an appeal to authority and lack any sort of rigor, but given the data on offensive line play that is available free to the public, there is not much more I can do. ESPN has a stat called pass block win rate that uses tracking data (that is not public) to gauge whether or not an offensive lineman "won" his block on a given play. This can be parleyed into a team ranking. The methodology for pass block win rate can be found here. In 2019 the Giants offensive line ranked 12th (I cannot find a full leader-board for 2018) . Obviously pass blocking is not run blocking, but it is a decent proxy for gauging offensive line ability. This recent post at Football Outsiders talks about the correlation between passing and rushing effectiveness and one of the theories posited is that the correlation can be attributed to offensive line play. Football Outsiders also puts out its own offensive line statistics for passing and rushing. In 2019, the Giants offensive line ranked 25th in adjusted line yards in the run game and 17th in adjusted sack rate allowed. In 2018, those ranks were 29th and 20th respectively. This paints a more grim picture, but on the whole, when looking at the three different sources the offensive line is middling. So not terrible.

To address the lack of an effective passing game, I would say that if Barkley is as amazing as his biggest supporters espouse he is, he should be elevating the level of the Giants offense despite porous quarterback play. But we know rushing success on its own has almost no relationship to overall offensive performance (see studies conducted on the interplay between passing, rushing, and offensive efficiency here and here and my own analysis here). So even if Barkley was very effective, it would not have much of an effect on the Giants ability to move the football down the field and piece together scoring drives.

Putting the idea of running back fungibility aside, I found there was not much evidence that Barkley was better than his peers and warranted a massive Christian McCaffrey extension within the next year. Isolating running back ability is always going to be difficult in a sport like football where there are 11 players on the offensive side and it is not clear how we should divvy up credit between those players. One thought I had was to look at something that running backs may have some control over, something that is independent of offensive line play: the ability to break tackles. For the past two seasons, pro-football-reference has given users access to some advanced stats via charting from Sportsradar. For ball-carriers, this includes the number of tackles they broke and the corresponding rate at which they do so. My theory was that if breaking tackles is something that is consistent year over year, maybe we can better understand which running backs are most effective independent of context. The issue is there is only two seasons worth of data, thus I had some sample size concerns. So any conclusions drawn from this analysis should be taken with a grain of salt. With that being said, let us dive into the data from the past two seasons.

I first looked at every ball-carrier with at least 140 carries (sort of arbitrary but I had to set the cut-off somewhere) the past two seasons and visualized their broken tackle rate in those two seasons.
I recommend clicking on the chart for more clarity. There is not much consistency, as I feared. When modelling the 2019 rates as a function of a the 2018 rates weighted by attempts across seasons, I yielded a correlation of just 0.095 (about 9.5 percent of the variation in the 2019 rates can be explained by variation in the 2018 rates). Looking at the scale of the axes, the rate at which ball-carriers break tackles is small, thus we run into the aforementioned issue of sample size. To attempt to remedy the issue, I tried regressing the rates by padding the 2018 broken tackle rate with samples of league average broken tackle rates (this is a fantastic primer on the technique used to study three point percentage in basketball). The goal was to obtain a better correlation after padding the statistic in question. So I padded the 2018 rates with 100, 200, 300, and 400 league average carries (note that the typical league leader in carries has around 300). The rate I used to pad the results was a broken tackle rate of 0.07 broken tackles per carry, which was the average in 2018. Upon doing so, I obtained correlations of 0.1055, 0.1043, 0.1031, and 0.1022 for rates with 100, 200, 300, and 400 padded carries respectively. So the correlation barely budged after padding and actually became less reliable as the number of league average carries added increased.

So what can we conclude? Using the data I had available, the rate at which ball-carriers break tackles in year N is not very predictive of the rate in year N+1. Referencing this stat as a means for determining the best backs is not justifiable. If I had more data, maybe I could find more signal. But with the evidence at hand, I cannot say broken tackle rate is meaningful. Interestingly, Barkley posted an elite rate his rookie year and broke the second most tackles total behind Derrick Henry, but was merely average in his second season. So upon further reflection, I still think there is no justification for the Giants to pay up for his services as his rookie contract sets to expire after the 2021 season.

Sunday, August 23, 2020

Reviewing Tomas Tatar's Stop in Vegas

Having watched the Canadiens many times over the past three weeks, I find myself thinking a lot about the two major trades involving Tomas Tatar (from Detroit to Vegas and from Vegas to Montreal). Since coming over from the desert (along with a second round pick and, at the time, prospect Nick Suzuki, who has looked extremely dynamic in his 134 5v5 minutes during the playoffs) in the deal that netted Vegas Max Pacioretty , Tatar has been sensational. He has scored 2.63 points per hour at 5v5 (1.6 is average for a forward) and found the back of the net 0.9 times per hour (where average for a forward is about 0.6). He has been a mainstay on the top line for Montreal with Brendan Gallagher and Phillip Danault and the team has controlled about 60 percent of the expected goal and shot share when he is on the ice. Given Montreal received additional compensation besides Tatar in the Pacioretty deal, it would be difficult to frame the deal as anything other than a slam-dunk for Montreal. Tatar is getting paid just 4.8 million against Montreal's cap through next season, while Pacioretty signed a seven million AAV extension through his age 35 season in Vegas. Despite his exploits with Mark Stone the past season and a half, this was a lopsided deal. 

This deal looks so great for Montreal with hindsight because Tatar was viewed as a distressed asset at the time of the trade. His brief stint with Vegas was a disaster. Vegas was getting caved in when Tatar was on the ice, to the point where he was scratched during the playoffs. Tatar only scored 0.92 points per hour at 5v5 and did not even record a primary assist. Prior to his move to the desert in 2018, Tatar was struggling in Detroit, where his scoring rate was cut in half from the previous season despite the team showing solid underlying performance when he was on the ice. Still, Vegas ponied up for him at the deadline, where they gave up a first, second, and third round picks. 

For some insight into how much he struggled during the 2017-2018 season, I scraped his individual rate stats from Natural Stat Trick starting in the 2015-2016 regular season and looked at his performance trends in two week samples. 

His ice time and shot attempt rates have been relatively consistent throughout the past five seasons, yet his scoring numbers have been a roller coaster.
In both cases there is a clear nadir in the 2017-2018 season. His goal scoring results were not even out of line with what should have been expected given the locations of his shots. His shot attempt rates were fine, but his expected goal rates took a big dip. He either was not receiving passes in dangerous scoring areas or his proficiency in carrying the puck to those areas just briefly deteriorated. 

The main indictment on Vegas in this ordeal is that despite Tatar's middling performance leading up to the trade deadline in 2018, they still had to put together a haul for Detroit to part ways with him. It was a lot to give up at the time for what many considered a very good second-line winger, but the principle of trying to acquire a player with middling scoring rates who was previously good while still showing good on-ice results is and was a sound one. But Vegas then flipped script and moved him for an older (albeit very good) player who needed a new contract and had to attach a draft pick and an elite prospect (who is looks like a top-six caliber player already). Maybe Tatar was just never going to fit in Vegas, but I am skeptical. This just seems like a misreading of the evidence we had on Tatar's ability and the variance of performance in hockey. 

As I mentioned above, Tatar scored just 0.92 points per hour at 5v5 with Vegas. Tatar played with Vegas for 6 weeks in the regular season. Given his scoring rates from the past four regular seasons, what are the odds that Tatar would put together a six week stretch as wretched as the one at the end of 2017-2018? I took all of his biweekly scoring rates and stored them in a table. I randomly sampled three (for six weeks of time) 10,000 times and plotted the distribution of rates in each six week sample:
This does not look good for Vegas. He exceeds his scoring rate in Vegas in 89.92 percent of the samples. Trading a guy after he put up points at a rate almost two standard deviations (when you include data from his time in Montreal) from his mean is just bad business. Luckily for Vegas Pacioretty has worked out fine. But as they look across the continent and see how Tatar and Suzuki (who is just 21 years-old) have performed for Le Bleu-Blanc-Rouge, part of them has to bemoan the fact that they committed the cardinal sin of selling a player who was just down on his luck.

Friday, August 21, 2020

Making Fielding Adjustments to Victor Robles

Victor Robles was one of the top prospects in MLB before his major league debut, appearing in most lists' top ten prior to his call-up. Through 780 plate appearances in the majors, Robles has been a tantalizing player, displaying top of the line sprint speed and defensive ability that demonstrate the elite tools scouts fawned over before his call-up. Despite these prodigious tools, Robles' batted ball profile has slightly disappointed. His career hard hit rate is only 23.5 percent (per Baseball Savant), much lower than the major league average of approximately 34 percent. Some have pointed out his lack of hard hit balls (as a portion of his total batted ball events) can be attributed to his propensity to bunt for base hits.  His maximum exit velocity is about average (110 mph) which might indicate he can unlock some more power with a few adjustments. Take a look at the hitters who are grouped around Robles in maximum exit velocity on balls in the "sweet spot" (defined by MLBAM as batted balls between eight and 32 degrees):
Note I filtered the query by players with at least 75 balls in play at those launch angles. Forget, for a moment, that unseemly average exit velocity and take a look at the maximum for Robles and the players around him. Rhys Hoskins is a player known for his power. Cano, even in the later part of his career, is a very good hitter as are Blackmon, Canha, Winker, and Jose Martinez. Take aside Severino (who has had a great start to the 2020 campaign), Smoak (who has aged out of relevance), and Ramos (a lot of injury issues in addition to aging), this is a group of productive major league players. So there is some hope for Robles on the offensive side if he can get his approach in order.

Nevertheless, because he is an odd player that seems to be constantly talked about in baseball circles (especially by fantasy prognosticators), I thought he would be an interesting player to look it with regards to how an opposing team should deploy its defenders against him.

Robles has an unusual spray chart when you map out his batted balls from 2019 and 2020.
If the distinction I am trying to make is not clear, let me separate the batted balls by those which are ground balls and those which are fly balls. 
He has an extreme pull tendency when he puts the ball on the ground, but the opposite when he hits fly balls. The defense, therefore needs to adjust accordingly. So where should each defender on the diamond place himself? First I looked at all of his ground balls. I bucketed each ground ball into one of ten buckets based on the spray angle of the batted ball. There are five buckets on each side of second base. Here are all his ground balls labeled by their bucket: 

And the average locations of the batted balls in each bucket, in both tabular and graphical form, respectively: 
Based on the frequency with which Robles hits his ground balls, in addition to the locations of each average bucket position relative to the rest, I settled on this defensive alignment for the infielders: 
The third baseman is hugging the line, the shortstop is playing deep in the hole, the second baseman is around the bag, and the first baseman is shaded to the left (the further left the better, but obviously he needs to be able to get back to first base to receive throws. So consider the first baseman in the game at the time). Next I looked to position the outfielders. 
When hitting the ball in the direction of either of the corner outfielders, Robles hits the ball at similar distances and actually hits the ball the hardest to right field. Very unusual. Given the locations of his fly balls in the spray chart at the start of the post, I would position the left fielder around the cluster towards center field to the left of second, the center fielder shaded to the right towards the other large cluster, and have the right fielder hug the line. When all is said and done, the alignment should look as follows: 
As an addendum I wanted to see if there were certain pitch locations and types that induced contact at the locations of where I would place the fielders. First, I sorted the pitch locations that led to ground balls. Considering the infielders are placed around buckets one, three, five, and eight I filtered the entire data set for pitches that resulted in ground balls in those four buckets. 
Given his lack of fly balls and locations of these pitches, it seems he has trouble getting under the ball when its thrown down in the zone. He really pulls breaking balls down and inside, almost all of his balls in play for pitches with those characteristics were hit to the third baseman. Fastballs up led to a lot of grounders to bucket three, where the shortstop would be positioned. 

Here is a similar figure, for fly balls in each portion of the outfield:

Most fastballs on the outer half of the plate are going to find their way to the right fielder. Similarly, fastballs in the middle third go to center and in the inner third go to left. Robles is strictly adherent to the idea of "going with the pitch", a possible explanation for his middling power results despite his loud tools. Then again, his weakest fly ball contact is to the pull side. Maybe he needs to make a swing adjustment to unlock some more power? It is difficult to be an even above-average power hitter without hitting the ball to the pull side with authority.

All data scraped from MLBAM using the baseballR package 

Tuesday, August 18, 2020

Rangers 5v5 Usage Patterns Over the Course of the Regular Season

With the Rangers season completed, I wanted to look at how coach David Quinn adjusted player usage throughout the course of the regular season. Even-strength usage, more specifically 5v5 usage, makes up a majority of a hockey game (to the tune of about 80 percent). First I looked at the 5v5 usage patterns on a weekly basis during the season for the players I deemed as "regulars".

And here is the same type of plot but with biweekly usage for the most used players:
Seems like Quinn trusted Fox and Lindgren more as the season progressed. Chytil started seeing much less ice time at 5v5 towards the end of the season, as did Howden (definitely to the benefit of the team). Smith saw a huge influx in usage, probably attributable to his switch back to playing defense. Skjei also saw a lot more time right up until he got traded. This makes me think the team was trying to showcase him and they came to the conclusion they were going to move him a few weeks before he got traded.

Overall, it will be interesting to see how much time Fox gets next season (whenever that happens). Given his on-ice results I would like to see home more consistently lead the team in ice time along with Trouba. The Chytil usage is a bit concerning. Given the lack of center talent in the organization, he is a big piece moving forward so giving him ample ice time will be imperative in evaluating whether he can be the second-line center of the future.

Friday, August 14, 2020

How Much do Defensemen Drive Play Compared to Forwards?

Look at the most expensive players in the NHL and you will find that the majority are forwards. There is definitely a disparity in how teams pay free agents based on position. Do not mistake this point for the idea that defensemen do not get paid; they do. Filter the list of players from the link above to just defensemen and we still have a list of highly payed professional hockey players. Now, take a look at the highest paid forwards. If you do not feel the need to click all of those links, take my word on the following; if you do you can come up with your own conclusions: subjectively, the list of the highest paid defensemen is riddled with a higher proportion of albatrosses. In the hockey fan community, there is definitely a consensus that the league is worse at evaluating defenseman. The box-score statistics do not do an adequate job of describing or predicting a defenseman's value, especially relative to forwards (where points, which are still not a great indicator of talent, have more standalone value). Part of the issue is that aforementioned discrepancy between gauging the value of forwards and defensemen. But another part of the issue is, based on the roles each position plays on the ice, defensemen just matter less and their on-ice results are subject to way more noise than forwards.

I am going to define offensive and defensive play driving as the differential between a team's shot attempt and expected goal rates (xG for the rest of the post) when a given player is on or off the ice. These are called relative metrics and are displayed on any of the advanced stats sites available for public consumption (for this post all my data is from Natural Stat Trick). The obvious drawbacks for using these measures to approximate play driving ability is they lack context. They do not account for who the player plays with or against (teammates are more important), zone starts, score state, home versus away, etc. But for the purpose of this study, I think they will do. If you believe that the results are invalidated because of the way I chose to use the data, I cannot stop you from doing so.

The data set includes skaters from the past two regular seasons who played at least 200 minutes at 5v5 in both seasons (all the analysis is done looking at performance at 5v5). I first looked at the changes in each measure of offensive and defensive play driving and how they changed for each player year over year, by position.
In each instance, forward contributions are less variable year over year, even with regards to defensive play driving ability, though defensive performance is more variable overall. We should be less confident in how defensemen drive play in a given season compared to their forward counterparts. Theoretically this uncertainty should manifest itself in how we understand a given player's mean and distribution of performances in a season or set of games (defensemen have more variance, thus we should be less confident). 

I built four simple linear models for each of each of the measures of offensive and defensive play driving with the goal of predicting 2019-2020 results from the 2018-2019 results and a player's position. The players position (either forward or defenseman) was statistically significant in each case. Based on the collection of plots above, offensive play driving had much less noise than defensive play driving.
Offensive shot attempt driving had the least amount of variation, followed by driving xG on the offensive end. Therefore, we should be much more weary of attributing defensive results to defensive ability and instead chalk a lot of those results up to variance (which should still be done on the offensive end, but as I alluded to much less so). With that being said, looking at the distributions of play driving results on offensive paints a grim picture for defensemen.
Forwards have higher positive and negative impacts and less of the population is concentrated around the mean. This is another indicator that forwards matter more; they have a larger effect on the most predictive parts of a player's value.

What can we draw from this? First, whenever a team signs a defensemen to a massive long-term contract, our ears should perk up. I am not saying that no defenseman deserves a 40, 50, 60, 70, or 80 million dollar contract. I just want to caution fans on how confident they should be when considering the merits of a defenseman, especially compared to forwards. Considering that defenseman play driving ability is more variable year over year, I think teams should really be investing in their forward corps more and taking fliers on cheaper defenseman on short term deals, where they can either be useful pieces in the lineup, good trade chips, or sunk costs that are not too expensive (as Kyle Dubas has done in Toronto, to the dismay of many despite the team's consistent success the past few years). Keep this in mind during the next trade and free agency frenzy, sometime in October.

Thursday, August 13, 2020

Evaluating Saquon Barkley Through Two Seasons

Saquon Barkley is face of the New York Giants through two seasons in blue. He was drafted second overall in 2018, to the chagrin of the more analytically inclined football fan. On the surface Barkley seems to have been a slam-dunk pick. He has accumulated 2,310 yards on the ground and 1,159 yards through the air for a total of 3,469 scrimmage yards through two seasons (per Pro Football Reference). He led the NFL in scrimmage yards as a rookie and his two year total is 13th all time through two seasons for all backs since the NFL-AFL merger despite only playing in 13 games last season. On the surface this seems like an excellent for the Giants. But dig a little deeper and Barkley has not been the player Giants fans and the media have made him out to be.

Expected points added is a metric that values plays based on a variety of factors including, but not limited to down, distance and yard-line. Each play has an expected points value and the change in expected points between play is expected points added (EPA). This is the best way to gauge the value and success of plays in the NFL today. Success rate is the rate at which plays yield a positive expected points added. We can use this to show the benefits of throwing the ball relative to running. The average pass play yields an EPA of 0.07 while a running play on average results in an EPA of -0.04. So for the remainder of this post I will use success rate and EPA to evaluate the merits of player production.

Looking at just the Giants running backs the past two seasons, we see Barkley stands out in both volume and efficiency.
The following is the distribution of EPA on each carry for each back on the Giants: 
Barkley looks good, relative to his teammates. But looking at where he stands compared to the other most used backs in the NFL the past two years and his performance looks much less impressive: 
Barkley has a slightly above average EPA per play figure compared to all of the backs, but compared to the other more-used backs he is below average. His success rate is below average period. Given his reputation for making "home run" plays, this is not especially surprising. He lags behind stalwarts such as Christian McCaffrey, Derrick Henry, and Nick Chubb. Check out the scale of the y-axis. Most backs, even the best of them, generally yield negative EPA per play on average. The following is the same information tabulated for backs commonly referred to as the superstars at the position. I considered above average plays those which yielded an above average EPA for a running play. High end plays are in the top 20th percentile of all plays and low end carries are those in the bottom 20th percentile: 
But what about his ability as a receiver? Part of the appeal of Barkley is the ability to catch the ball out of the backfield. He does not fair much better. 
To say his per catch productivity is mediocre would be generous. Furthermore, I pulled the numbers for some of his peers who are considered the very best at receiving: 
Note that running back targets are still not a great value proposition. Most back yield less EPA per target than the average NFL target, which again is about 0.07 EPA. Finally, let's pull out the distributions of EPA per targets from three backs who are considered the best at receiving the ball out of the backfield: 
Barkley fairs much worse. Some might argue that Barkley has not had the best support in the form of quarterback and offensive line play. I would argue that if he is a transcendent player, he should be able to perform at the highest levels without that support. If that is not a satisfying answer, I would argue making a heavy investment in running back is a bad proposition in the first place. Two years in, Barkley's selection at 2nd overall still seems puzzling. The Giants could have used the selection on a player who plays a position that is more valued, such as a quarterback flier, a defensive back, a wide receiver, or a high end pass-rusher, the types of players that demand massive contracts on the open market. I worry that the Giants will double-down on the selection by handing Barkley a McCaffrey-type extension that will hamstring the Giants salary cap for the long-term future. All we can do now is wait and see if Dave Gettleman will accept the sunk cost and either play hard-ball or move on. I am dubious he will do so, but only time will tell.  

Wednesday, August 12, 2020

Funneling Attempts on the Power Play

If you have vaguely followed the NHL in the recent past, arguably the most evergreen phenomenon is Alex Ovechkin shooting from the left face-off circle on a power play. Not familiar? Take a look at his shot attempt locations with the man advantage since the 2007-2008 regular season:
Distilling these locations into a contour map yields the following:
The Washington Capitals, in this time-frame, have ranked 3rd, 1st, 1st, 17th, 18th, 1st, 1st, 1st, 1st, 3rd, 6th, 14th, and 21st (this past season) in goals per 60 minutes on the power play. Over the entirety of the time between the 2007-2008 and 2019-2020 seasons, the Capitals have scored 7.94 goals per 60 minutes on the power play, 0.82 goals per 60 more than the second-place Bruins. The difference between the Capitals and Bruins is the same as the difference between Boston and Montreal, who ranks 17th. Recent struggles notwithstanding, the Capitals power play has consistently churned out goals despite the fact that everyone in the arena and on their couch knows where they want to move the puck. To boil down the Capitals power play to funneling Ovechkin the puck is a bit disingenuous, but one would think going to the well over and over again would lose effectiveness over time. One might argue that the Caps are seeing their predictable behavior come to haunt them in recent seasons, but I would argue that has more to do with relative effectiveness of some of the pieces as they age. For most of the last 13 seasons, the Capitals have had the best power play in the league. 

Ovechkin is a generational goal scorer and arguably the best of all time. Funneling shot attempts to a player of that caliber (to the tune of 38 percent the past three seasons) seems like a shrewd strategy. But what about other players? Can funneling the puck to a single player on the power play lead to overall effectiveness if that player is not Ovechkin? To investigate, I took every player who has played 100 minutes on the power play (specifically 5v4 power plays) over the past three regular seasons. Next I looked at each of those players' individual shot attempt rates, goal scoring rates, and expected goal scoring rates. I then paired those with their on-ice shot attempt, goal, and expected goal rates and took the percentage of each that the individual accounted for. 

Once I collected all the data, I first plotted the effects of funneling for all the players in the data set on their on-ice goal rates (my barometer for power play effectiveness): 
There looks to be not much of a relationship. At the tail end of the shot attempt and goal plots there appears to be a small negative relationship, so I filtered for players who I deemed the most used (at least 20% of shot attempt, goal, and expected goal share): 
Still nothing. For expected goals and goals, there seems to be some diminishing returns at around 28 percent, but this does not persist with shot attempts. Finally, I split the data by forwards and defenseman to see if funneling to a specific player type has any effect: 
Given the locations from where defensemen shoot, it is not surprising that funneling offense in their direction yields worse results. They are often shooting far from the goal line, above the face-off circles. For forwards, it is still difficult to come to any conclusion. 

In the future, I would like to dig into this deeper and see if funneling to specific locations for specific shot types can help explain the proficiency of a power play. Can a player like Mike Hoffman, who has a similar shot diet to Ovechkin but at the right face-off circle, prop-up a good power play? For now, I think we can safely say the following: funneling the puck to forwards is not an inherently bad strategy (without knowledge of specifics) and funneling the puck to defenseman is bad (which is not a revolutionary finding). 

For those who are curious, here are the most and least used players on the power play based on percentage of shot attempts: 



Shot location data via Moneypuck.com, all other data via Natural Stat Trick




Saturday, August 8, 2020

Small Sample Strikeout Woes

Gary Sanchez, through his first 25 plate appearances in this abbreviated 2020 season, accumulated 12 strikeouts, a rate of 48 percent. Needless to say, there has been a lot of hand-wringing about this slow start to the year, including questions about his focus (which New York fans love to talk about with respect to Sanchez) and his overall ability (despite the massive sample of Sanchez being an excellent hitter, the best among catchers since his debut in 2016 on a rate basis).

I wanted to pen a quick post about narrative building off of small samples of plate appearances. Given the short season and the general nature of baseball fandom, it is difficult to put aside what you are watching on a game to game basis and have perspective on the range of outcomes for a player in small samples and what that player should be expected to do, on a macro-level (single or multiple seasons), in the future.

Inspired by Sanchez's strikeout-riddled start to the season, I looked at theoretical hitters with true strikeout rate talents from 20 percent to 40 percent in increments of two percentage points. For each batter I simulated 25 plate appearance samples 10,000 times each. I used a Monte Carlo method where a number below the strikeout rate talent level resulted in a "strikeout" while all other numbers were other plate appearance outcomes, which I was not concerned with for this study. The following are histograms for each theoretical hitter and the distribution of strikeout rates in each of the 10,000 25 plate appearance samples:

The horizontal axis represents the strikeout rate in the sample. The dashed line represents a strikeout rate of 50 percent, so any result past that would indicate the hitter struck-out in at least 13 plate appearances in the sample. As you would imagine, a true talent 20 percent strikeout rate hitter rarely records 13 strikeouts in 25 plate appearances. Also not surprising, a 40 percent strikeout rate hitter is rung-up at least 13 times more often than the hitter with a 20 percent talent. For more clarity on how often each hitter records a strikeout rate of 50 percent, I created another visualization using the distributions above: 
The 20 percent strikeout rate hitter recorded 13 strikeouts in just six of the samples, or 0.06 percent. The 40 percent hitter recorded the value of interest in 16.1 percent of the samples. Even a 40 percent true talent hitter rarely records this result and there are not even any of these types of hitters who regularly play in MLB. The average strikeout rate in MLB is 23 percent and a hitter of that talent level would be expected to record this result about 0.15 percent of the time. Let us now go back to Gary Sanchez. Depth Charts projections at FanGraphs have him at a 28.9 percent strikeout rate for the rest of the season, which we can consider that system's estimation of his true talent. We would expect a hitter of this caliber to strikeout at least 50 percent of the time about 1.2 percent of the time. What this means is that Sanchez is performing (with respect to striking out) at a level that certainly is within his range of possible outcomes, but an incredibly bad outcome at that. So I guess what you, the reader, should take away from this is maybe, just maybe, cut the guy a little slack. He has certainly been dreadful thus far, but I would not expect this level of performance to persist throughout the season. 


Saturday, August 1, 2020

Gerrit Cole's First Couple of Starts for the Yankees

Now a little more than one week through the (mess that is the) 2020 MLB season, we have seen Gerrit Cole take the mound for the Yankees on two occasions. Cole was the biggest free agent on the market and as such he signed a 324 million dollar contract with New York, the largest ever for a major league pitcher. Cole is coming off of two seasons in Houston where, after making some meaningful changes to his pitch mix, he blossomed into an ace-level starting pitcher to the tune of a 37.3 percent strike-out rate, a 6.9 percent walk rate and accumulated 13.4 wins above replacement over 412 and two thirds regular season innings, per FanGraphs. 

Having seen both of his starts thus far, I thought it would be a fun exercise to dig through some of the pitch-level data, even with the season's future hanging in the balance and see if I can spot any differences in his pitch characteristics thus far between his tenure in Houston and his starts with the Yankees. Before I take a look, I should provide the caveat that looking at just two starts worth of pitches should be taken with a massive grain of salt, on account of the extremely small sample. 

First, here are his average velocities by pitch type and the standard deviation of those velocities: 
Not much of a change at all. Even the standard deviations are very close. Next, pitch usage both in total and by count: 
Cole has been much more reliant on his four seamer in the early going, especially in hitter's counts. This might be some indication that he does not trust his secondary pitches yet for high leverage pitches, which given the unusual circumstances around the start of the season is not especially surprising. Still, given his success the last couple of years, I would hope as the season (hopefully) progresses, he begins to lean on his slider and curve more often to keep hitters uncomfortable even when they have the advantage. 

To the eye, Cole has not been very sharp (relative to our enormous expectations) and sure enough he seems to have some issues commanding the fastball and curve.
He really seems to be making an effort to bury his curve, but given his whiff rates in the past on the pitch he would probably see some benefit to putting it more in the strike zone and forcing opposing hitters to swing. The fastball has been left over the plate way too often thus far, and his whiff rate has suffered as a result (37.6 percent last year, 22.0 percent through his first two starts. In 2018 it was 29.7 percent, so some regression from last year's figure is to be expected). The slider seems fine, but his curve ball is generally his breaking ball of choice against lefties. In order to turn over the lineup more than a couple of times effectively, he needs to be more effective with that curve. Maybe you can attribute some of this apparent lack of command to where he is releasing the ball?
Looks like he is releasing the ball closer to his head on average. There is still overlap between the release positions in Houston and New York, but there is a clear separation between the clusters. Here is the same information in tabular form: 
There is a difference, but it is too early to attribute this difference to a mechanical change versus just natural variance in his release points. In the release point plots above, there is more variation in his release points from his Houston days, but also more pitches facilitates a larger probability that various release points deviate from the mean. If you consider his release point data from Houston as a distribution of possible release points for Cole, it is feasible that the data from his pitches in New York would fit in that distribution, disproving the idea that this is a conscious mechanical change and instead is just a product of variance. Nevertheless, I would be interested in hearing whether or not he made this tweak on purpose or its just the product of a small sample of pitches. Finally I looked at Cole's pitch movement profile and found that nothing had really changed in his first two starts. 
And the data in tabular form: 
Given the movement has not changed nor has the velocity, as I explained above, the apparent lack of command through two starts is probably the main culprit of his strike-out rate falling 10 percentage points since compared to his time in Houston and going from 0.93 ground balls per fly ball to 0.35, per FanGraphs. His swinging strike rate is down (about 27 percent) as is the rate at which he puts away hitters with two strikes.
All of this is not cause for alarm. Cole has been very good through two starts, just not as dominant as we have been used to. But we are also talking about only two starts and two starts after a four month layoff. Given what has transpired in the world of baseball the last few days, I would be much more worried about seeing Cole pitch at all for the rest of 2020.