Last week, Tom Tango, Sabermetrician extraordinaire for the Chicago Cubs (and one of the original hockey analysts to, you know, actually make money doing this thing), posted an article on his site proposing a new metric to better weight the components of Corsi. Tango defined his new metric, (which he proposed with tongue planted firmly in cheek be called Tango, or failing that Weighted Shots), as a simple linear combination of goals and non-goal Corsi events:

wSH = Goals + 0.2*(Shots + Missed Shots + Blocked Shots)

The weighting of 0.2 was informed (although not strictly derived) from a regression between half-season Corsi components and half-season Goals For (i.e. calculating the weights to maximize the predictive value of future Goal Differential). Tango’s goal was to preserve the predictive information that we see in Corsi while properly taking into account the fact that goals are really what the game is all about. And while no analyst has ever argued that Corsi alone is enough to evaluate a team or player with, Tango’s point was that we had the data to make intuitive improvements to Corsi in a relatively easy manner.

One of the problems with how wSH is formulated though, is that it aimed behind the current state of Hockey Analytics. As Micah McCurdy has illustrated, Score Adjusted metrics vastly outperform standard possession metrics, since both the location of the game and the current score state have significant impacts on how teams perform. Unless we take score and venue effects into account, even an improved metric like wSH is missing important information. Fortunately, if we follow along with Micah’s original methodology, we can figure out the appropriate adjustments to bring these factors into wSH.

Using data from 2008-2014, we can first calculate the probability of a given team recording an event based on the event type, score state and game location (home/away):

 Score Home Goal Away Goal Home Shot/Miss/Block Away Shot/Miss/Block -3 52.87% 50.83% 59.25% 56.66% -2 51.19% 50.49% 57.34% 55.31% -1 53.20% 49.77% 54.96% 53.22% 0 52.91% 47.09% 50.95% 49.05% 1 50.23% 46.80% 46.78% 45.04% 2 49.51% 48.81% 44.69% 42.66% 3 49.17% 47.13% 43.34% 40.75%

Then, we can take the probabilities, along with Tango’s wSH weights (1 for goals, 0.2 for shots/misses/blocks) and combine them to calculate weighted adjustment factors for a Score Adjusted Weighted Shots metric (SAwSH):

 Score Home Goal Weight Away Goal Weight Home Shot/Miss/Block Weight Away Shot/Miss/Block Weight -3 0.943 0.983 0.163 0.173 -2 0.976 0.990 0.171 0.179 -1 0.936 1.005 0.180 0.187 0 0.942 1.058 0.196 0.204 1 0.995 1.064 0.213 0.220 2 1.010 1.024 0.221 0.229 3 1.017 1.057 0.227 0.237

As you can see, the value of a goal relative to a shot isn’t constant in our new method. It ranges from one goal being worth 5.78 shots/misses/blocks (for a home team down 3 goals) all the way down to 4.46 shots/misses/blocks per goal (for a visiting team up by 3).

Now that we’ve defined how to calculate SAwSH, let’s look at how well it performs compared to Score Adjusted Corsi. Whenever we evaluate a new stat there are two things we need to look at to decide how much trust to put in it: 1) the repeatability of the metric, that is how well our measurement over one period predicts the same measurement over another period; and 2) how well the metric predicts our result of interest (winning hockey games). A metric that’s not repeatable doesn’t do us much good when we’re evaluating a team or player, since we don’t know whether the results we observe are due to luck or talent. At the same time, a measure that’s highly repeatable but doesn’t relate to winning is a metric that we should just ignore.

The best way for us to test for repeatability at the team level is to look at the correlation between our results in odd-numbered games and in even-numbered games. Since there’s nothing in the game number that would relate to our results, if we see a high correlation it’s a good sign that what we’re observing is a talent.

While Score Adjusted Corsi shows slightly more repeatability, the difference at this level is more or less negligible at this point. Both metrics show enough repeatability that we don’t have to worry that they’re influenced too heavily by luck. This is particularly important for SAwSH as it dispels one of the biggest worries that many people had about it, being that the inherent variableness in shooting and save percentage would mean that we’d need a much larger sample before we could trust the results.

If we move on to predictability we can run a similar test, but instead of correlating the same metric between even and odd-numbered games, we’ll look at how well our Score Adjusted numbers in even games predict a team’s Goals For Percentage in odd games (and vice-versa).

 Metric Correlation (Even -> Odd GF%) Correlation (Odd -> Even GF%) Score Adjusted Corsi 0.475 0.421 Score Adjusted Weighted Shots 0.495 0.446

In both datasets, SAwSH does a better job of predicting out of sample Goals For %. This makes sense of course, since SAwSH includes goal scoring/goaltending data where SAC doesn’t. The difference between SAC and SAwSH is also interesting to note: we seem to be able to explain ~5% more of the variance in out of sample GF% by using wSH rather than Corsi, illustrating the fact that shooting percentage and save percentage do matter at the team level. While they’re obviously not as important as possession (after all, we still do fairly well using only SAC), there’s clearly a benefit to including them our analyses.

While the computational cost of SAwSH may be slightly higher than standard CF, the benefits are more than just an increase in predictive power: wSH makes much more sense intuitively, and is a direct counterpoint to the argument that analytics are too focussed on possession. SAwSH makes a much better argument for analytics while giving up very little in the way of repeatability. While there are obviously further areas to investigate (the weightings in Tango’s original regression equations are worth a deeper look as there’s likely further value to be extracted there), SAwSH is clearly a step-forward for the analytics movement. And although some may argue that the power of SAwSH is a repudiation of Corsi as a metric, I instead look at it as a validation of possession-based analyses: the value of the sample that Corsi offers is obvious; SAwSH is just a small tweak to better reflect the inherent shooting and goaltending differences that Corsi can miss in some cases.

On his site Tango asked for correlations for Score Adjusted Goals, and so I’m happy to oblige:

 Comparison Correlation Odd(SAG%)->Even(SAG%) 0.365 Odd(SAG%)->Even(GF%) 0.366 Even(SAG%)->Odd(GF%) 0.354

Obviously, SAwSH is quite a bit better in both repeatability and predictability, but what’s more interesting is how little additional value we get from adjusting GF% for Score State/Venue. The correlation between raw GF% in odd and even games is 0.353, which means that we’re getting almost no additional information from our adjustments.

Tagged with: , ,
Posted in Statistics