In 2013-2014 Boyd Gordon and Manny Malholtra were two of the worst players in raw CF% across the league at 42.3% and 41.6% respectively. Most people would argue that their results were not all that surprising given that they faced the toughest zone starts of any players in the league, with over 59% of their shifts starting in the defensive zone according to stats.hockeyanalysis.com, almost 10% higher than anyone else in the NHL. The problem with this argument, however, is that neither player actually started 59% of their shifts in their own end. While both players did see 59% of the faceoffs they were on the ice for come at their own end of the rink, if we look at where each shift actually started and ignore faceoffs that started mid-shift, we see a much different story. While both players still faced some of the toughest zone starts of any player in the league, the actual percentage of Boyd Gordon’s shifts that started in front of his own goaltender was only about 32%, almost half of what’s traditionally reported. Malholtra, on the other hand, has a much larger gap: only 25% of his shifts actually started in the defensive zone, nearly 35% lower than his faceoff-based metric.
It’s not just Malholtra and Gordon and those at the extreme ends of the spectrum who are grossly misrepresented by traditional zone start percentage either. Every player across the NHL has their usage numbers skewed by the fact that most sites use faceoffs to measure zone starts rather than looking at the actual shift data (I should point out that most of the main stats sites do make it very clear that they use faceoffs, and that Hockey Analysis actually refers to the metrics as OZFO%/DZFO%/NZFO% now). Part of the reason for the differences is that the traditional measurements don’t take into account shifts that start on-the-fly as opposed to at a stoppage in play. And while this explains some of the difference we see, it’s not the bulk of the problem. The main issue with the current approach to measuring zone starts is that the measurement is often skewed (and sometimes heavily) by the performance and talent of the player in question. Bad players tend to end up with more defensive zone faceoffs because their opponents tend to get more shot attempts against them, which leads to more opportunities for their goalie to freeze the puck and more defensive zone faceoffs. The same idea is true in reverse for good players, and it all adds up to a false correlation between the traditional zone start measure and possession numbers.
Finding a players true zone start percentage, that is the number of times they actually begin their shift in a given zone divided by their total number of shifts, isn’t all that difficult either. A simple approach, which only requires use of the NHL’s Play-by-Play files, is to check whether a player was on the ice for the stoppage directly before any faceoff. If they were on the ice, then we don’t need to count the faceoff towards our zone start score, while if they weren’t we know it’s a true zone start.
Alternatively, and more accurately, we can use the NHL’s shift files to check when a player’s shift start coincides with a faceoff time. The benefit of this approach is that not only do we see when a player starts in a given zone, but we can also find those shifts where the player starts his shift on-the-fly. And when we look at the data, it turns out that including these on-the-fly shifts is actually really important in terms of determining a player’s true zone start percentage.
The graph above shows the number of players from 2008/09-2013/14 who recorded a given zone start percentage using both the traditional faceoff count based approach and our true zone start measure. While it’s obvious that when we include on-the-fly metrics our measures should decrease, the actual number of shifts that start on-the-fly (and hence don’t really offer any zone-start advantage) is quite staggering. An average player should expect roughly 60% of his shifts to start on-the-fly, with the remaining 40% divided up between the offensive, defensive and (most commonly) neutral zones. While our traditional metrics have players starting approximately 30% of their shifts in the offensive or defensive zones, when we use True Zone Starts we see that the number drops down to about 10-12%.
|True DZS%||Traditional DZS%||True NZS%||Traditional NZS%||True OZS%||Traditional OZS%||OTF%|
The other thing that’s important to note is that the distribution curves are a lot narrower for our true metrics than our traditional zone start numbers. The standard deviation of Traditional OZS% and DZS% is about 5%, while for True OZS% and DZS% it’s only about half that. What this implies is that the actual difference in coaching usage between players is a lot smaller than we previously thought it was, and that a good deal of the variance we were seeing in the original numbers was related to the differences in talent we mentioned above.
Even if we ignore differences in talent though, it’s pretty easy to show why we see a smaller variance in true zone start percentages. If we count the number of faceoffs that occur in each zone for a true zone start in a given zone, we can estimate how many traditional zone starts each true zone start is worth.
|Defensive Zone Start||Neutral Zone Start||Offensive Zone Start|
|Expected DZ Faceoffs||1.14||0.34||0.07|
|Expected NZ Faceoffs||0.06||1.28||0.06|
|Expected OZ Faceoffs||0.06||0.32||1.13|
For every Defensive Zone start, we’d expect an average player to see 1.14 defensive zone faceoffs (including the original draw), and 0.06 neutral zone and offensive zone faceoffs. In other words, even if we exclude the original faceoff, a player who starts in the defensive zone is more likely to have his second faceoff of a shift in the defensive zone than the offensive zone. This trend also holds (albeit in the opposite direction) in the offensive zone, and it’s this difference that results in players with above average True DZS% having even more extreme Traditional DZS%.
While we’ve seen that traditional zone start numbers obviously differ a lot in magnitude from true zone start numbers, you may be wondering whether it’s still appropriate to use the traditional numbers as proxies for the true numbers. This is obviously a fair question to ask: after all, if the numbers are basically the same but just on different scales, you would still be able to get a decent idea of a player’s true usage from their traditional numbers. Unfortunately, however, the correlations between the true and traditional metrics aren’t all that high.
|True DZS% vs. Traditional DZS%||0.80|
|True NZS% vs. Traditional NZS%||0.53|
|True OZS% vs. Traditional OZS%||0.67|
These would be great correlations to see if we were looking at year-over-year comparisons, but in this case it’s not encouraging at all since we’re looking at two numbers that purportedly measure the same thing over the same time period. To give a point of comparison, the correlation between same season Weighted Shots % and Corsi For % at the team level is above 0.95, and we still prefer Weighted Shots because it’s a more accurate measure of a team’s talent.
So while it’s clear that traditional zone starts are a flawed metric, we only really have half the story. While true zone starts are much less prevalent than the traditional numbers indicate, we still haven’t looked into what effect a true zone start has on a player’s possession numbers. We know that the difference in zone starts between players isn’t as big as we may have thought previously, but it’s still important to figure out the size of impact of a true zone start. After all, if a true defensive zone start makes a player 80% worse, a marginal difference in usage between players may result in an enormous differences in results. In Part II we’ll examine how we can measure these differences, and look at how we can properly adjust for true zone starts in our possession numbers.