2015 Playoff Predictions

After three seemingly never ending nights without hockey, the playoffs are finally upon us, with the Habs and Sens, and Caps and Isles set to kick off the postseason tonight at 7. As in years past I’m going to be posting predictions for each matchup as well as overall cup odds throughout the playoffs. Unlike the last 2 years, however, I’m using a new (and hopefully improved) model to predict how well each team will do.

This new model is based partly on work that I’ve completed and written about in the past (for example, a major element is Score Adjusted Weighted Shots), and partly on work that I’ve done but haven’t written about. I’ll hopefully get into more detail on the model later, but at a high level the key factors are:

  • 5v5 Score Adjusted Weighted Shots For Per 60 (over the last 25 games)
  • 5v5 Score Adjusted Weighted Shots Against Per 60 (over the last 25 games)
  • Penalties Drawn Per 60 (full season)
  • Penalties Taken Per 60 (full season)
  • Powerplay Weighted Shot Differential Per 60 (full season)
  • Penalty Kill Weighted Shot Differential Per 60 (full season)

Both the Powerplay and Penalty Kill shot weights are different from the even strength weights, but the principles used in deriving the weights are the same. All these factors are scaled against league average and then used to determine each team’s advantage on offense, defense and special teams. Based on these advantages we can predict each team’s likelihood of defeating their opponent in a single game, and extrapolate these single game numbers to find a total series probability.

So who does our new model think will be accepting the Cup from Gary Bettman this spring?

Read more ›

Tagged with: ,
Posted in Predictions

Puck++ Playoff Prediction Challenge

Do you like the playoffs? Do you like predicting things? And do you think the NHL’s claim that SAP has a guy who can get 85% of playoff series right is a bit unbelievable? Then do we here at Puck++ have a contest for you!

For the first (and hopefully not last) time, Puck++ is hosting a Playoff Prediction Challenge. To enter, all you need to do is provide the probability of the home team winning each series, round by round. The entry form and a full set of rules are available here – entries will be accepted at any time, however you’ll only be able to provide predictions for any series that has not yet begun, so don’t wait until the last minute to get your picks in.

What do the winners get? Glory, and the right to call yourself the smartest person in the hockey world for the next year*.

So what are you waiting for? Entering takes as little or as much time as you’d like, so fire up R or Python or your crystal ball and get predicting!

*Puck++ provides no guarantee that you will be recognized as the smartest person in the hockey world.

Tagged with: ,
Posted in Playoffs

Predicting Save Percentage: Dangers Zones and Shot Volumes

A few days ago, Conor Tompkins of Null Hypothesis Hockey tweeted out an interesting set of graphs showing the correlation between a goalie’s save percentage in each of the War-On-Ice danger zones and their overall success rate. Conor found that (unsurprisingly) a goalie’s performance on high danger shots was most closely correlated with overall success, with medium shots having slightly less influence, and low danger shots showing almost no relationship. While Conor’s model focused on correlations within the same season, Sam Ventura suggested that a useful extension would be to look at how well the danger zone save percentages predicted future overall save percentages. After all, if performance on high danger shots is most critical for a goalie in determining his current season save percentage, it stands to reason that this would also be a key predictor of future success.

One way we can look at this is to run a multiple linear regression between a goalie’s current season save percentage and his past save percentages broken down by danger zone. We’ll focus on 5v5 data only to avoid the issue of varying penalty rates between teams, and look at goalies who played at least 1000 minutes in back-to-back seasons (all data from War On Ice).

Read more ›

Tagged with: , ,
Posted in Goaltending, Theoretical

Calculating Replacement Level for Faceoff Win Percentage

The idea of the replacement level player is one of the most important concepts in sports analytics. While not strictly necessary to do any basic player comparisons, the value of the replacement level lies in providing a baseline below which a professional player should not perform. After all, if a player is performing below replacement level, we should listen to the stats and do exactly what they tell us to: replace him with almost any other player.

In hockey though defining replacement level can be a difficult task. Part of that stems from the fact that we currently don’t have exact methods of rating player’s individual contributions. We can say which players generally perform well when they’re on the ice, and we can estimate how a player’s team performs with and without him, but distilling all the information we have down to an opinion about a player’s value is currently more art than science. Hockey is a complex game with many moving parts and because of that creating an aggregation method for all the data we gather is a complex task. Read more ›

Tagged with: , ,
Posted in Faceoffs

Hockey Prospectus – Player Level Weighted Shots

Over at Hockey Prospectus I’ve got an article up on calculating Weighted Shots (or, more specifically, Score Adjusted Weighted Shots) at the individual player level. Give it a read, here. The article expands on my presentation at the Ottawa Hockey Analytics Conference, which you can find here.

Lastly, if you’re interested in seeing the player level SAwSH data from 2008-2009 through to 2013-2014, it’s available here.

Tagged with: , ,
Posted in Hockey Prospectus

Improving Corsi Rel: Adjusting for Team Talent

Corsi Rel is a stat that, in theory at least, is meant to address the fact that a good player on a poor team is still likely to post a bad CF%. We don’t want to punish superstars who are surrounded by replacement level players in the same way that we don’t want to reward hangers on playing on Cup winners (*cough* Dave Bolland *cough*). For defencemen in particular, Corsi Rel is often a better way to measure their impact, given that they have much less control over play in general and are driven heavily (at least in terms of raw results) by the talent up front that they’re paired with.

The problem with Corsi Rel, however, is that it’s too blunt of an instrument – it assumes that each player can only affect his team’s results by a set amount, regardless of the talent of that team. A good player on a bad team is assumed to be a good player on any team he plays on, which we know is unlikely to be true in practice. A player with a +1% Corsi Rel on a 42% team is unlikely to make a 56% team into a 57% squad, but pure Corsi Rel assumes that this would be the case. So while we know that there’s value in the information that Corsi Rel contains, the question is how to maximize that value.

Read more ›

Tagged with: ,
Posted in Theoretical

How much do zone starts matter part II: A lot on their own, not that much in aggregate

In Part I of our review of zone starts, we looked at the how the traditional definition of zone starts varied from what most people would consider a “true” zone start, and found that when we applied the true zone start definition to our data, the spread between players in zone start percentages decreased significantly. One key reason for the difference between methods is the inclusion of on-the-fly starts, which tend to make up around 60% of a players total shifts, and which drastically decrease the impact of each defensive/offensive/neutral faceoff. Another driver is the fact that often a player’s zone start percentage is impacted by their own performance: bad players end up with more defensive zone faceoffs due to their inability to drive possession, which incorrectly inflates their defensive zone start percentages. This also helps to create a false link between zone start percentages and possession numbers, leading people to incorrectly infer that tough zone starts are a key driver behind a player’s results.

While it’s useful to know that the true difference in zone starts between players is generally minimal, that doesn’t necessarily mean that we can just ignore them completely. To make a judgement about the overall impact that zone starts have we first need to figure out what the impact of a single zone start is on possession. To do that, we can simply look at all the 5v5 shifts taken since 2008 in aggregate, and calculate the overall Corsi For Percentage broken down by starting location.

Read more ›

Tagged with:
Posted in Zone Starts

Get every new post delivered to your Inbox.