Every NHL contract negotiation will always feature at least 2 common events: 1) reporters listing off subjectively-determined comparable players to peg a player to a given salary (“He’s going to be looking for Andrew MacDonald money”); and 2) fans picking-and-choosing stats to poke holes in the analysis performed by the media and come up with their own valuation. While the player evaluation tools available to the hockey community have grown over the past few years, the use of these tools to value players has always been somewhat lacking-should we look at a player’s Corsi-Rel, or his raw CF%? How do we include contextual factors like zone starts in our analysis? And what if there’s something that our stats don’t measure, that really should be included in a proper valuation? On the other hand, while finding comparable players is an admirable approach, the comparisons that are often used include such factors as “playing the tough minutes” or “bringing heart into the dressing room” or “he’s not Russian”. While these make for nice storylines (except the last one), they don’t always translate into good decisions (as the Flyers will likely find out with the aforementioned Mr. MacDonald).
As a fancy stats enthusiast and someone generally interested in tinkering in Excel, I wanted to look at whether we could blend the approaches to come up with a way to turn advanced statistics into comparable players. While I used a nearest neighbour analysis to find comparable players in my free agent preview this year, I found that it was tough to translate the results of a single variable (in this case, xGD20) into a salary, particularly considering that important factors such as usage weren’t included in the analysis. I wanted to put something together that was flexible enough that it could be modified to take different variables into account depending on the kind of analysis that was being undertaken (for example, comparing players on usage or comparing players on performance) to allow any analyst to decide which were the stats that mattered when determining player salary. And so, after the usual few weeks of procrastination, I’ve finally built a working Player Comparables Tool, which I’ve put up for download here.
The underlying spreadsheet that drives the whole tool was put together by the amazing Rob Vollman, who literally wrote the books on hockey analytics (which you should go purchase immediately if you haven’t yet). Pretty much every piece of data you could want to include is in there, including both stats of the fancy and non-fancy variety. The tool is setup to allow you to choose up to 10 stats to include in your analysis, and to weight them as you see fit. All changes should be made to the highlighted cells on the Player Comparables tab for best use-if you’re an NHL exec whose tinkering leads to a bad deal, I’m not giving refunds.
The approach the calculator takes is to find a given player’s “nearest neighbours” across the selected variables, calculating a z-score for each variable to ensure that everything is scaled the same way. I’ll spare you the rest of the nitty-gritty details of the calculations, but if you want to dig into it a bit more, the calculation “steps” are more or less outlined in columns O-AR of the “Consolidated” sheet.
The one piece of data that’s missing is a differentiation between contracts that cover UFA years, versus deals that are only for a player’s RFA seasons. This unfortunately tends to understate the salary of some of the UFA deals that get signed, but with the data at hand it’s a necessary limitation.
A few other things to note:
- If you choose only a few stats you’re likely to get wonky results and/or errors (this is due to ties in the nearest neighbour ranking).
- If you want to add your own stats all you need to do is add them onto one of the tables (Main, PP, EV, PK), and they should automatically appear in the dropdowns.
- If you want to use more than 10 players in the calculation, you just need to add rows below the 1-10 names, and adjust the formulas as appropriate.
- All of the salaries in the sheet are based on 2013 cap hits. If you want to update, edit the Cap Cost column (CX) of the Main Page.
So how well does the predictor do in actually guessing free agent salaries? I took a very basic approach, equally weighting OZ%, ES CF%-Rel, ES TOI%, PP TOI%, PK TOI%, ES IPP, Total Sh%, ES PenD/60, ES Pts/60 and PP Pts/60, and compared the actual salaries with the nearest-neighbour predicted top 60 free agents from TSN. While this model is obviously a fairly basic one, it does do a decent job, generating an R^2 between predicted and actual of 0.25. With a bit of tweaking of the weights (and ideally, the inclusion of UFA/RFA deal info), it would likely get a lot better.
If you have any questions or issues with the sheet (or, if you find any bugs) don’t hesitate to let me know either in the comments or at puckplusplus AT gmail DOT com.
[…] While on the topic of the annual super-spreadsheets, Matt Cane of Puck++ created a nearest-neighbour tool that can find a player’s same-season […]
Interesting. I was just wondering about this question (specifically, “How much should the Leafs pay RFA Kadri next year?”). Would you get better results from a non-parametric test (e.g, Spearman R)? Do we really have to predict the players’ actual salary? Or is it enough to say that Kadri should be somewhere between the 17th and 21st best paid player? Overall, interesting stuff. Thanks for sharing your hard work.