Corsi Rel is a stat that, in theory at least, is meant to address the fact that a good player on a poor team is still likely to post a bad CF%. We don’t want to punish superstars who are surrounded by replacement level players in the same way that we don’t want to reward hangers on playing on Cup winners (*cough* Dave Bolland *cough*). For defencemen in particular, Corsi Rel is often a better way to measure their impact, given that they have much less control over play in general and are driven heavily (at least in terms of raw results) by the talent up front that they’re paired with.
The problem with Corsi Rel, however, is that it’s too blunt of an instrument – it assumes that each player can only affect his team’s results by a set amount, regardless of the talent of that team. A good player on a bad team is assumed to be a good player on any team he plays on, which we know is unlikely to be true in practice. A player with a +1% Corsi Rel on a 42% team is unlikely to make a 56% team into a 57% squad, but pure Corsi Rel assumes that this would be the case. So while we know that there’s value in the information that Corsi Rel contains, the question is how to maximize that value.
One way we can do this is to create a function that modifies Corsi Rel based on quality of a player’s teammates:
Adjusted Corsi Rel = f(TMCF%) + Corsi Rel
The question then is what function we should use – ideally we want something simple, since anything we choose will be arbitrary (you could, in theory, design a study to calculate the correct adjustment function, but I’ll leave that to someone more ambitious than me). There are, however, a few conditions that our function should satisfy to ensure that it makes sense logically. First, we know that when a player’s TMCF% = 50% we should make no adjustments – this will be the baseline that we work against.
Condition 1: f(50%) = 0
Second, as TMCF% increases or decreases by n% away from 50%, we want the adjustment to be the same, although in the opposite direction. So if we’re increasing a player’s Corsi Rel by x% on a 52% team, we should be decreasing it by x% on a 48% team.
Condition 2: f(50% + n%) = -f(50% – n%)
This limits us to odd functions (or more specifically functions that are odd around 50%), with linear and cubic functions being the simplest choices. Before we choose a function, however, we need one more additional constraint that will give us another point to fit our line/curve through. The easiest way to do this is to choose two players that we thing should be equal, and to force our equation to ensure they are. So, for example, you might think that a +3% Corsi Rel player on a 45% team would post a CF% of 56% on a 55% team (for a Corsi Rel of +1%). If we use that as our equivalency and fit a cubic function, our adjustment function becomes:
f(TMCF%) = 80*(TMCF%-50%)^3
Or, if you prefer a linear approach we get:
f(TMCF%) = 0.2*(TMCF%-50%)
So which approach works better? Well that’s a matter of personal opinion, but one thing we can look at is what adjustment each function makes across a series of TMCF%.
Each curve represents the amount we need to add to a player’s raw Corsi Rel to get their Adjusted Corsi Rel. So a 48% player on a 40% team would have a Corsi Rel of 0% using our cubic function, or 6% using the linear method. And while arguments could be made for both functions, to me the cubic graph seems like a more reasonable plot. The linear function doesn’t seem to be able to penalize really bad players on really bad teams enough, or to give great players on top tier teams as much due as they probably deserve. The cubic function also makes less change to a player’s Corsi Rel in the 48-52% range, which is probably a reasonable assumption – most teams in that region are relatively average, and there’s little reason to make drastic changes to a player playing on an average team.
You can, of course choose to change the equivalencies: maybe you think a +3% player on a 45% team would be a -1% player on a 55% team, it’ll simply change the constant that the cubic is multiplied by. With a bit of digging in the data we have we could probably verify whether the cubic or linear method is more appropriate and what constant should be used. This post isn’t meant to provide exact values to adjust by (although, admittedly, the cubic method doesn’t look too far out of line with what I’d expect), but rather to establish a framework upon which further revisions can be made, and to throw the idea out there to be critiqued and improved upon.
While we’ve focussed primarily on percentage stats here, you could also design a similar method to address rate stats – simply replace the 50% with the league average CF60 or CA60 and repeat the same fitting process from above. There’s also no reason to limit ourselves to Corsi either – the same methodology can be applied to Fenwick, Goals, or even Weighted Shots (assuming that they ever get fully explained by a certain procrastinating analyst). The point is that raw Rel stats can give us strange results for players on extreme clubs, and that by making simple adjustments such as the one’s outlined here we should be able to make better estimates for those players.