Quantifying Spacing: Part 1

Metrics

Sep 22

Intrigued by @crumpledjumper’s new lineup tool, I decided to attempt, in some way, to ‘measure’ spacing, if that can be done (short of processing every last bit of player location data). The word ‘spacing’, for most I must imagine, usually brings to mind a spread court, with 4 players evenly distributed along the 3-pt line, itching to fire off a catch-and-shoot bomb. What’s often forgotten is that cutting, and more generally rim gravity, plays a part in warping the court as well, but I’ll focus on that in later parts; I just want to document my first ideas, as well as the process behind it for optimal transparency and understanding. With that being said, here’s my relatively simplistic measure of space: 3PA * ((3P% * 1.5) - EFG%). Let’s break down each component:

3PA (3 Point Attempts) - Here, 3PA is a player’s 3PA per 100 possessions, scaled to make results easily comparable.

3P% (3 Point Percentage) - This is simply the designated player’s 3P%. It is multiplied by 1.5 to make it equivalent to EFG%, which measures efficiency by representing it as a 2P%.

EFG% (Effective Field Goal Percentage) - The EFG% value that I use here is the league-wide percentage. In this case, it is a better measure than TS%, because I’m looking at the value of the average field goal attempt, since that is where a 3 could be a better/worse alternative (no one ‘defends’ free throws, so spacing does not affect them). In the exclusive context of the prompt (where spacing will be measured by percentile in the lineup finder), the quantity of this value is irrelevant, only affecting the face values and not the rankings, but if I ever wish to use this equation largely for the values themselves, this will be the numerical component that I revisit.

Essentially, I am measuring the value that a player adds with his 3-point shooting, on a per-100 possession basis, and following the logic that the more value you add from 3, the more defensive attention you will receive, because both volume and percentage are significant. As much as being able to hit 3’s at a high percentage is a skill, so is being able to get as many high-percentage looks as possible. A 40% 3-pt shooter who can’t shoot on the move is limited, and their lower volume would justifiably reflect their impact. And just to be explicitly clear, I am NOT attempting to measure shooting ability. For example, let’s look at James Harden. His value added is 0.055, per this metric. If you took that at face value, you would be missing a lot of things, chiefly the context. A mere 17% of Harden’s 3’s were assisted this season, and a lot of them were tough, off-the-dribble bombs. The metric obviously favors those who are used more as specialists, which is actually the goal (spacing is being represented alongside Offensive Load in the lineup finder; the model is consistent in it’s biases towards tough shot-makers, which is totally fine). Anyways, let’s start by looking at (thanks to this) the top-10 players in this ranking:

Here, the x-axis is 3PA per 100 possessions, the y-axis is 3P%, and the area/color of the points represent value added. Looking at the graph, and you see a lot of the usual suspects, guys who are renowned for their marksmanship. The two biggest surprises are Dame, because he carries such a large load (however, his gravity this season was something else, and he definitely belongs) and George Hill, who shot a ridiculous 46% from 3 this season, despite taking a measly 6.3 3’s per 100 possessions. His unsustainable percentage oversells his middling volume, especially when you take into account that 93% of his shots were either open or wide open. Here are the numerical values, if you prefer (3PA, 3P%, Value Added):

Duncan Robinson - 13.7 - 0.446% - 1.918
J.J. Redick - 11.6 - 0.453% - 1.7458
Dāvis Bertāns - 13.9 - 0.424% - 1.4873
Seth Curry - 9.8 - 0.452% - 1.4602
Doug McDermott - 10.4 - 0.435% - 1.2844
Kyle Korver - 11.3 - 0.418% - 1.1074
George Hill - 6.3 - 0.46% - 1.0143
Bojan Bogdanović - 10.7 - 0.414% - 0.9844
Joe Harris - 9 - 0.424% - 0.963
Damian Lillard - 13 - 0.401 - 0.9425

Before I show the full graph, let me share a couple of small details. Firstly, I got my player list from here, and bball-reference gave me 259 filtered players, including traditional centers. Because their specific data points distorted the graph (ahem, Dwight Howard and his 60% from 3) and provided unnecessary noise, I (for the time being) removed anyone averaging less than 1 3PA per 100 possessions, all of whom were traditional bigs (plus TJ McConnell). I’ll hold my tongue on a tangent for second, and instead just get the graphs out (if you want to look at my raw, unorganized spreadsheets, they’re right here). The first graph will show 3PA vs 3P%, with size/color indicating value added, while the second will show 3PA vs Value Added (with Value Added represented by color hardness as well for extra visual clarity).

Two small things to note: the dots with no circle represent <0 value, and the top-right and bottom-left dots are fictitious data points used to stretch the graph, because you can’t set the ranges with the tool that I used.

Once again, ignore the corners

[Cut back to my tangent on filtering traditional centers]

Whenever I attempt to encompass rim gravity, centers will definitely come back in the mix, but this adjustment got me thinking about whether these centers were accurately represented by the metric or not, as things currently stand. So let’s officially pivot to the next section of the article,

The Metric’s Shortcomings

The most important thing to keep in mind when viewing any advanced stat is what the stat is actually measuring (input), how it’s measuring it (formula), and therefore what the output can tell you. In other words, context. Many players in the data set technically subtract value, but what exactly does that tell us, especially when compared to players who don’t shoot at all? For example, should the list be interpreted as non-spacers and spacers, with non-spacing being a categorical variable and spacing quantitative (although some players who add <-0.3 in the data are still thought of as capable shooters)? I’m not so sure that being a non-spacer scales the way being a spacer does, but beyond that, there is positional context to consider. Being a non-shooting big is a lot different from being a non-shooting shooting guard, because shooting guards are expected to, well, shoot. Non-shooting bigs pressure the rim (in theory), which also draws defense/creates space, while range-challenged guards cannot say the same. But as long as we specifically measure spacing via shooting, do non-shooting 5’s deserve their 0’s, since their role doesn’t really ask for that and they don’t have a positive or negative impact? But what about a player like James Harden, who only adds 0.05495 despite being a high-volume and above average 3-pt shooter? Here, I will yet again direct you to role. Harden plays a largely on-ball role, so even if he has the potential to add a lot of value via spacing, he doesn’t get the chance to, because the ball is often in his hands (this actually works intuitively, because a lot of on-ball stars’ shooting ability is undersold by their percentages, lowering their perceived value, working out because they end up spending less time actually spacing the floor anyways). Now, time to address (the rest of) my own criticisms:

First off, the role-centric shortcomings are pretty irrelevant, because I conceived this idea as a complement to a lineup tool that finds similar lineups, where the other two measurements are Offensive Load and Rim Deterrence. The tool in part tries to find similar lineups based on the distribution of roles, so as long as the spacing values are consistently misrepresentative for all players who perform a certain role (such as on-ball creation), then there’s really nothing wrong here. To everything other criticism: noted.

Plans for Part 2:

First on my to-do list is to create a measure of rim gravity. I may add it to the current spacing metric, or make a separate metric; only time will tell. Second, I will look at how I could encompass the amount of time spent on-/off-ball into my formula, essentially embracing the underlying bias of player roles that is present. Third, (and I’m really in no rush to do this), I will look at thinning off the rough edges of the league average EFG%, by filtering out shots where spacing did not play a part. I’ll also consider any feedback I get, as some alternative perspectives could be huge.

I really hope you enjoyed reading this as much as I enjoyed writing it. Follow me on twitter @thenbaundrgrnd to stay updated on my work, show support, or both.

AnalyticsSpacingPart 1

Noah Leeman

Quantifying Spacing: Part 1

Not All Possessions Are Created Equal: A Four Factors Memoir