Last week, Ninjames put together an excellent look at what appears to be a strong class of pass rushing linebackers in the 2011 NFL Draft. It's fitting then that Football Outsiders produced their annual SackSEER results for the incoming rookie class of pass rushers. For those who've never heard of SackSEER, it is a model meant to project the sack totals of highly drafted 4-3 defensive ends and 3-4 outside linebackers in their first five years in the NFL. Nathan Forster, the man who developed the model, indicated it is composed of four metrics:
[T]he prospect's vertical leap, short shuttle time, per-game sack productivity in college (with certain adjustments), and missed games of NCAA eligibility.
Based on the metric, SackSEER apparently would have predicted success for Mario Williams and Shawne Merriman and predicted struggles for Robert Ayers and Jarvis Moss. As Forster admitted, last year was not a good one for SackSEER's projections due to struggles by Jerry Hughes and the success of Carlos Dunlap and Jason Pierre-Paul. Of course it's only been one year, so we'll see how those players move on.
I spoke with Florida Danny about SackSEER and he took some time to put together his assessment of the model, looking at both positives and negatives. I've posted his comments after the jump.
Even with last year's struggles, I still wanted to take a look at SackSEER for this year, given the fact that a guy like Von Miller could very well end up in the 49ers lap at the number seven pick. I mention Miller specifically because SackSEER projects him out as the best of the bunch at the top of the draft. They project Miller to have 36.4 sacks through his first five seasons, which averages out to 7.28 per
In addressing Miller, Forster acknowledges that the projection doesn't address some of the issues that concern folks, such as:
[It] does nothing to address concerns about Miller's size and his ability to hold the point against the run. Size at the edge rusher position has been tricky. Prospects with good size and good SackSEER projections rarely bust, and there have been plenty of players such as Aaron Maybin and Manny Lawson who end up playing down to their size despite impressive athleticism. However, some of the best edge rushers have been undersized, and often severely so. Most recently, Clay Matthews took the NFL by storm despite weighing only 240 pounds at the Combine, and Trent Cole and Robert Mathis have been outstanding despite being well south of the 240-pound mark on draft day.
Although an injury or struggles against the run could certainly derail Miller's career, Miller has the potential to become an elite player at his position.
Outside linebacker Justin Houston is projected with the second most sacks over the next year five years. The rest of the top prospects don't approach the level of awfulness projected for a guy like Jason Pierre-Paul. The system views this as a solid class at the top.
The system has one potential sleeper in Nevada's Dontay Moch thanks to his solid college production and vertical leap. They view him as worth a third round pick, which would bring him down from some of the second round projections we've seen for him at times this offseason.
Once again, you can check out FO's SackSEER projections HERE, and view Florida Danny's thoughts on the model below. What do folks think of some of these projections?
1) With an R-squared of .42, meaning that the model explains 42% of the statistical variation in sack totals during a player's first 5 NFL season, I'm pretty confident that the 4 variables making up SackSEER are predictive above and beyond other variables that might be intuitive (e.g., 40 times, bench press, draft pick, etc).
2) Relatedly, the 4 variables that make up SackSEER make perfect sense from a theoretical perspective:
a. Vertical jump, as Nate Forster mentions, is a measure of explosive strength in the lower body. Exploding off the line is exactly the kind of physical attribute you want to see from an edge rusher.
b. Short shuttle time, again as Nate mentions, is a measure of change-of-direction speed and hip flexibility. Again, these attributes make sense in terms of the physical movements required to be a good edge rusher.
c. Modified Sack Rate (SRAM) is an adjusted measure of college production, and we know that, in pretty much every type of task performance, past behavior is indicative of future behavior.
d. Missed games is a measure of both injury risk and amount of experience doing the very thing the player's going to be asked to do in the NFL. Again, the more experience you have at executing a task, the better you're generally going to be at that task in the future. Also, being injury-prone is obviously going to reduce productivity.
Here's what I don't like (and they're both methodological critiques based on the info Nate explicitly stated about his methods in FOA 2010):
1) Far more edge rushers taken in the draft have ended up on the lower end of the NFL productivity spectrum than on the higher end. In other words, the performance distribution isn't bell-shaped; it's skewed. Statistically speaking, simple linear regression isn't optimal in this situation because it ends up being far better at predicting the highly-populated end of the spectrum rather than the sparsely populated end. Not surprisingly, this is exactly what Nate says happens with SackSEER. He should have done some kind of mathematical transformation to the 5-year sack totals in order to bring them more in line with a bell-shaped distribution, thereby making simple linear regression the more optimal choice in terms of modeling.
2) To build the regression model, he used player data from 1999-2008. However, he didn't test the accuracy of the resulting model by trying to predict 5-year sack totals for players outside that 1999-2008 range. Therefore, the success of SackSEER might be idiosyncratic to the 1999-2008 data.
In regards to the two critiques, aren't necessarily big deals for 2 reasons:
1. Nate could have actually done the things I critiqued him for not doing, but just didn't include a discussion of it in the FOA 2010 chapter for the sake of brevity and/or because of the non-technical nature of the audience.
2. Not testing the model on data outside of 1999-2008 could have been because of insufficient data. I mean, I don't know off hand how far back official Combine data goes back. Also, from the research on college stats that I've done, I know there's a certain point around 1999 where it becomes prohibitively difficult to find accurate stats for players. Hell, I was looking at QBs, and was having difficulty finding stats. It could have easily been the case that finding stats for college edge rushers was next to impossible. But, like I said, he didn't discuss it in the FOA 2010 chapter, so I'm just left to assume he didn't even consider it. I could easily be wrong, though.