AUTHOR'S NOTE: This is Part 2 of a 5-part series on predicting the career performance of NFL QBs. Today's posting is at 2 p.m. PDT because the 2011 NFL schedule is coming out at 4 p.m. For the rest of the week, though, I'll be posting a new installment every day at 4 p.m. PDT.
Yesterday, in Part 1 of this series, I highlighted the fact that, despite popular belief to the contrary, the Wonderlic does not predict the future NFL performance of highly drafted college QBs. I also mentioned my post 2 years ago about the Lewin Career Forecast (LCF), wherein I noted that, although college games started (GS) and college completion percentage (Comp%) predicted QB performance, the margins of error for the LCF predictions were prohibitively large.
In times like these, when you're faced with numerous underwhelming results, it's usually best course to take a step back, and think about things from a more philosophical perspective. By this, I don't mean sitting down and reading Nietzsche or Lao Tzu. Rather, I mean trying to answer abstract questions like, "What exactly is the purpose of all this?" and "Are the things we're doing fulfilling that purpose?" That's what I'll attempt to do today. And just to be clear about this, I'm going to be waxing philosophical about using statistics to project QB (or player) performance, not team performance. You'll understand momentarily why I make this distinction.
After the jump, I kick it abstract...
FINDING ONE'S PURPOSE
I'm going to come right out and say it: the underlying (read: implicit) purpose of every player projection system I've seen is to make money. Make no mistake, either the producers are making money, the consumers are using the product to make money, or more likely both producers and consumers are in a symbiotic money-making relationship. You don't believe me? Here's some evidence:
- Football Outsiders (FO) sells their so-called KUBIAK player projections
- FO readers use FO's KUBIAK projections to gain an advantage in their fantasy football leagues (i.e., make money).
- CBS et al. sell their player projections
- CBS et al. readers use their projections to gain an advantage in their fantasy leagues (i.e., make money)
- NFL teams have statisticians who earn a salary developing math-based player evaluation systems
- NFL teams make more money when their math-based player evaluation systems lead to increased winning
- R.C. Fisher, who developed the fantasy-friendly QB projection system I mentioned yesterday, admits he's "not going into exact detail for many reasons, seeking fortune by having something no one else does is the main one."
Now, I'm not saying that making money is the primary purpose of these projection systems (except for Fisher's obviously); it's just one purpose that no one seems to explicitly articulate. Nor am I saying that wanting to make money is a bad thing. The US of A is still a capitalist country after all (Please, no "we're all socialists now" comments in the comments section. Thanks in advance.). To reiterate, please do not construe this as me begrudging anyone for making money. My point here is a more methodological one. Namely, if it's true that the purpose of all these projection systems is to make money for someone, then the methods we use to develop such systems should be optimal for achieving that goal. Obviously, this means I'm of the opinion that they currently don't. Well, more accurately, I think the ones that focus on things other than winning fantasy leagues or prop bets don't. For the rest of this post, I'll highlight some ways in which current QB projection systems don't match that purpose.
WHAT IS PERFORMANCE?
The first place where I've noticed the incongruity between purpose and method is in how several QB projection systems measure "performance." For instance, as I've already talked about at length, the LCF uses DYAR/G as its performance measure. Sure, DYAR/G is statistically correlated with FFPts/G, thereby allowing us to use it as a proxy for fantasy scoring, but why is that extra step necessary? If I want you to pay me for telling you how good a QB is going to be in his NFL career, and you're going to use that information to win money in your keeper league, I think the best way to fulfill our mutually capitalist purpose is to give you the projections in terms of FFPts/G. Similarly, if I'm going to develop a system that spits out these projections, I'm going to want it to optimally predict FFPts/G.
The same goes for Fisher's system. His measure of performance is a numerical rating that, given a certain threshold value, signifies that a QB is either going to be "elite" or "a bust." As a fantasy player, though, what am I supposed to do with this information? Create my personal cheatsheet by ranking QBs in terms of eliteness and bustworthiness? More importantly, does my fantasy league use a scoring system whereby "elite" QBs get a 5-point weekly bonus? The answers are "no" and "no," respectively. So, again, if Fisher's going to develop a system, which he freely admits is designed to make money, why not just have the performance projection be something that's actually applicable in money-making endeavors? Say, something - anything - that represents fantasy scoring.
In this regard, FO's KUBIAK projections for QBs, as well as those put out by the various websites, have it right. They've developed statistical models that specifically predict QB FFPts, and we can directly apply the projections spit out by those models towards the purpose of making money.
Of course, even the KUBIAK projections suffer from another performance-related issue: projecting total FFPts rather than FFPts/G. You might think this is splitting hairs, but it's statistically the case that absolute totals are dependent on the number of opportunities to amass said totals. After all, this is the very reason why FO doesn't principally believe in basic "count stats" anywhere else in their work. Furthermore, the way most fantasy football leagues work is that you play one game per week, and your point total for that week is based on each player's point total for their game. This, coupled with the fact that Gs and opportunities (e.g., attempts, receptions, etc.) are very, very highly correlated, the optimal thing to do is just predict a performance value that best corresponds with what the fantasy player needs to make money, i.e., FFPts/G.
Now, at this point, you might be asking yourself, "Danny, why are you focusing so much on fantasy football?" Well, the reason, not surprisingly, goes back to our original purpose. Specifically, if you think about the available applications for projection systems created outside of NFL team headquarters, you'll quickly realize that, pretty much, only 2 exist: fantasy football and sports betting. I mean, clearly, by the very nature of the amount of information and financial resources at their disposal, any projection system developed by an NFL team is going to be vastly superior to anything we can create on the outside (See "The Word 'Outsiders' Appears in 'Football Outsiders.'"). As much as Fisher might think there's a possibility he could sell his system to some NFL team, the fact of the matter is that he's far more likely to sell it to fantasy players (See "Name of Fisher's Site is Fantasy Football Metrics").
In terms of sports betting, although player prop bets do exist, the vast majority of them are game-specific, such that career-long projections and season-long projections are rendered irrelevant. Therefore, after a little musing, I'm of the opinion that, in terms of player projection, the money is in fantasy football. I emphasize player projection here because "figuring out what strategies and tactics win games" is a totally different animal, and people who can do it are actually attractive commodities to NFL teams. In addition, team projection is infinitely more suited to financial transactions situated in the Mojave desert.
WHAT PREDICTS PERFORMANCE?
Another place where methods run awry of purposes is in the variables that QB projection systems have used to predict performance. For instance, I've mentioned before how the LCF's application is limited to QBs drafted in the first 2 rounds. Given that the LCF does not predict FFPts/G, FO does not sell the LCF projections, and no Vegas prop bets involve DYAR/G, it therefore follows that, with respect to my formulation so far in this post, the purpose of the LCF is not to make money via sports gambling or fantasy football. Furthermore, there's no way NFL teams are unaware that experience and accuracy are statistically related to QB performance, so no money to be made there either. Instead, the LCF is more of an academic exercise designed to provide a system whereby someone can statistically evaluate QBs before the draft, and make an educated guess about their future NFL careers.
If that's the case, then there's a glaring problem with how the LCF is supposed to be applied. You don't know before the draft which QBs are going to be taken in the first 2 rounds. You might know which ones will be drafted in the Top 10, but after that, it's a crap shoot. Just ask Aaron Rodgers and Brady Quinn. This "first 2 rounds" limitation is a specific critique that NN readers have raised in my previous posts about the LCF, and it's a valid one. Even I've twisted myself into circles trying to explain the logic of applying the LCF under this limitation.
So, given that we now consider the LCF to no longer be something designed for application before the draft, I'll introduce a predictor that correlates with QB career performance better than even college GS and college Comp%, and turns the LCF into a potential money-maker: namely, the pick number at which the QB was drafted. Well, actually, I'll save that for tomorrow's post. The point I'm making here, though, is that if we identify the purpose of the system, then we can (and should) optimize the methods used to develop the system. If the LCF is an academic exercise, then the after-the-draft, "first 2 rounds" limitation doesn't fit. The sample has to include all QBs for the method to be right. However, if it's a money-making exercise - fantasy drafts don't happen until August, after all - then we can either (a) keep the "first 2 rounds" thing because it's now a "feature" rather than a "limitation," (b) start throwing in all kinds of other variables that will optimize the precision of our predictions because we've loosened the constraints, or (c) both.
Another predictor-related methodological snafu has to do with the fallacy of multiple endpoints. I've discussed it before here on NN, and Brian Burke of Advanced NFL Stats has used it to take down FO's "Curse of 370." Basically, the fallacy of multiple endpoints is when you place paramount importance on a specific numerical threshold when, in actuality, multiple other thresholds (i.e., endpoints) would work just as well. In the case of the Curse of 370, Burke's argument was that 370 is not some magic number, and proved it by showing that there'd be no curse at all if you simple decreased the threshold to 368 (curse shifts away from high-carry RBs) or raised it to 373 (curse shifts towards low-carry RBs). This same sort of fallacy applies whenever you hear an announcer say, "Team X is Record A-B when they give up more than Y points in a game." There's absolutely nothing magical about Y, and, given the time, you could easily find point totals other than Y that result in a similar record. I mean, it's not like the team just quits playing once they've given up Y points. Hopefully, you see where I'm going with this.
Both the LCF and the Rule of 26-27-60 have applications that rely on magical thresholds. Do QBs become Neo in their 27th (or 37th per the LCF) start? Do a couple of failed 4th-quarter Hail Maries that drop a QBs completion percentage to 59.4% signify the end of his chances for NFL stardom? Does a QB stop going to practice after getting that "The 8th month of the year is _______?" question wrong? Of course not. Rather, these thresholds exist simply because people noticed a convenient value for each predictor that made performance projections look good.
It's not that using thresholds, per se, is bad or necessarily leads to the fallacy of multiple endpoints. It's just that, with respect to the LCF and the Rule of 26-27-60, the thresholds have replaced the predictors altogether in terms of applying the projections. No one besides Dave Lewin says, "X starts + Y Comp% = Projected DYAR/G" when applying the LCF. Instead, because of the thresholds, the vast majority of people say, "Oh. He had 34 starts? Well, he's going to be a bust." Relating this back to the philosophical stuff, bad methods lead to inaccurate predictions, and people will stop buying them after a while if they're perpetually bad. Using magic numbers in the way I just described is a bad method. What's a better method? Well, if you think 37 GS and 60% Comp is actually meaningful for some reason, then see whether or not belonging to 1 of the 4 possible combinations of those 2 stats predicts different levels of performance.; either that or just abandon the thresholds altogether. Garbage in, garbage out.
Just to summarize my argument a little bit, it basically goes like this. Whether we like it or not, the audience for QB performance projections uses these projections to make money. The vast majority of the time, we enter into a trade whereby the audience gives us money, and we give them our projections. There is no market for selling our projection wares to the NFL because they already have better wares than we could ever develop. In this reality, we should therefore tailor our product to meet consumer need. In order to do so, we need to use statistical methods that are most relevant for the applications our consumers actually use them for. This means that the optimal way to develop a model for QB projection would be to
- Use FFPts/G as the unit of measurement for our performance predictions
- Abandon the idea that we're focused on projecting QBs before the NFL draft, and instead focus on projecting them before fantasy football drafts, thereby eliminating unnecessary model limitations.
- Abandon the use of thresholds or turn our predictor values into some kind of group-membership value
After this little mental exercise, I think that doing these things will meet everyone's needs. Better methods will lead to better predictions, and people on both sides of the producer/consumer divide will end up better off. To me, that's preferable to the current state of QB projection, wherein both sides are getting very little value. Tomorrow, I'll move things out of the abstract by concretely demonstrating what I'm talking about.