Back in October, I introduced the Lewin Career Forecast (LCF) to Niners Nation (NN). To put it mildly, the post stirred up some lively debate, and signaled the beginning of a still-raging internecine battle on NN over the appropriateness of football stat analysis. Just to refresh everyone's memory, the LCF, which was developed by David Lewin of Football Outsiders (FO), predicts a QB's NFL performance based on 2 of his college stats: games started (GS) and completion percentage (COMP%). A couple of other ancillary features of the LCF are that (a) it's most appropriate for predicting the NFL performance of 1st- and 2nd-round QB picks, and (b) 37 college GS and a 60% college COMP% are the implied statistical benchmarks for success.
Because of its (over)simplicity, the LCF has received a healthy amount of criticism since it was unveiled in Pro Football Prospectus (PFP) 2006. Whether we're talking about statistical modeling or plain ol' common sense, it just can't be right that predicting performance for an NFL QB only requires knowing 2 things about him (3 if you count his draft round). In other words, QB performance is way too complicated to be that simple. Well, at least that's what the critics say. There are other arguments against the LCF, but I'll get into those a little later.
So, I figured that, because the draft is tomorrow, and because the "QB at #10" rumors seem to be accelerating, now would be as good a time as any for me to revisit the LCF and evaluate - from a statistical perspective - just how useful it is to the average NFL fan (Warning: Just because I'm a stats guy, don't assume I'm going to conclude that LCF is über-useful). Therefore, in this article, I'm going to do 4 things:
- Discuss the statistical and non-statistical positives and negatives of the LCF
- Give a statistical rationale for a solution that addresses the negatives
- Detail my solution
- Apply my solution to the top QB prospects in the 2009 NFL Draft
After the jump, I'll tackle the LCF...
POSITIVES AND NEGATIVES
In order to explore this topic, we need some more background info about the LCF (Note: I'm going to try as hard as possible to write the next couple of paragraphs in English).
Basically, to come up with the LCF, Lewin first thought of as many college stats as he could that might influence how a QB is going to perform at the professional level. Some of these were performance-based (e.g., COMP%), while others were more trait-like (e.g., height). To measure "performance at the professional level," he chose defense-adjusted points per game above replacement (DPAR/G). DPAR was the FO predecessor to DYAR, which is a stat most of you should be familiar with if you've read my previous articles. After choosing DPAR/G as his performance measure, he got all of the relevant stats for every QB drafted from 1997-2006, and looked for statistically significant relationships between these stats.
Based on the result that GS and COMP% were the stats most related to DPAR/G, and that the DPAR/G relationships for other stats seemed to depend on GS (e.g., a QB who starts more games generally throws for more yards and TDs by definition), he ran an analysis called multiple linear regression (MLR) to come up with a simple equation that predicts DPAR/G from GS and COMP%. As part of this analysis, he ran an MLR for each round of the draft, and found that the 1st and 2nd rounds yielded the most accurate predictions. Now, he's able to simply plug in any college QB's GS and COMP% stats, and get a predicted value for DPAR/G, which he calls the LCF.
As for this magical equation he came up with, think of it in terms of fantasy football (FF). To figure out how many points your QB scored in a given week according to standard FF scoring, you just divide his passing yards (PAYDs) by 25, add to that his passing touchdowns (PATDs) multiplied by 4, and then subtract his interceptions (INTs) multiplied by 2. Really, all you're doing here is applying the following equation to the QB's stats: FFPts = (.04 x PAYDs) + (4 x PATDs) - (2 x INTs). The MLR-based equation for LCF is just this sort of thing.
Now, onto the positives and negatives...
The first good thing about the LCF is that the inclusion of GS and COMP% makes perfect intuitive sense. The more GS a college QB has, the more experience he has playing QB. Obviously, if you're going to hire a guy to run your company (or offense), it's better that he/she has more experience than less. It means he/she is more likely to know what the (site decorum) he/she is doing. Similarly, the higher a college QB's COMP%, the better he is at hitting his target. Obviously, the point of the passing game in football - at any level - is to move the ball down the field by, you know, completing the pass to your teammate. If a guy can't hit the broadside of a barn in college after having thrown 1,000s of passes in his life up to that point, then he's probably not going to become a sharpshooter when he goes to the pros.
The second good thing about the LCF is more statistical in nature. When a statistician wants to determine whether someone did an analysis correctly and reported accurate results, they do what's called a replication study. Well, I did one of those for this article. Except for (a) using data only from 1st- and 2nd-round picks, and (b) using DYAR/G instead of DPAR/G, I did the same thing that Lewin did to come up with the LCF, i.e., I followed the steps I described a minute ago. To Lewin's credit, I got the same results he did: GS and COMP% had the strongest relationships with DYAR/G, and were the best MLR predictors of DYAR/G. Just for the sake of providing info, here are some of the other things that I found:
- The correlation between GS and DYAR/G was .71. The correlation between COMP% and DYAR/G was .35. Correlations range from -1 to 1, with the size of the number indicating how strong the relationship is. Positive correlations mean that one stat goes up as the other goes up, whereas negative correlations mean that one stat goes down as the other goes up. So, for all intents and purposes, these 2 correlations mean that a higher GS is twice as important as a higher COMP% when it comes to predicting higher DYAR/G.
- Because some people have made this argument, I looked at two variables related to the NFL situation a QB draftee was entering at the time he was drafted. Whether we're talking about the previous year, the year before that, the year before that, or the average of these 3 years, neither a team's wins nor its offense DVOA when its QB arrived from college were related at all to his future NFL stats (COMP%, PAYDs, PATDs, INTs, TD-INT ratio, DYAR/G).
- For sh*ts and giggles, I also looked at whether coming from a BCS school has any impact. It doesn't.
- The actual equation for LCF that I arrived at in my replication study was: NFL DYAR/G = -232.16 + (2.75 x college GS) + (250.31 x college COMP%).1 Plug in some QB numbers and try it for yourself if you wish.
Now for the negatives, of which there are 3...
First, as you can tell from what I just discussed, coming up with the LCF requires a good bit of statistical acumen. In terms of deriving it, the outcome variable, DYAR/G, is a little complicated, and the analysis itself requires statistical software and MLR training. In terms of applying it, you need to have the equation handy and probably a calculator. Because of these issues, LCF is not really practical for the average football fan. For example, if you're having a football argument with a friend about some random QB prospect, it's not really practical for you to keep a copy of Lewin's equation in your wallet and a calculator in your pocket (more likely your pocket protector) in order to rebut your friend's claim that "QB X is going to be awesome in the pros!" Likewise, you're probably going to elicit bewilderment rather than agreement if you base your rebuttal on DYAR/G. And even if you simply wait for Lewin's predictions in PFP or on FO's website, you're probably not going to have the predictions handy when arguing with your friend.
Second, as some people have pointed out, the fact that the LCF is only accurate for 1st- and 2nd-rounders poses a problem with respect to applying it. In order for it to actually be a "prediction," you have to know in advance whether or not a given college QB is going to be drafted in the 1st or 2nd round. This limits your prediction window to the month or so prior to the draft, in which draft pick projections reach a consensus among teams and pundits like Mel Kiper, Jr. And just so you know, this whole "diminishing accuracy by round" thing is real. When I ran my MLR analysis looking only at 1st-rounders, the model was about 5% better; which brings me to the last problem.
The third problem with LCF is by far the most concerning. When it comes down to it, statistics is about explaining the variation among things. It answers questions like, "Why does Jane have an IQ 20 points higher than John?" Jane's IQ varies from John's by 20 points, and we want to know why. In the context of the LCF, this example can be rephrased, "Why does Peyton Manning (the 1st pick in 1998) have an NFL DYAR/G 161 yards higher than Ryan Leaf (the 2nd pick in 1998)?" More generally, the question is, "Why do certain NFL QBs have a higher DYAR/G than others?" The LCF answers these latter 2 questions by saying, "Because of the differences between their GS and COMP% in college."
When you run an MLR analysis, the results tell you a lot more than the multipliers you need to get your prediction. The most important of these supplementary results is called explained variance, which, in the context of LCF, tells us how much of the DYAR/G variation between top-2-round QBs is explained by their college GS and COMP% stats. In my replication of Lewin's analysis, that explained variance value was 56.2%. In other words, over 40% of the DYAR/G variation between QBs was not explained by GS and COMP%. While 56.2% isn't shabby in MLR, especially given the small sample size (n = 35), there's still a veritable sh*t-ton of variation for which LCF has no answer.
Going back to the previous example, the LCF basically says that it can account for about 90 yards (or 56.2%) of the 161-yard difference between Manning and Leaf, but has no clue about the other 70 or so yards. Indeed, if you plug their college numbers into the equation I gave you earlier, you get a predicted DYAR/G difference of 78.94 (Manning = 48.85; Leaf = -30.09), which is far below their actual 161-yard difference. You might say, "Well, it still accounted for half of the difference." My response would be this: Leaf's legacy would be considerably different if he ended up being only 80 DYAR/G worse than Manning. In fact, his actual DYAR/G would increase from -62.24 (worst among the draftees) to 16.70 (on par with Jake Plummer).
And if you thought that wasn't enough, there's one more variation-related problem with LCF: variation in the multipliers themselves. Recall that what you get out of an MLR analysis is a bunch of numbers you have to multiply your stats by to get your predicted DYAR/G. It turns out that these multipliers, called parameters, are estimates just like your DYAR/G prediction is an estimate. In other words, they're statistics-based educated guesses. Think of political polls for this one. Just like every poll has a margin of error, every one of the MLR parameters has a margin of error. Just like the true value for a candidate's vote percentage is somewhere within that margin, the true value for the MLR parameter is somewhere within that margin. Obviously, the goal of estimation, whether we're talking QB performance or political polls, is to minimize your margin of error as much as possible. Later on, I'll tell you how that's done; though I bet you already have a clue.
So what does this "margin of error" problem mean for LCF? First, it means that Lewin gives you his LCF estimate, but doesn't give you his margin of error, which is the same thing as a pollster reporting their prediction for a candidate's vote percentage, but not telling you their margin of error. Second, and much more importantly, it turns out that the amount of parameter estimate variation for the replication MLR I did was huge. Here's a perfect example that is relevant for this year's draft:
- According to scout and pundit consensus, Matt Stafford, Mark Sanchez, and Josh Freeman are going to be the only QBs taken in the first 2 rounds of the 2009 draft. Stafford's stats were 33 GS and 56.9% COMP% (or .569 for the purposes of LCF). Sanchez's stats were 15 GS and .643 COMP%. Freeman's stats were 35 GS and .591 COMP%.
- Remember that the MLR parameter estimates for GS and COMP% (from the earlier equation) were 2.75 and 250.31, respectively. When you take each of their margins for error into account, the true GS parameter value is anywhere from 1.81 to 3.68, while the true COMP% parameter value is anywhere from 57.54 to 443.08.
- Without going into detail here, know that the first number in the equation (-232.16), which is called the intercept, ranges from -349.26 to -115.07 when you take its margin of error into account.
- Let's see what the LCF predictions are when the parameter estimates in the equation are exactly right.
- Let's also see what the LCF predictions are when the true parameters for GS, COMP%, and the intercept are all at the low end of their respective ranges; for example, 2 for GS, 60 for COMP%, and -340 for the intercept. In other words, let's use the following "low but just-as-likely" equation for LCF: DYAR/G = -340 + (2 x GS) + (60 x COMP%).
- Let's also see what the LCF predictions are when the true parameters for GS, COMP%, and the intercept are all at the high end of their respective ranges; for example, 3.5 for GS, 440 for COMP%, and -120 for the intercept. In other words, let's use the following "high but just-as-likely" equation for LCF: DYAR/G = -120 + (3.5 x GS) + (440 x COMP%).
Below is a table showing the differences between DYAR/G predictions using the 3 different LCF equations (low-end, exact, and high-end):
College QB |
Low-End LCF |
Exact LCF |
High-End LCF |
Stafford |
-239.86 |
0.88 |
245.86 |
Sanchez |
-271.42 |
-30.22 |
215.42 |
Freeman |
-234.54 |
11.88 |
262.54 |
So basically, using 3 different-but-equally-likely LCF equations, it's just as likely that all 3 QBs are going to be 4 times worse than Leaf in the NFL as it is that they're going to be about 3 times better than Manning. Here's the same table using the range of parameter estimates in an LCF equation that predicts NFL COMP% instead of DYAR/G:
College QB |
Low-End LCF |
Exact LCF |
High-End LCF |
Stafford |
7.98% |
55.66% |
93.72% |
Sanchez |
5.86% |
53.98% |
92.44% |
Freeman |
8.82% |
57.37% |
96.28% |
Not all is lost here, though. If you're arguing with your friend about how good Matt Stafford is going to be as a pro, the odds are great that the argument will end when you say, "I think he's going to have a completion percentage somewhere between 8% and 94%." Well, either the conversation will end or your friend will come back with the proverbial, "No sh*t, Sherlock!" Remember, although the exact LCF is obviously the best estimate when it comes to suppressing this reflex, it's just as likely from a statistical perspective that the low-end LCF and high-end LCF predictions end up being right!
I want to make something very clear here before moving on to my potential solution to these problems. My discussion of LCF negatives are in no way meant to suggest that the LCF is useless. Rather, my purpose here is to make it more useful for the average football fan; basically to remove the mystery that surrounds it. Furthermore, I am not saying that Lewin and FO are trying to be deceptive about variability. On the contrary, FO freely acknowledges the variability inherent in their statistics and predictions. The fact of the matter is that football stat analysis is prone to vast amounts of variability, a reality that I'll explore a little bit later. What's important to understand here is that the variability problem with LCF is a byproduct of football stats themselves, not Lewin's or FO's desire to assert certainty where there's uncertainty.
BOTTOM LINE: Here are the main points you should take away from my discussion of the LCF's strengths and weaknesses:
- The LCF is a DYAR/G prediction based on a QB's college GS and COMP%. It's based on a stat-analysis-derived equation that quantifies the relative influence of these 2 factors.
- The LCF is most accurate for QBs drafted in the 1st and 2nd rounds.
- The inclusion of GS and COMP% in the LCF makes intuitive sense from a non-statistical perspective.
- The inclusion of GS and COMP% in the LCF is justified from a statistical perspective, as is the exclusion of other seemingly important college stats.
- Based on my replication study, a QB's predicted NFL DYAR/G = -232.16 + (2.75 x college GS) + (250.31 x college COMP%).
- The LCF has an outcome variable, DYAR/G, that might make the average football fan cross-eyed.
- Based on LCF, over 40% of "NFL success" for a QB has nothing to do with his college GS and COMP%.
- The margin of error for an LCF prediction is very big.
- PFP is not engaged in statistical money laundering with respect to LCF.
SOLUTION IDEA
OK, so the main problems with the LCF - in terms of everyday application - are that it uses a potentially confusing measure of "NFL success," and that its prediction equation yields wildly different results when you account for the margin of error. If this is the case, it seems to me that the way to improve the LCF for average-football-fan consumption involves (a) using a more widely known outcome measure, and (b) side-stepping the prediction equation.
In terms of NFL outcome measures, I've mentioned several already in this article: COMP%, GS, PAYDs, PATDs, INTs, and TD-INT ratio. As college COMP% and GS are the two stats that make up the LCF, it makes sense to try NFL COMP% and GS as replacements for DYAR/G. Also, in the same way that college PAYDs, PATDs, and INTs are dependent on GS, the NFL versions of these stats are also dependent on GS; as is NFL TD-INT ratio. Therefore, GS is a viable candidate both in its own right and as a proxy for PAYDs, PATDs, INTs, and TD-INT ratio.
To evaluate whether NFL COMP% and GS are good replacements for DYAR/G, we need to find out how much of the variance in COMP% and GS - as compared to DYAR/G - is accounted for by college COMP% and GS. Here's a table summarizing the results:
NFL Performance Measure |
Performance Variance Explained |
DYAR/G |
56.2% |
COMP% |
45.6% |
GS |
20.9% |
So it turns out that college COMP% and GS actually account for less of the variation in NFL COMP% and GS than they do for DYAR/G. This means that NFL COMP% and GS are worse than DYAR/G when it comes to choosing an outcome measure for a modified version of the LCF. NFL GS in particular is woefully bad because, as the table shows, almost 80% of a QB's NFL GS has nothing to do with his college COMP% and GS.
Because NFL GS is so bad, perhaps one of those outcomes we tossed out for being dependent on GS might work better. And here's a thought: What if we used NFL FFPts/G as a way to incorporate these outcomes into one summary measure? Well, that's exactly what I did. So how much of the variation in NFL FFPts/G is accounted for by college COMP% and GS? The answer is 48.8%, which means that -although still not as good as DYAR/G - FFPts/G is actually a more useful measure of NFL success than are COMP% and GS. So the moral of the story here is that, if we accept the cost of losing 8% in explained variation, we can gain the benefit of using a more widely known outcome measure. Going back to my QB argument example, your friend will probably know what 10 FFPts/G means in terms of performance, while he probably won't have a clue what 10 DYAR/G means; so that's definitely a plus.
Now, how can we side-step the prediction equation altogether so that we don't have to worry about the LCF's margin-of-error problem? Well, perhaps we can use some other non-MLR system that gives us a prediction that's at least as accurate as LCF.
IDEAL SOLUTION
My solution here involves using categorization to replace prediction equations. Here's what I mean. Suppose we take the general benchmarks set by the LCF (37 GS and 60% COMP%), and group the 35 QBs who were taken in the first 2 rounds of the draft from 1997-2006 into the following categories:
- Group A: QBs that had 37 or more GS and a COMP% of 60% or higher
- Group B: QBs that had 37 or more GS, but had a COMP% lower than 60%
- Group C: QBs that had less than 37 starts, but had a COMP% of 60% or higher
- Group D: QBs that had less than 37 starts and a COMP% lower than 60%
Here's how the group membership plays out:
Group A |
Group B |
Group C |
Group D |
Ben Roethlisberger |
Cade McNown |
Aaron Rodgers |
Akili Smith |
Byron Leftwich |
Carson Palmer |
Alex Smith |
Charlie Batch |
Chad Pennington |
Donovan McNabb |
David Carr |
Jim Druckenmiller |
Daunte Culpepper |
Jake Plummer |
Jason Campbell |
Joey Harrington |
Drew Brees |
Jay Cutler |
Kellen Clemens |
JP Losman |
Eli Manning |
Kyle Boller |
Rex Grossman |
Marques Tuiasosopo |
Matt Leinart |
Shaun King |
Tim Couch |
Michael Vick |
Peyton Manning |
Vince Young |
Patrick Ramsey |
|
Phillip Rivers |
Quincy Carter |
||
Ryan Leaf |
|||
Tarvaris Jackson |
Now, rather than focusing on an exact prediction of NFL success, let's just predict some general standards of performance for each group. For instance, let's say that QBs in Groups A and B should have 10 or more FFPts/G (with rounding), whereas QBs in Groups C and D should have less than 10 FFPts/G (Aside: 10 FFPts works out to about 200 yards, 1 TD, and 1 INT). And for the sake of comparison, let's also say that QBs in Groups A and B should have 20 or more DYAR/G (with rounding), whereas QBs in Groups C and D should have less than 20 DYAR/G.2 Below is a table showing the groups and their respective stats:
QB |
Group |
GS |
COMP% |
FFPts/G |
DYAR/G |
LCF FFPts/G |
LCF DYAR/G |
Peyton Manning |
A |
45 |
62.9% |
16.06 |
99.23 |
12.47 |
48.85 |
Drew Brees |
A |
37 |
61.0% |
14.25 |
62.87 |
10.06 |
22.13 |
Daunte Culpepper |
A |
44 |
63.9% |
13.53 |
43.73 |
12.53 |
48.61 |
Phillip Rivers |
A |
51 |
63.5% |
12.84 |
55.92 |
14.03 |
66.83 |
Ben Roethlisberger |
A |
38 |
65.5% |
12.01 |
47.93 |
11.61 |
36.14 |
Chad Pennington |
A |
51 |
63.4% |
11.48 |
53.82 |
14.00 |
66.58 |
Eli Manning |
A |
38 |
61.1% |
11.36 |
25.89 |
10.32 |
25.12 |
Byron Leftwich |
A |
38 |
65.1% |
9.72 |
25.91 |
11.49 |
35.14 |
Matt Leinart |
A |
39 |
64.8% |
7.63 |
13.24 |
11.64 |
37.13 |
Carson Palmer |
B |
41 |
59.1% |
14.14 |
67.66 |
10.43 |
28.36 |
Jay Cutler |
B |
45 |
57.2% |
13.59 |
63.22 |
10.80 |
34.58 |
Donovan McNabb |
B |
49 |
58.4% |
13.20 |
33.86 |
12.07 |
48.57 |
Jake Plummer |
B |
45 |
55.4% |
10.43 |
15.34 |
10.27 |
30.08 |
Kyle Boller |
B |
40 |
48.0% |
7.66 |
-5.21 |
6.95 |
-2.18 |
Shaun King |
B |
39 |
55.5% |
7.14 |
1.47 |
8.91 |
13.85 |
Cade McNown |
B |
43 |
55.5% |
6.02 |
-12.20 |
9.84 |
24.84 |
Aaron Rodgers |
C |
22 |
63.8% |
11.42 |
36.09 |
7.42 |
-12.05 |
Jason Campbell |
C |
30 |
64.6% |
10.66 |
34.14 |
9.50 |
11.92 |
Tim Couch |
C |
24 |
67.1% |
9.15 |
-20.73 |
8.84 |
1.70 |
Rex Grossman |
C |
32 |
61.0% |
8.57 |
-6.19 |
8.91 |
8.40 |
David Carr |
C |
26 |
62.8% |
8.02 |
-15.65 |
8.05 |
-3.57 |
Vince Young |
C |
32 |
61.8% |
6.74 |
0.70 |
9.14 |
10.40 |
Alex Smith |
C |
23 |
66.3% |
6.29 |
-47.25 |
8.38 |
-3.05 |
Kellen Clemens |
C |
32 |
61.0% |
4.30 |
-9.71 |
8.91 |
8.40 |
Joey Harrington |
D |
28 |
55.2% |
9.06 |
-8.60 |
6.28 |
-17.10 |
Michael Vick |
D |
21 |
56.5% |
8.65 |
-6.39 |
5.05 |
-33.07 |
Patrick Ramsey |
D |
32 |
58.9% |
8.35 |
-4.05 |
8.29 |
3.14 |
Quincy Carter |
D |
29 |
56.6% |
8.09 |
0.95 |
6.93 |
-10.85 |
Charlie Batch |
D |
24 |
58.0% |
7.62 |
10.24 |
6.18 |
-21.08 |
JP Losman |
D |
27 |
57.8% |
7.44 |
-16.45 |
6.81 |
-13.34 |
Tarvaris Jackson |
D |
36 |
53.6% |
7.27 |
3.48 |
7.66 |
0.86 |
Ryan Leaf |
D |
24 |
54.4% |
5.23 |
-62.24 |
5.13 |
-30.09 |
Akili Smith |
D |
19 |
56.6% |
3.75 |
-56.73 |
4.62 |
-38.31 |
Marques Tuiasosopo |
D |
25 |
55.4% |
1.24 |
-11.23 |
5.65 |
-24.84 |
Jim Druckenmiller |
D |
24 |
53.8% |
0.93 |
-43.00 |
4.95 |
-31.59 |
Stats displayed in bold italics are cases in which the general performance prediction was wrong. With this in mind, here's how accurate you would have been at predicting NFL success if you simply used group membership and general performance standards:
- Predicting QBs in Groups A and B to average 10 or more FFLPts/G: 12 of 16 (75.0%)
- Predicting QBs in Groups A and B to average 20 or more DYAR/G: 11 of 16 (68.8%)
- Predicting QBs in Groups C and D to average less than 10 FFLPts/G: 17 of 19 (89.5%)
- Predicting QBs in Groups C and D to average less than 20 DYAR/G: 17 of 19 (89.5%)
Those accuracy rates aren't too shabby in the least. Although you'd be more accurate about the curds (i.e., Groups C and D) than about the whey (e.g., Groups A and B), simply predicting a general standard of performance based on group membership is pretty accurate overall. Now, for the sake of comparison, let's see how you would have done if you applied the stat-based predictions. In other words, let's see how many times the exact prediction of over (and under) 10 FFLPts/G and/or 20 DYAR/G was right. The last 2 columns show the exact predictions for FFPts/G and DYAR/G. Stats displayed in bold are cases in which the exact performance prediction was wrong. Here are the accuracy rates:
- Predicting a given QB to average 10 or more FFLPts/G: 13 of 15 (86.7%)
- Predicting a given QB to average less than 10 FFLPts/G: 19 of 20 (95.0%)
- Predicting a given QB to average 20 or more DYAR/G: 11 of 14 (78.6%)
- Predicting a given QB to average less than 20 DYAR/G: 19 of 21 (90.5%)
And what if you take group membership into account when evaluating exact prediction accuracy?
- Predicting QBs in Groups A and B to average 10 or more FFLPts/G: 14 of 16 (87.5%)
- Predicting QBs in Groups A and B to average 20 or more DYAR/G: 13 of 16 (81.3%)
- Predicting QBs in Groups C and D to average less than 10 FFLPts/G: 18 of 19 (94.7%)
- Predicting QBs in Groups C and D to average less than 20 DYAR/G: 17 of 19 (89.5%)
Obviously, the stat-based exact predictions are going to be slightly better than the non-stat-based general predictions because - as you'll recall - the margin of error associated with the exact prediction method (i.e., LCF equation) is slightly better than the unknown error associated with the general prediction method (i.e., grouped performance standards). Nevertheless, the point here is that the accuracy cost doesn't outweigh the applied benefit. In other words, what you lose in accuracy, you make for by actually being able to come up with a prediction quickly and articulate that prediction using language your friend understands.
One thing I'll add here before moving on is that my solution complies entirely with FO's acknowledgement that variability - indeed, a lot of it - exists in their statistics and predictions. As they've taken great pains to admit, the aim of their stats is to show that some players (or teams) are better than others; not that their stats are the Da Vinci Code of football performance. In other words, they don't claim that knowing a secret code means being able to exactly predict the future (or describe/explain the past). Therefore, because our aims and variability admissions are similar, the main advantage of my solution is that it addresses the, "WTF is DYAR/G?" problem.
APPLYING THE SOLUTION
So I think I've come up with a much more practical application of the LCF that preserves Lewin's basic finding about the exclusive importance of college GS and COMP%. My goal here was not to rip Lewin to shreds. In fact, as I said earlier, his analysis was spot-on from both a methodological and results perspective. Unfortunately, the LCF just suffers from a serious variation problem.
Basically, Lewin's analysis was a slave to sample size. The reason why there's such a huge margin of error associated with the LCF is because the analysis that produced it was based on data from only 35 QB draft picks. It's actually quite remarkable that he was able to obtain such strong relationships between DYAR/G, GS, and COMP% with such a small sample size. Therefore, given the strength of these relationships, there's no doubt in my mind that the margin of error will shrink as future QB draft picks are added to the data set. Indeed, the main way to reduce variability in MLR predictions (and parameters) is to increase your sample size. There's no statistical magic here: The laws of probability dictate that educated guesses get better and better as the amount of information they're based on increases (i.e., the more they're educated).
Nevertheless - and I hope you're reading this methodrampage - the analysis is what it is. The state of the LCF at this point in time dictates that we have to wait for more data (i.e., wait for the sample size to get big enough). As I said, I don't doubt that future data will vindicate LCF. I just can't get behind it right now as a valid application for the average NFL fan (or NN member).
With that said, let's apply my more practical application to this year's draft. As I mentioned earlier, Stafford, Sanchez, and Freeman are, by consensus, the QBs who will be selected in the first 2 rounds. Here are their stats, group membership, and predicted performances:
QB |
Group |
GS |
COMP% |
10+ FFPts/G |
20+ DYAR/G |
LCF FFPts/G |
LCF DYAR/G |
Matt Stafford |
D |
33 |
56.9% |
No |
No |
7.94 |
0.88 |
Mark Sanchez |
C |
15 |
64.3% |
No |
No |
5.94 |
-30.22 |
Josh Freeman |
D |
35 |
59.1% |
No |
No |
9.04 |
11.88 |
Based on my general standards of performance, these 3 QBs are likely to fail with respect to having 10+ FFPts/G during their NFL careers. If this doesn't give the Lions pause about taking Stafford at #1, I don't know what will. Likewise, if this doesn't give the 49ers, NN members, and pundits pause about taking Sanchez at #10, I don't know what will. Serviceable QBs perhaps; just not top-10 picks. And if we want to quantify just how unlikely it would be for Stafford or Sanchez to end up average 10+ FFPts/G in their NFL careers, all we have to do is go back to the accuracy rates I presented earlier: 17 of the 19 Group C and D QBs from 1997-2006 averaged less than 10 FFPts/G during their careers. This means that, based on previous history, there's only about a 10% chance that Stafford and Sanchez buck the trend. Notice that I'm not saying a 0% chance; it's just highly unlikely. If it happens, it doesn't mean the stats were wrong. It just means these QBs defied the odds, and there's nothing inherently wrong with that (esp. for their bank accounts).
Although I obviously prefer the general predictions, if we compare the slightly more-accurate exact predictions in this table with the earlier one detailing prior draft picks, we get some sobering results. Stafford is closest to Tarvaris Jackson and David Carr in predicted FFPts/G, and closest to Jackson and Tim Couch in predicted DYAR/G. Sanchez is closest to Marques Tuiasosopo and Charlie Batch in predicted FFPts/G, and closest to - wait for it - Ryan Leaf and Jim Druckenmiller in predicted DYAR/G. I'm going to pause for a few seconds and let that sink in.................................................................................OK, done vomiting; back to the show. Finally, Freeman is closest to Vince Young and Shaun King in predicted FFPts/G, and closest to Jason Campbell and Vince Young in predicted DYAR/G.
So, based on exact predictions, the QBs should be ranked Freeman, Stafford, Sanchez. However, as is the major take-home of this article, we can't be so trusting of the exact predictions. The better thing to do - from a practical perspective - is simply to say that none of the 3 QBs are going to meet 2 general standards of good NFL performance. In this way, we can replace the exact rankings with general groupings that are just as accurate overall.
And if you want to really win that argument with your friend, then instead of saying, "Matt Stafford is going to have a completion percentage between 8% and 94%," or worse yet, "Matt Stafford is going to have a DYAR/G between -240 and 245," say, "I bet you Matt Stafford doesn't score more than 10 fantasy football points per game during his career." If your friend took that bet, you'd be a winner almost 90% of the time. Feel free to donate those proceeds to NN. Fooch needs the money (just kidding, Fooch).
1 Let me stress that this is the LCF equation that I came up with based on my replication study. It's probably not the same exact equation used by Lewin and FO in their publications because Lewin has no doubt refined the equation since 2006 as more data has come in. Also, unlike me, he has full access to all FO stats that aren't on the website, along with all of their game charts, etc. The point is that if you want Lewin's actual LCF predictions, you should rely on Pro Football Prospectus and FO's website, rather than my replication-derived equation. And if you're wondering whether this is a copyright CYA on my part, it is.
2 I used 10 FFPts/G and 20 DYAR/G as general standards because (a) that's what you get when you plug 37 GS and 60% COMP% into their respective equations, and (b) it makes the group membership accuracy as close as possible to the exact prediction accuracy.
**DYAR and DVOA statistics used to produce this article were obtained from Football Outsiders.