A Statistical Look at Drafting QBs: LCF Reduxe
Back in October, I introduced the Lewin Career Forecast (LCF) to Niners Nation (NN). To put it mildly, the post stirred up some lively debate, and signaled the beginning of a still-raging internecine battle on NN over the appropriateness of football stat analysis. Just to refresh everyone's memory, the LCF, which was developed by David Lewin of Football Outsiders (FO), predicts a QB's NFL performance based on 2 of his college stats: games started (GS) and completion percentage (COMP%). A couple of other ancillary features of the LCF are that (a) it's most appropriate for predicting the NFL performance of 1st- and 2nd-round QB picks, and (b) 37 college GS and a 60% college COMP% are the implied statistical benchmarks for success.
Because of its (over)simplicity, the LCF has received a healthy amount of criticism since it was unveiled in Pro Football Prospectus (PFP) 2006. Whether we're talking about statistical modeling or plain ol' common sense, it just can't be right that predicting performance for an NFL QB only requires knowing 2 things about him (3 if you count his draft round). In other words, QB performance is way too complicated to be that simple. Well, at least that's what the critics say. There are other arguments against the LCF, but I'll get into those a little later.
So, I figured that, because the draft is tomorrow, and because the "QB at #10" rumors seem to be accelerating, now would be as good a time as any for me to revisit the LCF and evaluate - from a statistical perspective - just how useful it is to the average NFL fan (Warning: Just because I'm a stats guy, don't assume I'm going to conclude that LCF is über-useful). Therefore, in this article, I'm going to do 4 things:
- Discuss the statistical and non-statistical positives and negatives of the LCF
- Give a statistical rationale for a solution that addresses the negatives
- Detail my solution
- Apply my solution to the top QB prospects in the 2009 NFL Draft
After the jump, I'll tackle the LCF...
POSITIVES AND NEGATIVES
In order to explore this topic, we need some more background info about the LCF (Note: I'm going to try as hard as possible to write the next couple of paragraphs in English).
Basically, to come up with the LCF, Lewin first thought of as many college stats as he could that might influence how a QB is going to perform at the professional level. Some of these were performance-based (e.g., COMP%), while others were more trait-like (e.g., height). To measure "performance at the professional level," he chose defense-adjusted points per game above replacement (DPAR/G). DPAR was the FO predecessor to DYAR, which is a stat most of you should be familiar with if you've read my previous articles. After choosing DPAR/G as his performance measure, he got all of the relevant stats for every QB drafted from 1997-2006, and looked for statistically significant relationships between these stats.
Based on the result that GS and COMP% were the stats most related to DPAR/G, and that the DPAR/G relationships for other stats seemed to depend on GS (e.g., a QB who starts more games generally throws for more yards and TDs by definition), he ran an analysis called multiple linear regression (MLR) to come up with a simple equation that predicts DPAR/G from GS and COMP%. As part of this analysis, he ran an MLR for each round of the draft, and found that the 1st and 2nd rounds yielded the most accurate predictions. Now, he's able to simply plug in any college QB's GS and COMP% stats, and get a predicted value for DPAR/G, which he calls the LCF.
As for this magical equation he came up with, think of it in terms of fantasy football (FF). To figure out how many points your QB scored in a given week according to standard FF scoring, you just divide his passing yards (PAYDs) by 25, add to that his passing touchdowns (PATDs) multiplied by 4, and then subtract his interceptions (INTs) multiplied by 2. Really, all you're doing here is applying the following equation to the QB's stats: FFPts = (.04 x PAYDs) + (4 x PATDs) - (2 x INTs). The MLR-based equation for LCF is just this sort of thing.
Now, onto the positives and negatives...
The first good thing about the LCF is that the inclusion of GS and COMP% makes perfect intuitive sense. The more GS a college QB has, the more experience he has playing QB. Obviously, if you're going to hire a guy to run your company (or offense), it's better that he/she has more experience than less. It means he/she is more likely to know what the (site decorum) he/she is doing. Similarly, the higher a college QB's COMP%, the better he is at hitting his target. Obviously, the point of the passing game in football - at any level - is to move the ball down the field by, you know, completing the pass to your teammate. If a guy can't hit the broadside of a barn in college after having thrown 1,000s of passes in his life up to that point, then he's probably not going to become a sharpshooter when he goes to the pros.
The second good thing about the LCF is more statistical in nature. When a statistician wants to determine whether someone did an analysis correctly and reported accurate results, they do what's called a replication study. Well, I did one of those for this article. Except for (a) using data only from 1st- and 2nd-round picks, and (b) using DYAR/G instead of DPAR/G, I did the same thing that Lewin did to come up with the LCF, i.e., I followed the steps I described a minute ago. To Lewin's credit, I got the same results he did: GS and COMP% had the strongest relationships with DYAR/G, and were the best MLR predictors of DYAR/G. Just for the sake of providing info, here are some of the other things that I found:
- The correlation between GS and DYAR/G was .71. The correlation between COMP% and DYAR/G was .35. Correlations range from -1 to 1, with the size of the number indicating how strong the relationship is. Positive correlations mean that one stat goes up as the other goes up, whereas negative correlations mean that one stat goes down as the other goes up. So, for all intents and purposes, these 2 correlations mean that a higher GS is twice as important as a higher COMP% when it comes to predicting higher DYAR/G.
- Because some people have made this argument, I looked at two variables related to the NFL situation a QB draftee was entering at the time he was drafted. Whether we're talking about the previous year, the year before that, the year before that, or the average of these 3 years, neither a team's wins nor its offense DVOA when its QB arrived from college were related at all to his future NFL stats (COMP%, PAYDs, PATDs, INTs, TD-INT ratio, DYAR/G).
- For sh*ts and giggles, I also looked at whether coming from a BCS school has any impact. It doesn't.
- The actual equation for LCF that I arrived at in my replication study was: NFL DYAR/G = -232.16 + (2.75 x college GS) + (250.31 x college COMP%).1 Plug in some QB numbers and try it for yourself if you wish.
Now for the negatives, of which there are 3...
First, as you can tell from what I just discussed, coming up with the LCF requires a good bit of statistical acumen. In terms of deriving it, the outcome variable, DYAR/G, is a little complicated, and the analysis itself requires statistical software and MLR training. In terms of applying it, you need to have the equation handy and probably a calculator. Because of these issues, LCF is not really practical for the average football fan. For example, if you're having a football argument with a friend about some random QB prospect, it's not really practical for you to keep a copy of Lewin's equation in your wallet and a calculator in your pocket (more likely your pocket protector) in order to rebut your friend's claim that "QB X is going to be awesome in the pros!" Likewise, you're probably going to elicit bewilderment rather than agreement if you base your rebuttal on DYAR/G. And even if you simply wait for Lewin's predictions in PFP or on FO's website, you're probably not going to have the predictions handy when arguing with your friend.
Second, as some people have pointed out, the fact that the LCF is only accurate for 1st- and 2nd-rounders poses a problem with respect to applying it. In order for it to actually be a "prediction," you have to know in advance whether or not a given college QB is going to be drafted in the 1st or 2nd round. This limits your prediction window to the month or so prior to the draft, in which draft pick projections reach a consensus among teams and pundits like Mel Kiper, Jr. And just so you know, this whole "diminishing accuracy by round" thing is real. When I ran my MLR analysis looking only at 1st-rounders, the model was about 5% better; which brings me to the last problem.
The third problem with LCF is by far the most concerning. When it comes down to it, statistics is about explaining the variation among things. It answers questions like, "Why does Jane have an IQ 20 points higher than John?" Jane's IQ varies from John's by 20 points, and we want to know why. In the context of the LCF, this example can be rephrased, "Why does Peyton Manning (the 1st pick in 1998) have an NFL DYAR/G 161 yards higher than Ryan Leaf (the 2nd pick in 1998)?" More generally, the question is, "Why do certain NFL QBs have a higher DYAR/G than others?" The LCF answers these latter 2 questions by saying, "Because of the differences between their GS and COMP% in college."
When you run an MLR analysis, the results tell you a lot more than the multipliers you need to get your prediction. The most important of these supplementary results is called explained variance, which, in the context of LCF, tells us how much of the DYAR/G variation between top-2-round QBs is explained by their college GS and COMP% stats. In my replication of Lewin's analysis, that explained variance value was 56.2%. In other words, over 40% of the DYAR/G variation between QBs was not explained by GS and COMP%. While 56.2% isn't shabby in MLR, especially given the small sample size (n = 35), there's still a veritable sh*t-ton of variation for which LCF has no answer.
Going back to the previous example, the LCF basically says that it can account for about 90 yards (or 56.2%) of the 161-yard difference between Manning and Leaf, but has no clue about the other 70 or so yards. Indeed, if you plug their college numbers into the equation I gave you earlier, you get a predicted DYAR/G difference of 78.94 (Manning = 48.85; Leaf = -30.09), which is far below their actual 161-yard difference. You might say, "Well, it still accounted for half of the difference." My response would be this: Leaf's legacy would be considerably different if he ended up being only 80 DYAR/G worse than Manning. In fact, his actual DYAR/G would increase from -62.24 (worst among the draftees) to 16.70 (on par with Jake Plummer).
And if you thought that wasn't enough, there's one more variation-related problem with LCF: variation in the multipliers themselves. Recall that what you get out of an MLR analysis is a bunch of numbers you have to multiply your stats by to get your predicted DYAR/G. It turns out that these multipliers, called parameters, are estimates just like your DYAR/G prediction is an estimate. In other words, they're statistics-based educated guesses. Think of political polls for this one. Just like every poll has a margin of error, every one of the MLR parameters has a margin of error. Just like the true value for a candidate's vote percentage is somewhere within that margin, the true value for the MLR parameter is somewhere within that margin. Obviously, the goal of estimation, whether we're talking QB performance or political polls, is to minimize your margin of error as much as possible. Later on, I'll tell you how that's done; though I bet you already have a clue.
So what does this "margin of error" problem mean for LCF? First, it means that Lewin gives you his LCF estimate, but doesn't give you his margin of error, which is the same thing as a pollster reporting their prediction for a candidate's vote percentage, but not telling you their margin of error. Second, and much more importantly, it turns out that the amount of parameter estimate variation for the replication MLR I did was huge. Here's a perfect example that is relevant for this year's draft:
- According to scout and pundit consensus, Matt Stafford, Mark Sanchez, and Josh Freeman are going to be the only QBs taken in the first 2 rounds of the 2009 draft. Stafford's stats were 33 GS and 56.9% COMP% (or .569 for the purposes of LCF). Sanchez's stats were 15 GS and .643 COMP%. Freeman's stats were 35 GS and .591 COMP%.
- Remember that the MLR parameter estimates for GS and COMP% (from the earlier equation) were 2.75 and 250.31, respectively. When you take each of their margins for error into account, the true GS parameter value is anywhere from 1.81 to 3.68, while the true COMP% parameter value is anywhere from 57.54 to 443.08.
- Without going into detail here, know that the first number in the equation (-232.16), which is called the intercept, ranges from -349.26 to -115.07 when you take its margin of error into account.
- Let's see what the LCF predictions are when the parameter estimates in the equation are exactly right.
- Let's also see what the LCF predictions are when the true parameters for GS, COMP%, and the intercept are all at the low end of their respective ranges; for example, 2 for GS, 60 for COMP%, and -340 for the intercept. In other words, let's use the following "low but just-as-likely" equation for LCF: DYAR/G = -340 + (2 x GS) + (60 x COMP%).
- Let's also see what the LCF predictions are when the true parameters for GS, COMP%, and the intercept are all at the high end of their respective ranges; for example, 3.5 for GS, 440 for COMP%, and -120 for the intercept. In other words, let's use the following "high but just-as-likely" equation for LCF: DYAR/G = -120 + (3.5 x GS) + (440 x COMP%).
Below is a table showing the differences between DYAR/G predictions using the 3 different LCF equations (low-end, exact, and high-end):
|
College QB |
Low-End LCF |
Exact LCF |
High-End LCF |
|
Stafford |
-239.86 |
0.88 |
245.86 |
|
Sanchez |
-271.42 |
-30.22 |
215.42 |
|
Freeman |
-234.54 |
11.88 |
262.54 |
So basically, using 3 different-but-equally-likely LCF equations, it's just as likely that all 3 QBs are going to be 4 times worse than Leaf in the NFL as it is that they're going to be about 3 times better than Manning. Here's the same table using the range of parameter estimates in an LCF equation that predicts NFL COMP% instead of DYAR/G:
|
College QB |
Low-End LCF |
Exact LCF |
High-End LCF |
|
Stafford |
7.98% |
55.66% |
93.72% |
|
Sanchez |
5.86% |
53.98% |
92.44% |
|
Freeman |
8.82% |
57.37% |
96.28% |
Not all is lost here, though. If you're arguing with your friend about how good Matt Stafford is going to be as a pro, the odds are great that the argument will end when you say, "I think he's going to have a completion percentage somewhere between 8% and 94%." Well, either the conversation will end or your friend will come back with the proverbial, "No sh*t, Sherlock!" Remember, although the exact LCF is obviously the best estimate when it comes to suppressing this reflex, it's just as likely from a statistical perspective that the low-end LCF and high-end LCF predictions end up being right!
I want to make something very clear here before moving on to my potential solution to these problems. My discussion of LCF negatives are in no way meant to suggest that the LCF is useless. Rather, my purpose here is to make it more useful for the average football fan; basically to remove the mystery that surrounds it. Furthermore, I am not saying that Lewin and FO are trying to be deceptive about variability. On the contrary, FO freely acknowledges the variability inherent in their statistics and predictions. The fact of the matter is that football stat analysis is prone to vast amounts of variability, a reality that I'll explore a little bit later. What's important to understand here is that the variability problem with LCF is a byproduct of football stats themselves, not Lewin's or FO's desire to assert certainty where there's uncertainty.
BOTTOM LINE: Here are the main points you should take away from my discussion of the LCF's strengths and weaknesses:
- The LCF is a DYAR/G prediction based on a QB's college GS and COMP%. It's based on a stat-analysis-derived equation that quantifies the relative influence of these 2 factors.
- The LCF is most accurate for QBs drafted in the 1st and 2nd rounds.
- The inclusion of GS and COMP% in the LCF makes intuitive sense from a non-statistical perspective.
- The inclusion of GS and COMP% in the LCF is justified from a statistical perspective, as is the exclusion of other seemingly important college stats.
- Based on my replication study, a QB's predicted NFL DYAR/G = -232.16 + (2.75 x college GS) + (250.31 x college COMP%).
- The LCF has an outcome variable, DYAR/G, that might make the average football fan cross-eyed.
- Based on LCF, over 40% of "NFL success" for a QB has nothing to do with his college GS and COMP%.
- The margin of error for an LCF prediction is very big.
- PFP is not engaged in statistical money laundering with respect to LCF.
SOLUTION IDEA
OK, so the main problems with the LCF - in terms of everyday application - are that it uses a potentially confusing measure of "NFL success," and that its prediction equation yields wildly different results when you account for the margin of error. If this is the case, it seems to me that the way to improve the LCF for average-football-fan consumption involves (a) using a more widely known outcome measure, and (b) side-stepping the prediction equation.
In terms of NFL outcome measures, I've mentioned several already in this article: COMP%, GS, PAYDs, PATDs, INTs, and TD-INT ratio. As college COMP% and GS are the two stats that make up the LCF, it makes sense to try NFL COMP% and GS as replacements for DYAR/G. Also, in the same way that college PAYDs, PATDs, and INTs are dependent on GS, the NFL versions of these stats are also dependent on GS; as is NFL TD-INT ratio. Therefore, GS is a viable candidate both in its own right and as a proxy for PAYDs, PATDs, INTs, and TD-INT ratio.
To evaluate whether NFL COMP% and GS are good replacements for DYAR/G, we need to find out how much of the variance in COMP% and GS - as compared to DYAR/G - is accounted for by college COMP% and GS. Here's a table summarizing the results:
|
NFL Performance Measure |
Performance Variance Explained |
|
DYAR/G |
56.2% |
|
COMP% |
45.6% |
|
GS |
20.9% |
So it turns out that college COMP% and GS actually account for less of the variation in NFL COMP% and GS than they do for DYAR/G. This means that NFL COMP% and GS are worse than DYAR/G when it comes to choosing an outcome measure for a modified version of the LCF. NFL GS in particular is woefully bad because, as the table shows, almost 80% of a QB's NFL GS has nothing to do with his college COMP% and GS.
Because NFL GS is so bad, perhaps one of those outcomes we tossed out for being dependent on GS might work better. And here's a thought: What if we used NFL FFPts/G as a way to incorporate these outcomes into one summary measure? Well, that's exactly what I did. So how much of the variation in NFL FFPts/G is accounted for by college COMP% and GS? The answer is 48.8%, which means that -although still not as good as DYAR/G - FFPts/G is actually a more useful measure of NFL success than are COMP% and GS. So the moral of the story here is that, if we accept the cost of losing 8% in explained variation, we can gain the benefit of using a more widely known outcome measure. Going back to my QB argument example, your friend will probably know what 10 FFPts/G means in terms of performance, while he probably won't have a clue what 10 DYAR/G means; so that's definitely a plus.
Now, how can we side-step the prediction equation altogether so that we don't have to worry about the LCF's margin-of-error problem? Well, perhaps we can use some other non-MLR system that gives us a prediction that's at least as accurate as LCF.
IDEAL SOLUTION
My solution here involves using categorization to replace prediction equations. Here's what I mean. Suppose we take the general benchmarks set by the LCF (37 GS and 60% COMP%), and group the 35 QBs who were taken in the first 2 rounds of the draft from 1997-2006 into the following categories:
- Group A: QBs that had 37 or more GS and a COMP% of 60% or higher
- Group B: QBs that had 37 or more GS, but had a COMP% lower than 60%
- Group C: QBs that had less than 37 starts, but had a COMP% of 60% or higher
- Group D: QBs that had less than 37 starts and a COMP% lower than 60%
Here's how the group membership plays out:
|
Group A |
Group B |
Group C |
Group D |
|
Ben Roethlisberger |
Cade McNown |
Aaron Rodgers |
Akili Smith |
|
Byron Leftwich |
Carson Palmer |
Alex Smith |
Charlie Batch |
|
Chad Pennington |
Donovan McNabb |
David Carr |
Jim Druckenmiller |
|
Daunte Culpepper |
Jake Plummer |
Jason Campbell |
Joey Harrington |
|
Drew Brees |
Jay Cutler |
Kellen Clemens |
JP Losman |
|
Eli Manning |
Kyle Boller |
Rex Grossman |
Marques Tuiasosopo |
|
Matt Leinart |
Shaun King |
Tim Couch |
Michael Vick |
|
Peyton Manning |
Vince Young |
Patrick Ramsey |
|
|
Phillip Rivers |
Quincy Carter |
||
|
Ryan Leaf |
|||
|
Tarvaris Jackson |
Now, rather than focusing on an exact prediction of NFL success, let's just predict some general standards of performance for each group. For instance, let's say that QBs in Groups A and B should have 10 or more FFPts/G (with rounding), whereas QBs in Groups C and D should have less than 10 FFPts/G (Aside: 10 FFPts works out to about 200 yards, 1 TD, and 1 INT). And for the sake of comparison, let's also say that QBs in Groups A and B should have 20 or more DYAR/G (with rounding), whereas QBs in Groups C and D should have less than 20 DYAR/G.2 Below is a table showing the groups and their respective stats:
|
QB |
Group |
GS |
COMP% |
FFPts/G |
DYAR/G |
LCF FFPts/G |
LCF DYAR/G |
|
Peyton Manning |
A |
45 |
62.9% |
16.06 |
99.23 |
12.47 |
48.85 |
|
Drew Brees |
A |
37 |
61.0% |
14.25 |
62.87 |
10.06 |
22.13 |
|
Daunte Culpepper |
A |
44 |
63.9% |
13.53 |
43.73 |
12.53 |
48.61 |
|
Phillip Rivers |
A |
51 |
63.5% |
12.84 |
55.92 |
14.03 |
66.83 |
|
Ben Roethlisberger |
A |
38 |
65.5% |
12.01 |
47.93 |
11.61 |
36.14 |
|
Chad Pennington |
A |
51 |
63.4% |
11.48 |
53.82 |
14.00 |
66.58 |
|
Eli Manning |
A |
38 |
61.1% |
11.36 |
25.89 |
10.32 |
25.12 |
|
Byron Leftwich |
A |
38 |
65.1% |
9.72 |
25.91 |
11.49 |
35.14 |
|
Matt Leinart |
A |
39 |
64.8% |
7.63 |
13.24 |
11.64 |
37.13 |
|
Carson Palmer |
B |
41 |
59.1% |
14.14 |
67.66 |
10.43 |
28.36 |
|
Jay Cutler |
B |
45 |
57.2% |
13.59 |
63.22 |
10.80 |
34.58 |
|
Donovan McNabb |
B |
49 |
58.4% |
13.20 |
33.86 |
12.07 |
48.57 |
|
Jake Plummer |
B |
45 |
55.4% |
10.43 |
15.34 |
10.27 |
30.08 |
|
Kyle Boller |
B |
40 |
48.0% |
7.66 |
-5.21 |
6.95 |
-2.18 |
|
Shaun King |
B |
39 |
55.5% |
7.14 |
1.47 |
8.91 |
13.85 |
|
Cade McNown |
B |
43 |
55.5% |
6.02 |
-12.20 |
9.84 |
24.84 |
|
Aaron Rodgers |
C |
22 |
63.8% |
11.42 |
36.09 |
7.42 |
-12.05 |
|
Jason Campbell |
C |
30 |
64.6% |
10.66 |
34.14 |
9.50 |
11.92 |
|
Tim Couch |
C |
24 |
67.1% |
9.15 |
-20.73 |
8.84 |
1.70 |
|
Rex Grossman |
C |
32 |
61.0% |
8.57 |
-6.19 |
8.91 |
8.40 |
|
David Carr |
C |
26 |
62.8% |
8.02 |
-15.65 |
8.05 |
-3.57 |
|
Vince Young |
C |
32 |
61.8% |
6.74 |
0.70 |
9.14 |
10.40 |
|
Alex Smith |
C |
23 |
66.3% |
6.29 |
-47.25 |
8.38 |
-3.05 |
|
Kellen Clemens |
C |
32 |
61.0% |
4.30 |
-9.71 |
8.91 |
8.40 |
|
Joey Harrington |
D |
28 |
55.2% |
9.06 |
-8.60 |
6.28 |
-17.10 |
|
Michael Vick |
D |
21 |
56.5% |
8.65 |
-6.39 |
5.05 |
-33.07 |
|
Patrick Ramsey |
D |
32 |
58.9% |
8.35 |
-4.05 |
8.29 |
3.14 |
|
Quincy Carter |
D |
29 |
56.6% |
8.09 |
0.95 |
6.93 |
-10.85 |
|
Charlie Batch |
D |
24 |
58.0% |
7.62 |
10.24 |
6.18 |
-21.08 |
|
JP Losman |
D |
27 |
57.8% |
7.44 |
-16.45 |
6.81 |
-13.34 |
|
Tarvaris Jackson |
D |
36 |
53.6% |
7.27 |
3.48 |
7.66 |
0.86 |
|
Ryan Leaf |
D |
24 |
54.4% |
5.23 |
-62.24 |
5.13 |
-30.09 |
|
Akili Smith |
D |
19 |
56.6% |
3.75 |
-56.73 |
4.62 |
-38.31 |
|
Marques Tuiasosopo |
D |
25 |
55.4% |
1.24 |
-11.23 |
5.65 |
-24.84 |
|
Jim Druckenmiller |
D |
24 |
53.8% |
0.93 |
-43.00 |
4.95 |
-31.59 |
Stats displayed in bold italics are cases in which the general performance prediction was wrong. With this in mind, here's how accurate you would have been at predicting NFL success if you simply used group membership and general performance standards:
- Predicting QBs in Groups A and B to average 10 or more FFLPts/G: 12 of 16 (75.0%)
- Predicting QBs in Groups A and B to average 20 or more DYAR/G: 11 of 16 (68.8%)
- Predicting QBs in Groups C and D to average less than 10 FFLPts/G: 17 of 19 (89.5%)
- Predicting QBs in Groups C and D to average less than 20 DYAR/G: 17 of 19 (89.5%)
Those accuracy rates aren't too shabby in the least. Although you'd be more accurate about the curds (i.e., Groups C and D) than about the whey (e.g., Groups A and B), simply predicting a general standard of performance based on group membership is pretty accurate overall. Now, for the sake of comparison, let's see how you would have done if you applied the stat-based predictions. In other words, let's see how many times the exact prediction of over (and under) 10 FFLPts/G and/or 20 DYAR/G was right. The last 2 columns show the exact predictions for FFPts/G and DYAR/G. Stats displayed in bold are cases in which the exact performance prediction was wrong. Here are the accuracy rates:
- Predicting a given QB to average 10 or more FFLPts/G: 13 of 15 (86.7%)
- Predicting a given QB to average less than 10 FFLPts/G: 19 of 20 (95.0%)
- Predicting a given QB to average 20 or more DYAR/G: 11 of 14 (78.6%)
- Predicting a given QB to average less than 20 DYAR/G: 19 of 21 (90.5%)
And what if you take group membership into account when evaluating exact prediction accuracy?
- Predicting QBs in Groups A and B to average 10 or more FFLPts/G: 14 of 16 (87.5%)
- Predicting QBs in Groups A and B to average 20 or more DYAR/G: 13 of 16 (81.3%)
- Predicting QBs in Groups C and D to average less than 10 FFLPts/G: 18 of 19 (94.7%)
- Predicting QBs in Groups C and D to average less than 20 DYAR/G: 17 of 19 (89.5%)
Obviously, the stat-based exact predictions are going to be slightly better than the non-stat-based general predictions because - as you'll recall - the margin of error associated with the exact prediction method (i.e., LCF equation) is slightly better than the unknown error associated with the general prediction method (i.e., grouped performance standards). Nevertheless, the point here is that the accuracy cost doesn't outweigh the applied benefit. In other words, what you lose in accuracy, you make for by actually being able to come up with a prediction quickly and articulate that prediction using language your friend understands.
One thing I'll add here before moving on is that my solution complies entirely with FO's acknowledgement that variability - indeed, a lot of it - exists in their statistics and predictions. As they've taken great pains to admit, the aim of their stats is to show that some players (or teams) are better than others; not that their stats are the Da Vinci Code of football performance. In other words, they don't claim that knowing a secret code means being able to exactly predict the future (or describe/explain the past). Therefore, because our aims and variability admissions are similar, the main advantage of my solution is that it addresses the, "WTF is DYAR/G?" problem.
APPLYING THE SOLUTION
So I think I've come up with a much more practical application of the LCF that preserves Lewin's basic finding about the exclusive importance of college GS and COMP%. My goal here was not to rip Lewin to shreds. In fact, as I said earlier, his analysis was spot-on from both a methodological and results perspective. Unfortunately, the LCF just suffers from a serious variation problem.
Basically, Lewin's analysis was a slave to sample size. The reason why there's such a huge margin of error associated with the LCF is because the analysis that produced it was based on data from only 35 QB draft picks. It's actually quite remarkable that he was able to obtain such strong relationships between DYAR/G, GS, and COMP% with such a small sample size. Therefore, given the strength of these relationships, there's no doubt in my mind that the margin of error will shrink as future QB draft picks are added to the data set. Indeed, the main way to reduce variability in MLR predictions (and parameters) is to increase your sample size. There's no statistical magic here: The laws of probability dictate that educated guesses get better and better as the amount of information they're based on increases (i.e., the more they're educated).
Nevertheless - and I hope you're reading this methodrampage - the analysis is what it is. The state of the LCF at this point in time dictates that we have to wait for more data (i.e., wait for the sample size to get big enough). As I said, I don't doubt that future data will vindicate LCF. I just can't get behind it right now as a valid application for the average NFL fan (or NN member).
With that said, let's apply my more practical application to this year's draft. As I mentioned earlier, Stafford, Sanchez, and Freeman are, by consensus, the QBs who will be selected in the first 2 rounds. Here are their stats, group membership, and predicted performances:
|
QB |
Group |
GS |
COMP% |
10+ FFPts/G |
20+ DYAR/G |
LCF FFPts/G |
LCF DYAR/G |
|
Matt Stafford |
D |
33 |
56.9% |
No |
No |
7.94 |
0.88 |
|
Mark Sanchez |
C |
15 |
64.3% |
No |
No |
5.94 |
-30.22 |
|
Josh Freeman |
D |
35 |
59.1% |
No |
No |
9.04 |
11.88 |
Based on my general standards of performance, these 3 QBs are likely to fail with respect to having 10+ FFPts/G during their NFL careers. If this doesn't give the Lions pause about taking Stafford at #1, I don't know what will. Likewise, if this doesn't give the 49ers, NN members, and pundits pause about taking Sanchez at #10, I don't know what will. Serviceable QBs perhaps; just not top-10 picks. And if we want to quantify just how unlikely it would be for Stafford or Sanchez to end up average 10+ FFPts/G in their NFL careers, all we have to do is go back to the accuracy rates I presented earlier: 17 of the 19 Group C and D QBs from 1997-2006 averaged less than 10 FFPts/G during their careers. This means that, based on previous history, there's only about a 10% chance that Stafford and Sanchez buck the trend. Notice that I'm not saying a 0% chance; it's just highly unlikely. If it happens, it doesn't mean the stats were wrong. It just means these QBs defied the odds, and there's nothing inherently wrong with that (esp. for their bank accounts).
Although I obviously prefer the general predictions, if we compare the slightly more-accurate exact predictions in this table with the earlier one detailing prior draft picks, we get some sobering results. Stafford is closest to Tarvaris Jackson and David Carr in predicted FFPts/G, and closest to Jackson and Tim Couch in predicted DYAR/G. Sanchez is closest to Marques Tuiasosopo and Charlie Batch in predicted FFPts/G, and closest to - wait for it - Ryan Leaf and Jim Druckenmiller in predicted DYAR/G. I'm going to pause for a few seconds and let that sink in.................................................................................OK, done vomiting; back to the show. Finally, Freeman is closest to Vince Young and Shaun King in predicted FFPts/G, and closest to Jason Campbell and Vince Young in predicted DYAR/G.
So, based on exact predictions, the QBs should be ranked Freeman, Stafford, Sanchez. However, as is the major take-home of this article, we can't be so trusting of the exact predictions. The better thing to do - from a practical perspective - is simply to say that none of the 3 QBs are going to meet 2 general standards of good NFL performance. In this way, we can replace the exact rankings with general groupings that are just as accurate overall.
And if you want to really win that argument with your friend, then instead of saying, "Matt Stafford is going to have a completion percentage between 8% and 94%," or worse yet, "Matt Stafford is going to have a DYAR/G between -240 and 245," say, "I bet you Matt Stafford doesn't score more than 10 fantasy football points per game during his career." If your friend took that bet, you'd be a winner almost 90% of the time. Feel free to donate those proceeds to NN. Fooch needs the money (just kidding, Fooch).
1 Let me stress that this is the LCF equation that I came up with based on my replication study. It's probably not the same exact equation used by Lewin and FO in their publications because Lewin has no doubt refined the equation since 2006 as more data has come in. Also, unlike me, he has full access to all FO stats that aren't on the website, along with all of their game charts, etc. The point is that if you want Lewin's actual LCF predictions, you should rely on Pro Football Prospectus and FO's website, rather than my replication-derived equation. And if you're wondering whether this is a copyright CYA on my part, it is.
2 I used 10 FFPts/G and 20 DYAR/G as general standards because (a) that's what you get when you plug 37 GS and 60% COMP% into their respective equations, and (b) it makes the group membership accuracy as close as possible to the exact prediction accuracy.
**DYAR and DVOA statistics used to produce this article were obtained from Football Outsiders.
33 comments
|
0 recs |
Do you like this story?
Comments
Isnt Sanchez the most skewed due to his lack of games played?
I would think on the most basic non in depth analysis level that the fact he played only 16 games heavily skews his results
btw, for the record my QB board looks like this
Sanchez >>>>>>>>>> Stafford >>>>>Freeman
by GreatOden'sRaven on Apr 24, 2009 12:11 PM PDT reply actions
sanchez's GS...
is lower than any QB used in the analysis. the 2 next-lowest are akili (19) and vick (21). not exactly a ringing endorsement for QBs with that little starting experience in college. granted, he’s helped out by the fact that his comp% was so high. that’s certainly different from akili and vick, who were both in the mid 50’s. as of right now, and only in terms of the stats, there’s no reason to believe he’s going to buck the trend. i think that it’s a debatable point from a non-stat perspective, but anyone who talks like they’re sure he’s going to be a stud is a snake-oil salesman. same goes for stafford and freeman for that matter. i guess the way to look at this is that all 3 QB’s stats (i.e., genes) predispose them to score less than 10 FFLPts/G during their careers. whether or not they actually do is going to depend on whether that predisposition is overcome with coaching, work ethic, etc. (i.e., environment).
by (Florida) Danny Tuccitto on Apr 24, 2009 2:04 PM PDT up reply actions
Wait a second
granted, [Sanchez is] helped out by the fact that his comp% was so high. that’s certainly different from akili and vick, who were both in the mid 50’s. as of right now, and only in terms of the stats, there’s no reason to believe he’s going to buck the trend.
What trend are we talking about? The trend that he’s not going to be a good NFL QB? I’ll believe that a lack of games started in college would make him riskier than a QB with a similar completion % with 37+ games started in college but you’ve got one hell of sale to make if you’re telling me that a college QB’s future success in the NFL is contingent on him making 37+ starts in college.
Don't sweat it. I'm illiterate.
by methodrampage on Apr 24, 2009 2:26 PM PDT up reply actions
the trend...
…i’m talking about is that 17 of 19 college QBs in groups C and D did not achieve the 200 yds, 1 TD, and 1 INT per game threshold during their careers.
also, i put the following phrase in the passage you cited, which kind of answers your question:
and only in terms of the stats
a college QB’s future success is “contingent on” a hell of a lot more than his stats.
by (Florida) Danny Tuccitto on Apr 24, 2009 2:36 PM PDT up reply actions
Even in terms of stats
I’m not buying
Based on my general standards of performance, these 3 QBs are likely to fail
What is it about having 37+ games started that makes a QB better than one without? I think my issue is you’re saying that Sanchez isn’t going to be good because he didn’t start 37+ games which is nonsense. It may make him harder to project but it doesn’t mean he’s destined to fail.
Don't sweat it. I'm illiterate.
by methodrampage on Apr 24, 2009 2:52 PM PDT up reply actions
not destined, but more likely
No statistic is going to make someone destined to fail, but he’s certainly less likely to have immediate success in the NFL based on lack of experience.
One of the common traits of the guys in the A group were the ability to immediately contribute. I believe that some of the guys in groups C/D could have succeeded if given the Philip Rivers or Aaron Rodgers treatment and shown the bench for a year or two.
I think Sanchez’s chances of becoming a bust go up a lot if he’s thrown into a starting role this year. Experience is always needed as a professional in any industry or sport, so it’s clearly relevant when examining a QB prospect.
Rays in '08.... Desmond Jennings - the breakout continues.....
by youALREADYknow on Apr 24, 2009 3:00 PM PDT up reply actions
That's my issue
I don’t think a lack of games started makes him less likely to have success. I think a lack of games started just makes him harder to project. I’m of the opinion that no QB should start his rookie season as the majority of QB, 37+ starts in college or not, would benefit by learning from the sidelines their first year. Maybe due to his lack of starts Sanchez is more of a “project” he might not be ready to contribute until after his 3rd year but I don’t believe the LCF is talking about immediate success, just success in general.
Don't sweat it. I'm illiterate.
by methodrampage on Apr 24, 2009 3:14 PM PDT up reply actions
after that, i agree with you
I understand where you’re coming from now. I also find it funny that I consider Sanchez a more NFL-ready QB than Stafford or Freeman even though he has less game reps.
Still, for the purposes of this exercise and evaluating LCF I think that Danny did an excellent job turning their method into a more efficient one. Any true in-depth statistical analysis would have to go deeper into performance-related stats, but that wasn’t really the purpose of this forecasting model.
Rays in '08.... Desmond Jennings - the breakout continues.....
by youALREADYknow on Apr 24, 2009 3:30 PM PDT up reply actions
I'll admit that what Danny has put together is an impressive piece of work
I just don’t like the LCF all that much because I think it’s somewhat hypicritical. In so as much that it suggests that NFL scouts know enough about college talent that they can determine who’s worthy of being selected in the first two rounds but at the same time, by it’s nature, suggests that they don’t know [site decorum] because they keep drafting QB that don’t meet the 37+ starts and high completion % requirements in those first two rounds.
Don't sweat it. I'm illiterate.
by methodrampage on Apr 24, 2009 3:38 PM PDT up reply actions
i see what you're saying...
…and i don’t disagree. what you’re proposing here is a potential mechanism for why lower GS for a Rd1 or Rd2 QB predicts lower NFL success, i.e., that he needs seasoning, but doesn’t get it at the pro level. if you look at my Group C, you find that the guys with fewer starts but a high comp % (like sanchez) turned out better when they sat and seasoned (rodgers, campbell) than when they were thrown into the fire (smith, carr, rex, vince, couch). so perhaps you’re right.
the only problem is that, when a QB is taken in the 1st or 2nd round, it’s far more likely that they’re not going to be allowed to season by sitting on the bench. the teams are too highly committed to the QB to wait on him. GB and WAS had the luxuries of veteran QBs for rodgers and campbell to sit behind.
so basically, yeah…sanchez can be successful if he sits for a while and seasons. however, getting selected in the 1st round makes him less likely to have that luxury. quite a catch-22 here. looking at potential teams for him, i think he’s a good fit in SEA for this very reason. SF, perhaps, but only if hill is able to hold down the fort for a few years, and the whole team isn’t blown up again because they suck this season. to boot, if SF were to suck this season, i have to believe singletary would turn to an unseasoned sanchez to save his job.
by (Florida) Danny Tuccitto on Apr 24, 2009 3:57 PM PDT up reply actions
method...
…i get what you’re saying…
i used the word “likely” very consciously in that sentence, and for an obvious reason. i think i made it pretty clear in the piece that no one can be certain. that’s the whole point of my LCF revision. i’m not saying, as you quote:
Sanchez isn’t going to be good because he didn’t start 37+ games
that would be nonsense if i said that. i’m saying sanchez isn’t likely to be good based on the fact that both his comp% and GS put him in a group that has not historically achieved the 10 FFLPts/G threshold in their careers. that’s all. no more no less. not certain, just an educated guess.
by (Florida) Danny Tuccitto on Apr 24, 2009 3:01 PM PDT up reply actions
Ok, I'm guilty of using destined went you ment likely
But what is it about Sanchez’ completion % that you or the LCF doesn’t like? Please correct me if I’m wrong, but it’s the lack of games started that “puts him into a group that has not historically achieved the 10 FFLPts/G threshold”. Now lack of starts and completion % would both be reasons why Stafford and Freeman are “likely to fail” but Sanchez’ exemplary completion % isn’t a negative.
By labeling a QB as a Group B QB are you suggesting that he’s a better prospect than a Group C QB? I think I would prefer a higher completion % with less starts than more starts with a lower completion %.
Don't sweat it. I'm illiterate.
by methodrampage on Apr 24, 2009 3:32 PM PDT up reply actions
just to clarify here
it’s not that LCF or i “don’t like” comp%. actually, we like it a lot. it’s just that, as i reported in the article GS has twice as big of an impact as comp% when it comes to predicting performance. so basically, although only 2 stats predict NFL performance, comp% is only half as good of a predictor as GS. that’s why the 37+ GS guys are in groups A and B. the predictions are nowhere near accurate anymore if you switch the numbers in the equation so that comp% has double the impact that GS has.
by (Florida) Danny Tuccitto on Apr 24, 2009 4:02 PM PDT up reply actions
well stafford is going to be throwing to calvin
so that’s what 10 TDs right there.
http://www.49ersboard.blogspot.com
Wow! Great work.. The next Nate Silver?
I found this pretty interesting as it confirmed my own intuitive placement of GS(experience) and completion percentage as the most important QB stats. But I find it less than useful except from a historical perspective because of the inability to apply the formula to QBs in rounds beyond the 2nd. The thing I found somewhat surprising is that the positive correlation between GS and success is so much higher than the completion % correlation – I have intuitively thought the reverse.
So how about adding some variables to look at in order to expand the range of application to the later rounds? I would add variables that cover the physical attributes of the player – height, 40 time, short-shuttle time, and 3-cone drill time. Why? Because a QB’s height, reaction time, quickness, and agility(elusiveness) affect his performance and can be measured. A QB with a slow reaction time, or one who cannot maneuver in the pocket or scramble a little is not likely to make it in the NFL.
So what are the correlations between these measureables and success in the NFL? I don’t know, someone must do the work. But I suspect that looking for correlations in these data, either positive or negative, could produce some interesting results.
But all in all, I don’t think statistical analysis is going to replace many good talent evaluators in the near future.
PS: In case you don’t know who Nate Silver is, he’s the sports and political statistician who makes the best predictions in baseball AND politics, bar none.
a couple of things...
1) FO uses something called a speed score, which is like an adjusted 40-time, to predict RB performance
2) i don’t think stats are going to replace scouts any time soon either
3) i love nate silver, as does basically anyone who’s ever encountered his work. it doesn’t hurt that he’s a poker player too. :-)
by (Florida) Danny Tuccitto on Apr 24, 2009 2:14 PM PDT up reply actions
oops...
…here’s the speed score article by Doug Farrar
by (Florida) Danny Tuccitto on Apr 24, 2009 2:16 PM PDT up reply actions
Thanks for the link. Interesting.
But I was really happy to see my favorite running backs in this draft all have Speed Scores over 100 – Andre Brown, Javarris Williams, and Kory Sheets.
Somebody would still have to do the work to see what kind of correlation there is between Speed Score and QB performance. I suspect the sample size isn’t large enough yet to measure this one. But who knows.
oh, and...
…thanks for the kind words.
by (Florida) Danny Tuccitto on Apr 24, 2009 2:18 PM PDT up reply actions
Interested in doing a statistical draft post-mortem?
I have been intending to do an evaluation of the “rankers”, the people who rank players in numerical order for the entire draft class and the people who rank players in positional order in order to see whom can be believed about what. The methodology is simple – just take the absolute value of the difference between the ranking # and the actual selection #, and then find the mean and standard deviations out to the 3rd SD for various segments of the draft(1st round, 1-2 rounds, 3-5 rounds, and 6-7 rounds) in order to look at prediction accuracy.
If you are going to do something similar, I’ll defer to your expertise.
Much better
* Group A: QBs that had 37 or more GS and a COMP% of 60% or higher
* Group B: QBs that had 37 or more GS, but had a COMP% lower than 60%
* Group C: QBs that had less than 37 starts, but had a COMP% of 60% or higher
* Group D: QBs that had less than 37 starts and a COMP% lower than 60%
Those are better categories for comparison, but I think to truly evaluate a QB’s performance in college we have to go one step further. The problem is finding a way to normalize completion percentage.
Variables that I don’t feel are accounted for: Offense type (West coast, spread, vertical), Completion percentage by throw (short, intermediate, deep), # of drops by WR’s, offensive line protection, quality of opponents
I love what you’ve done here as a more simplified way of grouping comparable prospects though. I feel like based on your model, you would have Pat White in the first category (A) but his completion percentage is skewed by their offensive system. Whether or not he’s that kind of QB is debatable (I think he’s the 3rd best prospect at QB).
Rays in '08.... Desmond Jennings - the breakout continues.....
i think...
…it might be useful for sure to adjust college comp% for opponents’ pass defense, let’s say. i don’t think there’s any doubt that having a high comp% in the WAC or this year’s Big 12 is a lot easier than having a high comp% in the SEC. even if it was just a general adjustment based on conference rating, that might a simple way of helping the prediction.
in terms of pat white, his stats do indeed put him in group A. two problems though:
1. even my version of the LCF doesn’t work unless he’s taken before the 3rd round
2. from what i’ve read, he’s projected more as a non-QB in the NFL
if any of these things change, then pat white is worth looking at in re LCF.
thanks for the compliments, by the way.
by (Florida) Danny Tuccitto on Apr 24, 2009 1:53 PM PDT up reply actions
WAC/BigXII
A lot of the problem in the Big XII is that the conference is filled with teams running spread offenses. I’m sure if someone did a statistical analysis of completion percentages by offense type, the difference in average completion percentage between spread offenses and pro-style offenses would be ~5%. That makes a huge difference in a study like this.
As for next year’s class, all of the top QB’s are running some form of the spread offense so it will be interesting to see how they are graded going into the draft.
Overall great work you did here. If you ever decide to go further on the analysis of completion percentage in college, I’d gladly dig up some research.
Rays in '08.... Desmond Jennings - the breakout continues.....
by youALREADYknow on Apr 24, 2009 2:50 PM PDT up reply actions
I have to Agree
The model doesn’t account for the offense emphasis of attack : completion % vs big playability
Also interested in seeing where David Klingler would rank and all those 90’s Florida QBs. I can see where he limited the model to 1st and 2nd round picks. He is eliminating QB’s who appeared to be carried by a great supporting cast.
Who wud of thunk it?
A more accurate QB in college should be better in the pros than a less accurate QB in college and a college QB with a long track record (sample size) should be a safer bet than a college QB with a short track record. Simply ground breaking. (Florida Danny do not take this as a knock on your work which I found impressive, this is a knock on the LCF).
So by the LCF Tim Tebow should be the first QB drafted next year? Interesting.
Don't sweat it. I'm illiterate.
Colt McCoy will have the starts too.
So I’ll retract my statement that the LCF would suggest that Tebow should be the top QB drafted but it’s at least pointing to Tebow being an elite QB prospect.
Don't sweat it. I'm illiterate.
by methodrampage on Apr 24, 2009 2:28 PM PDT up reply actions
that instantly makes stats invalid
tebow is a 4th rounder/pat white-esque qb in the pros
by GreatOden'sRaven on Apr 24, 2009 10:49 PM PDT up reply actions
i appreciate your point...
…i would just say that, in the early stages of a field of study, there are a hell of a lot of “duh” results. this doesn’t diminish their worthiness. it’s one thing to theorize or give the opinion that comp% and GS are important. it’s another thing to actually prove it beyond 95% confidence (which is what lewin and i have done). similarly, for every “duh” result we might find, there are a good number of times in which the “duh” result doesn’t show up. i cited one in the article. namely, it’s pretty “duh” from a simple rhetorical and abstract perspective that going to a good offense (ala big ben and philip rivers) is better for a college QB than going to a bad offense (ala akili smith and tim couch). it turned out though that this supposedly “duh” relationship didn’t pan out. “inherited offensive situation,” which i measured in 4 different ways for the sake of thoroughness, had absolutely zero impact on NFL QB performance for the 1st and 2nd round picks. so i guess it wasn’t so “duh” after all.
so i think you shouldn’t be so quick to minimize the “duh” finding in the context of a very young field of study and a simultaneous weeding out of other supposedly “duh” predictors that aren’t so “duh” after all. if i was an astronomer, and i told you today that my analysis proves the world is round, you’d be totally justified in saying, “no sh*t, sherlock.” the scientific knowledge based on 500 or so years of astronomy renders most everything except for “there’s life on mars,” as a genuinely “duh” result. 30 or so years of football stat research doesn’t even come close to comparing. there are still a lot of “duh” things that need to be proven beyond a scientific doubt. basically, we’ve got to crawl before we walk here.
thanks for making it clear you’re not bashing me. :-) i think it’s because i gave you a mention in the piece. ;-) jk
by (Florida) Danny Tuccitto on Apr 24, 2009 2:54 PM PDT up reply actions
If you limit your exposure you can prove anything
37+ games started, 60+ completion % and a top 2 round pick is extremely selective. 9 QB have met that those requirements in the last 12 years. So there are only 3 QB worth drafting every 4 years.
Don't sweat it. I'm illiterate.
by methodrampage on Apr 24, 2009 3:05 PM PDT up reply actions
ok, from a purely probabilistic perspective...
…you’re actually dead wrong about being able to prove anything by limiting exposure. the diametric opposite is true. the more you widen your exposure, the more likely you’re going to prove something that isn’t real, simply because you have so much information. it’s actually the very fact that lewin found such strong relationships for comp% and GS despite such limited sample size that is the major contribution of his study.
now, in terms of your “worthy to draft” argument, i’d again say you’re extrapolating way too much here. i would sure hope that no GM is basing his decision on whether a QB is worthy or not worthy of being drafted entirely on that QB’s college GS and comp%. also, this analysis identifies who was worthy to draft in the first 2 rounds, but doesn’t say anything about which QBs in the entire draft are worthy of drafting. so your 3 every 4 yrs conclusion assumes that LCF or I have anything to say at all about worthiness after round 2. we’re agnostic about it, not deist. so, if you limit things to only what i’m talking about in my article, then there have been 16 QBs in the past 10 1st- and 2nd-rounds who were worthy of drafting if “worthiness” to a given GM at the time meant the QB would average at least 200 yds-1 TD-1 INT in his career.
by (Florida) Danny Tuccitto on Apr 24, 2009 3:45 PM PDT up reply actions
wow
great work Danny. If we end up taking Sanchez I’ll be furious
Still defending Rich Aurilia, and the Niners' classic unis
i will be excited in 2 years
when he starts to mature.
Sanchez has all the skills. mark my words. In 4 years he is a probowler
by GreatOden'sRaven on Apr 24, 2009 10:48 PM PDT up reply actions

by 




























