clock menu more-arrow no yes mobile

Filed under:

2011 NFL Draft: An Expanded Model for Predicting QB Career Performance

AUTHOR'S NOTE: This is technically Part 5 of a 5-part series on predicting the career performance of NFL QBs. However, it's reporting such a massive analysis that I think it's best to split into two parts, you know, for sanity's sake. Oh, and p.s. WARNING: EXPLICIT STATISTICAL CONTENT.

Leading up to this final installment, I've offered up some basic philosophical and empirical thoughts about attempting to project college QBs into the NFL. First, methodologically speaking, it would be nice if we had a system that predicts fantasy football points per game (FFPts/G), is not limited to the top 2 rounds of the draft, and avoids the use of artificial make-or-break thresholds like "37 starts." In terms of actual results, I've shown that the Wonderlic and various team-related stats are meaningless, whereas factors related to draft order are potentially meaningful. Well, today, I'm going to put all of the above together, and present to you a statistical model for predicting QBs' career performance.

MO' QBS, MO' PROBLEMS

As easy as that sounds given everything I've already hashed out in previous posts, there are still a couple of additional hurdles we have to jump over before getting started, all of which are related to one inherent problem with QBs who get drafted into the NFL, especially in the later rounds: it's almost even money that they're never going to have much of an NFL career. In fact, out of the 163 QBs drafted from 1993-2006, 36 never attempted a pass in the NFL, and 64 played fewer than 8 games:

Rd

QBs

> 8 Gs

% > 8 Gs

1

33

32

97.0%

2

11

11

100.0%

3

18

15

83.3%

4

22

15

68.2%

5

16

4

25.0%

6

28

12

42.9%

7

35

10

28.6%

Total

163

99

60.7%

As you can see, this poses a problem for our desire to extend the Lewin Career Forecast (LCF) past the first 4 rounds of QBs. More on that in a moment. For now, the point here is that 36 goose eggs in the "performance" column is far too many to be including in our data set; and, worse, including them would basically render the prediction model invalid. Here's a graph illustrating how it's too many (click to enlarge):

Qb_draft_pick_ffptsg_distribution_medium

As you can see in the graph (which, incidentally, is called a histogram), there's a massive spike between -0.29 and 1.09 FFPts/G, such that the career performance of QB draft picks isn't bell-shaped (i.e., normally distributed). In order to run a regression analysis, your outcome measure - in this case, career performance - must have a bell-shaped distribution. Therefore, to just run this analysis with the 7-round data as is would mean we'd be violating that requirement.

So what do we do? Well, in my mind, there are only 3 options besides "quit and go home":

  1. Eliminate some of the later rounds because that's where the vast majority of the zeroes are.
  2. Replace the zeroes with some appropriate value that better reflects minimum performance.
  3. Do both #1 and #2.

After the jump, I'll tell you which option I chose, and present the prediction model...

I decided to have my cake and eat it too on this one. That is, I chose to run the analyses 2 different ways: once using data from fewer than 7 rounds without replacing the zeroes, and once using data from fewer than 7 rounds with replacement. For the latter option, the question becomes, "What number do we replace the zeroes with?" Well, in a post last year evaluating the efficiency of NFL teams when it comes to correctly slotting QBs in the draft order, Brian Burke of Advanced NFL Stats defined minimum performance as "at least 200 career attempts," and then replaced the performance of all QB draft picks with fewer than 200 attempts with the average performance of those with 1-199 attempts.

That seems as good an approach as any to me, so I used it here with only one change: I changed 200 attempts to 8 games because my outcome measure here is a per-game statistic, not a per-attempt statistic like the one Burke used (Hence, the "> 8 Gs" column in the table above). As a quality control check for switching from an attempts qualification to a games qualification, I looked to see whether my non-qualifying QBs resembled a similar average performance to that of Burke's non-qualifiers. In a happy bit of symmetry, it turns out that QBs in my sample who played 1-7 games averaged 1.21 FFPts/G, which represents the 5th percentile of FFPts/G for the 99 qualifying QBs (i.e., median was 6.78 for the ones who played 8 or more games). Burke's non-qualifiers represented the 6th percentile. Not bad.

So, after substituting 1.21 FFPts/G for the zeroes, as well as for the actual career performance of QBs with fewer than 8 games played in their careers, and paring down the number of rounds one at a time from 7 down to 2 (aka Option 3), I found that 3 rounds is where the career performance distribution comes closest to bell-shaped (click to enlarge):

Qb_draft_pick_ffptsg1_distribution_medium

Furthermore, after paring down the number of rounds one at a time from 7 down to 2 without replacing the non-qualifiers' FFPts/G (aka Option 1), I found that 4 rounds is where the career performance distribution comes closest to bell-shaped (click to enlarge):

Qb_draft_pick_ffptsg2_distribution_medium

So, as I said, I did the analysis two ways: once using the data for QBs drafted in the first 3 rounds with replacement of non-qualifying QBs' career performance, and once using the data for QBs drafted in the first 4 rounds without replacement.* In both cases, we have bell-shaped distributions, so the regression analyses will spit out valid results.

Now, I know this doesn't fully satisfy our desire to predict performance for all QB draft picks, but sometimes you just have to make sacrifices for the better good. If we consult that qualifying QB table at the top of the post, there's no getting around the fact that, if you just said all QBs drafted after Round 4 aren't going to amount to anything, you'd have 2-to-1 odds in your favor for being right (i.e., it would happen 67% of the time). That's a bet I'll take any day of the week, and twice on an NFL Sunday. Let me put it to you this way. Of the 79 QBs drafted in Rounds 5-7 from 1993-2006, only Tom Brady, Marc Bulger, and Mark Brunell have averaged at least 10 FFPts/G during their careers, and Mark Brunell was a 5th-round pick taken at a selection (#118) that would be a mid-4th-rounder today. So, just remember, for every Tom Brady, there are about 20 Chuck Clements (btw, Clements was Bill Walsh's sleeper pick in the '97 draft).

One last thing I'll mention before getting to the meat of the analysis is that Burke also accounted for the increase in passing efficiency over time by era-adjusting his performance measure. He was using draft data from 1980-2000, so it made sense because the increase over time was real. For my analysis, I didn't adjust for era because, if you take the midpoint of the average QB draft pick's career length (his 3rd year), and look at the trend in league-wide FFPts/G from 1995-2008 (i.e., 3rd year for 1993 picks to 3rd year for 2006 picks), you get the following (click to enlarge):

Nfl_passing_ffptsg_by_year_medium

That trend line is almost perfectly horizontal, and the number next to the "x" in the equation means that league-wide FFPts/G has increased a whopping .005 points per year. In other words, draft picks chosen from 1993-2006 have all played their careers during the same era of fantasy QB scoring; no era adjustment necessary.

THE BASICS OF THE ANALYSES

OK, so now you know I'm running the analysis on 2 different data sets related to QBs drafted from 1993-2006. For ease of interpretation, I'm going to call the analysis for those taken in the first 4 rounds without replacement, "FFPtsG1," and call the analysis for those taken in the first 3 rounds with replacement, "FFPtsG2." Because I've now expanded the data past 2 rounds, the next order of business was to recalculate the correlations I presented previously in this series. It turns out that only the following variables are meaningfully related to either FFPtsG1 or FFPtsG2 (correlations ≥ .2 were meaningful):

Variable

FFPtsG1

FFPtsG2

Pick Number

-0.571

-0.533

QB Order

-0.450

-0.444

College GS

0.260

0.235

Div 1A School?

0.234

0.177

Prev Yr Ws

-0.209

-0.228

College Comp%

0.199

0.331

Prev Yr Total DVOA

-0.178

-0.142

Because of the riddle I discussed last Wednesday about the inherent relationships between pick number, QB order, previous year wins, and previous year Total DVOA, I had to tease out those relationships a little bit to guard against multicollinearity among predictors in the same model. In other words, I didn't want to include previous year wins and pick number in the same model, for instance, because previous year wins causes a team's pick number, so if either ended up predicting FFPts/G, the results for both would be useless. In contrast, pick number isn't caused by Total DVOA, so it's OK to include them together in the same model. Therefore, in the analyses, I quarantined pick number and Total DVOA away from QB order and previous year wins. That is, pick number was only allowed as a predictor in models with Total DVOA (and vice versa), and QB order was only allowed as a predictor in models with previous year wins (and vice versa).

One other thing I did to improve on previous studies that try to predict QB career performance - it's been implicit in the discussion so far - was to test multiple models and see which one does the best job of prediction. It's pretty much conventional wisdom in statistics that you always want to test competing models because there are always competing explanations for the same reality. Except for in the natural sciences, you pretty much never have one theory win out over all the others, so it's best to evaluate as many as you can.

The obvious choice here was to compare my models with the LCF, and see whether they improve upon the predictive ability of the LCF or not; and, of course, to compare my FFPtsG1 and  FFPtsG2 models with each other as well. To evaluate the various models, I compared their R-squared values, which tell you how much variation in FFPts/G is being explained by each (the higher the better), and their average error for predicting career performance for the 17 QBs drafted from 2007 to 2009. Evaluating the ability of each model to predict QBs that weren't included in the data I used to create the model is always recommended because the models are, by definition, supposed to optimally explain the data that was used to create them. In other words, we're trying to predict the future, not explain the past.

RESULTS OF THE ANALYSES

It turns out that the variables that best predicted FFPtsG1 were QB order, whether or not the QB played at a Division 1A the year prior to his selection, and college GS. In contrast, the variables that best predicted FFPtsG2 were pick number, college GS, and college completion percentage. And, of course, for the LCF models, college GS and college completion percentage were the best predictors.

OK, this post is approaching dissertation-length, so it's probably best to cut it short. Come back tomorrow for the exciting conclusion, where I'll present the rest of the earth-shattering results.

*Given that we're well into "most QBs qualify" territory once we've pared down to 3 rounds, you might be wondering how the performance distribution was bell-shaped for the Rounds 1-3 with replacement, but not without it. The answer is that, without replacement, the distribution ends up being slightly more skewed in the direction of high performers; the opposite of what was happening when non-qualifiers from later rounds weren't being replaced.