Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: Win or Lose, Boston Celtics' New Big 3 Era A Success

2011 NFL Draft: An Expanded Model for Predicting QB Career Performance (Cont'd)

AUTHOR'S NOTE: This is the continuation of yesterday's post. I'm just starting it from where I left off. Click here to read the first half. Oh, and again, WARNING: EXPLICIT STATISTICAL CONTENT!

All in all, I tested 11 models:

 

Sample

Predictors

Result

 Test

Name

Rounds

N

A

B

C

R-squared

N

M Error

SD Error

FD1

4NR

84

GS

Div1A?

Pick

0.385

15

3.49

1.67

FD2

3R

62

GS

Comp%

Pick

0.388

14

3.49

1.95

LCF1

4NR

84

GS

Comp%

 

0.093

15

3.67

2.22

LCF2

3R

62

GS

Comp%

 

0.150

14

3.95

2.52

LCF3

2NR

44

GS

Comp%

 

0.360

12

4.64

2.58

LCF4

2R

44

GS

Comp%

 

0.359

12

4.77

2.70

LCF1Rev1

4NR

84

GS

Comp%

Pick

0.374

15

3.36

1.62

LCF1Rev2

4NR

84

GS

 

Pick

0.359

15

3.33

1.41

LCF2Rev

3R

62

GS

Comp%

Pick

0.388

14

3.49

1.95

LCF3Rev

2NR

44

GS

Comp%

Pick

0.431

12

3.81

2.23

LCF4Rev

2R

44

GS

Comp%

Pick

0.431

12

3.94

2.38

After the jump, I'll discuss the table...

Star-divide

Here's how you'd read this table. In the "Name" column, FD stands for me, LCF stands for the Lewin Career Forecast, and "Rev" means "revised." In the "Rounds" column, the number means how many rounds of data that were in the sample, "R," means "with replacement," and "NR" means, "without replacement." In the "N" column, that's just the number of QBs that were in the sample. For the predictors, "GS" stands for "college games started," "Comp%" stands for "college completion percentage," and "Div1A?" stands for "did QB enter NFL draft from Div1A school?" Finally, if you see a predictor crossed out, it means that it did not significantly predict FFPts/G in that particular model.

So, for instance, Model LCF1Rev1 was a revised version of the LCF wherein I used non-replaced data from the 84 QBs drafted  in Rounds 1-4 from 1993-2006 to predict QBs' FFPts/G from their college games started, college completion percentage, and/or pick number. In that model, it turned out that Comp% did not have a significant impact on FFPts/G, so I then tested Model LCF1Rev2, which did not have Comp% in it. Capisce?

Basically, the general idea here was to (a) test a 4-round model and a 3-round model that were native to my analysis, (b) test 2 models that are exact replicas of the LCF, (c) test 4-round and 3-round versions of the LCF, and (d) test various revised LCF models that throw the pick variable into the mix given that it was the most significant predictor of FFPts/G back when I ran simple correlations.

OK, now for the good stuff. Each model test results in a linear equation that relates the predictors to FFPts/G. R-squared measures how well that regression equation fits the data. It goes from 0 to 1, can be expressed as a percentage, and the closer to 1 it is, the better.

After running all the model tests, I then used the linear equation spit out by each model spit to test how well they predicted career FFPts/G for QBs drafted from 2007-2009. In the last 3 columns of the above table, "N" is again how many QBs were included in a particular test. The actual test measures here were the mean (M) and standard deviation (SD) of absolute error. Lower M Error means more accurate prediction, and lower SD Error means less varied prediction (i.e., fewer huge misses).

In essence, R-squared tells you how well the model explains the past, whereas the error values tell you how well the model predicts the future. Ideally, what we want is a model that does both well, but we lean a little bit towards the prediction side of things if the results are close.

MODEL EVALUATIONS

The first thing you probably notice in the table is that the 2 models that simply extend the LCF past 2 rounds (i.e., Models LCF1 and LCF2) were atrocious at both explanation and prediction. Basically, this represents the very reason why you're only supposed to use the LCF for QBs taken in the first 2 rounds. After that, GS and Comp% do a horrible job by themselves. Of course, you might also notice that even the basic LCF 2-round models predict badly despite explaining the past well (i.e., high errors, but high R-squared). Taken together, these results seem to suggest that the LCF, as originally specified, is an inadequate model if we want to (a) predict future FFPts/G, and (b) do so for players drafted after the 2nd round.

The next model I'll bring your attention to is my 4-round model, FD1, which seems to do a pretty good job at both explanation and prediction. One reservation I have about it, though, is that it's the only model to have identified Div1A? as a meaningful predictor, and I think that result was the simple byproduct of it's 4-round nature. This is because, if you look at all the non-D1A QBs that were taken in the first 4 rounds from 1993-2006, most were taken in the 4th round, and the only one to have ever really amounted to anything was Steve McNair, who also happened to be the only one taken in the 1st round. Basically, the result is telling us that non-D1A QBs are going to suck. However, it should really say, non-D1A QBs taken in Rounds 2-4 are going to suck, especially because the only such QB taken from 2007-2009 was Joe Flacco, who also happens to have been a non-sucking 1st-rounder. All in all, I don't think Model FD1 is the best one.

So that leaves the revised LCF models (you'll notice my Model FD2 actually ended up being identical to Model LCF2Rev). Looking at these models, you see right off the bat that the 2-round versions (i.e., Models LCF3Rev and LCF4Rev) suffer the same fate as their non-revised counterparts. Namely, they're really good at explaining the past, but horrible at predicting the future (i.e., high errors, but high R-squared). So, out they go.

Of the remaining 3, I'm going to choose LCF1Rev2 as the winner because it's the most predictive model, and it explains the past nearly as well as any of the others. Another reason I prefer it to Model LCF2Rev is that it achieves our goal of extending the LCF as deeply into the draft as possible. This goal was achieved, however, at the expense of Comp%, the statistical importance of which washes out after the 3rd round. You can see this pretty clearly in the table, which shows that Comp%, although important in all of the 3-round models, was not a meaningful predictor in any of the 4-round models.

So, without further ado, the equation for Model LCF1Rev2, which we can now use to predict the career FFPts/G for QBs drafted in the first 4 rounds is

FFPts/G = 7.10 - 0.05*Pick + 0.08*GS

Basically, this equation says that, FFPts/G decreases by 0.5 for every 10 additional picks farther into the draft, and it increases by 1 for every 12 additional college games started. For example, Mark Sanchez started only 15 games at USC, but was the 5th pick in the draft. This translates to a career prediction of 8.01 FFPts/G, which happens to be ony 1 FFPt/G off from his actual career average thus far (9.01). Incidentally, this prediction is 1.50 FFPts/G better than the original, 2-round LCF prediction, presumably because it heavily penalized him for having started only 15 games. In our model, however, pick is the best predictor, so the fact that he was drafted 5th overall ends up (correctly) being more important.

Well, hope you all enjoyed that. After the draft, I'll come back and use this equation to make predictions about the QBs taken in the first 4 rounds, especially if the 49ers happen to take one.

Comment 5 comments  |  0 recs  | 

Do you like this story?

Comments

Display:

Could you project the top QBs by where they're being projected to go?

Newton – 1
Gabbert – 10
Locker – 25
Ponder – 30
Dalton – 35
Mallet – 40
Kapernick – 45

by whistlingmountain on Apr 26, 2011 11:17 AM PDT reply actions  

Just plug into the equation: FFPts/G = 7.10 – 0.05*Pick + 0.08*GS

I was going to do it for you, but I can’t find a stupid website that just lists the stupid players’ stupid number of games started. Stupid ESPN and everything else.

Currently stifling the bacon, the world.

by howtheyscored on Apr 26, 2011 5:29 PM PDT up reply actions  

This is what I got assuming whistlingmountain is right about where they are picked.

Andy Dalton 9.35
Jake Locker 9.05
Colin Kaepernick 8.93
Blaine Gabbert 8.68
Christian Ponder 8.32
Cam Newton 8.17
Ryan Mallett 7.18

by Ougadas on Apr 27, 2011 12:12 AM PDT up reply actions  

Danny, just a few comments ...

1. While far from an expert, during my career I had occasion to have my staff apply statistical analysis and modeling on a number of occasions … I enjoy the process and, thus, enjoy your articles.
2. It’s obvious that you love football … I don’t know anyone (else) who would put this much time and effort into a QB predictive model just for the fun of it.
3. Can’t wait to see your post-draft article.
4.. You’re amazing … keep up the great work.
5. Given the time that I’m sure that you put into this, if you’re married, extend my condolences to your wife.

by 49erFanSince1950 on Apr 26, 2011 2:44 PM PDT reply actions  

I completely agree with Since1950

Danny, your stuff is always amazing and honestly one of my favorite reads out of all my “regular” sites. Thank you for all the work you put into it.

by lacrosse_cat on Apr 27, 2011 6:41 AM PDT up reply actions  

Comments For This Post Are Closed


User Tools

Media Requests please email ninersnation@gmail.com

FanPosts

Community blog posts and discussion.

Recommended FanPosts

Small
Site Decorum: Remember, We Are ALL 49er Fans

Recent FanPosts

Small
Concussions...
Small
Is Harbaugh lying or does he mean what he says?
872_small
Where have you seen 49er players?
Download2_small
Can the 49'ers Maintain their Turnover Differential in 2012?
Sfak_small
Why are you a 49er fan?
6a00e5500c77218833011168f234b4970c_small
FOX: "How To Save The Sport"
Small
Old Spice Patrick Willis Football ProCamp
Dave_small
Call For Moderators
Steve_young_small
Game Day Food

+ New FanPost All FanPosts >


Head Ball Coach

Dave_small David Fucillo

Howtheyscoredcat_small howtheyscored

313483_2054510893373_1562580382_31984672_1965025_n_small James Brady

Coordinator

Pirates_small smileyman

Bowman_avi_sm_small Tre9er

Assistant Coach

Pixies_logo_small (Florida) Danny Tuccitto

Memento-lies_small urnext

Me_on_beach_small WesHanson

Dylan_cannes_small Dylan DeSimone

Officiating Crew

Jackalope_card_small wjackalope

These3words_small these3words

Joe_and_bill_small twolfe2

428030_10150598134996875_112852666874_9167376_1157036734_n_small mikeinsp