Welcome back for Part 2 of my interview with Bill Barnwell, Managing Editor of Football Outsiders (FO). Just to recap Part 1, Bill offered up the following 49er-related thoughts:
- FO's 6.1-win projection for the 49ers this season resulted from simulating the 2010 NFL season 10,000 times.
- Statistically speaking, there's no clear-cut favorite in the NFC West this season.
- The 49ers rely on statistical analysis more heavily than the average NFL franchise.
- In addition to the Alex-Smith-won't-be-in-the-shotgun-as-much reason that FO cites in their book, 2 other reasons for their prediction of an underwhelming Niner offense this season are that the rookie OLs probably won't provide immediate help, and that Smith probably won't face the easiest QB schedule in the league for the 2nd year in a row.
- With respect to Dashon Goldson and Ahmad Brooks, it's business, not personal. The reason why both are unlikely to duplicate their superb 2009 seasons is because very few players at their positions ever have duplicated that level of production.
- The reason why FO repeatedly downplays the 49ers' playoff chances is because, during the 6 years of FOA's existence, the 49ers have been consistently bad, and therefore, haven't given FO a statistical reason to project otherwise. In other words, they do not hate the 49ers...seriously!
He also made the following points on topics generally related to FO:
- Because of human nature, it's baked into the cake that nearly every NFL fanbase will think FO underestimated their team in the yearly projections; and oftentimes the fans were right.
- Just because DVOA said Pierre Thomas was the best RB in the league last season, it doesn't mean he's actually better than Adrian Peterson.
- Don't pay too much attention to Receiving DVOA for RBs.
- FO incorporates constructive, analysis-oriented criticism into their work; so much so that it's frequently the case that a reader ends up becoming an author on their site or in FOA.
Today, I'm posting Bill's answers to the other 10 questions I asked him, 5 of which were inspired by questions some of you submitted in response to my call for questions. I say "inspired by" because, although I didn't ask them verbatim (for obvious editorial reasons), I did preserve the spirit of the inquiry. Which reminds me; thanks again to everyone who helped me out by submitting questions. And because I won't be adding any commentary after the end of the interview transcript, let me also thank Bill one more time for giving Niners Nation the opportunity to interview him, and for bearing with my verbosity. If you're interested in reading more interviews with the FO people, here's a page providing links for all of their interviews with the various blogs here on SB nation.
Oh, and finally, if you're interested in football statistics or simply want a really good preview of the upcoming NFL season, go buy Football Outsiders Almanac 2010 (here for .pdf, here for print). I highly recommend it...and, no, I wasn't paid to say that.
After the jump, Bill identifies what would complete him, and then identifies it again, and then identifies it again. He also expresses amazement in response to the revelation that FO's win projections are ONLY about 5 times less accurate than stat-based baseball projections, addresses the issue of FO-based gambling, and answers a few of your questions...
STATISTICAL ANALYSIS IN FOOTBALL
FD: Back in 2005, Aaron Schatz published an essay on football's Hilbert problems, 10 barriers to progress in the field that needed to be adressed in the medium-to-long term. The most glaring problem was a simple lack, and inadequate quality of, publicly available football data for inquiring minds to analyze. In other words, we can't answer a lot of basic questions when NFL play-by-play data is so limited, with many of the 22 players on the field seemingly having never even been there according to the box score. Five years later, how much progress do you think has been made on this front?
BB: Since then, we've compiled five years of Game Charting data. That's an enormous step forward. Game Charting data -- tracking events like how many players rushed the quarterback on a given play, or whether there was play-action, or why a pass fell incomplete -- is far from a cure-all, but it allows us to attack questions that we would have no idea about when Aaron wrote that five years ago.
The next big step would be access to the "All-22" film, the coaches film from an overhead angle that you'll see on NFL Matchup. This would allow us to get a read on defensive alignments and reliably track player participation data.
FD: If what you just described is the status quo, and we imagine a day in the future where data availability is nearly ideal, what are 1 or 2 of the biggest things standing in the way between today and that perfect day?
BB: Having the all-22 film is the biggest thing.
FD follow-up: I really was talking more about institutional barriers than any specific panacea like access to the All-22 video. For example, given our inaccessibility to the All-22 video, what's/who's standing in the way of it being released? Obviously, it's the NFL generally speaking. But is it specifically the league office, the team front offices, the coaches, the players, the cameramen, all of the above, etc.? I mean, given that you guys visit NFL Films every year, and I see Aaron on NFL Network's "Top 10" show from time to time, I have to assume FO's directly brought up the topic with the NFL gatekeepers. What do they say when they reject the idea? Over time, are they getting closer to accepting the idea?
BB follow-up: I see what you mean about the institutional barriers. The problem is, unfortunately, that the NFL is a mighty big place. The people who we speak to at the NFL would be happy to give us the all-22 film, but we've never been told anything but "There's no way we could give that to you." NFL Films/NFL Network stuff has nothing to do with the league's administrative office in NYC, who would make that decision. I have no idea who specifically would be the person to talk to to get them to change their mind. There's been no change, really, in the likelihood of the data becoming available; the only thing I can think of is that the NFL might see the success of MLB Advanced Media and start to make it available as a web feature to differentiate Game Replay from TV broadcasts. Even then, I doubt it will be for a number of years.
FD: Continuing this line of questioning, what do you think are the next frontiers in football stat analysis? What kinds of data collection or analytical methods do you see emerging over the next decade or so?
BB: I know it's going to sound repetitive, but the all-22 film is what stands out to me. Once you get that, you can start hacking away with video analysis the way that people in the NBA have, and that opens up myriad opportunities.
FD: For understandable reasons, people have a tendency to compare football stat analyses to baseball stat analyses (aka sabermetrics). A website called Vegas Watch has tracked the accuracy of your preseason W-L predictions over the past few years as well as the accuracy of various sabermetric models for predicting baseball W-L records. A quick, back-of-the-envelope calculation shows that, after accounting for the difference in season length between the 2 sports, you guys are about 5 times less accurate than the sabermetricians. Obviously, given the aforementioned lack of data, and the relative infancy of football stat analysis, this is to be expected; so I'm not bringing it up to focus the spotlight on your inaccuracies. Rather, what I'd like to know is, aside from having better data, what 1 or 2 things would most help football statisticians close that accuracy gap?
BB: Maybe they could expand the schedule to 162 games! I'm honestly shocked that we're that close; there's SO many factors that make baseball, especially now, easier to project than football.
I should probably mention a couple. One is, as I mentioned, the sample size. We can produce metrics that break down games to the play level, but that doesn't mean that the team with the better DVOA is going to win every game, or even the large majority of games. It's also entirely possible for a team that's "actually" good to lose a large amount of their games. Take your favorite elite baseball team and look at their record in 16-game stretches. The best team in baseball right now by W-L record is the Yankees, who are 60-34. That's about the equivalent of a 10-6 team in football. The Yankees have had a 16-game stretch this year where they were 6-10, and another stretch where they were 8-8. Because baseball has 162 games, their stretches of bad luck or underachieving play in certain situations can regress over the long season. In football, even though the Yankees might "truly" be a .600 team, they wouldn't have had the chance to outlive that 6-10 record. There's a great post on the pro-football-reference blog about a simulation of seasons and how often the "best" team actually won the Super Bowl that I can't find, but is a larger-scale example
Another is the fact that, well, baseball's a lot further along in their statistical revolution than we are. By the time Baseball Prospectus came around, the work of Bill James had been circulating for a generation. We're certainly blessed with The Hidden Game of Football, but it's not exactly the same scope of work. We're still finding a lot of the things that actually help football teams win, and perhaps more importantly, we're still gathering the data. We're advancing at a faster rate than someone in the 80s would have, but we're still in the very early stages of the statistical revolution in football.
With all that in mind, I'd love to nail every win projection every year, but I know that's not going to be the case. For me, what's important is understanding why we're going to project a team to decline or improve and seeing if that actually happens; in a way, it's almost more interesting to be wrong. If the 49ers force a ton of turnovers next year and make the playoffs, I'll be upset that we weren't right in our prediction, but I'll be excited because it means that there may be something about the team we're not analyzing properly.
(FD Update: I scoured the interwebs, and found the p-f-r blog post Bill's talking about. Pretty interesting stuff. After randomly assigning an SRS-inspired range of random ratings to 32 theoretical NFL teams, Doug Drinen of p-f-r simulated 10,000 NFL seasons, and found that the "highest-rated" team in a given simulated season ended up winning the Super Bowl only 24% of the time.
So, to Bill's point, the randomness of "any given Sunday" means that, over the course of millenia, it's very likely that the best statistical team in a given season will win that season's Super Bowl only once every 4 years. Because the ratings were randomly assigned to different teams for each simulated season, this result has very little to do with how trustworthy the team ratings are as stats. It has almost everything to do with simple random variation. Throw in the fact that Brian Burke's found NFL game results to be about 52.5% luck, and you quickly realize that we're basically talking about flipping a coin here. That's why it's so difficult to hit the bullseye with NFL projections.
If you're interested in reading the entirety of Drinen's post, here it is. It's tiny compared to the typical length of my stat posts, and provides links to the other posts in Drinen's simulation series for p-f-r. To see an example of Bill's point about the whole "best team wins" thing being dramatically different in sports with longer seasons, b-b-r blogger Neil Paine did nearly the same study for NBA basketball as the one Drinen did for NFL football.
Oh, and incidentally, Paine's simulation method was nearly identical to the one Bill described in Part 1 as being how FO calculates their win projections. So if you want to learn more about the method behind FO's win projection madness, Paine's post is a really good illustration. OK, enough randomness; back to the interview...)
FD: Speaking of Vegas, I happen to think that a non-trivial amount of the backlash you guys receive is a reflection of bitterness after over-relying on your projections when betting on football. What do you think of the unavoidable nexus between NFL stat analysis and NFL gambling?
BB: I think it's related to how you buy into the data, which is true of any sort of interaction with our work, whether it be for gambling, fantasy purposes, or as a model of the game that's played on the field. If you read it and think "Wow, that's great" and don't attempt to understand any of the concepts behind it, then I think you'll get very easily angry when the game doesn't fit what we expect to happen. If you take a longer view of how we do things and approach the game and view it as a useful piece of information that's evolving, I think you derive a lot more out of it, and I think that the people who make serious investments using our data for gambling purposes are intelligent enough to approach our work in that manner.
FD (h/t goatfather): Given the complexity of NFL team performance and the dynamic nature of football as a sport, does FO have any plans to start incorporating statistical interactions and more complex types of analyses into your stats?
BB: I really don't think that's a big problem right now. I mean, I'd love to start introducing more complex mathematics, but actually gathering good data is so much more important right now.
FD (h/t bignerd): As I too have found in my own work, a lot of year-to-year variation in NFL performance is simply regression to the mean. One entity in the NFL universe that seems to never regress to the mean is the 49ers' offense. In FO's projection model, do all teams regress to the mean equally or do adjustments for regression to the mean apply differently to different teams?
BB: The latter, although I suspect that there are more aspects of regression towards the mean that should be uniquely adjusted for specific teams that we're not capturing yet. Injuries are a good example; I know that the Titans should regress towards the mean in health, but they're so healthy on a year-to-year basis that I wouldn't expect them to go to 16th in health. I might suggest that they'd "regress" to 10th from the top two.
BB: We're not at the point where we can incorporate the things that lead to "wins" that aren't directly part of the statistical record, like the quality of a play-action fake for a quarterback or the ability of a wideout to block. It's disingenuous, I feel, to build a win-based metric for an individual player when you're knowingly ignoring a fair amount of the work he contributes to a win. It would be like putting together WAR for baseball and only considering hitting stats.
FD (h/t Andrew Davidson): Earlier I asked specifically about FO's projections for Goldson and Brooks. Obviously, it's a lot easier to project the performance of a league fixture at a stat-laden position like Peyton Manning than it is to project Johnny-come-lately defenders. In general, though, how do FO's individual stat models handle projections for players with very little historical data available?
BB: It's only really an issue for rookies and for quarterbacks. For rookies, we look at team variables (say age of the offensive line and the expected success of the passing game) and examine, say, how a rookie running back in a similar draft slot with a similar role played versus the running back of the previous season. Veteran running backs and wideouts with little experience don't get projected for big roles, so the projection isn't really a big deal.
For quarterbacks, we project their numbers to a full 16-game season, but it's again a lot of the team variables. That may seem murky, but it actually turns out pretty well -- Matt Cassel's 2008 projection actually turned out shockingly well.
FD (h/t BKisforSF): As was evidenced by John Taylor's #1 DVOA vis-a-vis Aaron's discussion of the 1993 DVOA season, it seems like there's a pattern of WR sidekicks having inflated efficiency stats. Intuitively, one would think these sidekicks are benefitting a great deal from the opposing defense focusing most of their resources on stopping the other WR. Have you guys found this to be a real phenomenon? If so, are you planning to adjust for it in future incarnations of individual DVOA, DYAR, etc.?
BB: I'd like to start examining the issue of usage curves for receivers, like Dean Oliver did in his Basketball on Paper. For now, I think you have to just acknowledge that complimentary players are going to look more efficient than a fair amount of the stars, and just consider that when you're incorporating those numbers.