January 15th, 2022
A reminder that the website is in off-season mode: unlocked and free to explore.
As always in the off-season, all projections now represent how the models would predict things now, “as if those games were played over again today.” This view lets us explore “what ifs” and “if only we had known…” Have fun with it!
(If you have comments, this article was also posted here to Reddit.)
In keeping with the tradition of transparency and reflection, here is the final look back at 2022 streaming model accuracy and predictability. The previous monthly accuracy updates were: week 4, week 8, and week 12. And the 2021 summary.
Background context of the 2022 season: This season was apparently exceptional. I recently made a separate post about how this season's all-time record low in game score differentials (a uniform NFL) caused streaming results to be poorer than usual, according to the accuracy of "other" ranking sources. I also posted separately about how this year of low predictability affected my own models compared to the last 12 years. Those posts have some good meat to dig into, if you have off-season downtime!
The overall story is the same as I've stated all season: QB has been exceptionally predictable; whereas RB, D/ST, and Kicker took the biggest hit from the league trend. As a reminder, this is to be interpreted as the difficulty that the season presented; I do not believe the models are in any way "broken"-- that is why there's a comparative assessment below!
Average of weekly correlation coefficients for each model (Correlation between fantasy point projection and fantasy point result)
Reminder / for newbies: My goal is to make sure models are performing at a similar level to others who are historically accurate. Although I don't label the other sources' names this year, they are the dependable year-after-year top-tier ranking sources. You would surely recognize most of them. Anyway, it's great if I can "beat" them (for a few years it was like that at D/ST), but most of all, I want to make sure they perform at a similar level to top sources. Because if they do, then we can count on 2 things: (1) You can consider the model outputs as useful, with trust in the statistical trends backed by hard data. And (2) If the models are working well enough, then it's not nonsense to extrapolate to future weeks. With quality enough predictive models, longer range planning for streaming is as dependable as possible, which can improve our strategy.
As highlighted in the post about season-to-season 2022 predictability: The very best ranking sources in the world all struggled at D/ST in 2022, compared to how they did previous years. Keep in mind, we really want to see the average correlation coefficient up in the 0.35 - 0.40 range.
I'd say I really dead-tied for #2 with the 3rd source; he did bad in weeks 15-16 (when I did great), but he had a great comeback week 17 (when I fell flat). The surprise was the winner ("#5 in 2021"), who performed average all season but suddenly leap-frogged us all when he had stellar predictability in weeks 14 and 16.
Lessons learned: Being very open, I really want my D/ST model to do better. I think some of the input parameter processing got slightly worse, when I thought I was improving things by integrating more years of data (from 5 years to 12 years). So I will be investigating if I can fix this processing to not blur useful information the data contains. I also have some other optimization ideas, including the creation of a different model for the first 4-6 weeks.
Things looked dire for much of the season-- I was doing as bad as the other sources. But my kicker model accuracy picked up significant steam the last 6 weeks. Since week 10, my model clearly jumped to the top.
I care a lot about this position, because it is the only model I expect to make a leap above conventional ranking. (For D/ST: I want to be at least top 3 to ensure my future-week projections are valid. For QB: top 5 would be great, and the aim is to provide a complementary view to other sources-- while also allowing look-ahead.)
This chart is less valid this year; the bars should reach up above 0.15 and preferably up to 0.25 in my case.
In the above chart, you see the correlation coefficient summary (which is bad). But remember that I do not fully trust this for kickers, for reasons explained before. The more interesting chart is what I have been showing all along this 2022 season: the points produced by the top-ranked kickers. (Note how the ranking sources produced very different patterns, which is in contrast to the QB version below.)
Lessons learned: My kicker model got a big update week 10, and I'm pretty proud about what went into it and what came out of it. Nevertheless, the main point I want to investigate is the early-season speed of adopting new data. I would like to pick up on rising kickers sooner.
This is actually the first time I use a consistent model from beginning to end, to evaluate my accuracy. In 2021 for example, I introduced a model update mid-season. I'm really happy about the 2022 result, because it means that a hard, stats-driven model is able to keep pace with some great, highly respected sources. So I'm convinced the QB model is doing what I wanted it to.
Similar to the Kicker graph above, this QB shows just how closely all QB ranking sources performed to each other.
Comparatively: Most QB sources really were all the same tier. That becomes clearer when you look at the points-per-top-N chart: note how the lines largely overlap. There was only one source to stand-out, at #1, who made great calls in weeks 4, 6, and 12, when others didn't.
Sorted by average of all 17 weeks of correlation coefficients
Lessons learned: I don't have a lot to say on this one, but I would like to incorporate some different rushing stats beyond the QB rushing stats I use now.
This is the only result that I'm clearly disappointed in. When I announced expectation levels based on historical simulations, I stated that you need a mindset that you could conceivably lose. But I did not expect it to actually happen!
The weighted win-rate (i.e. 2.0x returns) of model suggestions was 49%. But as anyone familiar with the field knows: you need to win more than 52.5% for break-even. There were some other reasons I'll touch on below, but it seems pretty clear that specifically week 10 gets most blame. The simulations have rarely seen a loss of 70%+ for a single week. I made an extra rant about that week (things like Packers beating Cowboys and Steelers beating Saints). And while it's usually ridiculous to point to any single week as the cause (of losses or gains), I think in retrospect is still stands out as the only week which prevented the season from hitting breaking-even:
Learnings: There are a lot from this one. The main issue-- and anyone who followed would know this-- is that the model did not seem to adapt enough to a low-scoring season. Week after week in 2022, there were too many suggestions to bet the Over, on games with low Vegas O/U. Maybe it's because the models were trained on data from 12 seasons that mostly scored higher (excepting one year). Anyway, besides this, I will be calibrating the model to adjust for magnitudes (e.g. what to do differently when spreads are very high); and I will revisit the calculation of confidence levels to give better risk-adjusted ROIs.
TL;DR-- We can all agree fantasy streaming didn't do what we usually expect. So it was a quirky season for launching a website, but I'm glad the fantasy models at least kept pace. Having kicker at #1 means a lot to me, and being among the top 3 is a good sign for D/ST and QB. I've outlined the areas for improvement I'll be working on-- starting now!-
Thanks to all who followed the season, and I'm looking forward to kicking off advancements with you when the 2023 season kicks-off!
Tagged underAccuracy , Current Season , Expectations