By Subvertadown
Tagged under
Accuracy , Current Season[Week 14 Reddit post for D/ST] [Week 14 Reddit post for Kicker]
Congrats on the Charity Drive! We have donated the collected total of $750 to the Concussion Legacy Foundation. (Remember, you can still donate on your own!)
December 2, 2024
This one took a while to compile!
On Friday (after week 12), I posted updated Accuracy results, from the season so far. Here it is:
TL;DR: Kicker is doing very well and at #1 by the FantasyPros accuracy measurement. D/ST and QB are also doing great, and I see no need for improvements (mentioning improvement areas is what I sometimes report on here). While the streaming models are all doing great, the Betting Lines is doing severely worse this year, and it clearly calls for renewed attention to improve the selection logic.
First, some different News. For the first time ever, my Kicker (and partly D/ST) model are represented in the official FantasyPros Accuracy list in 2024. That's new, because for 6 years, I have done my own accuracy evaluations, comparing to 5 other sources. This makes the first time being ranked against hundreds, in a "semi-official" way.
Kickers at #1! The great news is that my Kicker rankings are at #1 for the season! In fact, they have been at #1 for the last 4 weeks. Hoping to close out that way.
As I've posted about before, Keith Lott (of FantasySixPack) and I found a good chance to collaborate this season: Among the inputs he uses for his D/ST rankings, one of them is my own D/ST rankings. Meanwhile, we also both thought it would be a good idea if he entered my Kicker rankings into the accuracy competition. So you will find his name at the top, for Kickers.
By the way, you will also see that FantasySixPack's own Joe Bond is at #2 for kickers, which is just amazing. Blows my mind. And no, we are not copying each other, in any way.... It is just a great coincidence that two different approaches represented by FantasySixPack are giving great results this year.
Being at #1 is not just welcome news. I also feel it's extra validation for the various "extras" that I provide on the website. The "Kicker Opportunity Curves" and the "Why-so-high" tools are not just mere gimmicks. Instead, they represent some of the best contextualizations, helping you identify what's important to consider for kicker predictions. I really hope you use those tools to make your lineup decisions. It's not only about a rankings list!
D/ST. Meanwhile, as you can see in the above chart, Keith's D/ST rankings are currently 23rd in accuracy at FantasyPros, which is not at all shabby. Unlike Kickers, that doesn't reflect my pure Subvertadown rankings; they are his modified rankings, which include other inputs besides my own. I can compare the two with my own calculations (see below), but it doesn't really affect the main point: we're both providing accuracy among the top fractions of D/ST rankers.
Anyone who has read my previous Accuracy Reports knows that I care about this kind of credibility because I'm mostly concerned with my projections for future-weeks. The goal is to be just good enough among the top, so that future forecasts are credible.
I have long discouraged using these accuracy reports as a "judgement". Don't fade a ranker (including myself) simply for not standing at #1. My main goal is really to validate that my models continue to operate at a high-enough level. Usually I use this space to discuss areas for improvement. This time, I find no need to improve the streaming models, but there is need to somehow improve the Betting Lines.
The other purpose of these reports is for us all to understand "what kind of season we're in". To say it differently, if you’re new to these: Examining predictive accuracy has been a long-standing tradition, underpinning a key purpose of Subvertadown: To give us an understanding of how predictable things are. We want to know "Was this season more or less predictable than normal?"
So as usual, here’s a look at how each individual model is doing, compared to other seasons.
This does not tell "how good the models are". It only tells us how predictable the current season is, compared to the historical norm.
Things look different this season, for the flex positions. Running backs and especially Wide receivers are less predictable than most seasons, whereas TE has been more predictable. You might have found it less reliable than usual to guess which games would be shoot-outs and which would be ground-and-pound. Apparently, the game script has flipped more often this season.
However, the good news is that it has been a great year for streaming D/ST-- and a great year for QB too.
Reminder / for newbies: As stated above, I’m not trying to be — and don’t expect to be— “#1”. That’s plain unrealistic. My goal is rather to check that the models are still performing at a similar level to other reliable rankers. I.e., sources that have been consistently good for at least a few years. Many sources are great one year but then poor the next. My chosen experts are good in a more consistent way. We naturally trade places at the top, year to year. Knowing that the models perform at least at a "similarly" to top sources lends confidence and gives reason to trust forecasting, when we extrapolate models to future weeks.
As I reported in back in the week 6 accuracy report, D/ST accuracy has continued to be quite excellent, relative to other seasons. The other rankers are doing good, too.
The following accuracy graph is different in a few aspects, compared to how I've usually shown it.
I am showing the week-to-week fluctuation in accuracy, instead of showing a single average of the season.
Instead of showing the 5 other sources individually, I have combined them into 1 average. I made this choice because they are all great D/ST sources, and the purpose is to compare myself to the whole.
I am using the FantasyPros "Accuracy Gap" calculation, instead of my usual Correlation Coefficient evaluation. And I'm doing it for Yahoo scoring this time. I thought these changes were relevant, considering the news at the top of this post. (Disclaimer: My calculation appears to give different numbers from FantasyPros. I'm implementing their method as they describe... However they have not published their reference numbers, for converting weekly ordinal ranking into fantasy points. As an example, I assume each week's #1 D/ST scores "20.1" points, and I get this number from the average of the last 17 weeks. FantasyPros might be averaging multiple seasons, and they could end up at a different number, say "23" instead of "20.1". The rest of the calculation is exactly as they describe.)
Remember, lower is better.
The graph shows that my model (yellow line) has usually produced a lower gap than the Average of my other 5 favorite sources (black line). Exceptions: Weeks 1, 6, and 9.
For those of you who have been curious about the crowd-sourced, Pick6x6 "Consensus" rankings, I have plotted them in (brown) for comparison. It shows that your consensus rankings managed to "beat" my raw rankings in weeks: 1, 5, 10, and 11.
And finally, out of my own curiosity... I tried to compare the accuracy gap of the Stream-o-matic. Results are overall inconclusive, but Keith definitely improved on my own rankings in weeks: 1, 3, 9, and 11.
The most encouraging sign, from all these, is that the Subvertadown and modified rankings all tend to lie beneath the black line. That means we're competitive with historically reliable sources. We can trust that my D/ST future forecast projections are meaningful.
Having a version of my model in the FantasyPros accuracy competition provides a validation of my own calculations, where I also have found myself at #1 since week 4.
As with D/ST, here is a plot of my calculated accuracy gap, week-by-week. It shows that my accuracy gap has been lower for most weeks of the season. (Exceptions were weeks 1 and 11.)
After working so much on revisions, new sophistication, and new tools to go along, I'm extremely happy that there appears to be a consistent trend of producing a lower accuracy gap each week. I hope and expect to finish the season at #1 or #2.
As I announced in week 6, my QB model that has been on quite a successful roll this season. After a poor week 1, it has done a great job. I don't make a separate Reddit post about streaming QBs, but I hope people are getting good use out of it. The other sources I compare against are incredible experts at ranking the position.
This plot is correlation coefficients, so higher is better.
The other sources did better most especially in weeks 1 and 9 (and slightly better in 3,4,7). But my model has managed to outperform in 6 weeks by a good margin.
Based on probability alone, we should normally expect to lose slightly more than 1x per pathway, during the last 6 weeks of Survivor. The average expectation for this duration is to lose about 4-5 times (summed across all 3 pathways tracked), meaning the number of times I need to switch to a “backup” pathway. The actual number has been 4 replacements, which makes the last 6 weeks better than a 66% win probability each week. This was a big improvement compared to the first 6 weeks of the season (7 failures). The current results are close to normal, for the total 12 weeks.
The losses were as follows:
Pathway 1: Bengals week 1 (Lions alternative), Ravens week 2 (Texans alternative), Seahawks week 5 (Chiefs alternative), Steelers week 12 (Buccaneers alternative)
Pathway 2: Buccaneers week 3 (Jets alternative), 49ers week 5 (Commanders alternative), 49ers week 11 (Rams alternative),
Pathway 3: 49ers week 3 (Bills alternative), Jets week 4 (Cowboys alternative), Ravens week 8 (Chargers alternative), Commanders week 12 (Dolphins alternative)
While everything else has gone well this season, specifically the Betting Lines have gone poorly.
This has been hugely discouraging for me, because I invested hundreds of hours in the offseason. I wanted to turn 2023's success into "the best darn betting recommendations out there"-- and the results have clearly gone the opposite way. My off-season work was spent on all kinds of optimizations:
Deciding when to bet early versus late.
Studying how that decision depends on which week in the season.
Deciding between 2 different betting mechanisms ("relative" models versus "absolute" models), according to the type of game and time of season.
Well, obviously it's time to reflect based on some bad weeks, and a lack of great upswings like we should normally see. As I have posted in my weekly homepage articles, Betting Lines have been hit fairly hard in a couple more weeks, during the mid-season. A week with a 40% loss in week 10, after a 30% loss in week 9.
It's easy to say "well it's gambling". But that doesn't change the fact that, the extent of poor results is highly improbable, statistically speaking! Let me tell it another way. If you had bet on everything in reverse from my recs, and we were riding high on a wave of earnings, then I would still be sitting here telling you "these are unrealistically positive returns; don't get your hopes up for the next week, because these past results are not reliable indications, and they're not consistent with past simulations."
I have identified a few specific ways to improve. One I have already addressed is some overfit in the models-- I took care of these after week 6. Second, I started the season assuming I should eliminate betting suggestions when my two separate models conflict each other. That was a problem, because it resulted in too few options each week, which was risky. I updated for this issue, after week 4, by finding logic to choose the "more likely correct" model. And thirdly, I next want to work on making sure the bets are more evenly distributed. We've had some weeks where the Spread bets recommend to bet on all the favorites. There should usually be an equal amount of Underdog bets.
Here is the IMGUR link to the actual bets that were suggested.
Good luck in the final third of the season!
/Subvertadown
Tagged under
Background , Updates / News , Prev. Season / ArchiveTagged under
BackgroundTagged under
Background , Updates / NewsTagged under
Expectations , Modeling , AccuracyTagged under
Modeling , Understanding StatisticsTagged under
Background , AccuracyTagged under
BackgroundTagged under
Background , ModelingTagged under
BackgroundTagged under
Background , ModelingTagged under
Modeling , ExclusiveTagged under
BackgroundTagged under
Modeling , KickerTagged under
Modeling , D/STTagged under
Modeling , D/ST