By Subvertadown
Tagged under
Betting Lines , ExpectationsOctober 1st, 2022
I have been so eager to make this analysis since January!
It has just taken this long to get everything in place-- first the models, then the simulation programming, and now the communication. I hope you'll find some of the following interesting, whether you bet or not (I don't recommend gambling as a financial plan!).
First of all, I happen to be writing this after a good week 3. Week 2 was also quite okay. That gives me the chance to say: NO, the good return of these last couple weeks are NOT to be expected...rather I really want to temper expectations that and drive home the opposite point: gambling is risky! You cannot expect the same returns, as you'll see below. [Last week had an ROI in the 60% range (assuming bets returning 1.9x) and the week before circa 30%. Week 1 had a loss of more than -30%. Of course, these amounts fluctuated depending when bets were placed during the week.]
Yes, I could have called it "how much you can gain", but I want to emphasize the element of risk. So based on my models and historical simulations, here is the picture of what ROIs could have looked occurred historically, using the same formulas and making avg. 15 bets per week. As you can see, it is absolutely possible to lose most (or all) the bankroll in 1 week. In fact you cannot call that out of the ordinary:
I hope you can see that: close to half the time, my strategy yields losses.
On average, I hope for a weekly average target of 8% - 9% season-wide (additive per 16 weeks giving 112%), but I reiterate that might not happen. All I do is provide a method, and some people who are comforted by the approach might enjoy using it as a guideline.
Now that I've hopefully scared away anyone with high expectations, here's how this whole betting lines module emerged. For the last 5 years I have been working on game score models. It came out of doing the D/ST and kicker work, not just because I thought it could be used for gambling. In order to simulated future-week D/ST score projections etc., I sometimes needed to simulate future-week NFL scores as input.
Then, a year or two ago, I realized my approach was coming really close to Vegas accuracy, and (without revealing too much) I realized I could even identify statistical inefficiencies in some Vegas lines. Doing so doesn't even require that my accuracy is always better, it just requires recognizing the signals that betting lines might be slightly swayed. I was not comfortable sharing the results until I developed a basic diversification scheme (it works close to the Kelly Criterion, think of it as a tweak of that). And finally I did this in the middle of last year.
Last year this worked well for a while, but fell apart late-- especially weeks 14-15. I have been so curious-- since January-- to upgrade the models and find out whether this was an accident or something avoidable. So I tripled my seasons of data to add more robustness, and now I could finally investigate how my approach might be expected to perform. And it's not just the season-long performance that's interesting: It's especially important to understand the week-to-week variation AND season-to-season variation.
So for transparency's sake, here is how I simulated the results, to generate historical ROIs. It would be really important to understand whether I made any assumptions that could have been too optimistic. Here were the steps:
The spreads and totals are calculated by cross-validated models, like I normally do. And this part was done in the proper conservative way: NOT training the model with the results of the same year in question. For example, to get the 2021 spread estimates, I trained my models on the other years (2010-2020), and then applied the resulting formulas to 2021 data. Otherwise, using same-season data would be back-fitting of data onto itself. Note: I assumed the order of seasons shouldn't affect the approach, i.e. predicting 2013 using 2014 data ("future" data) is equally as hard as using only past data. This could be the one source of bias, because it does not pick up on the risk of league-wide changes in trends (it assumes the same trends always should have applied).
Another conservative aspect was only using raw data with no attempt to account for injuries / trades or any other weekly variations. This means the model stuck with a high spread even in games when the starting QB had been ruled out.
Using this raw approach just described, then it's easy to calculate which bets would have won, i.e. when my model turned out "more right" than Vegas. Interestingly (or maybe no surprise) the resulting models won close to 52.5%-- in other words no upfront betting advantage for a standard 1.9x return.
Now the critical step: analyzing how this resulting pattern of wins and losses was associated with all the inputs-- to yield a formula that represents probability assignment. There are details I don't want to disclose, but the trick is in identifying which bets have the only 50% chance and which ones have a >53%+ chance. From my own skeptical point of view, this is the one point in my process where there was potential for mathematical overfitting, so I applied the same cross validation to de-risk.
Here is a picture of the resulting probability estimates, from which you can see the vast majority of bets fall below 53% and therefore are excluded. I don't want to make claims, as advice givers do, that 60% is the expectation.
There were so many things I was curious to see! Right away, returning to the question I had at the end of 2021, I could see that weeks 14-15 of last year were, in fact, simply unavoidable losses. Secondly (seen in the first graph above), I could understand that they were pretty normal losses in comparison to history.
I was also curious if some weeks were just harder to predict-- for example were earlier or later weeks harder to beat Vegas lines? You can see that in the following. This is a simulation of placing approximately 15 wagers each week-- i.e. a weekly “portfolio” with each wager at the appropriate risk-adjusted amount. By the way, “Cumulative ROI” includes each individual wager as “investment”, so it represents the average weekly ROI (8% on the season would be 112% on 16 weeks):
Cumulative ROI shown meaning the ROI of all weeks from week 1 up to the given point in time.
Here's my summary of observations:
2 out of these 11 seasons ended with either a loss or no-gain. It is definitely possible to go through a season and have a net loss.
On average... a median ROI of 8-9% resulted, and I consider it a target instead of any higher number.
Surprising to me, the first 4 weeks were often somehow the best. I was very surprised that week 1 appeared to actually provide more wins, and I'm still not sure how much I believe it. (In this simulation, week 1 estimates were based only on last-season data.)
However, this "early-season advantage" did not seem to apply over the prior 3 years. They have been running with losses at the start.
Looking at the black line, the cumulative-ROI starts to decline after week 4. That dip means that weeks 5-7 tend to bring losses, bringing down the season average. Are weeks 5-7 difficult because Vegas starts to "learn better"? Or because teams re-strategize? Or because injuries pile up? I have no idea.
Other interesting notes:
The average probability on bets that get suggested is about 55.5%.
The average number of suggested bets per week is 9: 4 from totals, 5 from spreads.
Spreads are harder to win against betting lines; totals are comparatively easier.
This analysis has led me to slightly revise my suggestions.
Week 1: In my week 1 post this year, I suggested to not bet at all. But according to the graph, week 1 might actually be a chance to take advantage of overly reactive perceptions of team change from the previous season. It turns out my recommendation was right this particular year-- Week 1 brought a loss this year (2022)-- but next year I won't say the same.
Compounded betting: At some point I noted in a Reddit post that there would have been huge gains last year if the earnings had been reinvested, so that each week the bankroll increased and compounded exponentially. I did not mean anyone should follow this suggestion... and now you can clearly see the risk from the data. Do not do this. Rather plan for a certain amount of bets over the full season-- you might risk e.g. only 25% of that amount at a given time.
Reserves: Looking at the historical returns, the safe recommendation is to choose a fixed sum for a target weekly bankroll, while at the same time always reserving (NOT betting) an equal size amount "for future weeks"... "just in case". This would have worked, with room to spare, in every season except 2013 which looks like an outlier.
Number of bets. Historically, the updated models result in approximately 9 bets per week. This is different from my previous suggestion for 20 bets. Going forward, expect somewhere between 4 and 13 bets suggestions per week.
Special note for 2022: We are seeing an abnormally low number of recommendations-- because more often than not my model "agrees" with Vegas and finds little advantage. Therefore, we are seeing consecutive weeks with only several bets recommended.
Size of bets. I have so far tried to recommend the simple approach of applying a fixed bankroll each week-- for example spreading out exactly $100 each week. However, it is clear that the optimal approach is to expect a different number of bets and a different bet amount each week, depending on the confidence level. From now on, I will not make an effort to have the bets add up to a fixed number every week. It will be a "floating" sum, reflecting the degree of risk.
Remember, this is all just for fun, don't stress or go crazy with it. But I hope this description and the model either provides useful advice, or at a minimum helps your decision-making as you refer to other sources for betting suggestions.
/Subvertadown
Associated reading:
Tagged under
BackgroundTagged under
Background , Updates / NewsTagged under
Accuracy , BackgroundTagged under
BackgroundTagged under
Background , ModelingTagged under
BackgroundTagged under
Background , ModelingTagged under
Background