On Walter's suggestion, I played with the binomial distribution. It's not perfect, but it's better. I now get an expectation of 1.53 teams going undefeated and 2008 USC with a 33.85% chance of having exactly one loss and 19.02% chance of going undefeated, both of which make a lot more empirical sense.
This isn't perfect, because I tried to reverse engineer the E(wins) out of the binomial distribution, until I realized I couldn't get half wins back out, so the E(wins) calculation is using the same assumptions as before (e.g. symmetry).
In any case, I'm not really planning on using this for anything other than making fun of retard fanboys who think Georgia is going to win the national title (these numbers give 2008 UGa a 17.1% chance of exactly one loss and a 5.1% chance of going undefeated -- prescient), so it doesn't need to be perfect, but I think Jonny will be able to write some good posts with these.
Showing posts with label already?. Show all posts
Showing posts with label already?. Show all posts
Future Win Totals Thinking Revisited
I've got time to put the full record update up since games don't start again until Thursday, so I figured I would add to this post from last week.
Since I figured out the expected wins of each team from the totals posted at BetUS, all I needed was a standard deviation to come up with a distribution of the probability of each team having a particular number of wins. I did this by using last year's totals and actual results to come up with a root mean squared error estimate of the standard deviation, which was around 2.16.
I'll let Jonny analyze each particular team, but I came up with an unexpected and almost assuredly erroneous result. The numbers show the expected number of teams that should go undefeated is 4.27.
There a couple of reasons that immediately spring to mind why this number is wrong. First off, my methodology could be wrong. I used a normal distribution to estimate probabilties for discrete data. While not terrible, it's still not good (for math geeks, this is analogous to using the trapezoidal rule for estimating integrals).
Also, it may not be a good idea to use a symmetric, normal distribution for totals that are 10, 11, or 12, since the tails of a normal distribution go to infinity and you are buttressing against the total possible number of wins.
Another reason would be that last year's RMSE is not representative of the true standard deviation of the distribution. The last thing that comes to mind is the books shading the lines high for the expected good teams, expecting to take a lot of over action.
To use a further example, last year Moneyline, when he was still running a blog, estimated USC's chances of going undefeated at 14.8% and their chances of one loss at 32.4%. Using this methodology, 2008 USC had a 31.3% chance of going undefeated and a 17.7% chance of one loss. Obviously, Moneyline's numbers are way closer to reality than these are.
There is quite a bit of room for improvement here, I'm just not sure where.
Since I figured out the expected wins of each team from the totals posted at BetUS, all I needed was a standard deviation to come up with a distribution of the probability of each team having a particular number of wins. I did this by using last year's totals and actual results to come up with a root mean squared error estimate of the standard deviation, which was around 2.16.
I'll let Jonny analyze each particular team, but I came up with an unexpected and almost assuredly erroneous result. The numbers show the expected number of teams that should go undefeated is 4.27.
There a couple of reasons that immediately spring to mind why this number is wrong. First off, my methodology could be wrong. I used a normal distribution to estimate probabilties for discrete data. While not terrible, it's still not good (for math geeks, this is analogous to using the trapezoidal rule for estimating integrals).
Also, it may not be a good idea to use a symmetric, normal distribution for totals that are 10, 11, or 12, since the tails of a normal distribution go to infinity and you are buttressing against the total possible number of wins.
Another reason would be that last year's RMSE is not representative of the true standard deviation of the distribution. The last thing that comes to mind is the books shading the lines high for the expected good teams, expecting to take a lot of over action.
To use a further example, last year Moneyline, when he was still running a blog, estimated USC's chances of going undefeated at 14.8% and their chances of one loss at 32.4%. Using this methodology, 2008 USC had a 31.3% chance of going undefeated and a 17.7% chance of one loss. Obviously, Moneyline's numbers are way closer to reality than these are.
There is quite a bit of room for improvement here, I'm just not sure where.
Short List for the Naismith
Anyone else totally stoked at the thought of backing Greivis Vasquez again next year? No? I don't understand why he doesn't go to Italy now and get it over with.
Passing everything tonight. I want to play Cleveland, especially with the line movement, but I can't justify it with the numbers at Wagerline and Sportsbook.
Streak for the Cash
7p LSU/Arkansas o10.5 runs
Current Streak: 0
Flip CS-Fullerton. Two games at better than -200 and lost them both. If those lines were accurate, there was roughly an 8% chance of them losing both games.
I should probably play the Angels at 10p tonight, but flip them too.
Passing everything tonight. I want to play Cleveland, especially with the line movement, but I can't justify it with the numbers at Wagerline and Sportsbook.
Streak for the Cash
7p LSU/Arkansas o10.5 runs
Current Streak: 0
Flip CS-Fullerton. Two games at better than -200 and lost them both. If those lines were accurate, there was roughly an 8% chance of them losing both games.
I should probably play the Angels at 10p tonight, but flip them too.
Football Predictors
I'm finally feeling like putting all of my academic statistical knowledge to use. With the help of some other people in the contrarian community, I'm going to try a first stab at handicapping this summer (yes, summer). Before I do all the necessary coding for the stats stuff, I need to figure out what the best predictors are for success in college, pros, and/or both so the data can be acquired and manipulated.
My initial list looks like this: points scored, points allowed, yards gained, yards allowed, and interceptions thrown. The first four should be pretty obvious. The last one is the proxy for turnovers. Clearly, turnovers affect a football game's outcome, but other research has shown that fumbles are fairly random. However, interceptions depend mostly on the quality of the quarterback and are predictable from week-to-week (as any Penn State fan from 2006-7 can attest).

If you wanted to predict outcomes of games, what statistics would use and why would they be better than the ones I'm looking at trying? Also, in the comments, feel free to make fun of me for even bothering to attempt a project that will surely fail.
My initial list looks like this: points scored, points allowed, yards gained, yards allowed, and interceptions thrown. The first four should be pretty obvious. The last one is the proxy for turnovers. Clearly, turnovers affect a football game's outcome, but other research has shown that fumbles are fairly random. However, interceptions depend mostly on the quality of the quarterback and are predictable from week-to-week (as any Penn State fan from 2006-7 can attest).

If you wanted to predict outcomes of games, what statistics would use and why would they be better than the ones I'm looking at trying? Also, in the comments, feel free to make fun of me for even bothering to attempt a project that will surely fail.
9/19/09 Revisited
Doc Saturday, admittedly not from a gambling perspective, seems to think I'm not totally retarded.
Looking Ahead: 9/19/09
Yeah, I know it's basketball season, but obviously college football is my favorite sport...
Just guessing here, but given what I've seen written by the MSM so far:
Vanderbilt +6 +102 3x
Ole Miss was one of my favorite teams at the end of last season. They were, in my opinion, legitimately the third best team in the SEC last season. That said, let's take a look at their last 5 games:
W 17-7 Auburn (chicken garbage)
W 59-0 UL-Monroe (Sun Belt)
W 31-13 @ LSU (chicken garbage)
W 45-0 Mississippi State (chicken garbage)
W 47-34 Texas Tech @ Cotton Bowl (overrated?)
These games are what the public is going to remember heading into 2009. The last 5 teams were mainly garbage. Texas Tech, while having an outstanding season, still seemed to me like they had a better record than their actual skill level. However, some people are ranking Ole Miss in their Preseason Top 5. I'm not so sure. With only 5 returning offensive starters and 2 on the offensive line back, I'm not really expecting Ole Miss to put up the numbers they were at the end of last season, Jevan Snead notwithstanding.
I am of the opinion that Ole Miss will have a higher public perception than what they deserve, given the way they took care of the awful SEC West last season. The only part that I am unsure about is how good Vandy will be this season, but they are returning 19 of 22 starters from last year's bowl team. Hence, I'm thinking Vandy will be an anti-public pick in Week 3.
Just guessing here, but given what I've seen written by the MSM so far:
Vanderbilt +6 +102 3x
Ole Miss was one of my favorite teams at the end of last season. They were, in my opinion, legitimately the third best team in the SEC last season. That said, let's take a look at their last 5 games:
W 17-7 Auburn (chicken garbage)
W 59-0 UL-Monroe (Sun Belt)
W 31-13 @ LSU (chicken garbage)
W 45-0 Mississippi State (chicken garbage)
W 47-34 Texas Tech @ Cotton Bowl (overrated?)
These games are what the public is going to remember heading into 2009. The last 5 teams were mainly garbage. Texas Tech, while having an outstanding season, still seemed to me like they had a better record than their actual skill level. However, some people are ranking Ole Miss in their Preseason Top 5. I'm not so sure. With only 5 returning offensive starters and 2 on the offensive line back, I'm not really expecting Ole Miss to put up the numbers they were at the end of last season, Jevan Snead notwithstanding.
I am of the opinion that Ole Miss will have a higher public perception than what they deserve, given the way they took care of the awful SEC West last season. The only part that I am unsure about is how good Vandy will be this season, but they are returning 19 of 22 starters from last year's bowl team. Hence, I'm thinking Vandy will be an anti-public pick in Week 3.
Subscribe to:
Posts (Atom)