Looking at Time Series Analysis of am19psu's Results

,
Well, I went ahead and downloaded Minitab so I could do some ARIMA modeling of my data. I have to admit, the preliminary results are a little disheartening. One thing to keep in mind is that the strategy is ever-evolving, so the results here may not have come from the same distribution, but I doubt that is playing much of a factor.

As you recall, the figure below shows my results through March 1st for 2008-09 basketball season.



I used disheartening above because when I did the time series analysis I came up with this:

Type Coef SECoef T P
AR 1 0.9712 0.0158 61.46 0.000
Constant 0.0645 0.1481 0.44 0.664
Mean 2.235 5.133

Basically, what that means is that my results are a random walk. The AR1 coefficient is the coefficient of the serial correlation term with p-value 0.000 and the constant is not significant (with p-value 0.664). Note that the coefficient is near 1. If it were exactly 1, it would be a true random walk. This pattern is one that is destined to fail over time. It's even been called gambler's ruin.

This result has many names: the level-crossing phenomenon, recurrence or the gambler's ruin. The reason for the last name is as follows: if you are a gambler with a finite amount of money playing a fair game against a bank with an infinite amount of money, you will surely lose.
And in general, we're not even playing a fair game because of juice.

This graph shows the results from 2008-09 college football.



The time series analysis for this sucks even worse.

Type Coef SE Coef T P
AR 1 0.9382 0.0300 31.26 0.000
Constant -0.8430 0.1800 -4.68 0.000
Mean -13.636 2.912

In this particular example, the AR1 term is a bit farther away from 1, meaning it is likely not a random walk, but the constant term is negative. That means I am a loser when it comes to college football, at least in 2008-09. Not that I had any inclination of that from the sidebar.

The next figure shows my NCAAF results with the first two weeks left out. One of the things that I've considered is that contrarian gambling early in the season is not profitable. ML has told me stories about Squeeky cleaning up in college football early in a season, 2004 or 2005, I believe. However, I have not had any luck whatsoever. This graph doesn't necessarily support that hypothesis, but it is fair to say that I never got out of the hole I dug myself the first two weeks.



These results are much more indicative of what I want to see. With a coefficient of AR1 around 0.84 and positive constant, I am showing NCAAF to be profitable for 2008-09 after Week 2.

Type Coef SE Coef T P
AR 1 0.8451 0.0445 18.99 0.000
Constant 1.0931 0.1816 6.02 0.000
Mean 7.056 1.173

Whew. At least, debatably, I'm not completely retarded. That said, when I look at the first differences, the p-value of AR1 is not statistically significant, so it is still possible that results are a random walk.

I'm not even going to look at the last two NFL seasons, since I readily admit I have no skill in that sport.

I'm not entirely sure what to make of these results, other than I have not been gambling skillfully in 2008-09. More questions and suggestions are always welcome in the comments.

In part two of this post, which will run Friday at 6PM, I'll look into my 2007-08 results.

21 comments:

Anonymous said...

this is beyond my stats credentials but doesn't this test if every game is independent from the last ie. not correlated. in that case, shouldn't it be a random walk?
also, does the test measure any constant that would trend the walk upwards

sorry if this is way off

Vegas Watch said...

"The next figure shows my NCAAF results with the first two weeks left out."

Isn't this, like, not allowed? You'd be including the first two weeks if you'd done well over that period, no?

Anonymous said...

updating my ignorant questions:

if this shows that it is not quite a random walk, doesn't that mean we can predict the outcome of the next game based on the last game better than 50/50? therefore, it would be advantageous to make a play after a "win" but not a "loss". this seems crazy but is what i originally was referring to about serial correlation.

am19psu said...

Think of the results in terms of regression like this:

a = AR1 coefficient
t = current time step
Y = dependent variable
Y(r), r = subscript
c = constant
e = error

The model equation is then:

Y(t) = a * Y(t-1) + c + e

If a = 1 and c = 0, then it is a true random walk.

To answer your question, in time series, it is assumed that there is some dependence between Y(t-1) and Y(t). In fact, that is one of the ways to determine how many AR terms to use.

If there is an upward trend, it is instructive to look at the first differences to determine if there is a serial correlation.

I hope this helps.

am19psu said...

Isn't this, like, not allowed? You'd be including the first two weeks if you'd done well over that period, no?

Certainly. And I believe I did include the first two weeks in the analysis directly above it.

But you're missing the point. I tried to point out that the first two weeks of NCAAF (and for that matter, NCAAB and the NFL) are not good for contrarian gamblers. In other words, I'm not sure the distributions of results are the same.

I see the same thing college hoops, which is why I've sat out the first month of hoops the last two years.

am19psu said...

if this shows that it is not quite a random walk, doesn't that mean we can predict the outcome of the next game based on the last game better than 50/50? therefore, it would be advantageous to make a play after a "win" but not a "loss". this seems crazy but is what i originally was referring to about serial correlation.

I need more time to think about this. Since it is similar to a random walk, I'm thinking it may be more like the hot hand fallacy. But I'd be happy to be proven otherwise.

Vegas Watch said...

"But you're missing the point."

Not really. It just seems like it's small sample size to begin with, and if you're taking out 1/8 of the season, which doubles as your worst run, you're not going to get a result that is meaningful.

am19psu said...

Right. Unless you have reason to believe that you are sampling from a different distribution. Which is what I am arguing. It's not like I am completely ignoring those results, nor am I touting myself as a superior college football gambler given I say this below the image:

That said, when I look at the first differences, the p-value of AR1 is not statistically significant, so it is still possible that results are a random walk.

Vegas Watch said...

It's not like I am completely ignoring those results, nor am I touting myself as a superior college football gambler"

You don't need to defend yourself, I didn't say you did. I was just wondering if the results would've been meaningful even if they'd been statistically significant.

Anonymous said...

"I'm thinking it may be more like the hot hand fallacy."

agree but this is what i originally was referring to. NS at 538.com wrote about how the stock market until recently did exhibit serial correlation so i thought maybe we could check it out.

i agree with VW that these are too small of samples to make conclusions but it clue us in on what to further research

Sham said...

The only point I have to add is the first couple weeks of the NFL season were profitable, it was the rest of the season that went to shit for me.

am19psu said...

NS at 538.com wrote about how the stock market until recently did exhibit serial correlation so i thought maybe we could check it out.

I didn't see that article. I'll have to check it out.

i agree with VW that these are too small of samples to make conclusions but it clue us in on what to further research

That was kind of my point regarding the first two weeks of NCAAF. Right now it's anecdotal, but all three years I've gambled on college football, the first two weeks have been absolute murder for contrarians.

Maybe it is an odd coincidence and just unlucky that shitty weeks fell at the start and I remember it better because it was early in the season. It's not like shitty weeks can't happen in the middle of the season, too.

Anonymous said...

"Maybe it is an odd coincidence and just unlucky that shitty weeks fell at the start and I remember it better because it was early in the season. It's not like shitty weeks can't happen in the middle of the season, too."

Right. Even if there is no edge whatsoever early in the season, it still wouldn't explain our horrible starts to college football the last couple of years. You'd still expect to hit around 50% not 40-45%.

There is definitely lots of room for improvement in our process(es), but at the end of the day the difference between a good season and a great season and even a bad season and a terrible one is always going to be fucking luck, or variance if you will.

Erich said...

Sorry for the late email, looks like you were quite ambitious and nailed this one pretty quick.
Have you thought about testing other player's track records? I'm sure there are some diligently kept records out there and am wondering what you might need to quickly process this for another player's track record or perhaps a rule based betting strategy.

Anonymous said...

There shouldn't be any reason to look at serial correlation. There is just no reason any previous game would relate to one in the future.

The hot hand fallacy has conflicting variables, like other players getting more touches and stronger defense on the player with the hot hand.

The stock market may have correlation just because it is driven by public demand. If the public even falsely believes there is a correlation they will drive it to have correlation.

Anonymous said...

I thought of an analogy. In blackjack one can read all he wants about betting systems but it will never make him win because each bet individually will have a losing probability. There is no affect from one bet to the next.

Anonymous said...

"There shouldn't be any reason to look at serial correlation. There is just no reason any previous game would relate to one in the future."

I would agree but just wanted to test it because we can.

Then I started trying to think of any reason why there would be correlation. What I came up with has less to do with how one game relates to the last and more to due with the "feelings" of Vegas at a given time. Maybe they set more "traps" in time clusters. this would lead to a serial correlation.

Again, this is somewhat outlandish, but i felt it was worth discussion.

Anonymous said...

dgon679, what do you think of the "trap time clusters"? this would fit serial correlation but doesn't imply one games outcome affecting the next.

probably silly i suppose

Anonymous said...

"dgon679, what do you think of the "trap time clusters"? this would fit serial correlation but doesn't imply one games outcome affecting the next."

Well, at least its a reason. And I think it can imply one game affecting the next. If we continue to make plays on a consistent basis while Vegas is sporadic in making plays, we will win and lose in clusters (well, at least a little).

But why would Vegas only sporadically want to take good gambles?

Perhaps Vegas can't afford to take risks at certain times like when they have just lost a big game. Seems doubtful. I just don't see them ever having cash flow problems. Since every gamble they take we must assume is a good gamble, I don't know why they take caution at any time...

Maybe to throw off Contrarians? I wouldn't think so. If they set a line closer to the actual it will attract more public bets along with more contrarian bets. It just doesn't seem like they really would have a problem with contrarians. Anyway I'm sure someone else has put more thought into this question than me...

Anonymous said...

plausible and extremely unlikely

am19psu said...

Have you thought about testing other player's track records?

I'd be happy to run numbers for other players. At least for ML, Jonny, and Vegas, I'm not sure I'm going to see a whole lot of difference.

There shouldn't be any reason to look at serial correlation. There is just no reason any previous game would relate to one in the future.

At first blush, this seems true. However, don't we end up on the same teams quite a bit (WVU, K-State, etc.)? One could argue that since a majority of our bets are on teams that we (and the books) think are undervalued by the public, there is some sort of co-dependence.

In blackjack one can read all he wants about betting systems but it will never make him win because each bet individually will have a losing probability.

Unless you are card counting, in which case you are banking on the deck having a memory. Wouldn't it be analogous to any normal bet being a loser in EV because of juice, but here we are trying to identify market inefficiencies (i.e. lots of high cards in the deck) due to square perceptions?

plausible and extremely unlikely?

Lots of good back and forth last night guys. My only addition would be that books aren't infallible. Last I knew, from a guy that works in the market, the books haven't done well this year. If we are "betting with Vegas," and Vegas isn't performing, we are going to be screwed.