Friday, April 6, 2012

Final 2012 Full-Bracket Results

Posted by Lee

Hopefully everyone had a chance to watch the exciting game between Kentucky and Kansas this past Monday. This post only covers the results of the full tournament bracket and not the second chance Sweet Sixteen bracket.

Here are the full standings, including ESPN analysts (E) and my own picks.

TheMatrixFactorizer127
Jay Bilas (E)126
Lee's picks124
The Pain Machine122
Baseline120
Danny's Dangerous Picks117
By The Numbers104
Dick Vitale (E)102
Obama102
Predict the Madness99
Ryan Boesch98
TheSentinel86
AJsMadness73
machine_learning_first_try45

Great contest this year and congratulations to this year's winner, TheMatrixFactorizer! It not only won the full-bracket contest, it also squeezed past ESPN analyst Jay Bilas by a point. Once again, machines triumph over humans in our contest. I, for one, welcome our new March Madness predicting robot overlords.

Wednesday, March 21, 2012

Round 2 Update + Upset Analysis

Posted by Danny Tarlow
Here's another great guest post from Scott Turner, our #1 Machine March Madness guest poster. Great analysis -- thanks Scott! If you want more where this came from, check out his blog.

On my blog here I took a closer look at how the Pain Machine predicts upsets in the tournament and how effective it was this year.  I thought it might be interesting to look at how the top competitors in the Machine Madness contest predicted upsets.  I put together the following table with the competitors across the top and an X in every cell where they predicted an upset.  Boxes are green for correct predictions and red for incorrect predictions.  The final row(s) in the table shows the scores & possible scores for each competitors.

Game Pain Machine Predict the Madness Sentinel Danny's Conservative Picks AJ's Madness Matrix Factorizer
Texas over Cincy X X X X X
Texas over FSU X X
WVU over Gonzaga X X X
Purdue over St. Mary's X X X X X
NC State over SDSU X
South Florida over Temple X X
New Mexico over Louisville X X
Virginia over Florida X
Colorado State over Murray State X
Vandy over Wisconsin X
Wichita State over Indiana X
Murray State over Marquette X X
Upset Prediction Rate 43% 25% 33% 0% 25% 29%
Current Score 42 43 42 41 41 39
Possible Points 166 155 166 161 137 163


(I'm not counting #9 over #8 as an upset. That's why Danny has only 41 points; he predicted a #9 over #8 upsets that did not happen.)

So what do you think?

One thing that jumps out immediately is that the competitors predicted many more upsets this year than in past years.  Historically we'd expect around 7-8 upsets in the first two rounds.  Last year the average number of upsets was about 2 (discounting the Pain Machine and LMRC).  The Pain Machine is forced to predict this many, but this year the Matrix Factorizer also predicts 7, and Predict the Madness and AJ's Madness predict 4.  From what I can glean from the model descriptions, none of these models (other than the Pain Machine) force a certain level of upsets. 

Monte's model ("Predict the Madness") seems to use only statistical inputs, and not any strength measures, or strength of competition measures.  This sort of model will value statistics over strength of schedule, and so you might see it making upset picks that would not agree with the team strengths (as proxied by seeds).

The Sentinel uses a Monte Carlo type method to predict games, so rather than always produce the most likely result, it only most likely to produce the most likely result.  (If that makes sense :-)  The model can be tweaked by choosing how long to run the Monte Carlo simulation.  With a setting of 50 it seems to produce about half the expected number of upsets.

Danny's Dangerous Picks are anything but; it is by far the most conservative of the competitors.  The pick of Murray State over Marquette suggests that Danny's asymmetric loss function component might have led to his model undervaluing strength of schedule.

AJ's Madness model seems to employ a number of hand-tuned weights for different components of the prediction formula.  That may account for the prediction upsets, including the somewhat surprising CSU over Murray State prediction.

The Matrix Factorizer has two features that might lead to a high upset rate.  First, there's an asymmetric reward for getting a correct pick, which might skew towards upsets.  Secondly, Jasper optimized his model parameters based upon the results of previous tournaments, so that presumably built in a bias towards making some upset picks.

What's interesting about the actual upsets?

First, Texas over Cincy and Purdue over St. Mary's were consensus picks (excepting Danny's Conservative Picks).   This suggests that these teams really were mis-seeded.  Purdue vs. St. Mary's is the classic trap seeding problem for humans -- St. Mary's has a much better record, but faced much weaker competition.  Texas came very close to beating Cincinnati -- they shot 16% in the first half and still tied the game up late -- which would have made the predictors 2-0 on consensus picks.

Second, the predictors agreed on few of the other picks.  Three predictors liked WVU over Gonzaga, and the Pain Machine and the Matrix Factorizer agreed on two other games.  Murray State over Marquette is an interesting pick -- another classic trap pick for a predictor that undervalues strength of schedule -- and both Danny's predictor and the Matrix Factorizer "fell" for this pick.

So how did the predictors do?

The Pain Machine was by far the best, getting 43% of its upset predictions correct.  Sentinel was next at 33%.  Perhaps not coincidentally, these two predictors have the most possible points remaining.

In terms of scoring, the Baseline is ahead of all the predictors, so none came out ahead (so far) due to their predictions.  The PM and Sentinel do have a slight edge in possible points remaining over the Baseline.

So who will win?

The contest winner will probably come down to predicting the final game correctly.  There's a more interesting spread of champion predictions than I expected -- particularly given the statistical dominance of Kentucky. 

If Kentucky wins, the likely winner will be the Baseline or Danny.  If Kansas wins, the Pain Machine will likely win unless Wisconsin makes it to the Final Four, in which case AJ should win.  If Michigan State wins, then the Sentinel will likely win.  And finally, if Ohio State wins, then Predict the Madness should win.

Monday, March 19, 2012

Second Chance Competition Announcement

Posted by Danny Tarlow
For all of you who didn't get your algorithms finished in time, and for all of the original competitors who'd like a fresh start, we're pleased to announce this year's "second chance" Sweet 16 contest.

This one will be run a little bit differently. For machines, the rules are all still the same. The difference is that there will now be a pool of human competitors in the mix -- Facebook friends and fans of our sponsor, a knee doctor who likes robots.

The prize pool for the second chance tournament will be $50 and $25 gift certificates for first and second place, respectively, and they will go to the top two entrants, whether they be human or computer.

If you want to participate as a human, you need to add Doctor Tarlow on Facebook and look for his announcement there. For those who wish to enter an algorithm, here are the instructions: That's it! Good luck to all the algorithmic competitors out there. I hope we can pull out a victory over those pesky humans.

"Predict the Madness" by Monte McNair

Posted by Danny Tarlow
This is a guest post by Monte McNair, the man behind team "Predict the Madness," which is the leader of the machine competitors after the second round.

Developing a system to fill out the best NCAA Tournament bracket is composed of two parts: matchup prediction and bracket optimization.

MATCHUP PREDICTION
The first thing to do is come up with a method to predict the likelihood of one team beating another. Since we only care about advancement, I want a system that produces a perentage as opposed to a point spread or something else. Therefore, I use a logistic regression with the outcome of games being the dependent variable. For the variables, I use the location of the game, metrics for the team's offense and defense, and metrics of the team's opponents' averages for both offense and defense. The NCAA Tournament is played at all neutral sites, but since I'm training on all games, I want to know how important playing at home is so that I can strip this out for neutral site games. The reason to use components of a team's offense and defense as opposed to simply points is that the different components that contribute to points have varying levels of reliability. As KenPom figured out this year, for example, defensive 3P% is extremely unreliable. My model takes this into account and weights it less than it would be if we used its influence on points against. By breaking it down, we let the model determine which factors are most reliable in predicting future performance.

The main thing we care about is that the model does a good job of predicing future games. Instead of waiting for future games, however, we can just use out of sample games. I took about 1/3 of our games and made them training games and left the other 2/3 as testing games. One thing I did that may be different from most is that I used all of a team's games for the season except for the game in question to create their profile. For example, say North Carolina played Duke on January 7th in one of my training games. For North Carolina's profile, I used stats from all of their games before AND after January 7th. I'm not sure what other systems do but I think they might use all games (without excluding the game in question) or perhaps just games PRIOR to the game in question. In any case, after training the model, I can test it against the out of sample games I set aside for testing. I divided up all the test games into 100 buckets ordered by their predicted win percentage and compared it to the actual win percentage in those games. As we can see, the buckets are closely aligned meaning the predictions are fairly accurate.



BRACKET OPTIMIZATION
The next thing to do is to take our matchup predictions and maximize our expected points based on the scoring system we are presented with. While this is most beneficial when scoring systems provide bonuses for picking upsets or some other unique scoring, it can still be helpful in basic scoring systems and is better than simply advancing winners round by round.

As an example, take Louisville and New Mexico, the 4 and 5 seeds in the West region. My model predicts New Mexico as the favorite in a game against Louisville, projected to win 51.2% of the time. Both are favored in their 1st round matchups as well, so if we were to simply advance them both, we'd then choose New Mexico to advance over Louisville in the 2nd round. However, New Mexico has a tougher 1st round opponent in Long Beach State than Louisville does against Davidson. In the table below, we see that New Mexico wins just 65% against LBSU while Louisville wins 75% of the time against Davidson. This is enough to make it more likely that Louisville advances to the Sweet 16 than New Mexico, despite UNM being the better team.

1st 2nd
New Mexico 64.9% 37.2%
Louisville 75.3% 40.7%

New Mexico over Louisville: 51.2%

In a basic scoring system, this rarely comes into play and when it does, it provides little benefit. But it still is best to be accurate if you can.

Saturday, March 17, 2012

Machine March Madness: Round 1 Update

Posted by Danny Tarlow
As usual, the first round was full of upsets, with two of the #2 ranked teams falling. None of our competitors predicted either of those upsets, but they are still putting on a respectable performance. Here are details of each competitor's entry, along with the current performance.

The favorites at this point look like "The Matrix Factorizer" and "The Pain Machine". Both did quite well in the first round, and both have 7/8 elite eight teams still surviving, along with all 4/4 final four teams still alive.

The Matrix Factorizer

Jasper

I modified Danny's starter code in two ways: First, I added an asymmetric component to the loss function, so the model is rewarded for getting the prediction correct even if the absolute predicted scores are wrong. Second, I changed the regularization so that latent vectors are penalized for deviating from the global average over latent vectors, rather than being penalized for being far from 0. This can be interpreted as imposing a basic hierarchical prior.

I then ran a search over model parameters (e.g., latent dimension, regularization strength, parameter that trades off the two parts of the loss function) to find the setting that did best on number of correct predictions made in the past 5 years's tournaments.

24 of 33 Correct, 25 Pts, 171 Pts Possible

The Pain Machine

Scott Turner

Methodology: Linear regression on a number of statistics, including strength ratings to predict MOV (Margin of Victory). Some modifications for tournament use, particularly to force a likely number of upsets.

23 of 33 Correct, 24 Pts, 170 Pts Possible

TheSentinel

Chuck Dickens

Methodology: Using Ken Pomeroy's Pythag formula to rate teams, then calculated the actual game probabilities with the log5 formula. Used a random number generator to determine outcome of games. This provided some randomness which created a few interesting upsets. Simulate the tournament 50 times and record each team's probability to reach subsequent rounds. Step through each round of the bracket choosing winners based on the team that had a higher probability to win that round.

I found that running the simulation 50 times gave me the most variability in the final four, running the simulation more than 100 times gave me a bracket that had almost no upsets and most all of the higher seeded teams progressed through the tournament.

23 of 33 Correct, 24 Pts, 172 Pts Possible
Baseline

Always pick the higher seed.

23 of 33 Correct, 24 Pts, 168 Pts Possible
Ryan's Picks

Ryan

For each season (e.g. 2006-2007) I have enumerated the teams and compiled the scores of the games into a matrix S. For example, if team 1 beat team 2 with a score of 82-72 then S12=82 and S21=72. Ideally, each team would play every other team at least once, but this is obviously not the case so the matrix S is sparse. Using the method proposed by George Dahl, I define vectors o and d which correspond to each teams offensive and defensive ability. The approximation to the matrix S is then just the outer product od' (for example (od')_12=o1d2=S12est). This is a simple rank one approximation for the matrix. If each team played each other at least once then the matrix S would be dense and the vectors o and d could be found by finding the SVD of S (see http://www.stanford.edu/~boyd/ee263/notes/low_rank_approx.pdf). Because this is not the case, we instead define a matrix P that represents which teams played that season. For example, P12=P21=1 if teams 1 and 2 played a game. Now the problem stated by George can be expressed compactedly as, "minimize ||P.*(o*d')-S||_F". Here, '.*' represents the Hadamard product and ||.||_F is the Frobenius norm. In this from, it is easy to see that, for constant vector o and variable vector d, this is a convex problem. Also, for constant vector d and variable vector o this is a convex problem. Therefore, by solving a series of convex problems, alternating the vector variable between o and d, the problem converges rapidly in about 5 to 10 steps (see "Nonnegative Matrix Factorizations" code here http://cvxr.com/cvx/examples/).

See this post for more details.

23 of 33 Correct, 24 Pts


Danny's Dangerous Picks

I started with the basic matrix factorization approach from my starter code, then I added small neural networks that applied a transformation to the base latent vectors based on whether the team was playing at home, away, or in the tournament. These transformation vectors were learned based on season and tournament performance of teams from other years. I split the data into 5 cross-validation sets, and looked for hyperparameter settings that did best on tournament prediction in past years. Like Jon, I also added an asymmetric component to the loss function.

Interestingly (disappointingly), after finding the setting of parameters that did best on past data, my method made some pretty conservative predictions for this year, predicting only 3 upsets.

22 of 33 Correct, 23 Pts, 165 Pts Possible
Predict the Madness

Monte McNair

Methodology: To determine the probability of any matchup (Team 1 beating Team 2), I use a logistic regression using statistics for offense/defense of team and team's opponents plus location, dependent variable is outcome of the game. To select bracket, I use a program to calculate the best possible bracket by maximizing number of points based on scoring system, this correctly accounts for situations where simply advancing favored teams round by round would fail.

22 of 33 Correct, 23 Pts, 157 Pts Possible


AJ's Madness

AJ Diliberto

The methodology is that I selected various stats and gave weight to those that I feel are important, such as points for and against, offensive rebounds, and turnover margin. I also factored in whether they were from one of the big conferences, the level of experience and success the coach has had, and then overlaid the formula with a strength of schedule formula that would reduce certain teams scores based on how good or bad the competition was that they played to get those stats.

22 of 33 Correct, 23 Pts, 139 Pts Possible

Machine Learning First Try

Joe Gilbert

My methodology is as follows:
1. Develop a matrix that contains only 2011 scores (done using your data)
2. Develop a matrix that contains all of your teams and generate columns for averages over all players in 2011: minutes played, FT attempted/made, 3P attempted/made (done), rebounds, turnovers, fouls (again using your data)
3. Use machine learning, specifically a traditional Forest algorithm to predict each team's score for each game based on the 2011 data only
4. Select the winner for each round and repeat step 3 for the next round to determine the next winners
Currently, the algorithm predicted the first round modeling each team's score as an "Away" team since they are all technically on the road. I think I may change it so that the scores are based on a mean value of the model for an Away team and Home team because currently it is predicting LIU Brooklyn over MSU in the 1st round...if it comes true then so be it.

20 of 33 Correct, 21 Pts, 91 Pts Possible
By The Numbers

Tim Jacobs

Methodology:
I took the data so generously provided, trained a couple of neural networks on the past performance, then used average away performance for each team to predict performance in the tourney. The networks are training as I type.

17 of 33 Correct, 18 Pts, 166 Pts Possible

Wednesday, March 14, 2012

Data Usage Clarification

Posted by Lee

I just realized that the data rules and usage discussion happened on the Google Group and not everyone may have read it. Similarly, a clarification on hand-tweaking.

Basically, no human judgment data should enter your model except for your decisions on how to build the model and hyper-parameters for that model. Also, if you do use data that we did not provide, please let us know and please make it available to all the other competitors so that they might have the opportunity to use it as well.

Tuesday, March 13, 2012

Fast Company Article

Posted by Danny Tarlow
David Holmes over at Fast Company wrote a nice article on about our Machine March Madness contest:
http://www.fastcompany.com/1824382/march-madness-ncaa-tournament-predictions-algorithms

Thanks David!

To everybody else: I hope you're hard at work on your algorithm.