Saturday, March 20, 2010

Current Standings and Introductions

Posted by Lee
We have eight entries in our inaugural March Madness Predictive Analytics Challenge. The standings after the first two rounds of play look like this:
  1. My Robots Wicked Smaht
  2. ebv
  3. Danny's Dangerous Picks
  4. Hugues
  5. The Pain Machine
  6. FTW
  7. Simple PageRank
  8. BrentsBracket
With the first week of basketball in the tournament over, let's introduce our competitors!

Entry Name: My Robots Wicked Smaht
Team Members: Rolf and Andrew
In their own words: Our backgrounds are more on the human learning side of things, so we took a fairly simple approach to creating a bracket picking robot. Our robot uses a simple regression to identify key variables in order to enhance the RPI rankings.

Rolf has blogged about their entry.

Entry Name: ebv
Team Members: Eric Venner (venner at bcm dot edu)
In his own words: I'm using a very simple model based on PageRank. A loss is treated like a link from the losing team to the winning team, and weighted based on the point in the season at which it was played - later games are weighted higher.

Entry Name: Danny's Dangerous Picks
Team Members: Danny Tarlow (you're reading his blog)
In his own words: I generated the predicted score using my probabilistic matrix factorization model's offensive and defensive rankings to determine each game's winner, like described here:

Entry Name: Hugues
Team Members: Hugues Salamin
In his own words: I am currently doing a PhD in Glasgow, Scotland. For the prediction, I use a CRF with one variable (winner) and the features are the some of the states of the team members. I got an accuracy of 0.75 when training on 2006 to 2009 and testing on 2010. I was planning to extend the model (predict overtime and score delta) but did not have enough time. Maybe for the sweet 16 part. The code is in Python and training uses the SciPy LBFGS implementation for the gradient descent.

Name: The Pain Machine
Team Members: Dr. Scott Turner (srt19170 at gmail)
In his own words: My Ph.D. is in Artificial Intelligence from UCLA, where I wrote a program (MINSTREL) to tell stories about King Arthur and his knights as a way to look at creativity and storytelling. For the past twenty years I've worked for the Aerospace Corporation as a software architect and ground system expert for satellite programs.

My approach wasn't particularly sophisticated; it processed the first part of the season to develop a ranking for all the teams, and then did a simple genetic algorithm to evolve an equation to predict outcomes based upon the ranking, RPI, and a few other stats. It was able to correctly predict the outcomes of my test set of games at about 80% (not particularly good, IMO).

Entry Name: FTW
Team Members: Matt Curry (matt at pseudocoder dot com); @mcurry -
In his own words: A bunch of years ago I wasted hours per day writing programs to predict the outcomes of sporting events, mostly for pretend gambling purposes. My best was a program that could pick select NBA games with a tremendous success rate (focused on home teams that were huge underdogs). I didn't have the testicular fortitude to trust it with more then a few small bets. My program for this contest is awful. I fully expect to get destroyed.

Entry Name: Simple Page Rank Bracket
Team Members: Daniel Mack (dmack at isis dot vanderbilt dot edu) and @manieldack
In his own words: As a first go around, I decided to step back and look at the problem as a network. With teams as nodes, and wins being represented as edges from the team that was beaten to the team that won. This structure actually has some interesting properties, but one of the most fascinating, is that it resembles in some fashion a web infrastructure. Good teams are linked frequently from other teams that are also linked frequently. Using the barest of page rank algorithms, I calculated the teams' ranks and propagated the winner through in the brackets, this meant that when I predict a team to win, the page rank is calculated to take that win into account, and thus teams in the Final Four have been impacted.

Entry Name: BrentsBracket
Team Members: Brent Castle
In his own words: Matrix Completion for Power Rating Differences


Danny Tarlow said...

Thanks again everybody for entering. This is turning out even better than I hoped!

An interesting note: both of the top two teams at this point mentioned using a temporal component, where more recent games were weighted more heavily. Correct me if I'm wrong, but I don't think any of the other methods did. Coincidence?

Danny Tarlow said...

And here is the Yahoo bracket group that has the full details of the standings: