Posted by Danny TarlowEvery year, the NCAA College Basketball seasons ends with a tournament of 64 teams. Humans around the US (but also elsewhere in the world) fill in brackets with predictions of the outcome, enter pools, and wait excitedly for the results.
College basketball is a streaky and fairly high variance game, so there are many chances for an underdog to make a run deep into the tournament. We see this often -- for example, last year's tournament featured a final four made up of 3, 4, 8, and 11 seeds -- leading to the colloquial tournament name, "March Madness".
So without further ado, it is my pleasure to announce that this year, this blog, in conjunction with commissioner Lee, will host another "Machine March Madness" contest. The big idea is simple: using data from this season and from past seasons (which we will provide -- e.g., past data here: full and simple), build a computer system that fills out a bracket, then pit yourself against the field of silicon competition. You can see posts from last season's tournament here, and some press coverage here.
We'll get more details coming soon, including details about prizes. For now, you can do a few things.
- Download the past data (full and simple), and start thinking about how you'd model the tournament. To get some starter ideas, I recommend this timeless post by George Dahl.
- Let us know in the comments if there is any other data that you would like to use. The rule we have is that all systems must be built using the same data, but we're open to suggestions about what this data is.
- Get started!
Update: Here's a question about additional data to use, posted on Quora.