Wednesday, January 19, 2011

Get ready for the 2011 March Madness Predictive Analytics Challenge

Posted by Lee

For all posts related to the contest, check the March Madness category


Your March Madness commissioner, Lee, here.

With less than 2 months before the opening tipoff of the 2011 NCAA Men's Basketball Tournament, it's time to get ready for the second annual March Madness Predictive Analytics Challenge! The goal is to write a computer program that takes as input historical scores and stats, then automatically produces bracket predictions. To get an idea of how one such program might work, check out Danny's original data-driven March Madness post that started it all.

This year, the contest will be the same format as last year. We'll have one contest starting from the field of 64 and another contest for the Sweet Sixteen.

In terms of data, contestants should expect roughly the same type of data allowed as last year. However, I am planning on trying to compile all boxscore data (how many points, rebounds, etc. each player generates in every game) for the last 4 years.

If you have ideas for what type of data you'd like to include, please let us know via the comments and we can try to include it. We will release all the data at the same time in the first week of February and again once all the games have been complete going into the tournament.

Enter your email below to get updates related to the contest, and stay tuned for additional announcements!

5 comments:

Scott Turner said...

I don't want to scare anyone off but I'll be competing again this year :-)

Danny Tarlow said...

Great! Glad to have you on board, Scott!

Paul said...

I have been doing very, very light thinking about modeling this kind of data for the last couple of years. But every time I try I can't find decent data. Where are you going to get the data from?

Danny Tarlow said...

Hi Paul,

Lee will post an update about this soon, but there will be plenty of data. See here for a description of what will be available:
http://blog.smellthedata.com/2011/02/data-update.html

We should be able to provide this for all games dating back to 2006.

Danny Tarlow said...

See this post for the data:
http://blog.smellthedata.com/2011/02/ncaa-boxscore-data-2006-2010.html