Monday, June 28, 2010

Embracing Uncertainty

Posted by Danny Tarlow
If you're around London this week, try heading down to the Southbank Centre for the Royal Society's 350th anniversary event. The Machine Learning and Perception group at Microsoft Research Cambridge has a booth with a bunch of demos set up, all revolving around "Embracing Uncertainty: The New Machine Intelligence". I particularly like the Galton machine (or Quincunx).

I'll be down there on Tuesday and Friday.

Tuesday, June 22, 2010

Machine Learning (ICML) Discussion Site

Posted by Danny Tarlow
The International Conference on Machine Learning (ICML) is happening now in Haifa, Israel. There have been attempts at sites to discuss papers in previous years as well, but this year, there is a revamped version. Most notably, you can subscribe to an RSS feed of the discussion, which should make it easier for people to keep tabs on discussions.

Like many others, I think discussion sites are a nice idea. It's a step towards making research more interactive beyond just the conference poster session, it can help clarify ambiguities the readers might have of the paper, and I think it has the potential to make authors more accountable for their work. Hopefully it will catch on.

Oh, and I haven't had a chance to read any of them in detail yet, but here are the papers that caught my eye and will probably make it to my reading list:

Continuous-Time Belief Propagation
Tal El-Hay (Hebrew University); Ido Cohn; Nir Friedman (Hebrew University); Raz Kupferman (Hebrew University)

Interactive Submodular Set Cover
Andrew Guillory (University of Washington); Jeff Bilmes (University of Washington, Seat)

A fast natural Newton method
Nicolas Le Roux (Microsoft Cambridge); Andrew Fitzgibbon (Microsoft Research)

Non-Local Contrastive Objectives
David Vickrey (Stanford University); Cliff Chiung-Yu Lin (Stanford University); Daphne Koller (Stanford University)

Accelerated dual decomposition for MAP inference
Vladimir Jojic (Stanford University); Stephen Gould (Stanford University); Daphne Koller (Stanford University)

Monday, June 21, 2010

Bayesian NBA Basketball Predictions

Posted by Danny Tarlow
Those of you who read this blog regularly know that I like to play around each year predicting March Madness basketball scores. This last year, Lee got involved, and we ran the first annual March Madness Predictive Analytics Challenge, which by all measures was a great success.

Well, it's fun to run my model, but it's a pretty basic model. It doesn't use any information other than the scores of each game, so important things like when the game was played, whether it was a home or away game for each team, and various other pieces of side information are ignored. It's not that I don't think there's useful additional information, it's just tricky to figure out a good way to get it into the model.

So I'm quite pleased to report that some of my buddies from the Toronto machine learning group -- Ryan, George, and Iain -- had some great ideas. They wrote a paper about the ideas, which will be appearing at the upcoming conference, Uncertainty in Artificial Intelligence (UAI 2010). They're also releasing their data and code. The rough idea of the model is to train a different model for each context (which is given by the approximate date, who is home/away, and other side information), but to constrain the models with similar contexts to have similar parameters using Gaussian Process priors. As they say in the abstract:
We propose a framework for incorporating side information by coupling together multiple PMF problems via Gaussian process priors. We replace scalar latent features with functions that vary over the covariate space. The GP priors on these functions require them to vary smoothly and share information. We apply this new method to predict the scores of professional basketball games, where side information about the venue and date of the game are relevant for the outcome.
It's a cool model and a really nice idea. If you followed the previous action related to March Madness, I encourage you to take a look. And of course, it's never too early to be thinking about your entry for the 2011 March Madness Predictive Analytics Challenge!

Saturday, June 19, 2010

Resources for Learning about Machine Learning

Posted by Danny Tarlow
I've been using Quora a bit lately, somewhat to the detriment of this blog (though that's not the full explanation for my slow posting schedule). Anyhow, Quora is a nice question and answer service that has been getting some press in the startup world recently. A while back, Quora released a Terms of Service that gives pretty liberal terms of use for the content on the website. Founder Adam D'Angelo summarizes:
You can reuse all new content on Quora by publishing it anywhere on the web, as long as you link back to the original content on Quora.
One question that has received some interest (49 followers of the question, and 11 answers) and might be relevant to the readers here is this one:
What are some good resources for learning about machine learning?

I've read Programming Collective Intelligence, and am looking any recommendations on follow up books/resource.
There were some good answers, even some of which I didn't know about. Here's a sampling of the answers:

My answer was Andrew Ng's YouTube videos:

Some other good ones:
Jie Tang says...
Mike Jordan and his grad students teach a course at Berkeley called Practical Machine Learning which presents a broad overview of modern statistical machine learning from a practitioner's perspective. Lecture notes and homework assignments from last year are available at

A Google search will also turn up material from past years
Ben Newhouse says...
The textbook "Elements of Statistical Learning" has an obscene amount of material in it and is freely available in PDF form via

While more niche than general Machine Learning, I recently ripped through "Natural Image Statistics" (also downloadable at ). It's a great read both for its explanations of your standard ML algo's (PCA, ICA, mixed gaussians etc) and for its real-world applications/examples in trying to understand the models used for analysis in our neural vision system
Jeremy Leibs gives the staple of David MacKay's book (I believe David MacKay would say that machine learning is just information theory), right?:
"Information Theory, Inference, and Learning Algorithms" by David MacKay has some decent introductory material if I remember. Available online:
Incidentally, I haven't read Programming Collective Intelligence, but it seems popular amongst non researchers. Do any of you know more about it?

Also, I have a few more Quora invites left, so if anybody wants in, let me know, and we'll see what we can do.

Thursday, June 17, 2010

Uncertainty: Probability and Quantum Mechanics

Posted by Danny Tarlow
In machine learning, we often take probability for granted. We desire a system for representing uncertainty in the world, and Cox's theorem tells us that if we accept some basic postulates regarding what is desirable in a system of uncertainty, we will end up with probability.

So that should be the end of the story... right? Well, maybe not. The first Cox postulate is
Divisibility and comparability - The plausibility of a statement is a real number and is dependent on information we have related to the statement,
which seems quite innocent. However, who's to say that there is anything fundamental about real numbers? Real numbers have strange things like irrational numbers and negative numbers (crazy, I know), but they're lacking in comparison to imaginary numbers (there's no operation that you can apply 4 times before returning to your original value, which you can do by multiplying by i with imaginary numbers). It seems kind of arbitrary to choose real numbers. For a fun and interesting read, see the following link. It makes the point better than I can:
Negative numbers aren’t easy. Imagine you’re a European mathematician in the 1700s. You have 3 and 4, and know you can write 4 – 3 = 1. Simple.

But what about 3-4? What, exactly, does that mean? How can you take 4 cows from 3? How could you have less than nothing?

Negatives were considered absurd, something that “darkened the very whole doctrines of the equations” (Francis Maseres, 1759). Yet today, it’d be absurd to think negatives aren’t logical or useful. Try asking your teacher whether negatives corrupt the very foundations of math.
Imaginary numbers come up in the context of systems of uncertainty when we deal with quantum mechanics. The basic idea is that interactions operate over amplitudes (expressed as complex numbers), then to determine the likelihood of a final configuration, you look at norms of amplitudes. For a relatively straightforward explanation, see here:

So I don't necessarily have any well-formed thoughts on the matter (yet?), but it's fun to think about other principled ways of representing uncertainty. I'm curious to know if there are types of interactions useful for machine learning that would be hard to represent with standard probability models but that would be aided by these types of quantum models.

Finally, I leave you with this blog comment from The Blog of Scott Aaronson:
“graphical models with amplitudes instead of probabilities” is a fair definition of a quantum circuit (and therefore a quantum computer).
That seems to me, worth understanding deeper.