Posted by Danny TarlowDirk writes...
Thanks for posting your code and ideas on your blog. I was playing around with your PMF code applying it to chess games, where users are white and items are black. Because players can play more than once against each other, would you remove the duplicate pairs?Cool. It sounds like an interesting application.
I don't see any reason to remove duplicate pairs. If different games have different outcomes, you definitely want to keep both to prevent biasing the model. If there are duplicate games with the same outcome, we should be more confident that the outcome wasn't a fluke, so it roughly makes sense that we'd want to weight it a bit more (which is the effect of having duplicate entries in the data set).
It's a bit strange to apply the vanilla model chess, though. There is no relationship in the model between player X as white and player X as black--they might as well be two separate people. It may be interesting to look at extensions where the latent vectors for player X as white are constrained to be similar as the latent vectors for player X as black.
Do any readers disagree?
(The relevant code is here.)