« September 2006 | Main | November 2006 »

My Gmail Spam Box

Something I do more and more is to mark mailing lists that I can't get off of as spam. Some require long-forgotten passwords or multiple responses to e-mail robots to unsubscribe. Much easier to let Google treat it is as spam than to work out some e-mail command list manager interface circa 1996. That's why you should make it really easy to unsubscribe from your mailing list - if multiple people are marking it as spam because they can't unsubscribe, Google is going to eventually decide that it is spam, and not send it to anyone.

There's 4603 spam e-mails in my spam folder today. I don't delete them anymore, Google trashes them after 30 days - so the number is a good indicator of volume of spam in the previous month. It's getting worse. The number was hovering around 2000 only a few months ago. It's half my own fault - I have dozens of e-mail addresses connected to one thing or another, all funneling into the same place. Google's spam filter is reasonably good, but there's still some spam getting through, and an occasional false positive.

Something I've asked Google to do is to let me direct all Chinese and Arabic language e-mails to my spam box. I might get one or two genuine foreign language e-mails a month, and if they're French, German, Spanish or even Danish, I might see they're genuine and machine translate them. But with Chinese or Arabic, I don't stand a chance and when I've tried translating, it just confirms what I already thought - it's spam. Another idea - maybe they could put a 'translate' button on the interface for foreign language e-mails. They already have the translation service.

Posted by Alexander on October 24, 2006 | My Gmail Spam Box | Comments (0) | TrackBack

Netflix Million Dollar Prize

I find this contest remarkably interesting. Netflix has a database of their customers' ratings of various movies. They use this database to make recommendations to their customers on which movies they might like to see next. The recommendation is not so much 'Goodfellas is a great movie' - but more 'Goodfellas was rated highly by people who like the same kind of movies that you do'. The competition is to improve the algorithm making the recommendations - prize $1 Million USD - open to anybody, anywhere (so long as you're not hated by the USA, including Quebec, harshly enough).

The target doesn't sound too ambitious. The ratings are between 1 and 5 stars. Their current system 'Cinematch' doesn't do too well. On average it's out by just under 1 star for each rating - so typically Cinemax will predict that a viewer will watch a movie and score it 4 stars - the viewer will actually score it 3 or 5 stars. You could probably get close to that level of competence by guessing that the viewer will rate every movie at 3 stars. Many times, you'd be right, and many times, you'd only be one star away. The vast majority of movies that I've seen are 2, 3 or 4 stars.

The contest is looking for just a 10% improvement on Cinematch to be eligible to win the prize. If I understand the small print of the contest correctly - it'll run for a minimum of 4 months, and provided at least one entry meets the 10% improvement, the algorithm that does best will win the prize.

This kind of contest is right up my street - I specialized in artificial intelligence subjects in University, but I've found it difficult to use those skills outside of personal projects. I'd have to regard this as a bit of fun too - but it's great to have this dataset to work with, and the potential prize makes it a lot more exciting.

Posted by Alexander on October 02, 2006 | Netflix Million Dollar Prize | Comments (0) | TrackBack