Project: Mail Anti-Spam Project



Yahoo! Labs Audience Sciences has partnered with the Yahoo! Mail team to cut down on the spam that reaches users' inboxes. Together they have created, for the first time, an efficient content-based filter that can be scaled for the millions of Yahoo! Mail users worldwide.

The problem is more intriguing than the standard document classification problems in the academic literature. There are extremely strict limitations on misclassifying a non-spam email as spam, and there are malicious users out there who actively provide feedback designed to cripple the Yahoo! anti-spam mechanism. And in addition to tackling the content-based anti-spam problems, the team is also investigating different reputation-based schemes.

The solutions to these problems will obviously have high business value for Yahoo!. But they are also extremely novel scientific challenges from an academic and research standpoint - often called "adversarial classification" problems - and require techniques that range from machine learning to streaming algorithms to game theory.

More information at: http://research.yahoo.com/project/2446