Tuesday, May 27, 2003


Spam is the price we pay for free mail. The response rates are tiny (1 in 10000 respond) but the spammers make it up with volume and manage to turn a profit. A continuous arms race is on between spammers and anti spammers. Like any arms race, the advantage keeps shifting from camp to camp and it is unlikely there would ever be a clear winner.

Keyword filtering cut out the spam with obvious keywords like Viagra, penis enlargement or Teen Sex.Spammers just stopped using these keywords.

Noe Bayesian Filtering seems to have won a round for the anti spammers. It uses probability to work out if a mail is junk or real.

First, thousands of messages are statistically analysed to extract the top 15 features that define them as spam

The list of includes some words, such as "teens", and other were less obvious stuff like formatting codes and routing information found in e-mail headers

This filtering is claimed to be 99% effective. Even if it is 90% effective, it raises spammers costs by a factor of 10, hopefully driving most from business.

Anti spammers have the courts on their side too. US states are passing laws that outlaw spam, net service firms are filing lawsuits and installing basic filters. Some are adopting Bayesian filters to spot the most obvious spam


