After worms, spam is easily the bane of existence for an increasing number of users — corporate and home alike. It is more than a full-time job to keep track of all the new spammer tricks.
Devising tests to probe antispam behaviour is a daunting task — but that doesn’t mean we shouldn’t try.
At first blush, it seems like a hopeless endeavour. After all, with spammers expending significant effort to find ways around whatever barriers are put in place, by the time one test is developed there are sure to be new methods of spamming out “in the wild”. Some say that because testing can never hope to mirror reality, no useful tests can be developed. I disagree.
We can apply concepts developed to test less-amorphous things — LAN switches for example — to provide a basis for useful evaluation of antispam. For starters, one could dismiss the last decade or so of switch/router testing for the “it-does-not-reflect-reality” reasons stated previously. Indeed, while we test these devices, nobody contends that this is how networks work in the wild.
We do it for two fundamental reasons. First and foremost, such testing provides a way to get direct points of comparison across products. The results can give key insights into the underlying architecture of the product and thus how it is likely to behave in the wild.
And this is where we can get the benefit with testing antispam solutions. With countless companies and countless underlying architectures and implementations, it becomes difficult to judge products on their own merits. By devising structured tests and applying them across products/services, key elements can be exposed.
We’ve done just this. For several reports that will be released soon, we devised a set of tests that exercised antispam products/services.
Given the variety of spam that any product will have to handle, coupled with the various addressing attributes used by spammers, we knew that we’d be looking at using thousands of test messages. While logistically challenging, it does provide the benefit of delivering thousands of data points and thus more reliable results.
We started by “collecting” some thousands of examples of actual spam, loading them into a database and classifying them into categories based on the content. Among these we built a category of “benign” messages. These were real messages that should get through and not get snagged by the anti-spam product. Then we used a custom “spam-generator” program that we built to send these messages to the system under test using both original and “faked” FROM addresses and variations on TO, CC and BCC fields.
With this fundamental test, we saw significant differences in behaviour and proved to our satisfaction that testing spam is not a futile endeavour.
Kevin Tolly is president of The Tolly Group