Sunday | 12 October, 2008
Computerworld
Researchers dig for hidden links in spam
Two researchers at the University of Quebec are refining a way to identify links advertised in spam messages.
Jeremy Kirk (IDG News Service) 01/11/2007 05:32:25

Computerworld Buyer's Guide - Vendors Matched to this Article
Related Features
  • +

    Ticked Off at Tick the Box Mentality 04/02/2008 13:01:15

    Does your executive search firm know the difference between an MIS manager and a CIO, and if it does, can it explain that difference to its corporate clients?
    Does your executive search firm know its MIS managers from its elbow? Does it even know the difference between an MIS manager and a CIO, and if it does, can it explain that difference to its corporate clients?
Additional Resources
Executive Guides
Whitepapers
Zones
Zone logoZones provide focussed content from Computerworld and leading technology partners.

Newsletter Subscription

Sign up for our Computerworld newsletters!
Computerworld's twice-daily news service keeps you in touch with the latest, most important headlines from Australia and around the world.
Keep up with the latest virtualisation technologies, products, news and features.
RSS Feeds

Filtering spam messages is a thankless job for software.

For every 100 spam e-mails, one message usually gets through, an irritating pitch with links to Web sites selling questionable drugs or sketchy Rolexes.

The links contained within spam are one indicator in determining whether it should be blocked. Often after a large spam run, the addresses of spammy Web sites will be added to blocklists that are used by antispam software to cull future messages with those links.

To get around it, spammers construct e-mails with links that can't be identified by filters but still are valid in the messages, said Christopher Fuhrman, a professor of software engineering in the Department of Software and IT Engineering at the University of Quebec.

Spammers do this by "munging" the HTML (Hypertext Markup Language) -- adding backslashes, taking out tags -- so that the message and its links are still readable by the rendering engines of browsers or e-mail clients, but appear as a garble of nonsense to filters. The technique is also known as obfuscation.

It's a trial-and-error process, since spammers don't read HTML Web standards. "Spammers just want to get the cash," Fuhrman said.

Tamper with the HTML too much, and the message won't render at all. Too little, and filters snare the message.

So spammers aim for a narrow gap: Most browsers and e-mail clients can render a certain amount of munged HTML, although the tolerances vary depending on the application.

Fuhrman theorizes that spammers test their messages using Microsoft's widely used Outlook program, which uses the same HTML rendering engine as its Internet Explorer (IE) browser.

So Fuhrman and one of his graduate students, Hicham El Alami, are writing a program to use that IE's rendering engine as a way to "parse" messages, or extract the links.

Services such as SpamCop already do this. SpamCop -- part of IronPort Systems, a subsidiary of Cisco -- has a Web-based service that uses algorithms to parse links out of spam messages submitted by users.

Those algorithms are hard to write, although SpamCop's is pretty good, Fuhrman said. Fuhrman and El Alami are interested in creating an alternate way to do that same parsing without needing to consistently tweak an algorithm to keep up with new tricks used by spammers.

It's hard to write a parser that will read links the same way IE's rendering engine does since Microsoft's source code is secret, Fuhrman said. So a better idea would be just to use that engine as part of a program to parse messages. A variety of tools exist to manipulate IE's rendering engine through APIs (application programming interfaces), Fuhrman said.

The links that IE's engine renders would be reported to a blocklist service. Fuhrman wrote a model version of his idea that works in Java, but El Alami is now working on one for .NET, Microsoft's application development framework.

"I want to ultimately get it as a Web-based engine so that users can paste spam, and when it comes out, it will reveal the links," Fuhrman said.

Computerworld Buyer's Guide - Vendors Matched to this Article
Market Place

Computerworld Member Login


 

Smart SOA World Tour

Discover how SOA can create smarter outcomes for your business.

Attend and learn:

  • How SOA is helping leading companies to become more agile
  • Where you should be applying SOA processes in your company
  • The top SOA implementation mistakes to avoid

Click here for more information.
Whitepaper

Wireless LANs: Is my enterprise at risk?

Achieve an overall understanding of the risks associated with wireless LANs. Discover their inherent properties, as well as what makes them different from wired networks. Read on to uncover a list of recently published articles on real-life breaches and incidents illustrating the need for proactive measures to mitigate wireless security risks.

Enterprise IT Buyer's Guide
Find Technology Vendors Fast
 
Find vendors by name | Find by category
Sponsored Links