Your Peace of Mind is our Commitment

Contact Us English Recent Articles

Spam, Spam Spam Spam…

Previously, Professor Dan Remenyi kicked off a debate on how to control Spam and viruses so I will follow-up and, hopefully, move the debate forwards. But is Spam a security issue? I would say that it constitutes a Denial of Service attack, additionally; in some cases the spamming is part of a larger illegal scam.

The Spam problem is certainly growing. In just 7 days recently, a company with 25 user accounts under one domain in Hong Kong received about 140,000 messages for 9391 non-existent addresses. The domain has a high-profile name, which probably acted as an attractant, but that is still an incredible amount of randomly addressed email. China has also recognised the Spam problem, in January 2003 the China Internet Network Information Centre (CNNIC) published statistics showing that over 51% of user's emails were Spam. In the U.K., the managed service provider MessageLabs reports that Spam grew from 2.5% in June 2002 to 55% in June 2003.

When the Internet started growing, we talked excitedly about the "Global Village", a warm, fuzzy image of a small community where people care about their neighbours. Villages grow into towns, which grow into cities, and I now believe we have reached the stage of the "Global Inner City", heavily in decline. The Internet is crowded, dirty, jammed, rife with disease and frequented by numerous confidence tricksters. Fortunately, it is still exciting, full of opportunities, and the information you require is really "just next door".

Categorising Spam

All Spam is not created equal and Spam is not the only junk we want to eliminate from our inboxes so defining some categories is in order. The most prolific group are the "professional spammers" - the messages arrive from a host with no obvious connection to the originator of the message and the From address is forged. These usually include the offers of drugs, porn, pirated software and money-laundering requests. A second group is the "over-zealous marketing" - the major differences between this and the first group are that, firstly, the offers are more legitimate, and secondly, the originator can be identified and contacted. A third group is not Spam at all - this is the mailing from the contact you met, or the list you had forgotten you subscribed to. Mass-mailing viruses are also not Spam, but something far worse. Many of the more recent ones, including Klez and Sobig, forge the sender's address, which leads to the next category - inappropriately directed virus detection alerts and other automated responses. Finally, there are miscellaneous chain letters sent by friends and acquaintances. These groups are not always clearly defined, and can blend into one another.

The over-zealous marketers may exacerbate their offence in various ways: some appear to have poor mailing list management, so a single address receives multiple copies, others seem to have downloaded a domain registry database and are trying likely-sounding addresses: sales@, accounts@, marketing@. If they do not exist, the postmaster probably ends up with an error message for each one. A diligent postmaster will work hard to improve communications by re-directing genuinely misaddressed messages, but why should their time be wasted sorting through junk from companies that could not be bothered to look for a valid address? Another case is where a list established for one purpose is used for something else, for example, a company runs a monthly newsletter, but sends additional junk (sorry, "timely messages about additional products and services that may interest you") to the addresses on that list at other times.

The professional spammers are often clearly criminals - the offers themselves are illegal in some sense: requests for assistance in money laundering demonstrate a lack of honesty, even before the probable scam comes into play and software at too-good-to-be-true prices is probably pirated. Selling prescription medicines without a prescription is also an offence, and I am sure some of the things they claim to be able to do to my body must be at least misleading marketing. The delivery methods also show dishonesty: tracing the source IP sometimes reveals an open relay has been used. This is taking advantage of a mis-configured mail server, consuming the bandwidth of the victim. In other cases, the source IP is part of a pool for broadband (probably home) users, usually this will mean that a Trojan has been installed on the victim's machine, so they are totally unaware of the spammer's activity. Breaking into a computer and taking it over is an overt criminal act in most jurisdictions. Here, we begin to see convergence with the world of viruses - sending out Trojans to thousands of potential victims is a tedious business, but a successful virus can compromise hundreds of thousands of machines in a few hours. This is a "business model" for viruses that Professor Remenyi found hard to see: a virus can bring large numbers of computers under the control of the attacker, who can use them for spamming or other activities. This is also where we reach the end of the available evidence: various recent viruses, including Sobig, do contain a backdoor component which could be used to download and run other software at the attacker's command but I have not yet seen evidence that this has been used by spammers. Perhaps the attackers had a different purpose in mind.

Solutions

With such a complex and diverse variety of junk email there will be no single, easy solution and various solutions that I think will be ineffective are being proposed.

Email Charging

Introducing a small charge per item of email sent sounds attractive, but the fact is that legitimate Internet users are already paying for access - from a quick calculation my company's leased line works out at US0.8 cents per megabyte. We will still get lots of junk from the over-zealous marketers, after all, it will still be cheaper than paper. It will also do nothing to stop the professional spammers - they are already stealing the bandwidth they use so it will be the victims whose PC's they subvert that will be faced with the bill.

White Lists

Refusing to accept email from an unknown source is not an option for any organisation that deals with the public. This includes Government agencies, retail companies, the service industries… and so on. Even if a white list is a viable option, the task of keeping up-to-date will become onerous as the size of the organisation increases.

Digital Signatures

Digital signatures and a secure email system may, indeed be the ultimate solution to Spam, but, at best, it will be long-term. Large organisations will need time to deploy certificates and train everyone who needs to use the new system. The secure and insecure systems will continue working in parallel for a long time as people will still need to communicate with contacts who have not yet migrated.

Legal Control

Last August, U.S. Federal Trade Commission (FTC) Chairman Timothy J. Muris said "No one should expect any new law to make a substantial difference by itself, Eventually, the Spam problem will be reduced, if at all, through technological innovation... legislation cannot do much to solve the problem," but this omits the essential role that laws must play: they will define our community's criteria for acceptable use of this communications medium. Technology can help us block, but what to block is a human decision.

So laws will be part of an effective solution. Good laws would effectively control over-zealous marketers, they are not trying to hide their identities, they are legitimate companies and we can assume they will, in general, respect the law. They will have little effect on the professional spammers - they already appear perfectly willing to break existing laws, and the problems of tracing the origin and International jurisdiction make prosecution difficult.

South Korea enacted a new law on spamming last December, and a recent survey by the Korean Information Security Agency (KISA) shows a promising result. The law prohibited automatic generation of email addresses, the harvesting of email addresses from Web sites and the use of technical means to get around Spam blocks. It also protected children from Spam, and controlled labelling of commercial emails. KISA's survey revealed that in March this year, 90% of commercial email received by users was unsolicited, but the figure had fallen to 70% in July, and the organisation attributed the fall to the new law.

Belgium has also recently updated its' laws on Spam. Enacted March 2003, it provides a recent example to discuss good and bad legal approaches. One good point is that it defines email broadly as any message sent over a public communications network and stored until it is collected by the recipient, thus covering traditional email, SMS, MMS and voice mail.

Also good is that the sender must not use a third party's address, or falsify or hide information to prevent tracking the origin of the message.

It does require opt-in: the prior, free, specific and informed consent of the recipient, and the recipient can always withdraw consent. Unfortunately, the Belgium opt-in requirement only applies to natural persons, not legal persons, i.e. companies and organisations have to opt-out.

Opt-Out Problems

Direct Marketers and some ISPs, including the Hong Kong ISP Association, favour opt-out lists, but they have serious drawbacks:

Professional spammers sometimes use "opt-out" links to confirm that an address is active - the victim them receives even more Spam.

Then there is no requirement to remove an invalid address, but the messages still consume the bandwidth of the recipient domain. If the opt-out requires a message from the address to be removed, it is inconvenient for administrators to generate the necessary messages. In the extreme case of the high-profile Hong Kong domain mentioned above, 9391 dummy addresses would be required.

Thirdly, there is no standardised method for removal - it may need a message with the subject UNSUBSCRIBE, or REMOVE, or perhaps those words in the message body, or perhaps a visit to a website. This makes it impossible to automate and time-consuming for the recipient.

Fourthly, there is no requirement for the data to be collected fairly and the burden on the recipient is unjustifiable. Ignoring the International considerations for a moment, there are about 300,000 registered companies in Hong Kong. Suppose a company says, "Please send your sales enquiries to sales@" on its web page, each of those companies could add the address to its' mailing list. If unsubscribing from one list takes 30 seconds, then that is 1.25 man-years of work.

Defining Spam

Like Justice Potter Stewart's famous comments on pornography, "I could never succeed in intelligibly [defining pornography], but I know it when I see it", some commentators suggest that defining Spam is difficult and that this would be a problem in drafting laws but reasonable criteria should not be difficult to define.

While these are not couched in legal terminology, I suggest these criteria as a starting point for further debate:

Messages should be solicited: either the recipient (or a qualified representative) asked for the message, or the address was advertised as a contact point for the purpose it is being used.

Messages should be in a language that the recipient understands. If the sender does not know what languages the recipient understands, then they have no business sending them a message.

The From: addresses in the message and its envelope should work, and must relate to the actual message sender.

The recipient has not asked to be removed from the mailing list and has not asked the sender to stop sending email.

Messages should not be sent via a mailing list that the recipient did not request to be added to, or any kind of purchased bulk mailing list.

Messages should not propose any kind of illegal or unethical activities. This will cover invitations to defraud the Nigerian Government, as well as any kind of porn-related message sent to a minor - if the sender does not know the age of the recipient, they have no business sending age-restricted content.

Blacklists

Various forms of blacklist exist. The Internet Society of China recently announced that their members would block emails from 127 servers that had been identified as sources of Spam by their Anti-Spam E-Mail Coordination Team, unfortunately, that is a miniscule number and it is very easy for a spammer to move to a new server.

A number of commercial and free DNS-based blacklists also exist, maintained by various independent organisations, including SpamCop, Spamhaus and MAPS. Technically, these provide a simple, low-maintenance way of blocking connections from known Spam sources - many mail servers and mail gateways can be simply configured to perform a lookup on these lists when they receive a connection. As the connection can be dropped on the basis of the source IP address, before the message has been transferred, it reduces the bandwidth consumed by spammers. The disadvantage is that legitimate mail will sometimes be blocked, when the sender is unfortunate enough to share their mail server with a spammer (they might share an ISP, or the server might be compromised, or accidentally mis-configured and abused).

An issue that blacklists should address is how they provide assurances that messages are not being censored for other reasons - a Government-managed blacklist might be accused of political censorship, and a commercial blacklist might be accused of blocking competitor's messages or other motives. Transparency in the requirements of evidence and procedures for listing and de-listing will be necessary safeguards.

Rule-Based Filtering

Nowadays, most mail clients have some form of rules-based mail filtering that can be used to prevent Spam. Some mail gateways have similar options, which brings the efficiency that one rule set is written for the whole site. I do not see these as a viable long-term solution as spammers simply tweak their messages to avoid the common rules. A lot of the current Spam already does this, with human-decipherable mis-spellings - porn is replaced by p0rn or p()rn, for example. The administrator has to continually update the rules to cope, and the rule set becomes unmanageably large - for one email client, the default anti-Spam rule set is 417 rules.

Ideally, the rules must be created and maintained with your organisation in mind: blocking all messages containing the word viagra is presumably not an option for Pfizer.

Another limitation of the rule-based methods is that the email must be received before the content can be examined, so it does not reduce the bandwidth consumption (redirecting your mail through a service provider who discards messages for you just makes the charging indirect - they pay for their bandwidth, and pass their costs on to you).

Bayesian Filtering

This uses Bayes theorem to calculate the probability that a message is Spam, based on the occurrence of the tokens it contains in a corpus of known Spam and non-Spam messages. Paul Graham has developed this method and reported a filtering rate of about 99.75%, and a false positive rate of .06%. It shares the limitation of the rule-based technique of not reducing bandwidth consumption.

A Hybrid Approach

As with any Denial of Service attack, our main objective is to minimise the impact on our organisations. Laws will help us define what is and is not acceptable, but the main impact-reduction will come from technical methods. Employee time is the most valuable resource affected, so the primary effort should be in reducing the time spent by humans categorising their email and defining rule sets. Reducing bandwidth is a desirable secondary objective.