CPU challenges for spam-free mail

In 1995, I proposed an anti-spam system where mail would include an analog of a stamp, or as I later refined it a cheque, which would offer a small amount of money with each mail. If the recipient decided the mail was spam, they could redeem the stamp and get the money. This would make spam un-economical, but as long as regular recipients were polite, would not cost money for ordinary users.

It would, however, require a digital money infrastructure and new mailing software for mail senders and recipients.

Many others have had the same idea. One of the most interesting alternatives was the idea I call a CPU stamp, first proposed at Crypto '92 by Cynthia Dwork and Moni Naor. In this case, the mail would come not with digital money but with proof that the sender had spent several seconds of computer time before sending the mail. Computer time is very cheap for ordinary PC users, and so if you need to spend 20 seconds of calculating to send your first E-mail to somebody, it's no big burden. A spammer, sending a million messages, would need 230 days of CPU time, which is not practical. (Proposed calculations can include CPU intensive and memory intensive operations which may be more consistent among platforms.)

One such project is the HashCash system. The name comes because the problem to be solved is inverting a hash function. This is something that can only be done by brute force and thus takes lots of CPU time.

This idea eliminates the need for money to be involved, which solves some of the problems, but doesn't help the legitimate host of a mailing list. It also requires new software for the sender, which again has been an non-starter up to now.

Challenge/Response

The first system I actually implemented against spam was a Challenge/Response mailer. This method has also become popular, and since newer systems aren't doing it quite right, I prepared a set of challenge/response best practices for such systems.

C/R systems are very effective against spam, and done right have the lowest false positive rates (with the requirement that the user scan over the lower-scoring messages that never responded to challenges.)

Certificates

Today, people are so fed up with spam they are actively proposing we move to fully authenticated E-mail. All mail senders would need to get a certificate which in most (but not all) proposals would verify your identity. You would need to sign all mail and use the certificate. People would only accept mail from a sender who was authenticated. The simple anonymous e-mail system of the past would be undone in the name of stopping spam.

This concerns me, since even though you can send Anthrax in the mail, we don't demand you show ID to drop a letter in a postbox. While most advocates of certificates claim that it will be up to the individual to decide if they want unsigned mail, the truth is almost nobody will choose to do so if the options are refuse unsigned mail or get a lot of spam. The decision to allow privacy and anonymity in communications is one society has to make as a whole.

Combining it all together

If we are ready to demand that senders get new mailing software, we don't have to go so far as to demand they sign all mail. Below I describe a better option.

Basic Whitelisting

Almost all mail you get is from people you know or mailing lists you subscribed to. As such, most spam systems try to "whitelist" such names, and let the mail through without further blocks. This is a good idea, and can be implemented with some sophistication. If this is done, whatever else we do only applies to a minority of mail, and that's good.

There are other techniques people have to mark mail as almost surely not spam. Almost all anti-spam systems include such a determination. Such mail should also go right through.

CPU stamp challenge

When mail arrives that is the first mail from an unknown correspondent, and other spam detectors are not sure if it is spam or not, a challenge can be sent. This challenge would be implemented both with a ESMTP response code and with a regular message offering options for response.

A special ESMTP response code which indicates the mail is getting a challenge. This will assist automating challenges on direct connections. However, it will not be failure code because the real challenge must come like an E-mail for users unable to get it any other way. The suspect mail is held, not rejected.
A challenge mail with ordinary human language instructions to allow the sender to prove they are not a spamming robot. This is often termed a "Turing test." If the receiver of the challenge has legacy software, they will need to respond to this challenge.
In the challenge will be the URL of a Java applet which is programmed to, when run, solve a demanded CPU intense problem, ie. generate a CPU stamp. Any user with a web-enabled E-mail and Java (which is the vast majority of users) can click on the link and forget about it. The Java applet will run on their machine, calculating the answer, and forwarding it back to make sure the mail gets delivered. This all takes place in a low-priority background task.
The CPU challenge will also be encoded in a standard form. For hashing, the problem might be "For X, Y and Z, find W such that MD5(X,W) modulo Z = Y." Cryptographers will know that the larger Z is, the harder this problem will be. As CPUs get faster, it can be made harder. Aware mail client would see the challenge, and fire off a background program to solve it and forward the answer to get the mail delivered. The user will not even be aware of this.

In this case, CPU stamps are only calculated on your first mail to a new correspondent if the mail does not pass other non-spam tests. In other words, not very often. As such the CPU required can be high, high enough to deter any spammer. Users with new mail clients (MUAs) are not even aware that the system is operating -- except for mailing lists.

Certificates

This system would also allow digital signature and certificates. Any signed mail with a valid certificate would be let through. Many mailing lists would consider getting such a certificate, since they can't afford to do CPU stamps for all mail.

Mailing lists could get a certificate from a central trusted certificate authority or one of its delegates, or they could generate their own. If they generate their own, then users must approve the mailing list manually.

Individual users could also sign their mail and get certificates, if they have other needs to sign their mail, or if they don't wish to spend CPU time on the CPU-challenges.

Mailing lists

Mailing lists could get a certificate from a central authority, but we don't like systems requiring authority (even secure DNS which has ICANN as a central authority.) Another simple system would work as follows.

Normally challenge/response systems must never challenge mailing list mail. However, mailing lists could explicitly ask for a one-time CPU-stamp challenge if their mail is unrecognized.

When the first mail arrives in your box from the list, with the flag indicating this is OK, your mail system would issue a CPU challenge. The mailing list would answer this. Your mail system would then notify you that some mailing list mail has come in for a mailing list you forgot to whitelist. You can then confirm or deny that you wanted to subscribe. If you do subscribe, you whitelist the mailing list and its mail arrives unimpeded. While a mailing list can't afford to do CPU stamps for each mail it sends out, it can afford them for each new subscriber who forgets to whitelist when subscribing.

(Note that people who subscribe using e-mail with a smart mailer would whitelist automatically.)

Anonymous Mailers

The best thing about this system is it preserves anonymous E-mail, if not anonymous mailing lists. (I am hard pressed to figure out how to do an anonymously hosted mailing list and stop spam.)

To mail anonymously, just pre-compute a sufficiently large CPU stamp. In effect, this is the hashcash system, which has all mail come with the CPU stamp in advance.

The recipient will see the CPU stamp (which is tied to a hash of the message itself so it can't be re-used) and let the anonymous mail in.

Windows into the system

Are there some security windows which will allow spammers in? Not many. If there are windows into the spam-scoring systems that allow low-scoring mail to get in unchallenged, they could be exploited for a time.

The other main window is the virus. Spammers may eventually write a virus to steal other's CPU time and calculate CPU stamps for them. They may also do this to have the victims send mail to their friends, as is common in an E-mail virus, to use the whitelisting that friends will tend to get.

While this is nasty, it's difficult, already a felony, and doesn't scale well. Almost all spam systems are vulnerable to this attack.

Note that as CPUs get faster, the challenge problem has to get harder over time. That's easy with a challenge based problem. Pre-calculated stamps for anonymous mail must keep up with the latest trends. If they send a stamp that's not "hard enough" they may get their mail rejected, and if they are anonymous, they have no way of getting back the error. As such, anonymous mailers must send an even harder pre-calculated stamp than normal. Fortunately, only a small volume of mail desires anonymity.

Once Implemented

If implemented, this system would have the following attributes:

Challenges would be rarely sent, only on opening of correspondence and first subscription to a mailing list.
Users with new mail clients would be unaware of the system. At most they would notice a slight delay (and a slight slowing of their machine) on mailings to brand new users, and a new to confirm subscriptions to mailing lists.
Users with old mail clients but Java software would be able to deal with challenges with a single click. This is the vast majority of all users.
Offline users and the few remaining users with old software and no Java would need to deal with occasional, rare Turing test challenges.
Mailing list hosters would wish to get updated mailing list software to allow signing and response to new-subscription challenges. So long as new subscription challenges have low volume, CPU costs would be manageable.
Of course, spammers would not be able to spam without buying supercomputer clusters. Should they do this the challenge difficulty can easily be boosted.
Anonymous mail would be protected and reasonably easy, but would require new mailing software
Strictly speaking, while this works best if senders get updated mail clients, it does not require it.

Other Methods

Of course, I believe it is possible to block spam without demanding new software on the part of senders, using throttle servers. I describe this in my best plan to end spam essay.