Just some stuff I gathered on spam.

Antispam resources:

A website helping spammers to spam themselves to death.

Spam is also known as unsolicited bulk email. Unsolicited means you didn't ask for it, bulk means there's a lot of it, and email means... email. And let me tell you, there is a lot of it. A spammer's computer can send millions of spam emails all across the internet in a very short period of time (minutes, I think?).

Why is spam bad?

Spam is a Denial of Service attack. You might have heard of Code Red which forced the United States Whitehouse to move their webpage to avoid being crushed by hundreds of thousands of innocent infected users overloading their system. That's what is called a Distributed Denial of Service attack. (Distributed because the hackers are controlling more than one computer, Denial of Service because it denies the computer the ability to perform a service, like websites.) Now let me compare that to spam.

Spammers don't send you spam from their own computers, instead they send the mail to what are called anonymous open relays. These computers have mail servers, i.e. they can send and relay mail, so badly configured that anyone can instruct that machine to send a completely anonymous email to as many people as they want. Spammers take advantage of this not only to be completely untrackable, but the open relays help reduce the load on the spammer's computer. So whereas a spammer might send out a million emails in 5 minutes, if they have discovered 4 open relays, they can send 4 million more emails in those same 5 minutes, and each email will seem to originate from the anonymous open relay.

So, the open relays are acting as the innocent computers infected by Code Red. The spammers are sending mail at such a high rate, it slows down network traffic, and messes things up, albeit not as skillfully as Code Red. If you don't think email can clog a network, go look up the Email virus hoaxes. They have successfully DoS'd whole companies to a standstill where email is concerned.

Unlike junk mail in your mailbox, spammers do not pay for sending you commercial email. It is your service provider who pays, using up space in your email box to hold spam. It is the Internet routers who pay, delivering the spam to your computer. And it is you who pay, wasting valuable seconds, even hours trying to find useful email amongst the spam. These are not just aesthetic comfort issues, this stuff costs real money.

What can you do?

Don't distribute your email address anywhere. Don't fall for stupid promotional offers. Don't try to get free porn on the web. Don't think you can get away with anything without paying a cost. *sigh* Don't post your email on any webpages, including bulletin boards, mailing list archives, and home pages. Don't send emails to everyone in your address book, since while you may know everyone, the chance that there are two people in your address book that don't know each other is very high. Giving other people's email to strangers is bad. Don't use your real email on newsgroups, especially binary ones. (You're pretty much dead if you do this.) Use a fake email ending in ".invalid", and if your newsreader doesn't support that, get a better one. And if your ISP is naive enough to inisit you use a valid email for newsgroups, get another one. If you can't get another news server, try Google.

I use RBLcheck in my Procmail filter to check addresses to see if they are on the Remote BlackList. Remote Blacklist servers work like DNS servers, except they don't tie names to IP addresses, they tie "guilty" to IP addresses/blocks. If you ask for the name of an IP or range of IPs and get the "guilty" answer, then that IP is considered "Blacklisted" by that server. RBLcheck is just a nice utility to deal with the confusing protocol of RBLs.

Warning, Remote Blacklists are extremely cloistered, they allow no outside parties to make adjustments to their lists. As far as I can gather it's like, "I'll spend all my time marking down spam sites, and won't take any outside help," but I might not have the clearest picture. I've been really happy with what those lists have done though. They only mark a good email as spam every now and again, though you should understand that it does happen. I myself have a whitelist of users that don't get checked by the spam filter, and I absolutely don't delete spam on arrival, since someone valid, whose ISP is stupid enough to run an open relay, should at least get a chance to be heard.

If you do this, I also recommend you make a program to translate the confusing Recieved: lines (the only semi-reliable way to track an email) into a simpler header. Mine ends up to be "X-IP: host name [host IP]" where "host name" is the resolved host, and "host IP" is the resolved IP. Now why couldn't Received: lines be this simple?

The code I made to add the X-IP thing is here. I'm horribly embarassed about it, it's a cludgy, hacked together piece of work, but just in case you know what you're doing and you need the regexps from me (as well as that "newlines are allowed in Recieved: lines" annoyance. I'd give you the code that does a fork/exec/waitpid on RBLcheck for every X-IP, but if you can understand it, you can probably write your own.