Tackle the SPAM problem with exim and spamassassin.

My platform of choice is Debian GNU/Linux; in particular I run a mix of "sarge" and "sid".

This version of this document refers to :

exim 4 4.01 , 4.04 (not yet packaged)
spamassassin 1.5 2.01 2.11 2.20

Other versions may or may not work the same. Usually the software works the same at least for minor version changes.

Note:


I am not using the configuration documented here. I have since installed Marc Merlin's sa-exim local_scan() implementation on my system. Even though it is still relatively new code it has been working quite well for me. This document remains for those of you who still need/want it.

Step 1 :

Edit /etc/exim/exim.conf to include scanning (filtering) by spamassassin in the delivery of a message.
In the Transports section add the following (order is insignificant) :

# Spam Assassin
spamcheck:
    driver = pipe
    command = /usr/local/bin/exim4 -oMr spam-scanned -bS
    use_bsmtp = true
    transport_filter = /usr/bin/spamc
    home_directory = "/tmp"
    current_directory = "/tmp"
    # must use a privileged user to set $received_protocol on the way back in!
    user = mail
    group = mail
    log_output = true
    return_fail_output = true
    return_path_add = false
    message_prefix =
    message_suffix =
        

Notes :
This pipes the message to exim using the BSMTP (batched SMTP) protocol. This avoids any nasties with shell metacharacters in addresses. Prior to giving the message back to itself, exim will filter it through the 'spamc' command. As the message returns to exim, the "received_protocol" set to "spam-scanned".
The main problems with this setup is that user's can't configure SA for themselves since SA is run as user 'mail'




Insert the following in the Routers section. The order matters. Put it after any routers that should handle unscanned mail and before any routers that should handle only scanned mail.

# Spam Assassin
spamcheck_router:
  no_verify
  check_local_user
  # When to scan a message :
  #   -   it isn't already flagged as spam
  #   -   it isn't already scanned
  condition = "${if and { {!def:h_X-Spam-Flag:} {!eq {$received_protocol}{spam-scanned}}} {1}{0}}"
  driver = accept
  transport = spamcheck
        

This director is used for any message that

Step 2:

(apologies in advance for the annoying html to those who didn't try to put debian-specific shell configs in their spamassassin config files)
THIS FILE IS NOT THE SAME AS YOUR local.cf. IF YOU ARE NOT USING THE DEBIAN PACKAGE THEN YOU DON'T HAVE THIS FILE AND MUST DEAL WITH YOUR INIT SCRIPTS YOURSELF.

Edit /etc/default/spamassassin to start 'spamd' at boot time and to not create user preferences files automatically and not add a "From " header at the top of the messages. The "From " header will really break your mail because the "From " header is only for mbox mailboxes.

# Change to one to enable spamd
ENABLED=1
OPTIONS="-F 0"
        

(this will make sense when you read your copy of the file)

Step 3:

Start spamd as root :

# /etc/init.d/spamassassin start

Note:
Thus far the spam hasn't been dealt with; the messages have only been tagged as to whether or not it is spam. Users must now decide what they want to do with messages that have been tagged as spam. A variety of tools that can be use for filtering including procmail, maildrop, and exim.

I decided to use exim for filtering my mail into various folders. In my filter file I added the following above the other sorting rules. For details on what a filter file is and where it goes see section 21.10 of spec.txt (specifically the 'allow_filter' option) and section 5 of filter.txt.

if
    $h_X-Spam-Status: contains "Yes"
        or
    "${if def:h_X-Spam-Flag {def}{undef}}" is "def" 
then
    logwrite "    => junk : SPAM"
    save $home/Mail/junk/spam/
    finish
endif
        

This dumps all messages tagged as spam into their own folder (and mentions it in my logfile). At my leisure I can then check the messages for any false-positives. Note that because I skip scanning on messages with an X-Spam-Flag: header I must check for that in my filter. Otherwise a spammer could put the X-Spam-Flag: header in but omit the X-Spam-Status: header and slip past my filter.

If you find this info helpful or confusing or if you have any comments or questions or if you have any other reason (besides sending spam ;-)), just drop me a note : dman@dman.ddts.net. Unfortunately this host has some connectivity problems at the moment. The address dsh8290@rit.edu will work until I graduate.



Comments from readers:


Here are some comments that readers of this page have sent me. They are located here because it is not part of my setup, so I can't really vouch for it, but I think it is good information to share anyways.
(Note: This comment was given for exim3. I have tweaked it to use correct exim4 terminology.)

Couple of things which I do slightly differently (and I think better :-)

1) Put the router that delivers mail with the real- prefix before the spam 
check router - this allows false positives to reply to you if they wish, 
and therefore ...

2) .. my filter file bounces back a message thusly
========================== exim filter
# Exim Filter

if first_delivery and
   $h_X-Spam-Flag: contains "YES"
then
   logfile /var/log/exim/spamlog
   logwrite "$tod_log From: $h_From: Subject: $h_Subject: \n \t X-Spam-Status: $h_X-Spam-Status: Sender: $sender_address"
   if $h_From: is not ""
   then
      mail to $h_From: subject "Re: Your last message to me"
           expand file /etc/exim/spam-reply.txt
           once /var/log/exim/spamcount
           once_repeat 5d
   endif
   seen finish
endif
================== spam-reply.txt
Your mail with Subject: $h_Subject:
to domain <my_domain> appears to be unsolicited spam.

If you intended to contact a person at that email domain for
legitimate reasons then our apologies. Please would you resend to the
same address but add the prefix "real-" (without the quotes) to the
e-mail address and it will bypass the spam filter.

Thank you

postmaster@<my_domain>
=======================
        

Thanks

These people have helped me understand the system and correct some errors and sup-optimality in the configuration discussed above. This list is by no means exclusive. If you feel your name should be added here, drop me a line.