The Tsonny Blog

Using DSPAM for spam fighting September 28, 2008

This setup describes a DSPAM installation on a Redhat Enterprise Linux 4 machine running PLESK and qmail as MTA. However, most of it should be applicable to other systems, too. This setup is in use on my own mail server and is not more than a condensed step-by-step guide.

Install DSPAM

If you're on Debian, use aptitude to install it. For RH EL4, there's a repo on pramberger.at from where you should download the libdspam and dspam RPMs. After that, installation is just the usual rpm -i routine.

To configure it, do the following:

  1. Edit the configuration in /etc/dspam/dspam.conf as follows:
       Preference "signatureLocation=headers"  #to put the DSPAM signature into the header
       Trust popuser  #this is the user qmail runs with
    
  2. Make sure that you use popuser instead of dspam user for dspam daemon: Edit /etc/init.d/dspam and change the line
    initlog -q -c "$SU - dspam -c \"/usr/sbin/$DSPAM --daemon ${OPTIONS} &\""
    into
    initlog -q -c "sudo -u popuser sh -c \" /usr/sbin/$DSPAM --daemon ${OPTIONS} & \"  "
    This is necessary because we cannot su with popuser, since popuser has no shell in RHEL4.
  3. Make sure that popuser can execute and use dspam. To test, try what "sudo -u popuser dspam --help" is doing. If it plays dirty, chmod +x the DSPAM executable.

Setting up Plesk Qmail with DSPAM

First, create the directory /opt/dspam-scripts to hold some helper scripts we're going to create in the following.

Create /opt/dspam-scripts/dspam-checkmail.sh with this content:

#!/bin/sh

DSPAM=/usr/bin/dspam
#p="${HOME%/*}"  # THIS DOES NOT YET WORK
#MAILNAME="${HOME##*/}@${p##*/}"

"$DSPAM" --client --stdout --deliver=innocent,spam --mode=teft --feature=noise,whitelist --user popuser@qmail
check=$?

exit $check

This script performs the actual DSPAM spam check. DSPAM can handle user-based filtering, but in our setup we'll just use one DSPAM user for all mail users. To keep it simple, we called this user popuser@qmail as well (see above).

Next, add a file procmail.log to each qmail user's Maildir and make popuser its owner. This file will be useful for promail logging.

We can now start using the filter. To do so, create the directories .Junk and .Junk/cur in each user's Maildir. Then, change the .qmail files in the qmail user directories as follows:

| true
|preline /usr/bin/procmail -m -o .procmailrc

and the users .procmail file as follows:

DEFAULT=./Maildir/
SPAMDIR=${DEFAULT}/.Junk/
LOGFILE=${DEFAULT}/procmail.log
LOG="--- Logging ${LOGFILE} for ${LOGNAME} "

# Begin spam treatment.
:0fw
| /opt/dspam-scripts/dspam-checkmail.sh

:0
*^X-DSPAM-Result: Spam
${SPAMDIR}
# End spam treatment.

The .qmail file just tells Qmail to call procmail at the last stage. Note that whatever you do in Plesk, it won't overwrite this setting, so it is safe to do it this way...
The .procmail file calls the dspam-checkamil.sh script and then filters the message either into the user's Maildir/Junk/cur directory or, as usual, into the Inbox.

As you can see from the scripts above, we use DSPAM in the client mode. Therefore, you have to start the DSPAM daemon via the /etc/init.d/dspam script now.

Training day

At this point, DSPAM is filtering your mail, but it is not yet trained to do it really well. Let us change that. We shall train DSPAM with some old ham and spam:

  1. Add two temporary directories to hold training ham and spam (we'll use /tmp/train-ham and /tmp/train-spam).
  2. If you used Spamassassin previously, you have to remove the SA headers from existing mails before you should train them. You can do so by executing
    for i in /var/qmail/mailnames/stoop.net/norbert/Maildir/cur/*; \
       do (spamassassin -d < $i > /tmp/ham-train/cur/`basename $i`); done &
    
    This places a clean copy of all mails in your inbox in /tmp/ham-train. Do the same for existing SPAM, which we assume to be stored in a .Old_Junk subfolder:
    for i in /var/qmail/mailnames/stoop.net/norbert/Maildir/.Old_Junk/cur/*; \
      do (spamassassin -d < $i > /tmp/spam-train/cur/`basename $i`); done &
    
  3. Now you can train DSPAM using dspam_train USER SPAMDIR HAMDIR, ie.:
    dspam_train popuser@qmail /tmp/spam-train/cur/ /tmp/ham-train/cur/
    

User-assisted training using IMAP directories

The following steps describe how you can let your users train DSPAM by placing wrongly classified mails into special IMAP folders. Assuming that your users put undetected spam in their .Junk/Junk-Learn IMAP folder and ham wrongly classified as spam in .Junk/Innocent-Learn, save the following script as /opt/dspam-scripts/junk_training_script.sh to retrain these mails

#!/bin/sh
#  (c) 2005 Casey Allen Shobe 
#  Released under BSD license.  See http://opensource.org/licenses/bsd-license.php

# Bail if already running (in case we are slammed with spam to train from):
processes=$(ps ax)
if [ `echo "${processes}" | grep junk_training_script.sh | wc -l` -gt 1 ]; then
        echo "ERROR: junk_training_script is already running!"
        exit 1
fi

# Subscripts:
cutscript=/opt/dspam-scripts/dspam-antispam_update-cut
filterscript=/opt/dspam-scripts/dspam-antispam_update-filter
dspamuser=popuser@qmail
mailroot=/var/qmail/mailnames

for domain in `find "${mailroot}" -type d -maxdepth 1 -mindepth 1 | cut -d '/' -f 5 | sort`; do
  if [ `find "${mailroot}/${domain}/" -type d -name ".Junk.Junk-Learn" | wc -l` -gt 0 ]; then
    echo "Running for domain ${domain}"
    for user in `find "${mailroot}/${domain}/" -type d -maxdepth 1 -mindepth 1 | cut -d '/' -f 6 | sort`; do
      maildir="${mailroot}/${domain}/${user}/Maildir"
      if [ `find "${mailroot}/${domain}/${user}/" -type d -name ".Junk.Junk-Learn" | wc -l` -gt 0 ]; then
        echo " - ${user}@${domain}"
        if [ -d "${maildir}/.Junk.Junk-Learn/cur/" ]; then
          find "${maildir}/.Junk.Junk-Learn/cur/" -type f \
               -exec ${cutscript} {} Spam \; \
               -exec ${filterscript} "${dspamuser}" spam error {} \; \
               -exec mv {} "${maildir}/.Junk/cur/" \;
        fi
        if [ -d "${maildir}/.Junk.Innocent-Learn/cur" ]; then
          find "${maildir}/.Junk.Innocent-Learn/cur/" -type f \
               -exec ${cutscript} {} Innocent \; \
               -exec ${filterscript} "${dspamuser}" innocent error {} \; \
               -exec mv {} "${maildir}/cur/" \; #TODO: Should call maildrop? 
        fi
      fi
    done
  fi
done

Note: This scripts checks all your server's domains at once. It moves learned spam to Junk and learned ham to the user's inbox.

The script also needs two helper scripts:
/opt/dspam-scripts/dspam-antispam_update-cut (just formats an output for logging etc):

#!/bin/sh
output=`echo "${1}" | cut -d '/' -f 10`
echo "    ${2} - ${output}"

/opt/dspam-scripts/dspam-antispam_update-filter (the actual DSPAM training):

#!/bin/sh

signature=`grep -h X-DSPAM-Signature ${4} | awk {'print $2'}`
echo "           ...retraining signature ${signature}"
/usr/bin/dspam --client --user ${1} --class=${2} --source=${3} --signature=${signature}
echo "           using DSPAM --client --user ${1} --class=${2} --source=${3} --signature=${signature}"

To train automatically, make a cron job for /opt/dspam-scripts/junk_training_script.sh which runs every 30 minutes or so.

Comments

No comments posted yet. You could be the first!


Comments have been disabled for this post.