Using squid, squidGuard, havp, and ramdisk as an antivirus proxy & internet filter

For a couple of years I have used havpÂ in my home. Since my web/email server is in my home and already running clamav as my antivirus, once I learned that I could route all my internet traffic through havp proxy, it was a no-brainer that I needed to do this as another layer of protection for my home computers. Setup was straightforward, I just downloaded, ran the usual configure/make/make install, and it worked. The only tricky part was setting up the mandatory locking on the file system. Initially I had a spare hard drive in the server that I just remounted with mandatory locking and all was fine.

Then a few things happened around the same time. First, I needed that spare hard drive to go into a refurbished computer that went to my father-in-law. Time to use a ramdisk for my havp scanning filesystem. Next, my son (now 10) started using the web more. Time for a filter.

This post will document how I implemented the squidGuard proxy server as a content filter, havp as an antivirus proxy, with a ramdisk as the havp scanning filesystem. Getting the ramdisk to mount with each reboot took some work, so here’s what I did.

First, the ramdisk. Hopefully someone can correct me, but I found that I could not simply put a line in my /etc/fstab for the ramdisk and just have it get mounted with a system reboot. Reboots are pretty rare, but this is the problem… they are so rare that when they do happen, I have to re-learn (a la google) how to create the ramdisk properly, then when starting havp fails I have to re-remember that I have to reset permissions on the mounted ramdisk, etc. Time for a script. Here’s my script, called /usr/local/bin/mount-havp-ramdisk.sh:

#! /bin/bash
# HAVP requires a filesystem with mandatory locks.
# I use a ramdisk for the filesystem, which must be created
# before use by HAVP.
# The script is called from the /etc/init.d/havp startup script,
# and verifies that the ramdisk exists and is mounted, and if not
# it creates it and sets proper permissions.

# Set some variables
RAMDISK=/dev/ram0
MOUNTPOINT=/var/tmp/havp
HAVPUSER=havp

#
# If the ramdisk is already exists and is mounted, then no need to continue.
#
MP="`/bin/mount |/bin/grep $RAMDISK`"
if [ "$MP" != "" ]; then
        # ramdisk is mounted; exit with success.
        exit 0;
fi

#
# Since ramdisk not mounted, we won't assume it exists.
# First we'll create the ramdisk, then mount it with mandatory locking
# and finally set permissions
#
/sbin/mke2fs -q -m 0 /dev/ram0 && \
        /bin/mount -o mand $RAMDISK $MOUNTPOINT && \
        /bin/chown $HAVPUSER:root $MOUNTPOINT && \
        /bin/chmod 0750 $MOUNTPOINT
exit $?

As noted in the script, this gets called from the startup script for havp (/etc/init.d/havp). I have made aÂ few modifications of the havp startup script that is provided from its author. First, within the start section, I added a call to my mounting script. Second, I wanted to make the startup script compatible with chkconfig for my fedora server for easy management of starting and stopping havp on reboots. Finally, the inti.d script provided by the author doesn’t really have a working “status” function, so I hacked around to get it like I wanted. Here’s my /etc/init.d/havp:

#!/bin/sh
#
# Provides: havp
# chkconfig: - 87 26
# pidfile: /var/run/havp/havp.pid
# config: /etc/local/etc/havp/havp.config
# Short-Description: starting and stopping HTTP AntiVirus Proxy
# Description: HAVP provides a proxy through which HTTP requests are \
# routed with antivirus on the response sent back to the client
#
####
# This init-script tries to be LSB conform but platform independent.
#
# Therefore check the following two variables to fit to your requests:
# HAVP_BIN HAVP_CONFIG PIDFILE
# Any configuration of HAVP is done in havp.config
# Type havp --help for help and read havp.config you should have received.

HAVP_BIN=/usr/local/sbin/havp
HAVP_CONFIG=/usr/local/etc/havp/havp.config
PIDFILE=/var/run/havp/havp.pid
HAVP=havp

# Return values acc. to LSB for all commands but status:
# 1       generic or unspecified error (current practice)
# 2       invalid or excess argument(s)
# 3       unimplemented feature (for example, "reload")
# 4       user had insufficient privilege
# 5       program is not installed
# 6       program is not configured
# 7       program is not running
# 8-99    reserved for future LSB use
# 100-149 reserved for distribution use
# 150-199 reserved for application use
# 200-254 reserved
# Note that starting an already running service, stopping
# or restarting a not-running service as well as the restart
# with force-reload (in case signaling is not supported) are
# considered a success.

reload_havp()
{
        echo "Reloading HAVP ..."
        PID="`cat $PIDFILE`"
        if [ "$PID" != "" ]; then
                kill -HUP "$PID" >/dev/null 2>&1
                if [ $? -ne 0 ]; then
                        echo "Error: HAVP not running"
                        exit 1
                fi
        else
                echo "Error: HAVP not running or PIDFILE not readable"
                exit 1
        fi
        exit 0
}

case "$1" in
        start)
                echo "Starting HAVP ..."
                # mount ramdisk
                /usr/local/bin/mount-havp-ramdisk.sh
                if [ $? -ne 0 ]; then
                        echo "Error: Could not mount ramdisk; cannot start HAVP"
                        exit 1
                fi
                if [ ! -f $HAVP_BIN ]; then
                        echo "Error: $HAVP_BIN not found"
                        exit 5
                fi
                $HAVP_BIN -c $HAVP_CONFIG
                exit $?
                ;;

        stop)
                echo "Shutting down HAVP ..."
                if [ ! -f "$PIDFILE" ]; then
                  echo "Error: HAVP not running or PIDFILE unreadable"
                  exit 1
                fi
                PID="`cat $PIDFILE`"
                if [ "$PID" != "" ]; then
                        kill -TERM "$PID" >/dev/null 2>&1
                        if [ $? -ne 0 ]; then
                                echo "Error: HAVP not running"
                                exit 1
                        fi
                else
                        echo "Error: HAVP not running or PIDFILE unreadable"
                        exit 1
                fi
                sleep 2
                exit 0
                ;;

        restart)
                echo "Shutting down HAVP ..."
                $0 stop >/dev/null 2>&1
                $0 start
                exit $?
                ;;

        reload-lists)
                reload_havp
                ;;

        force-reload)
                reload_havp
                ;;

        reload)
                reload_havp
                ;;

        status)
                PID="`cat $PIDFILE 2>/dev/null`"
                if [ "$PID" != "" ]; then
                        echo "$HAVP (pid $PID) is running..."
                        exit 0
                else
                        echo "$HAVP not running..."
                        exit 0
                fi
                ;;

        *)
                echo "Usage: $0 {start|stop|status|restart|force-reload|reload|reload-lists}"
                exit 0
                ;;
esac

The majority of this is straight from the provided sample script. With my modifications,Â a simple command (chkconfig havp on) makes sure that havp comes up with each reboot, and exits cleanly when the system goes down.

OK, now to get havp and squid/squidGuard to work together.

For havp, you must edit the havp.configÂ and at a minimum enable a scanner… I have enabled the ClamAV Library Scanner, the preferred method for havp. See the havp.config file for details, it is pretty straightforward. Note that the scanner temp files have the default location of /var/tmp/havp/, which is the mount point for my ramdisk. The havp user (which I had to create) must have write permissions on this directory. My mount-havp-ramdisk.sh script takes care of those permissions. Also note that the default port for havp is 8181, meaning that you can run an anti-virus proxy off of this port without worrying about squid.

For squid & squidGuard, it’s a little more complicated. I have based my setup on some notes made on the “Ideas” page for havp, whereby I have a sandwich setup:

Squid —> HAVP —> Squid

The rationale is that content comes into squid, which acts as a local cache for faster retrieval the next time it’s requested. At the same time, squid is configured to pass everything through squidGuard, which acts as a content filter. Squid then passes the contents to HAVP for virus scanning. The content then goes back through squid, which pulls the content from the local cache (instead of retrieving again from the internet) for performance reasons. If at any point if there is problem (inappropriate page as marked by the content filter squidGuard, or a virus as marked by HAVP), then the content does not continue through and appropriate messages are sent to the requesting user.

The “ideas” page from the HAVP site lists a sample configuration, which must be tweaked for the latest version of squid. Here’s my /etc/squid/squid.conf:

##################################################
#
# /etc/squid/squid.conf
#
# Sandwich config for HAVP
# From http://server-side.de/ideas.htm
#

# bind port for users
http_port 3128

# Disabling icp
icp_port 0

# scanning through HAVP
cache_peer localhost parent 8181 0 no-query no-digest no-netdb-exchange default

# Memory usage values
cache_mem 64 MB
maximum_object_size 65536 KB
memory_pools off

# 4 GB store on disk
cache_dir aufs /var/spool/squid 4096 16 256

# no store log
cache_store_log none

# Passive FTP off
ftp_passive off

# no X-Forwarded-For header
forwarded_for off

# Speed up logging
#buffered_logs on

# no logfile entry stripping
strip_query_terms off

# Speed, speed, speed
pipeline_prefetch on
half_closed_clients off
shutdown_lifetime 1 second

# don't query neighbour at all
hierarchy_stoplist cgi-bin ?

# And now: define caching parameters
refresh_pattern ^ftp: 20160 50% 43200
refresh_pattern -i \.(jpe?g|gif|png|ico)$ 43200 100% 43200
refresh_pattern -i \.(zip|rar|arj|cab|exe)$ 43200 100% 43200
refresh_pattern windowsupdate.com/.*\.(cab|exe)$ 43200 100% 43200
refresh_pattern download.microsoft.com/.*\.(cab|exe)$ 43200 100% 43200
refresh_pattern -i \.(cgi|asp|php|fcgi)$ 0 20% 60
refresh_pattern . 20160 50% 43200

#
# Access ACLs
#
acl manager proto cache_object
acl localhost src 127.0.0.1/255.255.255.255
acl to_localhost dst 127.0.0.0/8
http_access deny to_localhost

acl SSL_ports port 443
acl Safe_ports port 80-81       # http
acl Safe_ports port 21          # ftp
acl Safe_ports port 443         # https
acl Safe_ports port 1025-65535  # unregistered ports
acl CONNECT method CONNECT
acl QUERY urlpath_regex cgi-bin \?
acl HTTP proto HTTP

# Do not scan the following domains
acl noscan urlpath_regex -i \.(jpe?g|gif|png|ico)$

http_access allow manager localhost
http_access deny manager
http_access deny !Safe_ports
http_access deny CONNECT !SSL_ports

acl our_networks src 192.168.0.0/24 24.173.221.168
http_access allow our_networks

http_access allow localhost
http_access deny all

# For sandwich configuration we have to disable the "Via" header or we
# get a "forwarding loop".
request_header_access Via deny all
reply_header_access Via deny all

# Do not cache requests from localhost, SSL-encrypted or dynamic content.
no_cache deny QUERY
no_cache deny localhost
no_cache deny CONNECT
no_cache allow all

# Do not forward parent requests from localhost (loop-prevention) or
# to "noscan"-domains or SSL-encrypted requests to parent.
always_direct allow localhost
always_direct allow CONNECT
always_direct allow noscan
always_direct deny HTTP

never_direct deny localhost
never_direct deny CONNECT
never_direct deny noscan
never_direct allow HTTP

# Extras not in havp sample
access_log /var/log/squid/access.log squid
acl apache rep_header Server ^Apache

# Direct through squidGuard
url_rewrite_program /usr/bin/squidGuard
url_rewrite_children 8

The last section in this script directs through squidGuard. Setup of squidGuard takes a little work in itself, but this is fairly well documented on the squidguard site install pages.

By the way, I installed squid & squidGuard through yum, as in <code>yum install squid squidGuard</code>. This takes care of making sure the daemons get started at boot time. With my modifications above, havp along with a ramdisk for temp scanning files also start at boot time.

3 Comments;

Ian Robson December 9, 2010

Thanks for posting this guide. Very useful…

Did you see any performance increase by using the ramdisk instead of the hard disk?
ehymel December 9, 2010

Glad this was helpful!

I have not done any benchmarking of ramdisk vs hard disk since I used ramdisk because 1. havp suggested it, and 2. I didn’t have a spare filesystem for scanning.
Squidblacklist August 26, 2014

Squidblacklist.org is the worlds leading publisher of native acl blacklists tailored specifically for Squid proxy, and alternative formats for all major third party plugins as well as many other filtering platforms.

There is room for better blacklists, we intend to fill that gap.

It would be our pleasure to serve you.

Signed,

Benjamin E. Nichols
http://www.squidblacklist.org

blog for ehymel

Just another WordPress weblog

Using squid, squidGuard, havp, and ramdisk as an antivirus proxy & internet filter

3 Comments;

LEAVE A COMMENT