For a couple of years I have used havp in my home. Since my web/email server is in my home and already running clamav as my antivirus, once I learned that I could route all my internet traffic through havp proxy, it was a no-brainer that I needed to do this as another layer of protection for my home computers. Setup was straightforward, I just downloaded, ran the usual configure/make/make install, and it worked. The only tricky part was setting up the mandatory locking on the file system. Initially I had a spare hard drive in the server that I just remounted with mandatory locking and all was fine.
Then a few things happened around the same time. First, I needed that spare hard drive to go into a refurbished computer that went to my father-in-law. Time to use a ramdisk for my havp scanning filesystem. Next, my son (now 10) started using the web more. Time for a filter.
This post will document how I implemented the squidGuard proxy server as a content filter, havp as an antivirus proxy, with a ramdisk as the havp scanning filesystem. Getting the ramdisk to mount with each reboot took some work, so here’s what I did.
First, the ramdisk. Hopefully someone can correct me, but I found that I could not simply put a line in my /etc/fstab for the ramdisk and just have it get mounted with a system reboot. Reboots are pretty rare, but this is the problem… they are so rare that when they do happen, I have to re-learn (a la google) how to create the ramdisk properly, then when starting havp fails I have to re-remember that I have to reset permissions on the mounted ramdisk, etc. Time for a script. Here’s my script, called /usr/local/bin/mount-havp-ramdisk.sh:
#! /bin/bash # HAVP requires a filesystem with mandatory locks. # I use a ramdisk for the filesystem, which must be created # before use by HAVP. # The script is called from the /etc/init.d/havp startup script, # and verifies that the ramdisk exists and is mounted, and if not # it creates it and sets proper permissions. # Set some variables RAMDISK=/dev/ram0 MOUNTPOINT=/var/tmp/havp HAVPUSER=havp # # If the ramdisk is already exists and is mounted, then no need to continue. # MP="`/bin/mount |/bin/grep $RAMDISK`" if [ "$MP" != "" ]; then # ramdisk is mounted; exit with success. exit 0; fi # # Since ramdisk not mounted, we won't assume it exists. # First we'll create the ramdisk, then mount it with mandatory locking # and finally set permissions # /sbin/mke2fs -q -m 0 /dev/ram0 && \ /bin/mount -o mand $RAMDISK $MOUNTPOINT && \ /bin/chown $HAVPUSER:root $MOUNTPOINT && \ /bin/chmod 0750 $MOUNTPOINT exit $?
As noted in the script, this gets called from the startup script for havp (/etc/init.d/havp). I have made a few modifications of the havp startup script that is provided from its author. First, within the start section, I added a call to my mounting script. Second, I wanted to make the startup script compatible with chkconfig for my fedora server for easy management of starting and stopping havp on reboots. Finally, the inti.d script provided by the author doesn’t really have a working “status” function, so I hacked around to get it like I wanted. Here’s my /etc/init.d/havp:
#!/bin/sh # # Provides: havp # chkconfig: - 87 26 # pidfile: /var/run/havp/havp.pid # config: /etc/local/etc/havp/havp.config # Short-Description: starting and stopping HTTP AntiVirus Proxy # Description: HAVP provides a proxy through which HTTP requests are \ # routed with antivirus on the response sent back to the client # #### # This init-script tries to be LSB conform but platform independent. # # Therefore check the following two variables to fit to your requests: # HAVP_BIN HAVP_CONFIG PIDFILE # Any configuration of HAVP is done in havp.config # Type havp --help for help and read havp.config you should have received. HAVP_BIN=/usr/local/sbin/havp HAVP_CONFIG=/usr/local/etc/havp/havp.config PIDFILE=/var/run/havp/havp.pid HAVP=havp # Return values acc. to LSB for all commands but status: # 1 generic or unspecified error (current practice) # 2 invalid or excess argument(s) # 3 unimplemented feature (for example, "reload") # 4 user had insufficient privilege # 5 program is not installed # 6 program is not configured # 7 program is not running # 8-99 reserved for future LSB use # 100-149 reserved for distribution use # 150-199 reserved for application use # 200-254 reserved # Note that starting an already running service, stopping # or restarting a not-running service as well as the restart # with force-reload (in case signaling is not supported) are # considered a success. reload_havp() { echo "Reloading HAVP ..." PID="`cat $PIDFILE`" if [ "$PID" != "" ]; then kill -HUP "$PID" >/dev/null 2>&1 if [ $? -ne 0 ]; then echo "Error: HAVP not running" exit 1 fi else echo "Error: HAVP not running or PIDFILE not readable" exit 1 fi exit 0 } case "$1" in start) echo "Starting HAVP ..." # mount ramdisk /usr/local/bin/mount-havp-ramdisk.sh if [ $? -ne 0 ]; then echo "Error: Could not mount ramdisk; cannot start HAVP" exit 1 fi if [ ! -f $HAVP_BIN ]; then echo "Error: $HAVP_BIN not found" exit 5 fi $HAVP_BIN -c $HAVP_CONFIG exit $? ;; stop) echo "Shutting down HAVP ..." if [ ! -f "$PIDFILE" ]; then echo "Error: HAVP not running or PIDFILE unreadable" exit 1 fi PID="`cat $PIDFILE`" if [ "$PID" != "" ]; then kill -TERM "$PID" >/dev/null 2>&1 if [ $? -ne 0 ]; then echo "Error: HAVP not running" exit 1 fi else echo "Error: HAVP not running or PIDFILE unreadable" exit 1 fi sleep 2 exit 0 ;; restart) echo "Shutting down HAVP ..." $0 stop >/dev/null 2>&1 $0 start exit $? ;; reload-lists) reload_havp ;; force-reload) reload_havp ;; reload) reload_havp ;; status) PID="`cat $PIDFILE 2>/dev/null`" if [ "$PID" != "" ]; then echo "$HAVP (pid $PID) is running..." exit 0 else echo "$HAVP not running..." exit 0 fi ;; *) echo "Usage: $0 {start|stop|status|restart|force-reload|reload|reload-lists}" exit 0 ;; esac
The majority of this is straight from the provided sample script. With my modifications, a simple command (chkconfig havp on
) makes sure that havp comes up with each reboot, and exits cleanly when the system goes down.
OK, now to get havp and squid/squidGuard to work together.
For havp, you must edit the havp.config and at a minimum enable a scanner… I have enabled the ClamAV Library Scanner, the preferred method for havp. See the havp.config file for details, it is pretty straightforward. Note that the scanner temp files have the default location of /var/tmp/havp/, which is the mount point for my ramdisk. The havp user (which I had to create) must have write permissions on this directory. My mount-havp-ramdisk.sh script takes care of those permissions. Also note that the default port for havp is 8181, meaning that you can run an anti-virus proxy off of this port without worrying about squid.
For squid & squidGuard, it’s a little more complicated. I have based my setup on some notes made on the “Ideas” page for havp, whereby I have a sandwich setup:
Squid —> HAVP —> Squid
The rationale is that content comes into squid, which acts as a local cache for faster retrieval the next time it’s requested. At the same time, squid is configured to pass everything through squidGuard, which acts as a content filter. Squid then passes the contents to HAVP for virus scanning. The content then goes back through squid, which pulls the content from the local cache (instead of retrieving again from the internet) for performance reasons. If at any point if there is problem (inappropriate page as marked by the content filter squidGuard, or a virus as marked by HAVP), then the content does not continue through and appropriate messages are sent to the requesting user.
The “ideas” page from the HAVP site lists a sample configuration, which must be tweaked for the latest version of squid. Here’s my /etc/squid/squid.conf:
################################################## # # /etc/squid/squid.conf # # Sandwich config for HAVP # From http://server-side.de/ideas.htm # # bind port for users http_port 3128 # Disabling icp icp_port 0 # scanning through HAVP cache_peer localhost parent 8181 0 no-query no-digest no-netdb-exchange default # Memory usage values cache_mem 64 MB maximum_object_size 65536 KB memory_pools off # 4 GB store on disk cache_dir aufs /var/spool/squid 4096 16 256 # no store log cache_store_log none # Passive FTP off ftp_passive off # no X-Forwarded-For header forwarded_for off # Speed up logging #buffered_logs on # no logfile entry stripping strip_query_terms off # Speed, speed, speed pipeline_prefetch on half_closed_clients off shutdown_lifetime 1 second # don't query neighbour at all hierarchy_stoplist cgi-bin ? # And now: define caching parameters refresh_pattern ^ftp: 20160 50% 43200 refresh_pattern -i \.(jpe?g|gif|png|ico)$ 43200 100% 43200 refresh_pattern -i \.(zip|rar|arj|cab|exe)$ 43200 100% 43200 refresh_pattern windowsupdate.com/.*\.(cab|exe)$ 43200 100% 43200 refresh_pattern download.microsoft.com/.*\.(cab|exe)$ 43200 100% 43200 refresh_pattern -i \.(cgi|asp|php|fcgi)$ 0 20% 60 refresh_pattern . 20160 50% 43200 # # Access ACLs # acl manager proto cache_object acl localhost src 127.0.0.1/255.255.255.255 acl to_localhost dst 127.0.0.0/8 http_access deny to_localhost acl SSL_ports port 443 acl Safe_ports port 80-81 # http acl Safe_ports port 21 # ftp acl Safe_ports port 443 # https acl Safe_ports port 1025-65535 # unregistered ports acl CONNECT method CONNECT acl QUERY urlpath_regex cgi-bin \? acl HTTP proto HTTP # Do not scan the following domains acl noscan urlpath_regex -i \.(jpe?g|gif|png|ico)$ http_access allow manager localhost http_access deny manager http_access deny !Safe_ports http_access deny CONNECT !SSL_ports acl our_networks src 192.168.0.0/24 24.173.221.168 http_access allow our_networks http_access allow localhost http_access deny all # For sandwich configuration we have to disable the "Via" header or we # get a "forwarding loop". request_header_access Via deny all reply_header_access Via deny all # Do not cache requests from localhost, SSL-encrypted or dynamic content. no_cache deny QUERY no_cache deny localhost no_cache deny CONNECT no_cache allow all # Do not forward parent requests from localhost (loop-prevention) or # to "noscan"-domains or SSL-encrypted requests to parent. always_direct allow localhost always_direct allow CONNECT always_direct allow noscan always_direct deny HTTP never_direct deny localhost never_direct deny CONNECT never_direct deny noscan never_direct allow HTTP # Extras not in havp sample access_log /var/log/squid/access.log squid acl apache rep_header Server ^Apache # Direct through squidGuard url_rewrite_program /usr/bin/squidGuard url_rewrite_children 8
The last section in this script directs through squidGuard. Setup of squidGuard takes a little work in itself, but this is fairly well documented on the squidguard site install pages.
By the way, I installed squid & squidGuard through yum, as in <code>yum install squid squidGuard</code>. This takes care of making sure the daemons get started at boot time. With my modifications above, havp along with a ramdisk for temp scanning files also start at boot time.
3 Comments;
Thanks for posting this guide. Very useful…
Did you see any performance increase by using the ramdisk instead of the hard disk?
Glad this was helpful!
I have not done any benchmarking of ramdisk vs hard disk since I used ramdisk because 1. havp suggested it, and 2. I didn’t have a spare filesystem for scanning.
Squidblacklist.org is the worlds leading publisher of native acl blacklists tailored specifically for Squid proxy, and alternative formats for all major third party plugins as well as many other filtering platforms.
There is room for better blacklists, we intend to fill that gap.
It would be our pleasure to serve you.
Signed,
Benjamin E. Nichols
http://www.squidblacklist.org