"Anything that can go wrong, will - at the worst possible moment." I've mentioned Finagle's Law of Dynamic Negatives a few times earlier, and as I also mentioned; it's always there. So, when I decided, or rather attempted, to spice up my server a bit by tweaking some kernel parameters that handle TCP/IP traffic, it just had to go wrong!
So here's what happened: I had been reading up a little on improving latency, and in my local network it all seemed to work perfectly. Among other things I fiddled with the timeouts for Maximum Segment Lifetime, Explicit Congestion Notifications and time-out values for ACK, SYN and FIN packets in the firewall. But just because it works great in a finely tuned LAN, doesn't mean that it will work great on the internet.
So when a couple of search engines visited my site and started sniffing around, shit started to hit the fan. For some reason, presumably the tweaks I did to the firewall and TCP/IP stack, communication between my server and the search engines got disrupted, and a lot of TCP packets were sent back and forth trying to (re)-establish communication. The result was that my server was pulling an average of 21 Megabits per second for about 4 hours, burning up a substantial sum of my monthly data bundle in the process. And the biggest hoot is that during that time the server registered only 147 page views.
Needless to say I wasn't very happy with it burning up 96GB of useless traffic, due to my "improvements". The phrase 'If it's not broken, don't try to fix it." comes to mind. I've undone all the tweaks and will consider it a lesson learned.
*sigh* Yeah...
It's a common sight and source of annoyance for many system administrators: people attempting to do brute force hacking on your server. Often the /var/log/auth.log file will contain a truckload of error messages and authentication failures.
A common error message would be the "reverse mapping" message:
Feb 15 17:41:58 ams01 sshd[1516]: reverse mapping checking getaddrinfo for user.145.126.222.zhong-ren.net [222.126.145.202] failed - POSSIBLE BREAK-IN ATTEMPT!Another variant is that the reverse DNS lookup doesn't match the forward DNS:
Feb 19 00:55:28 ams01 sshd[32031]: Address 212.156.126.210 maps to 212.156.126.210.static.turktelekom.com.tr, but this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT!Or the log file contains a long chain of error messages where someone is trying to log on with a single user, using a vast array of passwords:
Feb 22 12:01:24 ams01 sshd[77812]: error: PAM: authentication error for root from 60.54.248.46 Feb 22 12:01:27 ams01 sshd[77815]: error: PAM: authentication error for root from 60.54.248.46 Feb 22 12:01:31 ams01 sshd[77818]: error: PAM: authentication error for root from 60.54.248.46And in another scenario, the log file could contain a single IP that is trying to log on with a variety of usernames and passwords:
Feb 23 13:53:01 ams01 sshd[473]: Invalid user db2inst1 from 222.174.35.3 Feb 23 13:53:10 ams01 sshd[477]: Invalid user prueba from 222.174.35.3 Feb 23 13:53:19 ams01 sshd[481]: Invalid user postgres from 222.174.35.3
There are many variants of brute force hacking, and they can be annoying as hell; especially if your server sends you a daily security report containing all the log file entries. I don't know about you, but seeing all those hacking attempts makes me nervous... what if one of them succeeds ? Given enough time, one of them has to come up with my username/password combination... I don't like that idea at all! Fortunately, there are things like bruteblock. Bruteblock is a little program that does just that: it blocks brute force hacking attempts (Yay! ).
The way it works is pretty simple: it reads the log entry, checks it against some regular expressions and if it registers X amount of matches within Y seconds, it adds a table to the firewall to block that IP address. Of course this could lead to some problems, because some ISP's have dynamic IP address allocations, so you could be blocking half a country after a while. Bruteblock solves this by adding a timestamp to the firewall table. A small daemon that runs in the background monitors the firewall table, and after an X amount of time removes the blockade.
The first step is to tell your logging facility that it needs to pipe the authentication messages to bruteblock. This is easily done by adding a single line to /etc/syslog.conf:
auth.info;authpriv.info |exec /usr/local/sbin/bruteblock -f /usr/local/etc/bruteblock/ssh.confWe don't want to restart the syslog daemon just yet, because we still have to determine what bruteblock should check and how it should handle matches against the regular expressions. /usr/local/etc/bruteblock/ssh.conf contains some pre-fabricated regular expressions for common sshd authentication failures. To get rid of even more log file spam and brute force attempts we will add an additional two regular expressions:
regexp4 = sshd.*reverse mapping checking getaddrinfo for \S+ \[(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\] failed - POSSIBLE BREAK-IN ATTEMPT!
regexp5 = sshd.*Address (\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}) maps to \S+, but this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT!
The configuration file also contains some settings on how many matches per how many seconds it should detect before blocking an IP address, a duration for the blockade and which firewall table to use for the IP addresses. The default duration is 600 seconds (10 minutes), which we will change to 84600 seconds (24 hours). The other settings can be left at to: 4 matches in 60 seconds will trigger action, and the IP addresses will be added to table 1.
Next, we'll want to decide what to do with the IP addresses that are trying to do brute-force hacking on us. We'll add a few rules to /etc/rc.firewall:
## Block SSH brute force scanners.
##
${fwcmd} add deny all from me to "table(1)"
${fwcmd} add deny all from "table(1)" to me
We'll want to place these rules as early as possible in the chain of rules for performance issues; I'll write a separate article on rule-chains sometime in the (near) future. Next we can either restart the entire firewall, or just add the two rules manually by adding a chain number; just be sure to add the rules to the firewall script as well, so they don't get lost at the next reboot.
Allright, after we've added the rules to the firewall, we can restart syslogd and start bruteblockd (the daemon that handles the cleanup for the firewall table):
/etc/rc.d/syslogd restart /usr/local/etc/rc.d/bruteblockd start
And that's pretty much all there is to it. After a few days we can take a peek at /var/log/auth.log and see something like:
Feb 28 07:39:21 ams01 sshd[96155]: Did not receive identification string from 220.172.191.31 Feb 28 07:43:24 ams01 sshd[96253]: Invalid user arun from 220.172.191.31 Feb 28 07:43:27 ams01 sshd[96257]: Invalid user arun from 220.172.191.31 Feb 28 07:43:31 ams01 sshd[96261]: Invalid user arun from 220.172.191.31 Feb 28 07:43:34 ams01 sshd[96265]: Invalid user arun from 220.172.191.31 Feb 28 07:43:34 ams01 bruteblock[95217]: Adding 220.172.191.31 to the ipfw table 1Great, It looks like it works! Bruteblock matches 4 rules, and at the 5th it adds the IP address to the firewall table. A quick peek in at the firewall table shows that the IP address has been added, including a timestamp:
ams01# ipfw table 1 list 124.95.128.162/32 1330485366 193.34.145.55/32 1330527853 200.113.185.227/32 1330491753 213.165.165.130/32 1330487447 220.172.191.31/32 1330496014And the firewall rules also confirm that traffic has been blocked:
ams01# ipfw -a list | grep table 01200 210 103040 deny ip from me to table(1) 01300 152 14488 deny ip from table(1) to me
So with a few simple steps, brute force hackers will have to be extremely patient and switch IP addresses every 4 attempts. It won't stop hackers completely, but it will definitely make it a bit harder for them
About 3 years ago I migrated from FreeBSD to Windows 2008, because I was unhappy with the manageability. Most of this dissatisfaction with the manageability was due to my own lack of effort and interest at that time. Last year I switched back to FreeBSD with a renewed passion: this time I would make it work; and I did!
My server has been running flawlessly for almost a year on FreeBSD 8.2, and I'm very satisfied. There is however always room for improvements, and optimizing stuff is somewhat of a hobby (though some might call it an obsessive-compulsive disorder ) of mine. Recently FreeBSD 9.0 was released, with a couple of new features that really interest me. I could just upgrade to this newer version on the fly, but I've decided to drive to the datacenter and install it from scratch.
The main reason for this is that FreeBSD 9.0 has a big improvement on the file system which I'm very keen on using. However, updating a file system "on the fly" is just something that I'm not willing to do just yet, So tomorrow is road trip-time. If I've got the time, I might take some pictures or maybe record some video while I'm at the datacenter.
With the new version of the operating system I'm also making some changes to the way stuff works on my server. These changes will include switching to high-performance web-server software, tightening up security even further without compromising performance, rewriting some code for the website and adding a decent CMS for my site. I might make tutorials, editorials, etc for each of those individually though.
But first things first: Road-trip tomorrow!
As most other system administrators, I put a lot of value in having a stable server. Unfortunately it is always possible that, for whatever reason, your server "hangs" and becomes unresponsive. One of the most common reasons is a Denial of Service attack (and sometimes bugged anti-virus software ) which generates 100% CPU usage and causes your server to become unresponsive.
To prevent stuff like this from happening, something called a watchdog was invented. The basic principle is real simple: the watchdog has to be reset within X seconds, or else the system will reboot. FreeBSD has support for both hard- and software based watchdogs. Since my server has an Intel ICHxx chipset, I logically opted for the hardware based solution.
Before making permanent changes to my kernel, with the possibility of wrecking my server, I had to determine if my server would actually support the interface. Since my server has an elevated kernel security level I first had to reboot it with level 0 security before being able to load kernel modules:
ams01# kldload ichwdNothing happened, the world did not implode on itself, my server did not suddenly reboot itself; This was a good sign. Fetching a list of the loaded kernel modules confirmed that the module was in fact loaded:
ams01# kldstat Id Refs Address Size Name 1 7 0xffffffff80100000 6abc20 kernel 2 1 0xffffffff807ac000 8b8 accf_data.ko 3 1 0xffffffff807ad000 1580 accf_http.ko 4 1 0xffffffff807af000 3818 ichwd.ko
And consequently, a quick peek in the kernel boot messages also told me that the interface was recognized and support:
ams01# dmesg | grep ichwd ichwd module loaded ichwd0: on isa0 ichwd0: Intel ICH9R watchdog timer (ICH9 or equivalent)
Excellent! Of course loading a kernel module manually would mean that it would not be loaded anymore after the first reboot, and I still had to reboot the server to restore the kernel security level). I had two options now: either I compile a new kernel with the ichwd device enabled, or I tell the system to load up the kernel module at boot-time. I decided to go for the second option:
ams01# echo 'ichwd_load="YES"' >> /boot/loader.conf
Once I update the system to a newer release of FreeBSD, I have to compile a new kernel anyway, but for now this will do just fine. The next step was to enable the watchdog daemon that will be doing the polling:
ams01# echo 'watchdogd_enable="YES"' >> /etc/rc.conf ams01# /etc/rc.d/watchdogd start
I let the server run for a few minutes and nothing happened; which is good... it should only do something if something is wrong, after all. Since I had to reboot the server anyway to restore the kernel security level, and I wanted to see what would happen if something did go wrong, I killed the watchdogd process and waited. A few seconds later, suddenly my SSH connection was terminated. About 30 seconds later I received a text message on my phone that the server had rebooted itself.
Well well... It seems to work just fine! I sincerely hope that I never actually have to use this failsafe though
It took some blood, sweat, tears and a lot of gasoline; but we're back on the air, and we're cruising on FreeBSD
After postponing, delaying and deferring the issue for quite a bit of time; it was getting kind of embarrassing to put off the migration, and the worst part was that I didn't have an excuse not to do it. I had picked a date in my agenda to do the actual migration; which was on a Friday. But on Thursday I was bored, and decided to do it one day earlier. That decision may or may not have been rushed by the fact that my server was having yet another issue with the virus scanning software.
I downloaded FreeBSD-8.2-RELEASE-amd64-disc1.iso, made a final backup of my server data and got ready to make my way to the datacenter where the server is hosted. You can enter the datacenter 24/7, but they do require you to register on a website so they know who is coming. While trying to register I got an error on the website. I emailed the hosting company that I was unable to register on the website, but that I was en-route and would need access to the datacenter.
When I got to the datacenter and tried to log in, the system said there was no registration for me and therefore it could not let me in. I called the hosting company's helpdesk to ask why they hadn't arranged for access. The guy on the phone said that they had fixed the problem that was preventing me from registering, and that I should be able to register now. I told him that I was already at the datacenter, and asked if he could register access for me. He told me that they're not allowed to do that, and suggested that I use my smart phone to register. I told him that I had already tried that, but the website didn't work because it redirected to some kind of status page as soon as it detected that I was using a smart phone instead of a desktop pc. After some arguing with the helpdesk about how I would get access to the server without having to drive back to my home or harass Daniel at work, the security guard of the datacenter offered me use of his private laptop to register for access. Some bro-fists were exchanged and I was finally able to go inside.
I hooked up my USB CD-ROM player to the server, and made it boot from CD... or so I thought! While trying to boot, it got stuck halfway in loading the kernel. Switching USB ports, rubbing the CD; none of it seemed to help. Man, I was pissed! But I also facepalmed, because I neglected to check if the CD was working before driving off to the datacenter. I bro-fisted the security guard again, told him I would be back in a bit, and drove back home grumpy and hungry.Back home I downloaded FreeBSD-8.2-RELEASE-amd64-bootonly.iso to save some time. I double and triple checked that the CD was booting and working properly. A quick bite later I was on my way back to the datacenter. I hooked up the CD-ROM player to the server again and... it got stuck halfway in loading the kernel again! Needless to say, a small mushroom cloud would have manifested itself above the datacenter. I looked around the datacenter to see if someone else was there. I got lucky; some American guy was working on a couple of servers and had a CD-ROM with him that I was able to lend for a few minutes. Unfortunately, it gave the same result as with my own CD-ROM.
After cooling down a bit, I decided to bring the server home to figure out what the deal was. the brand of CD-R's, a driver issue, a BIOS configuration issue, the ISO's being broken... It could be a lot of different things. Back at home I decided to download FreeBSD-8.2-RELEASE-amd64-memstick.img and try to boot from an USB memory stick instead; which worked perfect the first time; man I was relieved! Since it was already late I decided to continue the next morning.
The next day, everything went as planned. I installed FreeBSD on the server, did some minimal configuration so that I would at least be able to receive some email, compiled a custom kernel and drove back to the datacenter to shove the server back in the rack. The rest of the weekend I spent tweaking the configuration and debugging some PHP scripts to fix case sensitive pathnames, etc..
Over the next few days or weeks I will probably need to do some minor tweaks, but right now I have everything running pretty much the way I wanted, and couldn't be more happy with it. It's so nice when everything works out the way you had it in mind.
During the "downtime" caused by the Kaspersky update, I started to browse for alternatives. One of the alternatives that caught my eye was Clamav, an open-source virus scanner for UNIX systems; Although there is also a Windows port available. As I was peeking a bit through the options and features, an idea sparked in my mind; A memory of an old love that popped up, so to speak.
I tried to dismiss the idea but it kept haunting me, and eventually I surrendered to the unspoken desire: I wanted my old love back, no matter what it takes.
In the last week of January 2011, version 8.2 of the FreeBSD operating system will be released. You might wonder why I'm mention this on November 2nd, but it has a reason. Basically I've got 3 months to freshen up my UNIX skills, convert my sites and services so that they can work with FreeBSD and work out some new stuff. I've installed version 8.1 on my laptop, which will serve as a staging / development template.
I've added a link in the menu to give an overview of the project status. I've done a lot of research and all the issues that made me decide to migrate to Windows in January 2009 are no longer an issue. Maybe I was just lazy back then, or maybe I was just tired of doing the research... Whatever the real reasons were, they're off the past. My love for the FreeBSD operating system is revitalized and stronger than ever. After 2 years of Windows, we're going back to FreeBSD!
Apparently something is wrong with the latest update from Kaspersky Anti-virus, because the last few days the CPU load on my server has skyrocketed to 80-100% load on average. This is caused by two worker processes from Kaspersky Anti-virus (kavfswp.exe) that take up 40-50% each. I've never had this problem before, and reinstalling the software temporarily fixes it, but as soon as it kicks in an update cycle for the anti-virus definitions, it starts all over again.
I'm not too happy with my server having high load. Aside from slowing down my websites, it also consumes more power and I don't know how happy the datacenter is with that. Technically I'm allowed to use 400mA for the server, but due to this nice CPU load bug It's been pumping 464mA. Some searching on Google only told me that in 2009 there was a similar problem. It was caused by an error in the anti-virus definitions and it was solved a week later when Kaspersky released new anti-virus definitions. I hope it's a similar issue, and that it will be fixed soon.
I could disable the anti-virus for the time being, but I don't know if that's such a good idea. Sure, I'm the only one that uploads files to it; but still... I don't like the idea of using an unprotected server. I've temporarily disabled videos till the problem is solved.
Update October 25th, 2010 - 12:19
It seems that I'm not the only one with this problem, judging by this thread on the Kaspersky support forums. Kaspersky promised to release an update that fixes the problem later on today.
Update October 25th, 2010 - 14:47
*phew* The update seems to have solved the problem.
I'm all set. This next Saturday (July 25th, 2009), I will be moving my new web server from the "staging area" (read: my bedroom) to the data center in Amsterdam. Sunday (July 26th, 2009) the old server in Canada will be powered down and dismantled.
Aside from departing from the server in Canada, I will also be departing from Xoops (the CMS that I've been using for 2 years now). I've decided to write my own website code, for a couple of reasons: security, speed and size (also related to speed I guess...).
The more code you have, the slower a site is, and the more can go wrong. Xoops is a very large CMS, with a lot of functionality (most of which I don't use). If I write a minimalistic CMS myself, with just the things that I use; it should - in theory - make the site smaller, faster and more secure.
So... this Saturday my server will go online, but my website will be offline for a while until I've made a basic blog module.
Wish me luck!
Today, my black magic woman arrived. Unfortunately, it turns out that the 2x1GB memory I had arranged doesn't fit. But with the 1GB that was in the server and 4GB extra... it's still 5 GB and that's still plenty for a webserver.
Some snapshots for your viewing pleasure:
*purr*
As some of you might, or might not know, I rent a dedicated server at iWeb in Ontario, Canada. They have some decent deals going on and starting at USD 69,- a month you can rent your own server. That is... if you stick to 1GB ram and either Linux or FreeBSD. If you want a different operating system, you have to pay extra, and if you want more memory, you also have to pay... extra. In my case it ended up at USD 109 a month for An Intel Celeron D 3.0 GHz with 2GB Ram and 300 GB IDE hard disk, equipped with Windows 2003 Standard Edition. The package includes a 10MBps uplink and 1 TB of traffic per month.
This is an average price for renting a dedicated server, but it always struck me as odd that I have to pay USD 10,- a month for an 1 GB memory module that costs € 15,- at my workplace. Time for change, I thought... so I did a bit of researching and found out that for € 49,- per month, I can collocate my own server on a 100MBps uplink with 1 TB traffic per month, at Trans|ip, the same company where I have my domains registered. It's not rocket science to see that it would save me about € 35,- to € 50,- per month, depending on the exchange rate of course.
I would get 10 times the uplink speed for less money. The only problem is that I didn't have my own server. I had already decided that I wanted a certain minimal configuration:
- Dual Core processor
- 4 GB memory
- 2 Hard Disks in raid 1 configuration
First I checked the website of my employer, Aces Direct, of course. Unfortunately, the cheapest server that met my desired specifications was way over the budget that I had in mind. The problem is that most servers are sold without disks, and server hard disks are a bit more expensive than consumer hard disks.
Next I ended up at a company that sells reasonably cheap web servers. For € 399,- (ex taxes of course) you can get a simple server with either an AMD Athlon 64, AMD Sempron 64, or Intel Dual Core E2220 processor. A bit of research told me that neither of the AMD's was Dual Core, and that the Intel was a first generation Dual Core processor and had performance that was comparable to a single core processor.
A co-worker told me that one of our suppliers might have something on stock. On our website, we prefer to sell the latest models obviously, but the suppliers might have an older model on the shelf somewhere. A quick E-mail here and a phone call there told me that indeed one of our suppliers had some older models on the shelf that might fit within my budget. The price would be comparable to the cheap web server with the AMD or Intel Dual Core processor, but it would be a Hewlett Packard or I.B.M. Of course I feared that it would end up way too pricy again due to the more expensive hard disks, but the supplier told me not to worry about it.
A bit of haggling and ass kissing later, I had made a very nice deal on my new server, which should arrive next Monday or Tuesday. I went a bit over my planned budget, but also managed to make a deal with my boss so that I can pay for the server in parts. I'll pay half of the server in cash (which is well within my budget), and half of it will be deducted from my salary in 3 parts. So what did I get?
From a supplier I managed to get For € 819,91 inc taxes :
- Hewlett Packard Proliant DL 320 G5P with Intel Xeon 2.66 GHz processor.
- 4 GB memory upgrade kit.
- 2x 250 GB hard disk (hot plug)
Via another channel I also managed to get a 64-Bit Windows 2008 Web Edition license and an additional 2GB of memory for free (the guy owed me a favor :P). I think going from a Celeron D 3GHz with 2GB ram and IDE hard disk to a Dual Core Xeon 2.66GHz with 6 GB ram and raid 1 hard disks is quite a nice upgrade, and after the server has been paid off, I will save money and have more performance.
My Server should arrive Monday or Tuesday, so stay tuned!