The most effective way to prevent bots from spamming your server is to drop them at the firewall. This is generally achieved using tools like Denyhosts or fail2ban, which monitor your logs, identify suspicious activity, and block the offending IP addresses before they cause harm.
Denyhosts works at the application level by adding entries to /etc/hosts.deny
, whereas fail2ban operates at the firewall level using iptables
, which makes it far more efficient.
However, on resource-constrained machines, fail2ban can still be taxing. A few years ago, we shared a demo of a lightweight log parser called banbylog, tailored specifically for our needs (SSH and WordPress activity monitoring) at a much lower resource cost. If that sounds like a good fit, feel free to check it out, but keep in mind, that it’s not production-ready!
While the consensus is to parse logs and block hostile IPs at the firewall, it will not work under the Cloudflare umbrella!
Because the IPs that are "attacking" your server are not the actual offending IPs, but Cloudflare machines that are proxying the request to your server. Take a look at the diagram below:
Cloudflare acts as a middleman between your server and the users. The only IP addresses visible to your firewall are from Cloudflare, not from the original user.
In fact, you should explicitly whitelist the Cloudflare IP range. If you happen to block an IP that belongs to Cloudflare, legitimate users will see your site as down.
😱 While this article's focus is on securing servers while working under a Content Distribution Network, Cloudflare offers a variety of tools - free and paid - you may leverage to achieve these same results.
Every request Cloudflare sends to your server has an attached header that carries the user's original IP under CF-Connecting-IP
. And that's what we can leverage to get them! Unfortunately, reading http
headers is too upstream for iptables
, thus HAProxy to the rescue!
While iptables
operates at Layer 4, HAProxy
can operate at OSI Layer 7.
💡 Remember the OSI model:
Layer 7 - Application; protocols (HTTP, ...)
Layer 6 - Presentation; character encoding (ASCII, UTF8, ...)
Layer 5 - Session; stick client to server
Layer 4 - Transport; protocols (TCP, UDP, ...)
Layer 3 - Network; routing protocols (IP)
Layer 2 - Data Link; physical to network (ARP, Ethernet)
Layer 1 - Physical; cabling, Wi-Fi
The easiest way to test HAProxy configurations is to boot a Docker instance of HAProxy:
docker run --name haproxy -p 888:80 -v ${pwd}:/usr/local/etc/haproxy --sysctl net.ipv4.ip_unprivileged_port_start=0 haproxy:2.3
Start HAProxy listening on host port 888.
The above assumes you're using Powershell and are in the directory that contains haproxy.cfg
. Change ${pwd}
accordingly if not.
docker kill -s HUP haproxy
This command forces HAProxy to restart, which is useful for iterating different configs.
Below is a straightforward haproxy.cfg
that will put HAProxy listening on the port 80
, log to stdout
so you can get an instant glimpse of what's going on and also print the CF-Connecting-IP
header.
global
# output logs straight to stdout
log stdout format raw daemon debug
defaults
# put HAProxy in Level 7 mode allowing to inspect http protocol
log global
mode http
frontend main
bind *:80
# capture original IP from header and put it in variable txn.cf_conn_ip
http-request set-var(txn.cf_conn_ip) hdr(CF-Connecting-IP)
# get the original IP on logs (just to check if things are working)
log-format "%ci\ %hr\ %ft\ %b/%s\ %Tw/%Tc/%Tt\ %B\ %ts\ %r\ %ST\ %Tr CF-IP:%{+Q}[var(txn.cf_conn_ip)]"
# use backend ok, which always returns a 200, for debug
use_backend ok
backend ok
http-request return status 200 content-type "text/plain" lf-string "ok"
This should be enough to test HAProxy.
Let's make a test request to HAProxy.
I'm partial to Bruno, a portable and offline alternative to Postman. Download it, add the CF-Connecting-IP
header, and make a POST
request to http://localhost:888/wp-login.php
to test if everything's working.
If things went as planned, your HAProxy should have printed this:
172.17.0.1 main ok/<NOSRV> -1/-1/0 78 LR POST /wp-login.php HTTP/1.1 200 -1 CF-IP:"222.222.222.222"
Now that we have the offending IP (222.222.222.222), we can block it!
In our particular case, bots are hitting wp-login.php
, xmlrpc.php
and xmrlpc.php
(The last one is a typo, but we've had more than 100k hits in the last 24h!). We also know that they're flooding the server with POST requests trying to brute-force passwords.
frontend main
# requests that will be monitored and blocked if abused
acl is_wp_login path_end -i /wp-login.php /xmlrpc.php /xmrlpc.php
acl is_post method POST
Add this to the end of the frontend main
block
We now have a couple of options:
a) Now that we have the offending IPs in the log, we could change banbylog
to write them to a file and have HAProxy deny those requests. While this would work, HAProxy would need to be constantly reloaded.
b) Or we may simply leverage HAProxy stick tables and do a rate-limiting on offending requests.
While "a" would allow us to ban the offending IP for an indefinite amount of time, "b" has fewer moving pieces.
frontend main
# requests that will be monitored and blocked if abused
acl is_wp_login path_end -i /wp-login.php /xmlrpc.php /xmrlpc.php
acl is_post method POST
# table than can store 100k IPs, entries expire after 1 minute
stick-table type ip size 100k expire 1m store http_req_rate(1m)
# we'll track (save to table) the original IP only if the request hits
# one of the monitored paths with a POST request
http-request track-sc0 hdr(CF-Connecting-IP) if is_wp_login is_post
# we now query the stick-table and if the IP has made more than
# 5 requests of the offending type in the last minute,
# current request is denied
http-request deny if is_wp_login is_post { sc_http_req_rate(0) gt 5 }
HAProxy has multiple deny options, tarpit, silent drop, reject, or shadowban. A tarpit denial would be something like this:
# make request hang for 20 seconds before replying back with a 403
timeout tarpit 20s
http-request tarpit deny_status 403 if is_wp_login is_post { sc_http_req_rate(0) gt 2 }
Notice that this puts extra stress on both HAProxy and iptables as they must keep the connection open.
Just for reference, here is the full, simplified, HAProxy config file that blocks requests if they hit one of the monitored URLs with more than 5 hits in less than a minute:
global
log stdout format raw daemon debug
defaults
log global
mode http
frontend main
bind *:80
# capture original IP from header and put it in variable txn.cf_conn_ip
http-request set-var(txn.cf_conn_ip) hdr(CF-Connecting-IP)
# get the original IP on logs (just to check if things are working)
log-format "%ci\ %hr\ %ft\ %b/%s\ %Tw/%Tc/%Tt\ %B\ %ts\ %r\ %ST\ %Tr CF-IP:%{+Q}[var(txn.cf_conn_ip)]"
acl is_wp_login path_end -i /wp-login.php /xmlrpc.php /xmrlpc.php
acl is_post method POST
stick-table type ip size 100k expire 1m store http_req_rate(1m)
http-request track-sc0 var(txn.cf_conn_ip) if is_wp_login is_post
http-request deny if is_wp_login is_post { sc_http_req_rate(0) gt 5 }
# obviously change this to whatever backend you're using
use_backend ok
backend ok
http-request return status 200 content-type "text/plain" lf-string "ok"
And there you have it; using your proxy as a gatekeeper!
While this article focuses on the server side of things, the first thing you should obviously do is to foolproof forms with Re-Captcha, which you can easily do with a plugin such as Advanced Google reCAPTCHA by WebFactory
.
EDIT: A user queried: «I have some WordPress installations under Cloudflare and some exposing the server directly. Can it work on both?»
You can. It's as simple as doing something like this:
# if CF-Connecting-IP is not set, put src IP in txn.client_ip
http-request set-var(txn.client_ip) hdr(CF-Connecting-IP)
http-request set-var(txn.client_ip) src if !{ var(txn.client_ip) -m found }
But beware, if some sites are under Cloudflare and some aren't, a spammer can forge CF-Connecting-IP
to trick the simple if/else above. You must first check if src belongs to Cloudflare before extracting txn.client_ip
from the header.
This was originally published on wasteofserver.com. Check it there to see updates and comments.