an apache helicopter

Improve your site security with these rewrites.

By: Josiah Huckins - 1/10/2020
minute read


Today, I'd like to offer a few short tips to improve the security of web applications.

We'll take a look at two seemingly innocuous rewrite rules, each of which can prove to be a big help in preventing certain types of DoS attacks, among other things.

These rewrites apply to the Apache Web Server. At the time of this writing, this is by far the most popular web server, running the web layer for most sites and web applications in the wild. (In later posts we'll get into the differences between Apache, Lighttpd and Nginx, and where each shines.)

URL Structures

A simple and typical URL might look something like https://test.com/a/path/thats/great.html. As you'd expect, this will request a page called great.html. The server will respond with great.html and that's it.
However, there are many other options available for building a URL and requesting a resource from the server. You can manipulate the type of response with query parameters or navigate directly to a section of the page with fragments. There are many ways to configure your site to accept requests and respond with tailored content. For all the unique ways there are to serve up a web app, there are also many ways for malicious folks to break it or break in!

Issue #1: The Double Slash problem

One particular way for a URL request to mess things up is when there's a double forward slash.
Such a request might look something like this: https://test.com/a/path/thats/great.html//some/other/path/not--so-great.html. I think you see the problem here. Instead of just requesting great.html, this attempts to request something great and something not so great...now, this example is harmless, if the two pages exist they may be returned with a 200 status, or a 404 if they don't. No harm done.
This can easily turn into a problem when the request after the double forward slash is for a known system path:

https://test.com/a/path/thats/great.html//etc/passwd
https://test.com/a/path/thats/great.html//c:/windows/system32

In the case of certain unpatched application servers like Jetty, this would allow someone to bypass security checks and directly access the contents of the files. In other cases it can change servlet request filters or break the page. None of these options are, well, great.

Issue #2: Multiple extensions

Another issue occurs when there are multiple extensions in a single request. Take this for example: https://test.com/some/page.html/some/page2.html. In this case the HTML extension is listed twice. In most cases this will result in a 404 as the entire part of the URL after the domain is taken as a single path. If using a REST based framework like Sling (which is famously used in the AEM framework), this may be treated differently.

Due to a feature known as sling suffixes, Sling would treat the /some/page2.html portion of the URL above like a suffix of the URI path. Suffixes are normally used to specify additional information for the page, and in many cases they're used to specify which rendering script or component should render the requested page. This is a powerful and commonly used feature of Sling. Unfortunately it can offer an easy way to execute a Denial of Service (DoS) attack against your site!
A bad apple could easily write a script to request the page with multiple suffixes, enumerating over numbered names in those suffixes (/some/page1.html, /some/page2.html, etc). Each of these suffixed URLs, while all requesting the same page, would each individually be cached. The reason being that each is a cacheable variation of the page. The cache would soon fill to capacity. In the best scenario, requests are all served from the app server (with a big performance loss), in the worst scenario all requests are stopped cold. If the cache directories are not consumed, this could at least be used to flood the server with requests, driving processing to a deadlock.


What a simple way to cause problems.

Now, there are many ways to fix this in your application layer. Most vendors will offer mitigation solutions in the form of patches or configuration changes. None of those methods are bad.
As an architectural best practice, I prefer to catch and prevent request based vulnerabilites as early as possible. That is, at the entry point for users, the web layer. This can prevent a malicious request from making it to the app layer, saving processing power where it's needed the most. As an added benefit, you can apply these updates in your web server and have them applied for multiple types of application servers.

The Double Slash Fix

To fix the issue with double slashes, you can employ this rewrite:


What this does is check if the URI path contains a double slash anywhere within. If it does we have a match on the rewrite condition and the rewrite rule below will direct the request to a 404.html page.
There are a couple of things to note here. What you see above is a regular expression. In regular expression syntax, the (.*) means 0 or more of anything except line breaks. This is essentially a catch all for anything. Generally, use of (.*) is not needed, its better to use something more specific like a character class which only matches on alphanumeric characters. For the sake of our example we are matching on anything before and after the double slashes (//). Most importantly including multiple occurrences of double slashes.
So this will match on /some/path.html//some/more and /some/path.html//some/more//even/more.

The Multiple Extensions Fix

To fix the issue with multiple extensions, you can employ this rewrite:


What this does is check if the URI path contains more than one path and extension. So this will match on /some/path.html/some/more.html, but it will not match on a legitimate path structured like this:
/some/legit/path.html.

The key item here is the {2}. This indicates that the preceeding regular expression grouping (((.*)\.html)) needs to match two or more times. If the grouping only matches once, the condition is not met.
Secondly, the entire expression needs to match against the end of the line. We group the entire expression in parentheses: ((((.*)\.html)){2}). This grouping is needed to indicate that the entire expression applies to the end of the line. The $ indicates the end of a line.

Closing Thoughts

While you can separate these rules into their own lines, you can also consolidate them into:


With these simple rewrites, you eliminate two easy and dangerous attacks. This is one of many ways to harden your sites and applications against malicious folks who want nothing more than to break your stuff.


Comments