Restricting Access with Apache mod_rewrite & Cookies

Posted: 2011-02-16 in tech, webdev
Tags: , , , , , , , ,

So let’s say you want to restrict a bunch of data or pages on your website to users that have agreed to what is commonly referred to as a “Terms of Service” agreement, or ToS. These agreements are typically filled with lots of legalese and in a very small font size. Very few people people read such agreements in their entirety, however, they are an important requirement in many scenarios of information dissemination on the internet. What are some scenarios where this could be important?

  1. Software distribution. You might want to get a user’s agreement that he or she will not redistribute an application your company or organization has created without written approval.
  2. Media and content distribution: Perhaps you publish music, video, online games or articles and blogs online and wish to protect that content with a ToS page.
  3. Forums and chat: You may wish to obtain consent from your users that they will not engage in abusive behavior in your online community.

The possibilities are basically endless and the aforementioned examples are only the most obvious scenarios. However, to make this work (and in many jurisdictions to make it legally enforceable) one should not be able to simply bypass the ToS page and link directly to the URL of the data that is being protected. A naive web administrator might throw up a ToS page as an intermediate navigation stop before arriving at the download area containing links to the data. However, if anyone has a bookmark, or enters the direct URL, or publishes the direct link to the data online, then the data can be easily retrieved without ever agreeing to your organization’s terms. Good luck enforcing your ToS in court if necessary…

For this post, I’m going to assume that you are using the Apache web server. In order to prevent direct linking and access to your content, we somehow have to instruct Apache not to serve it until some condition is satisfied. There are many way of doing this of course. One can write a custom Apache module or handler for the task in a language such as C, or Perl if you are using mod_perl. However, we can also use a very useful module called mod_rewrite to help us out. A nice advantage here is the mod_rewrite is automatically included and enabled by default in many Apache installations. We are going to configure mod_rewrite to check for the existence of a special cookie that our ToS page will set. If you have the cookie (and the correct value in it), Apache will let you have the data. If not, Apache will redirect you to the ToS page. Sounds easy right? Using cookies for this task has some additional benefits:

  1. We can set the cookie to expire whenever we like. Let’s say, 90 days. That way the user doesn’t need to accept the ToS every single time they want access to the data, just every 90 days… One can also make the cookie permanent if that behavior is desired.
  2. All browsers of any consequence support cookies, even the text based browsers and download utilities such as wget, lynx and w3m.

Let us assume that the files we wish to protect all have the same extension: .tgz. In the Unix world, .tgz, or .tar.gz files are very common and are simply compressed archives of data, conceptually similar to .zip for you MS Windows users out there. This is a trivially easy example because we can define a mod_rewrite configuration on the web server as follows:

RewriteEngine on
RewriteCond %{REQUEST_URI} \.tgz$
RewriteCond %{HTTP_COOKIE} !tos=accepted
RewriteRule ^(.*) /terms-of-service.html [R,L]

RewriteLog /var/log/httpd/rewrite.log
RewriteLogLevel 9

The first line simply activates the rewrite engine. The second and third lines apply conditions as to when the RewriteRule on line 4 goes into effect. Both conditions need to be satisfied. The first condition of line 2, is that the resource requested needs to end in .tgz. Please consult the documentation on regular expressions for why we need to escape the “.” with a backslash and what the trailing $ is for.

The second condition states that we must NOT have a cookie called “tos” with the value “accepted” in it. More on how a user acquires the cookie later.

So, with this configuration, if the server receives a request for a .tgz file and the client does not have the required cookie, the RewriteRule will be triggered. Basically, it redirects us to the terms-of-service.html page. The [R,L] flags specify the type of redirect used and ensure that this is the last rule to apply. The page redirected to doesn’t have to be a .html file of course. It could just as easily be a .php, or .jsp file…

You may ask, what if the user disables cookies or clears the cookies every time the browser is exited? Well, in the former, the user will not be able retrieve the data. It’s important that such limitations be discussed openly and up-front. This is an unfortunate consequence of relying on cookies to accomplish such a task. However, cookies have become such an integral part of modern web usage that their use hardly constitutes a radical departure from the norm. One can view a user with cookies disabled as a special case of a user that wishes to implicitly decline your ToS… One can add a note on the ToS page itself describing how cookies are required to be enabled for the authorization mechanism to work. In the case of a user that routinely or automatically flushes cookies away, that user will simply need to agree to the ToS each time they wish to access or download content.

Giving the user the Cookie
Once the server has been configured, to deny (redirect) requests for data that aren’t provided with the cookie, it’s time to consider just how we will grant the cookie to a user that accepts the Terms of Use/Service. Fortunately, this is a task that can be done with JavaScript. If you’re a fan of JQuery like I am, it can be done even more elegantly with the JQuery cookie plugin, found here:

Our page will have two buttons, one to accept and one to deny the ToS. The markup should look something like so:

<form action="#" method="get">
    <input id="accept" type="button" value="Accept" />
    <input id="decline" type="button" value="Decline" />

The exact formatting of the form and inputs may depend on the doctype you’ve selected for your site and pages… Once this is inserted into the appropriate place beneath the text of your conditions, it’s time to wire the buttons up with a bit of JavaScript. We’ll create a new JavaScript file called tos.js:

$(document).ready(function() {
  $("#cancel").click(function () {
  $("#accept").click(function () {
    // Set a cookie to expire in 90 days.
    $.cookie("tos", "accepted", { path: '/', domain: "", expires: 90 });

This code registers handler functions for when each button is clicked. In the terms-of-service.html file you would simply have to include the JQuery file, the JQuery cookie plugin, and the tos.js file, like so:

  <script type="text/javascript" src="/js/jquery-1.4.1.min.js"></script>
  <script type="text/javascript" src="/js/jquery.cookie.js"></script>
  <script type="text/javascript" src="/js/download_tos.js"></script>

To recap, what we have achieved is a page where if the user has hit the “Accept” button, then he will be issued a cookie that the web server will honor by allowing the download of the content, .tgz files. If the user presses the “Decline” button, the he will be forwarded to a configurable alternate page, in this case the index.html page. One really can’t get around this system by linking or navigating directly to the content. However, a skilled person could craft the cookie manually and post the cookie along with the request without ever having agreed to the ToS. However, to do that the user must examine the JavaScript and go through the trouble of crafting the cookie manually… However, protecting your content from being republished by users even though they agreed to the ToS, well, that’s tough… You might want to ask the RIAA about that.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s