Jump to content

Is Apache mod_rewrite a tool you're using?

+ 1
  dklynn's Photo
Posted Oct 01 2010 05:16 PM

If not, you're missing the boat! mod_rewrite is a tool which uses regular expressions to match strings available in Apache's variables. The effect it has on serving files based on its rules gives the webmaster a very powerful tool!

The basics:
Using two mod_rewrite statements and a couple of "flags" (modifiers) along with a limited knowledge of regular expressions, a webmaster can easily redirect visitor requests. Most often, this is done for mundane reasons but there are some really powerful tasks that can also be performed.

The tasks:
WARNING:
The following mod_rewrite code snippets are designed for use on an Apache 2.x server. Because mod_rewrite's PERL regex engine is so powerful, you must take great care with the generation and ordering of mod_rewrite snippets to prevent conflicts and/or infinite loops. These snippets are supplied to demonstrate the simple nature of the code but should only be used by webmasters who understand what the code is designed to do to prevent problems.

  • Force or eliminate www for a Domain - change this "subdomain" for a website.
    RewriteCond %{HTTP_HOST} !^www\. [NC]
    RewriteRule .? http://www.%{HTTP_HOST}%{REQUEST_URI} [R=301,L]

  • Test directory/file exists - special codes are available to allow mod_rewrite to check that the request exists as a file or directory; very useful in avoiding unintended redirections.
  • Redirect to a 404 handler - if you don't like the default ErrorDocument, create your own! Just negate the directory/file test then send the visitor to your 404 handler.
    RewriteCond %{REQUEST_FILENAME} -d 	# is a directory	
    RewriteCond %{REQUEST_FILENAME} -f 	# is a file	
    RewriteRule .? 404handler.php [L]

  • Prevent image hotlinking - bandwidth piracy is a problem so prevent your images from being displayed on someone else's site.
    RewriteCond %{HTTP_REFERER} !^$	
    RewriteCond %{HTTP_REFERER} !example.com [NC]	
    RewriteRule \.(gif|jpe?g|png)$ - [F]

  • Block 'bots - they also steal bandwidth so have mod_rewrite deny 'bot access to your website.
    RewriteCond %{HTTP_USER_AGENT} ^$ [OR]	
    RewriteCond %{HTTP_USER_AGENT} ^User-Agent [OR]	
    # continue list ...	
    RewriteRule .? - [F]

  • Browser dependent content - don't struggle to determine what CSS file to serve when mod_rewrite can read the browser info, too, and serve all the required hacks to IE.
    # MS Internet Explorer - Mozilla v4	
    RewriteCond %{HTTP_USER_AGENT} ^Mozilla/4(.*)MSIE	
    RewriteRule ^css\.css$ css-ie.css [L]

  • Changing file or directory names - Normally, that upsets your SE rankings but mod_rewrite can redirect AND signal SE's that the change is permanent (so they'll update their records).
    RewriteRule ^old_directory/([a-z\.]+)$ new_directory/$1 [R=301,L]

  • Replace a character - Changing only one character in all file names is a different regex problem but mod_rewrite can do that, too.
    RewriteRule ^/?(.*)-(.*)$ $1_$2 [N,R=301,L] # change ALL '-'s to '_'s

  • Convert extensions - show one file extension but serve another.
    RewriteRule ^([a-z/]+)\.html$ $1.php [L] 	# redirection is NOT visible without the R=301

  • Extensionless URIs - this is far more than adding a PHP (or HTML) file extension as it's routinely to provide a query string to a handler to fetch content from a database, i.e., use the (unique) title of an article or blog rather than the id (primary key).
    RewriteCond %{REQUEST_FILENAME}.php -f	# check that it is a php script first!	
    RewriteRule ^([a-zA-Z0-9]+)$ $1.php [L]

  • URIs with key/value pairs - use mod_rewrite to parse the request and extract values or key/value pairs for the redirection. Taking advantage of mod_rewrite's looping, the number of key/value pairs is only limited by the length of the URL string.
    # check for file or directory first!	
    RewriteRule ^([a-zA-Z]+)/([a-zA-Z]+)$ handler.php?$1=$2 [L]

  • Query stings - they can be examined and manipulated, too, including removing a query string.
    # look for a required key	
    RewriteCond %{QUERY_STRING} !uniquekey=	
    RewriteRule ^script_that_requires_uniquekey\.php$ other_script.php [QSA,L]

  • # remove query string from test.php
    RewriteCond %{QUERY_STRING} !^$	# query string is not empty	
    RewriteRule ^test\.php test.php? [R=301,L]

  • Secure servers - just like the www, you can force or prevent the secure server (HTTPS) being used for specific files, directories or across the entire website.
    RewriteCond %{SERVER_PORT} !^443$	
    RewriteRule ^secure_page\.php$ https://www.example.com/$1 [R=301,L]

    A more advanced code snippet can enforce use of the http server on all scripts except those selected to enforce use of the secure server:
    RewriteCond %{HTTP_HOST}/s%{HTTPS} ^(www\.)?([^/]+)/((s)on|s.*)$ [NC]	
    RewriteRule .? http%4://%2%{REQUEST_URI} [R=301,L]	
    RewriteCond %{HTTPS} on [NC]	
    RewriteRule !^(page1|page2|page3|page4|page5)\.php$ http://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]

  • Work with a CMS - WordPress is "greedy" in its use of mod_rewrite to redirect everything to its handler script. A little knowledge is often required to "punch holes" in its "greedy" nature.
    # BEGIN WordPress	
    RewriteEngine On
    RewriteBase /	
    RewriteCond %{REQUEST_FILENAME} !-f	
    RewriteCond %{REQUEST_FILENAME} !-d	
    RewriteCond %{REQUEST_URI} # private/ # This line omits private subdirectory	
    RewriteRule . /index.php [L]	
    # END WordPress

  • Apache Environments - they can be manipulated (with your own environmental variables, too) in order to serve time of day redirects, etc.
    RewriteCond %{TIME_HOUR}%{TIME_MIN} >0600	
    RewriteCond %{TIME_HOUR}%{TIME_MIN} <1800	
    RewriteRule ^page\.html$ page.day.html [L]	
    RewriteRule ^page\.html$ page.night.html [L]

  • RewriteMaps - the best is last for good reason: Improper coding can bring down the entire server! Reserved for System Administrators, RewriteMaps have built-in functions for mod_rewrite as well as allowing the use of programs (your scripts; useful to update links when site-wide changes just can't be defined by a regular expression).
    RewriteMap lowercase int:tolower 	# must be in server or VHost configuration file	
    RewriteCond ${lowercase:%{HTTP_HOST}} ^(.+)$	
    RewriteRule .? http://%1%{REQUEST_URI} [R=301,L]


Summary:
With a few simple tools, mod_rewrite can perform "magic." mod_rewrite is so powerful and has so many varied uses that it is difficult to remember that there is also "magic" in knowing when not to use mod_rewrite in favor of a more appropriate tool.

Tags:
0 Subscribe


3 Replies

 : Nov 09 2010 02:39 PM
I used to use mod_rewrite, but it was a big hassle, so I do not use it anymore.
0
  dklynn's Photo
Posted Nov 19 2010 02:39 AM

The hassle is only minimal (if your host has enabled it) and the rewards are extraordinary. IMHO, it's well worth the little effort - at least if you know just a bit about regular expressions.

Regards,

DK
0
  gotenks05's Photo
Posted Nov 20 2010 12:48 PM

@dklynn It was useful, somewhat, but it interfered with Photo Feeds and that. I am currently my own host and my server is a LAMP, but I use a MAMP setup for testing.