In many rewrite rule answers, when testing the url for a certain condition, I often see a mix of using RewriteCond with REQUEST_URI, and the RewriteRule itself.
Is this just personal preferences, or is there a performance reason, or just clarity of the rules? All of these are valid reasons, in my opinion; I'm just wondering if there's a particular reason.
I know there are conditions where RewriteCond is the only choice. I'm interested here in the cases where the RewriteRule would also work. Generally these are simpler rules.
Here are some examples:
EXAMPLE 1
This answer has a common pattern, allow certain folders as-is.
Htaccess maintenance mode allow certain directories
# always allow these folders
RewriteCond %{REQUEST_URI} ^/display_me_always [OR]
RewriteCond %{REQUEST_URI} ^/another_folder [OR]
RewriteCond %{REQUEST_URI} ^/even_more_folders
RewriteRule .+ - [L]
This could be done with just a RewriteRule as:
RewriteRule ^/(?:display_me_always|another_folder|even_more_folders) - [L]
(Added the ?: for non-capturing. I'm never quite sure if it's faster to have a simpler rule, or not to capture.)
EXAMPLE 2
Modifying this for a more common scenario, redirect certain folders to other folders.
RewriteCond %{REQUEST_URI} ^/display_me_always [OR]
RewriteCond %{REQUEST_URI} ^/another_folder [OR]
RewriteCond %{REQUEST_URI} ^/even_more_folders
RewriteRule ^/[^/]+/(.*)$ /other_location/$1 [L]
This seems simpler with the rule only.
RewriteRule ^/(?:display_me_always|another_folder|even_more_folders)/(.*)$ /other_location/$1 [R=301,NC]
EXAMPLE 3
This answer has a common pattern, redirect if not already in the target location.
Mod Rewrite rule to redirect all pdf request to another location. Test the folder with the RewriteCond, then test the file with the rule.
I can see a negative condition being much clearer to do with RewriteCond, but it's still possible with RewriteRule.
RewriteCond %{REQUEST_URI} !^/web_content/pdf/
RewriteRule ^(.+\.pdf)$ /web_content/pdf/$1 [L]
This could be written as
RewriteRule ^(?!/web_content/pdf/)(.+\.pdf)$ /web_content/pdf/$1 [L]
One simple reason: it's much easier to understand for a non regex guru.
The rewrite rule that you will give to a OP today, he may need to modify one day later .. and the easier the rule to understand the more chances that he will look into it himself and not running again to this place for yet another small fix / new similar rule.
Yes, combined rule is faster -- no doubts here. But the time spent at URL rewriting is still so small compared to a single script execution ... that it only can make some difference on very-very CPU busy servers and when there are a lot of such rewrites.
Therefore, the most optimal will be something in between -- which is still easy to read and is compact and efficient at the same time. Instead of
# always allow these folders
RewriteCond %{REQUEST_URI} ^/display_me_always [OR]
RewriteCond %{REQUEST_URI} ^/another_folder [OR]
RewriteCond %{REQUEST_URI} ^/even_more_folders
RewriteRule .+ - [L]
offer
# always allow these folders
RewriteCond %{REQUEST_URI} ^/(display_me_always|another_folder|even_more_folders)
RewriteRule .+ - [L]
You cannot become a master in a day or two (unless, maybe, you are some kind of genius) -- everything takes time. And the more practical experience you have (by modifying these rules yourself) the better it is for you to move further, to produce more efficient/stable rule.
BTW:
This rule will not work if placed in .htaccess (URL in matching pattern starts with no leading slash):
RewriteRule ^/(?:display_me_always|another_folder|even_more_folders) - [L]
But will work fine if placed in server config / virtual host context -- that's one of the "nuances" you need to know.
In general, I'd say you're right: RewriteCond is really largely for use with matching items OTHER than the REQUEST_URI. I'd expect that most of the cases you're looking at cases in an .htaccess. From the spec:
If you wish to match against the full URL-path in a per-directory (htaccess) RewriteRule, use the %{REQUEST_URI} variable in a RewriteCond.
Related
I'm trying to remove the .html file extension from my URL's to make them look nicer. I've seen many examples of this and have tried them, but I'm struggling to find something that only works in the root, and doesn't apply to any subdirectories or subdomains.
Can I get any help with this?
Example:
example.org/test.html > example.org/test
example.org/food/xyz.html > example.org/food/xyz.html
login.example.org/something.html > login.example.org/something.html
Your question is a bit vague, so I will concentrate on the main issue you mention:
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !/.+\.html
RewriteRule ^/?([^/]+)$ $1.html [L]
Alternative approach:
RewriteEngine on
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^/?([^/]+)$ $1.html [L]
If this is limited to only a single host name (as your examples might suggest) depends on your setup. Such thing is easiest done if you place the rule in the real host configuration of your http server instead of using .htaccess style files. That is also why I append a general hint:
You should always prefer to place such rules inside the http servers host configuration instead of using .htaccess style files. Those files are notoriously error prone, hard to debug and they really slow down the server. They are only provided as a last option for situations where you do not have control over the host configuration (read: really cheap hosting service providers) or if you have an application that relies on writing its own rewrite rules (which is an obvious security nightmare).
In the comment you wrote that you also want to force the shorter notation, so redirect users request a URL with such "file name extension" to the version without.
I don't really see any point in this, but it certainly is possible:
RewriteEngine on
RewriteCond %{ENV:REDIRECT_STATUS} !^.
RewriteRule ^. - [L]
RewriteCond %{REQUEST_URI} ^/.+\.html$
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^/?(.+)\.html$ /$1 [L,R=301]
RewriteCond %{REQUEST_URI} !^/.+\.html$
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^/?([^/]+)$ /$1.html [L]
I am looking for If, else type of code which can combine my re-write rules based on host/domain name. So if domain name is "domainA" then redirect a to b, c to d and if domain name is "domainB" then redirect x to y and z to p. Basically I don't want to write the Condition again and again as I have done below:
I have written following code for htaccess
RewriteCond %{HTTP_HOST} ^domain\.com$ [OR]
RewriteCond %{HTTP_HOST} ^www\.domain\.com$
RewriteRule ^home /? [L,R=301]
RewriteCond %{HTTP_HOST} ^domain\.com$ [OR]
RewriteCond %{HTTP_HOST} ^www\.domain\.com$
RewriteRule ^home-old /? [L,R=301]
In above I am using host because I have multiple domains pointed to same hosting space so htaccess is common between all.
Can I combine the above multiple condition for a domain into one, instead of writing domai name again and again for some set of redirects for specific domain?
Please advise, thanks!
There really is no such if-then-else structure in mod_rewrite. You have to either be clever with your rules (which sometimes makes them unreadable or impossible to amend) or just be explicit about everything.
For your specific example, you can just use regular expressions to combine them:
^domain\.com$ and ^www\.domain\.com$ gets turned into ^(www\.)?domain\.com$
and
^home and ^home-old is just ^home, since the first only matches if the URI starts with home, and home-old does indeed start with home (there is no $ symbol to indicate the end of the string). So you're looking at:
RewriteCond %{HTTP_HOST} ^(www\.)?domain\.com$ [NC]
RewriteRule ^home /? [L,R=301]
If you need to be specific by using the $, you can just change the regex to:
RewriteRule ^home(-old)?$ /? [L,R=301]
EDIT:
If I have some other urls instead of home, and botique, and hairtips then I need to write RewriteCond every time? I guess I have to write that every time just confirming from you
Yes, you have to repeat the condition every time, or, you can make a passthrough at the very beginning of your rules:
RewriteCond %{HTTP_HOST} !^(www\.)?domain\.com$ [NC]
RewriteRule ^ - [L]
This means: anything that's not a request for host: www.domain.com or domain.com, then pass through without rewriting. Then you can just have rules after this because only requests with hosts other than the above will reach those rules.
This is the "clever" bit that I was referring to before. This can make it tricky to change your rules or amend them later because you've set a strict condition at the top.
I'm trying to better understand mod_rewrite and I've come across some differences, which I think do the same thing? In this case, no existing files or directories and rewriting to an index.php page.
RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule .+ - [L]
Do I need the [OR] or can I leave it off?
What are the differences or advantages of the following rules? I'm currently using the first one, but I've come across the last four in places like WordPress:
#currently using
RewriteRule ^(.+)$ index\.php?$1 [L]
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
Do I need the [OR] or can I leave it off?
In this case you need the [OR] because RewriteCond's are inherently ANDed, and it's not the case that a request is both a file and a directory (as far as mod_rewrite is concerned).
RewriteRule ^(.+)$ index\.php?$1 [L]
This rewrites all requests that aren't for the document root (e.g. http://domain.com/) as a query string for index.php, thus a request for http://domain.com/some/path/file.html gets internally rewritten to index.php?some/path/file.html
RewriteRule ^index\.php$ - [L]
This is a rule to prevent rewrite looping. The rewrite engine will continue to loop through all the rules until the URI is the same before the rewrite iteration and after (without the query string). If the URI starts with index.php simply stop the current rewrite iteration (what the - does). The rewrite engine sees that the URI before sending it through the rules was index.php and after the rules was index.php, thus the rewrite engine stops and all rewriting is done. This prevents mod_rewrite from rewriting things to index.php?index.php which the first rule would do upon the 2nd pass through the rewrite engine if it isn't for this rule.
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
This is the catch-all. If the first rule never gets applied, and the request isn't for an existing file or directory, send the request to index.php. Though in this case, it looks like this rule will never get applied.
EDIT:
is there a way to ignore a certain rule if a condition is true? For example, www.domain.com/some/path > index.php?some/path, but if the URI is www.domain.com/this/path > no rewrite?
You'd have to add 2 conditions, one that checks to make sure the requested host isn't "www.domain.com" and one to check that the URI isn't "/this/path":
RewriteCond %{HTTP_HOST} !^(www\.)?domain\.com$ [NC,OR]
RewriteCond %{REQUEST_URI} !^/some/path
The [NC] indicates that the condition's match should ignore case, so when someone enters the URL http://WWW.domain.com/ in their address bar, it will match (or in this case, not match). The second condition matches when the URI starts with "/some/path", which means requests for http://domain.com/some/path/file.html will match and NOT get rewritten. If you want to match exactly "/some/path", then the regular expression needs to be !^/some/path$.
Why not use [OR] in the final block between !-f and !-d?
This is the logical negation of -f OR -d: "if the file exists, don't rewrite, OR if the directory exists, don't rewrite" turns into "if the file doesn't exist, AND if the directory doesn't exist, then rewrite"
I'm trying to get my .htaccess file to work. Well it already works, but something is a bit annoying right now.
I have modyfied my .htaccess to support multi-language website with the following:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !^/public/[a-z]{2}/ [NC]
RewriteRule ^(.*)$ index.php?url=$1 [PT,L]
# match those that DO have a language code
RewriteRule ^([a-z]{2})/(.*)$ index.php?lang=$1&url=$2 [PT,L]
This works like a charm. When a two-char language code is added to the beginning of the URL the ?lang is added to my index.php. However this is a problem because my javascript folder is called "js" which means this will treated as if it was a language code. I actually thought that the two first RewriteCond's would prevent this from happening, but apparently it isn't.
I must admit that the .htaccess isn't what I'm best at so I might have missed some sort of simple solution.
Option #1: Use this rule (instead of your last line):
RewriteRule ^(?!js)([a-z]{2})/(.*)$ index.php?lang=$1&url=$2 [PT,L]
This will do nothing for /js/hello but will rewrite if /en/hello (for example) requested.
Option #2: Add condition similar to what you have for previous rule (I assume /public/js/ is the actual folder for your JavaScript files -- if not, adjust the name accordignly):
RewriteCond %{REQUEST_URI} !^/public/js/
RewriteRule ^([a-z]{2})/(.*)$ index.php?lang=$1&url=$2 [PT,L]
I recommend #1 -- it's may be more difficult to understand .. but it's all in single line and a bit faster (single regex versus two in #2).
I've installed a .htaccess file containing rewrite rules (as suggested by CodeIgniter) seen below.
How can I add to it so that any requests for urls that don't begin with members/ or admin/ get public/ put at the beginning?
i.e. if someone requests contact/, it should actually go for public/contact and so on unless they request members/something or admin/something.
.htaccess at present:
RewriteBase /
# Removes access to the system folder by users.
RewriteCond %{REQUEST_URI} ^_system.*
RewriteRule ^(.*)$ /index.php?/$1 [L]
# prevent user access to the application folder
RewriteCond %{REQUEST_URI} ^myapp.*
RewriteRule ^(.*)$ /index.php?/$1 [L]
# This snippet re-routes everything through index.php, unless
# it's being sent to resources, or searching for robots.text
# Add any OR's in here if you need other directly accessable Files/folders
RewriteCond $1 !^(index\.php|resources|robots\.txt)
RewriteRule ^(.*)$ index.php/$1
.htaccess - edited - this is how it looks now:
RewriteCond %{REQUEST_URI} ^_system.*
RewriteRule ^(.*)$ /index.php?/$1
RewriteCond %{REQUEST_URI} ^myapp.*
RewriteRule ^(.*)$ /index.php?/$1
RewriteCond $1 !^(index\.php|resources|robots\.txt)
RewriteRule ^(.*)$ index.php/$1
RewriteCond %{REQUEST_URI} !^/(members/|admin/|public/|index\.php) [NC]
RewriteRule ^(.*)$ /public/$1 [L]
It's returning a 404 for all requests at the moment. I'm sure I've got something wrong but don't know what :)
File structure is:
site_root/.htaccess
site_root/index.php
site_root/myapp/
You can prepend the regex with ! to mean "must not match". So you can achieve what you want like this:
RewriteCond %{REQUEST_URI} !^(public|members|admin)/ [NC]
RewriteRule ^(.*)$ /public/$1 [L]
Note: I'm adding a new answer because it is mostly unrelated to my old answer and is based on edits to your post. Leaving my old answer up for reference.
There are a number of problems here, but there is one big issue which needs resolving before it even makes sense to address the small problems. It appears that you are using a framework that works by routing everything (with a few exceptions) through index.php (which surely has its own internal router to the application controllers in myapp/). But you also say you want to route everything (with exceptions) through public/.
These are mutually exclusive possibilities -- you can't route all traffic through two places simultaneously. So I can go into fixing the above rules and show the code, but only after I know what the rules are supposed to be doing. Given what you said you want in your intro text, combined with what is already in your .htaccess, it's impossible to figure out what the combined rules are even supposed to do. I can edit this post with code details only after you answer some questions:
What framework are you using?
What is public/ supposed to be? Is it a directory (if so, what should it hold)? Or is something that is supposed to be routed to and handled by the application?
Do your current admin and members sections work by going to admin/ and members/ subdirectories? Or are these routed to and handled by the application?
Do you have any other important details about how the app is structured?