.htaccess rewritecond match something except - .htaccess

I'm trying to get my .htaccess file to work. Well it already works, but something is a bit annoying right now.
I have modyfied my .htaccess to support multi-language website with the following:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !^/public/[a-z]{2}/ [NC]
RewriteRule ^(.*)$ index.php?url=$1 [PT,L]
# match those that DO have a language code
RewriteRule ^([a-z]{2})/(.*)$ index.php?lang=$1&url=$2 [PT,L]
This works like a charm. When a two-char language code is added to the beginning of the URL the ?lang is added to my index.php. However this is a problem because my javascript folder is called "js" which means this will treated as if it was a language code. I actually thought that the two first RewriteCond's would prevent this from happening, but apparently it isn't.
I must admit that the .htaccess isn't what I'm best at so I might have missed some sort of simple solution.

Option #1: Use this rule (instead of your last line):
RewriteRule ^(?!js)([a-z]{2})/(.*)$ index.php?lang=$1&url=$2 [PT,L]
This will do nothing for /js/hello but will rewrite if /en/hello (for example) requested.
Option #2: Add condition similar to what you have for previous rule (I assume /public/js/ is the actual folder for your JavaScript files -- if not, adjust the name accordignly):
RewriteCond %{REQUEST_URI} !^/public/js/
RewriteRule ^([a-z]{2})/(.*)$ index.php?lang=$1&url=$2 [PT,L]
I recommend #1 -- it's may be more difficult to understand .. but it's all in single line and a bit faster (single regex versus two in #2).

Related

Rewrite multiple rules in .htaccess / remove .html extension [duplicate]

How to remove .html from the URL of a static page?
Also, I need to redirect any url with .html to the one without it. (i.e. www.example.com/page.html to www.example.com/page ).
I think some explanation of Jon's answer would be constructive. The following:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
checks that if the specified file or directory respectively doesn't exist, then the rewrite rule proceeds:
RewriteRule ^(.*)\.html$ /$1 [L,R=301]
But what does that mean? It uses regex (regular expressions). Here is a little something I made earlier...
I think that's correct.
NOTE: When testing your .htaccess do not use 301 redirects. Use 302 until finished testing, as the browser will cache 301s. See https://stackoverflow.com/a/9204355/3217306
Update: I was slightly mistaken, . matches all characters except newlines, so includes whitespace. Also, here is a helpful regex cheat sheet
Sources:
http://community.sitepoint.com/t/what-does-this-mean-rewritecond-request-filename-f-d/2034/2
https://mediatemple.net/community/products/dv/204643270/using-htaccess-rewrite-rules
To remove the .html extension from your urls, you can use the following code in root/htaccess :
RewriteEngine on
RewriteCond %{THE_REQUEST} /([^.]+)\.html [NC]
RewriteRule ^ /%1 [NC,L,R]
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^ %{REQUEST_URI}.html [NC,L]
NOTE: If you want to remove any other extension, for example to remove the .php extension, just replace the html everywhere with php in the code above.
Also see this How to remove .html and .php from URLs using htaccess` .
This should work for you:
#example.com/page will display the contents of example.com/page.html
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^(.+)$ $1.html [L,QSA]
#301 from example.com/page.html to example.com/page
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /.*\.html\ HTTP/
RewriteRule ^(.*)\.html$ /$1 [R=301,L]
With .htaccess under apache you can do the redirect like this:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)\.html$ /$1 [L,R=301]
As for removing of .html from the url, simply link to the page without .html
page
You will need to make sure you have Options -MultiViews as well.
None of the above worked for me on a standard cPanel host.
This worked:
Options -MultiViews
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([^\.]+)$ $1.html [NC,L]
For those who are using Firebase hosting none of the answers will work on this page. Because you can't use .htaccess in Firebase hosting. You will have to configure the firebase.json file. Just add the line "cleanUrls": true in your file and save it. That's it.
After adding the line firebase.json will look like this :
{
"hosting": {
"public": "public",
"cleanUrls": true,
"ignore": [
"firebase.json",
"**/.*",
"**/node_modules/**"
]
}
}
Thanks for your replies. I have already solved my problem. Suppose I have my pages under http://www.yoursite.com/html, the following .htaccess rules apply.
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /html/(.*).html\ HTTP/
RewriteRule .* http://localhost/html/%1 [R=301,L]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /html/(.*)\ HTTP/
RewriteRule .* %1.html [L]
</IfModule>
Good question, but it seems to have confused people. The answers are almost equally divided between those who thought Dave (the OP) was saving his HTML pages without the .html extension, and those who thought he was saving them as normal (with .html), but wanting the URL to show up without. While the question could have been worded a little better, I think it’s clear what he meant. If he was saving pages without .html, his two question (‘how to remove .html') and (how to ‘redirect any url with .html’) would be exactly the same question! So that interpretation doesn’t make much sense. Also, his first comment (about avoiding an infinite loop) and his own answer seem to confirm this.
So let’s start by rephrasing the question and breaking down the task. We want to accomplish two things:
Visibly remove the .html if it’s part of the requested URL (e.g. /page.html)
Point the cropped URL (e.g. /page) back to the actual file (/page.html).
There’s nothing difficult about doing either of these things. (We could achieve the second one simply by enabling MultiViews.) The challenge here is doing them both without creating an infinite loop.
Dave’s own answer got the job done, but it’s pretty convoluted and not at all portable. (Sorry Dave.) Łukasz Habrzyk seems to have cleaned up Anmol’s answer, and finally Amit Verma improved on them both. However, none of them explained how their solutions solved the fundamental problem—how to avoid an infinite loop. As I understand it, they work because THE_REQUEST variable holds the original request from the browser. As such, the condition (RewriteCond %{THE_REQUEST}) only gets triggered once. Since it doesn’t get triggered upon a rewrite, you avoid the infinite loop scenario. But then you're dealing with the full HTTP request—GET, HTTP and all—which partly explains some of the uglier regex examples on this page.
I’m going to offer one more approach, which I think is easier to understand. I hope this helps future readers understand the code they’re using, rather than just copying and pasting code they barely understand and hoping for the best.
RewriteEngine on
# Remove .html (or htm) from visible URL (permanent redirect)
RewriteCond %{REQUEST_URI} ^/(.+)\.html?$ [nocase]
RewriteRule ^ /%1 [L,R=301]
# Quietly point back to the HTML file (temporary/undefined redirect):
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^ %{REQUEST_URI}.html [END]
Let’s break it down…
The first rule is pretty simple. The condition matches any URL ending in .html (or .htm) and redirects to the URL without the filename extension. It's a permanent redirect to indicate that the cropped URL is the canonical one.
The second rule is simple too. The first condition will only pass if the requested filename is not a valid directory (!-d). The second will only pass if the filename refers to a valid file (-f) with the .html extension added. If both conditions pass, the rewrite rule simply adds ‘.html’ to the filename. And then the magic happens… [END]. Yep, that’s all it takes to prevent an infinite loop. The Apache RewriteRule Flags documentation explains it:
Using the [END] flag terminates not only the current round of rewrite
processing (like [L]) but also prevents any subsequent rewrite
processing from occurring in per-directory (htaccess) context.
Resorting to using .htaccess to rewrite the URLs for static HTML is generally not only unnecessary, but also bad for you website's performance. Enabling .htaccess is also an unnecessary security vulnerability - turning it off eliminates a significant number of potential issues. The same rules for each .htaccess file can instead go in a <Directory> section for that directory, and it will be more performant if you then set AllowOverride None because it won't need to check each directory for a .htaccess file, and more secure because an attacker can't change the vhost config without root access.
If you don't need .htaccess in a VPS environment, you can disable it entirely and get better performance from your web server.
All you need to do is move your individual files from a structure like this:
index.html
about.html
products.html
terms.html
To a structure like this:
index.html
about/index.html
products/index.html
terms/index.html
Your web server will then render the appropriate pages - if you load /about/, it will treat that as /about/index.html.
This won't rewrite the URL if anyone visits the old one, though, so it would need redirects to be in place if it was retroactively applied to an existing site.
I use this .htacess for removing .html extantion from my url site, please verify this is correct code:
RewriteEngine on
RewriteBase /
RewriteCond %{http://www.proofers.co.uk/new} !(\.[^./]+)$
RewriteCond %{REQUEST_fileNAME} !-d
RewriteCond %{REQUEST_fileNAME} !-f
RewriteRule (.*) /$1.html [L]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^.]+)\.html\ HTTP
RewriteRule ^([^.]+)\.html$ http://www.proofers.co.uk/new/$1 [R=301,L]
Making my own contribution to this question by improving the answer from #amit-verma (https://stackoverflow.com/a/34726322/2837434) :
In my case I had an issue where RewriteCond %{REQUEST_FILENAME}.html -f was triggering (believing the file existed) even when I was not expecting it :
%{REQUEST_FILENAME}.html was giving me /var/www/example.com/page.html for all these cases :
www.example.com/page (expected)
www.example.com/page/ (also quite expected)
www.example.com/page/subpage (not expected)
So the file it was trying to load (believing if was /var/www/example.com/page.html) were :
www.example.com/page => /var/www/example/page.html (ok)
www.example.com/page/ => /var/www/example/page/.html (not ok)
www.example.com/page/subpage => /var/www/example/page/subpage.html (not ok)
Only the first one is actually pointing to an existing file, other requests were giving me 500 errors as it kept believing the file existed and appending .html repeatedly.
The solution for me was to replace RewriteCond %{REQUEST_FILENAME}.html -f with RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI}.html -f
Here is my entire .htaccess (I also added a rule to redirect the user from /index to /) :
# Redirect "/page.html" to "/page" (only if "/pages.html" exists)
RewriteCond %{REQUEST_FILENAME} -f
RewriteCond %{THE_REQUEST} /(.+)\.html [NC]
RewriteRule ^(.+)\.html$ /$1 [NC,R=301,L]
# redirect "/index" to "/"
RewriteRule ^index$ / [NC,R=301,L]
# Load "/page.html" when requesting "/page" (only if "/pages.html" exists)
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI}.html -f
RewriteRule ^ /%{REQUEST_URI}.html [QSA,L]
Here is a result example to help you understand all the cases :
Considering I have only 2 html files on my server (index.html & page.html)
www.example.com/index.html => redirects to www.example.com
www.example.com/index => redirects to www.example.com
www.example.com => renders /var/www/example.com/index.html
www.example.com/page.html => redirects to www.example.com/page
www.example.com/page => renders /var/www/example.com/page.html
www.example.com/page/subpage => returns 404 not found
www.example.com/index.html/ => returns 404 not found
www.example.com/page.html/ => returns 404 not found
www.example.com/test.html => returns 404 not found
No more 500 errors 🚀
Also, just to help you debug your redirections, consider disabling the network cache in your browser (as old 301 redirections my be in cache, wich may cause some headaches 😅):
first create a .htaccess file and set contents to -
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.html -f
RewriteRule ^(.*)$ $1.html
next remove .html from all your files eg. test.html became just test and also if you wanna open a file from another file then also remove .html from it and just file name
Use a hash tag.
May not be exactly what you want but it solves the problem of removing the extension.
Say you have a html page saved as about.html and you don't want that pesky extension you could use a hash tag and redirect to the correct page.
switch(window.location.hash.substring(1)){
case 'about':
window.location = 'about.html';
break;
}
Routing to yoursite.com#about will take you to yoursite.com/about.html. I used this to make my links cleaner.
To remove the .html extension from your URLs, you can use the following code in root/htaccess :
#mode_rerwrite start here
RewriteEngine On
# does not apply to existing directores, meaning that if the folder exists on server then don't change anything and don't run the rule.
RewriteCond %{REQUEST_FILENAME} !-d
#Check for file in directory with .html extension
RewriteCond %{REQUEST_FILENAME}\.html !-f
#Here we actually show the page that has .html extension
RewriteRule ^(.*)$ $1.html [NC,L]
Thanks
For this, you have to rewrite the URL from /page.html to /page
You can easily implement this on any extension like .html .php etc
RewriteRule ^(.*).html$ $1.html [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([^\.]+)$ $1.html [NC,L]
You will get a URL something like this:
example.com/page.html to example.com/page
Please note both URLs below will be accessible
example.com/page.html and example.com/page
If you don't want to show page.html
Try this
RewriteRule ^(.*).html$ $1 [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([^\.]+)$ $1.html [NC,L]
More info here
If you have a small static website and HTML files are in the root directory.
Open every HTML file and make the next changes:
Replace href="index.html" with href="/".
Remove .html in all local links. For example: "href="about.html"" should look like "href="about"".
RewriteEngine On
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /html/(.*).html\ HTTP/
RewriteRule .* https://example.com/html/%1 [R=301,L]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /html/(.*)\ HTTP/
RewriteRule .* %1.html [L]
it might work because its working in my case
RewriteRule /(.+)(\.html)$ /$1 [R=301,L]
Try this :) don't know if it works.

.htaccess conflicting rewrite rules

I'm trying to change a website to multi-language, so I have URL's like this:
www.company.com/en/about
www.company.com/fr/about
which should point to index.php?lang=en&what=about
so I defined the following rewrite rule (which works)
RewriteRule ^en/(.*)$ ?lang=en&what=$1 [NC,L]
RewriteRule ^fr/(.*)$ ?lang=fr&what=$2 [NC,L]
but I also need the homepage url as www.company.com/en (pointing to index.php?lang=en)
which does not work for this rule.
The best solution would be something like
RewriteRule ^(.*)/(.*)$ ?lang=$1&what=$2 [NC,L]
but it converts all the urls, like href='css.css' kind of references, so it messes up the whole page.
so how should I restrict the first GET variable to be two chars? or one of the defined languages?
Try:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([a-z]{2})(?:/(.*)|)$ /?lang=$1&what=$2 [L]
The first grouping, ([a-z]{2}), captures the 2 letter language. The second optional grouping captures the "what". If there's nothing there, then "what" will be blank.

Why use RewriteCond and RewriteRule to test a url condition?

In many rewrite rule answers, when testing the url for a certain condition, I often see a mix of using RewriteCond with REQUEST_URI, and the RewriteRule itself.
Is this just personal preferences, or is there a performance reason, or just clarity of the rules? All of these are valid reasons, in my opinion; I'm just wondering if there's a particular reason.
I know there are conditions where RewriteCond is the only choice. I'm interested here in the cases where the RewriteRule would also work. Generally these are simpler rules.
Here are some examples:
EXAMPLE 1
This answer has a common pattern, allow certain folders as-is.
Htaccess maintenance mode allow certain directories
# always allow these folders
RewriteCond %{REQUEST_URI} ^/display_me_always [OR]
RewriteCond %{REQUEST_URI} ^/another_folder [OR]
RewriteCond %{REQUEST_URI} ^/even_more_folders
RewriteRule .+ - [L]
This could be done with just a RewriteRule as:
RewriteRule ^/(?:display_me_always|another_folder|even_more_folders) - [L]
(Added the ?: for non-capturing. I'm never quite sure if it's faster to have a simpler rule, or not to capture.)
EXAMPLE 2
Modifying this for a more common scenario, redirect certain folders to other folders.
RewriteCond %{REQUEST_URI} ^/display_me_always [OR]
RewriteCond %{REQUEST_URI} ^/another_folder [OR]
RewriteCond %{REQUEST_URI} ^/even_more_folders
RewriteRule ^/[^/]+/(.*)$ /other_location/$1 [L]
This seems simpler with the rule only.
RewriteRule ^/(?:display_me_always|another_folder|even_more_folders)/(.*)$ /other_location/$1 [R=301,NC]
EXAMPLE 3
This answer has a common pattern, redirect if not already in the target location.
Mod Rewrite rule to redirect all pdf request to another location. Test the folder with the RewriteCond, then test the file with the rule.
I can see a negative condition being much clearer to do with RewriteCond, but it's still possible with RewriteRule.
RewriteCond %{REQUEST_URI} !^/web_content/pdf/
RewriteRule ^(.+\.pdf)$ /web_content/pdf/$1 [L]
This could be written as
RewriteRule ^(?!/web_content/pdf/)(.+\.pdf)$ /web_content/pdf/$1 [L]
One simple reason: it's much easier to understand for a non regex guru.
The rewrite rule that you will give to a OP today, he may need to modify one day later .. and the easier the rule to understand the more chances that he will look into it himself and not running again to this place for yet another small fix / new similar rule.
Yes, combined rule is faster -- no doubts here. But the time spent at URL rewriting is still so small compared to a single script execution ... that it only can make some difference on very-very CPU busy servers and when there are a lot of such rewrites.
Therefore, the most optimal will be something in between -- which is still easy to read and is compact and efficient at the same time. Instead of
# always allow these folders
RewriteCond %{REQUEST_URI} ^/display_me_always [OR]
RewriteCond %{REQUEST_URI} ^/another_folder [OR]
RewriteCond %{REQUEST_URI} ^/even_more_folders
RewriteRule .+ - [L]
offer
# always allow these folders
RewriteCond %{REQUEST_URI} ^/(display_me_always|another_folder|even_more_folders)
RewriteRule .+ - [L]
You cannot become a master in a day or two (unless, maybe, you are some kind of genius) -- everything takes time. And the more practical experience you have (by modifying these rules yourself) the better it is for you to move further, to produce more efficient/stable rule.
BTW:
This rule will not work if placed in .htaccess (URL in matching pattern starts with no leading slash):
RewriteRule ^/(?:display_me_always|another_folder|even_more_folders) - [L]
But will work fine if placed in server config / virtual host context -- that's one of the "nuances" you need to know.
In general, I'd say you're right: RewriteCond is really largely for use with matching items OTHER than the REQUEST_URI. I'd expect that most of the cases you're looking at cases in an .htaccess. From the spec:
If you wish to match against the full URL-path in a per-directory (htaccess) RewriteRule, use the %{REQUEST_URI} variable in a RewriteCond.

Excluding a script from the general UrlRewrite rules

I have following rewrite rules for a website:
RewriteEngine On
# Stop reading config files
RewriteCond %{REQUEST_FILENAME} .*/web.config$ [NC,OR]
RewriteCond %{REQUEST_FILENAME} .*/\.htaccess$ [NC]
RewriteRule ^(.+)$ - [F]
# Rewrite to url
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !^(/bilder_losning/|/bilder/|/gfx/|/js/|/css/|/doc/).*
RewriteRule ^(.+)$ index.cfm?smartLinkKey=%{REQUEST_URI} [L]
Now I have to exclude a script including its eventually querystrings from the above rules, so that I can access and execute it on the normal way, at the moment the whole url is being ignored and forwarded to the index page.
I need to have access to the script shoplink.cfm in the root which takes variables tduid and url (shoplink.cfm?tduid=1&url=)
I have tried to resolve it using this:
# maybe?:
RewriteRule !(^/shoplink.cfm [QSA]
but to be honest, I have not much of a clue of urlrewriting and have no idea what I am supposed to write. I just know that above will generate a nice 500 error.
I have been looking around a lot on stackoverflow and other websites on the same subject, but all I see is people trying to exclude directories, not files. In the worst case I could add the script to a seperate directory and exclude the directory from the rewriterules, but rather not since the script should really remain in the root.
Just also tried:
RewriteRule ^/shoplink.cfm$ $0 [L]
but that didn't do anything either.
Anyone who can help me out on this subject?
Thanks in advance.
Steven Esser
ColdFusion programmer
Please try to put the following line at the top of your config (after RewriteEngine on):
RewriteRule ^shoplink.cfm$ - [L]

Invisible .htaccess Redirect from /public_html/ to /public_html/folder

I need to point the root domain of my hosting account to a subdirectory (joomla). I want this to be invisible (i.e. browser address bar doesn't change). Also, I need this to work when a user hits the root or a subfile/subfolder.
I've tried the following rules, which work individually, but I can't get them to work together.
This one works when no subfile/subfolder is specified:
RewriteEngine On
RewriteRule ^$ /joomla/ [L]
And this one works when a subfile/subfolder IS specified:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule (.+)$ /joomla/$1 [L]
I just can't figure out how to combine them.
RewriteEngine On
RewriteRule ^(.*)$ /joomla/$1 [L]
Should work (untested). The key difference between this and your second attempt is the + vs *. The + will match one or more, whereas the * will match 0 or more, so this should work also when no file/subdirectory is specified.
This should do the trick:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /joomla/$1 [L]
.* will also match an empty string. You also more than likely want to do the -d check to make sure that they aren't accessing a directory that exists (though, thinking about it, this might mess with the / matching, I don't know).

Resources