Force removal of index.php with .htaccess - .htaccess

I'm currently using the following to rewrite http://www.site.com/index.php/test/ to also work directly with http://www.site.com/test/, but I would like to not only allow the second version, I would like to FORCE the second version. If a user goes to http://www.site.com/index.php/test/ it should immediately reroute them to http://www.site.com/test/. index.php should never appear in a url. Stipulation: this should only apply to the first index.php. If I have a title like http://www.site.com/index.php/2011/06/08/remove-index.php-from-urls/ it should leave the second index.php, as it is part of the URL.
Current rule that allows but does not force:
#Remove index.php
RewriteCond $1 !^(index.php|images|css|js|robots.txt)
RewriteRule ^(.*)$ /index.php/$1 [L]
Thanks.

As you wrote, if a user goes to http://www.site.com/index.php/test/ this rule will imediately reroute him to http://www.site.com/test/
RedirectMatch 301 /index.php/(.*)/$ /$1
I'm not sure if that is what you need as your current rewrite rule is opposite to mine.

First (and wrong) answer - see below
You can accomplish a redirection with these directives (in this order):
RewriteCond %{REQUEST_URI} ^index.php
RewriteRule ^index\.php/(.+)$ /$1 [R,L]
RewriteCond $1 !^(index.php|images|css|js|robots.txt)
RewriteRule ^(.*?)$ /index.php/$1 [L]
That will first redirect all the requests that begin with index.php to the corresponding shortened url, then silently serve index.php/etc with the second rule.
EDIT - Please read on!
In fact, the solution above generates an infinite redirection loop, because Apache takes the following actions (let's say we request /index.php/abc):
first RewriteCond matches
Apache redirects [R], that is, generates a new HTTP request, to /abc
/abc fails first RewriteCond
/abc matches second RewriteCond
Apache does not redirect, but rewrites this URI (so it makes an "hidden" request), to /index.php/abc . We are again at point 1, that's a loop.
Please note...
By using the [L] (last rule) flag, we can only tell Apache not to process more rewrite rules, but only if the current rule matches. Since a new HTTP request is made, there is no information about how may redirection we have been through yet. So, any time one of the two matches, and in any case it generates a new request (=>loop)
Using the [C] (chain rules) flag is kinda pointless because it makes Apache process a rule only if the previous rule matches, while the two rules we have are mutually excluding.
Using the [NS] (not if subrequest) flag on rule #1 is again not an option because it aƬsimply does not apply to our case (see Apache RewriteRule docs about it)
Setting env variables is not an option (alas), since a new request is made at pt 2, thus destroying all environment variables we set.
An alternative solution can be to rewrite e.g. /abc , to /index.php?path=abc. That is done by these rules (please, delete your RedirectMatch similar rule before adding these):
RedirectMatch ^/index\.php(/.*) $1
RewriteCond %{REQUEST_URI} !^/(index.php|images|css|js|robots.txt|favicon.ico)
RewriteRule ^(.+) /index.php?path=$1 [L,QSA]
I don't know the internals of CodeIgniter's scripts, but as most of the MVC scripts, it will read $_REQUEST['PATH_INFO'] to understand which page is requested. You could slightly modify the code that recognizes the page like this (I assumed that the page path is stored in the $page var):
$page = $_REQUEST['PATH_INFO'];
if(isset($_GET['path']) && strlen($_GET['path'])) $page = $_GET['path']; // Add this line
This won't break the previous code and accomplish what you asked for.

Related

.htaccess Redirect + Rewrite in one rule

I have a rewrite rule to hide index.php, which is
working fine.
RewriteCond $1 !^(index\.php)
RewriteRule ^(.*)$ index.php/$1 [L,QSA]
I am currently redirecting a specific sub-domain to another domain.
RewriteCond %{HTTP_HOST} ^(www\.)?deutschland\.example\.com$ [NC]
RewriteRule ^ http://www.example.de%{REQUEST_URI} [NE,R=301,L]
The redirect is working fine, but now I am getting index.php in the URL too, which is coming in the REQUEST_URI.
http://www.example.de/index.php/search/result
So how to remove 'index.php' from this redirected URL?
Note: Its the same php website application only, just using country-wise multiple domains.
(1) Rule order is important. (2) the last flag doesn't mean last; it means last on this cycle. (From Apache 2.4 the end flag does what you might think last does. See my Tips for debugging .htaccess rewrite rules for more discussion of this). So in this case rule(1) fires and then mod_rewrite loops around again and this time rule (2) fires giving what you find.
Swap the two rules around and it will work as expected.

Surprising rewriting of URL by htaccess rule

I've zeroed my problem and I've specific question.
With only the following code in the .httaccess why index2.php gets called if I type in my URL as www.mysite.com/url2 ?
RewriteEngine On
RewriteCond %{REQUEST_URI} (.html|.htm|.feed|.pdf|.raw)$ [NC]
RewriteRule (.*) index2.php [L]
I've also tested it at http://www.regextester.com and should not replace it with index2.php:
In the end I want this rule to skip any URL starting with /url2 or /url2/*.
EDIT: I've made screen recording of this problem: http://screenr.com/BBBN
You have this in your .htaccess:
RewriteEngine On
RewriteCond %{REQUEST_URI} (.html|.htm|.feed|.pdf|.raw)$ [NC]
RewriteRule (.*) index2.php [L]
What it does? it rewrites anything that ends with html, htm, feed , pdf , raw to index2.php. So, if you are getting results as your URL is ends with those extensions, then there are two possible answers:
There is another rewrite rule in an .htaccess in upper directories (or in server config files) that causes the URL to be rewritten.
Your URL actually ends with those extensions. have in mind, what you enter in your address bar, will be edited and rewritten. For example, if you enter www.mysite.com/url2 in your address bar and that file doesn't exist on server, your server will try to load the proper error document. So, if your error document is /404.html, it will be rewritten to index2.php at the end.
Update:
I think it's the case. create a file named 404.php in your document root. Inside your main .htaccess (in your document root), put this:
ErrorDocument 404 /404.php
delete all other ErrorDocument directives.
inside 404.php , put this:
<?php
echo 'From 404.php file';
?>
Logic behind it:
When you have a weird behavior in mod_rewrite, the best solution in my experience is using rewrite log. to enable rewrite log put this in your virtualhost or other server config directives you may choose:
RewriteLogLevel 9
RewriteLog "logs/RewriteLog.log"
be careful: the code above will enable rewrite log and start logging at highest level possible (logging everything). It will decrease your server speed and the log file will become huge very quickly. Do this only on your dev server.
Explanation: When you try to access www.mysite.com/url2, Apache gives your URL to rewrite module. Rewrite module checks if any of RewriteRules applies to your URL. Because you have one rule and it doesn't apply to your URL, it tries to load the normal file. But this file does not exit. So, Apache will do the next step which is showing the proper error message. When you set a custom error file, Apache will run the test against the new address. For example if error document is /404.html, Apache checks whether your rule applies to /404.html or not. Since it does, it will rewrite it.
The point to remember is apache will do this every time there is change in URL, whether the change is made by rewrite module or not!
The rule you list should work as you expect if this is the only rule. Fact is that theory is fun, but apparently it doesn't work as expected. Please note that . will match ANY CHARACTER. If you want to match the full stop/period character, you'll need to escape it. That's why I use \.(html|htm|feed|pdf|raw)$ instead of (.html|.htm|.feed|.pdf|.raw)$ below.
You can add another RewriteCond that simply doesn't match if the url starts with /url2, like below. This might not be a viable solution if there are lots of urls that shouldn't be matched.
RewriteCond %{REQUEST_URI} !^/url2
RewriteCond %{REQUEST_URI} \.(html|htm|feed|pdf|raw)$ [NC]
RewriteRule (.*) index2.php [L]
To get a better understanding of what is happening you can alter the rule to something like this. Now simply enter the urls you dont want to be matched in the url bar and inspect the url bar after the redirect happens. In the url-parameter you now see what url actually triggered this rule to match. This screencast shows you a similar version working with a sneaky rewriterule that is working away on the url.
#A way of finding out what is -actually- matched
RewriteCond %{REQUEST_URI} \.(html|htm|feed|pdf|raw)$ [NC]
RewriteCond %{REQUEST_URI} !/foo
RewriteRule (.*) /foo?url=$1 [R,L]
You can decide to match the %{THE_REQUEST} variable instead. This will always contain the request itself. If something else is rewriting the url, this variable doesn't change, meaning you can use this to overwrite any changes. Make sure the url won't be matching itself. You would get something like below. An example screencast can be found here.
#If it doesn't end on .html/htm/feed etc, this one won't match
RewriteCond %{THE_REQUEST} ^(GET|POST)\ /.*\.(html|htm|feed|pdf|raw)\ HTTP [NC]
RewriteCond %{REQUEST_URI} !^/index2\.php$
RewriteRule (.*) /index2.php [L]

htacces redirect and mask

How would I redirect from the root folder to a sub folder and then mask that folder?
So instead of http://root.com/sub_folder
It would be just http://root.com
I have tried:
RewriteEngine on
RewriteCond %{HTTP_HOST} ^root\.com$
RewriteRule (.*) http://root.com/$1 [R=301,L]
RewriteRule ^$ /sub [L]
However, that does not work. Any help will be welcome.
To clarify what I think you're looking for:
You want users who enter http://root.com with no trailing path to be rewritten silently to http://root.com/sub.
If a user directly enters http://root.com/sub, however, you want them to be redirected to http://root.com.
Any other path within root.com should be left alone.
The following two rules accomplish this. If you have more than one domain and only want this to apply to one domain, add your original RewriteCond in front of each RewriteRule.
RewriteRule ^sub/?$ http://root.com/ [R=301,L]
RewriteRule ^$ /sub [END]
First rule redirects /sub with or without trailing slash to root.com. Second rule rewrites base domain to /sub.
EDIT: Per Jon Lin's comment, below, the [L] flag only stops the current round of processing and internal rewrites are sent through the rules once more (I always forge that part). So, you can terminate the second line with [END] instead, which stops all rewrite processing. The catch is that [END] is only available in Apache 2.4 or higher, so if you're on an older version something trickier will need to be done.

Mod rewrite to redirect except on a certain page

Trying to write a rewrite rule in my htaccess so that any request to /en/shop/index.html or /fr/shop/index.html stays on the server, but if the user goes to any other page it redirects to a different server. Here's what I've got so far and it doesn't work.
RewriteRule ^(.*)/shop/(.*) [L]
RewriteRule ^(.*)$ http://www.newwebsite.com/$1 [R=301]
Add a dash to tell the first RewriteRule that you want the matches to be passed through unchanged:
RewriteRule ^.*/shop(/.*)?$ - [L]
I also removed the first set of parentheses since you're not using the results of the match so there's no need to store the matched patterns. I assumed you might need to match /shop without a trailing slash so that's why the (/.*)? section is there.

Redirect to fallback file if first attempt fails

I have this in my .htaccess:
RewriteRule ^images/([^/\.]+)/(.+)$ themes/current/images/$1/$2 [NC]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^images/([^/\.]+)/(.+)$ modules/$1/images/$2 [L,NC]
The idea is that it does the following:
// Rewrite this...
images/calendar/gear.png
// ... to this
themes/current/images/calendar/gear.png
// HOWEVER, if that rewritten path doesn't exist, rewrite the original URL to this:
modules/calendar/images/gear.png
The only things that change here are calendar and gear.png, the first of which could be any other single word and the latter the file name (possibly with path) to an image file.
I can rewrite the original URL to the first rewrite as shown in the example just fine, but what I cannot do is get my .htaccess to serve up the file from the other, fallback location if the first location 404s. I was under the impression that not using [L] in my first RewriteRule would rewrite the URL for RewriteCond.
The problem I'm having is that instead of serving the fallback file, the browser just shows a 404 to the first rewritten path (themes/current/calendar/gear.png), instead of falling back to modules/calendar/gear.png. What am I doing wrong?
Please note that my regex isn't perfect, but I can refine that later. Right now I'm concerning myself with the rewrite logic itself.
Fallthrough rules are fraught with bugs. My general recommendation is than any rule with a replacement string other than - should trigger an internal redirect to restart the .htaccess parse. This avoids the subrequest and URI_PATH bugs.
Next once you go to 404, again in my experience this is unrecoverable. I have a fragment which does something similar to what you are trying to do:
# For HTML cacheable blog URIs (a GET to a specific list, with no query params,
# guest user and the HTML cache file exists) then use it instead of executing PHP
RewriteCond %{HTTP_COOKIE} !blog_user
RewriteCond %{REQUEST_METHOD}%{QUERY_STRING} =GET [NC]
RewriteCond %{ENV:DOCUMENT_ROOT_REAL}/blog/html_cache/$1.html -f
RewriteRule ^(article-\d+|index|sitemap.xml|search-\w+|rss-[0-9a-z]*)$ \
blog/html_cache/$1.html [L,E=END:1]
Note that I do the conditional test in filesystem space and not URI (Location) space. So this would map in your case to
RewriteCond %{DOCUMENT_ROOT}/themes/current/images/$1/$2l -f
RewriteRule ^images/(.+?)/(.+)$ themes/current/images/$1/$2 [L]
Though do a phpinfo() to check to see if your hosting provider uses an alternative to DOCUMENT_ROOT if it is a shared hosting offering e.g an alternative environment variable as mine uses DOCUMENT_ROOT_REAL.
The second rule will be picked up on the second processing past after the internal redirect.

Resources