Htaccess - Rewrite engine (reverse engineering a line of code) - .htaccess

On a site I'm working on, if you enter the url, plus 1 directory, the htaccess adds a trailing slash.
So, this: http://www.mysite.com/shirts
Becomes this: http://www.mysite.com/shirts/
The htaccess that runs the site is quite long and complex, so it's not easy to find or test which rule is causing the rewrite. I was able to track down the issue to this line of code (I think):
RewriteRule (.*) http://www.mysite.com/$1 [R=301,L]
Does this rule match the behavior I'm describing above? It seems to be the cause, but it doesn't make logical sense to me. I don't unsderstand where the trailing slash is coming from.
Can someone shed some light on this for me? Thanks in advance.
Edit: MORE:
RewriteEngine On
RewriteCond %{HTTP_HOST} ^mysite\.com$
RewriteRule (.*) http://www.mysite.com/$1 [R=301,L]

By default apache will add the ending /, you will have to use:
DirectorySlash Off
To disable that behavior which is caused by mod_dir, you can read more about it here.
However if you're trying to remove the / to fix images not showing. That is not the right way to do it, you should instead use the HTML base tag, for example:
<BASE href="http://www.yourdomain.com/">
Read more here about it.
Your current rule as you have updated on your question:
RewriteCond %{HTTP_HOST} ^mysite\.com$
RewriteRule (.*) http://www.mysite.com/$1 [R=301,L]
Means:
if domain on the URL is only mysite.com
redirect current URL to domain with www.
So an example of it would be, if you access:
http://domain.com/blog/some_blog_article
It will redirect the user to:
http://www.domain.com/blog/some_blog_article
Note how it retains everything and only add the www. to the domain.
If you really want to redirect it regardless here is one way to do it:
Options +FollowSymLinks -MultiViews
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_HOST} ^mysite\.com$ [NC]
RewriteRule (.*) http://www.mysite.com/$1 [R=301,L]
# check if it is a directory
RewriteCond %{REQUEST_FILENAME} -d
# check if the ending `/` is missing and redirect with slash
RewriteRule ^(.*[^/])$ /$1/ [R=301,L]
# if file or directory does not exist
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
# and we still want to append the `/` at the end
RewriteRule ^(.*[^/])$ /$1/ [R=301,L]

Related

Trailing slash issue using mod_rewrite for SEO friendly URL

I consider myself reasonably competent with PHP. I am, however, completely and totally lost when it comes to mod_rewrite.
I have a URL structure that works like the following:
http://site/something/something-else/the-actual-page/
that redirects to:
http://site/index.php?page=the-actual-page
It's only ever the final 'folder' that is passed to the script. The preceding 'folders' (if any) are for SEO and structure purposes.
If there is a preceding folder "promotion" then it redirects to a separate file. This is along the lines of:
http://site/promotion/campaign-name/
redirecting to
http://site/promotion.php?campaign=campaign-name
I'm using the following code to achieve this:
<IfModule mod_rewrite.c>
Options +FollowSymLinks
RewriteEngine On
RewriteRule ^promotion/(.*)/$ promotion.php?params=$1 [L]
RewriteRule ^(.*) index.php?page=$1 [L]
</IfModule>
This works as intended, with links redirecting properly EXCEPT when there is no trailing slash. For example
http://site/something/thepage/
will work, whilst
http://site/something/thepage
will not.
To solve this problem I'm attempting to set up a 301 that redirects any URI without a trailing slash to a URI with a trailing slash.
The code below (placed above the other rules) works to a degree, but I lose folder data.
Code:
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ http://%{HTTP_HOST}/$1/ [L,R=301]
The problem?
http://site/something/thepage
redirects to
http://site/thepage/
I'm afraid all the googling in the world is not helping me, as I cannot wrap my brain around mod_rewrite at all!
Appreciate any help.
You'll be better off using this:
RewriteEngine On
RewriteCond %{REQUEST_URI} /+[^\.]+$
RewriteRule ^(.+[^/])$ %{REQUEST_URI}/ [R=301,L]
# ... your other rerites
If you'd like to reverse it and strip the trailing slash instead, then use this:
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)/$ /$1 [R=301,L]
You'd then need to change your rewrite accordingly:
RewriteRule ^promotion/(.*)$ /promotion.php?params=$1 [L]

htaccess rewrite for subdomain only

I'm trying to rewrite some parameters to beautiful links, but for a subdomain / a folder only. Unfortunately I can't get it to work, maybe also because there are some other rewrites in line before...
Heres my code:
<IfModule mod_rewrite.c>
# NON-WWW TO WWW
RewriteEngine On
RewriteCond %{HTTP_HOST} ^example.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
# WORDPRESS-BLOG
Options +FollowSymlinks
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
# REDIRECT FOR SUBDOMAIN
RewriteCond %{HTTP_HOST} ^subdomain.example.com
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([^/.]+)(?:/)?$ index.php?cshort=$1 [L]
RewriteRule ^([^/.]+)/([^/.]+)(?:/)?$ /index.php?cshort=$1&cid=$2 [L]
RewriteRule ^([^/.]+)/([^/.]+)/([^/.]+)(?:/.*)?$ /index.php?cshort=$1&cid=$2&step=$3 [L]
</IfModule>
Basically only the last part is the one I want to rewrite to change URLs from something like
http://subdomain.example.com/index.php?cshort=abc&cid=123&step=1 to http://subdomain.example.com/abc/123/1
The other rewriting rules for www.example.com shouldn't get affected. Unfortunately my current codes only does the first two rules for the blog and the www, but nothing happens on the subdomain. What's wrong in my code?
When you say that you want to rewrite from http://subdomain.example.com/index.php?cshort=abc&cid=123&step=1 to http://subdomain.example.com/abc/123/1 you mean that you want the user to enter the pretty URL and to have it serve the full URL in the background, not that you want to redirect from the ugly to the pretty URL, right?
In your RewriteRules, what are you trying to accomplish with "(?:/)?"? As written, that doesn't make any sense to me. If you're just trying to match whether or not the directory path ends with a slash, you can do that as follows:
RewriteRule ^([^/.]+)/?$ index.php?cshort=$1 [L]
EDIT: Additional suggestions:
Move the "Redirect for subdomain" section above the "Wordpress Blog" section. Since the Wordpress rule applies to "everything that's not a real file or directory, regardless of domain" that should go last.
RewriteConds only apply to a single RewriteRule that follows them. For each of the three rules you have listed under "Redirect for subdomain", after updating them per the above suggestion, you need to repeat the two RewriteCond lines in front of the RewriteRule.

mod rewrite to remove file extension, add trailing slash, remove www and redirect to 404 if no file/directory is available

I would like to create rewrite rules in my .htaccess file to do the following:
When accessed via domain.com/abc.php: remove the file extension, append a trailing slash and load the abc.php file. url should look like this after rewrite: domain.com/abc/
When accessed via domain.com/abc/: leave the url as is and load abc.php
When accessed via domain.com/abc: append trailing slash and load abc.php. url should look like this after rewrite: domain.com/abc/
Remove www
Redirect to 404 page (404.php) when accessed url doesn't resolve to folder or file, e.g. when accessing either domain.com/nothingthere.php or domain.com/nothingthere/ or domain.com/nothingthere
Make some permanent 301 redirects from old urls to new ones (e.g. domain.com/abc.html to domain.com/abc/)
All php files sit in the document root directory, but if there is a solution that would make urls such as domain.com/abc/def/ (would load domain.com/abc/def.php) also work it would be great as well, but not necessary
So here is what I have at the moment (thrown together from various sources and samples from around the web
<IfModule mod_rewrite.c>
RewriteCond %{HTTPS} !=on
# redirect from www to non-www
RewriteCond %{HTTP_HOST} ^www\.(.+)$ [NC]
RewriteRule ^ http://%1%{REQUEST_URI} [R=301,L]
# remove php file extension
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{THE_REQUEST} ^GET\ /[^?\s]+\.php
RewriteRule (.*)\.php$ /$1/ [L,R=301]
# add trailing slash
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^.*[^/]$ /$0/ [L,R=301]
# resolve urls to matching php files
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule (.*)/$ $1.php [L]
With this the first four requirements seem to work, whether I enter domain.com/abc.php, domain.com/abc/ or domain.com/abc, the final url always ends up being domain.com/abc/ and domain.com/abc.php is loaded.
When I enter a url that resolves to a file that doesn't exists I'm getting an error 310 (redirect loop), when really a 404 page should be loaded. Additionally I haven't tried if subfolders work, but as I said, that's low priority. I'm pretty sure I can just slap the permanent 301 redirects for legacy urls on top of that without any issues as well, just wanted to mention it. So the real issue is really the non working 404 page.
I've had problems with getting ErrorDocument to work reliably with rewrite errors, so I tend to prefer to handle invalid pages correctly in my rewrite cascade. I've tried to cover a fully range of test vectors with this. Didn't find any gaps.
Some general points:
You need to use the DOCUMENT_ROOT environment variable in this. Unfortunately if you use a shared hosting service then this isn't set up correctly during rewrite execution, so hosting providers set up a shadow variable to do the same job. Mine uses DOCUMENT_ROOT_REAL, but I've also come across PHP_DOCUMENT_ROOT. Do a phpinfo to find out what to use for your service.
There's a debug info rule that you can trim as long as you replace DOCROOT appropriately
You can't always use %{REQUEST_FILENAME} where you'd expect to. This is because if the URI maps to DOCROOT/somePathThatExists/name/theRest then the %{REQUEST_FILENAME} is set to DOCROOT/somePathThatExists/name rather than the full pattern equivalent to the rule match string.
This is "Per Directory" so no leading slashes and we need to realise that the rewrite engine will loop on the .htaccess file until a no-match stop occurs.
This processes all valid combinations and at the very end redirects to the 404.php which I assume sets the 404 Status as well as displaying the error page.
It will currently decode someValidScript.php/otherRubbish in the SEO fashion, but extra logic can pick this one up as well.
So here is the .htaccess fragment:
Options -Indexes -MultiViews
AcceptPathInfo Off
RewriteEngine On
RewriteBase /
## Looping stop. Not needed in Apache 2.3 as this introduces the [END] flag
RewriteCond %{ENV:REDIRECT_END} =1
RewriteRule ^ - [L,NS]
## 302 redirections ##
RewriteRule ^ - [E=DOCROOT:%{ENV:DOCUMENT_ROOT_REAL},E=URI:%{REQUEST_URI},E=REQFN:%{REQUEST_FILENAME},E=FILENAME:%{SCRIPT_FILENAME}]
# redirect from HTTP://www to non-www
RewriteCond %{HTTPS} !=on
RewriteCond %{HTTP_HOST} ^www\.(.+)$ [NC]
RewriteRule ^ http://%1%{REQUEST_URI} [R=301,L]
# remove php file extension on GETs (no point in /[^?\s]+\.php as rule pattern requires this)
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_METHOD} =GET
RewriteRule (.*)\.php$ $1/ [L,R=301]
# add trailing slash
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^.*[^/]$ $0/ [L,R=301]
# terminate if file exists. Note this match may be after internal redirect.
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^ - [L,E=END:1]
# terminate if directory index.php exists. Note this match may be after internal redirect.
RewriteCond %{REQUEST_FILENAME} -d
RewriteCond %{ENV:DOCROOT}/$1/index.php -f
RewriteRule ^(.*)(/?)$ $1/index.php [L,NS,E=END:1]
# resolve urls to matching php files
RewriteCond %{ENV:DOCROOT}/$1.php -f
RewriteRule ^(.*?)/?$ $1.php [L,NS,E=END:1]
# Anything else redirect to the 404 script. This one does have the leading /
RewriteRule ^ /404.php [L,NS,E=END:1]
Enjoy :-)
You'll probably want to check if the php file exists before adding the tailing slash.
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule ^.*[^/]$ /$0/ [L,R=301]
or if you really want a tailing slash for all 404 pages (so /image/error.jpg will become /images/error.jpg/, which I think is weird):
RewriteCond %{ENV:REDIRECT_STATUS} !200
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^.*[^/]$ /$0/ [L,R=301]
I came up with this:
DirectorySlash Off
RewriteEngine on
Options +FollowSymlinks
ErrorDocument 404 /404.php
#if it's www
# redirect to non-www.
RewriteCond %{HTTP_HOST} ^www\.(.+)$ [NC]
RewriteRule ^ http://%1%{REQUEST_URI} [L,R=301,QSA]
#else if it has slash at the end, and it's not a directory
# serve the appropriate php
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)/$ /$1.php [L,QSA]
#else if it's an existing file, and it's not php or html
# serve the content without rewrite
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{REQUEST_FILENAME} -f
RewriteCond %{REQUEST_URI} !(\.php)|(\.html?)$
RewriteRule ^ - [L,QSA]
#else
# strip php/html extension, force slash
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule ^(.*?)((\.php)|(\.html?))?/?$ /$1/ [L,NC,R=301,QSA]
Certainly not very elegant (env:redirect_status is quite a hack), but it passes my modest tests. Unfortunately I can't test the www redirection, as I'm on localhost, and has no real access to a server, but that part should work too.
You see, I used the ErrorDocument directive to specify the error page, and used the DirectorySlash Off request to make sure Apache doesn't interfere with the slash-appending fun. I also used the QSA (Query String Append) flag that, well, appends the query string to the request so that it's not lost. It looks kind of silly after the trailing slash, but anyhow.
Otherwise it's pretty straightforward, and I think the comments explain it pretty well. Let me know if you run into any trouble with it.
Create a folder under the root of the domain
Place a .htaccess in the above folder as RewriteRule ^$ index.php
Parse the URL
With PHP coding you can now strip the URL or file extension as required

htaccess rewrite rule, old URL to new

A bit of help fellow SO people.
What I have at the moment (based on some code I used for a different type of URL).
I want the first URL to redirect to the second, with no query string included afterwards
This is what I have to so far.
RewriteRule ^(page.php?id=missionstatement+)/?$ http://example.com/why/mission-statement [R=301,L]
RewriteRule ^(page.php?id=ofsted+)/?$ http://example.com/how/ofsted-report [R=301,L]
RewriteRule ^(page.php?id=governingbody+)/?$ http://example.com/governors [R=301,L]
Here is the rule (will redirect 1 URL):
RewriteCond %{QUERY_STRING} ^id=whatever
RewriteRule ^page\.php$ http://%{HTTP_HOST}/how/somehow? [R=301,L]
This rule intended to be placed in .htaccess in website root folder. If placed elsewhere some small tweaking may be required.
I have used %{HTTP_HOST} -- this will redirect to the same domain as requested URL. If domain name has to be different, replace it by exact domain name.
The ? at the end of new URL will get rid of existing query string.
Ahoy!
Give this a whirl:
#check mod_rewrite is enabled
<IfModule mod_rewrite.c>
#enable mod rewrite
RewriteEngine On
#set working directory
RewriteBase /
#force trailing slash
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ $1/ [R=301,L]
#bootstrap index.php
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^page.php\?id=(.*)$ http://www.willans.com/page.php/$1 [R=310,L]
#end mod rewrite check
</IfModule>
It's been a while since i've done any web dev, but that should be a push in the right direction at least ;)

Trying to add trailing slash with htaccess, results in a absolute path

What I'm trying to achive is to have all urls on my page look like http://domain.com/page/, no extensions, but a trailing slash. If a user happends to write http://domain.com/page or http://domain.com/page.php it will redirect to the first url. After some googling i found this code, and it's close to working, but when you leave out the trailing slash in your request the url becomes something like http://domain.com/Users/"..."/page/ and therefor returns a 404.
My .htaccess looks like this:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{THE_REQUEST} ^GET\ /[^?\s]+\.php
RewriteRule (.*)\.php$ /$1/ [L,R=301]
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule (.*)/$ $1.php [L]
RewriteCond %{REQUEST_FILENAME} !(.*)/$
RewriteRule (.*)/$ $1.php [L]
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule .*[^/]$ $0/ [L,R=301]
I've been trying to add an additional rule but I really don't get any of this and I haven't been able to find any answers.
For a scenario like this one, the .htaccess author has to consider both what the browser URL bar should display and what file the web server should return/execute. Note also that each external redirect starts the processing of the rewrite directives over.
With that in mind, start by taking care of which file is returned when the URL is in the correct format:
RewriteEngine on
RewriteRule ^/?$ /index.php [L]
RewriteRule ([^./]+)/$ /$1.php [L]
Then, deal with URLs with no trailing slash by redirecting them with [R=301]:
RewriteRule ^/(.*)\.[^.]*$ http://www.example.com/$1/ [R=301,L]
RewriteRule ^/(.*)$ http://www.example.com/$1/ [R=301,L]
Note that the first of these two rules should also take care of the case where there is a filename (like something.php) but also a trailing slash by eliminating the filename extension and re-adding the slash.
Keep in mind that, if your internal directory structure does not match what the web server is serving (as is often the case in shared hosting scenarios), you will likely need to add a RewriteBase directive immediately after the RewriteEngine directive. See the Apache docs for an explanation.

Resources