Redirecting non-www URL to www using .htaccess - .htaccess

I'm using Helicon's ISAPI Rewrite 3, which basically enables .htaccess in IIS. I need to redirect a non-www URL to the www version, i.e. example.com should redirect to www.example.com. I used the following rule from the examples but it affects subdomains:
RewriteCond %{HTTPS} (on)?
RewriteCond %{HTTP:Host} ^(?!www\.)(.+)$ [NC]
RewriteCond %{REQUEST_URI} (.+)
RewriteRule .? http(?%1s)://www.%2%3 [R=301,L]
This works for most part, but is also redirect sub.example.com to www.sub.example.com. How can I rewrite the above rule so that subdomains do not get redirected?

Append the following RewriteCond:
RewriteCond %{HTTP:Host} ^[^.]+\.[a-z]{2,5}$ [NC]
That way it'll only apply the rule to nondottedsomething.uptofiveletters as you can see, subdomain.domain.com will not match the condition and thus will not be rewritten.
You can change [a-z]{2,5} for a stricter tld matching regex, as well as placing all the constraints for allowed chars in domain names (as [^.]+ is more permissive than strictly necessary).
All in all I think in this case that wouldn't be necessary.
EDIT: sadie spotted a flaw on the regex, changed the first part of it from [^.] to [^.]+

I've gotten more control using urlrewriter.net, something like:
<unless header="Host" match="^www\.">
<if url="^(https?://)[^/]*(.*)$">
<redirect to="$1www.domain.tld$2"/>
</if>
<redirect url="^(.*)$" to="http://www.domain.tld$1"/>
</unless>

Zigdon has the right idea except his regex isn't quite right. Use
^example\.com$
instead of his suggestion of:
^example\.com(.*)
Otherwise you won't just be matching example.com, you'll be matching things like example.comcast.net, example.com.au, etc.

#Vinko
For your generic approach, I'm not sure why you chose to limit the length of the TLD in your regex? It's not very future-proof, and I'm unsure what benefit it's providing? It's actually not even "now-proof" because there's at least one 6-character TLD out there (.museum) which won't be matched.
It seems unnecessary to me to do this. Couldn't you just do ^[^.]+\.[^.]\+$? (note: the question-mark is part of the sentence, not the regex!)
All that aside, there is a bigger problem with this approach that is: it will fail for domains that aren't directly beneath the TLD. This is domains in Australia, UK, Japan, and many other countries, who have hierarchies: .co.jp, .co.uk, .com.au, and so on.
Whether or not that is of any concern to the OP, I don't know but it's something to be aware of if you're after a "fix all" answer.
The OP hasn't yet made it clear whether he wants a generic solution or a solution for a single (or small group) of known domains. If it's the latter, see my other note about using Zigdon's approach. If it's the former, then proceed with Vinko's approach taking into account the information in this post.
Edit: One thing I've left out until now, which may or may not be an option for you business-wise, is to go the other way. All our sites redirect http://www.domain.com to http://domain.com. The folks at http://no-www.org make a pretty good case (IMHO) for this being the "right" way to do it, but it's still certainly just a matter of preference. One thing is for sure though, it's far easier to write a generic rule for that kind of redirection than this one.

#org 0100h Yes, there are many variables left out of the description of the problem, and all your points are valid ones and should be addressed in the event of an actual implementation. There are both pros and cons to your proposed regex. On the one hand it's easier and future proof, on the other, do you really want to match example.foobar if sent in the Host header? There might be some edge cases when you'll end up redirecting to the wrong domain. A thrid alternative is modifying the regex to use a list of the actual domains, if more than one, like
RewriteCond %{HTTP:Host} (example.com|example.net|example.org) [NC]
(Note to chris, that one will change %1)
#chrisofspades It's not meant to replace it, your condition number two ensures that it doesn't have www, whereas mine doesn't. It won't change the values of %1, %2, %3 because it doesn't store the matches (iow, it doesn't use parentheses).

Can't you adjust the RewriteCond to only operate on example.com?
RewriteCond %{HTTP:Host} ^example\.com(.*) [NC]

Why dont you just have something like this in your vhost (of httpd) file?
ServerName: www.example.com
ServerAlias: example.com
Of course that wont re-direct, that will just carry on as normal

Related

The Ultimate Generic .htaccess Wildcard Subdomain Rewrite to "www." - is this valid?

I think I have achieved the holy grail of generic wildcard subdomain redirection after going around in circles for a whole day battling "too many redirects" errors. It seems to work with any domain and subdomain, the only part you need to specify is a list of possible valid suffixes eg .com|.com.au|.co.uk etc. This code will take *yourdomain.suffix for any domain and turn it into http://www.yourdomain.suffix, but only for valid subdomains that could actually exist. You can have as many sequences of anything.anything-anything.anything-anything-anything.anything. before yourdomain.com as you want, it will all get turned into www. Now it seems to work perfectly, but I don't trust this sadistic language of regex one bit. I have absolutely no way of knowing if this code is valid, if it will cause server problems or fail under some important circumstances. Can anyone help bug-test or refine it?
Here it is:
RewriteCond %{HTTP_HOST} ^([a-zA-Z0-9]+[a-zA-Z0-9\-]*[a-zA-Z0-9]+[\.]{1}|[a-zA-Z0-9]+[\.]{1})*([a-zA-Z0-9]+[a-zA-Z0-9\-]*[a-zA-Z0-9]+|[a-zA-Z0-9]+)\.(com|com[\.]{1}au)?$ [NC]
RewriteCond %{HTTP_HOST} !^www\.([a-zA-Z0-9]+[a-zA-Z0-9\-]*[a-zA-Z0-9]+|[a-zA-Z0-9]+)\.(com|com[\.]{1}au)?$ [NC]
RewriteRule .? http://www.%2.%3%{REQUEST_URI} [R=302,NC,L]
Note: The reason it's so long is because I'm trying to account for the possibility of dashes in the main domain or subdomain parts. So anything-anything.youdomain.com. But I read that with domain names you're not allowed to have dashes without at least one alphanumerical character between the dash and any period. So www.anything-.yourdomain.com or www.-anything.yourdomain.com are both invalid and must be rejected. If I didn't have to consider this, the regex for the first 2 lines would be way simpler: it could just start with:
RewriteCond %{HTTP_HOST} ^([a-zA-Z0-9\-]+[\.]{1})*([a-zA-Z0-9\-]+)\.(com|com[\.]{1}au)?$

How to redirect only when there is something after .html?

I have found that there are some people with bad syntax links to our articles.
For example, we have an article with URL
http://www.oursite.com/demo/article-179.html
The issue is that lot of people have linked back to this article with bad syntax such as
http://www.oursite.com/demo/article-179.html%5Cohttp:/www.oursite.com/demo/glossary.php
Now, I added the following ReWrite Rule in the .htaccess file to take care of such links.
RewriteRule article-179\.html(.*)$ "http\:\/\/www\.oursite\.com\/demo\/article-179\.html [301,L]
But this has resulted in a Redirect Loop message. How can we fix this issue via htaccess rewrite rule. Basically, we need something in our rewrite rule that works only when there is one or more characters after the .html. If not, then it should not redirect.
Any help would be highly appreciated!
With best regards!
Use + instead of *. * matches zero or more, which causes the pattern to match for the redirected path too, + instead matches one or more.
Also you should make the pattern as precise as possible, ie don't just check whether it ends with article-179.html, better check for the full path. And if this all happens on the same domain, then there's no need to use the absolute URL for the redirect.
There's also no need for escaping the substitution parameter like you did, it's treated as a simple string except for:
back-references ($N) to the RewriteRule pattern
back-references (%N) to the last matched RewriteCond pattern
server-variables as in rule condition test-strings (%{VARNAME})
mapping-function calls (${mapname:key|default})
http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html#rewriterule
Long story short, theoretically this should do it:
RewriteRule ^demo/article-179\.html(.+)$ /demo/article-179.html [R=301,L]
or this if you really need the absolute URL:
RewriteRule ^demo/article-179\.html(.+)$ http://www.oursite.com/demo/article-179.html [R=301,L]

mod_rewrite: How to disable not clean urls navigation of rewrite rules

I've been enabled mod_rewrite module and all is right.
I created simple rules for the url, but how do I disable the url navigation (rewritten) with the parameters?
example:
# rewrite rule for cleaning
RewriteRule ^bookstore/([0-9]+)?$ /bookstore/book.php?id=$1 [L]
Now, if I navigate to http://mydomine.com/bookstore/123 all is done, but the url http://mydomine.com/bookstore/book.php?id=123 is also navigable.
How can I make visible and bavigable only the first one?
Add this to the same htaccess file:
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /bookstore/book\.php\?id=([0-9]*)
RewriteRule ^bookstore/book\.php$ /bookstore/%1? [L,R=301]
This will 301 redirect requests for the URI with query strings to the one without.
Not 100% sure about this, but I think that if you rewrite A to B, then both A and B will work.
I would like to ask why exactly is it a problem that http://mydomine.com/bookstore/book.php?id=123 is navigable too? What is the problem if that link is valid too, and the user can use both links... although it would take them some time and luck to discover the second option. What would they gain by doing that? What would you lose? If the answer in both cases is "nothing", then simply stop worrying. :) If you used the old links previously and now replace then with new links, then it is a good thing that your customer's old bookmarks will still work.
But assuming that you have a good reason for disabling the old URLs, how about changing them both. For example rename "book.php" to "xyz.php" and then redirect http://mydomine.com/bookstore/123 to http://mydomine.com/bookstore/xyz.php?id=123 -- and the old http://mydomine.com/bookstore/book.php?id=123 will stop working.
Ok, that is an ugly solution, but you can make it nicer if instead of renaming the files you just move them to a subdirectory, like http://mydomine.com/xyz/bookstore/book.php?id=123 . Alternatively, you could use the redirect to add a "secret" parameter and then check it in the PHP file, for example rewrite http://mydomine.com/bookstore/123 to http://mydomine.com/bookstore/book.php?id=123&secret=xyz . Sure, it's just a "security by obscurity", but again... what exactly would anyone gain by discovering your true URLs?

301 redirect question?

Is this qood example of redirection of page to another domain page:
RewriteCond %{HTTP_HOST} ^dejan.com.au$ [OR]
RewriteCond %{HTTP_HOST} ^www.dejan.com.au$
RewriteRule ^seo_news_blog_spam\.html$ "http\:\/\/dejanseo\.com\.au\/blog\-spam\/" [R=301,L]
or good old works too:
301 redirect seo_news_blog_spam.html http://dejanseo.com.au/blog/spam/
and whats the difference?
Presumably, the rules are functionally equivalent (well, assuming that http://dejanseo.com.au/blog/spam/ was supposed to be http://dejanseo.com.au/blog-spam/ like the first one redirects to, and the only host pointing at that location is dejanseo.com.au with or without the www).
The first example uses directives from mod_rewrite, whereas the second one uses some from mod_alias. I imagine that the preferred option is the second one for this particular case, if not only because it's a bit simpler (there's also marginal additional overhead involved in creating the regular expressions being used by mod_rewrite, but that's very minor):
Redirect 301 seo_news_blog_spam.html http://dejanseo.com.au/blog-spam/
However, I suspect the reason that you have the first one is that it was created using CPanel (based on the unnecessary escapes in the replacement that appeared before in another user's question where it was indicated CPanel was the culprit). They've gone with the mod_rewrite option because it provides conditional flexibility that the Redirect directive does not, and I assume this flexibility is reflected somewhat in whatever interface is used to create these rules.
You'll note that there is a condition on whether or not to redirect based upon your host name in the first example, identified by the RewriteCond. This allows for you to perform more powerful redirects that are based on more than just the request path. Note that mod_rewrite also allows for internal redirects invisible to the user, which mod_alias is not intended for, but that's not the capacity it's being used in here.
As a final aside, the host names in your RewriteCond statements should technically have their dots escaped, since the . character has special meaning in regular expressions. You could also combine them, change them to direct string comparisons, or remove them altogether (since I imagine they don't do anything useful here).
Unbeliavable, the problem was that the synthax wasn't correct, so instead of:
redirect 301 seo_news_blog_spam.html http://dejanseo.com.au/blog/spam/
it should look like this:
Redirect 301 seo_news_blog_spam.html http://dejanseo.com.au/blog/spam/
One, first big letter was the source of all troubles, what a waste of time :D
it works now as it supposed to!
Thanks to everyone who participated, issue solved.

mod_rewrite Redirect Rule Variables question

I'm a bit of an .htaccess n00b, and can't for the life of me get a handle of regular expressions.
I have the following piece of RewriteRule code that works just fine:
RewriteRule ^logo/?$ /pages/logo.html
Basically, it takes /pages/logo.html and makes it /logo.
Is there a way for me to generalize that code with variables, so that it works automatically without having to have an independent line for each page?
I know $1 can work as a variable, but thats usually for queries, and I can't get it to work in this instance.
First you need to know that mod_rewrite can only handle requests to the server. So you would need to request /logo to have it rewritten to /pages/logo.html. And that’s what the rule does, it rewrites requests with the URL path /logo internally to /pages/logo.html and not vice versa.
If you now want to use portions of the matched string, you need to use groups to group them ( (expr)) that you then can reference to with $n. In your case the pattern [^/] will be suitable that describes any character other than the slash /:
RewriteRule ^([^/]+)$ /pages/$1.html
Try this:
RewriteRule ^/pages/(.*)\.html$ /$1
The (.*) matches anything between pages/ and .html. Whatever it matches is used in $1. So, /pages/logo.html becomes /logo, and /pages/subdir/other_page.html would become /subdir/other_page

Resources