.htaccess code questions about SSL and canonicalization - .htaccess

In short, my website has a single payments page. SSL certificate is installed but is not required apart for that one payments page.
With regards to my .htaccess file - I currently separate my payments page with the following code. I also block visitors from semalt.com. Can't remember exactly why, but I think I was receiving unwanted attention (spam) from them at the time.
What I would like to know is:
is this code still valid 5 years on?
do I need to address canonicalization by directing to either a www or non-www version of mywebsite (importantly without affecting that one important https payments page); is it necessary?
1. Options +FollowSymlinks
2. RewriteEngine On
3. RewriteBase /
4. # RewriteCond %{HTTP_HOST} !^example\.com$ [NC]
5. # RewriteRule .* http://example.com%{REQUEST_URI} [L,R=301]
6.
7. RewriteCond %{HTTPS} off
8. RewriteRule ^payment\.html$ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
9.
10. # block visitors referred from semalt.com
11. RewriteEngine on
12. RewriteCond %{HTTP_REFERER} semalt\.com [NC]
13. RewriteRule .* – [F]
14. # End semalt block
15. # block referer spam buttons for website
16. RewriteEngine On
17. RewriteCond %{HTTP_REFERER} buttons\-for\-website\.com
18. RewriteRule ^.* - [F,L]
19. # End buttons for website block
20.
21. ErrorDocument 404 /404.html

The main thing I would address is that you are only redirecting to HTTPS for your payments page. You should be forcing HTTPS for your entire site - everywhere. These days browsers alert users to the fact that they are browsing an insecure connection if on HTTP (Google Chrome states "Not Secure" next to the URL), which doesn't do anything for user trust. This is the main thing that would have changed in the last 5 years - HTTPS is mandatory everywhere.
There is no good reason not to use HTTPS everywhere these days.
Assuming the rest of your site is already HTTPS "ready" (I assume it must be and you aren't sending users back to HTTP from your payment page?!) then change the HTTP to HTTPS redirect to include your entire site:
# HTTP to HTTPS
RewriteCond %{HTTPS} off
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]
2) do I need to address canonicalization by directing to either a www or non-www version of mywebsite (importantly without affecting that one important https payments page); is it necessary?
Yes, you should. You already have the directives at the top of your .htaccess file - but they are commented out? You may have already set the rel="canonical" element in the head of your pages, but otherwise, if www and non-www are both available then this is potentially duplicate content (same content available from 2 or more different URLs). You need to decide which: www or non-www? Which do you currently favour? Which (predominantly) is already indexed? Which does your payments page use? (Hopefully, the answer is the same to all the above.)
Also redirect directly to HTTPS as part of this redirect. And this should go before the current HTTP to HTTPS redirect (the same order as currently in your .htaccess file):
# Redirect to non-www
RewriteCond %{HTTP_HOST} !=example.com
RewriteRule ^ https://example.com%{REQUEST_URI} [R=301,L]
Note that the above www to non-www redirect assumes you are not using any other subdomains. To redirect to www.example.com, just change both instances of example.com.
RewriteCond %{HTTP_REFERER} semalt\.com [NC]
RewriteRule .* – [F]
Ok, if it helps - check your server logs if this is doing anything for you. But change the .* regex to ^ (marginally more efficient). And any blocking directives should be at the very top of the file (you don't want to bother canonicalising these requests).
RewriteCond %{HTTP_REFERER} buttons\-for\-website\.com
RewriteRule ^.* - [F,L]
Again - OK, it helps (does it?!). Optimise the regex as above. No need to backslash escape literal hyphens in the CondPattern (unless they appear in the middle of a character class). The L flag is not required when used with F.
Other notes:
No need to repeat the RewriteEngine On directive.
You do not need the RewriteBase / directive with your current directives.
It's clearer to define your ErrorDocuments at the top of the file.
Summary
Bringing the above points together we have:
Options +FollowSymlinks
ErrorDocument 404 /404.html
RewriteEngine On
# block visitors referred from semalt.com
RewriteCond %{HTTP_REFERER} semalt\.com [NC]
RewriteRule ^ – [F]
# block referer spam buttons for website
RewriteCond %{HTTP_REFERER} buttons-for-website\.com [NC]
RewriteRule ^ - [F]
# Redirect to non-www
RewriteCond %{HTTP_HOST} !=example.com
RewriteRule ^ https://example.com%{REQUEST_URI} [R=301,L]
# HTTP to HTTPS
RewriteCond %{HTTPS} off
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]

Related

Google is not indexing https

I have used the http protocol for a long time. After years I implemented a domain certificate. Now I am trying to get the website with the https:// protocol indexed but Google still indexes the http protocol.
I have tried several things. I enabled the 'Force SSL with https redirect' option in DirectAdmin.
I changed my .htaccess so the browser redirects every option to the https protocol:
RewriteEngine on
RewriteCond %{HTTP_HOST} ^[^.]+\.[^.]+$
RewriteCond %{HTTPS}s ^on(s)|
RewriteRule ^ http%1://www.%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
RedirectMatch permanent index.php/(.*) https://www.***.com/$1
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.(php|html) [NC]
RewriteRule ^index\.php$ https://www.***.com/ [R=301,L]
RewriteCond %{THE_REQUEST} ^GET\ /.*/index\.(php|html)\ HTTP
RewriteRule (.*)index\.(php|html)$ /$1 [R=301,L]
RewriteCond %{HTTP_HOST} ^domain\.com$ [OR]
RewriteCond %{HTTP_HOST} ^www\.domain\.com$
RewriteRule ^home\.html$ "https\:\/\/www\.domain\.com\/" [R=301,L]
RewriteCond %{HTTP_HOST} ^domain\.com$ [OR]
RewriteCond %{HTTP_HOST} ^www\.domain\.com$
RewriteRule ^home$ "https\:\/\/www\.domain\.com\/" [R=301,L]
I have created a sitemap.xml that contains only the https:// protocol.
In the Google Search Console I see that the website is indexed today and that the 'Google selected canonical URL' is still the http protocol.
Does someone know what I need to do to fix this problem?
I changed my .htaccess so the browser redirects every option to the https protocol:
Actually, you've not.
You don't have an HTTP to HTTPS redirect at all and the first rule (a non-www to www redirect) specifically maintains whatever protocol has been requested (HTTP or HTTPS).
Try the following two rules instead, replacing your first (non-www to www) rule:
# non-www to www (and HTTPS)
RewriteCond %{HTTP_HOST} ^[^.]+\.[^.]+$
RewriteRule ^ https://www.%{HTTP_HOST}%{REQUEST_URI} [R=301,L]
# HTTP to HTTPS (already www)
RewriteCond %{HTTPS} off
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]
The remaining directives can also be simplified...
# Redirect to path-info (removing "index.php")
RewriteRule ^(.+/)?index\.php/(.*) /$1 [R=301,L]
# Remove "index.php" from the end of the URL
RewriteRule ^(.+/)?index\.php$ /$1 [R=301,L]
# Redirect "home" and "home.html" to the document root
RewriteRule ^home(\.html)?$ / [R=301,L]
You don't seem to need to target the domain name in the later directives, so I've removed the seemingly superfluous conditions.
I've simplified/reduced the two rules that removed index.php into a single directive. Unless you have a front-controller whereby you are routing all requests to index.php then you don't need the additional condition that checks against THE_REQUEST. (There is no front-controller in the directives you've posted.)
(Although the rule that removes and redirects to the path-info is a little out of place if you don't have a front-controller?)
I've changed the mod_alias RedirectMatch to the corresponding*1 mod_rewrite rule. mod_alias directives are processed after mod_rewrite, despite the apparent order of directives in the config file, so it is advisable to avoid mixing redirects from both modules to avoid unexpected conflicts.
(*1 I've also "corrected" this so to avoid matching URLs of the form <anything>index.php, rather than <anything>/index.php, which I assume is the intention.)
Clear your browser cache and test first with 302 (temporary) redirects to avoid any potential caching issues.
I also tried to delete the canonical link element
You should not delete the "canonical link element" providing you are linking to the correct canonical URL, ie. HTTPS + www.
You've not actually stated how long it is since you've "switched to HTTPS", but this can take some time. Google naturally favours HTTPS, but you are likely to have many HTTP backlinks due to the age of your site. It is important that you 301 redirect HTTP to HTTPS.
You should register both properties: HTTP and HTTPS in GSC and monitor the index status of both.
NB: Questions of this nature are generally better asked on the Webmasters stack: https://webmasters.stackexchange.com/

Using .htaccess to redirect mobile and desktop from http to https versions

We have installed a SSL for our site and I have created an .htaccess with the following code:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
RewriteCond %{ENV:HTTPS} !=on
RewriteRule ^(.*)$ https://%{HTTP_HOST}/$1 [R=301,L]
# mobile redirect
RewriteCond %{HTTP_HOST} ^www\mobile.example\.com [NC]
RewriteCond %
{HTTP_USER_AGENT}"android|blackberry|iphone|ipod|iemobile|opera mobile|palmos|webos|googlebot-mobile" [NC]
RewriteRule ^(.*)$ https://mobile.example.com/$1 [L,R=302]
</IfModule>
This code works great coming from the desktop, but the mobile part is not working. What am I missing?
...but the mobile part is not working.
You've not stated explicitly what the "mobile" part is expected to do. However, the "mobile part" in your code would seem to just be a www to non-www redirect. The HTTP to HTTPS redirect is separate to this and does not differentiate between mobile and desktop (and neither would it necessarily need to).
However, there are several issues with the directives in the "mobile part" that will prevent it from "working" (and also with the HTTP to HTTPS redirect).
The directives are in the wrong order. Both of the external redirects (HTTP to HTTPS and "mobile" www to non-www) should be before the internal rewrite (the first couple of rules)
I assume ENV:HTTPS (that references an environment variable called HTTPS) is as per instruction from your webhost. This is non-standard, although not uncommon with some shared hosts.
RewriteCond %{HTTP_HOST} ^www\mobile.example\.com [NC] - You are missing a dot after the www subdomain (assuming that is what you trying to match). So, this will never match. You are also missing a slash before the dot in the middle of the regex (to match a literal dot, not any character). The CondPattern should presumably read ^www\.mobile\.example\.com in order to match the www subdomain.
RewriteCond % {HTTP_USER_AGENT}"android|blackberry|iphone|ipod|iemobile|opera mobile|palmos|webos|googlebot-mobile" [NC] - You are missing a space after the first argument %{HTTP_USER_AGENT}<here>. Although you also appear to have an erroneous space after the %. Either way, this will fail to match. However, I would also question why you specifically need to match the mobile user-agent here? I would think you need to redirect www to non-www regardless of user-agent? Why would you permit a desktop user-agent access to www.mobile.example.com? So, this condition can perhaps be removed entirely.
Not a bug, but you probably don't need the <IfModule> wrapper, unless these directives are optional and you are porting the same code to multiple servers where mod_rewrite might not be available. See my answer to a related question on the Webmasters stack: https://webmasters.stackexchange.com/questions/112600/is-checking-for-mod-write-really-necessary
Again, not a bug, but the RewriteBase / directive in this block of code is entirely redundant.
Taking the above points into consideration, it should be written more like this:
RewriteEngine On
# HTTP to HTTPS redirect - all hosts
RewriteCond %{ENV:HTTPS} !=on
RewriteRule ^(.*)$ https://%{HTTP_HOST}/$1 [R=301,L]
# mobile redirect
RewriteCond %{HTTP_HOST} ^www\.mobile\.example\.com [NC]
RewriteCond %{HTTP_USER_AGENT} "android|blackberry|iphone|ipod|iemobile|opera mobile|palmos|webos|googlebot-mobile" [NC]
RewriteRule (.*) https://mobile.example.com/$1 [R=302,L]
# Front-controller
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
Like I said in the notes above, I question the use of the RewriteCond %{HTTP_USER_AGENT} directive to detect mobile-only user-agents. If all users should be redirected www to non-www (as it looks like they should) then simply remove this condition. This should also presumably be a 301 (permanent) redirect once you have confirmed that it works as intended.
Taking this a step further, don't you also want to canonicalise desktop clients as well? ie. Redirect www to non-www on all hosts?
This code works great coming from the desktop
Although there's no reason why this didn't work "great" from mobile either if you were requesting the conanical host, ie. https://mobile.example.com/.
UPDATE: What I need for the .htaccess to do is redirect all traffic - desktop and mobile etc - to the new https instead of HTTP.
By the sounds of it you only need a "simple" HTTP to HTTPS redirect. The "front-controller pattern" that you have seemingly copied from the webhost's article may be in error?
Try the following instead in the root .htaccess file.
RewriteEngine On
# Redirect all requests from HTTP to HTTPS on the same host
RewriteCond %{ENV:HTTPS} !=on
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]
You should remove all other directives and make sure there are no other .htaccess files in subdirectories.
The REQUEST_URI server variable contains the requested URL-path. This will be required, instead of using a backreference as you had initially, if your mobile subdomain points to a subdirectory off the main domain's document root (which you hint at in comments, but not stated in the question).
You must clear the browser cache before testing and test first with 302 (temporary) redirects before changing to a 301 (permanent) redirect only once you have confirmed the redirect works as intended.

Forced SSL in htaccess going to only the home page

I have recently added an SSL to my sites. I have added the code to the .htaccess file to force the https. The issue is that my external links that go to pages within the site are now being redirected to the homepage. The code I am using is:
RewriteEngine On
RewriteBase /
RewriteCond %{ENV:HTTPS} !on [NC]
RewriteRule ^(.*)$ https://www.watsonelec.com%1 [R,L]
I think the issue is in the last line, as the rule is telling it to redirect to the homepage. What I can't seem to find is a rule that will say for it to go to the URL provided in the link but give it an https instead of the HTTP.
I did do a search for this topic, but all the code I found was similar to what I already had. Thank you for all your help.
Update
I have two sites I am trying to work this out for, watsonenerysolutions.com and watsonelec.com.
When I tried
RewriteOptions InheritDownBefore
RewriteCond %{ENV:HTTPS} !on [NC]
RewriteRule ^(.*)$ https://www.watsonenergysolutions.com/$1 [R,L]
It still sent to the homepage
When I tried
RewriteOptions InheritDownBefore
RewriteCond %{ENV:HTTPS} !on [NC]
RewriteRule ^ https://www.watsonenergysolution.com%{REQUEST_URI} [R,L]
I received an error message that said Safari can't open the page "https://www.watsonenergysolutions.com/index.php" because Safari can't find server "www.watsonenergysolutions.com"
%N backreferences are what you match in RewriteCond's. In your case, it is empty. That's why anything is going to the homepage.
You need to use $1 or %{REQUEST_URI}, both rules below are equivalent (the second may be faster because you don't -re-match unnecessarily)
RewriteRule ^(.*)$ https://www.watsonelec.com/$1 [R,L]
RewriteRule ^ https://www.watsonelec.com%{REQUEST_URI} [R,L]
Note 1: %{REQUEST_URI} value always begins with a leading /, while what you can match in a RewriteRule never begins with a leading /
Note 2: R flag uses a 302 redirect by default. Maybe you'll want to use a 301 ([R=301,L])

Redirect One Domain To https 2nd Domain http

We have a domain where every page is set to https in .htaccess.
There is also a 2nd domain for the Irish version of the site, where we don't have a certificate, so we need the pages to be set to http.
We also want to ensure both domains are set to www.
To further complicate the matter, users can login to part of the site, and we want those pages to be secure. What we plan is to detect the Irish domain and login page and redirect to the main domain.
I can code individual parts of this in .htaccess, but I am not sure how to code the whole scenario without it getting horribly complicated, messy and difficult to debug. I also don't want to double redirect pages unnecessarily.
So in summary:
(www.)example.com -> https://www.example.com
(www.)example.ie -> http://www.example.ie
(www.)example.ie/login.php -> https://www.example.com/login.php
I would be grateful for any help, haven't been able to find any similar scenarios. Thanks.
There's multiple ways of doing this. Assuming you're differentiating based on the host header (vs, say, using unique IP-based virtual hosts), you could do something like:
# Special case for Irish login page
RewriteCond %{HTTP_HOST} ^(?:www\.)?example\.ie$
RewriteRule ^/login\.php$ https://www.example.com/login.php [R]
# By default use HTTPS
RewriteRule .* - [E=CORRECT_REQUEST_SCHEME:https]
# For (www.)example.ie use HTTP
RewriteCond %{HTTP_HOST} ^(?:www\.)?example\.ie$
RewriteRule .* - [E=CORRECT_REQUEST_SCHEME:http]
# Ensure www prefix
RewriteCond %{HTTP_HOST} ^example\.(com|ie)$
RewriteRule ^(/.*)$ %{ENV:CORRECT_REQUEST_SCHEME}://www.example.%1/$1 [R]
# Ensure correct request scheme
RewriteCond %{HTTPS} =off
RewriteCond %{ENV:CORRECT_REQUEST_SCHEME} =https
RewriteRule ^(/.*)$ https://%{HTTP_HOST}/$1 [R]
Edited code to reflect Avi's solution. For some reason it's still not forcing www, and it's not forcing https for the .com site. Unfortunately running under Apache 2.2.29 on our production web server we can't use "REQUEST_SCHEME":
RewriteEngine on
RewriteBase /
Options +FollowSymLinks
## ERROR DOCUMENT PROCESSING ##
ErrorDocument 403 /404.php
ErrorDocument 404 /404.php
ErrorDocument 410 /410.php
## ERROR DOCUMENT PROCESSING ##
## TRAP REQUESTS FOR IMAGE DOES NOT EXIST ##
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} \.(gif|jpg|jpeg|png)$
RewriteRule .* /products/sq/250sq/no_image_available.jpg [R]
## FORCE URL CHANGES FOR DOMAINS ##
# Special case for Irish login page
RewriteCond %{HTTP_HOST} ^(?:www\.)?example\.ie$
RewriteRule ^/login\.php$ https://www.example.com/login.php [R]
# By default use HTTPS
RewriteRule .* - [E=CORRECT_REQUEST_SCHEME:https]
# For (www.)example.ie use HTTP
RewriteCond %{HTTP_HOST} ^(?:www\.)?example\.ie$
RewriteRule .* - [E=CORRECT_REQUEST_SCHEME:http]
# Ensure www prefix
RewriteCond %{HTTP_HOST} ^example\.(com|ie)$
RewriteRule ^(/.*)$ %{ENV:CORRECT_REQUEST_SCHEME}://www.example.%1/$1 [R]
# Ensure correct request scheme
RewriteCond %{HTTPS} =off
RewriteCond %{ENV:CORRECT_REQUEST_SCHEME} =https
RewriteRule ^(/.*)$ https://%{HTTP_HOST}/$1 [R]

RewriteRule for HTTP to HTTPS and WWW ISAPI Rewrite

I've trawled many forums and tried many solutions. None work correctly. I am using ISAPI Rewrite 3 for IIS.
I need to change all requests to our website to WWW and HTTPS.
For example:
https://example.com/a-page-here/
http://example.com/a-page-here/
http://www.example.com/a-page-here/
www.example.com/a-page-here/
example.com/a-page-here/
to all change to:
https://www.example.com/a-page-here/
I've used http://htaccess.madewithlove.be, which may be buggy because I'm getting seemingly incorrect results for so-called working solutions. I don't want to be testing umpteen things on the live site.
This supposedly correct example (one of many) I found gives incorrect results:
RewriteEngine on
RewriteBase /
RewriteCond %{SERVER_PORT} !443
# Extract non-www portion of HTTP_HOST
RewriteCond %{HTTP_HOST} ^(www\.)?(.*) [NC]
# Redirect to HTTPS with www
RewriteRule (.*) https://www.%2/$1 [R=301]
Example tests:
example.com/a-page-here/ = https://www./example.com/a-page-here
www.example.com/a-page-here/ = https://www./www.example.com/a-page-here/
Can anyone give me a set of rules that will cleanly and reliably turn any non www request to our website to the correct https://www version, and not add invalid slashes etc?
Try this one:
RewriteEngine On
# non-www to www
RewriteCond %{HTTP_HOST} ^example\.com$ [NC]
RewriteRule (.*) https\://www.example.com/$1 [R=301]
# HTTP to HTTPS
RewriteCond %{HTTPS} off [NC]
RewriteRule (.*) https\://www.example.com/$1 [R=301]

Resources