HTTP to HTTPS .htaccess - .htaccess

I was unable to comment on a previous post!! I wanted to ask a question as I am new. Is someone able to explain to me what:
RewriteEngine On
RewriteCond %{HTTP:X-Forwarded-Proto} =http
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
Actually does in the .htaccess file? and what is the difference between "=" and "!=" I am using Cloudflare and Andrew suggested this config which works but I was looking for a greater understanding of it. To me it says if you get a http request use http. If I change http to https though it says too many redirects. All help appriciated.

The %{HTTP:X-Forwarded-Proto} is a variable that accesses the value of the X-Forwarded-Proto HTTP header, which is used by cloudflare or AWS (or other load balancing services) to indicate the protocol of the original request. This is almost always either "http" or "https".
The mod_rewrite documentation for RewriteCond says:
'=CondPattern' (lexicographically equal)
Treats the CondPattern as a plain string and compares it lexicographically to TestString. True if TestString is lexicographically equal to CondPattern (the two strings are exactly equal, character for character). If CondPattern is "" (two quotation marks) this compares TestString to the empty string.
So in other words, the test string %{HTTP:X-Forwarded-Proto} must equal "http".
Additionally:
You can prefix the pattern string with a '!' character (exclamation mark) to specify a non-matching pattern.
So, != means "not equal".
So the condition tests if the request was originally "http", and if so, the rule redirects the browser to https://%{HTTP_HOST}%{REQUEST_URI} (https).

Related

Use htaccess to change query parameter to iOS app-specific deep-link [duplicate]

I am trying to do the following:
User visits URL with query parameter: http://www.example.com/?invite=1234
I then want them to be deep linked into the app on their iOS device, so they go to: app_name://1234
Any suggestions on how to accomplish this in my .htaccess file?
I tried this but it doesn't work:
RewriteEngine On # Turn on the rewriting engine
RewriteRule ^invite/(.*)/$ app_name://$1 [NC,L]
If RewriteRule won't work, can anyone send me an example code for RewriteCond or JavaScript to achieve what I need?
Not sure how this will work with the iOS device, but anyway...
RewriteRule ^invite/(.*)/$ app_name://$1 [NC,L]
This doesn't match the given URL. This would match a requested URL of the form example.com/invite/1234/. However, you are also matching anything - your example URL contains digits only.
The RewriteRule pattern matches against the URL-path only, you need to use a RewriteCond directive in order to match the query string. So, to match example.com/?invite=1234 (which has an empty URL-path), you would need to do something like the following instead:
RewriteCond %{QUERY_STRING} ^invite=([^&]+)
RewriteRule ^$ app_name://%1 [R,L]
The %1 backreference refers back to the last matched CondPattern.
I've also restricted the invite parameter value to at least 1 character - or do you really want to allow empty parameter values through? If the value can be only digits then you should limit the pattern to only digits. eg. ^invite=(\d+).
I've include the R flag - since this would have to be an external redirect - if it's going to work at all.
However, this may not work at all unless Apache is aware of the app_name protocol. If its not then it will simply be seen as a relative URL and result in a malformed redirect.

Apache htaccess: Deny if REQUEST_URI not match cookie value

So, after searching for a solution all over this community, my question is as follow:
Im working within the Wordpress enviroment, Apache server. I have a folder within uploads named /restricted/. Everything in here (any file extension) can only be accessed if:
A cookie named 'custom_cookie' is set
And this cookie value must be a partial match of the URL request
If these conditions fail, an image is served. Inside this /restricted/ folder I got a .htaccess file. Everything must (prefered) be done in that htaccess file, not on root htaccess file.
The cookie is set by functions.php, no problem with that
part. And comments about security is not the question here
This is an url example (localhost): http://localhost/komfortkonsult/wp-content/uploads/restricted/some-file.jpg?r=870603c9d23f2b7ea7882e89923582d7
The first condition A cookie named custom_cookie is set, everything is working with this:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /komfortkonsult/
RewriteCond %{REQUEST_URI} ^.*uploads/restricted/.*
RewriteCond %{HTTP_COOKIE} !custom_cookie
RewriteRule . /komfortkonsult/restricted.png [R,L]
</IfModule>
However, the next part Im totally out in the blue, But I tried and failed with the following approaches:
RewriteCond %{HTTP_COOKIE} custom_cookie=(.*)$
RewriteCond %1::%{REQUEST_URI} ^(.*?)::/\1/?
RewriteRule . /komfortkonsult/restricted.png [R,L]
Likewise:
RewriteCond %{QUERY_STRING} ^r=(.*)$
RewriteRule ^/ - [E=COOKIE_MATCH:%1]
RewriteCond %{HTTP_COOKIE} !custom_cookie="%{ENV:COOKIE_MATCH}"
RewriteRule . /komfortkonsult/restricted.png [R,L]
Likewise:
RewriteCond %{HTTP_COOKIE} custom_cookie=([^;]+) [NC]
RewriteCond %{REQUEST_URI} !%1 [NC]
RewriteRule . /komfortkonsult/restricted.png [R,L]
And so on. I really want to keep this inside the .htaccess, instead using validation through a .php file call. But if that is the only solution to my architechture, please provide a full working example (not foo=bar, your redirects goes here...)
Any other approaches of my objectives are welcome.
Thanks so much for helping me out with this.
/ Intervik
Update (after accepted answer and working) example of usage
The objectives are one layer of protection in a Wordpress single install. All media, images or other files, uploaded and attached to pages, are hidden (replaced by an image) if A) the user is not logged-in or B) The user is logged in but not with the capability of 'edit_post'.
But the restriction is only for files uploaded into a unique folder called /restricted/. The folder is resident in the Wordpress original /uploads/ root. This restricted material is not allowed to be direct-linked or accessable by search engines etc etc. No browser-cache is allowed and restriction must work immediately after log-out. And more... but I think you get it.
The namespace 'custom_cookie' is just a providing example. And the examples showing the Wordpress install is within a subfolder on localhost. LIKE h**p://example.com/workspace/. Remove 'workspace/' if in root.
The cookie architecture, functions.php
function intervik_theme_set_custom_cookie(){
if(is_user_logged_in()){
global $current_user;
if(current_user_can('edit_posts')){
if(!isset($_COOKIE['custom_cookie'])){
$cookie_value = $current_user->ID . '|' . $current_user->user_login . '|' . $current_user->roles;
$salt = wp_salt('auth');
$cookie_hash = hash_hmac('md5', $cookie_value, $salt);
setcookie('custom_cookie', $cookie_hash, time()+36, '/');
$_COOKIE['custom_cookie'] = $cookie_hash;
} else {
$cookie_value = $current_user->ID . '|' . $current_user->user_login . '|' . $current_user->roles;
$salt = wp_salt('auth');
$cookie_hash = hash_hmac('md5', $cookie_value, $salt);
if($cookie_hash != $_COOKIE['custom_cookie']){
setcookie('custom_cookie', '', 1, '/');
unset($_COOKIE['custom_cookie']);
}
}
} else {
if(isset($_COOKIE['custom_cookie'])){
setcookie('custom_cookie', '', 1, '/');
unset($_COOKIE['custom_cookie']);
}
}
} else {
if(isset($_COOKIE['custom_cookie'])){
setcookie('custom_cookie', '', 1, '/');
unset($_COOKIE['custom_cookie']);
}
}
}
add_action('init', 'intervik_theme_set_custom_cookie');
As you can see, Each cookie is unique for each valid user, for each +36 seconds period (enough for a page-load - but use +120 for 2 minutes). This "token" is applied to every request send to the the server:
The link to attachment url filter:
function intervik_restricted_wp_get_attachment_url($url, $post_id){
if(strpos($url, '/restricted/') !== FALSE){
if(isset($_COOKIE['custom_cookie'])){
$url = add_query_arg('r', $_COOKIE['custom_cookie'], $url);
}
}
return $url;
}
add_filter('wp_get_attachment_url', 'intervik_restricted_wp_get_attachment_url', 10, 2);
We are not allowing any other query strings. Remark, more filter must be added for sizes, like wp_get_attachment_image_src etc etc. But direct links to media, this is enough.
Replace the if(current_user_can('edit_posts') with another
if(is_user_logged_in() ... changes everything to just login/out
users. Then skip the filters in the admin backend with if(!is_admin()
&& strpos($url, '/restricted/')!== FALSE) ...
And finally the .htaccess file, in the root of the uploads/restricted/ folder:
# BEGIN Intervik
Options +FollowSymLinks
Options All -Indexes
<IfModule !mod_rewrite.c>
Deny from all
</IfModule>
<IfModule mod_headers.c>
Header set Cache-Control "no-cache, no-store, must-revalidate"
Header set Pragma "no-cache"
Header set Expires 0
</IfModule>
RewriteEngine On
RewriteCond %{HTTP_COOKIE}::%{QUERY_STRING} !\bcustom_cookie=([0-9a-f]{32})\b.*::r=\1(&|$)
RewriteRule . /workspace/restricted.png? [R,L]
# END Intervik
I also placed the nice PNG IMAGE "Restriced Access timeout" in the Wordpress install root. This is also served as thumbnail in Library admin area for non valid administrators. The upload filter or backend is another area.
We are not protecting Englands financial plans here, but we wanna keep
away some paperwork for an organistion and some picures from Google and from
your wife.
Please comment
Its actually working and you are welcome to comment the flaws or security risks. However, there is also another layer validation with PHP above this layer in our install, but we need speed for not so important stuff.
You've got some of the correct bits in your different attempts, but you need to bring them together in the correct order.
Try the following instead:
RewriteEngine On
# custom_cookie value is 32 char hex and must match the value of the "r" URL parameter
RewriteCond %{HTTP_COOKIE}::%{QUERY_STRING} !\bcustom_cookie=([0-9a-f]{32})\b.*::r=\1(&|$)
RewriteRule ^ /komfortkonsult/restricted.png [QSD,R,L]
The QSD flag (Apache 2.4+) is required to remove the query string from the redirected URL. Alternatively, if you are still using Apache 2.2 then you can append a ? to the susbstitution instead.
Note that the RewriteBase is not required here. The <IfModule> should also be removed. The <IfModule mod_rewrite.c> wrapper is only required if this is intended to work without mod_rewrite being available. It is not. If mod_rewrite is not available then your conditions will simply fail silently and access will be unrestricted. In this case, it is preferable to fail with an error and access is forbidden (for everyone).
Assumptions:
The cookie value is a 32 character hex value (as in your example).
The r URL parameter is always the first URL parameter (as in your example).
You mentioned "any file extension", however, redirecting to an image only really "works" if an image is being requested in the first place. If you have files other than images it may be preferable to simply return a 403 Forbidden. (Strictly speaking, sending a 403 is the correct response rather than a 302, followed by 200 OK.) To send a 403 instead, just change the RewriteRule directive to read:
RewriteRule ^ - [F]
How this works...
An important point, that is missed from all but one of your examples, is the r URL parameter is part of the query string, not the URL-path. The REQUEST_URI server variable contains the URL-path only, which notably excludes the query string. To match the query string you need to compare against the QUERY_STRING server variable.
%{HTTP_COOKIE}::%{QUERY_STRING} - The cookie HTTP request header is joined with the query string using a separater (::) that is guaranteed to not appear in either value. This forms the TestString.
!\bcustom_cookie=([0-9a-f]{32})\b.*::r=\1(&|$) - This is the CondPattern that matches the TestString. \b is a word boundary, so we match only this specific cookie. The value of this cookie is captured using ([0-9a-f]{32}). We then skip over any remaining characters in the cookie header until we get to our separater (::). After this we are matching against the query string (value of the QUERY_STRING server variable in the TestString). The "magic" is the \1 backreference to the first captured group, ie. the cookie value.
The ! prefix on the CondPattern negates the entire pattern. So, the condition is successful when this pattern does not match, ie. when the values of the cookie and URL parameter are different (or not present at all).
Why your attempts were not working...
RewriteCond %{HTTP_COOKIE} custom_cookie=(.*)$
RewriteCond %1::%{REQUEST_URI} ^(.*?)::/\1/?
This assumes your cookie is the last cookie in the Cookie header. This is difficult to guarantee.
You are trying to match the cookie value with the entire URL-path (REQUEST_URI), so this will never match. It assumes your URL is of the form: http://localhost/870603c9d23f2b7ea7882e89923582d7.
RewriteCond %{QUERY_STRING} ^r=(.*)$
RewriteRule ^/ - [E=COOKIE_MATCH:%1]
RewriteCond %{HTTP_COOKIE} !custom_cookie="%{ENV:COOKIE_MATCH}"
Good, you are checking the query string for the URL parameter value. However...
The first RewriteRule never matches because the URL-path never starts with a slash in per-directory (.htaccess) context. Consequently, the COOKIE_MATCH environment variable is never set.
The CondPattern is a regex, not a plain string, so %{ENV:COOKIE_MATCH} is not evaluated - it is seen as a literal string. You've also enclosed this in double quotes, which aren't part of the cookie value either.
RewriteCond %{HTTP_COOKIE} custom_cookie=([^;]+) [NC]
RewriteCond %{REQUEST_URI} !%1 [NC]
Again, you are comparing against the URL-path, not the query string. However, as mentioned above, the %1 backreference is not evaluated in the CondPattern, so this is seen as a literal string anyway.
It is why the %{VARIABLE} (and %1 etc) expressions are not evaluated in the CondPattern that we need to use the seemingly complex expression that uses a regex backreference of the form:
%{VAR1}##%{VAR2} ^(.+)##\1$

htaccess RewriteRule with literal question marks (not query string)

I need to be able to match question marks because there was a translated text encoding mistake, and part of the URL ended up hardcoded with question marks in them. Here's a URL example that I need to rewrite:
https://example.com/Documentation/Product????/index.html
Here is my current rewrite rule. It works when the characters following "Product" are not question marks, but when they are, the rule doesn't apply.
RewriteRule "^Documentation/Product[^/]+/(.*)$" "https://s3.amazonaws.com/company-documentation/Help/Product/$1" [L,NC]
How would I make sure that question marks are considered to be characters too in this rule? I can't expect that only question marks and not the original non-English characters will be in the URL, so I want the rule above to match both question marks and any other character.
I found this topic which seems relevant, but the flags don't help, and the answer doesn't explain how to overcome the problem mentioned in the "Aside".
https://webmasters.stackexchange.com/questions/107259/url-path-with-encoded-question-mark-results-in-incorrect-redirect-when-copied-to
https://example.com/Documentation/Product????/index.html
You say it's "not a query string", but actually that is exactly what it is. And that is why you can't match it with the RewriteRule pattern. The above URL is split as follows:
URL-path: /Documentation/Product (matched by the RewriteRule pattern)
Query string: ???/index.html (note 3 ? - the first one starts the query string)
To match the query string you'll need an additional RewriteCond directive that checks against the QUERY_STRING server variable.
For example, to match the above URL, you would need to do something like:
RewriteCond %{QUERY_STRING} ^\?*/index\.html
RewriteRule ^Documentation/Product$ https://s3.amazonaws.com/company-documentation/Help/Product/index.html [NC,R,L]
This matches any number of erroneous ? at the start of the query string.
I've added the R (redirect) flag. Your directive (without the R flag) would trigger an external redirect anyway (because you specifying an absolute URL in the substitution), but it is far better to be explicit here. This is also a temporary (302) redirect. If this should be permanent (301) then change it to R=301, but only once you have confirmed that it's working OK (301s are cached hard by the browser so can make testing problematic).
UPDATE:
...so I want the rule above to match both question marks and any other character.
Only if there are question marks in the URL will there be a query string, so I think it is advisable to keep these two rules separate.
If there could be any erroneous characters at the start of the query string and if you want to capture the end part of the URL (like you are doing in your original directive, eg. index.html) then you can modify the above to read:
RewriteCond %{QUERY_STRING} /(.*)$
RewriteRule ^Documentation/Product$ https://s3.amazonaws.com/company-documentation/Help/Product/%1 [NC,R,L]
Note the %1 (as opposed to $1) backreference in the substitution string. This is a backreference to the captured group in the last matched CondPattern (ie. /(.*)$).
You can follow this with your existing directive (but remember to include the R flag) for more "normal" URLs that don't contain a ? (ie. query string).
NB: Surrounding the arguments in double quotes are entirely optional in this example. They are only required if you have unescaped spaces in the pattern or substitution arguments.
In summary
# Redirect URLs of the form:
# "/Documentation/Product?<anything#1>/<anything#2>"
RewriteCond %{QUERY_STRING} /(.*)$
RewriteRule ^Documentation/Product$ https://s3.amazonaws.com/company-documentation/Help/Product/%1 [NC,R,L]
# Redirect URL-paths of the form (no query string):
# "/Documentation/Product<something>/<anything>"
RewriteRule ^Documentation/Product[^/]+/(.*) https://s3.amazonaws.com/company-documentation/Help/Product/$1 [NC,R,L]

.htaccess: Multiple TLD's to 1 TLD excluding subdomains

I have multiple TLD's (domainX.com, domainY.net, ...) pointing towards the same folder. In this folder I would like to add a .htaccess file to redirect ALL www and non-www URL's that aren't domainY.com to domainY.com.
There is one twist here however. I have some subdomains: alfa.domainY.com and beta.domainY.com and gamma.domainY.com set-up, which in all my tests keep redirecting to domainY.com.
Any chance anyone can give me a successful bit of code here?
EDIT: Maybe also add some #Comments, I noticed most answers here lack that, and I think it means some of them can't be reused as people don't know what they do. I can also add this myself afterwards.
Try adding these rules to the htaccess file of your document root:
RewriteEngine On
RewriteCond %{HTTP_HOST} !domainY\.net$ [NC]
RewriteRule ^(.*)$ http://domainY.net/$1 [L,R=301]
The expression matching the host will fail for any alpha.domainY.net because it only matches against the TLD (.net) and domain (domainY).
The first line turns on the rewrite engine.
The second line, the condition, is a true/false expression that is applied to the immediately following rule. In this case, it checks the request's Host: header, and if it ends with domainY.net, then the condition fails, because of the ! in front.
The third line is the rule, the URI is used to match against the pattern ^(.*)$ which essentially matches everything, and is captured via the parentheses. Then the next bit is the target. If the rule matches, which it does because the pattern matches everything, then the target is applied, and in this case, it redirects the browser to domainY.net and passes the same URI along with it via the regex backreference $1.

RewriteRule in htaccess

Could anyone explain the following line please?
RewriteRule ^(.*)$ /index.php/$1 [L]
The parts of the rewrite rule break down as follows:
RewriteRule
Indicates this line will be a rewrite rule, as opposed to a rewrite condition or one of the other rewrite engine directives
^(.*)$
Matches all characters (.*) from the beggining ^ to the end $ of the request
/index.php/$1
The request will be re-written with the data matched by (.*) in the previous example being substituted for $1.
[L]
This tells mod_rewrite that if the pattern in step 2 matches, apply this rule as the "Last" rule, and don't apply anymore.
The mod_rewrite documentation is really comprehensive, but admittedly a lot to wade through to decode such a simple example.
The net effect is that all requests will be routed through index.php, a pattern seen in many model-view-controller implementations for PHP. index.php can examine the requested URL segments (and potentially whether the request was made via GET or POST) and use this information to dynamically invoke a certain script, without the location of that script having to match the directory structure implied by the request URI.
For example, /users/john/files/index might invoke the function index('john') in a file called user_files.php stored in a scripts directory. Without mod_rewrite, the more traditional URL would probably use an arguably less readable query string and invoke the file directly: /user_files.php?action=index&user=john.
That will cause every request to be handled by index.php, which can extract the actual request from $_SERVER['REQUEST_URI']
So, a request for /foo/bar will be rewritten as /index.php/foo/bar
(I'm commenting here because I don't yet have the rep's to comment the answers)
Point #2 in meagar's answer doesn't seem exactly right to me. I might be out on a limb here (I've been searching all over for help with my .htaccess rewrites...), and I'd be glad for any clarification, but this is from the Apache 2.2 documentation on RewriteRule:
What is matched?
The Pattern will initially be matched against the part of the URL after the hostname and port, and before the query string. If you wish to match against the hostname, port, or query string, use a RewriteCond with the %{HTTP_HOST}, %{SERVER_PORT}, or %{QUERY_STRING} variables respectively.
To me that seems to say that for a URL of
http: // some.host.com/~user/folder/index.php?param=value
the part that will actually be matched is
~user/folder/index.php
So that is not matching "all characters (.*) from the beggining ^ to the end $ of the request", unless "the request" doesn't mean what I thought it does.

Resources