Could someone explain this RewriteCond to me?
Especially the HTTP/ portion at the end:
RewriteEngine on
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.php\ HTTP/
RewriteRule ^index\.php$ http://www.XXXXXXXXXXXXXXXX.com/ [R=301,L]
I found a very good explanation for the first half of the condition already here, but I'm still confused with the "/index.php\ HTTP/" portion.
And again: Why the "HTTP/" ?
I found a very good explanation for the first half of the condition already here, but I'm still confused with the "/index.php\ HTTP/" portion.
That explanation is omitting some detail. THE_REQUEST server variable contains the first line of the HTTP request headers. And will contain a string of the form:
GET /index.php HTTP/1.1
So, it contains 3 pieces of information, separated with spaces. You have the request method (GET, POST, OPTIONS, HEAD, etc.), the URI and the protocol/version used (HTTP v1.1 in this case).
For this particular regex you don't necessarily need to include the final HTTP/ part, just up to and including the space. However, by including http/ it makes the regex easier to read and less prone to error. It would not be best practice to end on a space (that you can't see).
The following would be equivalent:
RewriteCond %{THE_REQUEST} \s/index\.php\s
This uses the shorthand character class for whitespace (\s) instead of escaping a literal space (so you can "see" the space).
Or, use double quotes to surround the argument:
RewriteCond %{THE_REQUEST} " /index\.php "
(Although, that almost looks like an error at first glance.)
The reason for including this condition (RewriteCond directive) in the first place is to prevent a rewrite loop should you be rewriting the request to index.php later (a front-controller pattern). THE_REQUEST is not updated throughout the request (unlike other variables) when it is internally rewritten by Apache.
Related
I need a solution. For some reason in the past seems that I generate some "bad" links for bots only.
Resume: There is a fake "page" parameter when malformed url is present. When there are 2 "page" parameters then the first one is fake, must be removed.
X is random
Remove the parameter page only when "?/?page" is detected
Good: search?pagepage=496
Bad: search?/?page=X
Good: https://example.com/search?page=496
Good: https://example.com/search?page=496&orderBy=oldest
Bad: https://example.com/search?/?page=X&page=496&orderBy=oldest
RewriteCond %{QUERY_STRING} ^(.*)&?^XXX[^&]+&?(.*)$ [NC]
RewriteRule...
Thank you guys!
UPDATE
At final, I found a solution by myself:
RewriteCond %{QUERY_STRING} ^(.*)&?^/\?page=[^&]+&?(.*)$ [NC]
RewriteRule ^/?(.*)$ /search$1?%1%2 [R=301,L]
RewriteCond %{QUERY_STRING} ^/\?page=.+&(page=.*)
RewriteRule ^(search)$ $1?%1 [R=301]
This will do the rewrite for all your URLs that have the extra page parameter you want to keep.
To make the last part optional, we would have to wrap &(page=.*) into another set of braces, and add a ? as quantifier - (&(page=.*))?.
Then the back reference would need to be changed from %1 to %2 (because we only need that inner part, we don't want the &) - but then for your URL without any real page parameter to keep, there would be no match in this place, and therefor the %2 would not be substituted with anything, but added to the URL literally.
So better to leave the above as-is, and simply add
RewriteCond %{QUERY_STRING} ^/\?page=.+
RewriteRule ^(search)$ $1 [QSD,R=301]
below those two existing lines. The pattern does not need to be any more specific (because the URLs that have a genuine page parameter at the end, have already been handled by those previous two lines.) And the QSD makes it simply drop the existing query string, so that https://example.com/search?/?page=20 results in https://example.com/search (which I assume is what you wanted here, because there is no actual page parameter to keep, correct?)
I'm struggling to get this htaccess redirect to work. I want to redirect any URL that does not contain at least three numbers in a row. I started with the following and it worked perfectly for redirecting any URL that DID have three numbers in a row:
RewriteCond %{REQUEST_URI} [0-9]{3,20} [NC]
RewriteRule (.*) "https\:\/\/info\.mywebsite\.com\/" [R=301,L]
However, I tried to modify that with the exclamation mark to make the condition NOT match three numbers in a row:
RewriteCond %{REQUEST_URI} !([0-9]{3,20}) [NC]
RewriteRule (.*) "https\:\/\/info\.mywebsite\.com\/" [R=301,L]
But that doesn't seem to work as expected. Am I missing something with turning this expression into a not match?
Having previously experimented with the opposite 301 (permanent) redirect then the results are most probably cached (by the browser) from the earlier redirect. It is a good idea to test with 302 (temporary) redirects to avoid caching issues.
Note also that the REQUEST_URI server variable contains the URL-path only, so if the digits are contained in the query string part of the URL-path then your condition will fail.
The quantifier {3,20} matches from 3 to 20 characters, if you want "at least three" then use the quantifier {3,} (no upper bound).
You don't need the capturing subpatterns, ie. surrounding parentheses (...) on the regex since you are not using backreferences anywhere. Incidentally, you can't capture subpattern on a negated regex.
You don't need the additional condition (RewriteCond directive) - this can all be done with the RewriteRule directive only.
The NC flag is not required here - you are checking digits only.
For example:
RewriteRule !\d{3,} https://info.mywebsite.com/" [R=302,L]
As noted in comments, the RewriteRule substitution string is a regular string, not a regex, so does not require any special backslash escaping (although colons and slashes don't need escaping anyway in Apache regex).
I have spent hours looking for a solution, but .htaccess rules seem way over my head. I have this rule:
RewriteRule ^(.*)$ wikka.php?wakka=$1 [QSA,L]
and I need it to be applied only if there is anything beyond the domain name, ie. www.example.com/xyz but NOT with just www.example.com because then I only need to display a simple index.php instead {no address translation}.
How do I do that?
RewriteRule ^(.*)$ wikka.php?wakka=$1 [QSA,L]
You just need to change the * (0 or more) in the RewriteRule pattern to + (1 or more). For example:
RewriteRule (.+) wikka.php?wakka=$1 [QSA,L]
I also removed the anchors ^ and $, since they are not necessary here. You are grabbing everything anyway, so saying you are grabbing everything from the start to the end is just not necessary.
Also, unless you require the query string from the original request I would remove the QSA flag. By itself this rule will result in a repeated query string of the form ?wakka=wikka.php&wakka=xyz - where xyz is the intial request. This still "works" if reading the URL params with PHP as wakka=xyz will override the earlier parameter.
Not necessarily a problem, just something that I am not yet knowledgeable enough to do. I have an .htaccess file that I am using for url rewriting. This is what I have now.
ErrorDocument 404 /inc/error_documents/404.php
ErrorDocument 503 /inc/error_documents/503.php
# For security reasons, Option followsymlinks cannot be overridden.
#Options +FollowSymLinks
Options +SymLinksIfOwnerMatch
RewriteEngine ON
RewriteRule ^home$ /index.php [nc]
RewriteRule ^(about|contact|giving-tree)/?$ /$1.php [nc]
RewriteRule ^giving-tree/([0-9+]?)/?$ giving-tree.php?ageBegin=$1 [nc]
RewriteRule ^giving-tree/([0-9+]?)/([0-9+]?)/?$ giving-tree.php?ageBegin=$1&ageEnd=$2 [nc]
RewriteRule ^giving-tree/([0-9+]?)/([0-9+]?)/([0-9+]?)/?$ giving-tree.php?ageBegin=$1&ageEnd=$2&page=$3 [nc]
What I want to be able to do is make some of the parts in the 3 bottom rules optional. I know that I can accomplish this with RewriteCond, but I'm not sure how. What I need is basically this:
RewriteCond %{HTTP_HOST} ^hearttohandparadise.org/giving-tree
RewriteRule /beginAge-([0-9+]) #make it send GET request with beginAge as the variable
RewriteRule /endAge-([0-9+]) \?beginAge=$1 #make it send GET request with endAge as the variable
etc... etc...
Is there any way to accomplish this just by relying on .htaccess? or am I just fantasizing?
Forgive me is I sound stupid.
No, it's a perfectly valid idea. You'd basically want to allow the user to write the URI in an unstructured manner, without a strict order imposed, right? Like, I could write giving-tree/page-6/endAge-23?
If so, this is what you're looking for:
RewriteRule /beginAge-([0-9]+) giving-tree.php?beginAge=$1 [QSA,NC]
RewriteRule /endAge-([0-9]+) giving-tree.php?endAge=$1 [NC,QSA]
RewriteRule /page-([0-9]+) giving-tree.php?page=$1 [NC,QSA]
You see, if any part of the URI matches the expression "/beginAge-([0-9]+)", it'll be redirected to giving-tree.php?beginAge=$1; the magic is done by the QSA, Query String Append, option, which, well, appends any existing query string to the resulting URI. So as more and more matches are found and more and more GET parameters added, the query string just grows.
If you want a stricter thing, where some parameters are optional, but their order is fixed, then it's uglier by magnitudes:
RewriteRule /(beginAge-)?([0-9]+)/?(endAge-)?([0-9]+)?/?(page-)?([0-9]+)? giving-tree.php?beginAge=$2&endAge=$4&page=$6 [NC]
I just made everything optional by using the ? operator. This one may use some prettifying/restructuring.
(Alternatively, you could just do this:
RewriteRule ^giving-tree/([^/]+)/?$ process.php?params=$1 [nc]
That is, grabbing the entire part of the URI after the giving-tree part, lumping the whole thing into a single parameter, then processing the thing with PHP (as it's somewhat better equipped to string manipulation). But the first version is certainly more elegant.)
By the way, are you sure about the ([0-9+]?) parts? This means "One or no single character, which may be a digit or the plus sign". I think you meant ([0-9]+), i.e. "one or more digit".
How could i use a RewriteRule accesing a variable?
If i have:
SetEnv MY_VAR (a|b|c)
RewriteRule ^%{ENV:MY_VAR}$ index.php?s=$1 [L,QSA]
RewriteRule ^%{ENV:MY_VAR}-some-one$ index.php?s=$1 [L,QSA]
I have these examples but doesn`t work.
Later edit
Ok Tim, thank you for the answer. Let say that i have:
RewriteRule ^(skoda|bmw|mercedes)-([0-9]+)-([0-9]+)-some-think$ index.php?a=$1 [L,QSA]
RewriteRule ^some-one-(skoda|bmw|mercedes)/pag-([0-9]+)$ index.php?a=$1 [L,QSA]
RewriteRule ^a-z-(skoda|bmw|mercedes)$ index.php?a=$1 [L,QSA]
(forget second part of RewriteRule) .. I don-t want to put everywhere (skoda|bmw|mercedes) this list. Is more quickly to make a variable then to use it in rule...
You can't do that, because mod_rewrite doesn't expand variables in the regular expression clauses.
You can only use variables in the input argument to a RewriteCond, and as the result argument to a RewriteRule. There's overhead in compiling the regular expressions (especially if you're forced to do it per-request as with .htaccess files), so if you allowed variable content in them, they'd have to be recompiled for every comparison to ensure accuracy at the cost of performance. It seems the solution therefore was to not let you do that.
What exactly did you want to do that for, anyway?
I`ve received another answer on a mod_rewrite forum from jdMorgan:
Mod_rewrite cannot use a variable in a regex pattern. The .htaccess directives are not a scripting language...
I'd recommend:
RewriteCond $1<>$3 ^<>-([0-9]+)-([0-9]+)-some-think$ [OR]
RewriteCond $1<>$3 ^some-one-<>/pag-([0-9]+)$ [OR]
RewriteCond $1<>$3 ^a-z+-<>$
RewriteRule ^([^\-/]+[\-/])*(skoda|bmw|mercedes)([\-/].+)?$ index.php?a=$2 [QSA,L]
Here, the RewriteRule pattern is evaluated first (See Apache mod_rewrite documentation "Rule Processing").
If the pattern matches, then whatever comes before "(skoda|bmw|mercedes)" in the requested URL-path is placed into local variable "$1".
Whatever follows "(skoda|bmw|mercedes)" is placed into local variable $3.
The value of the requested URL-path matching "(skoda|bmw|mercedes)" is placed into $2.
Then each of the RewriteConds is processed to check that the format of the requested URL without the "(skoda|bmw|mercedes)" part is one of the formats to be accepted.
Note that the "<>" characters are used only as a separator to assist correct and unambiguous parsing, and have no special meaning as used here. They simply "take the place of" the variable-string that you do not want to include in each line. You can use any character or characters that you are sure will never appear in one of your URLs without first being URL-encoded. I prefer to use any of > or < or ~ myself.
Note also that the RewriteRule assumes that the "(skoda|bmw|mercedes)" substring will always be delimited by either a hyphen or a slash if any other substring precedes or follows it. I am referring to the two RewriteRule sub-patterns containing "[^-/]" ("NOT a hyphen or a slash") and "[-/]" ("Match a hyphen or a slash"). This greatly improves efficiency of regular-expressions pattern matching, so use this method if possible instead of using an ambiguous and inefficient sub-pattern like ".*" (Match anything, everything, or nothing").