mod_rewrite - check for string - string

I want to check if a URL contains the sting "-EN.htm", if so apply the rewrite.
That should be done with ^-EN.htm as follows, but the rule is not working:
RewriteCond %{REQUEST_URI} ^/(.*?)/([-_0-9a-zA-Z./=]*)^-EN.htm
RewriteRule ^(.*)$ /indexEN.php?folder=%1&follow=%2 [L]
What am I doing wrong?
Thank you for every help,
Scott

Your regular expression doesn't look right. You can also lose the condition and just move the pattern to the rewrite rule instead. Something along the lines of
RewriteRule ^/?(.*?)/([-_0-9a-zA-Z./=]*)^-EN.htm /indexEN.php?folder=$1&follow=$2 [L]
You need to make the leading slash optional (in htaccess this is stripped off) and instead of using % backreferences, use the $ ones.
Now on to your pattern, it's not valid. The ^ matches the beginning of the string (the URI), so if you have two of them and you're not trying to literally match the ^ character (which you'd need to escape), then the expression will never match anything. Without any examples of URLs that you're having to deal with, I assume you probably just want to ditch the second ^:
RewriteRule ^/?(.*?)/([-_0-9a-zA-Z./=]*)-EN.htm /indexEN.php?folder=$1&follow=$2 [L]

Related

Htaccess - Redirect if URL does not contain at least three numbers

I'm struggling to get this htaccess redirect to work. I want to redirect any URL that does not contain at least three numbers in a row. I started with the following and it worked perfectly for redirecting any URL that DID have three numbers in a row:
RewriteCond %{REQUEST_URI} [0-9]{3,20} [NC]
RewriteRule (.*) "https\:\/\/info\.mywebsite\.com\/" [R=301,L]
However, I tried to modify that with the exclamation mark to make the condition NOT match three numbers in a row:
RewriteCond %{REQUEST_URI} !([0-9]{3,20}) [NC]
RewriteRule (.*) "https\:\/\/info\.mywebsite\.com\/" [R=301,L]
But that doesn't seem to work as expected. Am I missing something with turning this expression into a not match?
Having previously experimented with the opposite 301 (permanent) redirect then the results are most probably cached (by the browser) from the earlier redirect. It is a good idea to test with 302 (temporary) redirects to avoid caching issues.
Note also that the REQUEST_URI server variable contains the URL-path only, so if the digits are contained in the query string part of the URL-path then your condition will fail.
The quantifier {3,20} matches from 3 to 20 characters, if you want "at least three" then use the quantifier {3,} (no upper bound).
You don't need the capturing subpatterns, ie. surrounding parentheses (...) on the regex since you are not using backreferences anywhere. Incidentally, you can't capture subpattern on a negated regex.
You don't need the additional condition (RewriteCond directive) - this can all be done with the RewriteRule directive only.
The NC flag is not required here - you are checking digits only.
For example:
RewriteRule !\d{3,} https://info.mywebsite.com/" [R=302,L]
As noted in comments, the RewriteRule substitution string is a regular string, not a regex, so does not require any special backslash escaping (although colons and slashes don't need escaping anyway in Apache regex).

RewriteRule - only two directories and a file

I am trying to rewrite URLs through .htaccess and what I want is that a rule be taken into account only if the URL is like "www.mysite.com/basedir/directory/file.htm".
I don't want "www.mysite.com/basedir/file.htm" or "www.mysite.com/basedir/directory/directory/file.htm", I want the exact described structure. At the moment I am trying to do it with this:
RewriteRule ^(basedir)/([^/\.]+)/(.*)\.(htm)$ /template.php?&page=$3 [L]
but it doesn't work. It accepts any number of directories after basedir.
Thanks for your help
Your second match area needs to have a character exclusion like the first:
RewriteRule ^(basedir)/([^/\.]+)/([^/]*)\.(htm)$ /template.php?&page=$3 [L]
That way you don't allow additional slashes to be matched, only the two explicit slashes earlier in the pattern.
Also, what exactly are you trying to match with $3? You only need to use parentheses around things that you need to match. If $3 is supposed to match file.htm, then you could instead write:
RewriteRule ^basedir/[^/\.]+/([^/]*\.htm)$ /template.php?&page=$1 [L]
or if it should just match "file" then:
RewriteRule ^basedir/[^/\.]+/([^/]*)\.htm$ /template.php?&page=$1 [L]

optional second directory in htaccess redirect

Hi
im wanting to write a mod redirect which handles the following:
www.domain.co.uk/brands
rewrites to www.domain.co.uk/index.php?p=brands
www.domain.co.uk/brands/5
rewrites to www.domain.co.uk/index.php?p=brands&go=5
Can this be achieved in one line without a conditional statement?
I have written this, but the second line is ignored:
RewriteRule ^(.*)\.html$ index.php\?p=$1 [L]
RewriteRule ^((.*)/*)(.*)\.html$ index.php?p=$1&go=$2 [L]
Any help would be much appreciated
Well, it seems to me that the first rule would match both versions, and rewrite www.domain.co.uk/brands/5 to www.domain.co.uk/index.php?p=brands/5 -- and then the [L] flag makes the matching stop. It wouldn't match the second rule after the rewrite, anyway.
The second regexp has an asterisk too many (after the slash) and one set of parens too many as well, but if you fix that and move it above the other one, it might help.
Just reverse the order. As a general practice you must have most specific rules first and most generic as last. Try this in your .htaccess:
RewriteRule ^([^/]*)/(.*)\.html$ /index.php?p=$1&go=$2 [L,NC,QSA]
RewriteRule ^(.*)\.html$ /index.php?p=$1 [L,NC,QSA]
This will redirect URI of '/brands/5.html' to /index.php?p=brands&go=5 and a URI of '/brands.html' to /index.php?p=brands
RewriteRule (.*)(/(.*))?$ index.php?p=$1&go=$3 [L]
should do it.
The reason your second rule fails is, you are matching one character (.) followed by a / at the start of the string, so immediately any urls with more than one character before the first slash will fail. You are also insisting that the url ends with html but in your examples they don't. Also for the future, remember . matches any single character so you probably meant to escape the . before html.

rewritecond, rewriterule and ignoring extra querystrings

I have an old url:
www.example.com/content.aspx?ID=227&ParentID=33&MicrositeID=0&Page=1
that I wish to rewrite to:
www.example.com/product/item
The only important bit is ID=227, everything after that can be stripped and is not required for the redirect. I need to not pass any querystrings to the new address, this is basically a hard rewrite from one address to another.
I have my rewrite rule:
RewriteCond %{QUERY_STRING} ^ID=227(.*)$ [NC]
RewriteRule ^content\.aspx$ http://www.example.com/product/item [R=301,L]
But as I'm a total noob at mod_rewrite I'm struggling - can any htaccess gurus out there help me out?
RewriteCond %{QUERY_STRING} (^|&)ID=(\d+)(&|$) [NC]
RewriteRule ^content\.aspx$ /product/item/%2? [R=301,L]
A few comments...
There's no need to add .* to match the whole string in this case. As long as you can pinpoint what you want to match, you'll be fine. (^|&) matches either the start of the string or & whereas (&|$) matches either the end of the string or &. This allows id=xxx to be anywhere in the query string, which is good practice. \d matches one digit whereas + is a repetition operator for "one or more".
Furthermore, you don't actually need to include the domain name so long as the resulting page is on the same domain. Just start the resulting string with a / to make it relative to the root level.
%2 means that you're inserting a submatch from the RewriteCond statement rather than the RewriteRule. The latter would be $1, $2, as you might know.
The trailing ? tells the rewrite engine not to append the querystring to the URL. (Don't worry, the question mark won't show up in the redirect URL)

Variable in htaccess, RewriteRule question

How could i use a RewriteRule accesing a variable?
If i have:
SetEnv MY_VAR (a|b|c)
RewriteRule ^%{ENV:MY_VAR}$ index.php?s=$1 [L,QSA]
RewriteRule ^%{ENV:MY_VAR}-some-one$ index.php?s=$1 [L,QSA]
I have these examples but doesn`t work.
Later edit
Ok Tim, thank you for the answer. Let say that i have:
RewriteRule ^(skoda|bmw|mercedes)-([0-9]+)-([0-9]+)-some-think$ index.php?a=$1 [L,QSA]
RewriteRule ^some-one-(skoda|bmw|mercedes)/pag-([0-9]+)$ index.php?a=$1 [L,QSA]
RewriteRule ^a-z-(skoda|bmw|mercedes)$ index.php?a=$1 [L,QSA]
(forget second part of RewriteRule) .. I don-t want to put everywhere (skoda|bmw|mercedes) this list. Is more quickly to make a variable then to use it in rule...
You can't do that, because mod_rewrite doesn't expand variables in the regular expression clauses.
You can only use variables in the input argument to a RewriteCond, and as the result argument to a RewriteRule. There's overhead in compiling the regular expressions (especially if you're forced to do it per-request as with .htaccess files), so if you allowed variable content in them, they'd have to be recompiled for every comparison to ensure accuracy at the cost of performance. It seems the solution therefore was to not let you do that.
What exactly did you want to do that for, anyway?
I`ve received another answer on a mod_rewrite forum from jdMorgan:
Mod_rewrite cannot use a variable in a regex pattern. The .htaccess directives are not a scripting language...
I'd recommend:
RewriteCond $1<>$3 ^<>-([0-9]+)-([0-9]+)-some-think$ [OR]
RewriteCond $1<>$3 ^some-one-<>/pag-([0-9]+)$ [OR]
RewriteCond $1<>$3 ^a-z+-<>$
RewriteRule ^([^\-/]+[\-/])*(skoda|bmw|mercedes)([\-/].+)?$ index.php?a=$2 [QSA,L]
Here, the RewriteRule pattern is evaluated first (See Apache mod_rewrite documentation "Rule Processing").
If the pattern matches, then whatever comes before "(skoda|bmw|mercedes)" in the requested URL-path is placed into local variable "$1".
Whatever follows "(skoda|bmw|mercedes)" is placed into local variable $3.
The value of the requested URL-path matching "(skoda|bmw|mercedes)" is placed into $2.
Then each of the RewriteConds is processed to check that the format of the requested URL without the "(skoda|bmw|mercedes)" part is one of the formats to be accepted.
Note that the "<>" characters are used only as a separator to assist correct and unambiguous parsing, and have no special meaning as used here. They simply "take the place of" the variable-string that you do not want to include in each line. You can use any character or characters that you are sure will never appear in one of your URLs without first being URL-encoded. I prefer to use any of > or < or ~ myself.
Note also that the RewriteRule assumes that the "(skoda|bmw|mercedes)" substring will always be delimited by either a hyphen or a slash if any other substring precedes or follows it. I am referring to the two RewriteRule sub-patterns containing "[^-/]" ("NOT a hyphen or a slash") and "[-/]" ("Match a hyphen or a slash"). This greatly improves efficiency of regular-expressions pattern matching, so use this method if possible instead of using an ambiguous and inefficient sub-pattern like ".*" (Match anything, everything, or nothing").

Resources