Can I combine these 3 rewrite rules into 1? - .htaccess

I'm having a brain fade and need some help please. I'm using 3 RewriteRules to accomplish something that I think should take just one:
RewriteRule ^([0-9]+)$ /bar/$1.html [R=301,L]
RewriteRule ^([0-9]+)(-*)$ /bar/$1.html [R=301,L]
RewriteRule ^([0-9]+)-([0-9]+)$ /bar/$1.html#$2 [R=301,NE,L]
I need to take the following URLs:
http://foo.com/100
http://foo.com/100-1
http://foo.com/200-
http://foo.com/1999
http://foo.com/1999-99
...and rewrite them like this:
http://foo.com/bar/100.html
http://foo.com/bar/100.html#1
http://foo.com/bar/200.html
http://foo.com/bar/1999.html
http://foo.com/bar/1999.html#99
What I have works but seems like a bit of a hack. is there a way to combine this all in to one rule?

I don't see a way to combine all three rules into a single rule, because the replacement structure is not always the same, with hash sometimes appearing and sometimes not appearing. But you can combine the first two rules:
RewriteRule ^([0-9]+)-?$ /bar/$1.html [R=301,L]
The second rule, which replaces with a hash symbol, can remain as is:
RewriteRule ^([0-9]+)-([0-9]+)$ /bar/$1.html#$2 [R=301,NE,L]

You can combine all 3 rules into one with this trick:
RewriteCond %{REQUEST_URI} ^/(\d+)-?(\d+)?$
RewriteCond %1#%2 ^(\d+)#$ [OR]
RewriteCond %1#%2 ^(\d+)(#\d+)$
RewriteRule ^ /bar/%1.html%2 [R=301,L,NE]
In the first condition, we match regex pattern that starts with a number followed by an optional hyphen and another optional number.
Next two conditions are using [OR] so only one will be true.
For URI /100, first condition will be true and 100 will be captured in %1 but %2 will be empty.
For URI /100-1, second condition will be true and 100 will be captured in %1 but %2 will be #1.

Related

htaccess rewriterule generating multiple copies of wrong match

Trying to use prettyURLs rewritten to php param qrys using .htaccess rules.
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^foo/?([^/]*)/?([^/]*)/?$ /foo.php?s=$1&c=$2 [NC,END,R=301,QSA]
RewriteRule ^bar/?([^/]*)/?$ /bar.php?s=$1 [NC,END,R=301,QSA]
The first rule works correctly, but the second one generates:
https://example.com/bar.php?s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=.php&s=45843
from
https://example.com/bar/45843
RewriteCond directives are only applied to the single RewriteRule immediately following them. That means that your second RewriteRule is not covered by any RewriteCond. Which means it creates an endless rewriting loop.
You want to reject that, pointing out that you rewrite to /bar.php which is not matched again by the matching pattern maybe, but ...
That is not true actually. Take a closer look at your rule:
RewriteRule ^bar/?([^/]*)/?$ /bar.php?s=$1 [NC,END,R=301,QSA]
The matching pattern uses /? which makes the slash optional . So bar.php?s=whatever is again matched. In the next round the rewriting engine does.
Solution:
apply the conditions to both rules and
use a proper matching pattern.
Actually I am not sure what you are trying to match with those patterns ... Why the /??
Are you trying to match a query string that way? That won't work, you need another RewriteCond for that applyiong a matching pattern against %{QUERY_STRING}. That is documented, actually.
Or are you trying to make anything after /bar optional ? Then use a pattern like ^/?bar(/[^/]*)?/?$ maybe ...

htaccess rewrite rule for Forbidding urls

I'm not really new to htaccess rewrites, but today I have seen a rule which I've not seen before:
# Access block for folders
RewriteRule _(?:recycler|temp)_/ - [F]
This rule is part of the Typo3 htaccess file.
What does the "?:" mean? Is this some kind of back reference? And what do the underlines stand for?
Many thanks!
Rule RewriteRule _(?:recycler|temp)_/ - [F] could be divided into 2 rules for better understanding. like:
RewriteRule _recycler_/ - [F]
AND
RewriteRule _temp_/ - [F]
Now let us understand what does that mean:
You could see its a shortcut method to make 1 rule out of 2 rules.
We could use regex to match multiple patterns and perform same kind of action on URIs which are falling in same criteria(which is matched by regex).
In this case we are trying to match _(literal character) followed by (?:recycler|temp). Where ?: stands up for a non-capturing group. So whatever comes into this section(?:.......) will NOT come in backreference capability. Its basically matching string/text recycler OR temp in regex which is preceded and followed by _
Now comes what is capturing group: in .htaccess we can use capability of capture matched values which we can use them later eg--> $1 for getting 1st captured value(stored in memory), we could say non-capturing group tells that we want to match a regex but DO NOT store that into memory(because we DO NOT want to use it later onwards into our program).
Here is an example of capturing group rules in htaccess rules:
RewriteEngine ON
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{DOCUMENT_ROOT}/$1.php -f [NC]
RewriteRule ^(first|second)/(.*)?$ $1.php?$2 [QSA,NC,L]
Explanation of above example: Its simply makes 2 capturing groups, 1st will have either first OR second, 2nd capturing group will have anything(because of we used .*) in it, so while rewriting part we are using $1 and $2 to get there values. You could clearly see that we could use these values in condition part as well(which becomes in backend like: %{DOCUMENT_ROOT}/first.php OR %{DOCUMENT_ROOT}/second.php).
Here is an example of non-capturing groups in htaccess Rules:
RewriteEngine ON
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{DOCUMENT_ROOT}/$1.php -f [NC]
RewriteRule ^(?:first|second)/(third/fourth)/?$ $1.php [QSA,NC,L]
Explanation of above example: We could see that we are matching first OR second in rule side so now value of $1 will be either third OR fourth this time since we used non-capturing group for first/second. So backend condition check will become like: %{DOCUMENT_ROOT}/third.php OR %{DOCUMENT_ROOT/fourth.php

Loop over single RewriteRule

Would there be any way to continuously repeat execution of a single RewriteRule in .htaccess?
For example:
RewriteEngine On
# ...
RewriteCond %{QUERY_STRING} (.*?)(?:^|&)q=[^&]*(.*)
RewriteRule (.*) $1?%1%2
# ...
This rule will only be executed once before moving on to other rules. However, it could certainly match more than once:
Input: /?a=a&q=q&b=b&q=q&c=c
Expected: /?a=a&b=b&c=c
Actual: /?a=a&b=b&q=q&c=c
The closest way to do this seems to be the [N] flag, but this only works if the rule is at the very top of the file. Unfortunately, this is not feasible if the rule relies on other rules being executed beforehand.
An answer to a similar question suggests that it is not possible, but it never addresses the question directly.
To add more context, the end goal is to essentially encode a query string:
RewriteEngine On
# ...
RewriteCond %{REQUEST_URI} ^/folder/(.+)
RewriteRule .? - [E=QUERY:%1,S=1]
RewriteRule .? - [S=2]
# Repeat the next rule as many times as possible
RewriteCond %{ENV:QUERY} (.*?)&(.*)
RewriteRule .? - [E=QUERY:%1\%26%2]
RewriteRule .? /folder?query=%{ENV:QUERY}
# ...
This would result in the following:
Input: /folder/a&b&c
Expected: /folder?query=a%26b%26c
Actual: /folder?query=a%26b&c
However, I'm still curious if there would be a general way to loop on one rule continuously in .htaccess.
First you should create an index.php in folder/ with this content to display the various PHP variables:
<?php
phpinfo();
?>
Here is a generic way to loop the rules and replace each & by - using N flag.
RewriteEngine On
#
# your other rules here
#
# recursively replace & by - and build query string
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{QUERY_STRING} ^(?:query=-?(.+))?$
RewriteRule ^(folder)/([^&]+)(?:&(.+))?$ $1/$3?query=%1-$2 [N,NC]
# **will never execute** due to previous rule looping
RewriteRule ^(folder)/(.+)$ $1/?query=$2 [NC,L,R]
The [N] flag causes the ruleset to start over again from the top, using the result of the ruleset so far as a starting point. Use with extreme caution, as it may result in loop.
At least in this specific case, there does appear to be a simple answer by using the [B] flag:
RewriteRule ^/folder/(.+) /folder?query=$1 [B]
However, I'm still curious if there would be a general way to loop on one rule continuously in .htaccess.

Stop hotlinking using htaccess and non-specific domain code

I need to write an anti-hotlink command for my .htaccess file but it can not be specific to any domain name in particular. Here's what I found on another sites so far but I'm not sure exactly why it doesn't work, can anyone spot the problem?
# Stop hotlinking.
#------------------------------
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} ^https?://([^/]+)/ [NC]
# Note the # is just used as a boundary. It could be any character that isn't used in domain-names.
RewriteCond %1#%{HTTP_HOST} !^(.+)#\1$
RewriteRule \.(bmp|gif|jpe?g|png|swf)$ - [F,L,NC]
Try this.
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} ^https?://(www\.)?([^/]+)/.*$ [NC]
RewriteCond %2#%{HTTP_HOST} !^(.+)#(www\.)?\1$ [NC]
RewriteRule \.(bmp|gif|jpe?g|png|swf)$ - [F,L,NC]
Would even work when only one of the referrer or target url has a leading www.
EDIT : (how does this % thing work?)
%n references the n(th) bracket's matched content from the last matched rewrite condition.
So, in this case
%1 = either www. OR "" blank (because it's optional; used ()? to do that)
%2 = yourdomain.com (without www always)
So, now the rewrite condition actually tries to match
yourdomain.com#stealer.com OR yourdomain.com#www.stealer.com
with ^(.+)#(www\.)?\1$ which means (.+)# anything and everything before # followed by www. (but again optional); followed by \1 the first bracket's matched content (within this regex; not the rewrite condition) i.e. the exact same thing before #.
So, stealer.com would fail the regex while yourdomain.com would pass. But, since we've negated the rule with a !; stealer.com passes the condition and hence the hot-link stopper rule is applied.

mod_rewrite regex (too many redirects)

I am using mod_rewrite, to convert subdomains into directory urls. (solution from here). When I explicity write a rule for one subdomain, it works perfectly:
RewriteCond %{HTTP_HOST} ^[www\.]*sub-domain-name.domain-name.com [NC]
RewriteCond %{REQUEST_URI} !^/sub-domain-directory/.*
RewriteRule ^(.*) /sub-domain-directory/$1 [L]
However, if I try to match all subdomains, it results in 500 internal error (log says too many redirects). The code is:
RewriteCond %{HTTP_HOST} ^[www\.]*([a-z0-9-]+).domain-name.com [NC]
RewriteCond %{REQUEST_URI} !^/%1/.*
RewriteRule ^(.*) /%1/$1 [L]
Can anyone suggest what went wrong and how to fix it?
Your second RewriteCond will never return false, because you can't use backreferences within your test clauses (they're compiled during parsing, making this impossible since no variable expansion will take place). You're actually testing for paths beginning with the literal text /%1/, which isn't what you wanted. Given that you're operating in a per-directory context, the rule set will end up being applied again, resulting in a transformation like the following:
path -> sub/path
sub/path -> sub/sub/path
sub/sub/path -> sub/sub/sub/path
...
This goes on for about ten iterations before the server gets upset and throws a 500 error. There are a few different ways to fix this, but I'm going to chose one that most closely resembles the approach you were trying to take. I'd also modify that first RewriteCond, since the regular expression is a bit flawed:
RewriteCond %{HTTP_HOST} ^([^.]+)\.example\.com$ [NC]
RewriteCond %1 !=www
RewriteCond %1#%{REQUEST_URI} !^([^#]+)#/\1/
RewriteRule .* /%1/$0 [L]
First, it checks the HTTP_HOST value and captures the subdomain, whatever it might be. Then, assuming you don't want this transformation to take place in the case of www, it makes sure that the capture does not match that. After that, it uses the regular expression's own internal backreferences to see if the REQUEST_URI begins with the subdomain value. If it doesn't, it prepends the subdomain as a directory, like you have now.
The potential problem with this approach is that it won't work correctly if you access a path beginning with the same name as the subdomain the request is sent to, like sub.example.com/sub/. An alternative is to check the REDIRECT_STATUS environment variable to see if an internal redirect has already been performed (that is, this prepending step has already occurred):
RewriteCond %{HTTP_HOST} ^([^.]+)\.example\.com$ [NC]
RewriteCond %1 !=www
RewriteCond %{ENV:REDIRECT_STATUS} =""
RewriteRule .* /%1/$0 [L]

Resources