using mod_rewrite to strip out junk - .htaccess

We're seeing some really weird URLs in our logs and I've been told to start redirecting them.
I know of a couple of better ways to go about fixing this, but the boss wants it done this way. I apologize in advance.
We're seeing stuff like the following in our logs:
http://www.example.com/foo/bar/bla&ob=&ppg=&rpp=100&ob=&rpp=&ppg=&rpp=30&ppg=&ppg=1&rpp=10&rpp=50&ob=&ob=&ob=&rpp=40&ob=&rpp=5&rpp=30&rpp=&rpp=20&order_by=&results_per_pge=75
I've been told to 'toss some mod_rewrite rules in the .htaccess file' to take this and strip out all the ob, rpp, and ppg variables.
Now, I've found ways to strip everything out. And that wouldn't be too bad if I could leave the /foo/bar/bla in there. But I can't seem to do that. Basically, any help would be appreciated.

Try:
# strip out any params that's ob=, rpp= or ppg=
RewriteRule ^/?(.*)&ob=([^&]*)&(.*)$ /$1&$3 [L]
RewriteRule ^/?(.*)&rpp=([^&]*)&(.*)$ /$1&$3 [L]
RewriteRule ^/?(.*)&ppg=([^&]*)&(.*)$ /$1&$3 [L]
# if everything's gone, finally redirect and fix query string
RewriteCond %{REQUEST_URI} !&(ob|rpp|ppg)
RewriteRule ^/?(.*?)&(.*) /$1?$2 [L,R=301]
The problem here is that your URL:
http://www.example.com/foo/bar/bla&ob=&ppg=&rpp=100&ob=&rpp=&ppg=&rpp=30&ppg=&ppg=1&rpp=10&rpp=50&ob=&ob=&ob=&rpp=40&ob=&rpp=5&rpp=30&rpp=&rpp=20&order_by=&results_per_pge=75
has A LOT of ob=, rpp=, and ppg= in the URI. More than 10. That means you'll get a 500 internal server error if you use these rules against that URL. By default, apache has the internal recursion limit set to 10, that means if it needs to loop more than 10 times (and it will for the above URL), it'll bail and return a 500. You need to set that higher:
LimitInternalRecursion 30
or some other sane number. Unfortunately, you can't use that directive in an htaccess file, you'll need to go into server or vhost config and set it.

Related

I am not able 301 redirect domain.tld/?cur=usd to domain.tld

I try to redirect domain.tld/?cur=usd to domain.tld (there are many curencies, this is only example of one currency - we do not use anymore this solution).
I need to redirect only home with parameter to home without parameter. The other urls worked for me, I'm just having trouble getting work with that one.
I try to search and use online generators but none of the solutions work.
Here is what I am trying:
RewriteCond %{QUERY_STRING} (^|&)cur\=(.*)($|&)
RewriteRule ^$ /? [L,R=301]
// update
before this rule I have only
#bof redirects
RewriteEngine enabled
...and then there are redirects for other URLs, but I tested this rule separately first and the result was the same...
It not redirect me.
Thanks for the help and maybe an explanation of what I'm doing wrong.
RewriteCond %{QUERY_STRING} (^|&)cur\=(.*)($|&)
RewriteRule ^$ /? [L,R=301]
As mentioned in comments, this should already do as you require, providing there are no conflicts with other directives in the .htaccess file.
However, the regex in the preceding condition is excessively verbose for what you are trying to achieve (ie. just testing for the presence of the cur URL parameter).
If you simply want to check for the cur URL parameter anywhere in the query string then the regex (^|&)cur= would suffice (and is more efficient). No need to backslash-escape the literal =. And if the URL parameter always appears at the start of the query string then just use ^cur=.
I found the problem - it was something with the hosting, after a reboot everything started working as expected.
So I can confirm that this rule is fine.
Sorry for question.

Redirects not working as expected

I have an .htaccess file with several lines. It does not work as expected. Mod_rewrite is enabled. RewriteLogLevel is set to 9.
The first two rules are there to forbid uris with a length more then 80 characters:
RewriteCond %{REQUEST_URI} ^.{80}
RewriteRule .* - [F]
It does not seem to get evaluated as every test url passes through and it does not generate an error either.
I also tried:
RewriteRule .{80} - [F]
But that did not do the trick either. The process ends with a 404, not a 403.
This next rule is not working either. It used to work.
RewriteRule ^(\/)?([\w]+)$ /index.php [L]
The URI /Contact was always handled by this index.php.
Whatever URL I type I get a 404. I should get a 403 or a 200. Not a 404. What am I missing?
Apache has on all directories the permission to read, write and execute and on all files the permission to read and write.
The two urls for testing are:
127.0.0.4/asssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssddddddddddddddddddddd?p=s&s=psv
and
127.0.0.4/Contact
The alias for 127.0.0.4 used is considerate.lb.
Try this rule instead:
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s/+\S{80}
RewriteRule ^ - [F]
Using THE_REQUEST instead of REQUEST_URI as that variable might get overwritten due to presence of other rules in your .htaccess
Finally I have found a solution. The problem was not in the coding of the .htaccess. I replaced the file with a previous version, added the new lines to test the request and it worked all fine.
It is not a satisfactory solution, because it can happen again and I do not have any clue what caused the error. If someone knows the error, I would love to hear what might have been the exact cause and how to solve that properly. I would like to change the tags of the question as the current tags might be misleading (although other people might experience the same problem how apache handles a .htaccess file), but I do not know which tags I should use.

removing file extension with htaccess failing

i'm using an htaccess script trying to remove the .php testing the .htaccess on a testing server it runs fine, but on the live server that is a different host it trys rewriting the file based on the absolute path and the rewrite fails
here is the htaccess:
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteRule ^(.*)$ $1.php
this is taking a url like this www.example.com/services
and trying to point it to /n/c/example.com/public/service.php
I know the {REQUEST_FILENAME} is suppose to be pulling the full local system path, but i don't understand why it's not finding the file. i know very little about htaccess and mod_rewriting so i'm not really sure what I should try to make it base everything off of just the url path, or if there is a better solution. I'm really open to suggestions.
Any help would be greatly appreciated.
Thanks
Use RewriteRule .* %{REQUEST_URI}.php [L]
It is hard to tell why your rule did not worked for you by having so little info about your Apache setup and any other rewrite rules that you may have.
Quite possible that the [L] flag did the trick for you -- you may have other rewrite rules that were rewriting this URL further, producing incorrect result in the end. I don't think that %{REQUEST_URI} did such a big job on its own, unless you have some symbolic links / aliases or even some transparent proxy in use which could make a difference.
Keep in mind, that the rules you have shown in your question cannot generate this sort of URL to be visible in browser's address bar (example.com//service.php/) -- it has to be a redirect (3xx code) involved .. which suggests that you have other rules somewhere.
Most likely it is a combination of your Apache specific settings & combined rewrite rules logic (where the L flag can make a big difference depending on those other rules).
The only way to give more precise answer will be enabling rewrite debugging and analyzing how rewrite was executed and what was involved.
Have you enabled mod_rewrite on the other server? AddModule mod_rewrite, I think.
Also - more likely - have you enabled .htaccess? You would need to have
AllowOverride All
or
AllowOverride FileInfo
for that.
These directives will need to go in the apache config files (usually /etc/httpd/conf/httpd.conf or one of the files in /etc/httpd/conf.d), and you will need to restart apache to get them to take effect.

Stop mod_rewrite returning REQUEST_URI when (.*) is empty

Options +FollowSymLinks
RewriteEngine On
RewriteRule ^mocks/site/(.*)$ http://thelivewebsite.com/$1 [R=301,L]
That is my htaccess file's contents.
The htaccess file is in the root directory of the hosting account and I just want to redirect the directory mocks/site/ to the new domain (with or without any extra directories).
eg: if someone goes to http://mywebsite.com/mocks/site then it needs to redirect to http://thelivewebsite.com. If they go to http://mywebsite.com/mocks/site/another/directory then it needs to redirect to http://thelivewebsite.com/another/directory. I hope that makes sense.
So the problem I have is that the htaccess code above seems to work pretty well when there is something after mocks/site/ however when there isn't something after that then the $1 in the redirect seems to reference the whole REQUEST_URI (eg: mocks/site/ rather than nothing - as there is nothing after it).
I don't know how to stop this. I thought about using a RewriteCond, but I'm not sure what to use there. I can't find anything that helps me to determine if there is anything after mocks/site/ or not.
Any help will be much appreciated.
Thank you.
That's very strange behaviour -- never seen anything like that. Therefore I think it could be something else (another rule somewhere -- on old or even new site). I recommend enabling rewrite debugging (RewriteLogLevel 9) and check the rewrite log (that's if you can edit Apache's config file / virtual host definition).
In any case, try this combination:
Options +FollowSymLinks
RewriteEngine On
RewriteRule ^mocks/site/$ http://thelivewebsite.com/ [R=301,L]
RewriteRule ^mocks/site/(.+)$ http://thelivewebsite.com/$1 [R=301,L]
It will do matching/redirecting in 2 steps: first rule is for exact directory match (so no $1 involved at all) and 2nd will work if there is at least 1 character after the /mocks/site/.
Alternatively (Apache docs even recommending this one) use Redirect directive (no need for mod_rewrite at all for such simple redirects):
Redirect 301 /mocks/site/ http://thelivewebsite.com/

htaccess reditect if server returns 404

For example I have a page http://www.f1u.org/en/its-interesting/166-cricri.
How to write rule: if that page exists - open it.
If it returns 404, then redirect to http://www.f1u.org/its-interesting/166-cricri
use this line in .htaccess file
ErrorDocument 404$ http://www.f1u.org/its-interesting/166-cricri
It sounds like you want the apache server to look ahead to see if the current URL exists, if not, redirect them. I think you might be able to use mod_rewrite to accomplish this.
My first stab at it would be something like:
RewriteEngine On
RewriteCond %{IS_SUBREQ} false
RewriteCond %{REQUEST_URI} !-U
RewriteRule /en(/.*) $1 [R,L]
I'll note that I haven't tried it so the syntax and effects could be slightly off, and you'd need to be careful that you don't put yourself into an infinite loop, or wind up with too many subrequests (as that could impact the performance of your server). But hopefully it'll give you a starting point to play with. Alternatively mod-rewrite could (depending on server permissions) let you invoke scripts to determine rewrites as well, which could be an option as well.

Resources