Is my htaccess file causing multiple calls to one page? - .htaccess

I think my htaccess file that makes use of mod_rewrite is causing my pages to be called more than once. Can anyone see if this could happen with my current htaccess file? Or if there is even a possibility? This happens in the view.php page only (from what I have seen).
# REWRITE DEFAULTS
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_HOST} ^www\.mysite\.com$ [NC]
RewriteRule ^(.*)$ http://mysite.com/$1 [R=301,L]
# /view.php?t=h5k6 externally to /h5k6
RewriteCond %{THE_REQUEST} ^GET\ /view\.php
RewriteCond %{QUERY_STRING} ^([^&]*&)*t=([^&]+)&?.*$
RewriteRule ^view\.php$ /%2? [L,R=301]
# /h5k6 internally to /view.php?t=h5k6
RewriteRule ^([0-9a-z]+)$ view.php?t=$1 [L]
What is happening in my PHP scripts is that they are being called more than once or at the very least a function is being called more than once even though I have made sure its being called once!
Thanks all

Those mod_rewrite conditions and rules will not cause a script to be called more than once. The rules themselves can be called multiple times. Every time a URL is successfully rewritten into a new request, the new request will invoke the rules again. However, this will stop as soon as a "real" resource (script, webpage, etc.) is identified and retrieved a single time.
Are there other references on your page that would make another request? For instance an IMG tag will cause a browser to make another request. Those requests will cause the rules to be run again. It looks like something with a dot (e.g. picture.jpg) will not match your rules, but something else might.
Other things to look for are CSS and scripts that are referenced.

Without reading your pasted code, I want to say no. the htaccess runs through each line and stops at the first rule that matches the request

Related

is this .htaccess part vunerable to my site

in some Joomla installations I found this .htaccess in an administrator component. Can one explain what happens here and if it looks like vunerable code?
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^http://(www\.)? [NC]
RewriteRule .*\.(csv)$ [R,L,NC]
It looks a bit dirty to me, but it's not a security issue for sure.
RewriteCond %{HTTP_REFERER} !^http://(www\.)? [NC]
... this means that the proceeding rule is executed only if there's no referer starting with http://www.
The referer method might be used to process the rule only 1 time, but not in further redirects, triggered by the htaccess file, because redirects don't preserve the referer. The whole intention is difficult to guess without seeing the rest of the file.
RewriteRule .*\.(csv)$ [R,L,NC]
... this means that there should be no more processing of the htaccess file if the url ends in .csv
The [L] means - "last" rule, no further processing. As #adrianopolis said, the NC means case-insensitiv, so it will match to .CSV as well as to .Csv etc.
The [R] means redirect, but as there is no target URL, it won't do anything but prevent further processing.

Surprising rewriting of URL by htaccess rule

I've zeroed my problem and I've specific question.
With only the following code in the .httaccess why index2.php gets called if I type in my URL as www.mysite.com/url2 ?
RewriteEngine On
RewriteCond %{REQUEST_URI} (.html|.htm|.feed|.pdf|.raw)$ [NC]
RewriteRule (.*) index2.php [L]
I've also tested it at http://www.regextester.com and should not replace it with index2.php:
In the end I want this rule to skip any URL starting with /url2 or /url2/*.
EDIT: I've made screen recording of this problem: http://screenr.com/BBBN
You have this in your .htaccess:
RewriteEngine On
RewriteCond %{REQUEST_URI} (.html|.htm|.feed|.pdf|.raw)$ [NC]
RewriteRule (.*) index2.php [L]
What it does? it rewrites anything that ends with html, htm, feed , pdf , raw to index2.php. So, if you are getting results as your URL is ends with those extensions, then there are two possible answers:
There is another rewrite rule in an .htaccess in upper directories (or in server config files) that causes the URL to be rewritten.
Your URL actually ends with those extensions. have in mind, what you enter in your address bar, will be edited and rewritten. For example, if you enter www.mysite.com/url2 in your address bar and that file doesn't exist on server, your server will try to load the proper error document. So, if your error document is /404.html, it will be rewritten to index2.php at the end.
Update:
I think it's the case. create a file named 404.php in your document root. Inside your main .htaccess (in your document root), put this:
ErrorDocument 404 /404.php
delete all other ErrorDocument directives.
inside 404.php , put this:
<?php
echo 'From 404.php file';
?>
Logic behind it:
When you have a weird behavior in mod_rewrite, the best solution in my experience is using rewrite log. to enable rewrite log put this in your virtualhost or other server config directives you may choose:
RewriteLogLevel 9
RewriteLog "logs/RewriteLog.log"
be careful: the code above will enable rewrite log and start logging at highest level possible (logging everything). It will decrease your server speed and the log file will become huge very quickly. Do this only on your dev server.
Explanation: When you try to access www.mysite.com/url2, Apache gives your URL to rewrite module. Rewrite module checks if any of RewriteRules applies to your URL. Because you have one rule and it doesn't apply to your URL, it tries to load the normal file. But this file does not exit. So, Apache will do the next step which is showing the proper error message. When you set a custom error file, Apache will run the test against the new address. For example if error document is /404.html, Apache checks whether your rule applies to /404.html or not. Since it does, it will rewrite it.
The point to remember is apache will do this every time there is change in URL, whether the change is made by rewrite module or not!
The rule you list should work as you expect if this is the only rule. Fact is that theory is fun, but apparently it doesn't work as expected. Please note that . will match ANY CHARACTER. If you want to match the full stop/period character, you'll need to escape it. That's why I use \.(html|htm|feed|pdf|raw)$ instead of (.html|.htm|.feed|.pdf|.raw)$ below.
You can add another RewriteCond that simply doesn't match if the url starts with /url2, like below. This might not be a viable solution if there are lots of urls that shouldn't be matched.
RewriteCond %{REQUEST_URI} !^/url2
RewriteCond %{REQUEST_URI} \.(html|htm|feed|pdf|raw)$ [NC]
RewriteRule (.*) index2.php [L]
To get a better understanding of what is happening you can alter the rule to something like this. Now simply enter the urls you dont want to be matched in the url bar and inspect the url bar after the redirect happens. In the url-parameter you now see what url actually triggered this rule to match. This screencast shows you a similar version working with a sneaky rewriterule that is working away on the url.
#A way of finding out what is -actually- matched
RewriteCond %{REQUEST_URI} \.(html|htm|feed|pdf|raw)$ [NC]
RewriteCond %{REQUEST_URI} !/foo
RewriteRule (.*) /foo?url=$1 [R,L]
You can decide to match the %{THE_REQUEST} variable instead. This will always contain the request itself. If something else is rewriting the url, this variable doesn't change, meaning you can use this to overwrite any changes. Make sure the url won't be matching itself. You would get something like below. An example screencast can be found here.
#If it doesn't end on .html/htm/feed etc, this one won't match
RewriteCond %{THE_REQUEST} ^(GET|POST)\ /.*\.(html|htm|feed|pdf|raw)\ HTTP [NC]
RewriteCond %{REQUEST_URI} !^/index2\.php$
RewriteRule (.*) /index2.php [L]

Targeting single directory for rewrite rule

I have edited this question to use the actual URLs. I need the url
http://westernmininghistory.com/mine_db/main.php?page=mine_detail&dep_id=10257227
To be rewritten like
http://westernmininghistory.com/mine_detail/10257227/
I have tried
RewriteRule ^([^/]*)/([^/]*)/$ /mine_db/main.php?page=$1&dep_id=$2 [L]
Which works on this page but breaks every other page on the site. I was wondering if there was a way to force the rewriterule to only operate on files within the mine_db directory. I had tried RewriteCond but with no success:
#RewriteCond %{REQUEST_URI} ^/mine_db
I really don't know they proper syntax for this though. Any ideas?
First of your rule can be shortened and written without needing RewriteCond. Also it appears that you want to capture 2 variables after test_db.
You can try this rule instead:
RewriteRule ^(mine_detail)/([0-9]+)/?$ /mine_db/main.php?page=$1&dep_id=$2 [QSA,L,NC]
Which will work with URIs like /mine_detail/12345 (trailing slash is optional). Also note that above rewrite will happen silently (internally) without changing the URLi in browser. If you want to change URL in browser then use R flag as well like this:
RewriteRule ^(mine_detail)/([0-9]+)/?$ /mine_db/main.php?page=$1&dep_id=$2 [QSA,L,NC,R]

.htaccess and SEO URLs - why is this an infinite loop?

I have a dirty URL like this: http://www.netairspace.com/photos/photo.php?photo=3392.
I want to do something like http://www.netairspace.com/photos/OH-LTU/Finnair_Airbus_330-202X/OUL_EFOU_Oulu/photo_3392/ (and later support short URLs like http://www.netairspace.com/pic/3392/ but I'll leave that out).
So I have a script photo_seo_url.php, which takes the photo ID, builds the SEO URL, and does a redirect (302 for testing, 301 when I'm happy with it). I then planned to add .htaccess mod_rewrite rules so that on calling the old URL:
the old URL would be rewritten internally to photo_seo_url.php
photo_seo_url.php would 301/302 redirect to the SEO URL
the SEO URL would be rewritten internally to the original photo.php
That way I would, in theory, get the benefits of the SEO URL while being able to retire the old ones at my leisure.
These are the rules I used:
RewriteEngine on
RewriteRule ^photos/.*/photo_([0-9]+)/?$ photos/photo.php?photo=$1 [NC,L]
RewriteCond %{QUERY_STRING} photo=([0-9]+)
RewriteRule ^photos/photo\.php$ photos/photo_seo_url.php?photo=%1 [NC,L]
But that goes into an infinite redirect loop. Why, if these two are doing internal rewrites rather than external redirects - or is that what I'm missing?
I've solved the problem adding a new file showphoto.php, which does nothing but include the original photo.php, and changing line 2:
RewriteEngine on
RewriteRule ^photos/.*/photo_([0-9]+)/?$ photos/showphoto.php?photo=$1 [NC,L]
RewriteCond %{QUERY_STRING} photo=([0-9]+)
RewriteRule ^photos/photo\.php$ photos/photo_seo_url.php?photo=%1 [NC,L]
But I'd still like to understand why the original version goes into an infinite loop. I've missed or misunderstood something. Is my approach sound?
To answer your question, why does this loop occur? This is what happens with an SEO URI, with a GET /photos/OH-LTU/Finnair_Airbus_330-202X/OUL_EFOU_Oulu/photo_3392/, say.
Rule 1 fires converting this to a GET /photos/photo.php?photo=3392 which triggers an internal redirect which then restarts the scan of the .htaccessfile.
Rule 2 then fires converting this to a GET photos/photo_seo_url.php?photo=339 which triggers an internal redirect which again restarts the scan of the .htaccessfile.
No further matches occur and hence this is passed to the script photos/photo_seo_url.php which then does a 302 to /photos/OH-LTU/Finnair_Airbus_330-202X/OUL_EFOU_Oulu/photo_3392/ and the browser detects a redirection loop.
What you need happen is for rule 1 firing to prevent rule 2 firing even after an internal redirect. One way to do this is to set an environment variable, say END (which gets converted to REDIRECT_END on the next pass) and to skip the rules if this is set:
RewriteEngine on
RewriteBase /
RewriteRule ^photos/.*/photo_([0-9]+)/?$ photos/photo.php?photo=$1 [NC,E=END:1,L]
RewriteCond %{ENV:REDIRECT_END}:%{QUERY_STRING} ^:photo=([0-9]+)$
RewriteRule ^photos/photo\.php$ photos/photo_seo_url.php?photo=%1 [NC,L]
An alternative approach is to add a dummy noredir parameter to the rewritten URI and add a:
RewriteCond %{QUERY_STRING} !\bnoredir
to the original second rule. However, photo.php would need to ignore this. Hope this helps :-)
RewriteEngine on
RewriteRule ^photos/.*/photo_([0-9]+)/?$ photos/photo.php?photo=$1&rewritten [NC,L]
RewriteCond %{QUERY_STRING} !rewritten
RewriteCond %{QUERY_STRING} photo=([0-9]+)
RewriteRule ^photos/photo\.php$ photos/photo_seo_url.php?photo=%1 [NC,L]

URL rewrite issues using .htaccess

I'm trying to write a URL like below, but when I try to call the seo queryparam, it always returns index.php. Any idea why it isn't returning the correct value for 'seo'?
RewriteRule ^([^/]*)$ index.php?c=home&m=details&seo=$1 [L]
The URL it should forward from would be something like this: http://domain.com/The-Name-of-the-Product. That URL should be rewritten to http://domain.com/index.php?c=home&m=details&seo=The-Name-of-the-Product, but instead it ends up as http://domain.com/index.php?c=home&m=details&seo=index.php
Various events cause a URL to go back through the rewrite process. You can use RewriteCond to prevent this:
RewriteCond $1 !^index.php$
RewriteRule ^/?([^/]+)$ index.php?c=home&m=details&seo=$1 [L,NS]
From the mod_rewrite technical details:
When you manipulate a URL/filename in per-directory context mod_rewrite first rewrites the filename back to its corresponding URL (which is usually impossible, but see the RewriteBase directive below for the trick to achieve this) and then initiates a new internal sub-request with the new URL. This restarts processing of the API phases.
This catches people all the time.

Resources