is this .htaccess part vunerable to my site - .htaccess

in some Joomla installations I found this .htaccess in an administrator component. Can one explain what happens here and if it looks like vunerable code?
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^http://(www\.)? [NC]
RewriteRule .*\.(csv)$ [R,L,NC]

It looks a bit dirty to me, but it's not a security issue for sure.
RewriteCond %{HTTP_REFERER} !^http://(www\.)? [NC]
... this means that the proceeding rule is executed only if there's no referer starting with http://www.
The referer method might be used to process the rule only 1 time, but not in further redirects, triggered by the htaccess file, because redirects don't preserve the referer. The whole intention is difficult to guess without seeing the rest of the file.
RewriteRule .*\.(csv)$ [R,L,NC]
... this means that there should be no more processing of the htaccess file if the url ends in .csv
The [L] means - "last" rule, no further processing. As #adrianopolis said, the NC means case-insensitiv, so it will match to .CSV as well as to .Csv etc.
The [R] means redirect, but as there is no target URL, it won't do anything but prevent further processing.

Related

htacces - need to fix broken links coming from other sites to mine

I am having an issue where Google Webmaster Tools is reporting a ton of 404 links to my site which are coming from ask.com.
I have tried to get ask.com to fix their side but of course they are not, so now I am stuck with over 11k of bad links to my site which I am suspecting is effecting my ranks right now.
Anyways I have a possible way to 301 them, but not sure how to do it with .htaccess.
Here is the bad link pointing to my site
http://www.freescrabbledictionary.com/sentence-examples/fere-film/feverous/about.php
It should be
http://www.freescrabbledictionary.com/sentence-examples/fere-film/feverous/
Besides the about.php there are other variations of endings as well, I basically need to be able to remove the ending.
Problem is that the URL after /sentence-examples/ can change. The beginning is always:
http://www.freescrabbledictionary.com/sentence-examples/
So basically:
http://www.freescrabbledictionary.com/sentence-examples/<-keep but can change->/<-keep but can change->/<-remove this->
This .htaccess should be placed on the folder before sentence-examples:
RewriteEngine on
# Redirect /sentence-examples/anything/anything/remove to /sentence-examples/anything/anything/
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s/+(sentence-examples/[^/]+/[^/]+)/.* [NC]
RewriteRule ^ /%1/? [R=302,PT,L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^/]+)/(.*)$ /sentence-examples/examplesentence.php?havethis=$1&word=$2 [L]
Change 302 to 301 once you confirm it's working as expected.
If you have a CMS installed you might need a different rule to work along with it without conflicting.
Keep in mind that if you had previously tried different redirects using 301 aka permanent redirect its recommended that you use a different browser to test this rule to avoid the caching.
This is possibly quick and dirty but I've done a simple test on localhost and here just to make sure it works.
RewriteEngine On
RewriteRule ^sentence-examples/(.*)/(.*)/(.*)\.php http://www.freescrabbledictionary.com/sentence-examples/$1/$2/ [R=301,L]
You can see that I've added wildcard groups (.*) to the RewriteRule so that we can pick up the elements of the URL that we need to aid in proper redirection i.e. $1 and $2. You can also use the third one ($3) to get which destinations are being targeted alot for your SEO needs.
NB: The rule above assumes that that the redirected URL will always be from a .php target and to ensure that you can redirect regardless of whatever comes after the 3rd URL segment replace the RewriteRule with this
RewriteRule ^sentence-examples/(.*)/(.*)/(.*)$ http://www.freescrabbledictionary.com/sentence-examples/$1/$2/ [R=301,L]

Using mod_rewrite to mask a directory/file name in a URL

I've taken my site down for some prolonged maintenance and am using mod_rewrite to send all requests to a single page: www.mysite.com/temp/503.php
This is my .htaccess file which works fine.
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_URI} !^/temp/503.php [NC]
RewriteRule .* /temp/503.php [R,L]
However, what I'd also like to be able to do is to hide /temp/503.php in the resulting URL from the visitor.
I know this is perhaps trivial and I'm sure fairly simple to achieve, but with my limited mod_rewrite skills I can't seem to get it to work.
Any help would be much appreciated.
Thanks.
Just get rid of the R flag in the rewrite rule, which tells the rule to redirect the request, thus changing the URL in the browser's location bar. So the rule would look like:
RewriteRule .* /temp/503.php [L]
which internally rewrites the requested URI instead of externally telling the browser that it's been moved to a new URL.

.htaccess and SEO URLs - why is this an infinite loop?

I have a dirty URL like this: http://www.netairspace.com/photos/photo.php?photo=3392.
I want to do something like http://www.netairspace.com/photos/OH-LTU/Finnair_Airbus_330-202X/OUL_EFOU_Oulu/photo_3392/ (and later support short URLs like http://www.netairspace.com/pic/3392/ but I'll leave that out).
So I have a script photo_seo_url.php, which takes the photo ID, builds the SEO URL, and does a redirect (302 for testing, 301 when I'm happy with it). I then planned to add .htaccess mod_rewrite rules so that on calling the old URL:
the old URL would be rewritten internally to photo_seo_url.php
photo_seo_url.php would 301/302 redirect to the SEO URL
the SEO URL would be rewritten internally to the original photo.php
That way I would, in theory, get the benefits of the SEO URL while being able to retire the old ones at my leisure.
These are the rules I used:
RewriteEngine on
RewriteRule ^photos/.*/photo_([0-9]+)/?$ photos/photo.php?photo=$1 [NC,L]
RewriteCond %{QUERY_STRING} photo=([0-9]+)
RewriteRule ^photos/photo\.php$ photos/photo_seo_url.php?photo=%1 [NC,L]
But that goes into an infinite redirect loop. Why, if these two are doing internal rewrites rather than external redirects - or is that what I'm missing?
I've solved the problem adding a new file showphoto.php, which does nothing but include the original photo.php, and changing line 2:
RewriteEngine on
RewriteRule ^photos/.*/photo_([0-9]+)/?$ photos/showphoto.php?photo=$1 [NC,L]
RewriteCond %{QUERY_STRING} photo=([0-9]+)
RewriteRule ^photos/photo\.php$ photos/photo_seo_url.php?photo=%1 [NC,L]
But I'd still like to understand why the original version goes into an infinite loop. I've missed or misunderstood something. Is my approach sound?
To answer your question, why does this loop occur? This is what happens with an SEO URI, with a GET /photos/OH-LTU/Finnair_Airbus_330-202X/OUL_EFOU_Oulu/photo_3392/, say.
Rule 1 fires converting this to a GET /photos/photo.php?photo=3392 which triggers an internal redirect which then restarts the scan of the .htaccessfile.
Rule 2 then fires converting this to a GET photos/photo_seo_url.php?photo=339 which triggers an internal redirect which again restarts the scan of the .htaccessfile.
No further matches occur and hence this is passed to the script photos/photo_seo_url.php which then does a 302 to /photos/OH-LTU/Finnair_Airbus_330-202X/OUL_EFOU_Oulu/photo_3392/ and the browser detects a redirection loop.
What you need happen is for rule 1 firing to prevent rule 2 firing even after an internal redirect. One way to do this is to set an environment variable, say END (which gets converted to REDIRECT_END on the next pass) and to skip the rules if this is set:
RewriteEngine on
RewriteBase /
RewriteRule ^photos/.*/photo_([0-9]+)/?$ photos/photo.php?photo=$1 [NC,E=END:1,L]
RewriteCond %{ENV:REDIRECT_END}:%{QUERY_STRING} ^:photo=([0-9]+)$
RewriteRule ^photos/photo\.php$ photos/photo_seo_url.php?photo=%1 [NC,L]
An alternative approach is to add a dummy noredir parameter to the rewritten URI and add a:
RewriteCond %{QUERY_STRING} !\bnoredir
to the original second rule. However, photo.php would need to ignore this. Hope this helps :-)
RewriteEngine on
RewriteRule ^photos/.*/photo_([0-9]+)/?$ photos/photo.php?photo=$1&rewritten [NC,L]
RewriteCond %{QUERY_STRING} !rewritten
RewriteCond %{QUERY_STRING} photo=([0-9]+)
RewriteRule ^photos/photo\.php$ photos/photo_seo_url.php?photo=%1 [NC,L]

Redirect internally to higher directory

I have example.org and foo.example.org pointing to the same directory, /var/www/html/, and want foo.example.org to internally redirect to /var/www/foo/ using only mod_rewrite.
This is what I have so far, but no joy:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{HTTP_HOST} ^foo [NC]
RewriteRule ^(.*)$ ../foo/$1 [L]
</IfModule>
This gets me 500s due to hitting the limit of 10 internal redirects, but I don't understand why.
The reason for the internal redirect loop is that your only RewriteCond is the host name check. The host name won't change after the internal redirect, and alas, will get triggered when the rules are parsed for the new request. You can solve this by adding a RewriteCond to check if the path already is set to the expected value (i.e. only rewrite requests with the path set to /var/www/html, and skip any other rewrite - as it has already been rewritten).
I'm going to suggest that it might be cleaner to do something like this through mod_vhost_alias, depending on your use case.

How do I get the [L] flag of RewriteRule (.htaccess) really working?

To newcomers: While trying to comprehensively describe my problem and phrase my questions I produced huge ammount of text. If you don't want to read the whole thing, my observations about (read "proof of") [L] flag not working the misconception, from which it all sprung, is located in Additional observations section. Why I misunderstood apparent behaviour is described in my Answer as well as solution to given problem.
Setup
I have following code in my .htaccess file:
# disallow directory indexing
Options -Indexes
# turn mod_rewrite on
Options +FollowSymlinks
RewriteEngine on
# allow access to robots file
RewriteRule ^robots.txt$ robots.txt [NC,L]
# mangle core request handler address
RewriteRule ^core/(\?.+)?$ core/handleCoreRequest.php$1 [NC,L]
# mangle web file adresses (move them to application root folder)
# application root folder serves as application GUI address
RewriteRule ^$ web/index.html [L]
# allow access to images
RewriteRule ^(images/.+\.(ico|png|bmp|jpg|gif))$ web/$1 [NC,L]
# allow access to stylesheets
RewriteRule ^(css/.+\.css)$ web/$1 [NC,L]
# allow access to javascript
RewriteRule ^(js/.+\.js)$ web/$1 [NC,L]
# allow access to library scripts, styles and images
RewriteRule ^(lib/js/.+\.js)$ web/$1 [NC,L]
RewriteRule ^(lib/css/.+\.css)$ web/$1 [NC,L]
RewriteRule ^(lib/(.+/)?images/.+\.(ico|png|bmp|jpg|gif))$ web/$1 [NC,L]
# redirect all other requests to application address
# RewriteRule ^(.*)$ /foo/ [R]
My web application (and its .htaccess file) is located in foo subfolder of DOCUMENT_ROOT (accessed from browser as http://localhost/foo/). It has PHP core part located in foo/core and JavaScript GUI part located in foo/web. As can be seen from the code above, I want to allow access only to single core script that handles all requests from GUI and to 'safe' web files and redirect all other requests to base application address (last commented directive).
Problem
Behaviour
It works until I try the last part by uncommenting the last redirecting directive. If I comment some more lines, the appropriate page parts stop working, etc.
However, when I uncomment last line, which should be performed only when matching of all previous rules fails (at least that's what I understand), page goes into redirection cycle (Firefox throws error page with something like "This page isn't redirecting properly"), because it's redirecting to http://localhost/foo/ again and again and again, forever.
Questions
What I don't understand is this processing of this rule:
RewriteRule ^$ web/index.html [L],
specifically the [L] flag. The flag apparently doesn't work for me. When the last line is commented, it correctly redirects, but when I uncomment it, it is always processed, even though rewriting should stop on [L] flag. Anyone got any ideas?
Also, on a sidenote, I'd be thrilled to know why my following attempt at fixing it doesn't work either:
RewriteEngine on
RewriteRule ^core/(\?.+)?$ core/handleCoreRequest.php$1 [NC,L]
RewriteRule ^(.*)$ web/$1 [L]
RewriteRule ^.*$ /foo/ [L]
This actually doesn't work at all. Even if I remove the last line, it still doesn't redirect anything correctly. How does the redirecting work in the first example, if it doesn't work in the second?
It would also be of great benefit to me, if anybody knew any way to actually debug these directives. I spend hours on this without even the slightest clue what could possibly be wrong.
Additional observations
After trying the advice given by bbadour (not that I haven't tried it before, but now that I had a second opinion, I gave it another shot) and it didn't work, I've come up with the following observation. By rewriting last line to this:
RewriteRule ^(.*)$ /foo/?uri=$1 [R,L]
or this
RewriteRule ^(.*)$ /foo/?uri=%{REQUEST_URI} [R,L]
and using Firebug's Net panel, I found out more evidence, that the [L] flag is clearly not working as expected in the previously mentioned RewriteRule ^$ web/index.html [L] rule (let's call it THE RULE from now on). In first case I get [...]uri=web/index.html, in second case [...]uri=/foo/web/index.html. That means that THE RULE gets executed (rewrites ^$ to web/index.html), but the rewriting doesn't stop there. Any more ideas, please?
After hours of searching and testing, I finally found the real problem and solution. Hopefully this will help somebody else too, when they come across the same problem.
Cause of observed behavior
.htaccess file is processed after every redirect (even without [R] flag),
which means that after the RewriteRule ^$ web/index.html [L] is processed, mod_rewrite correctly stops rewriting, goes to the end of the file, redirects correctly to /foo/web/index.html, and then the server starts processing .htaccess file for the new location, which is the same file. Now only the last rewrite rule matches and redirects back to /foo/ (this time with [R], so the redirect can be observed in browser) ... and the .htaccess file is processed again, and again, and again...
Once more for clarity: Because only the hard redirects can be observed, it seems like the [L] flag is ignored, but it is not so. Instead, the .htaccess is processed two times redirecting back and forth between /foo/ and /foo/web/index.html.
Solution
Disallow direct access to subfolder
To virtually move subdirectory to application root directory, additional complex conditional rewrites must be used. Variable THE_REQUEST is useful for distinguishing between hard and soft redirects:
RewriteCond %{THE_REQUEST} ^GET\ /foo/web/
RewriteRule ^web/(.*) /foo/$1 [L,R]
For this rewrite rule to be matched, two conditions must apply. First, on second line, the "local URI" must start with web/ (which corresponds with absolute web URI /foo/web/). Second, on first line, the real request URI must start with /foo/web/ too. Together this means, that the rule only matches when the file inside the web/ subfolder is requested directly from the browser, in which case we want to do a hard redirect.
Redirect to allowed content from root to subfolder (soft)
RewriteCond $1 !^web/
RewriteCond $1 ^(.+\.(html|css|js|ico|png|bmp|jpg|gif))?$
RewriteRule ^(.*)$ web/$1 [L,NC]
We want to redirect to allowed content only if we haven't done it already, hence the first condition. Second condition specifies mask for allowed content. Anything matching this mask will be softly redirected, possibly returning 404 error if the content doesn't exist.
Hide all content not in subfolder or not allowed
RewriteRule !^web/ /foo/ [L,R]
This will do a hard redirect to application root for all URIs not beginning with web/ (and remember, only requests that can begin with web/ at this point are internal redirects for allowed content.
Real example
My code shown in my "question" after using solution tips mentioned above gradually transformed into the following:
# disallow directory indexing
Options -Indexes
# turn mod_rewrite on
Options +FollowSymlinks
RewriteEngine on
# allow access to robots file
RewriteRule ^robots.txt$ - [NC,L]
# mangle core request handler address
# disallow direct access to core request handler
RewriteCond %{THE_REQUEST} !^(GET|POST)\ /asm/core/handleCoreRequest.php
RewriteRule ^core/handleCoreRequest.php$ - [L]
# allow access to request handler under alias
RewriteRule ^core/$ core/handleCoreRequest.php [NC,QSA,L]
# mangle GUI files adressing (move to application root folder)
# disallow direct access to GUI subfolder
RewriteCond %{THE_REQUEST} ^GET\ /foo/web/
RewriteRule ^web/(.*) /foo/$1 [L,R]
# allow access only to correct filetypes in appropriate locations
RewriteCond $1 ^$ [OR]
RewriteCond $1 ^(images/.+\.(ico|png|bmp|jpg|gif))$ [OR]
RewriteCond $1 ^(css/.+\.css)$ [OR]
RewriteCond $1 ^(js/.+\.js)$ [OR]
RewriteCond $1 ^(lib/js/.+\.js)$ [OR]
RewriteCond $1 ^(lib/css/.+\.css)$ [OR]
RewriteCond $1 ^(lib/(.+/)?images/.+\.(ico|png|bmp|jpg|gif))$
RewriteRule ^(.*)$ web/$1 [L,NC]
# hide all files not in GUI subfolder that are not whitelisted above
RewriteRule !^web/ /foo/ [L,R]
What I don't like about this approach is that the application root folder must be hardcoded in .htaccess file (as far as I know), so the file must be generated on application install, not simply copied.
To debug, try simplifying your regex, and the url you ask for (a part of the full url you wanna match), and see if it's working, now step by step, add more bits to the regex adn the testing url, till you find where things are stopping to work properly.
Try using:
RewriteRule ^(.*)$ /foo/ [R,L]
If it still loops, put a RewriteCond in front of it to skip the rule if it is already /foo/

Resources