what is line in the htaccess doing? - .htaccess

I was having problems with an img folder having a 403 forbidden and come to find out the issue was caused by this line....what is this doing
RewriteRule .*\.(jpg|jpeg|gif|png|bmp|pdf|exe|zip)$ - [F,NC]
and why would it cause a 403 forbidden on a /something/img folder that had images

See http://httpd.apache.org/docs/1.3/mod/mod_rewrite.html
F = forbidden, forces 403 errors on all URLs ending with one of these extensions.
NC = no case (ie. also works on .GIF for example.

RewriteRule .*\.(jpg|jpeg|gif|png|bmp|pdf|exe|zip)$ Matches any file names containing any number of characters followed by a period and one of those file extionsions.
- Tells mod_rewrite to keep the URL untouched; a technicality emplofed when showing a 403 forbidden page.
[F,NC] F=Forbidden, NC=No case; a case-insensitive match.
Most likely this rule follows (or is supposed to follow) one or more RewriteConds, which are conditions under which the RewriteRule will trigger. The intention of the rule was probably to block images and other files from being hotlinked. Without the RewriteCond, images will always be blocked.
A well designed rule for preventing hotlinking will look something like this:
# Only apply the rule if the referrer isn't empty...
RewriteCond %{HTTP_REFERER} !^$
# ... and doesn't match your site.
RewriteCond %{HTTP_REFERER} !\.?mysite.com/$
# Also, only apply the rule for the specified file types.
RewriteRule \.(jpg|jpeg|gif|png|bmp|pdf|exe|zip)$ - [F,NC]

Related

Htaccess caching system in subfolder not working

Sorry if this is a duplicate: I found many questions about caching system, but my problem seems to tied to the fact that the whole script is working within a subfolder.
All I need to do is implementing a simple caching system for my website, but I can't get this to work.
Here's my .htaccess file (widely commented to be clear - sorry if too many comments are confusing):
RewriteEngine on
# Map for lower-case conversion of some case-insensitive arguments:
RewriteMap lc int:tolower
# The script lives into this subfolder:
RewriteBase /mydir/
# IMAGES
# Checks if cached version exists...
RewriteCond cache/$1-$2-$3-{lc:$4}.$5 -f
# ...if yes, redirects to cached version...
RewriteRule ^(hello|world)\/image\/([a-zA-Z0-9\.\-_]+)\/([a-zA-Z0-9\.\-_]+)\/([a-zA-Z0-9\.\-_\s]+)\.(png|gif|jpeg?|jpg)$ cache/$1-$2-$3-{lc:$4}.$5 [L]
# ...if no, tries to generate content dynamically.
RewriteRule ^(hello|world)\/image\/([a-zA-Z0-9\.\-_]+)\/([a-zA-Z0-9\.\-_]+)\/([a-zA-Z0-9\.\-_\s]+)\.(png|gif|jpeg?|jpg)$ index.php?look=$1&action=image&size=$2&data=$3&name=$4&format=$5 [L,QSA]
# OTHER
# This is always non-cached.
RewriteRule ^(hello|world)\/([a-zA-Z0-9\.\-_\s]+)\/([a-zA-Z0-9\.\-_\s]+)?\/?$ index.php?look=$1&action=$2&name=$3 [QSA]
Now, the issue is that the RewriteCond seems to be always failing, as the served image is always generated by PHP. I also tried prepending a %{DOCUMENT_ROOT}, but is still not working. If I move the whole script to the root directory, it magically starts working.
What am I doing wrong?
Well one thing that you are doing wrong is trying to use a rewrite map in an .htaccess file. in the first place. According to the Apache documentation:
The RewriteMap directive may not be used in <Directory> sections or .htaccess files. You must declare the map in server or virtualhost context. You may use the map, once created, in your RewriteRule and RewriteCond directives in those scopes. You just can't declare it in those scopes.
If your ISP / sysadmin has already defined the lc map then you can use it. If not then you can only do case-sensitive file caching on Linux, because its FS naming is case sensitive. However, since these are internally generated images, just drop the case conversion and stick to lower case.
%{DOCUMENT_ROOT} may not be set correctly at time of mod_rewrite execution on some shared hosting configurations. See my Tips for debugging .htaccess rewrite rules for more hints. Also here is the equivalent lines from my blog's .htaccess FYI. The DR variable does work here, but didn't for my previous ISP, to I had to hard-code the parth
# For HTML cacheable blog URIs (a GET to a specific list, with no query params,
# guest user and the HTML cache file exists) then use it instead of executing PHP
RewriteCond %{HTTP_COOKIE} !blog_user
RewriteCond %{REQUEST_METHOD}%{QUERY_STRING} =GET [NC]
RewriteCond %{DOCUMENT_ROOT}html_cache/$1.html -f
RewriteRule ^(article-\d+|index|sitemap.xml|search-\w+|rss-[0-9a-z]*)$ \
html_cache/$1.html [L,E=END:1]
Note that I bypass the cache if the user is logged on or for posts and if any query parameters are set.
Footnote
Your match patterns are complicated because you are not using the syntax of regexps: use the \w and you don't need to escape . in [ ] or / . Also the jpeg isn't right is it? So why not:
RewriteRule ^(hello|world)/image/([.\w\-]+)/([.\w\-]+)/([\w\-]+\.(png|gif|jpe?g))$ \
cache/$1-$2-$3-$4 [L]
etc.. Or even (given that the file rule will only match for valid files in the cache)
RewriteRule ^(hello|world)/image/(.+?)/(.+?)/(.*?\.(png|gif|jpe?g))$ \
cache/$1-$2-$3-$4 [L]
The non-greedy modifier means that (.+?) is the same as ([^/]+) so doing hacks like ../../../../etc/passwd won't walk the file hierarchy.

Ho do I write an .htaccess redirect for a directory containing an ellipsis

Some how I had an invalid directory indexed in Google, and because of some dynamic relative links I now have 2500 "missing" pages indexed. I'm trying to use an .htaccess 301 redirect to correct the problem but I can't seem to get it to work. I need to redirect www.domain.com/shop/pc/.../pc/filename.asp to www.domain.com/shop/pc/filename.asp.
The rule I have written that doesn't want to work is RewriteRule ^shop/pc/\.\.\./pc/(.*)$ /shop/pc/$1 [R=301,L]
Any thoughts?
mod_rewite uses PCRE, so for these Unicode characters (I included the two dot leader as well, since I imagine that is more likely to sneak into a URL than an ellipsis):
# U+2026 … \xe2\x80\xa6 HORIZONTAL ELLIPSIS
RewriteRule ^shop/pc/\xe2\x80\xa6/pc/(.*)$ /shop/pc/$1 [R=301,L]
# U+2025 ‥ \xe2\x80\xa5 TWO DOT LEADER
RewriteRule ^shop/pc/\xe2\x80\xa5/pc/(.*)$ /shop/pc/$1 [R=301,L]
Note you may need the [B] flag (see flags) if the browser is percent-escaping the ellipsis.

.htaccess rewrite rule - custom structure

Is it possible to create a .htaccess rule which will take the middle of a URL structure, but resume the normal REQUEST_URL (Sorry for my terrible explanation)
Take this URL for example
/boats/283/manage/water
Now let's say I'm keeping the hierarchy standardisation as per the URL structure, minus the ID (ID in this case is 287) - so the actual script location is /boats/manage/water(.php)
But obviously I don't have to have a manual rule for each page, as that will get tedious.
eg (What I want to avoid per page).
RewriteRule ^boats/(\d+)/manage/water$ ./boats/manage/water.php?id=$1
RewriteRule ^boats/(\d+)/manage/bacon$ ./boats/manage/bacon.php?id=$1
I have no doubt I could find something relevant in Google, but I just can't quite come up with the proper keywords..
Any help/push in the right direction is much appreciated :)
You can try:
# group out the first path, the ID, then the rest
RewriteCond %{REQUEST_URI} ^/([^/]+)/(\d+)/(.*)$
# pre-check that the destination php file actually exists
RewriteCond %{DOCUMENT_ROOT}/%1/%3.php -f
# rewrite
RewriteRule ^ /%1/%3.php?id=%2 [L]

Mod_Rewrite to /subdirectory and /subdirectory/query

I'm having a difficult time getting into using mod_rewrite. I've been at this for about an hour googling stuff but nothing quite seems to work. What I want to do is change
example.com/species.php into example.com/species
and also
example.com/species.php?name=frog into example.com/species/frog.
Using
Options +FollowSymlinks
RewriteEngine on
RewriteRule ^species/(.*)$ /species.php?name=$1
I can get example.com/species.php?name=frog to display as example.com/species/frog, and with
RewriteRule ^species/ /species.php
I can get example.com/species.php to display as example.com/species/, but I can't get both of them to work at the same time.
Also, example.com/species with no trailing slash always comes up as a 404.
I've considered just making a /species/ directory to catch any problems but I'd rather just have a few rules for one species.php file. Any help would gladly be appreciated!
Edit (because I can't answer my own question for 8 more hours):
I seem to have fixed both of my problems. I changed my .htaccess to:
Options +FollowSymlinks
RewriteEngine on
RewriteRule ^species/(.*)$ /species.php?name=$1
RewriteRule ^species/?$ /species.php
The second RewriteRule successfully redirects example.com/species to example.com/species.php while leaving the other RewriteRule working at the same time.
However, if I typed in example.com/species/ with a trailing slash, it was being read as example.com/species.php?name= and would throw an error because no name was submitted, so I just added
if(isset($_GET['name']) && empty($_GET['name'])) {header('location: http://example.com/species');}
so that if I used example.com/species/ it would redirect to /species and work as desired.
If you change the * (match zero or more) to a + (match one or more) in your first RewriteRule then you should stop seeing species.php?name= if a trailing slash is used.
This is because the + will require that something appears after the slash, otherwise the rule will not match. Then your second RewriteRule will match because it ends with an optional slash, but will not add the name= query string to the target URL.
You may also want to add the [L] flag (last) after the first rule, because you don't need the second rule to execute if the first rule matches. (Note that this will not stop the RewriteCond and RewriteRule tests being run on the resulting redirect URL, which will have to go through the .htaccess file just like any other request.)
See the Reference Documentation for mod_rewrite in Apache 2.4 (or see the docs for the version of Apache you're actually using).

URL Beautification using .htaccess

in search of a more userfriendly URL, how do i achieve both of the following, elegantly using only .htaccess?
/de/somepage
going to /somepage?ln=de
/zh-CN/somepage#7
going to /somepage?ln=zh-CN#7
summary:
/[language]/[pagefilenameWithoutExtension][optional anchor#][a number from 0-9]
should load (without changing url)
/[pagefilenameWithoutExtension]?ln=[language][optional anchor#][a number from 0-9]
UPDATE, after provided solution:
1. exception /zh-CN/somepage should be reachable as /cn/somepage
2. php generated thumbnails now dont load anymore like:
img src="imgcpu?src=someimage.jpg&w=25&h=25&c=f&f=bw"
RewriteRule ^([a-z][a-z](-[A-Z][A-Z])?)/(.*) /$3?ln=$1 [L]
You don't need to do anything for fragments (eg: #7). They aren't sent to the server. They're handled entirely by the browser.
Update:
If you really want to treat zh-CN as a special case, you could do something like:
RewriteRule ^zh-CN/(.*) /$1?ln=zh-CN [L]
RewriteRule ^cn/(.*) /$1?ln=zh-CN [L]
RewriteRule ^([a-z][a-z])/(.*) /$2?ln=$1 [L]
I would suggest the following -
RewriteEngine on
RewriteRule ^([a-z][a-z])/([a-zA-Z]+) /$2?ln=$1
RewriteRule ^([a-z][a-z])/([a-zA-Z]+#([0-9])+) /$2?ln=$1$3
The first rule takes care of URLs like /de/somepage. The language should be of exactly two characters
length and must contain only a to z characters.
The second rule takes care of URLs like /uk/somepage#7.

Resources