Case insensitive rewrite rule - linux

So I have a rewrite rule that it's needed because of the old site, and have some images that are linked from another website, the problem is that I can' manually fix the url's because there are a lot of images.
So before the website was hosted on Windows, and there was no problem if you want to link an image like this:
http://www.example.com/Fder69.JPG and the filename was "fder69.JPG" it did work, now I have a rewrite rule like this:
RewriteRule ^([^/.]+.JPG)$ /imgs/$1 [L,NC,R=302] so basicly rewrites the old links to the new structure, but some of the images that don't have the exact filename don't work.
Is there a way to accomplish this? with something like CheckSpelling Off or ? can I make the rewrite cond to accept .JPG and .jpg, any tips?

One option is to rename all the files to be all-lowercase, which generally leads to nicer URLs, and then redirect any requests for mixed-case versions to all lowercase.
This approach has the advantage that each file ends up with only a single URL, rather than the same content appearing under multiple URLs as would be the case if you used mod_speling. This is good for search engine rankings, among other things.
One way to rename all the files would be to generate a bunch of mv commands in a shell script, like this:
find . | perl -ne 'chomp; print "mv \"", $_, "\" \"", lc $_, "\"\n";' > rename-files.sh
Note that I make no warranties that this won't mess up all your files, but I think it's right...
The redirection is done using a "RewriteMap", which is a function which can be applied on the right hand side of a RewriteRule. One of the built-in mappings available is int:tolower, allowing you to do this:
# Alias the mapping function as "lc"
RewriteMap lc int:tolower
# Perform the substitution if the URL contains uppercase letters
RewriteCond %{REQUEST_URI} [A-Z]
# Issue a 301 redirect to the all-lowercase version
RewriteRule /(.*) /${lc:$1} [R=permanent,L]

Related

Using htaccess to redirect to certain extension

I want to know how to use .htaccess to redirect a certain path to a certain extension. To be more clear, I want to redirect something like this:
http://www.example.com/api/some/page
To this:
http://www.example.com/some/page.json
I understand that I could just do this using the router that is supplied by CakePHP, however, how would this be done with a .htaccess file?
To handle this rewrite, you may use this rule just below RewriteEngine On:
RewriteEngine On
RewriteRule ^api/(?!.+\.json$)(.+)$ $1.json [L,NC]
(?!.+\.json$) is a negative lookahead that skips matching URIs that end with .json (to avoid a rewrite loop)
Pattern ^api/(?!.+\.json$)(.+)$ matches URIs that start with /api/ and captures part after /api in $1
$1.json in target adds .json at the end of matched part
Flags: L is for Last and NC is Ignore case

Why doesnt this htaccess rewrite work?

Okay so I am trying to make it so that if people go to /?char=USERNAME it would show the contents of /game/CharWidget.swf?login=USERNAME. This is my code so far:
RewriteEngine on
RewriteCond %{QUERY_STRING} char=(.*)
RewriteRule ^index.php?char=(.*) /game/CharWidget.swf?login=%1
This makes the url server side as /game/CharWidget.swf but doesn't carry the ?char=username and make it ?login=username so it wont show what I want it to show.
Edit; If it's easier doing /char/USERNAME to /game/CharWidget.swf?login=USERNAME i wouldnt mind doing that if someone could give me the code for it.
The query string is not visible to RewriteRules, so ^index.php?char=(.*) will never match. (Except that, since you haven't escaped . or ?, it will match e.g. indexZphchar=foo, which is probably not what you want.)
Also, if the user visits /?char=USERNAME, what the RewriteRule would normally see is just /; no index.php. Finally, if this is in an .htaccess file, you'll generally also need a RewriteBase directive.
Putting all those fixes together, something like this should work:
RewriteEngine On
RewriteBase /
RewriteCond %{QUERY_STRING} ^char=(.*)$
RewriteRule ^/?(index\.php)?$ /game/CharWidget.swf?login=%1 [NS]
(The regexp ^/?(index\.php)?$ will match either an empty path or index.php, with or without a leading slash. That makes it a bit more complex than absolutely necessary, but also more robust. In particular, the /? lets it also work outside .htaccess files, where the leading slash will be present.)
Ps. The regexp ^char=(.*)$ will also allow URLs like /?char=foo&bar=baz to be rewritten to /game/CharWidget.swf?login=foo&bar=baz. If you don't want to allow such rewrites, replace it with e.g. ^char=([^&;]*)$.
Edit: Unfortunately, this isn't going to work for .swf files, because those execute on the client, and so won't see any changes to the query string made by server-side rewrites.
What you could do is make the rewrite external by replacing the [NS] flag with [NS,L,R=302]. However, this will also change the URL shown in the browser address bar, which may not be what you want. If so, another option would be to make the original request serve an HTML page on which you embed the .swf file.

Htaccess caching system in subfolder not working

Sorry if this is a duplicate: I found many questions about caching system, but my problem seems to tied to the fact that the whole script is working within a subfolder.
All I need to do is implementing a simple caching system for my website, but I can't get this to work.
Here's my .htaccess file (widely commented to be clear - sorry if too many comments are confusing):
RewriteEngine on
# Map for lower-case conversion of some case-insensitive arguments:
RewriteMap lc int:tolower
# The script lives into this subfolder:
RewriteBase /mydir/
# IMAGES
# Checks if cached version exists...
RewriteCond cache/$1-$2-$3-{lc:$4}.$5 -f
# ...if yes, redirects to cached version...
RewriteRule ^(hello|world)\/image\/([a-zA-Z0-9\.\-_]+)\/([a-zA-Z0-9\.\-_]+)\/([a-zA-Z0-9\.\-_\s]+)\.(png|gif|jpeg?|jpg)$ cache/$1-$2-$3-{lc:$4}.$5 [L]
# ...if no, tries to generate content dynamically.
RewriteRule ^(hello|world)\/image\/([a-zA-Z0-9\.\-_]+)\/([a-zA-Z0-9\.\-_]+)\/([a-zA-Z0-9\.\-_\s]+)\.(png|gif|jpeg?|jpg)$ index.php?look=$1&action=image&size=$2&data=$3&name=$4&format=$5 [L,QSA]
# OTHER
# This is always non-cached.
RewriteRule ^(hello|world)\/([a-zA-Z0-9\.\-_\s]+)\/([a-zA-Z0-9\.\-_\s]+)?\/?$ index.php?look=$1&action=$2&name=$3 [QSA]
Now, the issue is that the RewriteCond seems to be always failing, as the served image is always generated by PHP. I also tried prepending a %{DOCUMENT_ROOT}, but is still not working. If I move the whole script to the root directory, it magically starts working.
What am I doing wrong?
Well one thing that you are doing wrong is trying to use a rewrite map in an .htaccess file. in the first place. According to the Apache documentation:
The RewriteMap directive may not be used in <Directory> sections or .htaccess files. You must declare the map in server or virtualhost context. You may use the map, once created, in your RewriteRule and RewriteCond directives in those scopes. You just can't declare it in those scopes.
If your ISP / sysadmin has already defined the lc map then you can use it. If not then you can only do case-sensitive file caching on Linux, because its FS naming is case sensitive. However, since these are internally generated images, just drop the case conversion and stick to lower case.
%{DOCUMENT_ROOT} may not be set correctly at time of mod_rewrite execution on some shared hosting configurations. See my Tips for debugging .htaccess rewrite rules for more hints. Also here is the equivalent lines from my blog's .htaccess FYI. The DR variable does work here, but didn't for my previous ISP, to I had to hard-code the parth
# For HTML cacheable blog URIs (a GET to a specific list, with no query params,
# guest user and the HTML cache file exists) then use it instead of executing PHP
RewriteCond %{HTTP_COOKIE} !blog_user
RewriteCond %{REQUEST_METHOD}%{QUERY_STRING} =GET [NC]
RewriteCond %{DOCUMENT_ROOT}html_cache/$1.html -f
RewriteRule ^(article-\d+|index|sitemap.xml|search-\w+|rss-[0-9a-z]*)$ \
html_cache/$1.html [L,E=END:1]
Note that I bypass the cache if the user is logged on or for posts and if any query parameters are set.
Footnote
Your match patterns are complicated because you are not using the syntax of regexps: use the \w and you don't need to escape . in [ ] or / . Also the jpeg isn't right is it? So why not:
RewriteRule ^(hello|world)/image/([.\w\-]+)/([.\w\-]+)/([\w\-]+\.(png|gif|jpe?g))$ \
cache/$1-$2-$3-$4 [L]
etc.. Or even (given that the file rule will only match for valid files in the cache)
RewriteRule ^(hello|world)/image/(.+?)/(.+?)/(.*?\.(png|gif|jpe?g))$ \
cache/$1-$2-$3-$4 [L]
The non-greedy modifier means that (.+?) is the same as ([^/]+) so doing hacks like ../../../../etc/passwd won't walk the file hierarchy.

.htacess rewrite to truncate links with common mistakes?

Everyone who have tried to search through error_log files from large websites got lots of links like these bellow due to people who have screwd up some html in third part sites or blogs...
File does not exist: /var/www/vhosts/mydomain.com/httpdocs/materias/137.html'http://...
File does not exist: /var/www/vhosts/mydomain.com/httpdocs/materias/137.html http://...
File does not exist: /var/www/vhosts/mydomain.com/httpdocs/materias/137.html/mydomain...
The problem is some extra chars after the .html...
Its easy to guess the correct url in each case... we just have to truncate the url after the ".html".
Is it possible with .htaccess to rewrite these problematic urls to the correct syntax?
Just eliminating everything after the .
html? And avoiding messing up with url queries in dynamic urls?
Here's what I would like to do ...
Replace ".html " with ".html#"
Replace ".html'" with ".html#"
Replace ".html/" with ".html#"
As everything after the # will be just ignored...
Any simple way to do that with .htaccess?
Just use a Regex:
RewriteRule ^(.*)\.html(.*)$ $1.html
This RedirectMatch rule should work:
RedirectMatch 301 ^(.+?\.html).+$ $1

.htaccess rewrite /files/users/1/file.pdf to /view/?file=file.pdf

I am terrible with mod_rewrite however I need to rewrite any request to the folder /files/users/*/ (* is a wildcard) to /view/ and insert the filename into a query paramater like so:
/files/users/9/test.pdf becomes /view/?file=test.pdf
How would I go about this assuming that the .htaccess file will be located inside /files/users/?
I would really appreciate if you explained how your solution works as I am slowly trying to become familiar with mod_rewrite.
So, you wanna have all my trade secrets on a silver plate?
Well, I try my best. ;-)
First of all, you must know where the documentation is. Look here for the reference: mod_rewrite. Or mod_rewrite, if your Apache version is 2.2.
You will find an overview with lots of links at Apache mod_rewrite. There, you will find a nice introduction to rewriting URLs. Also look here for lots of standard examples.
Since mod_rewrite supports PCRE regular expressions, you might need perlre and/or regular-expression.info from time to time.
Now to your question
RewriteEngine On
RewriteRule ^(?:.+?)/(.*) /view/?file=$1
This might already be sufficient. It looks for a subdirectory (?:.+?) in /files/users and captures the name of a file (.*) in this subdirectory. If this pattern matches, it rewrites the URL to /view/?file= and appends the captured file with $1, which gives /view/?file=$1.
All untested, of course, have fun.
P.S. Additional info is here at SO at .htaccess info and .htaccess faq.
Put the directive below in your .htaccess file to rewrite /files/users/9/test.pdf to /view/?file=test.pdf. In practical terms this means that if you visit http://yourdomain.com/files/users/9/test.pdf then the visitor will be served the rewritten url which is http://yourdomain.com/view?file=test.pdf
RewriteRule ^[^/]+/(.*)$ /view/?file=$1 [L]
A RewriteRule directive is part of the Apache mod_rewrite module. It takes two arguments:
Pattern - a regular expression to match against the current URL path (note that the URL path is not the entire URL but eg. /my/path, but in a .htaccess context the leading slash / is stripped giving us my/path).
Substitution - the destination URL or path where the user will rewritten OR redirected to.
Explaining the rule
The pattern ^[^/]+/(.*)$:
^ - the regex must match from the start of the string
[^/] - match everything but forward slash
+ - repetition operator which means: match 1 or more characters
/ - matches a forward slash
(.*) - mathes any characters. The dot means match any character. The star operator means match ANY characters (0 or more). The parantheses means the match is grouped and can be used in backreferences.
$ - the regex must match until the end of the string
The substitution /view/?file=$1:
...means that we rewrite the URL path to the /view/ folder with the query parameter file. The query parameter file will contain our first grouped match from the pattern as we pass it the $1 value (which means the first match from our RewriteRule pattern).
The [L] flag:
...means that mod_rewrite will stop processing rewrite rules. This is handy to avoid unwanted behaviour and/or infinite loops.

Resources