RewriteCond Check if file exists in a subdirectory - iis

I'm using Iirf v2.0.
I have the following directory structure:
/
/library
/library/index.php
/webroot
/webroot/images
/Iirf.ini
Where I have a library folder which contains my application, a webroot folder (which contains images, stylesheets etc) and an Iirf.ini config file.
I'm wanting to redirect all requests to /library/index.php if the file doesn't exist under webroot.
eg:
Request Response
/images/blah.png -> /webroot/images/blah.png
/news -> /library/index.php
My Iirf.ini config has:
RewriteEngine ON
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ /library/index.php [L]
Which redirects everything to /library/index.php but I'm having trouble working out how to check if the REQUEST_FILENAME exists under webroot.
I've looked at this question but I don't have access to DOCUMENT_ROOT. It gives me the following (taken from the log):
Thu Jul 15 11:46:21 - 760 - ReplaceServerVariables: VariableName='REQUEST_FILENAME' Value='C:\web\favicon.ico'
Thu Jul 15 11:46:21 - 760 - ReplaceServerVariables: in='%{DOCUMENT_ROOT}/webroot/%{REQUEST_FILENAME}' out='DOCUMENT_ROOT/webroot/C:\web\favicon.ico'
Any help would be greatly appreciated.
--- EDIT --
I've updated my config after more reading and the suggestions of Tim to be:
RewriteCond $0 !^/webroot
RewriteRule ^.*$ /webroot$0 [I]
RewriteCond $0 !-f
RewriteRule ^/webroot/(.*)$ /library/index.php [I,L,QSA]
And it passes to /library/index.php correctly but it still doesn't check for an existing file (even though it seems to say that it does).
Thu Jul 15 14:47:30 - 3444 - EvalCondition: checking '/webroot/images/buttons/submit.gif' against pattern '!-f'
Thu Jul 15 14:47:30 - 3444 - EvalCondition: cond->SpecialConditionType= 'f'
Thu Jul 15 14:47:30 - 3444 - EvalCondition: Special: it is not a file
I think I'm going to have to contact the author of the Filter.

Hmm...I hadn't heard about IIRF before, cool stuff. After browsing through the documentation to see what the differences between it and mod_rewrite are, I have two things you could try.
The first is to swap out %{DOCUMENT_ROOT} for %{APPL_PHYSICAL_PATH} in the answer that you found. DOCUMENT_ROOT is an Apache server variable, and from what I can tell the corresponding IIS variable should be APPL_PHYSICAL_PATH. I know based on the IIRF documentation that this variable is available, but admittedly I'm not 100% sure whether or not it points to your site root.
The other is to do the following, which again may or may not work based upon whether I understood the documentation correctly, how your index.php file gets the relevant path information to process the request, and a host of other things. Admittedly I think this is a less than ideal solution (compared to what I had originally thought to do based on how mod_rewrite does things), but maybe it'll work:
RewriteEngine ON
# This should rewrite to /webroot/whatever then restart the ruleset,
# apparently...On Apache in a per-dir context, this would alter the
# %{REQUEST_FILENAME} for the next run-through. I'm assume it does
# here too, but I might be wrong.
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !^/webroot
RewriteRule ^.*$ /webroot/$0
# The file still doesn't exist, rewrite it back to its original form,
# but move on to the next rule instead of restarting processing. This
# may not even be necessary, but I was hoping this rewrite would have
# side-effects that would make it as if the above rewrite didn't happen.
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(/webroot/)?(.*)$ $0 [NI]
# Now, if it still doesn't exist, we'll rewrite it to our
# /library/index.php file, but this may not work based on how you
# get the original request information. Adding the [U] flag will
# create a new header that preserves the "original" URL (I'm not
# sure what it takes the value from if the URL has already been
# rewritten in a previous step), which might be useful.
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^.*$ /library/index.php

I ended up having to swap to using the Helicon Tech ISAPI_Rewrite 3 filter.
The htaccess file I ended up using was:
RewriteEngine On
# Check whether the file exists and if not, check whether the request starts
# with webroot. Prepend webroot if it doesn't.
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !^webroot
RewriteRule ^.*$ webroot/$0 [NI]
# Check whether the file exists, if not, send the request off to library/index.php
RewriteCond %{DOCUMENT_ROOT}/$0 !-f
RewriteRule ^(webroot/)?(.*)$ library/index.php [I,L,QSA]

Related

Mod_rewrite only working with certain folder names

Context
I'm using mod_rewrite to make my links better for SEO. I made the following rule for my page expanded_debate.php:
Options -MultiViews
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteRule ^poll/([0-9a-zA-Z_-]+)/([0-9]+) expanded_debate.php?poll_title=$1&pollid=$2 [NC,QSA,L]
When I input this format in the URL (poll/filename/10, for example) I get a 404 error:
Object not found!
The requested URL was not found on this server. If you entered the URL manually please check your spelling and try again.
If you think this is a server error, please contact the webmaster.
Error 404
localhost
Apache/2.4.46 (Unix) OpenSSL/1.1.1h PHP/7.4.12 mod_perl/2.0.11 Perl/v5.32.0
However, when I change the first folder name to certain words, such as "debate" and "expanded_debate" (but not "expandedebate"), the file loads after page refresh. For example:
RewriteRule ^debate/([0-9a-zA-Z_-]+)/([0-9]+) expanded_debate.php?poll_title=$1&pollid=$2 [NC,QSA,L]
works fine.
I have an older .htaccess file, titled ".htaccess11", with the following info, in case it's of any use:
#forbids users from going to forbidden pages
IndexIgnore *
Options -Indexes
RewriteEngine On
RewriteCond %{SERVER_PORT} !^443$
RewriteCond %{REQUEST_URI} !^/\.well-known/acme-challenge/[0-9a-zA-Z_-]+$
RewriteCond %{REQUEST_URI} !^/\.well-known/cpanel-dcv/[0-9a-zA-Z_-]+$
RewriteCond %{REQUEST_URI} !^/\.well-known/pki-validation/(?:\ Ballot169)?
RewriteCond %{REQUEST_URI} !^/\.well-known/pki-validation/[A-F0-9]{32}\.txt(?:\ Comodo\ DCV)?$
RewriteRule ^(.*)$ https://%{SERVER_NAME}%{REQUEST_URI} [R=301,L]
#404 error directions
ErrorDocument 404 /404.php
Question
Any idea why only certain terms in the first folder position ("^debate" in example above) work when using mod_rewrite?
There are no "poll" folders in my project, if that's of any interest.
Let me know if there are any questions.
The line
RewriteCond %{REQUEST_FILENAME}\.php -f
Means "Take the requested URL, map it to a full local path in the normal way, append .php to the resulting path, and then process the following rewrite rule only if there is an existing regular file at the modified path".
For example, the URL "poll/filename/10" will be rewritten only if there is a file called "poll/filename/10.php" in the relevant location.
Since the value of the AcceptPathInfo directive is evidently set to On, this condition will also be met if there is an existing file called "poll.php" or "poll/filename.php". That is why the rewrite rule works when you change "poll" to "debate" or "expanded_debate" – there are existing files called "debate.php" and "expanded_debate.php".
In any case, it sounds like this behavior is not what was intended. Removing the -f condition should give the desired result. Or, to prevent the rewrite rule from making existing files inaccessible, you could replace it with:
RewriteCond %{REQUEST_FILENAME} !-f
The exclamation point negates the -f test: "continue only if this file does not exist"
If you are using the %{REQUEST_FILENAME} server variable (anywhere), you should be aware of how the AcceptPathInfo directive will affect this, and consider setting that directive explicitly in the same .htaccess file.
If Options +MultiViews is in effect, then %{REQUEST_FILENAME} will match existing files whether or not the extension is included in the request (GET /foo will match an existing file "foo.php", "foo.html", etc.). And GET /foo.php will match in any case. So, omit the string "\.php" from the original rule.
Other configuration may also have an effect, too. The important point is that, unlike %{REQUEST_URI}, %{REQUEST_FILENAME} invokes all the processing that Apache would otherwise do to translate a URL into a local path.
(source)
NB: although I don't think it was the intention here, you actually might want to test for the existence of a local file as part of this rule. You could use a RewriteCond to check whether the back-end data file for a given poll has been manually created, and return 404 by default if it has not. That would be a simple way to prevent users from making up their own poll URLs at will.

How to rewrite directory for multiple level directories

This is my .htaccess
RewriteEngine On
# browser requests PHP
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^\ ]+)\.php
RewriteRule ^/?(.*)\.php$ /$1 [L,R=301]
# check to see if the request is for a PHP file:
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteRule ^/?(.*)$ /$1.php
RewriteCond %{REQUEST_URI} !^/(views|css|js|media|partials|php)
RewriteRule (.*) /views/$1
This is my project structure:
The idea is that all my HTML files are structured in the folder views, but I don't want my URL to be http://example.com/views/index but just http://example.com/index without the views-part.
This is working fine in the following case:
http://example.com/account/
But fails as soon as I try to access a file in the accounts-folder e.g: http://example.com/account/voeg-kind-toe
That results in a 404. Seems like this .htaccess solution only works for one-level directories.
Edit:
Interesting: If I place the bottom two lines on top (so placing the code to rewrite the view-part before the code to remove the php-extension); the php-extension works but the /views//part don't.
Create .htaccess like this
RewriteEngine on
RewriteCond %{HTTP_HOST} ^(www.)?example.com$
RewriteCond %{REQUEST_URI} !^/views/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /views/$1
RewriteCond %{HTTP_HOST} ^(www.)?example.com$
RewriteRule ^(/)?$ views/ [L]
As mentioned in my comment above, providing there are no other directives/conflicts then this should still "work" in a roundabout way. However, there is an issue with the following directive in the order you have placed it:
# check to see if the request is for a PHP file:
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteRule ^/?(.*)$ /$1.php
You aren't testing for files in the /views subdirectory. But also, REQUEST_FILENAME does not necessarily contain what (I think) you think it does. When you request /account/voeg-kind-toe (an entirely virtual URL path) then REQUEST_FILENAME contains /account (it actually contains an absolute filesystem path, but I've kept it brief). So, the above is testing whether /account.php exists, not /account/voeg-kind-toe.php, or even /views/account/voeg-kind-toe.php - which is presumably the intention.
So, on the first pass, the above condition fails, no rewrite occurs, and processing continues...
The second rule then rewrites the request for /account/voeg-kind-toe to /views/account/voeg-kind-toe. Providing there are no further mod_rewrite directives, the rewrite engine then starts over. This time, /views/account/voeg-kind-toe is the input.
On the second pass, REQUEST_FILENAME is /views/account/voeg-kind-toe (since /views/account is a physical directory) and the request is rewritten to /views/account/voeg-kind-toe.php (since the filesystem check should be successful). Providing you have no other directives then processing should now stop.
This is working fine in the following case: http://example.com/account/
/account/ is simply rewritten to /views/account/ by the last rule.
Edit: Interesting: If I place the bottom two lines on top (so placing the code to rewrite the view-part before the code to remove the php-extension); the php-extension works but the /views//part don't.
The same process as above occurs, EXCEPT this all occurs in a single pass and so is less dependent on other directives that might occur later in the file.
I'm not sure what you mean by "the /views//part don't"?
Assuming you only have .php files within the /views directory and all URLs are intended to target the /views subdirectory and you don't need to reference directories directly then you could do this is a single directive and rewrite everything that does not contain (what looks like) a file extension to /views/<whatever>.php.
For example:
RewriteRule !\.\w{2,4}$ /views%{REQUEST_URI}.php [L]
The L (last) flag is required if you have other mod_rewrite directives that follow - to prevent additional processing.
This does mean you can't rely on the directory index. ie. You need to request /index (as in your example), not simply / to serve /views/index.php.
(You still need your first rule that removes .php from the requested URL - although this is only strictly necessary if you are changing an existing URL structure, where you previously used .php on the URLs.)

removed .html extensions with htaccess now index.html give 403 error

After entering the code below, my home page gives a 403 error. The rest of the site works perfectly. All instances of .html were removed.
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([^\.]+)$ $1.html [NC,L]
Any advice?
Thank you!
example.com leads ti the 403 error. If I write example.com/index it works fine.
Something else must have changed for this to result in a 403 error. The code you posted won't actually do anything when you request example.com/ - the same as if that code didn't exist at all. (UPDATE: However, this assumes your .htaccess file is located in the document - it appears this is not the case - see below.)
However, what will trigger a 403 in such cases is when "formatted directory listings" are disabled and the directory index document cannot be found (or has been disabled).
So, try setting the appropriate directory index at the top of your .htaccess file:
DirectoryIndex index.html
It is the DirectoryIndex that serves the appropriate file when requesting your "home page", not your directives in .htaccess.
UPDATE:
It [.htaccess] is located in my root directory. Would it be better to put it in the public_html folder?
Yes, the code you posted should go in the /public_html directory (ie. your document root). If these directives are in a .htaccess file above the document root then the RewriteRule pattern will match the URL-path public_html/ and rewrite the URL to public_html/.html which is possibly where your 403 error is coming from ("dot" files are usually hidden/protected OS files and you may also have a directive in your server config blocking access. However, this behaviour may also be dependent on other factors in the server config/OS). However, with that code in the document root then a request for example.com/ (your home page) won't be processed by these directives (which is good) - mod_dir should then serve the index.html file in this instance.
However, you don't want to process "directories" anyway (public_html is obviously a "directory", not a file). Which is what's happening above. eg. .html shouldn't be appended to public_html/ to begin with (or example.com/path/to/directory/ or any other directory). This can be avoided by adding an additional condition to your rule block to avoid directories (as well as files). For example:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^.]+)$ $1.html [L]
Simply adding that additional RewriteCond directive might be enough and still allow you to keep your .htaccess file above the document root. (However, you may still need to move the .htaccess file as well, as described above.)
Also, the NC flag is not required here and literal dots don't need to be escaped when used inside a character class.
You could also extend this code to first check the existence of the file (with a .html extension) before rewriting, although this may be unnecessary in your case. For example:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^([^.]+)$ $1.html [L]
This requires an additional "file check" which may be an unnecessary overhead.

RewriteRule does not match if folder name set to "myfolder/"

Since we migrated to a new server, some of our pages are broken (404). Reason is we have 2 broken rewrite rules.
What's really strange is that they work if I change folder's name.
For example this work:
RewriteRule ^anything/([a-zA-Z0-9-]+)/$ page.php?var=$1 [L]
This doesn't:
RewriteRule ^myfolder/([a-zA-Z0-9-]+)/$ page.php?var=$1 [L]
I can't even find a trick to make 301 redirects, because my original "myfolder/" virtual folder never matches.
Any ideas what's going on? I was thinking it could be a rule override or something like that (as it's hosted on a multidom solution), but i don't have such rules in my main site at the root. It drives me crazy.
Thx!
In practice you probably want to do 2 things. Disable multiviews and also bypass rules if the request is a real directory.
Options -MultiViews #turn off automatic URI matching, can cause weirdness
RewriteEngine on
#stop here if the request is a real file or directory
RewriteCond %{REQUEST_FILENAME} -d [OR]
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^ - [L]
RewriteRule ^myfolder/([a-zA-Z0-9-]+)/?$ /page.php?var=$1 [L]

How do I get the [L] flag of RewriteRule (.htaccess) really working?

To newcomers: While trying to comprehensively describe my problem and phrase my questions I produced huge ammount of text. If you don't want to read the whole thing, my observations about (read "proof of") [L] flag not working the misconception, from which it all sprung, is located in Additional observations section. Why I misunderstood apparent behaviour is described in my Answer as well as solution to given problem.
Setup
I have following code in my .htaccess file:
# disallow directory indexing
Options -Indexes
# turn mod_rewrite on
Options +FollowSymlinks
RewriteEngine on
# allow access to robots file
RewriteRule ^robots.txt$ robots.txt [NC,L]
# mangle core request handler address
RewriteRule ^core/(\?.+)?$ core/handleCoreRequest.php$1 [NC,L]
# mangle web file adresses (move them to application root folder)
# application root folder serves as application GUI address
RewriteRule ^$ web/index.html [L]
# allow access to images
RewriteRule ^(images/.+\.(ico|png|bmp|jpg|gif))$ web/$1 [NC,L]
# allow access to stylesheets
RewriteRule ^(css/.+\.css)$ web/$1 [NC,L]
# allow access to javascript
RewriteRule ^(js/.+\.js)$ web/$1 [NC,L]
# allow access to library scripts, styles and images
RewriteRule ^(lib/js/.+\.js)$ web/$1 [NC,L]
RewriteRule ^(lib/css/.+\.css)$ web/$1 [NC,L]
RewriteRule ^(lib/(.+/)?images/.+\.(ico|png|bmp|jpg|gif))$ web/$1 [NC,L]
# redirect all other requests to application address
# RewriteRule ^(.*)$ /foo/ [R]
My web application (and its .htaccess file) is located in foo subfolder of DOCUMENT_ROOT (accessed from browser as http://localhost/foo/). It has PHP core part located in foo/core and JavaScript GUI part located in foo/web. As can be seen from the code above, I want to allow access only to single core script that handles all requests from GUI and to 'safe' web files and redirect all other requests to base application address (last commented directive).
Problem
Behaviour
It works until I try the last part by uncommenting the last redirecting directive. If I comment some more lines, the appropriate page parts stop working, etc.
However, when I uncomment last line, which should be performed only when matching of all previous rules fails (at least that's what I understand), page goes into redirection cycle (Firefox throws error page with something like "This page isn't redirecting properly"), because it's redirecting to http://localhost/foo/ again and again and again, forever.
Questions
What I don't understand is this processing of this rule:
RewriteRule ^$ web/index.html [L],
specifically the [L] flag. The flag apparently doesn't work for me. When the last line is commented, it correctly redirects, but when I uncomment it, it is always processed, even though rewriting should stop on [L] flag. Anyone got any ideas?
Also, on a sidenote, I'd be thrilled to know why my following attempt at fixing it doesn't work either:
RewriteEngine on
RewriteRule ^core/(\?.+)?$ core/handleCoreRequest.php$1 [NC,L]
RewriteRule ^(.*)$ web/$1 [L]
RewriteRule ^.*$ /foo/ [L]
This actually doesn't work at all. Even if I remove the last line, it still doesn't redirect anything correctly. How does the redirecting work in the first example, if it doesn't work in the second?
It would also be of great benefit to me, if anybody knew any way to actually debug these directives. I spend hours on this without even the slightest clue what could possibly be wrong.
Additional observations
After trying the advice given by bbadour (not that I haven't tried it before, but now that I had a second opinion, I gave it another shot) and it didn't work, I've come up with the following observation. By rewriting last line to this:
RewriteRule ^(.*)$ /foo/?uri=$1 [R,L]
or this
RewriteRule ^(.*)$ /foo/?uri=%{REQUEST_URI} [R,L]
and using Firebug's Net panel, I found out more evidence, that the [L] flag is clearly not working as expected in the previously mentioned RewriteRule ^$ web/index.html [L] rule (let's call it THE RULE from now on). In first case I get [...]uri=web/index.html, in second case [...]uri=/foo/web/index.html. That means that THE RULE gets executed (rewrites ^$ to web/index.html), but the rewriting doesn't stop there. Any more ideas, please?
After hours of searching and testing, I finally found the real problem and solution. Hopefully this will help somebody else too, when they come across the same problem.
Cause of observed behavior
.htaccess file is processed after every redirect (even without [R] flag),
which means that after the RewriteRule ^$ web/index.html [L] is processed, mod_rewrite correctly stops rewriting, goes to the end of the file, redirects correctly to /foo/web/index.html, and then the server starts processing .htaccess file for the new location, which is the same file. Now only the last rewrite rule matches and redirects back to /foo/ (this time with [R], so the redirect can be observed in browser) ... and the .htaccess file is processed again, and again, and again...
Once more for clarity: Because only the hard redirects can be observed, it seems like the [L] flag is ignored, but it is not so. Instead, the .htaccess is processed two times redirecting back and forth between /foo/ and /foo/web/index.html.
Solution
Disallow direct access to subfolder
To virtually move subdirectory to application root directory, additional complex conditional rewrites must be used. Variable THE_REQUEST is useful for distinguishing between hard and soft redirects:
RewriteCond %{THE_REQUEST} ^GET\ /foo/web/
RewriteRule ^web/(.*) /foo/$1 [L,R]
For this rewrite rule to be matched, two conditions must apply. First, on second line, the "local URI" must start with web/ (which corresponds with absolute web URI /foo/web/). Second, on first line, the real request URI must start with /foo/web/ too. Together this means, that the rule only matches when the file inside the web/ subfolder is requested directly from the browser, in which case we want to do a hard redirect.
Redirect to allowed content from root to subfolder (soft)
RewriteCond $1 !^web/
RewriteCond $1 ^(.+\.(html|css|js|ico|png|bmp|jpg|gif))?$
RewriteRule ^(.*)$ web/$1 [L,NC]
We want to redirect to allowed content only if we haven't done it already, hence the first condition. Second condition specifies mask for allowed content. Anything matching this mask will be softly redirected, possibly returning 404 error if the content doesn't exist.
Hide all content not in subfolder or not allowed
RewriteRule !^web/ /foo/ [L,R]
This will do a hard redirect to application root for all URIs not beginning with web/ (and remember, only requests that can begin with web/ at this point are internal redirects for allowed content.
Real example
My code shown in my "question" after using solution tips mentioned above gradually transformed into the following:
# disallow directory indexing
Options -Indexes
# turn mod_rewrite on
Options +FollowSymlinks
RewriteEngine on
# allow access to robots file
RewriteRule ^robots.txt$ - [NC,L]
# mangle core request handler address
# disallow direct access to core request handler
RewriteCond %{THE_REQUEST} !^(GET|POST)\ /asm/core/handleCoreRequest.php
RewriteRule ^core/handleCoreRequest.php$ - [L]
# allow access to request handler under alias
RewriteRule ^core/$ core/handleCoreRequest.php [NC,QSA,L]
# mangle GUI files adressing (move to application root folder)
# disallow direct access to GUI subfolder
RewriteCond %{THE_REQUEST} ^GET\ /foo/web/
RewriteRule ^web/(.*) /foo/$1 [L,R]
# allow access only to correct filetypes in appropriate locations
RewriteCond $1 ^$ [OR]
RewriteCond $1 ^(images/.+\.(ico|png|bmp|jpg|gif))$ [OR]
RewriteCond $1 ^(css/.+\.css)$ [OR]
RewriteCond $1 ^(js/.+\.js)$ [OR]
RewriteCond $1 ^(lib/js/.+\.js)$ [OR]
RewriteCond $1 ^(lib/css/.+\.css)$ [OR]
RewriteCond $1 ^(lib/(.+/)?images/.+\.(ico|png|bmp|jpg|gif))$
RewriteRule ^(.*)$ web/$1 [L,NC]
# hide all files not in GUI subfolder that are not whitelisted above
RewriteRule !^web/ /foo/ [L,R]
What I don't like about this approach is that the application root folder must be hardcoded in .htaccess file (as far as I know), so the file must be generated on application install, not simply copied.
To debug, try simplifying your regex, and the url you ask for (a part of the full url you wanna match), and see if it's working, now step by step, add more bits to the regex adn the testing url, till you find where things are stopping to work properly.
Try using:
RewriteRule ^(.*)$ /foo/ [R,L]
If it still loops, put a RewriteCond in front of it to skip the rule if it is already /foo/

Resources