.htaccess RewriteCond for existing and non-existing files - .htaccess

I am trying to use .htaccess to redirect all url requests of a certain subpath ("URL/somefolders/main/..") to one basefile named "_index.php". So I implemented the following .htaccess to the "folder" URL/somefolders/main/ :
<IfModule mod_rewrite.c>
DirectoryIndex index.php
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^?]*)$ /main/_index.php?oldpath=$1 [NC,L,QSA]
</IfModule>
The redirection works fine for all non-existant files, but if the file exists then it is called without redirection. I suppose this is because I ordered to do so by the "!" in the RewriteCond, but all my tries to change it failed.
How do I have to change the above code to redirect all files (existant or not) ?
Edit:
All my tries still end up uneffective or erroneous.
The latter with the Apache log error:
Request exceeded the limit of 10 internal redirects due to probable configuration error. Use 'LimitInternalRecursion' to increase the limit if necessary. Use 'LogLevel debug' to get a backtrace.
Currently I fear that in fact I did correctly allow non-existant files in most of my before tries but get me a problem with an endless loop due to included files - is this possible? And if so can the .htaccess distinguish between "internal" and "external" file requests?

Your original rules are the most common implementation you'll see, where REQUEST_FILENAME is checked for existing files or directories to prevent things like CSS and images from being rewritten. But that's not what you want.
So you correctly attempted to remove the RewriteCond directives but ended up with an infinite rewrite loop. That is likely because the subsequent RewriteRule is also attempting to rewrite _index.php back to itself.
You can fix that by adding a RewriteCond which specifically matches _index.php to prevent it from looping on itself.
<IfModule mod_rewrite.c>
DirectoryIndex index.php
RewriteEngine on
# Don't apply the rewrite to _index.php to prevent looping
RewriteCond %{REQUEST_URI} !main/_index\.php
RewriteRule ^([^?]*)$ /main/_index.php?oldpath=$1 [NC,L,QSA]
</IfModule>
I'll also simplify the matched group in RewriteRule. ([^?]*) captures everything up to the first ?, but the expression received by RewriteRule will never include the query string or ? anyway. You may instead simply use (.*) to capture whatever is present.
RewriteRule (.*) /main/_index.php?oldpath=$1 [NC,L,QSA]

Related

Mod_rewrite only working with certain folder names

Context
I'm using mod_rewrite to make my links better for SEO. I made the following rule for my page expanded_debate.php:
Options -MultiViews
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteRule ^poll/([0-9a-zA-Z_-]+)/([0-9]+) expanded_debate.php?poll_title=$1&pollid=$2 [NC,QSA,L]
When I input this format in the URL (poll/filename/10, for example) I get a 404 error:
Object not found!
The requested URL was not found on this server. If you entered the URL manually please check your spelling and try again.
If you think this is a server error, please contact the webmaster.
Error 404
localhost
Apache/2.4.46 (Unix) OpenSSL/1.1.1h PHP/7.4.12 mod_perl/2.0.11 Perl/v5.32.0
However, when I change the first folder name to certain words, such as "debate" and "expanded_debate" (but not "expandedebate"), the file loads after page refresh. For example:
RewriteRule ^debate/([0-9a-zA-Z_-]+)/([0-9]+) expanded_debate.php?poll_title=$1&pollid=$2 [NC,QSA,L]
works fine.
I have an older .htaccess file, titled ".htaccess11", with the following info, in case it's of any use:
#forbids users from going to forbidden pages
IndexIgnore *
Options -Indexes
RewriteEngine On
RewriteCond %{SERVER_PORT} !^443$
RewriteCond %{REQUEST_URI} !^/\.well-known/acme-challenge/[0-9a-zA-Z_-]+$
RewriteCond %{REQUEST_URI} !^/\.well-known/cpanel-dcv/[0-9a-zA-Z_-]+$
RewriteCond %{REQUEST_URI} !^/\.well-known/pki-validation/(?:\ Ballot169)?
RewriteCond %{REQUEST_URI} !^/\.well-known/pki-validation/[A-F0-9]{32}\.txt(?:\ Comodo\ DCV)?$
RewriteRule ^(.*)$ https://%{SERVER_NAME}%{REQUEST_URI} [R=301,L]
#404 error directions
ErrorDocument 404 /404.php
Question
Any idea why only certain terms in the first folder position ("^debate" in example above) work when using mod_rewrite?
There are no "poll" folders in my project, if that's of any interest.
Let me know if there are any questions.
The line
RewriteCond %{REQUEST_FILENAME}\.php -f
Means "Take the requested URL, map it to a full local path in the normal way, append .php to the resulting path, and then process the following rewrite rule only if there is an existing regular file at the modified path".
For example, the URL "poll/filename/10" will be rewritten only if there is a file called "poll/filename/10.php" in the relevant location.
Since the value of the AcceptPathInfo directive is evidently set to On, this condition will also be met if there is an existing file called "poll.php" or "poll/filename.php". That is why the rewrite rule works when you change "poll" to "debate" or "expanded_debate" – there are existing files called "debate.php" and "expanded_debate.php".
In any case, it sounds like this behavior is not what was intended. Removing the -f condition should give the desired result. Or, to prevent the rewrite rule from making existing files inaccessible, you could replace it with:
RewriteCond %{REQUEST_FILENAME} !-f
The exclamation point negates the -f test: "continue only if this file does not exist"
If you are using the %{REQUEST_FILENAME} server variable (anywhere), you should be aware of how the AcceptPathInfo directive will affect this, and consider setting that directive explicitly in the same .htaccess file.
If Options +MultiViews is in effect, then %{REQUEST_FILENAME} will match existing files whether or not the extension is included in the request (GET /foo will match an existing file "foo.php", "foo.html", etc.). And GET /foo.php will match in any case. So, omit the string "\.php" from the original rule.
Other configuration may also have an effect, too. The important point is that, unlike %{REQUEST_URI}, %{REQUEST_FILENAME} invokes all the processing that Apache would otherwise do to translate a URL into a local path.
(source)
NB: although I don't think it was the intention here, you actually might want to test for the existence of a local file as part of this rule. You could use a RewriteCond to check whether the back-end data file for a given poll has been manually created, and return 404 by default if it has not. That would be a simple way to prevent users from making up their own poll URLs at will.

Aapche2 htaccess RewriteCond with ErrorDocument 404

Aapche2 htaccess RewriteCond
i found some wondering Condition that works , but i dont know why :)
Files
/error.php
/donate.php
/test/index.php
in a htaccess file i use
ErrorDocument 404 /error.php
RewriteEngine On
# WHY THIS LINE NEEDED TO GET IT WORKS
RewriteCond %{REQUEST_FILENAME} ^$
#
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule (.*) $1.php [L]
this .
http://localhost/donate calls internal http://localhost/donate.php
while
http://localhost/donate1 calls internal http://localhost/error.php
and
http://localhost/test/ call internal http://localhost/test/index.php
so far so good
but when i comment it out
#RewriteCond %{REQUEST_FILENAME} ^$
then i get internal server error while call
/donate1 and not the /error.php
can someone explain the steps , why this happens ?
#RewriteCond %{REQUEST_FILENAME} ^$
Because your directives are not actually doing what you think they are doing. In fact, with that "hacky" condition uncommented they are not doing anything at all, except to prevent the 500 Internal Server Error (which is due to an internal rewrite loop because the rule is strictly incorrect).
That condition checks if the REQUEST_FILENAME server variable is empty. It is never empty, so always fails, so the RewriteRule directive that follows is never triggered.
You could remove your mod_rewrite directives entirely and you'll get the same results.
http://localhost/donate calls internal http://localhost/donate.php
It's most probably MultiViews (mod_negotiation) that is rewriting /donate to /donate.php. Not the directives you posted (which, as I mentioned, don't actually do anything).
http://localhost/test/ call internal http://localhost/test/index.php
This is caused by mod_dir (DirectoryIndex). Again, nothing to do with the directives you posted.
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule (.*) $1.php [L]
then i get internal server error while call /donate1 and not the /error.php
Because when you request /donate1 the above directives trigger an internal rewrite loop (which results in a 500 Internal Server Error response). /donate1 to /donate1.php to /donate1.php.php to /donate1.php.php.php etc. (see below).
MultiViews does not apply here because there is no file that /donate1 can perceivably map to, eg. /donate1.php or /donate1.html or some other recognised resource, with a different file extension, that returns a text/html mime-type.
When you request /donate1 the following happens.
/donate1 does not map to a directory (1st condition) or a file (2nd condition) so is internally rewritten by this rule to donate1.php. Which is incorrect (but that is what this rule does).
The L flag then causes the current round of processing to stop and the rewrite engine starts over, passing the rewritten URL, ie. donate1.php back into the mix.
/donate1.php does not map to a directory or file so is rewritten to donate1.php.php.
The rewrite engine starts over...
/donate1.php.php does not map to a directory or file so is rewritten to donate1.php.php.php.
The rewrite engine starts over...
etc.
This repeats until 10 (default) internal rewrites are reached and the server "breaks" with a 500 error response. The server error log would contain the details of this error, for example:
AH00124: Request exceeded the limit of 10 internal redirects due to probable configuration error. Use 'LimitInternalRecursion' to increase the limit if necessary. Use 'LogLevel debug' to get a backtrace.
(Although very rarely would you ever need to change this internal redirect limit - it nearly always indicates an error in your script.)
Solution
You either remove your mod_rewrite directives entirely and just let MultiViews do its thing, OR you disable MultiViews and "correct" your mod_rewrite directives.
For example:
Options -MultiViews
ErrorDocument 404 /error.php
RewriteEngine On
# Rewrite extensionless URLs to ".php" if they exist
RewriteCond %{DOCUMENT_ROOT}/$1.php -f
RewriteRule (.+) $1.php [L]
An optimisation... if your URLs (that map to .php files) don't contain dots then you could simply exclude URLs that contain dots so you don't unnecessarily test requests for your static resources (eg. image.jpg, styles.css, etc.) that already include a file extension (which naturally contain a dot before the file extension):
RewriteRule ([^.]+) $1.php [L]
Reference:
https://httpd.apache.org/docs/2.4/content-negotiation.html

mod-rewrite redirect but prevent direct access

I want to redirect all content to:
www.example.com/public/...
but prevent direct access to
www.example.com/public/file1/
www.example.com/public/file2/
etc
The final URL should be:
www.example.com/file1/
I've tried this for redirecting and it works - but I dont know how to prevent direct access:
ReWriteCond %{REQUEST_URI} !/public/
RewriteRule ^(.*) public/$1 [L]
After spending an inordinate amount of time trying to solve this problem, I found that the solution lies with the under-documented REDIRECT_STATUS environment variable.
Add this to the beginning of your top-level /.htaccess code, and also to any .htaccess files you have under it (e.g. /public/.htaccess):
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{ENV:REDIRECT_STATUS} !=200
RewriteRule ^ /public%{REQUEST_URI} [L]
</IfModule>
Now, if the user requests example.com/file1 then they are served the file at /public/file1. However, if they request example.com/public/file1 directly then the server will attempt to serve the file at /public/public/file1, which will fail (unless you happen to have a file at that location).
IMPORTANT:
You need to add those lines to all .htaccess files, not just the top-level one in the web root, because if you have any .htaccess files below the web root (e.g. /public/.htaccess) then these will override the top-level .htaccess and users will again be able to access files in /public directly.
Note about variables and redirects:
Performing a redirect (or a rewrite) causes the whole process to start again with the new URI, so any variables that you set before the redirect will no longer be set afterwards. This is done deliberately, because usually you do not want the final result to depend on how you got there (i.e. whether it was via a direct request or via a redirect).
However, for those special occasions where you do want to know how you got to a particular URI, you can use REDIRECT_STATUS. Also, any environment variables set before the redirect (e.g. with SetEnvIf) will still be available after the redirect, but with REDIRECT_ prefixed to the name of the variable (so MY_VAR becomes REDIRECT_MY_VAR).
Maybe you should clarify what's the expected behaviour when user tries to reach the real URL:
www.example.com/public/file1/
If by prevent you mean forbid, you could add a rule to respond with a 403
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_URI} !/public/
RewriteRule ^(.*)$ /public/$1 [L]
RewriteCond %{REQUEST_URI} /public/
RewriteRule ^(.*)$ / [R=403,L]
</IfModule>
Update: The solution above doesn't work!
I realized my previous solution always throws the 403 so it's worthless. Actually, this is kinda tricky because the redirection itself really contains /public/ in the URL.
The solution that really worked for me is to append a secret query string to the redirection and check for this value on URL's containing /public/:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_URI} !/public/
RewriteRule ^(.*)$ /public/$1?token=SECRET_TOKEN [L]
RewriteCond %{REQUEST_URI} /public/
RewriteCond %{QUERY_STRING} !token=SECRET_TOKEN
RewriteRule ^(.*)$ / [R=403,NC,L]
</IfModule>
This way www.example.com/file1/ will show file1, but www.example.com/public/file1/ will throw a 403 Forbidden error response.
Concerns about security of this SECRET_TOKEN are discussed here: How secure is to append a secret token as query string in a htaccess rewrite rule?
If your URL's are expected to have it's own query string, like www.example.com/file1/?param=value be sure to add the flag QSA.
RewriteRule ^(.*)$ /public/$1?token=SECRET_TOKEN [QSA,L]

url rewrite how to send everything else to index.php condition

there,
This sould be a simple task for anyone who knows, but I am new to Apache rewrites, so please bear with me.
I wrote 2 rewrite conditions and they work. I need to write a third - so that everything else would go to index.php file. The problem is - if I add the third rule, it is always applied despite first 2.
RewriteEngine On
RewriteRule ^new/?$ new.php [NC,L]
RewriteRule ^thanks(.*)$ thankyou.php [NC,L]
RewriteRule ^(.*)$ index.php
Thanks for help.
I believe the answer lies in the following paragraph about the L flag used with the RewriteRule directive:
If you are using RewriteRule in either .htaccess files or in
sections, it is important to have some understanding of
how the rules are processed. The simplified form of this is that once
the rules have been processed, the rewritten request is handed back to
the URL parsing engine to do what it may with it. It is possible that
as the rewritten request is handled, the .htaccess file or
section may be encountered again, and thus the ruleset may be run
again from the start. Most commonly this will happen if one of the
rules causes a redirect - either internal or external - causing the
request process to start over.
I think what happens is that after the rewrite is executed, somehow control is given back to the URL parsing engine and the rules are run again.
You can prevent this behaviour by adding a few rewrite conditions to the last rule:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteRule new/?$ new.php [NC,L]
RewriteRule thanks(.*)$ thankyou.php [NC,L]
# Only rewrite to index.php if the current request is not for an existing file or directory
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . index.php [L]
</IfModule>

How do I get the [L] flag of RewriteRule (.htaccess) really working?

To newcomers: While trying to comprehensively describe my problem and phrase my questions I produced huge ammount of text. If you don't want to read the whole thing, my observations about (read "proof of") [L] flag not working the misconception, from which it all sprung, is located in Additional observations section. Why I misunderstood apparent behaviour is described in my Answer as well as solution to given problem.
Setup
I have following code in my .htaccess file:
# disallow directory indexing
Options -Indexes
# turn mod_rewrite on
Options +FollowSymlinks
RewriteEngine on
# allow access to robots file
RewriteRule ^robots.txt$ robots.txt [NC,L]
# mangle core request handler address
RewriteRule ^core/(\?.+)?$ core/handleCoreRequest.php$1 [NC,L]
# mangle web file adresses (move them to application root folder)
# application root folder serves as application GUI address
RewriteRule ^$ web/index.html [L]
# allow access to images
RewriteRule ^(images/.+\.(ico|png|bmp|jpg|gif))$ web/$1 [NC,L]
# allow access to stylesheets
RewriteRule ^(css/.+\.css)$ web/$1 [NC,L]
# allow access to javascript
RewriteRule ^(js/.+\.js)$ web/$1 [NC,L]
# allow access to library scripts, styles and images
RewriteRule ^(lib/js/.+\.js)$ web/$1 [NC,L]
RewriteRule ^(lib/css/.+\.css)$ web/$1 [NC,L]
RewriteRule ^(lib/(.+/)?images/.+\.(ico|png|bmp|jpg|gif))$ web/$1 [NC,L]
# redirect all other requests to application address
# RewriteRule ^(.*)$ /foo/ [R]
My web application (and its .htaccess file) is located in foo subfolder of DOCUMENT_ROOT (accessed from browser as http://localhost/foo/). It has PHP core part located in foo/core and JavaScript GUI part located in foo/web. As can be seen from the code above, I want to allow access only to single core script that handles all requests from GUI and to 'safe' web files and redirect all other requests to base application address (last commented directive).
Problem
Behaviour
It works until I try the last part by uncommenting the last redirecting directive. If I comment some more lines, the appropriate page parts stop working, etc.
However, when I uncomment last line, which should be performed only when matching of all previous rules fails (at least that's what I understand), page goes into redirection cycle (Firefox throws error page with something like "This page isn't redirecting properly"), because it's redirecting to http://localhost/foo/ again and again and again, forever.
Questions
What I don't understand is this processing of this rule:
RewriteRule ^$ web/index.html [L],
specifically the [L] flag. The flag apparently doesn't work for me. When the last line is commented, it correctly redirects, but when I uncomment it, it is always processed, even though rewriting should stop on [L] flag. Anyone got any ideas?
Also, on a sidenote, I'd be thrilled to know why my following attempt at fixing it doesn't work either:
RewriteEngine on
RewriteRule ^core/(\?.+)?$ core/handleCoreRequest.php$1 [NC,L]
RewriteRule ^(.*)$ web/$1 [L]
RewriteRule ^.*$ /foo/ [L]
This actually doesn't work at all. Even if I remove the last line, it still doesn't redirect anything correctly. How does the redirecting work in the first example, if it doesn't work in the second?
It would also be of great benefit to me, if anybody knew any way to actually debug these directives. I spend hours on this without even the slightest clue what could possibly be wrong.
Additional observations
After trying the advice given by bbadour (not that I haven't tried it before, but now that I had a second opinion, I gave it another shot) and it didn't work, I've come up with the following observation. By rewriting last line to this:
RewriteRule ^(.*)$ /foo/?uri=$1 [R,L]
or this
RewriteRule ^(.*)$ /foo/?uri=%{REQUEST_URI} [R,L]
and using Firebug's Net panel, I found out more evidence, that the [L] flag is clearly not working as expected in the previously mentioned RewriteRule ^$ web/index.html [L] rule (let's call it THE RULE from now on). In first case I get [...]uri=web/index.html, in second case [...]uri=/foo/web/index.html. That means that THE RULE gets executed (rewrites ^$ to web/index.html), but the rewriting doesn't stop there. Any more ideas, please?
After hours of searching and testing, I finally found the real problem and solution. Hopefully this will help somebody else too, when they come across the same problem.
Cause of observed behavior
.htaccess file is processed after every redirect (even without [R] flag),
which means that after the RewriteRule ^$ web/index.html [L] is processed, mod_rewrite correctly stops rewriting, goes to the end of the file, redirects correctly to /foo/web/index.html, and then the server starts processing .htaccess file for the new location, which is the same file. Now only the last rewrite rule matches and redirects back to /foo/ (this time with [R], so the redirect can be observed in browser) ... and the .htaccess file is processed again, and again, and again...
Once more for clarity: Because only the hard redirects can be observed, it seems like the [L] flag is ignored, but it is not so. Instead, the .htaccess is processed two times redirecting back and forth between /foo/ and /foo/web/index.html.
Solution
Disallow direct access to subfolder
To virtually move subdirectory to application root directory, additional complex conditional rewrites must be used. Variable THE_REQUEST is useful for distinguishing between hard and soft redirects:
RewriteCond %{THE_REQUEST} ^GET\ /foo/web/
RewriteRule ^web/(.*) /foo/$1 [L,R]
For this rewrite rule to be matched, two conditions must apply. First, on second line, the "local URI" must start with web/ (which corresponds with absolute web URI /foo/web/). Second, on first line, the real request URI must start with /foo/web/ too. Together this means, that the rule only matches when the file inside the web/ subfolder is requested directly from the browser, in which case we want to do a hard redirect.
Redirect to allowed content from root to subfolder (soft)
RewriteCond $1 !^web/
RewriteCond $1 ^(.+\.(html|css|js|ico|png|bmp|jpg|gif))?$
RewriteRule ^(.*)$ web/$1 [L,NC]
We want to redirect to allowed content only if we haven't done it already, hence the first condition. Second condition specifies mask for allowed content. Anything matching this mask will be softly redirected, possibly returning 404 error if the content doesn't exist.
Hide all content not in subfolder or not allowed
RewriteRule !^web/ /foo/ [L,R]
This will do a hard redirect to application root for all URIs not beginning with web/ (and remember, only requests that can begin with web/ at this point are internal redirects for allowed content.
Real example
My code shown in my "question" after using solution tips mentioned above gradually transformed into the following:
# disallow directory indexing
Options -Indexes
# turn mod_rewrite on
Options +FollowSymlinks
RewriteEngine on
# allow access to robots file
RewriteRule ^robots.txt$ - [NC,L]
# mangle core request handler address
# disallow direct access to core request handler
RewriteCond %{THE_REQUEST} !^(GET|POST)\ /asm/core/handleCoreRequest.php
RewriteRule ^core/handleCoreRequest.php$ - [L]
# allow access to request handler under alias
RewriteRule ^core/$ core/handleCoreRequest.php [NC,QSA,L]
# mangle GUI files adressing (move to application root folder)
# disallow direct access to GUI subfolder
RewriteCond %{THE_REQUEST} ^GET\ /foo/web/
RewriteRule ^web/(.*) /foo/$1 [L,R]
# allow access only to correct filetypes in appropriate locations
RewriteCond $1 ^$ [OR]
RewriteCond $1 ^(images/.+\.(ico|png|bmp|jpg|gif))$ [OR]
RewriteCond $1 ^(css/.+\.css)$ [OR]
RewriteCond $1 ^(js/.+\.js)$ [OR]
RewriteCond $1 ^(lib/js/.+\.js)$ [OR]
RewriteCond $1 ^(lib/css/.+\.css)$ [OR]
RewriteCond $1 ^(lib/(.+/)?images/.+\.(ico|png|bmp|jpg|gif))$
RewriteRule ^(.*)$ web/$1 [L,NC]
# hide all files not in GUI subfolder that are not whitelisted above
RewriteRule !^web/ /foo/ [L,R]
What I don't like about this approach is that the application root folder must be hardcoded in .htaccess file (as far as I know), so the file must be generated on application install, not simply copied.
To debug, try simplifying your regex, and the url you ask for (a part of the full url you wanna match), and see if it's working, now step by step, add more bits to the regex adn the testing url, till you find where things are stopping to work properly.
Try using:
RewriteRule ^(.*)$ /foo/ [R,L]
If it still loops, put a RewriteCond in front of it to skip the rule if it is already /foo/

Resources