Aapche2 htaccess RewriteCond with ErrorDocument 404 - .htaccess

Aapche2 htaccess RewriteCond
i found some wondering Condition that works , but i dont know why :)
Files
/error.php
/donate.php
/test/index.php
in a htaccess file i use
ErrorDocument 404 /error.php
RewriteEngine On
# WHY THIS LINE NEEDED TO GET IT WORKS
RewriteCond %{REQUEST_FILENAME} ^$
#
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule (.*) $1.php [L]
this .
http://localhost/donate calls internal http://localhost/donate.php
while
http://localhost/donate1 calls internal http://localhost/error.php
and
http://localhost/test/ call internal http://localhost/test/index.php
so far so good
but when i comment it out
#RewriteCond %{REQUEST_FILENAME} ^$
then i get internal server error while call
/donate1 and not the /error.php
can someone explain the steps , why this happens ?

#RewriteCond %{REQUEST_FILENAME} ^$
Because your directives are not actually doing what you think they are doing. In fact, with that "hacky" condition uncommented they are not doing anything at all, except to prevent the 500 Internal Server Error (which is due to an internal rewrite loop because the rule is strictly incorrect).
That condition checks if the REQUEST_FILENAME server variable is empty. It is never empty, so always fails, so the RewriteRule directive that follows is never triggered.
You could remove your mod_rewrite directives entirely and you'll get the same results.
http://localhost/donate calls internal http://localhost/donate.php
It's most probably MultiViews (mod_negotiation) that is rewriting /donate to /donate.php. Not the directives you posted (which, as I mentioned, don't actually do anything).
http://localhost/test/ call internal http://localhost/test/index.php
This is caused by mod_dir (DirectoryIndex). Again, nothing to do with the directives you posted.
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule (.*) $1.php [L]
then i get internal server error while call /donate1 and not the /error.php
Because when you request /donate1 the above directives trigger an internal rewrite loop (which results in a 500 Internal Server Error response). /donate1 to /donate1.php to /donate1.php.php to /donate1.php.php.php etc. (see below).
MultiViews does not apply here because there is no file that /donate1 can perceivably map to, eg. /donate1.php or /donate1.html or some other recognised resource, with a different file extension, that returns a text/html mime-type.
When you request /donate1 the following happens.
/donate1 does not map to a directory (1st condition) or a file (2nd condition) so is internally rewritten by this rule to donate1.php. Which is incorrect (but that is what this rule does).
The L flag then causes the current round of processing to stop and the rewrite engine starts over, passing the rewritten URL, ie. donate1.php back into the mix.
/donate1.php does not map to a directory or file so is rewritten to donate1.php.php.
The rewrite engine starts over...
/donate1.php.php does not map to a directory or file so is rewritten to donate1.php.php.php.
The rewrite engine starts over...
etc.
This repeats until 10 (default) internal rewrites are reached and the server "breaks" with a 500 error response. The server error log would contain the details of this error, for example:
AH00124: Request exceeded the limit of 10 internal redirects due to probable configuration error. Use 'LimitInternalRecursion' to increase the limit if necessary. Use 'LogLevel debug' to get a backtrace.
(Although very rarely would you ever need to change this internal redirect limit - it nearly always indicates an error in your script.)
Solution
You either remove your mod_rewrite directives entirely and just let MultiViews do its thing, OR you disable MultiViews and "correct" your mod_rewrite directives.
For example:
Options -MultiViews
ErrorDocument 404 /error.php
RewriteEngine On
# Rewrite extensionless URLs to ".php" if they exist
RewriteCond %{DOCUMENT_ROOT}/$1.php -f
RewriteRule (.+) $1.php [L]
An optimisation... if your URLs (that map to .php files) don't contain dots then you could simply exclude URLs that contain dots so you don't unnecessarily test requests for your static resources (eg. image.jpg, styles.css, etc.) that already include a file extension (which naturally contain a dot before the file extension):
RewriteRule ([^.]+) $1.php [L]
Reference:
https://httpd.apache.org/docs/2.4/content-negotiation.html

Related

Mod_rewrite only working with certain folder names

Context
I'm using mod_rewrite to make my links better for SEO. I made the following rule for my page expanded_debate.php:
Options -MultiViews
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteRule ^poll/([0-9a-zA-Z_-]+)/([0-9]+) expanded_debate.php?poll_title=$1&pollid=$2 [NC,QSA,L]
When I input this format in the URL (poll/filename/10, for example) I get a 404 error:
Object not found!
The requested URL was not found on this server. If you entered the URL manually please check your spelling and try again.
If you think this is a server error, please contact the webmaster.
Error 404
localhost
Apache/2.4.46 (Unix) OpenSSL/1.1.1h PHP/7.4.12 mod_perl/2.0.11 Perl/v5.32.0
However, when I change the first folder name to certain words, such as "debate" and "expanded_debate" (but not "expandedebate"), the file loads after page refresh. For example:
RewriteRule ^debate/([0-9a-zA-Z_-]+)/([0-9]+) expanded_debate.php?poll_title=$1&pollid=$2 [NC,QSA,L]
works fine.
I have an older .htaccess file, titled ".htaccess11", with the following info, in case it's of any use:
#forbids users from going to forbidden pages
IndexIgnore *
Options -Indexes
RewriteEngine On
RewriteCond %{SERVER_PORT} !^443$
RewriteCond %{REQUEST_URI} !^/\.well-known/acme-challenge/[0-9a-zA-Z_-]+$
RewriteCond %{REQUEST_URI} !^/\.well-known/cpanel-dcv/[0-9a-zA-Z_-]+$
RewriteCond %{REQUEST_URI} !^/\.well-known/pki-validation/(?:\ Ballot169)?
RewriteCond %{REQUEST_URI} !^/\.well-known/pki-validation/[A-F0-9]{32}\.txt(?:\ Comodo\ DCV)?$
RewriteRule ^(.*)$ https://%{SERVER_NAME}%{REQUEST_URI} [R=301,L]
#404 error directions
ErrorDocument 404 /404.php
Question
Any idea why only certain terms in the first folder position ("^debate" in example above) work when using mod_rewrite?
There are no "poll" folders in my project, if that's of any interest.
Let me know if there are any questions.
The line
RewriteCond %{REQUEST_FILENAME}\.php -f
Means "Take the requested URL, map it to a full local path in the normal way, append .php to the resulting path, and then process the following rewrite rule only if there is an existing regular file at the modified path".
For example, the URL "poll/filename/10" will be rewritten only if there is a file called "poll/filename/10.php" in the relevant location.
Since the value of the AcceptPathInfo directive is evidently set to On, this condition will also be met if there is an existing file called "poll.php" or "poll/filename.php". That is why the rewrite rule works when you change "poll" to "debate" or "expanded_debate" – there are existing files called "debate.php" and "expanded_debate.php".
In any case, it sounds like this behavior is not what was intended. Removing the -f condition should give the desired result. Or, to prevent the rewrite rule from making existing files inaccessible, you could replace it with:
RewriteCond %{REQUEST_FILENAME} !-f
The exclamation point negates the -f test: "continue only if this file does not exist"
If you are using the %{REQUEST_FILENAME} server variable (anywhere), you should be aware of how the AcceptPathInfo directive will affect this, and consider setting that directive explicitly in the same .htaccess file.
If Options +MultiViews is in effect, then %{REQUEST_FILENAME} will match existing files whether or not the extension is included in the request (GET /foo will match an existing file "foo.php", "foo.html", etc.). And GET /foo.php will match in any case. So, omit the string "\.php" from the original rule.
Other configuration may also have an effect, too. The important point is that, unlike %{REQUEST_URI}, %{REQUEST_FILENAME} invokes all the processing that Apache would otherwise do to translate a URL into a local path.
(source)
NB: although I don't think it was the intention here, you actually might want to test for the existence of a local file as part of this rule. You could use a RewriteCond to check whether the back-end data file for a given poll has been manually created, and return 404 by default if it has not. That would be a simple way to prevent users from making up their own poll URLs at will.

REQUEST_URI not matching explicit path and filename

Really stumped, because form and syntax seem fine.
RewriteCond for REQUEST_URI is not matching the explicit path and filename. When isolated, RewriteCond for REQUEST_FILENAME matches just fine. I have verified using phpinfo() that REQUEST_URI contains the leading slash, and have tested without the leading slash, also.
The goal here is to know that the request is for this file and, if it doesn't exist, then throw a 410.
RewriteCond %{REQUEST_URI} ^/dir1/dir2/dir3/v_9991_0726dd5b5e8dd67a214c0c243436d131_all\.css$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ - [R=410,L]
I don't want to omit the first Cond, because I only want to do this for a handful of files similar to this one.
UPDATE I
trying to get a definitive test. Test set-up:
testmee.txt does not exist
request is for testmee.txt in the root
verified the request_uri is matching, by redirecting to google
cannot get 410 when using only first Cond
(when using only first Cond, server serves 404, not 410)
(using both Conds, server serves 404, not 410)
CAN get 410 when using only second Cond
RewriteCond %{REQUEST_URI} ^/testmee\.txt$
#RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ - [R=410,L]
versus
#RewriteCond %{REQUEST_URI} ^/testmee\.txt$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ - [R=410,L]
UPDATE II
Response for MrWhite:
ughh, same symptom. Might have to live with googlebot hitting 404s instead of a desired 410 for outdated css/js. No biggie in the long run, probably.
Thank you for that request_uri test redirect. Everything is working normally in those tests. Page names, etc. are returned as expected, in the var= rewrite URL.
At this point, I think it must be some internal handling of 404s related to the file type extensions. See clue below. I have Prestashop shopping cart software, and it must be forcing 404s on file types.
This will redirect to google (to affirm pattern match):
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^testmee\.txt$ http://www.google.com/ [L]
(L flag is needed or else other Rules further down will interfere.)
This will continue to return 404 instead of 410:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^testmee\.txt$ - [NC,R=410]
And as a control test, this will return a 410:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^.*$ - [NC,R=410]
If file type is css in the above failed test, then my custom 404 controller does not get invoked. I just get a plain 404 Response, w/o the custom 404 that is wrapped with all my site templating.
For example:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^testmee\.css$ - [NC,R=410]
I'm afraid I've wasted some of your time. My apologies. I never imagined that Prestashop's code would be forcing 404 based on file type, but I can't see any other explanation. I could dig into it and maybe find the spot in the Controllers that is doing it. Gotta take a break, though.
This isn't really a solid answer, more of a things to try to help debug this and to quash some myths...
I have verified using phpinfo() that REQUEST_URI contains the leading slash
Yes, the REQUEST_URI Apache server variable does indeed contain the leading slash. It contains the full URL-path.
However, the REQUEST_URI Apache server variable is not necessarily the same as the $_SERVER['REQUEST_URI'] PHP superglobal - in fact, they aren't really the same thing at all. There are some significant differences between these variables (in some ways it's perhaps a bit unfortunate they share the same name). Notably, the PHP superglobal contains the initial URL from the request and includes the query string (if any) and is not %-decoded. Whereas the Apache server variable of the same name contains the rewritten URL (not necessarily the requested URL) and does not contain the query string and is %-decoded.
So, that's why I was asking whether you have other mod_rewrite directives. You could very well have had a conflict. If another directive rewrites the URL, then the condition will never match (despite the PHP superglobal suggesting that it should).
It seemed that if I put this at the top, the Last flag would end processing for that trip through, return the 410
This directive should certainly go at the top of the .htaccess file, to avoid the URL being rewritten earlier. The L flag is actually superfluous when used with a R=410 (anything other than a 3xx) - it is implied in this case.
Then I change the result to be "throw a 410" and it throws a 404.
That can certainly be caused by a server-side override. But you are able to throw a 410 in other situations, so that would seem to rule that out. However, you can reset the error document in .htaccess if in doubt (unless you are already using a custom error document):
ErrorDocument 410 default
RewriteCond %{REQUEST_URI} ^/dir1/dir2/dir3/v_9991_0726dd5b5e8dd67a214c0c243436d131_all\.css$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ - [R=410,L]
Whilst this doesn't really make a difference to how the rule behaves, you don't need the first RewriteCond directive that checks against the REQUEST_URI. You should be doing this check in the RewriteRule pattern instead (which will be more efficient, since this is processed first). For example:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^dir1/dir2/dir3/v_9991_0726dd5b5e8dd67a214c0c243436d131_all\.css$ - [NC,R=410]
The NC flag should be superfluous.
Still, a conflict with existing directives is the most probable cause. Remove all other directives. Do you still see the same behaviour?
You can test the value of the REQUEST_URI server variable. You could either issue a redirect and pass the REQUEST_URI as a URL parameter, or set environment variables (but you will need to look out for REDIRECT_<var> for each rewrite).
For example, at the top of your .htaccess (or wherever you are trying this):
RewriteCond %{QUERY_STRING} ^$
RewriteRule ^ /test.php?var=%{REQUEST_URI} [NE,R,L]
Created a dummy test.php file to avoid an internal subrequest to an error document.
I was unable to determine why server configuration or site code was forcing '410 Gone' response directive in htaccess to be overridden with a 404 response, so had to do something like this to tell googlebot to stop hunting for CSS/JS files that get purged periodically (and renamed when regenerated).
in .htaccess:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule v_(.*)_(.*)$ /410response.php [L]
in 410response.php placed in root:
<?php header($_SERVER['SERVER_PROTOCOL'].' 410 Gone');
UPDATE I
The 404 response when attempting to use htaccess for the 410 directive was being forced by the server, because of server apparently having a custom 410 document, that apparently routed to 404. Adding a directive to prevent that then properly allowed use of htaccess to return 410 for pattern matches in RewriteRule. (I thought that I had already checked yesterday to see if this would work, since #MrWhite said in his answer above to control for server possibly having a custom 410; today when making this check, it did work and indicate that server 410-to-404 redirection was overridding my 410 directive.)
ErrorDocument 410 default
RewriteRule test\.txt$ - [NC,R=410]
MrWhite! I located this solution in one of your posts on Stack Exchange.

.htaccess returns 500 Internal Server Error when slash on filename

My .htaccess file has a rewrite in it so instead of seeing example.com/login.php, one will see example.com/login.
Here is what is written in that file:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule ^(.+?)/?$ $1.php [L]
It works perfectly except for one thing: I am creating a file called api.php that will serve as the api.
I want clients to go to example.com/api/logout and example.com/api/auth etc. even though api is not a folder but instead a PHP file.
Problem is, whenever I go to a file like example.com/api/logout (or even example.com/login/foo), I always get a 500 Internal Server Error. If I go to example.com/api.php/logout, I get a 404 error.
I checked the error file for the 500 Internal Server Error, and it says Request exceeded the limit of 10 internal redirects due to probable configuration error which makes sense because mod_rewrite runs until all conditions are satisfied.
How should I change my rewrite to fix this?
:
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule ^(.+?)/?$ $1.php [L]
When you request a URL of the form /api/logout you will get a rewrite loop (500 error) because the REQUEST_FILENAME maps to <document-root>/api (which if you append .php is indeed a valid file), but you end up appending .php to /api/logout (the URL-path) in the RewriteRule directive (clearly not the same thing). /api/logout.php then gets rewritten to /api/logout.php.php etc. etc. repeatedly appending .php to the URL-path, while all the time checking that /api.php is a valid file.
If all your files are in the document root, then you can change your RewriteRule pattern to only match the first path segment. For example:
RewriteRule ^([^/]+) $1.php [L]
However, if you have files at different directory depths (which you are referencing without the .php extension) then you might need to create some specific rules / exceptions.
If I go to example.com/api.php/logout, I get a 404 error.
This will result in a 404 if additional pathname information (PATH_INFO) has been disabled (although it is usually enabled by default for PHP files). You can explicitly enable this at the top of your .htaccess file:
AcceptPathInfo On

.htaccess RewriteCond for existing and non-existing files

I am trying to use .htaccess to redirect all url requests of a certain subpath ("URL/somefolders/main/..") to one basefile named "_index.php". So I implemented the following .htaccess to the "folder" URL/somefolders/main/ :
<IfModule mod_rewrite.c>
DirectoryIndex index.php
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^?]*)$ /main/_index.php?oldpath=$1 [NC,L,QSA]
</IfModule>
The redirection works fine for all non-existant files, but if the file exists then it is called without redirection. I suppose this is because I ordered to do so by the "!" in the RewriteCond, but all my tries to change it failed.
How do I have to change the above code to redirect all files (existant or not) ?
Edit:
All my tries still end up uneffective or erroneous.
The latter with the Apache log error:
Request exceeded the limit of 10 internal redirects due to probable configuration error. Use 'LimitInternalRecursion' to increase the limit if necessary. Use 'LogLevel debug' to get a backtrace.
Currently I fear that in fact I did correctly allow non-existant files in most of my before tries but get me a problem with an endless loop due to included files - is this possible? And if so can the .htaccess distinguish between "internal" and "external" file requests?
Your original rules are the most common implementation you'll see, where REQUEST_FILENAME is checked for existing files or directories to prevent things like CSS and images from being rewritten. But that's not what you want.
So you correctly attempted to remove the RewriteCond directives but ended up with an infinite rewrite loop. That is likely because the subsequent RewriteRule is also attempting to rewrite _index.php back to itself.
You can fix that by adding a RewriteCond which specifically matches _index.php to prevent it from looping on itself.
<IfModule mod_rewrite.c>
DirectoryIndex index.php
RewriteEngine on
# Don't apply the rewrite to _index.php to prevent looping
RewriteCond %{REQUEST_URI} !main/_index\.php
RewriteRule ^([^?]*)$ /main/_index.php?oldpath=$1 [NC,L,QSA]
</IfModule>
I'll also simplify the matched group in RewriteRule. ([^?]*) captures everything up to the first ?, but the expression received by RewriteRule will never include the query string or ? anyway. You may instead simply use (.*) to capture whatever is present.
RewriteRule (.*) /main/_index.php?oldpath=$1 [NC,L,QSA]

How do I redirect all but one url to a script

I'm trying to get www.example.com and www.example.com/index.html to go to index.html, but I want all other urls e.g. www.example.com/this/is/another/link to still show www.example.com/this/is/another/link but be processed by a generic script. I've tried
RewriteEngine on
RewriteCond %{REQUEST_URI} !^index\.html$
RewriteCond %{REQUEST_URI} !^$
RewriteRule ^(.*)$ mygenericscript.php [L]
but it wont work, can someone please help?
Instead of testing what %{REQUEST_URI} is, you can instead just test if the resource exists:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule .* mygenericscript.php
This prevents your static resources (images, stylesheets, etc.) from being redirected if they're handled through the same directory your .htaccess is in as well.
What's probably happening now is that you're seeing an internal server error, caused by an infinite internal redirection loop when you try to access anything that isn't / or /index.html. This is because .* matches every request, and after you rewrite to mygenericscript.php the first time, the rule set is reprocessed (because of how mod_rewrite works in the context that you're using it in).
The easiest to do this is to install a 404-handler which gets executed when the server does not find a file to display.
ErrorDocument 404 /mygenericscript.php
or
ErrorDocument 404 /cgi-bin/handler.cgi
or similar should do the trick.
It is not that RewriteRule's can not be used for this, it is just that they are tricky to set up and requires in depth knowledge on how apache handles requests. It is a bit of a black art.
It appears as if you're using PHP, and you can use auto_x_file (x is either append or prepend:
http://php.net/manual/en/ini.core.php

Resources