http not found issue when url have % symbol - .htaccess

It showing lot of urls as 404 not found. Of course there is badly formed querystring in the url.
http://www.example.com/ref=http%3A%2F%2Fwww.example.org/
Bbove url is failing to reach .htaccess verification.
RewriteRule ^(.*)$ index.php?request_url=$1 [QSA,L]
If that url reach/passthrough the above .htaccess rule, I can simply add R=301, but that url does not reach/passingthrough that .htaccess rule and shows 404 error.

it wont work because the urls are encoded and will be decoded as (%2F for / and %5C for \) respectively.
Apache has Security limitations for these kind of requests
check these Urls for more info
https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2007-0450
http://securitytracker.com/id/1018110 (Look at section 4. Solution)
To make it work either pass decoded request or enable AllowEncodedSlashes in apache config and restart the apache service
http://httpd.apache.org/docs/current/mod/core.html#allowencodedslashes

Related

Apache error Document automatic re-write rule

I like to simplify code if possible but I am not to familiar with .htaccess, I had error documents redirect rule hard coded
ErrorDocument 403 http://example.com/error/404
Then I made it
ErrorDocument 403 http://%{HTTP_HOST}/error/404
my question is so that the .htaccess does not have to be manually modified is there a way to tell it if its https or http? because the above example if i use https ill have to hard code https I would like to check automatically.
Don't use an absolute URL in the ErrorDocument directive
ErrorDocument 403 http://example.com/error/404
You shouldn't be using an absolute URL in the ErrorDocument directive to begin with! This will trigger a 302 response (ie. a 302 temporary redirect) to the target URL. So, this won't send a 403 (or 404) response back to the user-agent on the first response.
(This format of the ErrorDocument directive should only be used in very exceptional circumstances since you also lose a lot of information about the URL that triggered the response in the first place.)
To internally serve a custom error document on the same server, this should be a root-relative URL, starting with a slash (no scheme or hostname). For example:
ErrorDocument 403 /error/404
However, /error/404 is unlikely to be a valid end-point. This should represent a valid resource that can be served. eg. /error/404.html.
(And this naturally gets round the issue of having to specifying HTTP vs HTTPS.)
To answer your specific question...
because the above example if i use https ill have to hard code https
(Although, arguably, you should be HTTPS everywhere these days.)
However, to do what you are asking, you could do something like the following using the REQUEST_SCHEME server variable, for example:
ErrorDocument 403 %{REQUEST_SCHEME}://%{HTTP_HOST}/error/404
Or, if the REQUEST_SCHEME server variable is not available then you can construct this from the HTTPS server variable using mod_rewrite and assign this to an environment variable. For example:
RewriteEngine On
RewriteCond %{HTTPS}s ^on(s)|
RewriteRule ^ - [E=PROTO:http%1]
ErrorDocument 403 %{reqenv:PROTO}://%{HTTP_HOST}/error/404
The %1 backreference contains s when HTTPS is on and is empty otherwise. So the PROTO environment variable is set to either http or https.
This does assume that the SSL is managed by the application server and not a front-end proxy (like Cloudflare Flexible SSL etc.).

Why my rewrite rule in htaccess is working in some case only?

I try to rewrite some of my URLs with a .htaccess file but it didn't work as expected.
This is the rewrite rule in my .htaccess file :
RewriteRule ^(index|administration)/([A-Za-z0-9-]+)(\.php)?$ index.php?c=$1&t=$2 [QSA]
When I go on www.example.com/index/main, I get a 404 error code.
So I try to change my rewrite rule to
RewriteRule ^index.php$ index.php?c=index&t=main [QSA]
Then I go to www.example.com/index.php and the webpage displays perfectly with all the datas in $_GET (c = index and t = main).
So I don't know why my first rule is not working. Let me see if you have any idea.
Is it possible that my server wants to enter the index folder, then the main folder for my first rule without taking care of my .htaccess (www.example.com/index/main) ?
You need to ensure that MultiViews (part of mod_negotiation) is disabled for this to work correctly. So, add the following at top of your .htaccess file:
Options -MultiViews
If MultiViews is enabled (it's disabled by default, but some hosts do sometimes enable this in the server config) then when you request /index/main where /index.php already exists as a physical file then mod_negotiation will make an internal request for index.php before mod_rewrite is able to process the request. (If index.html also exists, then this might be found first.)
(MultiViews essentially enables extensionless URLs by mocking up type maps and searching for files in the directory - with the same basename - that would return a response with an appropriate mime-type.)
If this happens then your mod-rewrite directive is essentially ignored (the pattern does not match, since it would need to check for index.php) and index.php is called without the URL parameters that your mod_rewrite directive would otherwise append.
it perfectly works by disabling the MultiViews Option in my .htaccess
This would ordinarily imply its your script (ie. index.php) that is triggering the 404 (perhaps due to missing URL parameters?), rather than Apache itself?
However, if you were seeing an Apache generated 404 then it would suggest either:
You also have an index.html file, which is found before index.php. .html files do not ordinarily accept path-info (ie. /main) so would trigger a 404.
OR, AcceptPathInfo Off is explicitly set elsewhere in the config, which would trigger a 404 when the request is internally rewritten to /index.php/main (by mod_negotiation).

.htaccess redirect shows multiple %20 when URL contains space on HTTPS

I have some old URLs that I need redirected, unfortunatelly some of them contain spaces. I redirect them to my redirect.php script, but for some reason when the URL contains space or %20, in the URL after redirection this %20 repeats unlimited times. This seems to only happen now when we switched the server to HTTPS, when running on http subdomain or on my local it works correctly.
My rule is:
RewriteRule ^/?(gallery\.php)(.*) /redirect.php$2 [R,L]
This works correctly:
gallery.php?place=name --> redirect.php?place=name
But this happens when url contains space:
gallery.php?place=long%20name -->
redirect.php?place=long%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20name
I tried adding [B] and [NE] flags but no success. Is there anything I am missing?
UPDATE 1:
To exclude other rules in htaccess, I have created a new example. I have empty directory /test/ , inside is empty file /test/index.php and /test/.htaccess file, which contains:
RewriteEngine On
RewriteRule ^/?(index\.php)(.*) /$2 [NE,R,L]
That is all. Still the behaviour is weird, eg:
/test/index.php?a=xy works as expected, but /test/index.php?a=x%20y repeats the %20 sign.
So in the end I managed to go around the problem by reordering the .htaccess rules. So I do all redirects from HTTP to my redirect script which is on HTTP as well. This script then resolves the correct new URL, that is already on SSL without spaces, and then the script redirects there. Fortunatelly all old URLs are on HTTP so it is sufficient, I don't need this parameter redirects to happen on SSL.
Also the [NE] flag is helpful when redirecting URLs with %20, as it prevents the percentage sign to be further encoded in new URL.

Links starting with double slashes cause invalid requests

All links on my website are protocol-less and start with double slashes:
href="//site.com/page.html".
And in the log I see many requests like: 404 - site.com/site.com/page.html
Which means some browsers are interpreting these absolute links as relative. By looking at the user agents I assume those are mostly bots.
Can I fix requests such as site.com/site.com/page.html with .htaccess by directing them to the proper URI? (site.com/site.com/page.html => site.com/page.html)
Try adding this to your document root (of the site that's hosting these protocol relative links):
RedirectMatch 301 ^/site.com/(.*)$ http://site.com/$1
or:
RewriteEngine On
RewriteRule ^site.com/(.*)$ http://site.com/$1 [L,R=301]
If the site that is hosting these links is already site.com, you can remove the http://site.com bit from the targets.

.htaccess redirect from subdirectory to another domains subdirectory accordingly

I am trying to make a redirect from my primary domain to an secondary domain, but only if the primary domain's request is to a sub directory.
The sub directory I want to redirect from is FTP, so if the user makes the following request:
http://www.site1.com/FTP/free/50b694124bd63/SaMple+PicTure.PnG
it would be transformed to
http://www.site2.com/FTP/free/50b694124bd63/SaMple+PicTure.PnG
but if the user makes a request that does not involve the FTP folder, the user will not be redirected. Like so:
http://www.site1.com or http://www.site1.com/somethingelse/
I am, however; a bit lost when it comes to making .htaccess files. What I have tried to do so far is:
# Redirect users
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^ftp(.*)$ http://site2.com/FTP/$1 [L,R=301]
</IfModule>
Any directions or samples would be great :)
No need to use the rewrite engine for simple redirects. I think you just want to use the Redirect directive:
Redirect /FTP http://www.site2.com/FTP
By default, this will result in a "temporary" redirect response (HTTP status 302). If you're sure the URL of the second site will never change, you can cause a "permanent" redirect response (HTTP status 301) by adding the permanent argument:
Redirect permanent /FTP http://www.site2.com/FTP
Also, note that the path of URLs is case-sensitive. If you want http://www.site1.com/ftp to also redirect, you will either need to add a rule with the lowercase path,
Redirect /ftp http://www.site2.com/FTP
or use mod_speling.

Resources