How do I block a part of an url with htaccess? - .htaccess

So basically, I have a link looking like:
mydomain.com/file.php?id=m03u7dp255jiobi&type=mp3
How do I block access to this URL only for this ?id= part
So when users visit this link file.php?id=test it won’t work, but if they visit the other link looking like this ?id=validurl it will work.

.htaccess
To do this in .htaccess you can use the mod_rewrite module and set conditions against the query string.
ReWriteCond %{QUERY_STRING} id=invalid_string [OR]
ReWriteCond %{QUERY_STRING} id=another_invalid_string
ReWriteRule . /new_destination.php [QSD,L,R=307]
OR - Literally an OR operator; as in Condition1 || Condition 2
QSD - Removes the original query string
L - Stops further rewrite rules from being applied
R=307 - Sets a 307 Temporary Redirect status code
R=401 Sets an Unauthorised status code and redirects automatically to a predefined 401 resource (ErrorDocument)
You can custom set the ErrorDocuments like:
ErrorDocument 401 /errors/unauthorised.php

Related

Surprising rewriting of URL by htaccess rule

I've zeroed my problem and I've specific question.
With only the following code in the .httaccess why index2.php gets called if I type in my URL as www.mysite.com/url2 ?
RewriteEngine On
RewriteCond %{REQUEST_URI} (.html|.htm|.feed|.pdf|.raw)$ [NC]
RewriteRule (.*) index2.php [L]
I've also tested it at http://www.regextester.com and should not replace it with index2.php:
In the end I want this rule to skip any URL starting with /url2 or /url2/*.
EDIT: I've made screen recording of this problem: http://screenr.com/BBBN
You have this in your .htaccess:
RewriteEngine On
RewriteCond %{REQUEST_URI} (.html|.htm|.feed|.pdf|.raw)$ [NC]
RewriteRule (.*) index2.php [L]
What it does? it rewrites anything that ends with html, htm, feed , pdf , raw to index2.php. So, if you are getting results as your URL is ends with those extensions, then there are two possible answers:
There is another rewrite rule in an .htaccess in upper directories (or in server config files) that causes the URL to be rewritten.
Your URL actually ends with those extensions. have in mind, what you enter in your address bar, will be edited and rewritten. For example, if you enter www.mysite.com/url2 in your address bar and that file doesn't exist on server, your server will try to load the proper error document. So, if your error document is /404.html, it will be rewritten to index2.php at the end.
Update:
I think it's the case. create a file named 404.php in your document root. Inside your main .htaccess (in your document root), put this:
ErrorDocument 404 /404.php
delete all other ErrorDocument directives.
inside 404.php , put this:
<?php
echo 'From 404.php file';
?>
Logic behind it:
When you have a weird behavior in mod_rewrite, the best solution in my experience is using rewrite log. to enable rewrite log put this in your virtualhost or other server config directives you may choose:
RewriteLogLevel 9
RewriteLog "logs/RewriteLog.log"
be careful: the code above will enable rewrite log and start logging at highest level possible (logging everything). It will decrease your server speed and the log file will become huge very quickly. Do this only on your dev server.
Explanation: When you try to access www.mysite.com/url2, Apache gives your URL to rewrite module. Rewrite module checks if any of RewriteRules applies to your URL. Because you have one rule and it doesn't apply to your URL, it tries to load the normal file. But this file does not exit. So, Apache will do the next step which is showing the proper error message. When you set a custom error file, Apache will run the test against the new address. For example if error document is /404.html, Apache checks whether your rule applies to /404.html or not. Since it does, it will rewrite it.
The point to remember is apache will do this every time there is change in URL, whether the change is made by rewrite module or not!
The rule you list should work as you expect if this is the only rule. Fact is that theory is fun, but apparently it doesn't work as expected. Please note that . will match ANY CHARACTER. If you want to match the full stop/period character, you'll need to escape it. That's why I use \.(html|htm|feed|pdf|raw)$ instead of (.html|.htm|.feed|.pdf|.raw)$ below.
You can add another RewriteCond that simply doesn't match if the url starts with /url2, like below. This might not be a viable solution if there are lots of urls that shouldn't be matched.
RewriteCond %{REQUEST_URI} !^/url2
RewriteCond %{REQUEST_URI} \.(html|htm|feed|pdf|raw)$ [NC]
RewriteRule (.*) index2.php [L]
To get a better understanding of what is happening you can alter the rule to something like this. Now simply enter the urls you dont want to be matched in the url bar and inspect the url bar after the redirect happens. In the url-parameter you now see what url actually triggered this rule to match. This screencast shows you a similar version working with a sneaky rewriterule that is working away on the url.
#A way of finding out what is -actually- matched
RewriteCond %{REQUEST_URI} \.(html|htm|feed|pdf|raw)$ [NC]
RewriteCond %{REQUEST_URI} !/foo
RewriteRule (.*) /foo?url=$1 [R,L]
You can decide to match the %{THE_REQUEST} variable instead. This will always contain the request itself. If something else is rewriting the url, this variable doesn't change, meaning you can use this to overwrite any changes. Make sure the url won't be matching itself. You would get something like below. An example screencast can be found here.
#If it doesn't end on .html/htm/feed etc, this one won't match
RewriteCond %{THE_REQUEST} ^(GET|POST)\ /.*\.(html|htm|feed|pdf|raw)\ HTTP [NC]
RewriteCond %{REQUEST_URI} !^/index2\.php$
RewriteRule (.*) /index2.php [L]

Htaccess rule to resolve the 404 error when host name is repeating twice

We are using unlimited sitemap generator to generate a xml sitemap and it takes all urls in the site. Unfortunately there is an error noticed in webmaster's crawl error section. There are a huge number of urls fetching by sitemap as duplicate urls.
For eg:if actual url is "http://www.example.com/forum/viewtopic.php?f=5&t=221&st=0&sk=t&sd=a&start=10"
The sitemap fetch this url and also a duplicate url returning 404 error as"http://www.example.com/http://www.example.com:80/forum/viewtopic.php?f=5&t=221&st=0&sk=t&sd=a&start=10"
(This is only an example url.)
All other urls listed in the sitemap are correct.The issue is with forum section only.(Using phpbb for forum).
Can any one suggest any valid htaccess rule to avoid this 404 .
I want to redirect all patterns like 'http://www.example.com/http://www.example.com:80/forum/....' to 'http://www.example.com/forum/.........'
Any help will be appreciated.
Enable mod_rewrite and .htaccess through httpd.conf and then put this code in your .htaccess under DOCUMENT_ROOT directory:
Options +FollowSymLinks -MultiViews
# Turn mod_rewrite on
RewriteEngine On
RewriteBase /
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s/+.+?(/forum/[^\s]+) [NC]
RewriteRule ^ /%1 [R=302,L,NE]
Once you verify it is working fine, replace R=302 to R=301. Avoid using R=301 (Permanent Redirect) while testing your mod_rewrite rules.
Explanation:
%{THE_REQUEST} represents the original request as received by Apache which in your case may look like: GET /http://www.example.com:80/forum/viewtopic.php?f=5&t=221&‌​st=0&sk=t&sd=a&start=10 HTTP/1.0
Breaking down my regex: ^[A-Z]{3,}\s/+.+?(/forum/[^\s]+) now
This part of regex ^[A-Z]{3,}\s matches 'GET ' part of input.
This part of regex /+.+? matches /http://www.example.com:80 part of input (.+? is reluctant match until next part of regex i.e. /forum/ starts.
This part of regex /forum/ matches literal /forum/ part of input.
This part of regex [^\s]+ matches /viewtopic.php?f=5&t=221&‌​st=0&sk=t&sd=a&start=10 part of input. (until a space is found).
(/forum/[^\s]+) is putting /forum/viewtopic.php?f=5&t=221&‌​st=0&sk=t&sd=a&start=10 in match group #1 (denoted by %1 in RewriteRule later)
Then RewriteRule ^ /%1 [R=302,L,NE] is executing when above RewriteCond is true. This rule then redirects the request to %1 captured above.

I changed the structure of my site to reach index cards

Excuse me for my english.
I make a brands directory web site.
Before to acces to the brands pages I use requests like this :
mydomain.com/fiche.php?id=115
where id is the id of the brand in my directory
I change the structure of the brands pages and now use this request:
mydomain.com/annuaire.php?type=fiche&id_marq=115
where id has become id_marq
I try to use a rewritebrule like this:
RewriteRule ^fiche.php$ http://www.annuaire-sites-officiels.com/annuaire.php?detail=fiche&id_marq=$1 [L,QSA,R=301]
to redirect the old links to the new pages but result dont pass the id_marq value and the url is:
http://www.annuaire-sites-officiels.com/annuaire.php?detail=fiche&id_marq=&id=115
&id= is too.
What am I doing wrong?
Your rule is not evaluating query string and that's why its not capturing id query parameter.
Change your code to:
Options +FollowSymLinks -MultiViews
# Turn mod_rewrite on
RewriteEngine On
RewriteBase /
RewriteCond %{QUERY_STRING} ^id=([^&]+) [NC]
RewriteRule ^fiche\.php$ /annuaire.php?detail=fiche&id_marq=%1 [R=302,L,QSA,NC]
Once you verify it is working fine, replace R=302 to R=301. Avoid using R=301 (Permanent Redirect) while testing your mod_rewrite rules.
Check out Regex Back Reference Availability:
You have to capture the query string. [QSA] passes it forward unaltered, so unless you're using id for anything you don't need that bit of code. Your 301 redirect is correct since this is a permanent redirect. Remember if you add a failed redirect your browser may cache that redirect so it might not look like it's working.
In this string match I'm only catching numbers to prevent someone from passing something like an asterisk * and XSS exploiting your site.
I've not included and [NC] matches in my code because when you allow multiple cases they can seem like different URLs to search engines (bad for SEO).
RewriteCond %{QUERY_STRING} id=([0-9]+)
RewriteRule ^fiche.php$ http://%{HTTP_HOST}/annuaire.php?detail=fiche&id_marq=%1 [R=301,L]

Trying to redirect one page to another, but my redirect rule doen't respond

I've removed and/or combined a couple of pages on a site, And now I need to set up a 301 redirect.
I thougt doing so in my .htaccess was my best bet, but the rules I trying to add doesen't get noticed or something. They don't respond at all...
These are the rules I've tried so far:
Redirect 301 /?Page=sPage&sPage=Our-Store %{SERVER_NAME}?Page=sPage&sPage=About-Us
RewriteRule ^/?Page=sPage&sPage=Our-Store$ %{SERVER_NAME}?Page=sPage&sPage=About-Us[R=301,NC,L]
RewriteCond %{HTTP_HOST} !^%{SERVER_NAME}$ [NC]
RewriteRule . %{SERVER_NAME}%{REQUEST_URI}?Page=sPage&sPage=About-Us [R=301,L]
This last one messed up the CSS and JS src's...
I have this at the top:
RewriteEngine On
RewriteBase /
Any suggestion?
UPDATE : follow up question
I have like 3000+ equal url strings with an ending ID that is different. How do I redirect all those requests?
This is the old url : ?Page=Tuninglist&Car=*
And this is the new one : ?Page=Tuning&view=vehicle&type=Car&id=*
* The value of id= is just integers...
Was hoping something like this could work, but no - got a 500 server error instead...
RewriteCond %{QUERY_STRING} ^Page=Tuninglist&Car=([0-9]+)$
RewriteRule ^ ?Page=Tuning&view=vehicle&type=Car&id=$1 [R=301,L]
*EDIT: The 500 server error occurred because I had a ? at the beginning of the condition.
The redirect now works, but the ending id value doesn't get included.
All I get is the correct page, but not the associated content based on that id...
You can't match against the query string in a redirect or rewrite rule, you need to do it using the %{QUERY_STRING} variable in a condition:
RewriteCond %{QUERY_STRING} ^Page=sPage&sPage=Our-Store$
RewriteRule ^ %{REQUEST_URI}?Page=sPage&sPage=About-Us [R=301,L]

trouble with simple mod_rewrite redirect rule

I have mod_rewrite working in a development environment.
This testing domain is using these rules in an .htaccess file:
Options +FollowSymLinks
Options +Indexes
RewriteEngine on
# deal with potential pre-rewrite spidered / bookmarked urls
RewriteRule ^clothes/index.php?pg=([0-9]+)$ /clothes/index$1.php [R=301,L]
# deal with actual urls
RewriteRule ^clothes/[0-9a-z-]+-pr([0-9]+).php$ /clothes/product.php?pid=$1 [L]
The 2nd Rule works fine. Entering http ://testdomain.dev/clothes/shirt-pr32.php is silently delivered content from http ://testdomain.dev/clothes/product.php?pid=32 ...which is as desired and expected!
However, assuming this was applied to a live site, one that had originally used paths such as: http ://testdomain.dev/clothes/product.php?pid=32, I'd like to redirect any incoming requests following the old pattern to the new urls ...which is what the 1st Rule was intended to do.
My problem is my testing server seems to ignore the 1st Rule and serves the page as requested (page loads but address bar remains at http ://testdomain.dev/clothes/product.php?pid=32)
Any assistance or enlightenment would be most graciously accepted!
You need to match the query string within a RewriteCond, then backreference that RewriteCond from the rule. The RewriteRule only matches against the path, not the query string.
Here's a related post I previously answered with a similar request: Mod_rewrite rewrite example.com/page.php?v1=abc&v2=def to example.com/abc/def
You can't match against the query string in a rewrite rule, you need to use the `%{QUERY_STRING} variable in a condition and use the % to backrefernce groupings. So instead of:
RewriteRule ^clothes/index.php?pg=([0-9]+)$ /clothes/index$1.php [R=301,L]
You'll need:
RewriteCond %{QUERY_STRING} ^pg=([0-9]+)
RewriteRule ^clothes/index.php$ /clothes/index%1.php? [R=301,L]

Resources