Prevent spiders from following is matches URL - .htaccess

How do I prevent spriders crawling pages that start with mydomain.com/abc...
For example mydomain.com/abcSGGSHS or mydomain.com/abc6bNNha
I think I need to add some sort of regular expression to the web root's .htaccess, right?

With mod_rewrite enabled, you can do the following
RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_USER_AGENT} ^YourBadSpiderName [OR]
RewriteCond %{HTTP_USER_AGENT} ^AotherBadSpider
RewriteCond %{REQUEST_URI} ^abc
RewriteRule ^$ http://mydomain.com/404.html [NC,L]
You'll have to update the spider names accordingly. If a bot changes his user agent, let's say to 'Mozilla/Firefox', you're out of luck..

Related

Writing an .htaccess file properly

I have the following code in my .htaccess file; it was given to me in an attempt to redirect users from one website to another, but while masking the URL such that the original domain was kept:
Options +FollowSymLinks -MultiViews
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_HOST} ^Shmoo.com$ [NC,OR]
RewriteCond %{HTTP_HOST} ^www\.Shmoo\.com$ [NC]
RewriteRule ^(.*)$ https://wubbins.com/humnewum/$1 [R=301,L]
RewriteCond %{HTTPS_HOST} ^Shmoo.com$ [NC,OR]
RewriteCond %{HTTPS_HOST} ^www\.Shmoo\.com$ [NC]
RewriteRule ^(.*)$ https://wubbins.com/humnewum/$1 [R=301,L]
I cannot use the generic Web forwarding, redirect, or Alias tools on one.com as these seem to use iFrames and therefore the destination site does not present properly on mobile devices.
This code seems to work, for the most part (as in no iFrame) but the URL is not masked, it is displayed as the destination 'www.wubbins.com/humnewum' not the origin 'www.Shmoo.com'
Complete noob so any help greatly appreciated.

htaccess conditions AND statements?

at the end of my .htaccess i redirect all url that i consider faulty.
this works perfect but i need an exception for my own pc
for my firefox browser i want the server to react as every other request
for my chrome browsers i want full access to all files on the server
i do it this way and it works:
RewriteCond %{REQUEST_URI} !=/index.php
RewriteCond %{REQUEST_URI} !=/retpic.php
RewriteCond %{REQUEST_URI} !^/pcs/.*$
RewriteCond %{REMOTE_HOST} ^11\.11\.1\.11$
RewriteCond %{HTTP_USER_AGENT} Firefox
RewriteRule . index.php [L]
RewriteCond %{REQUEST_URI} !=/index.php
RewriteCond %{REQUEST_URI} !=/retpic.php
RewriteCond %{REQUEST_URI} !^/pcs/.*$
RewriteCond %{REMOTE_HOST} !^11\.11\.11\.11$
RewriteRule . index.php [L]
but i end up doubling the whole code... is there a more elegant solution like combining the following with an AND statement? (i found something about an OR statement but not about AND)
RewriteCond %{REMOTE_HOST} ^11\.11\.1\.11$
RewriteCond %{HTTP_USER_AGENT} Firefox
edit: added more explanation on how the code works:
RewriteCond %{REMOTE_HOST} !^11\.11\.11\.11$
this is from the second part it excludes my ip from this rule so i can access all files on the server, this is important since i want to be able to access my cms
RewriteCond %{REMOTE_HOST} ^11\.11\.1\.11$
RewriteCond %{HTTP_USER_AGENT} Firefox
this parts includes my firefox browser when i am home so i can see if the website works with all restrictions in place. why do i have this rule: i was working on my site and restructuring some parts and it kept on working for me but when i was at a friends place i noticed it did not so i needed something to be able to check this at home.
You nee do apply little bit of boolean algebra to combine these rules into one.
Here is you can do it for you:
RewriteCond %{REMOTE_ADDR} !=11.11.2.11 [OR]
RewriteCond %{HTTP_USER_AGENT} Firefox
RewriteRule !(index\.php|retpic\.php|pcs/) index.php [L,NC]

.htaccess protect urls with query string combinations

I need to protect a "single logical" url in a Joomla CMS with htaccess. I found here .htaccess code to protect a single URL?
this solution, which works great for a specific url:
RewriteEngine On
RewriteCond %{REQUEST_URI} ^/index\.php$
RewriteCond %{QUERY_STRING} ^option=com_content&task=view&id=76$
RewriteRule ^(.*)$ /secure.htm
However, how can I make sure that the url parts can't be swapped around or amended, therefore circumventing the secure access. For example I don't want to allow access to
option=com_content&task=view&id=76&dummy=1
option=com_content&id=76&task=view
either.
I have tried this, which doesn't seem to work:
RewriteEngine On
RewriteCond %{REQUEST_URI} ^/index\.php$
RewriteCond %{QUERY_STRING} option=com_content
RewriteCond %{QUERY_STRING} task=view
RewriteCond %{QUERY_STRING} id=76
RewriteRule ^(.*)$ /secure.htm
Your rules work fine for me when I go to any of these URLs:
http://localhost/index.php?blah=blah&option=com_content&task=view&id=76
http://localhost/index.php?option=com_content&task=view&id=76&dummy=1
http://localhost/index.php?option=com_content&id=76&task=view
I get served the content at /secure.htm
However, you could add sound boundaries to your query string rules:
RewriteCond %{QUERY_STRING} (^|&)option=com_content(&|$)
RewriteCond %{QUERY_STRING} (^|&)task=view(&|$)
RewriteCond %{QUERY_STRING} (^|&)id=76(&|$)
So that you don't end up matching something like id=761

.htaccess, Rewriterule not working as i want

I have this link: http://www.domain.com.mk/lajmi.php?id=2790,
and i want to change it to http://www.domain.com.mk/lajmi/2790
With this code I can change the link to /lajmi/2790 but i get 404 error.
I mean i get the link
http://www.domain.com.mk/lajmi/2790, but it has 404 error (i dont se the content)
This is my code:
RewriteEngine on
RewriteCond %{HTTP_HOST} ^domain\.com\.mk$ [OR]
RewriteCond %{HTTP_HOST} ^www\.domain\.com\.mk$
RewriteCond %{QUERY_STRING} ^id=([0-9]*)$
RewriteRule ^lajmi\.php$ http://domain.com.mk/lajmi/%1? [R=302,L]
What I am doing wrong ?
Try this one :
RewriteEngine on
RewriteCond %{HTTP_HOST} ^(www\.)?domain\.com\.mk$
RewriteCond %{QUERY_STRING} ^id=(\d*)$
RewriteRule ^lajmi\.php$ http://domain.com.mk/lajmi/%1? [R=302,L]
RewriteRule ^lajmi/(\d*)$ lajmi.php?id=$1&r=0 [L]
(the &r=0 in the final rule is for not getting an infinite loop)
Single direction rewrite:
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_HOST} ^(www\.)?domain\.com\.mk$
RewriteRule ^lajmi/(\d*)$ lajmi.php?id=$1 [L,QSA]
This means that every uri of kind /lajmi/2790 will be passed to /lajmi.php?id=2790 in a sub-request.
However, in this case, if the user hits /lajmi.php?id=2790 by himself, then this is the url he will see in the browser, not the "beautified one".
Bi-directional rewrite:
RewriteEngine On
RewriteBase /
; Redirect lajmi.php?id=2790 to a beutified version, but only if not in sub-request!
RewriteCond %{HTTP_HOST} ^(www\.)?domain\.com\.mk$
RewriteCond %{IS_SUBREQ} !=true
RewriteCond %{QUERY_STRING} ^id=(\d*)$
RewriteRule ^lajmi\.php$ lajmi/%1 [R=301,L]
; Make the beutified uri be actually served by lajmi.php
RewriteCond %{HTTP_HOST} ^(www\.)?domain\.com\.mk$
RewriteRule ^lajmi/(\d*)$ lajmi.php?id=$1 [L]
Here, an additional RewriteCond was added to the first rule checking that this is not a sub-request, to ensure that the rules do not loop.
You can pick which way you like, but the first approach is enough if you build the links in your HTML in the 'beautified' way already (no need to redirect the browser twice just to see the page).

Deny referrals from all domains except one

Is it possible to accept traffic from only one domain, ideally using a .htaccess file?
I want my site to only be accessible via a link on another site I have.
I know how to block one referring domain, but not all domains
RewriteEngine on
# Options +FollowSymlinks
RewriteCond %{HTTP_REFERER} otherdomain\.com [NC]
RewriteRule .* - [F]
this is my full rewrite code:
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_REFERER} .
RewriteCond %{HTTP_REFERER} !domain\.co.uk [NC]
RewriteRule .? - [F]
# The Friendly URLs part
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?q=$1 [L,QSA]
I think it is working, but none of the assets are getting loaded and I get a 500 error when I click on another link.
Make that something like:
RewriteCond %{HTTP_REFERER} .
RewriteCond %{HTTP_REFERER} !yourdomain\.com [NC]
RewriteCond %{HTTP_REFERER} !alloweddomain\.com [NC]
RewriteRule .? - [F]
The first RewriteCond checks that the referrer is not empty. The second checks that it doesn't contain the string yourdomain.com, and the third that it doesn't contain the string alloweddomain.com. If all of these checks pass, the RewriteRule triggers and denies the request.
(Allowing empty referrers is generally a good idea, since browsers can generate them for various reasons, such as when:
the user has bookmarked the link,
the user entered the link manually into the address bar,
the user reloaded the page,
the browser is configured not to send cross-site referrer infromation, or
a proxy between your site and the browser strips away the referrer information.)

Resources