whitelist specific websites in htaccess - .htaccess

I have a problem in one of my websites. I'm using a custom made php script to protect my images from being hot linked by new Google Image Search. The script is working but it blocks all other websites from hotlinking, including Facebook, Google Plus, pinterest ...
Therefore please help me on how to whitelist at least these three websites in my htaccess file: facebook, google plus and pinterest.
I tried for example this:
RewriteCond %{HTTP_REFERER} !^http://plus.google.com\. [NC]
RewriteCond %{HTTP_REFERER} !^https://plus.google.com\. [NC]
for google plus .. but looks like it does't work ... what i'm missing here ... ?
Thank you very much

Anyway, what you want is something like the following rule:
RewriteCond %{HTTP_REFERER} !^http(s)?://plus.google.com [NC]
I'm not sure why your rule has a \. at the end of the URL but this does really not seem appropriate there.
You can also reduced it to one rule through the http(s)? part of the new rule.
But please, think really hard about what you want to do there. Instead of preventing hotlinking by Google, you should either consider robots.txt rules or just allowing Google to link to your images. Everything else can (and probably will with some Google update) harm your sites rank in Google, as you use something that could be easily filed under 'cloaking', which in return would get your whole page marked as spam in the Google index. You can read more on the subject there: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=66355

Related

.htaccess hacked? google returns internal server error

I have been struggling with this for weeks, and would greatly appreciate any help. When I do a google search on my website, http://www.example.com, I get an internal server error. It took weeks but I finally got ahold of someone from google, and they said it wasn’t them, I needed to call godaddy. So I called godaddy, and they said I have a virus on my site, but for $300 they would clean it. I downloaded Wordfence and ran it, it cleaned up some items and says everything is now clear, but I’m still getting an internal server error.
I have pasted my htaccess file here
https://pastebin.com/NRDdFfZ0
and am wondering about the first three lines
RewriteCond %{HTTP_USER_AGENT} (google|yahoo|msn|aol|bing) [OR]
RewriteCond %{HTTP_REFERER} (google|yahoo|msn|aol|bing)
RewriteRule ^(.*)$ maggoty-haroun.php?$1 [L]
I do have a php file called maggoty-haroun in my main site files, it just strikes me as an odd name.
https://pastebin.com/qnDu8f0k
We are a small restaurant, in a small town, badly hit by the pandemic and have been closed (curbside, delivery only) for months. Not being able to be found on google is going to be a killing blow. Is there anyone that can help?
Any help is greatly appreciated.
RewriteCond %{HTTP_USER_AGENT} (google|yahoo|msn|aol|bing) [OR]
RewriteCond %{HTTP_REFERER} (google|yahoo|msn|aol|bing)
RewriteRule ^(.*)$ maggoty-haroun.php?$1 [L]
You need to remove these directives.
These directives may simply result in a rewrite loop, hence your 500 internal server error, so they may not have done anything too malicious, except for making your site inaccessible (bad enough).
However, what they are trying to do... for any request that comes from a search engine crawler (User-Agent) OR from someone clicking on a result in the SERPs (Referer) then internally rewrite the request to maggoty-haroun.php, passing the requested URL-path in the query string (although due to the "rewrite-loop" they will end up just passing the same URL, ie. maggoty-haroun.php).
This can only be malicious - if it was successful it will de-index your pages in the search engines (and potentially damage your ranking by indexing "other" content) and prevent anyone from reaching your site.
However, unless your site is now "clean" you can't be sure that these directives won't be added back again - so you need to keep a close eye on it.
If these directives are simply resulting in a 500 error, then your site should bounce back (since a 500 error is considered "temporary" by search engines), providing this has not been the case for too long.
I have pasted my htaccess file here ...
Wow, 4000+ lines of blocking user agents and IP addresses!?

How to whitelist Google, Bing, Yahoo and popular browsers user-agent?

I know this may not be the best practice to dealing with the problem at hand, but please answer this question rather than trying to convince me of another strategy.
I want to develop a .htaccess file to block any visitors with user agents that DO NOT contain: Mozilla, googlebot, yahoo, bing, Chrome
You can reverse the rules using negation in RewriteCond like this:
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} !(Googlebot|msnbot|Surp) [NC]

htaccess auto redirect while attemp to view direct content

I am currently administrating some art website that contains lots of photos and other content files and it bugs me that ppl find a way around scripting and are accessing stuff directly, they download our copyright protected materials.
I was thinking about htaccess file that do the following:
someone type in address directly to the browser: http://www.mydomain.com/photos/photo.jpg
htaccess triggers and instead of showing the content - it redirects right away to: http://www.mydomain.com/ (this is important to do redirect before picture is displayed)
redirect is extremely important not just some preventing without redirect, but if someone attempts to use sowftware to download content via providing link to it then it rejects request
my knowledge about htaccess is really thin i could use a help on this one
This should work:
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^http://www.mydomain.com/ [NC]
RewriteRule .*\.jpg|gif$ /nolinking.html [R]
If you try enter http://www.mydomain.com/photos/photo.jpg it will redirect you to http://www.mydomain.com/nolinking.html, but it will allow images to be loaded on pages if they are linked to,

Changing a website domain and redirecting

We're moving a fairly large website from domain.one where it's been for a long time onto domain.two. If people still find links for domain.one we want them to redirect to an appropriate place on domain.two (if possible).
Domian.one is no longer required after the switch. I don't know anything about moving an entire domain so could use some advice on the best way to go about switching whilst retaining the SEO gained over the years.
Any help is appreciated.
Many thanks
Put this in an htaccess file in your root web directory. It will forward your users, and search engines, to the new URL on the new domain.
Options +FollowSymLinks
RewriteEngine on
RewriteRule (.*) http://www.newdomain.com/$1 [R=301,L]
ADD THE FOLLOWING LINES as posted by John .
Also log in to google webmaster tools->configuration->change of address
and change your url ther so that seacrh engine results are also changed.
THIS iS VERY IMPORTANT

Rewrite URL to remove first page pagination using htaccess

I have an paginated list of articles and my search engine is indexing the first page twice as /articles and /articles?start=1.
I want to write an .htaccess rewrite rule to rewrite any requests for /articles?start=1 to /articles to stop this from happening.
There are a couple of other article based paginated lists on the site, so i need to match just the parameter rather that the full url, so that the rule will work on thse urls also.
Thanks for your help!
Try this:
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} \/articles\/
RewriteCond %{QUERY_STRING} start=1
RewriteRule (.*)$ /index.php\?
I am assuming articles is a directory.
A bit of a necro, but I just found this question via Google when looking for something else.
Recently, Google offered up way more control over how URL parameters affect their indexing of your site. Go to Google Webmaster Tools and set up your site.
Google will crawl your site and add a list of URL parameters its aware of (and I beleive you can add your own to the list). You can specify how the URL parameters affect your page (pagination, filtering, internal etc) which in turn tells Google how to index it.
In this case, you'll be able to say that start should be either ignored, indexed only once, or treated how you see fit to make sure it only appears how its meant to in the index.
This helps Google and also helps you, so its entirely win-win.

Resources