How to allow Google Doc to display hotlinked files? - .htaccess

I have restricted hotlinking my files in using htaccess. But I need those files to be displayed in Google Doc Viewer.
In htaccess I allow Google Docs to hotlink my files, but it is not working. Please help me in this
Below is the code I used in my htaccess file
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^http://(www\.)?mydomainname.com/ [NC]
RewriteCond %{HTTP_REFERER} !^http(s)?://(www\.)?docs.google.com [NC]
RewriteCond %{REQUEST_URI} !hotlink\.(gif|png|jpg|doc|ppt|xls|pdf|html|htm|xlsx|docx|pptx|swf) [NC]
RewriteRule .*\.(gif|png|jpg|doc|ppt|xls|pdf|html|htm|xlsx|docx|pptx|swf)$ http://mydomainname.com/ [NC]

Using the referrer from https urls will probably not be very successful.
Note also that the remote in the googledocs are not fetched by the browser from your server, but from a server process at google.
I had the same problem and the trick I found was to allow a specific User-Agent:
RewriteCond %{HTTP_USER_AGENT} !(.*Feedfetcher-Google.*)
Of course it is easily spoofable, but in "normal usage" your hotlink protection will still work.

Oliver S is correct, but maybe the name of the user agent is changed now..
I tried
RewriteCond %{HTTP_USER_AGENT} !(.*Google.*)
And it worked perfectly

Related

301 redirect with GET variables

Good morning all,
When I do a search for the name of my site on google I end up with lots of links like mysite.com/?page=1
mysite.com/?page=2
Etc.
I would like to redirect 301 of these links which ends in mysite.com/?page=X
to monsite.com
Because I am afraid that Google will see it as duplicate content knowing that it displays all the home page of my site ...
I tried
RewriteCond %{QUERY_STRING} ^page=1(&|$) [NC]
RewriteRule ^(mysite)/?$ /$1? [R=301,L]
which doesn't work on my side.
Could you help me ?
Thanks in advance,
To redirect such requests this should be what you are looking for:
RewriteEngine on
RewriteCond %{QUERY_STRING} (?:^|&)page=\d+(?:&|$) [NC]
RewriteRule ^/?$ / [QSD,R=301,END]
Or a more general example which preserves given a path
RewriteEngine on
RewriteCond %{QUERY_STRING} (?:^|&)page=\d+(?:&|$) [NC]
RewriteRule ^ %{REQUEST_URI} [QSD,R=301,END]
Keep in mind however that even with such redirection you still have the issue that somewhere those references are generated. Google does not make them up. So to fix the actual issue and not just a symptom you will have to find the actual issue...

Redirect incoming referral link from Youtube

A user from a YouTube video has linked to my WordPress home page.
However, i want the incoming traffic to go to another page
I tried with .htaccess:
Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_REFERER} http\://www\.youtube\.com/watch\?v\=xxxxxxxxxx [NC]
RewriteCond %{REQUEST_URI} !^/index\.php$
RewriteRule ^(.*)$ http://www.domain.com/abc/post/ [R=302,L]
I am 100% that code is correct, yet it does not work. Are there any other alternatives besides the above code? Perhaps with PHP that I can use in the header or index file?
For one thing, youtube links are usually https so you should add a check for that:
RewriteCond %{HTTP_REFERER} https?\://www\.youtube\.com/watch\?v\=xxxxxxxxxx [NC]
The other thing is you need a space before the end of the video and the [NC] flag.

Why Google add a notice "this site may be compromised"

This morning, a lot of my website where tagged "this site may be compromised" by Google in it's result. Sites that are under my supervision on my own VPS server. I'ved run a deep scan on it and nothing unsual. I'ved look for suspicious htaccess and for javascript injection and nothing wrong so far.
Yesterday, I put an htaccess file to my web root to insure no sql, javascript, base64 and any other suspicious hacking solution might attack my server.
So I do suspect that Google add "this site may be compromised" since I add this protection to all my web sites.
there is the content of this htaccess :
RewriteEngine On
RewriteCond %{REQUEST_URI} !^/robots.txt
RewriteCond %{REQUEST_URI} !^/sitemap.xml
RewriteCond %{HTTP_USER_AGENT} ^-?$ [OR]
RewriteCond %{HTTP_USER_AGENT} ^[bcdfghjklmnpqrstvwxz\ ]{8,}|^[0-9a-z]{15,}|^[0-9A-Za-z]{19,}|^[A-Za-z]{3,}\ [a-z]{4,}\ [a-z]{4,} [OR]
RewriteCond %{HTTP_USER_AGENT} ^<sc|<\?|^adwords|#nonymouse|Advanced\ Email\ Extractor|almaden|anonymous|Art-Online|autoemailspider|blogsearchbot-martin|CherryPicker|compatible\ \;|Crescent\ Internet\ ToolPack|Digger|DirectUpdate|Download\ Accelerator|^eCatch|echo\ extense|EmailCollector|EmailWolf|Extractor|flashget|frontpage|Go!Zilla|grub\ crawler|HTTPConnect|httplib|HttpProxy|HTTP\ agent|HTTrack|^ia_archive|IDBot|id-search|Indy\ Library|^Internet\ Explorer|^IPiumBot|Jakarta\ Commons|^Kapere|Microsoft\ Data|Microsoft\ URL|^minibot\(NaverRobot\)|^Moozilla|^Mozilla$|^MSIE|MJ12bot|Movable\ Type|NICErsPRO|^NPBot|Nutch|Nutscrape/|^Offline\ Explorer|^Offline\ Navigator|OmniExplorer|^Program\ Shareware|psycheclone|PussyCat|PycURL|python|QuepasaCreep|SiteMapper|Star\ Downloader|sucker|SurveyBot|Teleport\ Pro|Telesoft|TrackBack|Turing|TurnitinBot|^user|^User-Agent:\ |^User\ Agent:\ |vobsub|webbandit|WebCapture|webcollage|WebCopier|WebDAV|WebEmailExtractor|WebReaper|WEBsaver|WebStripper|WebZIP|widows|Wysigot|Zeus|Zeus.*Webster [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^curl|^Fetch\ API\ Request|GT\:\:WWW|^HTTP\:\:Lite|httplib|^Java/1.|^Java\ 1.|^LWP|libWeb|libwww|^PEAR|PECL\:\:HTTP|PHPCrawl|python|Rsync|Snoopy|^URI\:\:Fetch|WebDAV|^Wget [NC]
RewriteRule (.*) - [F]
RewriteCond %{REQUEST_METHOD} (GET|POST) [NC]
RewriteCond %{QUERY_STRING} ^(.*)(%3C|<)/?script(.*)$ [NC,OR]
RewriteCond %{QUERY_STRING} ^(.*)(%3D|=)?javascript(%3A|:)(.*)$ [NC,OR]
RewriteCond %{QUERY_STRING} ^(.*)document\.location\.href(.*)$ [OR]
RewriteCond %{QUERY_STRING} ^(.*)(%3D|=)http(%3A|:)(/|%2F){2}(.*)$ [NC,OR]
RewriteCond %{QUERY_STRING} ^(.*)base64_encode(.*)$ [OR]
RewriteCond %{QUERY_STRING} ^(.*)GLOBALS(=|[|%[0-9A-Z]{0,2})(.*)$ [OR]
RewriteCond %{QUERY_STRING} ^(.*)_REQUEST(=|[|%[0-9A-Z]{0,2})(.*)$ [OR]
RewriteCond %{QUERY_STRING} ^(.*)(SELECT(%20|\+)|UNION(%20|\+)ALL|INSERT(%20|\+)|DELETE(%20|\+)|CHAR\(|UPDATE(%20|\+)|REPLACE(%20|\+)|LIMIT(%20|\+))(.*)$ [NC]
RewriteRule (.*) - [F]
There is a lot of keyword within this file regarding hacking terminology ... is there any way that Google might look into the htaccess file ?
Should I block google with a robots.txt for this htaccess only or could/should I add a line of code directly into the htaccess to block Google for scanning this file... ?
What do you think ?
If .htaccess is visible from outside, then you have a serious problem. That file should never be visible by anybody accessing the site through http. Blocking it in robots.txt would just prevent well-behaved bots from looking at it. But bots that ignore robots.txt would still have access.
If you suspect that your .htaccess is the cause of the problem, you need to make sure that it can't be served. That's the default on Apache, but if you were mucking around with permissions I suppose you could have exposed it. If you did, you need to fix that.
I think you need to look somewhere else for the cause of Google's "this site may be compromised" message. A Google (or Bing) search on [this site may be compromised] reveals lots of information about why that warning might appear.

.htaccess which somehow prevents Google to put some PageRank on my domains

I am using the following .htaccess code on all my domains since 2+ years ago on some projects, but no one of the websites build has ever got any Google PageRank, at least '1' bar. On all websites on which I don't use this code, I am getting a reasonable PageRank.
Could you tell me what I am doing wrong:
RewriteEngine On
RewriteBase /
# rewrite the non 'www' addresses
RewriteCond %{HTTP_HOST} ^example\.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
# rewrite REQUEST_URI
RewriteCond %{HTTP_HOST} ^www\.example\.com [OR]
RewriteCond %{HTTP_HOST} ^example\.com
RewriteRule (.*) index.php [L]
some of my websites using this .htaccess:
http://www.kampril.bg/
http://www.milleniumbg.eu/
Register these domains in the Google Search Console and check whether Google returns some error messages or some feedback about these. Submit some sitemaps.
If you do not see any error messages or warnings, then it simply means Google does not find the content of your websites interesting.

Hotlinking protection

ok so i have a couple of working codes
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^http://site.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://site.com$ [NC]
RewriteCond %{HTTP_REFERER} !^http://www.site.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://www.site.com$ [NC]
RewriteRule .*\.(jpg|jpeg|gif|png|bmp|mp3|wav)$ - [F,NC]
these would prevent hotlinking from your site hmm but when i viewed my so-called gallery which displays the jpg's it disappeared..is it possible to still use the images while using hotlink protection? i was kinda wondering
You can use this tool http://www.htaccesstools.com/hotlink-protection/
Use this generator to create a .htaccess file for hotlink protection of your images and pictures. Hotlink protection can save you lots of bandwidth by preventing other sites from displaying your images.
After you have created a .htaccess for hotlink protection, you can use the tool to test hotlink protection and make sure that you prevent hotlink.

Resources