I tried to block bad bots via htaccess with this code:
I know these are 2 ways to do so, but none of them is working, I still see the bots in the access-log: What am I doing wrong?
RewriteCond %{HTTP_USER_AGENT} ^BLEXBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SemrushBot [NC,OR]
SetEnvIfNoCase User-Agent "BLEXBot" rotbot
SetEnvIfNoCase User-Agent "SemrushBot" rotbot
<Limit POST GET HEAD PUT>
Order Allow,Deny
Allow from all
Deny from env=rotbot
</Limit>
The entries in the access log look like that:
domain.org:443 46.229.168.142 - - [22/Jul/2019:08:56:26 +0200] "GET /path/to/page/ HTTP/1.1" 403 3801 "-" "Mozilla/5.0 (compatible; SemrushBot/3~bl; +http://www.semrush.com/bot.html)"
domain.org:443 94.130.219.232 - - [22/Jul/2019:08:56:24 +0200] "GET /path/to/page/ HTTP/1.1" 403 760 "-" "Mozilla/5.0 (compatible; BLEXBot/1.0; +http://webmeup-crawler.com/)"
Fix these rules to:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^BLEXBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SemrushBot [NC]
RewriteRule ^.* - [F,L]
</IfModule>
Related
I got .htaccess blocking all user-agents and only allow one's i need
to allow cloudflare to access how can i allow not using (Mozilla)
this is what i got user-agent
Mozilla/5.0 (compatible; CloudFlare-AlwaysOnline/1.0; +http://www.cloudflare.com/always-online)
RewriteEngine on
AuthType Basic
AuthName "private"
AuthUserFile "/home/example/.htpasswds/public_html/exemple/passwd"
require valid-user
#-only allow-#
SetEnvIf User-Agent .0011 0011
Order deny,allow
Deny from all
Allow from env=0011
#-index only open for 0011-#
Options +Indexes
RewriteCond %{REQUEST_FILENAME} -d
RewriteCond %{HTTP_USER_AGENT} !0011 [NC]
RewriteRule . - [F]
You can use:
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} !CloudFlare-AlwaysOnline [NC]
RewriteRule ^ - [F]
But do not do that, because Cloudflare uses the name of the user's browser user-agents for all normal queries.
I am getting error 500 in my logs when using the below Rewrite Rules to block by User Agent and Referral URL. I have tried removing the referral url block but still getting error 500 for the user agent block. The traffic that is not block work properly. Just getting 500 for bots and refurl that are blocked. Any idea why is it giving me 500 instead of 403 in the log?
# Block by REFURL
RewriteEngine on
RewriteCond %{HTTP_REFERER} sample\.com [NC]
RewriteRule .* - [F]
# Block by User Agent
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} bot1 [NC,OR]
RewriteCond %{HTTP_USER_AGENT} bot2 [NC]
RewriteRule .* - [F]
# BLOCK BLANK USER AGENTS
RewriteCond %{HTTP_USER_AGENT} ^-?$
RewriteRule ^ - [F]
Here is sample of one of the log. My server use cpanel so I pull this from the raw access log. I only get error log when an IP has been block. It doesn't show anything for the 500 errirs.
xx.xx.xxx.xxx - - [02/Sept/2014:21:54:25 -0400] "GET / HTTP/1.0" 500 - "-" "Mozilla/5.0 (compatible; BadBot/5.0; +http://badbot.com/robot/)"
Even after setting up default page in view setting of google analytics I want when some user opens /index.cfm it should redirext to /.
RewriteEngine On
RewriteCompatibility2 On
RepeatLimit 200
RewriteBase
[hash] unsupported directive: [ISAPI_Rewrite]
RewriteCond %{HTTP:Host} ^website_name\.com$ [NC]
RewriteRule ^/products.html$ http\://www.website_name.com/products.cfm [L,R=301]
RewriteCond %{REQUEST_URI} /index\.cfm?$ [NC]
RewriteRule ^(.*)index\.cfm?$ "/$1" [NC,R=301,NE,L]
ErrorDocument 404 /404.cfm
ErrorDocument 500 / http://www.website_name.com
How can i ban requests form pingdomtools?
Their hostnames looks like that:
s464.pingdom.com
So how can i ban all hostnames ending with
pingdom.com
?
You could use a rewrite rule.
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{HTTP_REFERER} ^https?://([^.]+\.)*tumblr\.com [NC]
RewriteRule .* - [F]
</IfModule>
http://davidwalsh.name/block-domain
Note: This blocks based on referer, which can be spoofed or left out entirely.
Update: On servers that do reverse dns you can try:
deny from .pingdom.com
https://httpd.apache.org/docs/2.0/mod/mod_access.html#allow
RewriteEngine on
# Options +FollowSymlinks
RewriteCond %{HTTP_REFERER} somesite\.com [NC,OR]
RewriteRule .* - [F]
Well since it's not easy to ban the hostname i just banned the User agent:
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} ^Pingdom.com
RewriteRule ^(.*)$ http://go.away/
Works fine.
A stranger bot (GbPlugin) is codifying the urls of the images and causing error 404.
I tried to block the bot without success with this in the bottom of my .htaccess, but it didn't work.
Options +FollowSymlinks
RewriteEngine On
RewriteBase /
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_USER_AGENT} ^$ [OR]
RewriteCond %{HTTP_USER_AGENT} ^GbPlugin [NC]
RewriteRule .* - [F,L]
The log this below.
201.26.16.9 - - [10/Sep/2011:00:06:05 -0300] "GET /wp%2Dcontent/themes/my_theme%2Dpremium/scripts/timthumb.php%3Fsrc%3Dhttp%3A%2F%2Fwww.example.com%2Fwp%2Dcontent%2Fuploads%2F2011%2F08%2Fmy_image_name.jpg%26w%3D100%26h%3D65%26zc%3D1%26q%3D100 HTTP/1.1" 404 1047 "-" "GbPlugin"
Sorry for my language mistakes
Here's what you can put in your .htacces file
Options +FollowSymlinks
RewriteEngine On
RewriteBase /
SetEnvIfNoCase Referer "^$" bad_user
SetEnvIfNoCase User-Agent "^GbPlugin" bad_user
SetEnvIfNoCase User-Agent "^Wget" bad_user
SetEnvIfNoCase User-Agent "^EmailSiphon" bad_user
SetEnvIfNoCase User-Agent "^EmailWolf" bad_user
SetEnvIfNoCase User-Agent "^libwww-perl" bad_user
Deny from env=bad_user
This will return:
HTTP request sent, awaiting response... 403 Forbidden
2011-09-10 11:15:48 ERROR 403: Forbidden.
May I recommend this method:
Put this is .htaccess in root of your site.
ErrorDocument 503 "Your connection was refused"
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^(Mozilla.*537.36|Mozilla.*UCBrowser\/9.3.1.344)$ [NC]
RewriteRule .* - [R=503,L]
Where
^(Mozilla.*537.36|Mozilla.*UCBrowser\/9.3.1.344)$
are the two useragents I wanted to block in this example case.
You can use regex so a useragent like
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0
could be
Mozilla.*Firefox\/40.0
^means match from beginning and $ to the end so you could block just one useragent with:
ErrorDocument 503 "Your connection was refused"
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*Firefox\/40.0$ [NC]
RewriteRule .* - [R=503,L]
Or add several using the | character to separate them inside ( and ) like in the first example.
RewriteCond %{HTTP_USER_AGENT} ^(Mozilla.*537.36|Mozilla.*UCBrowser\/9.3.1.344)$ [NC]
You can test it by putting your useragent in the code and then try to access the site. http://whatsmyuseragent.com/
To block empty referers, you can use the following Rule :
RewriteEngine on
RewriteCond %{HTTP_REFERER} ^$
RewriteRule ^ - [F,L]
This will forbid all requests to your site if HTTP_REFERER value is empty ^$ .
To block user agents, you can use
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} opera|firebox|foo|bar [NC]
RewriteRule ^ - [F,L]
This will forbid all requests to your site if HTTP_USER_AGENT matches the Condition pattern.