Can't block AspiegelBot in htaccess or robots.txt - .htaccess

I have issues with AspiegelBot crawling one of the sites on a server, this results in a lot of cores getting used up. I've been trying to block the bot in both in the sites htaccess with no sucess. The bot still constantly appears in my access.log
114.119.165.232 - - [20/Apr/2020:07:38:40 +0200] "GET /tillbehor.html?size=98%2C422%2C423%2C1129%2C1378 HTTP/1.1" 301 296 "-" "Mozilla/5.0 (Linux; Android 7.0;) AppleWebKit/537.36 (KHTML, like Gecko) Mobile Safari/537.36 (compatible; AspiegelBot)"
Here is some of what I've tried:
htaccess
Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} ^.(Mb2345Browser|AspiegelBot|LieBaoFast|MicroMessenger|zh-CN|Kinza|Mb2345Browser).$ [NC]
RewriteRule .* - [F,L]
robots.txt
User-agent: *
Allow: /
Disallow: */shopby
#######################################
################ PAGES ################
#######################################
Disallow: /privacy-policy-cookie-restriction-mode/
Disallow: /terms/
#######################################
############# Block Bots ##############
#######################################
User-agent: MJ12bot
Disallow: /
User-agent: SemrushBot
Disallow: /
User-agent: SemrushBot-SA
Disallow: /
User-agent: rogerbot
Disallow:/
User-agent: dotbot
Disallow:/
User-agent: AhrefsBot
Disallow: /
User-agent: Alexibot
Disallow: /
User-agent: SurveyBot
Disallow: /
User-agent: Xenu's
Disallow: /
User-agent: Xenu's Link Sleuth 1.1c
Disallow: /
User-agent: AspiegelBot
Disallow: /
Am I missing something or writing something incorrectly? Ï'm kinda at a loss here.

I posted this as a comment but seeing as it's what solved this for me I will add it as an answer.
I managed to get the bot blocked by blocking the starting IP sequence in the htaccess file. It might not be optimal way to do it but it worked.
Deny from 114.119.0.0/16

Try adding this to your .htaccess
Options All -Indexes
RewriteEngine on
# Block Bad Bots & Scrapers
SetEnvIfNoCase User-Agent "^AspiegelBot" bad_bot
SetEnvIfNoCase User-Agent "Aboundex" bad_bot
SetEnvIfNoCase User-Agent "80legs" bad_bot
SetEnvIfNoCase User-Agent "360Spider" bad_bot
SetEnvIfNoCase User-Agent "^Java" bad_bot
SetEnvIfNoCase User-Agent "^Cogentbot" bad_bot
SetEnvIfNoCase User-Agent "^Alexibot" bad_bot
SetEnvIfNoCase User-Agent "^asterias" bad_bot
SetEnvIfNoCase User-Agent "^attach" bad_bot
SetEnvIfNoCase User-Agent "^BackDoorBot" bad_bot
SetEnvIfNoCase User-Agent "^BackWeb" bad_bot
SetEnvIfNoCase User-Agent "Bandit" bad_bot
SetEnvIfNoCase User-Agent "^BatchFTP" bad_bot
SetEnvIfNoCase User-Agent "^Bigfoot" bad_bot
SetEnvIfNoCase User-Agent "^Black.Hole" bad_bot
SetEnvIfNoCase User-Agent "^BlackWidow" bad_bot
SetEnvIfNoCase User-Agent "^BlowFish" bad_bot
SetEnvIfNoCase User-Agent "^BotALot" bad_bot
SetEnvIfNoCase User-Agent "Buddy" bad_bot
SetEnvIfNoCase User-Agent "^BuiltBotTough" bad_bot
SetEnvIfNoCase User-Agent "^Bullseye" bad_bot
SetEnvIfNoCase User-Agent "^BunnySlippers" bad_bot
SetEnvIfNoCase User-Agent "^Cegbfeieh" bad_bot
SetEnvIfNoCase User-Agent "^CheeseBot" bad_bot
SetEnvIfNoCase User-Agent "^CherryPicker" bad_bot
SetEnvIfNoCase User-Agent "^ChinaClaw" bad_bot
SetEnvIfNoCase User-Agent "Collector" bad_bot
SetEnvIfNoCase User-Agent "Copier" bad_bot
SetEnvIfNoCase User-Agent "^CopyRightCheck" bad_bot
SetEnvIfNoCase User-Agent "^cosmos" bad_bot
SetEnvIfNoCase User-Agent "^Crescent" bad_bot
SetEnvIfNoCase User-Agent "^Custo" bad_bot
SetEnvIfNoCase User-Agent "^AIBOT" bad_bot
SetEnvIfNoCase User-Agent "^DISCo" bad_bot
SetEnvIfNoCase User-Agent "^DIIbot" bad_bot
SetEnvIfNoCase User-Agent "^DittoSpyder" bad_bot
SetEnvIfNoCase User-Agent "^Download\ Demon" bad_bot
SetEnvIfNoCase User-Agent "^Download\ Devil" bad_bot
SetEnvIfNoCase User-Agent "^Download\ Wonder" bad_bot
SetEnvIfNoCase User-Agent "^dragonfly" bad_bot
SetEnvIfNoCase User-Agent "^Drip" bad_bot
SetEnvIfNoCase User-Agent "^eCatch" bad_bot
SetEnvIfNoCase User-Agent "^EasyDL" bad_bot
SetEnvIfNoCase User-Agent "^ebingbong" bad_bot
SetEnvIfNoCase User-Agent "^EirGrabber" bad_bot
SetEnvIfNoCase User-Agent "^EmailCollector" bad_bot
SetEnvIfNoCase User-Agent "^EmailSiphon" bad_bot
SetEnvIfNoCase User-Agent "^EmailWolf" bad_bot
SetEnvIfNoCase User-Agent "^EroCrawler" bad_bot
SetEnvIfNoCase User-Agent "^Exabot" bad_bot
SetEnvIfNoCase User-Agent "^Express\ WebPictures" bad_bot
SetEnvIfNoCase User-Agent "Extractor" bad_bot
SetEnvIfNoCase User-Agent "^EyeNetIE" bad_bot
SetEnvIfNoCase User-Agent "^Foobot" bad_bot
SetEnvIfNoCase User-Agent "^flunky" bad_bot
SetEnvIfNoCase User-Agent "^FrontPage" bad_bot
SetEnvIfNoCase User-Agent "^Go-Ahead-Got-It" bad_bot
SetEnvIfNoCase User-Agent "^gotit" bad_bot
SetEnvIfNoCase User-Agent "^GrabNet" bad_bot
SetEnvIfNoCase User-Agent "^Grafula" bad_bot
SetEnvIfNoCase User-Agent "^Harvest" bad_bot
SetEnvIfNoCase User-Agent "^hloader" bad_bot
SetEnvIfNoCase User-Agent "^HMView" bad_bot
SetEnvIfNoCase User-Agent "^HTTrack" bad_bot
SetEnvIfNoCase User-Agent "^humanlinks" bad_bot
SetEnvIfNoCase User-Agent "^IlseBot" bad_bot
SetEnvIfNoCase User-Agent "^Image\ Stripper" bad_bot
SetEnvIfNoCase User-Agent "^Image\ Sucker" bad_bot
SetEnvIfNoCase User-Agent "Indy\ Library" bad_bot
SetEnvIfNoCase User-Agent "^InfoNaviRobot" bad_bot
SetEnvIfNoCase User-Agent "^InfoTekies" bad_bot
SetEnvIfNoCase User-Agent "^Intelliseek" bad_bot
SetEnvIfNoCase User-Agent "^InterGET" bad_bot
SetEnvIfNoCase User-Agent "^Internet\ Ninja" bad_bot
SetEnvIfNoCase User-Agent "^Iria" bad_bot
SetEnvIfNoCase User-Agent "^Jakarta" bad_bot
SetEnvIfNoCase User-Agent "^JennyBot" bad_bot
SetEnvIfNoCase User-Agent "^JetCar" bad_bot
SetEnvIfNoCase User-Agent "^JOC" bad_bot
SetEnvIfNoCase User-Agent "^JustView" bad_bot
SetEnvIfNoCase User-Agent "^Jyxobot" bad_bot
SetEnvIfNoCase User-Agent "^Kenjin.Spider" bad_bot
SetEnvIfNoCase User-Agent "^Keyword.Density" bad_bot
SetEnvIfNoCase User-Agent "^larbin" bad_bot
SetEnvIfNoCase User-Agent "^LexiBot" bad_bot
SetEnvIfNoCase User-Agent "^lftp" bad_bot
SetEnvIfNoCase User-Agent "^libWeb/clsHTTP" bad_bot
SetEnvIfNoCase User-Agent "^likse" bad_bot
SetEnvIfNoCase User-Agent "^LinkextractorPro" bad_bot
SetEnvIfNoCase User-Agent "^LinkScan/8.1a.Unix" bad_bot
SetEnvIfNoCase User-Agent "^LNSpiderguy" bad_bot
SetEnvIfNoCase User-Agent "^LinkWalker" bad_bot
SetEnvIfNoCase User-Agent "^lwp-trivial" bad_bot
SetEnvIfNoCase User-Agent "^LWP::Simple" bad_bot
SetEnvIfNoCase User-Agent "^Magnet" bad_bot
SetEnvIfNoCase User-Agent "^Mag-Net" bad_bot
SetEnvIfNoCase User-Agent "^MarkWatch" bad_bot
SetEnvIfNoCase User-Agent "^Mass\ Downloader" bad_bot
SetEnvIfNoCase User-Agent "^Mata.Hari" bad_bot
SetEnvIfNoCase User-Agent "^Memo" bad_bot
SetEnvIfNoCase User-Agent "^Microsoft.URL" bad_bot
SetEnvIfNoCase User-Agent "^Microsoft\ URL\ Control" bad_bot
SetEnvIfNoCase User-Agent "^MIDown\ tool" bad_bot
SetEnvIfNoCase User-Agent "^MIIxpc" bad_bot
SetEnvIfNoCase User-Agent "^Mirror" bad_bot
SetEnvIfNoCase User-Agent "^Missigua\ Locator" bad_bot
SetEnvIfNoCase User-Agent "^Mister\ PiX" bad_bot
SetEnvIfNoCase User-Agent "^moget" bad_bot
SetEnvIfNoCase User-Agent "^Mozilla/3.Mozilla/2.01" bad_bot
SetEnvIfNoCase User-Agent "^Mozilla.*NEWT" bad_bot
SetEnvIfNoCase User-Agent "^NAMEPROTECT" bad_bot
SetEnvIfNoCase User-Agent "^Navroad" bad_bot
SetEnvIfNoCase User-Agent "^NearSite" bad_bot
SetEnvIfNoCase User-Agent "^NetAnts" bad_bot
SetEnvIfNoCase User-Agent "^Netcraft" bad_bot
SetEnvIfNoCase User-Agent "^NetMechanic" bad_bot
SetEnvIfNoCase User-Agent "^NetSpider" bad_bot
SetEnvIfNoCase User-Agent "^Net\ Vampire" bad_bot
SetEnvIfNoCase User-Agent "^NetZIP" bad_bot
SetEnvIfNoCase User-Agent "^NextGenSearchBot" bad_bot
SetEnvIfNoCase User-Agent "^NG" bad_bot
SetEnvIfNoCase User-Agent "^NICErsPRO" bad_bot
SetEnvIfNoCase User-Agent "^niki-bot" bad_bot
SetEnvIfNoCase User-Agent "^NimbleCrawler" bad_bot
SetEnvIfNoCase User-Agent "^Ninja" bad_bot
SetEnvIfNoCase User-Agent "^NPbot" bad_bot
SetEnvIfNoCase User-Agent "^Octopus" bad_bot
SetEnvIfNoCase User-Agent "^Offline\ Explorer" bad_bot
SetEnvIfNoCase User-Agent "^Offline\ Navigator" bad_bot
SetEnvIfNoCase User-Agent "^Openfind" bad_bot
SetEnvIfNoCase User-Agent "^OutfoxBot" bad_bot
SetEnvIfNoCase User-Agent "^PageGrabber" bad_bot
SetEnvIfNoCase User-Agent "^Papa\ Foto" bad_bot
SetEnvIfNoCase User-Agent "^pavuk" bad_bot
SetEnvIfNoCase User-Agent "^pcBrowser" bad_bot
SetEnvIfNoCase User-Agent "^PHP\ version\ tracker" bad_bot
SetEnvIfNoCase User-Agent "^Pockey" bad_bot
SetEnvIfNoCase User-Agent "^ProPowerBot/2.14" bad_bot
SetEnvIfNoCase User-Agent "^ProWebWalker" bad_bot
SetEnvIfNoCase User-Agent "^psbot" bad_bot
SetEnvIfNoCase User-Agent "^Pump" bad_bot
SetEnvIfNoCase User-Agent "^QueryN.Metasearch" bad_bot
SetEnvIfNoCase User-Agent "^RealDownload" bad_bot
SetEnvIfNoCase User-Agent "Reaper" bad_bot
SetEnvIfNoCase User-Agent "Recorder" bad_bot
SetEnvIfNoCase User-Agent "^ReGet" bad_bot
SetEnvIfNoCase User-Agent "^RepoMonkey" bad_bot
SetEnvIfNoCase User-Agent "^RMA" bad_bot
SetEnvIfNoCase User-Agent "Siphon" bad_bot
SetEnvIfNoCase User-Agent "^SiteSnagger" bad_bot
SetEnvIfNoCase User-Agent "^SlySearch" bad_bot
SetEnvIfNoCase User-Agent "^SmartDownload" bad_bot
SetEnvIfNoCase User-Agent "^Snake" bad_bot
SetEnvIfNoCase User-Agent "^Snapbot" bad_bot
SetEnvIfNoCase User-Agent "^Snoopy" bad_bot
SetEnvIfNoCase User-Agent "^sogou" bad_bot
SetEnvIfNoCase User-Agent "^SpaceBison" bad_bot
SetEnvIfNoCase User-Agent "^SpankBot" bad_bot
SetEnvIfNoCase User-Agent "^spanner" bad_bot
SetEnvIfNoCase User-Agent "^Sqworm" bad_bot
SetEnvIfNoCase User-Agent "Stripper" bad_bot
SetEnvIfNoCase User-Agent "Sucker" bad_bot
SetEnvIfNoCase User-Agent "^SuperBot" bad_bot
SetEnvIfNoCase User-Agent "^SuperHTTP" bad_bot
SetEnvIfNoCase User-Agent "^Surfbot" bad_bot
SetEnvIfNoCase User-Agent "^suzuran" bad_bot
SetEnvIfNoCase User-Agent "^Szukacz/1.4" bad_bot
SetEnvIfNoCase User-Agent "^tAkeOut" bad_bot
SetEnvIfNoCase User-Agent "^Teleport" bad_bot
SetEnvIfNoCase User-Agent "^Telesoft" bad_bot
SetEnvIfNoCase User-Agent "^TurnitinBot/1.5" bad_bot
SetEnvIfNoCase User-Agent "^The.Intraformant" bad_bot
SetEnvIfNoCase User-Agent "^TheNomad" bad_bot
SetEnvIfNoCase User-Agent "^TightTwatBot" bad_bot
SetEnvIfNoCase User-Agent "^Titan" bad_bot
SetEnvIfNoCase User-Agent "^True_Robot" bad_bot
SetEnvIfNoCase User-Agent "^turingos" bad_bot
SetEnvIfNoCase User-Agent "^TurnitinBot" bad_bot
SetEnvIfNoCase User-Agent "^URLy.Warning" bad_bot
SetEnvIfNoCase User-Agent "^Vacuum" bad_bot
SetEnvIfNoCase User-Agent "^VCI" bad_bot
SetEnvIfNoCase User-Agent "^VoidEYE" bad_bot
SetEnvIfNoCase User-Agent "^Web\ Image\ Collector" bad_bot
SetEnvIfNoCase User-Agent "^Web\ Sucker" bad_bot
SetEnvIfNoCase User-Agent "^WebAuto" bad_bot
SetEnvIfNoCase User-Agent "^WebBandit" bad_bot
SetEnvIfNoCase User-Agent "^Webclipping.com" bad_bot
SetEnvIfNoCase User-Agent "^WebCopier" bad_bot
SetEnvIfNoCase User-Agent "^WebEMailExtrac.*" bad_bot
SetEnvIfNoCase User-Agent "^WebEnhancer" bad_bot
SetEnvIfNoCase User-Agent "^WebFetch" bad_bot
SetEnvIfNoCase User-Agent "^WebGo\ IS" bad_bot
SetEnvIfNoCase User-Agent "^Web.Image.Collector" bad_bot
SetEnvIfNoCase User-Agent "^WebLeacher" bad_bot
SetEnvIfNoCase User-Agent "^WebmasterWorldForumBot" bad_bot
SetEnvIfNoCase User-Agent "^WebReaper" bad_bot
SetEnvIfNoCase User-Agent "^WebSauger" bad_bot
SetEnvIfNoCase User-Agent "^Website\ eXtractor" bad_bot
SetEnvIfNoCase User-Agent "^Website\ Quester" bad_bot
SetEnvIfNoCase User-Agent "^Webster" bad_bot
SetEnvIfNoCase User-Agent "^WebStripper" bad_bot
SetEnvIfNoCase User-Agent "^WebWhacker" bad_bot
SetEnvIfNoCase User-Agent "^WebZIP" bad_bot
SetEnvIfNoCase User-Agent "Whacker" bad_bot
SetEnvIfNoCase User-Agent "^Widow" bad_bot
SetEnvIfNoCase User-Agent "^WISENutbot" bad_bot
SetEnvIfNoCase User-Agent "^WWWOFFLE" bad_bot
SetEnvIfNoCase User-Agent "^WWW-Collector-E" bad_bot
SetEnvIfNoCase User-Agent "^Xaldon" bad_bot
SetEnvIfNoCase User-Agent "^Xenu" bad_bot
SetEnvIfNoCase User-Agent "^Zeus" bad_bot
SetEnvIfNoCase User-Agent "ZmEu" bad_bot
SetEnvIfNoCase User-Agent "^Zyborg" bad_bot
# Vulnerability Scanners
SetEnvIfNoCase User-Agent "Acunetix" bad_bot
SetEnvIfNoCase User-Agent "FHscan" bad_bot
# Aggressive Chinese Search Engine
SetEnvIfNoCase User-Agent "Baiduspider" bad_bot
# Aggressive Russian Search Engine
SetEnvIfNoCase User-Agent "Yandex" bad_bot
<Limit GET POST HEAD>
Order Allow,Deny
Allow from all
Deny from env=bad_bot
</Limit>

Related

How in .htaccess enable gzip only for certain url of the site?

I have the site on yii2. I am using pretty URL.
And now I have the following settings for compressing:
<IfModule mod_deflate.c>
SetOutputFilter DEFLATE
SetEnvIfNoCase Request_URI \.(?:gif|jpe?g|png)$ no-gzip dont-vary
SetEnvIfNoCase Request_URI \.(?:exe|t?gz|zip|bz2|sit|rar)$ no-gzip dont-vary
# Make proxies work as they should.
<IfModule mod_headers.c>
Header append Vary User-Agent
</IfModule>
</IfModule>
How can I enable compressing for url foo1/bar1 and foo2/bar2?

Apache htaccess order of compression and cache

I have seen examples Apache htaccess file code where compression is before cache and visa versa. Which is correct? I use redbot.org quite a bit and it seems compressing after cache produces fewer issues.
You may compress with gzip compression in this way:
<ifModule mod_deflate.c>
<filesMatch ".(css|js|x?html?|php)$">
SetOutputFilter DEFLATE
</filesMatch>
</ifModule>
To cache files, you could do this:
<ifModule mod_headers.c>
<filesMatch ".(ico|jpe?g|png|gif|swf)$">
Header set Cache-Control "max-age=2592000, public"
</filesMatch>
<filesMatch ".(css)$">
Header set Cache-Control "max-age=604800, public"
</filesMatch>
<filesMatch ".(js)$">
Header set Cache-Control "max-age=216000, private"
</filesMatch>
<filesMatch ".(x?html?|php)$">
Header set Cache-Control "max-age=600, private, must-revalidate"
</filesMatch>
</ifModule>
To more informations, visit this page:
http://samaxes.com/2009/01/more-on-compressing-and-caching-your-site-with-htaccess/

.htaccess file has no effect

I have read about 10 different articles explaining how to write an .htaccess file. I've followed their explanations and put the .htaccess.txt file in the root directory. But still it won't redirect my site from non-www to www.
Here's my .htaccess file:
RewriteEngine on
RewriteCond %{HTTP_HOST} ^mecoder.co.il$ [NC]
RewriteRule ^(.*)$ http://www.mecoder.co.il/$1 [L,R=301]
<FilesMatch "\.(jpg|png)$">
Header set Cache-Control "public, max-age=321408000"
</FilesMatch>
<ifModule mod_headers.c>
ExpiresActive On
ExpiresDefault A86400
<FilesMatch "\.(ico|gif|jpg|jpeg\png|flv|pdf|swf|mov|mp3|wmv|ppt)$">
ExpiresDefault A1814400
Header append Cache-Control "public"
</FilesMatch>
<FilesMatch "\.(xml|txt|html)$">
ExpiresDefault A259200
Header append Cache-Control "proxy-revalidate"
</FilesMatch>
<FilesMatch "\.(js|css)$>
ExpiresDefault A10800
Header append Cache-Control "proxy-revalidate"
</FilesMatch>
<FilesMatch "\.(php|cgi|pl)$">
ExpiresDefault A0
Header set Cache-Control "no-store, no-cache, must revalidate, max-age=0"
Header set Pragma "no-cache"
</FilesMatch>
</IfModule>
<ifModule mod_deflate.c>
<FilesMatch "\.(js|css|html|htm|php|xml)$">
SetOutputFilter DEFLATE
</FilesMatch>
</IfModule>
<FilesMatch "\.(ico|pdf|flv|jpg|jpeg|png|gif|js|css|swf)(\.gz)?$">
Header unset ETag
FileETag None
</FilesMatch>
And here is my site
First, it's not .htaccess.txt but .htaccess (without .txt).
If that doesn't work, you must check if overriding with .htaccess file is enabled in your server configuration. To do that, you should add AllowOverride All to your vhost or apache configuration.

How to disable single file from being cached

I have a file called xxx.php and in htaccess i have
<ifModule mod_deflate.c>
<filesMatch "\.(js|css|html|php)$">
SetOutputFilter DEFLATE
</filesMatch>
</ifModule>
how can i disable xxx.php file being cached?
Use a negative lookahead to exclude xxx.php:
<FilesMatch "^(?!xxx\.php$).*\.(js|css|html|php)$">
SetOutputFilter DEFLATE
</FilesMatch>

CakePHP htaccess redirecting to live site when running site locally

I am working on a website which is using CakePHP. I need to start adding some features to this site so i pulled everything from the live site to my development machine(mac os lion with MAMP server). When i try to load up the site (localhost/example.com/), it automatically redirects me to the live site(www.example.com).
Here is the root htaccess file contents:
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{HTTP_HOST} !^$
RewriteCond %{HTTP_HOST} !^www\.example\.com$ [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
RewriteRule ^$ webroot/ [L]
RewriteRule (.*) webroot/$1 [L]
</IfModule>
When i comment out the first two RewriteCond and first RewriteRule, my machine throws a 500 internal server error with no apache/php log output.
How can i run this website locally without automatically redirecting to the live site?
Is there any additional information i could provide?
edit: here is my document_root: /Applications/MAMP/htdocs. shouldn't it be /Applications/MAMP/htdocs/example.com?
edit: here is the /webroot/.htaccess file:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteRule img/([0-9]*x[0-9]*)/(.*)/(.*)$ files/image/$2/$1/$3 [QSA,L]
RewriteRule (js|css)/vendors/(.*)$ vendors.php?type=$1&file=$2 [QSA,L]
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ index.php?url=$1 [QSA,L]
</IfModule>
AddType application/x-javascript .js
AddType text/css .css
<IfModule mod_deflate.c>
# compress content with type html, text, and css
AddOutputFilterByType DEFLATE text/css text/javascript application/javascript application/x-javascript text/js
SetOutputFilter DEFLATE
BrowserMatch ^Mozilla/4 gzip-only-text/html
BrowserMatch ^Mozilla/4\.0[678] no-gzip
BrowserMatch \bMSI[E] !no-gzip !gzip-only-text/html
# Don't compress images/flash
SetEnvIfNoCase Request_URI \
\.(?:swf|flv)$ no-gzip dont-vary
# Don't compress images
# SetEnvIfNoCase Request_URI \.(?:gif|jpe?g|png)$ no-gzip dont-vary
# SetEnvIfNoCase Request_URI \/image\.php no-gzip dont-vary
# Also don't compress PDF and Flash-files 17-01-2004 MM
SetEnvIfNoCase Request_URI \.pdf$ no-gzip dont-vary
<IfModule mod_headers.c>
# properly handle requests coming from behind proxies
Header append Vary User-Agent
</IfModule>
</IfModule>
<IfModule mod_expires.c>
ExpiresActive On
ExpiresByType text/css "access plus 10 years"
ExpiresByType text/js "access plus 10 years"
ExpiresByType text/javascript "access plus 10 years"
ExpiresByType application/x-javascript "access plus 10 years"
ExpiresByType application/javascript "access plus 10 years"
ExpiresByType image/png "access plus 10 years"
ExpiresByType image/gif "access plus 10 years"
ExpiresByType image/jpeg "access plus 10 years"
ExpiresByType video/x-flv "access plus 10 years"
</IfModule>
FileETag none
<IfModule mod_headers.c>
# 1 YEAR
<FilesMatch "\.(ico|pdf|flv)$">
Header set Cache-Control "max-age=29030400, public"
</FilesMatch>
# 1 WEEK
<FilesMatch "\.(jpg|jpeg|png|gif|swf)$">
Header set Cache-Control "max-age=604800, public"
</FilesMatch>
# 2 DAYS
<FilesMatch "\.(xml|txt|css|js)$">
Header set Cache-Control "max-age=172800, proxy-revalidate"
</FilesMatch>
# 1 MIN
<FilesMatch "\.(html|htm|php)$">
Header set Cache-Control "max-age=60, private, proxy-revalidate"
</FilesMatch>
</IfModule>
edit: i've enabled debug logging level for apache so i can see more log messages. here is the log details upon visiting the root of the project with an error 500:
[Fri Jun 01 11:00:35 2012] [debug] mod_deflate.c(615): [client ::1] Zlib: Compressed 0 to 2 : URL /example.com/webroot/index.php
[Fri Jun 01 11:00:35 2012] [debug] mod_headers.c(756): headers: ap_headers_output_filter()
I've also included the access_log data after visiting the site with error 500:
127.0.0.1 - - [01/Jun/2012:11:03:02 -0400] "GET /example.com/ HTTP/1.1" 500 20 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_4) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.53 Safari/536.5"
Replace your existing .htaccess code with this:
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteBase /example.com/
RewriteRule (?!^webroot/)^(.*)$ webroot/$1 [L]
</IfModule>

Resources