Varnish returns 404 for valid URLs - varnish

I have a Varnish setup to cache some specific URLs (mostly XML outputs) for a period of time like 5-10 seconds. DC guys set it up and created the config file but unfortunately it just doesn't work.
I'd be glad if you can take a look to the configuration and help me to spot the problem.
Here is my default.vcl
backend default {
.host = "127.0.0.1";
.port = "80";
}
sub vcl_recv {
if (req.restarts == 0) {
if (req.http.x-forwarded-for) {
set req.http.X-Forwarded-For =
req.http.X-Forwarded-For + ", " + client.ip;
} else {
set req.http.X-Forwarded-For = client.ip;
}
}
if (req.request != "GET" &&
req.request != "HEAD" &&
req.request != "PUT" &&
req.request != "POST" &&
req.request != "TRACE" &&
req.request != "OPTIONS" &&
req.request != "DELETE") {
/* Non-RFC2616 or CONNECT which is weird. */
return (pipe);
}
if (req.request != "GET" && req.request != "HEAD") {
/* We only deal with GET and HEAD by default */
return (pass);
}
if (req.http.Authorization || req.http.Cookie) {
/* Not cacheable by default */
return (pass);
}
if(req.http.host == "foo.example1.com"){
if(req.url ~ "^/somedir\?somevar=anothervar&andanother=.*"){
return (lookup);
}
}else if(req.http.host == "bar.example2.com"){
if(req.url ~ "^/foo/bar\?somevar=anothervar&andanother=.*"){
return (lookup);
}
} else {
return (pass);
}
# Currently not reached.
return (lookup);
}
sub vcl_pipe {
# Note that only the first request to the backend will have
# X-Forwarded-For set. If you use X-Forwarded-For and want to
# have it set for all requests, make sure to have:
# set bereq.http.connection = "close";
# here. It is not set by default as it might break some broken web
# applications, like IIS with NTLM authentication.
return (pipe);
}
sub vcl_hash {
hash_data(req.url);
if (req.http.host) {
hash_data(req.http.host);
} else {
hash_data(server.ip);
}
return (hash);
}
sub vcl_miss {
return (fetch);
}
sub vcl_fetch {
if (beresp.ttl <= 0s ||
beresp.http.Set-Cookie ||
beresp.http.Vary == "*") {
/*
* Mark as "Hit-For-Pass" for the next 2 minutes
*/
set beresp.ttl = 120 s;
return (hit_for_pass);
}
set beresp.ttl = 5 s;
return (deliver);
}
And the log output when I tried to call the URL from localhost with
curl --verbose --header 'Host: foo.example1.com' "http://127.0.0.1:6081/somedir?somevar=anothervar&andanother=123456" is;
14 BackendClose b default
14 BackendOpen b default 127.0.0.1 60502 127.0.0.1 80
14 TxRequest b GET
14 TxURL b /somedir?somevar=anothervar&andanother=123456
14 TxProtocol b HTTP/1.1
14 TxHeader b User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.13.1.0 zlib/1.2.3 libidn/1.18 libssh2/1.2.2
14 TxHeader b Accept: */*
14 TxHeader b Host: foo.example1.com
14 TxHeader b X-Forwarded-For: 127.0.0.1
14 TxHeader b X-Varnish: 1405453128
14 TxHeader b Accept-Encoding: gzip
14 RxProtocol b HTTP/1.1
14 RxStatus b 404
14 RxResponse b Not Found
14 RxHeader b Date: Thu, 07 Nov 2013 17:07:50 GMT
14 RxHeader b Server: Apache/2.2.15 (Red Hat)
14 RxHeader b Content-Length: 295
14 RxHeader b Content-Type: text/html; charset=iso-8859-1
14 Fetch_Body b 4(length) cls 0 mklen 1
14 Length b 295
14 BackendReuse b default
12 SessionOpen c 127.0.0.1 48771 :6081
12 ReqStart c 127.0.0.1 48771 1405453128
12 RxRequest c GET
12 RxURL c /somedir?somevar=anothervar&andanother=123456
12 RxProtocol c HTTP/1.1
12 RxHeader c User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.13.1.0 zlib/1.2.3 libidn/1.18 libssh2/1.2.2
12 RxHeader c Accept: */*
12 RxHeader c Host: foo.example1.com
12 VCL_call c recv lookup
12 VCL_call c hash
12 Hash c /somedir?somevar=anothervar&andanother=123456
12 Hash c foo.example1.com
12 VCL_return c hash
12 VCL_call c miss fetch
12 Backend c 14 default default
12 TTL c 1405453128 RFC 120 -1 -1 1383844070 0 1383844070 0 0
12 VCL_call c fetch
12 TTL c 1405453128 VCL 5 -1 -1 1383844070 -0
12 VCL_return c deliver
12 ObjProtocol c HTTP/1.1
12 ObjResponse c Not Found
12 ObjHeader c Date: Thu, 07 Nov 2013 17:07:50 GMT
12 ObjHeader c Server: Apache/2.2.15 (Red Hat)
12 ObjHeader c Content-Type: text/html; charset=iso-8859-1
12 VCL_call c deliver deliver
12 TxProtocol c HTTP/1.1
12 TxStatus c 404
12 TxResponse c Not Found
12 TxHeader c Server: Apache/2.2.15 (Red Hat)
12 TxHeader c Content-Type: text/html; charset=iso-8859-1
12 TxHeader c Content-Length: 295
12 TxHeader c Accept-Ranges: bytes
12 TxHeader c Date: Thu, 07 Nov 2013 17:07:50 GMT
12 TxHeader c X-Varnish: 1405453128
12 TxHeader c Age: 0
12 TxHeader c Via: 1.1 varnish
12 TxHeader c Connection: keep-alive
12 Length c 295
12 ReqEnd c 1405453128 1383844070.464804649 1383844070.465569973 0.000082016 0.000713110 0.000052214
12 SessionClose c EOF
12 StatSess c 127.0.0.1 48771 0 1 1 0 0 1 257 295

This varnishlog is very clear.
Your backend is responding with a 404 for the URL you requested. You will find the same return code in your backend access logs.

Also same problem here, but turning off varnish and going back to apache to listen to port 80, all pages work like a charm.once you get varnish to respond to port 80 all goes down.So i think it is not the backend problem...

Related

Why does 301 error occur when varnish and wordpress are linked?

I would like to use varnish as a reverse proxy to provide wordpress service.
However, even if you set the default.vcl setting, installing Proxy Cache Purge, and so on, you can follow the instructions
When accessing through varnish port, 301 Redirect occurs and it is connected to the backend origin server.
The same phenomenon occurs even if a new wordpress server is installed and a separate plug-in/theme is not installed.
Why is this happening and how can it be solved?
varnish default.vcl settings
varnish default.vcl
vcl 4.1;
import std;
backend default {
.host = "172.16.21.222";
.port = "8000"; }
acl purge {
"localhost";
"127.0.0.1";
"172.16.21.222";
"::1"; }
sub vcl_recv {
if (req.url ~ "\?$") {
set req.url = regsub(req.url, "\?$", "");
}
set req.http.Host = regsub(req.http.Host, ":[0-9]+", "");
set req.url = std.querysort(req.url);
unset req.http.proxy;
if(req.method == "PURGE") {
if(!client.ip ~ purge) {
return(synth(405,"PURGE not allowed for this IP address"));
}
if (req.http.X-Purge-Method == "regex") {
ban("obj.http.x-url ~ " + req.url + " && obj.http.x-host == " + req.http.host);
return(synth(200, "Purged"));
}
ban("obj.http.x-url == " + req.url + " && obj.http.x-host == " + req.http.host);
return(synth(200, "Purged"));
}
if (
req.method != "GET" &&
req.method != "HEAD" &&
req.method != "PUT" &&
req.method != "POST" &&
req.method != "PATCH" &&
req.method != "TRACE" &&
req.method != "OPTIONS" &&
req.method != "DELETE"
) {
return (pipe);
}
if (req.url ~ "(\?|&)(utm_source|utm_medium|utm_campaign|utm_content|gclid|cx|ie|cof|siteurl)=") {
set req.url = regsuball(req.url, "&(utm_source|utm_medium|utm_campaign|utm_content|gclid|cx|ie|cof|siteurl)=([A-z0-9_\-\.%25]+)", "");
set req.url = regsuball(req.url, "\?(utm_source|utm_medium|utm_campaign|utm_content|gclid|cx|ie|cof|siteurl)=([A-z0-9_\-\.%25]+)", "?");
set req.url = regsub(req.url, "\?&", "?");
set req.url = regsub(req.url, "\?$", "");
}
if (req.method != "GET" && req.method != "HEAD") {
set req.http.X-Cacheable = "NO:REQUEST-METHOD";
return(pass);
}
if (req.url ~ "^[^?]*\.(7z|avi|bmp|bz2|css|csv|doc|docx|eot|flac|flv|gif|gz|ico|jpeg|jpg|js|less|mka|mkv|mov|mp3|mp4|mpeg|mpg|odt|ogg|ogm|opus|otf|pdf|png|ppt|pptx|rar|rtf|svg|svgz|swf|tar|tbz|tgz|ttf|txt|txz|wav|webm|webp|woff|woff2|xls|xlsx|xml|xz|zip)(\?.*)?$") {
set req.http.X-Static-File = "true";
unset req.http.Cookie;
return(hash);
}
if (
req.http.Cookie ~ "wordpress_(?!test_)[a-zA-Z0-9_]+|wp-postpass|comment_author_[a-zA-Z0-9_]+|woocommerce_cart_hash|woocommerce_items_in_cart|wp_woocommerce_session_[a-zA-Z0-9]+|wordpress_logged_in_|comment_author|PHPSESSID" ||
req.http.Authorization ||
req.url ~ "add_to_cart" ||
req.url ~ "edd_action" ||
req.url ~ "nocache" ||
req.url ~ "^/addons" ||
req.url ~ "^/bb-admin" ||
req.url ~ "^/bb-login.php" ||
req.url ~ "^/bb-reset-password.php" ||
req.url ~ "^/cart" ||
req.url ~ "^/checkout" ||
req.url ~ "^/control.php" ||
req.url ~ "^/login" ||
req.url ~ "^/logout" ||
req.url ~ "^/lost-password" ||
req.url ~ "^/my-account" ||
req.url ~ "^/product" ||
req.url ~ "^/register" ||
req.url ~ "^/register.php" ||
req.url ~ "^/server-status" ||
req.url ~ "^/signin" ||
req.url ~ "^/signup" ||
req.url ~ "^/stats" ||
req.url ~ "^/wc-api" ||
req.url ~ "^/wp-admin" ||
req.url ~ "^/wp-comments-post.php" ||
req.url ~ "^/wp-cron.php" ||
req.url ~ "^/wp-login.php" ||
req.url ~ "^/wp-activate.php" ||
req.url ~ "^/wp-mail.php" ||
req.url ~ "^/wp-login.php" ||
req.url ~ "^\?add-to-cart=" ||
req.url ~ "^\?wc-api=" ||
req.url ~ "^/preview=" ||
req.url ~ "^/\.well-known/acme-challenge/"
) {
set req.http.X-Cacheable = "NO:Logged in/Got Sessions";
if(req.http.X-Requested-With == "XMLHttpRequest") {
set req.http.X-Cacheable = "NO:Ajax";
}
return(pass);
}
unset req.http.Cookie;
return(hash); }
sub vcl_hash {
if(req.http.X-Forwarded-Proto) {
hash_data(req.http.X-Forwarded-Proto);
} }
sub vcl_backend_response {
set beresp.http.x-url = bereq.url;
set beresp.http.x-host = bereq.http.host;
if (!beresp.http.Cache-Control) {
set beresp.ttl = 1h;
set beresp.http.X-Cacheable = "YES:Forced";
}
if (bereq.http.X-Static-File == "true") {
unset beresp.http.Set-Cookie;
set beresp.http.X-Cacheable = "YES:Forced";
set beresp.ttl = 1d;
}
if (beresp.http.Set-Cookie ~ "wfvt_|wordfence_verifiedHuman") {
unset beresp.http.Set-Cookie; }
if (beresp.http.Set-Cookie) {
set beresp.http.X-Cacheable = "NO:Got Cookies";
} elseif(beresp.http.Cache-Control ~ "private") {
set beresp.http.X-Cacheable = "NO:Cache-Control=private";
} }
sub vcl_deliver {
if(req.http.X-Cacheable) {
set resp.http.X-Cacheable = req.http.X-Cacheable;
} elseif(obj.uncacheable) {
if(!resp.http.X-Cacheable) {
set resp.http.X-Cacheable = "NO:UNCACHEABLE";
}
} elseif(!resp.http.X-Cacheable) {
set resp.http.X-Cacheable = "YES";
}
unset resp.http.x-url;
unset resp.http.x-host; }
apache /etc/httpd/conf.d/wordpress.co.kr.conf
<VirtualHost *:8000>
ServerName 172.16.21.222
DocumentRoot /var/www/html/wordpress
ErrorLog /var/log/httpd/mywordpress-error-log
CustomLog /var/log/httpd/mywordpress-acces-log combined
<Directory /var/www/html/wordpress>
Options Indexes FollowSymLinks MultiViews
AllowOverride All
Order allow,deny
Allow from all
</Directory>
</VirtualHost>
varnishlog -g request -q "ReqUrl eq '/'"
* << Request >> 7
- Begin req 6 rxreq
- Timestamp Start: 1658301976.240571 0.000000 0.000000
- Timestamp Req: 1658301976.240571 0.000000 0.000000
- ReqStart 172.16.39.62 4667 a0
- ReqMethod GET
- ReqURL /
- ReqProtocol HTTP/1.1
- ReqHeader Host: 172.16.21.222
- ReqHeader Connection: keep-alive
- ReqHeader Upgrade-Insecure-Requests: 1
- ReqHeader User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36
- ReqHeader Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
- ReqHeader Accept-Encoding: gzip, deflate
- ReqHeader Accept-Language: ko
- ReqHeader X-Forwarded-For: 172.16.39.62
- VCL_call RECV
- ReqUnset Host: 172.16.21.222
- ReqHeader Host: 172.16.21.222
- ReqURL /
- VCL_return hash
- ReqUnset Accept-Encoding: gzip, deflate
- ReqHeader Accept-Encoding: gzip
- VCL_call HASH
- VCL_return lookup
- VCL_call MISS
- VCL_return fetch
- Link bereq 8 fetch
- Timestamp Fetch: 1658301976.348547 0.107975 0.107975
- RespProtocol HTTP/1.1
- RespStatus 301
- RespReason Moved Permanently
- RespHeader Date: Wed, 20 Jul 2022 07:26:16 GMT
- RespHeader Server: Apache/2.4.37 (rocky)
- RespHeader X-Powered-By: PHP/7.2.24
- RespHeader X-Redirect-By: WordPress
- RespHeader Location: http://172.16.21.222:8000/
- RespHeader Content-Length: 0
- RespHeader Content-Type: text/html; charset=UTF-8
- RespHeader x-url: /
- RespHeader x-host: 172.16.21.222
- RespHeader X-Cacheable: YES:Forced
- RespHeader X-Varnish: 7
- RespHeader Age: 0
- RespHeader Via: 1.1 varnish (Varnish/6.0)
- VCL_call DELIVER
- RespUnset x-url: /
- RespUnset x-host: 172.16.21.222
- VCL_return deliver
- Timestamp Process: 1658301976.348631 0.108059 0.000084
- RespHeader Connection: keep-alive
- Timestamp Resp: 1658301976.348669 0.108098 0.000038
- ReqAcct 416 0 416 354 0 354
- End
** << BeReq >> 8
-- Begin bereq 7 fetch
-- VCL_use boot
-- Timestamp Start: 1658301976.240807 0.000000 0.000000
-- BereqMethod GET
-- BereqURL /
-- BereqProtocol HTTP/1.1
-- BereqHeader Upgrade-Insecure-Requests: 1
-- BereqHeader User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36
-- BereqHeader Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
-- BereqHeader Accept-Language: ko
-- BereqHeader X-Forwarded-For: 172.16.39.62
-- BereqHeader Host: 172.16.21.222
-- BereqHeader Accept-Encoding: gzip
-- BereqHeader X-Varnish: 8
-- VCL_call BACKEND_FETCH
-- VCL_return fetch
-- BackendOpen 30 boot.default 172.16.21.222 8000 172.16.21.222 11024
-- BackendStart 172.16.21.222 8000
-- Timestamp Bereq: 1658301976.241097 0.000290 0.000290
-- Timestamp Beresp: 1658301976.348336 0.107528 0.107239
-- BerespProtocol HTTP/1.1
-- BerespStatus 301
-- BerespReason Moved Permanently
-- BerespHeader Date: Wed, 20 Jul 2022 07:26:16 GMT
-- BerespHeader Server: Apache/2.4.37 (rocky)
-- BerespHeader X-Powered-By: PHP/7.2.24
-- BerespHeader X-Redirect-By: WordPress
-- BerespHeader Location: http://172.16.21.222:8000/
-- BerespHeader Content-Length: 0
-- BerespHeader Content-Type: text/html; charset=UTF-8
-- TTL RFC 120 10 0 1658301976 1658301976 1658301976 0 0 cacheable
-- VCL_call BACKEND_RESPONSE
-- BerespHeader x-url: /
-- BerespHeader x-host: 172.16.21.222
-- TTL VCL 3600 10 0 1658301976 cacheable
-- BerespHeader X-Cacheable: YES:Forced
-- VCL_return deliver
-- Storage malloc s0
-- Fetch_Body 0 none -
-- BackendReuse 30 boot.default
-- Timestamp BerespBody: 1658301976.348518 0.107711 0.000182
-- Length 0
-- BereqAcct 428 0 428 251 0 251
-- End
The issue you're experiencing is related to a Host header mismatch.
Your VSL log shows that you're sending the following Host header:
- ReqHeader Host: 172.16.21.222:6081
This is the result of calling http://172.16.21.222:6081. However, you're WordPress setup doesn't recognize that host and redirects to the host it knows, which is 172.16.21.222:8000.
You can see that it redirects to this host in the following log line:
- RespHeader Location: http://172.16.21.222:8000/
You shouldn't be using port numbers in your Host header. I would advise you to change the default listening port of varnishd from 6081 to 80.
See https://www.varnish-software.com/developers/tutorials/installing-varnish-ubuntu/#4-configure-varnish for more information on how to change the listening port via systemd.
Please also ensure that port 8080 is not included in the WordPress base URL. Just use the IP address or a domain name instead.
For the time being you could bypass the port mismatch by overriding your Host header. Here's an example using cUrl:
curl -H"Host: 172.16.21.222:8000" -I http://172.16.21.222:6081
This command will override the host to 172.16.21.222:8000 while still accessing Varnish over 172.16.21.222:6081.
If this works, change the varnishd port to 80 and updated the WordPress base URL to not include the port.

Get varnish to proxy and not redirect

I want to use Varnish as a "smart" proxy and it almost works. The idea is that some requests should be passed through Varnish, hit the backend and return, all other requests should return a "synt" message that the specific response contains no result.
This works apart from the fact that Varnish returns a 301 redirect to the backend instead of just the response from the actual backend.
Backend and Cache are not located on the same host (or not even on the same network in this case).
Backend is ALSO running a separate Varnish instance and this request is always passed through that.
// VCL.SHOW 0 1820 input
#
# This is an example VCL file for Varnish.
#
# It does not do anything by default, delegating control to the
# builtin VCL. The builtin VCL is called when there is no explicit
# return statement.
#
# See the VCL chapters in the Users Guide at https://www.varnish-cache.org/docs/
# and http://varnish-cache.org/trac/wiki/VCLExamples for more examples.
# Marker to tell the VCL compiler that this VCL has been adapted to the
# new 4.0 format.
vcl 4.0;
### Here starts my part of the VCL
# Default backend definition. Set this to point to your content server.
backend default {
.host = "myhost.mydomain";
.port = "80";
}
sub vcl_recv {
# Happens before we check if we have this in cache already.
#
# Typically you clean up the request here, removing cookies you don't need,
# rewriting the request, etc.
if (req.url ~ "^\/cgi-bin\/wspd_cgi\.sh/apiFlightSearch.p\?from=ARN&to=AOK&date=2017-05-20&homedate=2017-05-27")
{
return (hash);
}
return (synth(750));
}
sub vcl_synth {
if (resp.status == 750) {
# Set a status the client will understand
set resp.status = 200;
# Create our synthetic response
set resp.http.content-type = "text/xml";
synthetic("<flights xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'><status><status>No flights</status></status></flights>");
return(deliver);
}
}
sub vcl_backend_response {
# Happens after we have read the response headers from the backend.
#
# Here you clean the response headers, removing silly Set-Cookie headers
# and other mistakes your backend does.
set beresp.ttl = 10 s;
}
sub vcl_deliver {
# Happens when we have all the pieces we need, and are about to send the
# response to the client.
#
# You can do accounting or modifying the final object here.
}
### Here ends my part of the VCL. The rest I guess is built in.
// VCL.SHOW 1 5479 Builtin
/*-
* Copyright (c) 2006 Verdens Gang AS
* Copyright (c) 2006-2014 Varnish Software AS
* All rights reserved.
*
* Author: Poul-Henning Kamp <phk#phk.freebsd.dk>
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*
*
* The built-in (previously called default) VCL code.
*
* NB! You do NOT need to copy & paste all of these functions into your
* own vcl code, if you do not provide a definition of one of these
* functions, the compiler will automatically fall back to the default
* code from this file.
*
* This code will be prefixed with a backend declaration built from the
* -b argument.
*/
vcl 4.0;
#######################################################################
# Client side
sub vcl_recv {
if (req.method == "PRI") {
/* We do not support SPDY or HTTP/2.0 */
return (synth(405));
}
if (req.method != "GET" &&
req.method != "HEAD" &&
req.method != "PUT" &&
req.method != "POST" &&
req.method != "TRACE" &&
req.method != "OPTIONS" &&
req.method != "DELETE") {
/* Non-RFC2616 or CONNECT which is weird. */
return (pipe);
}
if (req.method != "GET" && req.method != "HEAD") {
/* We only deal with GET and HEAD by default */
return (pass);
}
if (req.http.Authorization || req.http.Cookie) {
/* Not cacheable by default */
return (pass);
}
return (hash);
}
sub vcl_pipe {
# By default Connection: close is set on all piped requests, to stop
# connection reuse from sending future requests directly to the
# (potentially) wrong backend. If you do want this to happen, you can undo
# it here.
# unset bereq.http.connection;
return (pipe);
}
sub vcl_pass {
return (fetch);
}
sub vcl_hash {
hash_data(req.url);
if (req.http.host) {
hash_data(req.http.host);
} else {
hash_data(server.ip);
}
return (lookup);
}
sub vcl_purge {
return (synth(200, "Purged"));
}
sub vcl_hit {
if (obj.ttl >= 0s) {
// A pure unadultered hit, deliver it
return (deliver);
}
if (obj.ttl + obj.grace > 0s) {
// Object is in grace, deliver it
// Automatically triggers a background fetch
return (deliver);
}
// fetch & deliver once we get the result
return (fetch);
}
sub vcl_miss {
return (fetch);
}
sub vcl_deliver {
return (deliver);
}
/*
* We can come here "invisibly" with the following errors: 413, 417 & 503
*/
sub vcl_synth {
set resp.http.Content-Type = "text/html; charset=utf-8";
set resp.http.Retry-After = "5";
synthetic( {"<!DOCTYPE html>
<html>
<head>
<title>"} + resp.status + " " + resp.reason + {"</title>
</head>
<body>
<h1>Error "} + resp.status + " " + resp.reason + {"</h1>
<p>"} + resp.reason + {"</p>
<h3>Guru Meditation:</h3>
<p>XID: "} + req.xid + {"</p>
<hr>
<p>Varnish cache server</p>
</body>
</html>
"} );
return (deliver);
}
#######################################################################
# Backend Fetch
sub vcl_backend_fetch {
return (fetch);
}
sub vcl_backend_response {
if (beresp.ttl <= 0s ||
beresp.http.Set-Cookie ||
beresp.http.Surrogate-control ~ "no-store" ||
(!beresp.http.Surrogate-Control &&
beresp.http.Cache-Control ~ "no-cache|no-store|private") ||
beresp.http.Vary == "*") {
/*
* Mark as "Hit-For-Pass" for the next 2 minutes
*/
set beresp.ttl = 120s;
set beresp.uncacheable = true;
}
return (deliver);
}
sub vcl_backend_error {
set beresp.http.Content-Type = "text/html; charset=utf-8";
set beresp.http.Retry-After = "5";
synthetic( {"<!DOCTYPE html>
<html>
<head>
<title>"} + beresp.status + " " + beresp.reason + {"</title>
</head>
<body>
<h1>Error "} + beresp.status + " " + beresp.reason + {"</h1>
<p>"} + beresp.reason + {"</p>
<h3>Guru Meditation:</h3>
<p>XID: "} + bereq.xid + {"</p>
<hr>
<p>Varnish cache server</p>
</body>
</html>
"} );
return (deliver);
}
#######################################################################
# Housekeeping
sub vcl_init {
return (ok);
}
sub vcl_fini {
return (ok);
}
Output from curl:
$ curl "thisandthatip.compute.amazonaws.com/cgi-bin/wspd_cgi.sh/apiFlightSearch.p?from=ARN&to=AOK&date=2017-05-20&homedate=2017-05-27&adults=2&triptype=return&children=0&infants=0" -i
HTTP/1.1 301 Moved Permanently
Date: Wed, 15 Mar 2017 07:14:11 GMT
Server: Apache/2.2.15 (Red Hat)
Location: http://myserver.mydomain/cgi-bin/wspd_cgi.sh/apiFlightSearch.p?from=ARN&to=AOK&date=2017-05-20&homedate=2017-05-27&adults=2&triptype=return&children=0&infants=0
Content-Length: 514
Content-Type: text/html; charset=iso-8859-1
X-Varnish: 529144137
Via: 1.1 varnish-v4
X-Varnish: 98309 11
Age: 2
Via: 1.1 varnish-v4
Connection: keep-alive
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>301 Moved Permanently</title>
</head><body>
<h1>Moved Permanently</h1>
<p>The document has moved here.</p>
<hr>
<address>Apache/2.2.15 (Red Hat) Server at thisandthatip.eu-central-1.compute.amazonaws.com Port 80</address>
</body></html>
Backend apache access log
127.0.0.1 - - [15/Mar/2017:08:09:49 +0100] "GET /cgi-bin/wspd_cgi.sh/apiFlightSearch.p?from=arn&to=aok&date=2017-05-20&homedate=2017-05-27&adults=2&triptype=return&children=0&infants=0 HTTP/1.1" 200 994 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36"
Sending the request from the AWS instance to the backend renders no 301 redirection:
$ curl "myserver.mydomain/cgi-bin/wspd_cgi.sh/apiFlightSearch.p?from=arn&to=aok&date=2017-05-20&homedate=2017-05-27&adults=2&triptype=return&children=0&infants=0" -i
HTTP/1.1 200 OK
Date: Wed, 15 Mar 2017 08:54:14 GMT
Server: Apache/2.2.15 (Red Hat)
Cache-Control: max-age=1
Expires: Wed, 15 Mar 2017 08:54:15 GMT
Content-Type: text/xml
X-Varnish: 527559784
Age: 0
Via: 1.1 varnish-v4
Transfer-Encoding: chunked
Connection: keep-alive
Accept-Ranges: bytes
... Response body here ...
Complete varnishlog output of a single request
* << BeReq >> 98314
- Begin bereq 98313 fetch
- Timestamp Start: 1489568144.701450 0.000000 0.000000
- BereqMethod GET
- BereqURL /cgi-bin/wspd_cgi.sh/apiFlightSearch.p?from=ARN&to=AOK&date=2017-05-20&homedate=2017-05-27&adults=2&triptype=return&children=0&infants=0
- BereqProtocol HTTP/1.1
- BereqHeader Host: thisandthatip.compute.amazonaws.com
- BereqHeader Upgrade-Insecure-Requests: 1
- BereqHeader User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36
- BereqHeader Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
- BereqHeader Accept-Language: sv-SE,sv;q=0.8,en-US;q=0.6,en;q=0.4
- BereqHeader X-Forwarded-For: ip.ip.ip.ip
- BereqHeader Accept-Encoding: gzip
- BereqHeader X-Varnish: 98314
- VCL_call BACKEND_FETCH
- VCL_return fetch
- BackendClose 17 default(ip.ip.ip.ip,,80) toolate
- BackendOpen 17 default(ip.ip.ip.ip,,80) 172.31.31.195 42868
- Backend 17 default default(ip.ip.ip.ip,,80)
- Timestamp Bereq: 1489568144.730329 0.028878 0.028878
- Timestamp Beresp: 1489568144.759773 0.058322 0.029444
- BerespProtocol HTTP/1.1
- BerespStatus 301
- BerespReason Moved Permanently
- BerespHeader Date: Wed, 15 Mar 2017 08:55:44 GMT
- BerespHeader Server: Apache/2.2.15 (Red Hat)
- BerespHeader Location: http://myserver.mydomain/cgi-bin/wspd_cgi.sh/apiFlightSearch.p?from=ARN&to=AOK&date=2017-05-20&homedate=2017-05-27&adults=2&triptype=return&children=0&infants=0
- BerespHeader Content-Length: 514
- BerespHeader Content-Type: text/html; charset=iso-8859-1
- BerespHeader X-Varnish: 526644873
- BerespHeader Age: 0
- BerespHeader Via: 1.1 varnish-v4
- BerespHeader Connection: keep-alive
- TTL RFC 120 -1 -1 1489568145 1489568145 1489568144 0 0
- VCL_call BACKEND_RESPONSE
- TTL VCL 10 10 0 1489568145
- VCL_return deliver
- Storage malloc s0
- ObjProtocol HTTP/1.1
- ObjStatus 301
- ObjReason Moved Permanently
- ObjHeader Date: Wed, 15 Mar 2017 08:55:44 GMT
- ObjHeader Server: Apache/2.2.15 (Red Hat)
- ObjHeader Location: http://myserver.mydomain/cgi-bin/wspd_cgi.sh/apiFlightSearch.p?from=ARN&to=AOK&date=2017-05-20&homedate=2017-05-27&adults=2&triptype=return&children=0&infants=0
- ObjHeader Content-Length: 514
- ObjHeader Content-Type: text/html; charset=iso-8859-1
- ObjHeader X-Varnish: 526644873
- ObjHeader Via: 1.1 varnish-v4
- Fetch_Body 3 length stream
- BackendReuse 17 default(ip.ip.ip.ip,,80)
- Timestamp BerespBody: 1489568144.759849 0.058398 0.000076
- Length 514
- BereqAcct 578 0 578 415 514 929
- End
* << Request >> 98313
- Begin req 98312 rxreq
- Timestamp Start: 1489568144.701372 0.000000 0.000000
- Timestamp Req: 1489568144.701372 0.000000 0.000000
- ReqStart ip.ip.ip.ip 63485
- ReqMethod GET
- ReqURL /cgi-bin/wspd_cgi.sh/apiFlightSearch.p?from=ARN&to=AOK&date=2017-05-20&homedate=2017-05-27&adults=2&triptype=return&children=0&infants=0
- ReqProtocol HTTP/1.1
- ReqHeader Host: thisandthatip.compute.amazonaws.com
- ReqHeader Connection: keep-alive
- ReqHeader Upgrade-Insecure-Requests: 1
- ReqHeader User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36
- ReqHeader Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
- ReqHeader Accept-Encoding: gzip, deflate, sdch
- ReqHeader Accept-Language: sv-SE,sv;q=0.8,en-US;q=0.6,en;q=0.4
- ReqHeader X-Forwarded-For: ip.ip.ip.ip
- VCL_call RECV
- VCL_return hash
- ReqUnset Accept-Encoding: gzip, deflate, sdch
- ReqHeader Accept-Encoding: gzip
- VCL_call HASH
- VCL_return lookup
- Debug "XXXX MISS"
- VCL_call MISS
- VCL_return fetch
- Link bereq 98314 fetch
- Timestamp Fetch: 1489568144.759883 0.058511 0.058511
- RespProtocol HTTP/1.1
- RespStatus 301
- RespReason Moved Permanently
- RespHeader Date: Wed, 15 Mar 2017 08:55:44 GMT
- RespHeader Server: Apache/2.2.15 (Red Hat)
- RespHeader Location: http://myserver.mydomain/cgi-bin/wspd_cgi.sh/apiFlightSearch.p?from=ARN&to=AOK&date=2017-05-20&homedate=2017-05-27&adults=2&triptype=return&children=0&infants=0
- RespHeader Content-Length: 514
- RespHeader Content-Type: text/html; charset=iso-8859-1
- RespHeader X-Varnish: 526644873
- RespHeader Via: 1.1 varnish-v4
- RespHeader X-Varnish: 98313
- RespHeader Age: 0
- RespHeader Via: 1.1 varnish-v4
- VCL_call DELIVER
- VCL_return deliver
- Timestamp Process: 1489568144.759907 0.058535 0.000024
- Debug "RES_MODE 2"
- RespHeader Connection: keep-alive
- Timestamp Resp: 1489568144.759933 0.058561 0.000026
- Debug "XXX REF 2"
- ReqAcct 566 0 566 454 514 968
- End
Varnish 4.0.4 running on AWS Amazon Linux.
The 301 redirect is not done by your varnish. It is done by an apache server probably your backend. It can be seen by the X-Server header in your curl.
What varnish does is proxify the request and forward it to the backend you declare myhost.mydomain. In fact Varnish will resolve the dns at startup and forward the request to the ip it got.
I see two things to check here :
ban your request from your varnish cache (it may result in a 301 at some time during your test and still be cached, that does not serm to be the case but better start from fresh cache)
Make a curl to your backend to see if you get a 301 or a 200.
If that does not work I would restart your varnish service to refresh the dns resolution.
The Host header entry sent to the backend matched that of the AWS instance. That triggered a redirect in the backend, not in the Varnish cache.
Overriding the http.resp.host value in Varnish solved the problem:
sub vcl_recv {
# Happens before we check if we have this in cache already.
#
# Typically you clean up the request here, removing cookies you don't need,
# rewriting the request, etc.
# Set req.http.host (Host header) to www.airtours.se otherwise a redirect will be triggered
set req.http.host = "myserver.mydomain";
... More setting goes here
}

Varnish request missing cache despite being in

I have this request in the Varnish cache:
ReqMethod GET
ReqURL /organisation/xyz/proposal_0000000/comments/comment_0000001/
Some PURGE requests are then send to Varnish, resulting in this list of bans:
ban.list
200 2108
Present bans:
1458150360.937187 16 - req.http.host == localhost:8088 && req.url ~ /organisation/xyz/proposal_0000000/team0000000$
1458150360.929092 0 - req.http.host == localhost:8088 && req.url ~ /organisation/xyz$
1458150360.926030 0 - req.http.host == localhost:8088 && req.url ~ /organisation/xyz/proposal_0000000/pitch0000000$
1458150360.923491 0 - req.http.host == localhost:8088 && req.url ~ /organisation/xyz/proposal_0000000/practicalrelevance0000000$
1458150360.921025 0 - req.http.host == localhost:8088 && req.url ~ /organisation/xyz/proposal_0000000/plan0000000$
1458150360.918480 0 - req.http.host == localhost:8088 && req.url ~ /organisation/xyz/proposal_0000000/target0000000$
1458150360.915931 0 - req.http.host == localhost:8088 && req.url ~ /organisation/xyz/proposal_0000000/duration0000000$
1458150360.913486 0 - req.http.host == localhost:8088 && req.url ~ /organisation/xyz/proposal_0000000/difference0000000$
1458150360.910710 0 - req.http.host == localhost:8088 && req.url ~ /$
1458150360.908150 0 - req.http.host == localhost:8088 && req.url ~ /organisation$
1458150360.906249 0 - req.http.host == localhost:8088 && req.url ~ /organisation/xyz/proposal_0000000/extrainfo0000000$
1458150360.904289 0 - req.http.host == localhost:8088 && req.url ~ /organisation/xyz/proposal_0000000/connectioncohesion0000000$
1458150360.901930 0 - req.http.host == localhost:8088 && req.url ~ /organisation/xyz/proposal_0000000/challenge0000000$
1458150360.899287 0 - req.http.host == localhost:8088 && req.url ~ /organisation/xyz/proposal_0000000/goal0000000$
1458150360.896989 0 - req.http.host == localhost:8088 && req.url ~ /organisation/xyz/proposal_0000000/partners0000000$
1458150360.894324 0 - req.http.host == localhost:8088 && req.url ~ /organisation/xyz/proposal_0000000$
1458150360.891701 0 C
1458150348.035639 1 C
The same GET request is then executed again but with a MISS:
* << Request >> 32790
- Begin req 32789 rxreq
- Timestamp Start: 1458150371.759282 0.000000 0.000000
- Timestamp Req: 1458150371.759282 0.000000 0.000000
- ReqStart 127.0.0.1 43526
- ReqMethod GET
- ReqURL /organisation/xyz/proposal_0000000/comments/comment_0000001/
- ReqProtocol HTTP/1.1
- ReqHeader Host: localhost:8088
- ReqHeader Connection: keep-alive
- ReqHeader Pragma: no-cache
- ReqHeader Cache-Control: no-cache
- ReqHeader Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
- ReqHeader Upgrade-Insecure-Requests: 1
- ReqHeader Referer: http://localhost:8088/organisation/xyz/proposal_0000000/comments/?elements=paths
- ReqHeader Accept-Encoding: gzip, deflate, sdch
- ReqHeader Accept-Language: en-GB,en-US;q=0.8,en;q=0.6
- ReqHeader X-Forwarded-For: 127.0.0.1
- VCL_call RECV
- VCL_return hash
- ReqUnset Accept-Encoding: gzip, deflate, sdch
- ReqHeader Accept-Encoding: gzip
- VCL_call HASH
- VCL_return lookup
- ExpBan 3 banned lookup
- VCL_call MISS
- VCL_return fetch
- Link bereq 32791 fetch
- Timestamp Fetch: 1458150371.779571 0.020289 0.020289
- RespProtocol HTTP/1.1
- RespStatus 200
- RespReason OK
- RespHeader Server: gunicorn/19.2.1
- RespHeader Date: Wed, 16 Mar 2016 17:46:11 GMT
- RespHeader X-Caching-Mode: with_proxy_cache
- RespHeader X-Caching-Strategy: HTTPCacheStrategyWeakAdapter
- RespHeader Cache-Control: max-age=0, proxy-revalidate, s-maxage=31104000
- RespHeader Vary: Accept-Encoding, X-User-Path, X-User-Token
- RespHeader Content-Type: application/json; charset=UTF-8
- RespHeader Access-Control-Allow-Origin: *
- RespHeader Access-Control-Allow-Methods: POST,GET,DELETE,PUT,OPTIONS
- RespHeader Access-Control-Allow-Headers: Origin, Content-Type, Accept, X-User-Path, X-User-Token
- RespHeader ETag: W/"0|1|2016-03-16 13:44:05.887212+00:00|None|None"
- RespHeader Content-Encoding: gzip
- RespHeader X-Varnish: 32790
- RespHeader Age: 0
- RespHeader Via: 1.1 varnish-v4
- VCL_call DELIVER
- VCL_return deliver
- Timestamp Process: 1458150371.779598 0.020317 0.000028
- RespHeader Accept-Ranges: bytes
- RespHeader Content-Length: 426
- Debug "RES_MODE 2"
- RespHeader Connection: keep-alive
- Timestamp Resp: 1458150371.779641 0.020359 0.000042
- ReqAcct 598 0 598 699 426 1125
- End
The ban list ist then:
ban.list
200 147
Present bans:
1458150360.937187 17 - req.http.host == localhost:8088 && req.url ~ /organisation/xyz/proposal_0000000/team0000000$
I know regular expressions. How can /organisation/xyz/proposal_0000000/comments/comment_0000001/ matches any of the pattern in the ban.list? It does not make sense.
I'm using Varnish 4.1.1
The rule that matches your URL is:
1458150360.910710 0 - req.http.host == localhost:8088 && req.url ~ /$
regex req.url ~ /$ will match your URL and any other that ends with a slash; use req.url ~ ^/$.
A few observations:
you should use "==" direct comparasion since you know the full URL https://www.varnish-cache.org/docs/3.0/reference/varnish-cli.html#ban-expressions
use varnishtest to debug complex situations
try to be lurker-friendly
About lurker-friendly ban expressions
read more here
Lurker-friendly ban expressions are those that use only obj., but not req. variables. Since lurker-friendly ban expressions lack of req., you might need to copy some of the req. contents into the obj structure. In fact, this copy operation is a mechanism to preserve the context of client request in the cached object. For example, you may want to copy useful parts of the client context such as the requested URL from req to obj.
The following snippet shows an example on how to preserve the context of a client request in the cached object:
sub vcl_backend_response {
set beresp.http.x-url = bereq.url;
}
sub vcl_deliver {
# The X-Url header is for internal use only
unset resp.http.x-url;
}
Varnish test example for regex:
You can run it with: varnishtest test_regex.vtc
test_regex.vtc content:
# act like a backend server
server s1 {
rxreq
txresp
expect req.url == "/organisation/xyz/proposal_0000000/comments/comment_0000001/"
expect req.http.Test == "dosent_match"
} -start
# define & start a varnish instance
varnish v1 -vcl {
backend default {
.host = "${s1_addr}";
.port = "${s1_port}";
}
sub vcl_recv {
if ( req.url ~ "/$" ) {
set req.http.Test="match";
} else {
set req.http.Test="dosent_match";
}
}
} -start
# make a client request
client c1 {
txreq -url "/organisation/xyz/proposal_0000000/comments/comment_0000001/"
rxresp
} -run
varnish v1 -expect client_req == 1

Varnish slow when the object is cached (memory)

One of my customer came with a problem of speed for varnish.
Long debug, short :
When varnish get an object from his cache (memory), it's really sluggish (> 5 seconds),
when varnish need to get the object from the apache backend, no speed problem (< 1 second).
Exemple of a slow request (from varnishlog) :
193 ReqStart c <client ip> 59490 1329239608
193 RxRequest c GET
193 RxURL c /<my_url>/toto.png
193 RxProtocol c HTTP/1.1
193 RxHeader c Accept: */*
193 RxHeader c Referer: <my_referer>
193 RxHeader c Accept-Language: fr
193 RxHeader c User-Agent: <client_useragent>
193 RxHeader c Accept-Encoding: gzip, deflate
193 RxHeader c Host: <my_vhost>
193 RxHeader c Connection: Keep-Alive
193 VCL_call c recv lookup
193 VCL_call c hash
193 Hash c /<my_url>/toto.png
193 Hash c <my_vhost>
193 VCL_return c hash
193 Hit c 1329136358
193 VCL_call c hit deliver
193 VCL_call c deliver deliver
193 TxProtocol c HTTP/1.1
193 TxStatus c 200
193 TxResponse c OK
193 TxHeader c Server: Apache
193 TxHeader c Last-Modified: Mon, 18 Jun 2012 08:57:46 GMT
193 TxHeader c ETag: "c330-4c2bb5c0ef680"
193 TxHeader c Cache-Control: max-age=1200
193 TxHeader c Content-Type: image/png
193 TxHeader c Content-Length: 49968
193 TxHeader c Accept-Ranges: bytes
193 TxHeader c Date: Tue, 16 Oct 2012 06:54:03 GMT
193 TxHeader c X-Varnish: 1329239608 1329136358
193 TxHeader c Age: 391
193 TxHeader c Via: 1.1 varnish
193 TxHeader c Connection: keep-alive
193 TxHeader c X-Cache: HIT
193 TxHeader c X-Cache-Hits: 210
193 Length c 49968
193 ReqEnd c 1329239608 1350370443.778280735 1350370480.921206713 0.372072458 0.000045538 37.142880440
If I'm true, the problem is on the last line (ReqEnd),
37.142880440 it's the time in seconds to send the file.
I have the same problem on local (so it's not a bandwith problem).
The problem only happen in the morning when the maximum visitors are here (~ 400req/s).
Options for varnish :
DAEMON_OPTS="-a :80 \
-T localhost:6082 \
-f /etc/varnish/default.vcl \
-S /etc/varnish/secret \
-p thread_pool_min=100 \
-p thread_pool_max=1000 \
-p session_linger=100 \
-s malloc,8G"
Varnish seem to have enough ram and be fine :
SMA.s0.c_req 4303728 38.35 Allocator requests
SMA.s0.c_fail 0 0.00 Allocator failures
SMA.s0.c_bytes 169709790476 1512443.66 Bytes allocated
SMA.s0.c_freed 168334747402 1500189.36 Bytes freed
SMA.s0.g_alloc 172011 . Allocations outstanding
SMA.s0.g_bytes 1375043074 . Bytes outstanding
SMA.s0.g_space 7214891518 . Bytes available
n_wrk 200 . N worker threads
n_wrk_create 200 0.00 N worker threads created
n_wrk_failed 0 0.00 N worker threads not created
n_wrk_max 0 0.00 N worker threads limited
n_wrk_lqueue 0 0.00 work request queue length
n_wrk_queued 26 0.00 N queued work requests
n_wrk_drop 0 0.00 N dropped work requests
n_lru_nuked 0 . N LRU nuked objects
n_lru_moved 8495031 . N LRU moved objects
Varnish is up-to-date (3.0.3-1~lenny).
If you have an idea or a track ...
The varnish configuration :
backend default {
.host = "127.0.0.1";
.port = "8000";
.connect_timeout = 10s;
.first_byte_timeout = 10s;
.between_bytes_timeout = 5s;
}
sub vcl_recv {
set req.grace = 1h;
if (req.http.Accept-Encoding) {
if (req.http.Accept-Encoding ~ "gzip") {
set req.http.Accept-Encoding = "gzip";
}
else if (req.http.Accept-Encoding ~ "deflate") {
set req.http.Accept-Encoding = "deflate";
}
else {
unset req.http.Accept-Encoding;
}
}
if (req.url ~ "(?i)\.(png|gif|jpeg|jpg|ico|swf|css|js|eot|ttf|woff|svg|htm|xml)(\?[a-z0-9]+)?$") {
unset req.http.Cookie;
}
if (req.url ~ "^/content/.+\.xml$") {
unset req.http.Cookie;
}
if (req.url ~ "^/min/") {
unset req.http.Cookie;
}
if (req.restarts == 0) {
if (req.http.x-forwarded-for) {
set req.http.X-Forwarded-For =
req.http.X-Forwarded-For + ", " + client.ip;
} else {
set req.http.X-Forwarded-For = client.ip;
}
}
if (req.request != "GET" &&
req.request != "HEAD" &&
req.request != "PUT" &&
req.request != "POST" &&
req.request != "TRACE" &&
req.request != "OPTIONS" &&
req.request != "DELETE") {
return (pipe);
}
if (req.request != "GET" && req.request != "HEAD") {
return (pass);
}
if (req.http.Authorization || req.http.Cookie) {
return (pass);
}
return (lookup);
}
sub vcl_fetch {
if (req.url ~ "(?i)\.(png|gif|jpeg|jpg|ico|swf|css|js|eot|ttf|woff|svg|htm|xml)(\?[a-z0-9]+)?$") {
unset beresp.http.set-cookie;
}
if (req.url ~ "^/(content|common)/.+\.xml$") {
unset req.http.Cookie;
}
if (req.url ~ "^/min/") {
unset req.http.Cookie;
}
set beresp.grace = 1h;
if (beresp.ttl <= 0s ||
beresp.http.Set-Cookie ||
beresp.http.Vary == "*") {
set beresp.ttl = 120s;
return (hit_for_pass);
}
return (deliver);
}
sub vcl_deliver {
if (obj.hits > 0) {
set resp.http.X-Cache = "HIT";
set resp.http.X-Cache-Hits = obj.hits;
} else {
set resp.http.X-Cache = "MISS";
}
return (deliver);
}
You had
-p thread_pool_max=1000
which actually is a minimum recommended, and since you had
n_wrk_queued 26
which is an indicator that it's time to increase the threads, I believe if you had changed it to 2000 for example and kept an eye for n_wrk_queued to make sure you don't need more, things would have worked out fine.

Why isn't Varnish sending 304 unmodified when If-Modified-Since header is sent?

When sending a GET request directly to the backend with If-Modified-Since: Wed, 15 Feb 2012 07:25:00 CET set, Apache correctly returns a 304 with no content.
When I send the same request through Varnish 3.0.2, it responds with a 200 and resends all the content even though the client already has it. Obviously, this isn't a good use of bandwidth. My understanding is that Varnish supports intelligent handling of this header and should be sending a 304, so I figure I'd done something wrong with my .vcl file.
Varnishlog gives this:
16 SessionOpen c 84.97.17.233 64416 :80
16 ReqStart c 84.97.17.233 64416 1597323690
16 RxRequest c GET
16 RxURL c /fr/CS/CS_AU-Maboreke-6-6-2004.pdf
16 RxProtocol c HTTP/1.0
16 RxHeader c Host: www.quotaproject.org
16 RxHeader c User-Agent: Sprawk/1.3 (http://www.sprawk.com/)
16 RxHeader c Accept: */*
16 RxHeader c Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
16 RxHeader c Connection: close
16 RxHeader c If-Modified-Since: Wed, 15 Feb 2012 07:25:00 CET
16 VCL_call c recv lookup
16 VCL_call c hash
16 Hash c /fr/CS/CS_AU-Maboreke-6-6-2004.pdf
16 Hash c www.quotaproject.org
16 VCL_return c hash
16 Hit c 1597322756
16 VCL_call c hit
16 VCL_acl c NO_MATCH CTRLF5
16 VCL_return c deliver
16 VCL_call c deliver deliver
16 TxProtocol c HTTP/1.1
16 TxStatus c 200
16 TxResponse c OK
16 TxHeader c Server: Apache
16 TxHeader c Last-Modified: Wed, 09 Jun 2004 16:07:50 GMT
16 TxHeader c Vary: Accept-Encoding
16 TxHeader c Content-Type: application/pdf
16 TxHeader c Date: Wed, 22 Feb 2012 18:25:05 GMT
16 TxHeader c Age: 12432
16 TxHeader c Connection: close
16 Gzip c U D - 107685 115763 80 796748 861415
16 Length c 98304
16 ReqEnd c 1597323690 1329935105.713264704 1329935106.208528996 0.000071526 0.000068426 0.495195866
16 SessionClose c EOF mode
16 StatSess c 84.97.17.233 64416 0 1 1 0 0 0 203 98304
If I understand this correctly, the object is already in Varnish's cache so it doesn't need to contact the backend, but it already knows the Last-Modified so why would it not respond with 304?
And here's my VCL file:
backend idea {
# .host = "www.idea.int";
.host = "83.145.60.235"; # IDEA's public website IP
.port = "80";
}
backend qp {
# .host = "www.quotaproject.org";
.host = "83.145.60.235"; # IDEA's public website IP
.port = "80";
}
#
#Below is a commented-out copy of the default VCL logic. If you
#redefine any of these subroutines, the built-in logic will be
#appended to your code.
#
sub vcl_recv {
# force domain so that Apache handles the VH correctly
if (req.http.host ~ "^qp" || req.http.host ~ "quotaproject.org$") {
set req.http.Host = "www.quotaproject.org";
set req.backend = qp;
} else {
# default to idea.int
set req.http.Host = "www.idea.int";
set req.backend = idea;
}
# Before anything else we need to fix gzip compression
if (req.http.Accept-Encoding) {
if (req.url ~ "\.(jpg|png|gif|gz|tgz|bz2|tbz|mp3|ogg)$") {
# No point in compressing these
remove req.http.Accept-Encoding;
} else if (req.http.Accept-Encoding ~ "gzip") {
set req.http.Accept-Encoding = "gzip";
} else if (req.http.Accept-Encoding ~ "deflate") {
set req.http.Accept-Encoding = "deflate";
} else {
# unknown algorithm
remove req.http.Accept-Encoding;
}
}
# ajax requests bypass cache. TODO: Make sure you Javascript implementation for AJAX actually sets XMLHttpRequest
if (req.http.X-Requested-With == "XMLHttpRequest") {
return(pass);
}
if (req.request != "GET" &&
req.request != "HEAD" &&
req.request != "PUT" &&
req.request != "POST" &&
req.request != "TRACE" &&
req.request != "OPTIONS" &&
req.request != "DELETE") {
/* Non-RFC2616 or CONNECT which is weird. */
return (pipe);
}
# Purge everything url - this isn't the squid way, but works
if (req.url ~ "^/varnishpurge") {
if (!client.ip ~ purge) {
error 405 "Not allowed.";
}
if (req.url == "/varnishpurge") {
ban("req.http.host == " + req.http.host + " && req.url ~ ^/");
error 841 "Purged site.";
}
else {
ban("req.http.host == " + req.http.host + " && req.url ~ ^" + regsub( req.url, "^/varnishpurge(.*)$", "\1" ) + "$");
error 842 "Purged page.";
}
}
# spoof the client IP (taken from http://utvbloggen.se/snabb-guide-till-varnish/)
remove req.http.X-Forwarded-For;
set req.http.X-Forwarded-For = client.ip;
# Force delivery from cache even if other things indicate otherwise
if (req.url ~ "\.(flv)") {
# pipe flash start away
return(pipe);
}
if (req.url ~ "\.(jpg|jpeg|gif|png|tiff|tif|svg|swf|ico|css|vsd|doc|ppt|pps|xls|pdf|mp3|mp4|m4a|ogg|mov|avi|wmv|sxw|zip|gz|bz2|tgz|tar|rar|odc|odb|odf|odg|odi|odp|ods|odt|sxc|sxd|sxi|sxw|dmg|torrent|deb|msi|iso|rpm)$") {
# cookies are irrelevant here
unset req.http.Cookie;
unset req.http.Authorization;
}
# Force short-circuit to the real site for these dynamic pages
if (req.url ~ "/customcf/" || req.url ~ "/uid/editData.cfm" || req.url ~ "^/private/") {
return(pass);
}
# Remove user agent, since Apache will server these resources the same way
if (req.http.User-Agent) {
set req.http.User-Agent = "";
}
if (req.http.Cookie) {
# removes all cookies named __utm? (utma, utmb...) - tracking thing
set req.http.Cookie = regsuball(req.http.Cookie, "(^|; ) *__utm.=[^;]+;? *", "\1");
# remove cStates for RHM boxes (the server doesn't need to know these, JS will handle this client-side)
set req.http.cookie = regsub(req.http.cookie, "(; )?cStates=[^;]*", ""); #cStates might sometimes have a blank value
# remove ColdFusion session cookie stuff
if (!req.url ~ "^/publications/" && !req.url ~ "^/uid/admin/") {
set req.http.cookie = regsub(req.http.cookie, "(; )?CFID=[^;]+", "");
set req.http.cookie = regsub(req.http.cookie, "(; )?CFTOKEN=[^;]+", "");
}
# Remove the cookie header if it's empty after cleanup
if (req.http.cookie ~ "^;? *$") {
# The only cookie data left is a semicolon or spaces
remove req.http.cookie;
}
}
}
#
# Called when the requested object was not found in the cache
#
sub vcl_hit {
# Allow administrators to easily flush the cache from their browser
if (client.ip ~ CTRLF5) {
if (req.http.pragma ~ "no-cache" || req.http.Cache-Control ~ "no-cache") {
set obj.ttl = 0s;
return(pass);
}
}
}
#
# Called when the requested object has been retrieved from the
# backend, or the request to the backend has failed
#
sub vcl_fetch {
set beresp.grace = 1h;
# strip the cookie before the image is inserted into cache.
if (req.url ~ "\.(jpg|jpeg|gif|png|tiff|tif|svg|swf|ico|css|vsd|doc|ppt|pps|xls|pdf|mp3|mp4|m4a|ogg|mov|avi|wmv|sxw|zip|gz|bz2|tgz|tar|rar|odc|odb|odf|odg|odi|odp|ods|odt|sxc|sxd|sxi|sxw|dmg|torrent|deb|msi|iso|rpm)$") {
remove beresp.http.set-cookie;
set beresp.ttl = 100w;
}
# Remove CF session cookies for everything but the publications subsite
if (!req.url ~ "^/publications/" && !req.url ~ "/customcf/" && !req.url ~ "^/uid/admin/" && !req.url ~ "^/uid/editData.cfm") {
remove beresp.http.set-cookie;
}
if (beresp.ttl < 48h) {
set beresp.ttl = 48h;
}
}
#
# Called before a cached object is delivered to the client
#
sub vcl_deliver {
# We'll be hiding some headers added by Varnish. We want to make sure people are not seeing we're using Varnish.
remove resp.http.X-Varnish;
remove resp.http.Via;
# We'd like to hide the X-Powered-By headers. Nobody has to know we can run PHP and have version xyz of it.
remove resp.http.X-Powered-By;
}
Can anyone see the problem or problems?
Update: According to http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.3
Note: When handling an If-Modified-Since header field, some
servers will use an exact date comparison function, rather than a
less-than function, for deciding whether to send a 304 (Not
Modified) response.
It seems this may be Varnish's behaviour. I'm sending another date which is previous to the real file's last modified date, but not exactly what is cached in Varnish.
The problem is the non-GMT time zone in the If-Modified-Since request header:
If-Modified-Since: Wed, 15 Feb 2012 07:25:00 CET
According to http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.3
All HTTP date/time stamps MUST be represented in Greenwich Mean Time (GMT), without exception.
Varnish implements this as a strict requirement, whereas Apache handles nonstandard date formats more robustly. This is why you observed different behavior when querying Apache directly.
Since this question is still open with no answers and several up votes, I'll post an answer.
This does not seem to be an issue with Varnish 3.0.0 (which we are using) or the current version of Varnish you are running on your site.
200 OK response when requesting content with an expired If-Modified-Since header:
# curl -z "Wed, 09 Jun 2010 16:07:50 GMT" --head "www.quotaproject.org/robots.txt"
HTTP/1.1 200 OK
Server: Apache
Last-Modified: Tue, 22 Jan 2013 13:23:41 GMT
Vary: Accept-Encoding
Cache-Control: public
Content-Type: text/plain; charset=UTF-8
Date: Mon, 25 Nov 2013 15:00:45 GMT
Age: 69236
Connection: keep-alive
X-Cache: HIT
304 response when If-Modified-Since is after Last-Modified date:
# curl -z "Wed, 09 Jun 2013 16:07:50 GMT" --head "www.quotaproject.org/robots.txt"
HTTP/1.1 304 Not Modified
Server: Apache
Last-Modified: Tue, 22 Jan 2013 13:23:41 GMT
Vary: Accept-Encoding
Cache-Control: public
Content-Type: text/plain; charset=UTF-8
Date: Mon, 25 Nov 2013 15:00:52 GMT
Age: 69243
Connection: keep-alive
X-Cache: HIT
The same with the example you gave in varnishlog output:
# curl -z "Wed, 15 Feb 2012 07:25:00 CET" --head "www.quotaproject.org/fr/CS/CS_AU-Maboreke-6-6-2004.pdf"
HTTP/1.1 304 Not Modified
Server: Apache
Last-Modified: Wed, 09 Jun 2004 16:07:50 GMT
Cache-Control: public
Content-Type: application/pdf
Accept-Ranges: bytes
Date: Mon, 25 Nov 2013 15:08:48 GMT
Age: 335802
Connection: keep-alive
X-Cache: HIT
I would say Varnish works as expected. Maybe this was a problem with the Varnish build you were using or there was something amiss with the testing methodology. I couldn't see any problems with your VCL either.

Resources