I have a very simple site and am setting up varnish cache on it. The server is nginx.
The cache seems to get automatically purged after 120 seconds as when I go on the site i see the Age header being reset.
Can anyone point me towards where to remove this and have pages cached indefinitely or until i manually purge varnish?
You did not mention your OS or distribution, but for example on CentOS /etc/sysconfig/varnish sets the defaults for Varnish. Amongst those defaults is VARNISH_TTL=120, which sets the default TTL to 120 seconds.
If you only wish to set a high TTL for all objects, you can just edit the default one in /etc/sysconfig/varnish.
If the backend sends to the Varnish age headers, the Varnish will consider them as a real expiration date just like a web browser and will purge it's content when the header expires.
You should make sure that the backend doesn't send cache-control headers to the varnish and only the varnish will add cache-control headers when sending data to the browsers.
Related
We are using a Standard Microsoft Azure CDN to serve images for a web application. These images are requested as /api/img?param1=aaa¶m2=bbb, so we cache every unique URL. The cache duration is 7 days. We also override the "Cache-Control" header so that the image is only cached for 1 hours by the client browser.
The problem is, the images do not stay in cache for 7 days. The first day after the images have been requested, they seem to be in CDN (I verify the X-Cache header and it returns "TCP_HIT"), however if I make the same requests 2-3 days later, around 25% of images are not cached anymore (the X-Cache header is "TCP_MISS"). The origin server receives and logs requests, so I am sure that they bypass CDN.
Is there any explanation for this? Do I have to set additional parameters for images to be cached correctly?
We use the following settings :
Caching rules "Cache every unique URL"
Rules Engine:
if URL path begins with /api/img
then Cache expiration: [cache behaviour] Override, [duration] 7 days
and then Modify response header: Overwrite, "Cache-Control", "public, max-age=3600"
From some folks on the CDN Product Group:
for all but the Verizon Premium SKU, the max-age and cache expiration are one and the same thing, so 2c overrides 2b.
The CDN reserves the right to flush entries from the CDN if they are not used - cache items are evicted using an LRU algorithm.
the Verizon Premium SKU offers the ability to have two different age values, one for browser-to-edge (the "External Max-Age") and one for edge-to-source (original expiration time, or forced override time - see the docs).
I set up an Azure Verizon Premium CDN a few days ago as follows:
Origin: An Azure web app (.NET MVC 5 website)
Settings: Custom Domain, no geo-filtering
Caching Rules: standard-cache (doesn't care about parameters)
Compression: Enabled
Optimized for: Dynamic site acceleration
Protocols: HTTP, HTTPS, custom domain HTTPS
Rules: Force HTTPS via Rules Engine (if request scheme = http, 301 redirect to https://{customdomain}/$1)
So - this CDN has been running for a few days now, but the ADN reports are saying that nearly 100% (99.36%) of the cache status is "CONFIG_NOCACHE" (Description: "The object was configured to never be cached in accordance with customer-specific configurations residing on the edge servers, so the response was served via the origin server.") A few (0.64%) of them are "NONE" (Description: "The cache was bypassed entirely for this request. For instance, the request was immediately rejected by the token auth module, or the client request method used an uncacheable request method such as "PUT".") Also, in the "Cache Hit" report, it says "0 hits, 0 misses" for every day. Nothing is coming through the "HTTP Large" side, only "ADN".
I couldn't find these exact messages while searching around, but I've tried:
Updating cache-control header to max-age, public (ie: cache-control: public,max-age=1209600)
Updating the cache-control header to max-age (cache-control: max-age=1209600)
Updating the expires header to a date way in the future (expires: Tue, 19 Jan 2038 03:14:07 GMT)
Using different browsers so the request cache info is different. In Chrome, the request is "cache-control: no-cache" in my browser. In Firefox, it'll say "Cache-Control: max-age=0". In any case, I'd assume the users on the website wouldn't have these same settings, right?
Refreshing the page a bunch of times, and looking at the real time report to see hits/misses/cache statuses, and it shows the same thing - CONFIG_NOCACHE for almost everything.
Tried running a "worldwide" speed test on https://www.dotcom-tools.com/website-speed-test.aspx, but that had the same result - a bunch of "NOCACHE" hits.
Tried adding ADN rules to set the internal and external max age to 864000 sec (10 days).
Tried adding an ADN rule to ignore "no-cache" requests and just return the cached result.
So, the message for "NOCACHE" says it's a node configuration issue... but I haven't really even configured it! I'm so confused. It could also be an application issue, but I feel like I've tried all the different permutations of "cache-control" that I can. Here's an example of one file that I'd expect to be cached:
Ultimately, I would hope that most of the requests are being cached, so I'd see most of the requests be "TCP Hit". Maybe that's incorrect? Thanks in advance for your help!
So, I eventually figured out this issue. Apparently Azure Verzion Premium CDN ADN platform has "bypass cache" enabled by default.
To disable this behavior you need to configure additional features to your caching rules.
Example:
IF Always
Features:
Bypass Cache Disabled
Force Internal Max-Age Response 200 864000 Seconds
Ignore Origin No-Cache 200
I've noticed an issue on one of my sites whereby my content pages (which shouldn't set any cookies, should all be returning "Cache-Control: public" with a max-age set, and don't require authorization).
My issue is that somehow HitPass objects are making it into my cache, removing the caching from that page. I need to debug this, but am confused at exactly how best to do this particularly as I'm unable to replicate the issue.
I notice that varnish gives me an ID beside the HitPass in the varnish log. I assume this is the varnish ID for the request that generated the HitPass, and that searching back in a varnish log would tell me exactly what was wrong with the response?
Would it be better to just remove the SetCookie header from pages that I want to cache? The problem is that vcl_fetch is called even if a URL is passed... Is there any way to tell in vcl_fetch whether or not the current request has been passed by vcl_recv?
SetCookie is indeed a reason why you get hit-for-pass objects in your cache. This is an important optimization for non-prepared sites. A hit-for-pass will let varnish go straight to the backend for each of these request instead of stall them and wait for the response of the previous one.
I'm not sure as to exactly what you are wanting to debug. If it's the set-cookie, you should probably either remove that from the backend or make your own rules on what ones to cache or what one's to ignore in your cache. If you still need the set-cookie and it has unique values, hit-for-pass is the way to do that best.
I'm running varnish on a dedicated server. When i load a page, it is delivered via Apache and on the second and subsequent hits it is then delivered via Varnish Cache (i.e. I can see two timestamps in X-Varnish headers).
But when i open up the same page from some other computer, it's again delivered from the backend (apache) for the first time and on further reloads it comes from Varnish.
If a page is already in Varnish Cache, isn't it supposed to be delivered via Varnish even on a new computer for the first time? I've tried simple hello world php files without any database calls with the same effect. Might it be something wrong with my vcl file or Varnish works this way only?
check whether you sending session data (cookies) which then look like unique calls to varnish. the docs show you how to strip cookies.
Jon is right. I had similar problem. You also need to clean up your cookie and cache before test. Check if the first visit response header, it tries to set cookie. If so, you can do "unset beresp.http.Set-Cookie under vcl_fetch.
I'm using Varnish without touching any configuration (just the PORT forwarding to Apache to 8080).
But I got two issues:
I visit a URL of an image, I delete the image, and I visit again and it exists … Varnish cached it … how can i tell varnish to look first if the file AT LEAST exists before serving it from his cache ?
The PHP files are not being cached (I mean, the HTML content generated by the PHP). I always see in the Headers: Age: 0 … any clue ?
Thank you !
I visit a URL of an image, I delete the image, and I visit again and
it exists … Varnish cached it … how can i tell varnish to look first
if the file AT LEAST exists before serving it from his cache ?
Eh, the whole purpose of caching is not having to do the same work (like checking for existence & loading a file, or generating a PHP response) over and over again, but to reuse the generated response. Varnish never new about the existence of some file to begin with (your backend server did the math) so it can never check if 'the file at least exists'.
There are however ways to instruct varnish not to cache urls forever. For instance; if your back-end response instructs any cache to not reuse the result (certain HTTP response headers indicate this), varnish will not cache it. Varnish will be smart enough (by default) to not cache responses with cookies too (which probably answers your second question). You can tell varnish to only cache a response for a certain period (like 30 seconds), so your deletes will be picked up pretty quickly. You could PURGE urls from varnish after you changed/delete a file. If your backend server does not tell this correctly with it's response headers, your can override this behavior by writing your own .vcl file.
The PHP files are not being cached (I mean, the HTML content generated
by the PHP). I always see in the Headers: Age: 0 … any clue ?
I can guess: you're setting cookies. But it would really help if you added the response headers to your question.