Why to specify Last-Modified? - cache-control

I'm bit confused with this header.
I already added "Expires" and "Cache-Control: max-age" in my .httaccess.
I also removed "E-tag" and "Last-Modified" as per advice:
If you remove the Last-Modified and ETag header, you will totally eliminate If-Modified-Since and If-None-Match requests and their 304 Not Modified Responses, so a file will stay cached without checking for updates until the Expires header indicates new content is available!
so why Google's PageSpeed shows that?:
Specify a cache validator:
The following resources are missing a cache validator.
Resources that do not specify a cache validator cannot be refreshed efficiently. Specify a Last-Modified or ETag header to enable cache validation for the following resources

Related

How to set cache-control to always check for updates but always fall back to cache if server is unreachable

I'm missing something trying to understand cache-control (e.g., from https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control).
How do I set up cache control to accomplish the following (I'll be using an .htaccess file):
If client fetches a file, it should always store it in the cache.
When client needs a file, it should always check to see if the file has been changed and download a new copy if it has changed.
If the attempt to check fails -- e.g., server down or no Internet connection -- client should always use a cached copy if available, no matter how old. Any copy is better than none.
Use Cache-Control: no-cache and set the ETag header.
The resource will be stored in the cache. This is true of any cache header other than no-store.
no-cache tells the client that it must check with the server to see if the cached copy is valid. It does this by sending a conditional request, which requires that the cached response have an ETag (or Last-Modified) header.
Using a cached copy of a resource when there's no connectivity is the default behavior. You could prevent this with the must-revalidate directive.

Should HTTP 404 Not Found responses with Set-Cookie directive contain cache-control headers

In some situations when my application responds with 404 Not Found code it also returns Set-Cookie directive with session identifier, but there is no Cache-Controle or Pragma directive. Does this mean that session identifier can be stored in browser cache and does this influence the security of the application? I am not sure if all responses with Set-Cookie should contain caching directives.
Whether a cookie is permanently stored in the browser or not is controlled by the Expires and Max-Age properties. Cache-Control and Pragma headers only affect page contents. So I think you're good on 404 pages even without explicit cache headers (* but see the edit below).
Session cookies should always be set without an explicit expiry date, in which case they won't normally be stored on disk and will be removed when the user quits the browser.
(Note that there are cases beyond your control when such data from memory will still be persisted to disk, like for example when a user decides to hibernate, or the computer runs out of memory and starts to swap.)
Edit (see comments):
In case of normal pages that set cookies, you usually have headers to prevent caching of sensitive info like Cache-control: no-cache,no-store,must-revalidate. This I think inherently includes not caching cookie responses either, so you don't need to explicitly set it on normal pages.
So the question is then, what cookie is set on a 404 page? If an unauthenticated user downloads the 404 page and gets a session cookie, that cookie is useless for an attacker, as the application should not be vulnerable to session fixation (the cookie value should change upon logon anyway). If it is an authenticated user, why would the application set the session cookie again on a 404 page? If it does though, you should send headers to prevent caching, that's a good catch by Skipfish. (In fact, you can do this for unauthenticated users too, but I would rate that a very low risk.)

Varnish caching - age gets reset

I have a very simple site and am setting up varnish cache on it. The server is nginx.
The cache seems to get automatically purged after 120 seconds as when I go on the site i see the Age header being reset.
Can anyone point me towards where to remove this and have pages cached indefinitely or until i manually purge varnish?
You did not mention your OS or distribution, but for example on CentOS /etc/sysconfig/varnish sets the defaults for Varnish. Amongst those defaults is VARNISH_TTL=120, which sets the default TTL to 120 seconds.
If you only wish to set a high TTL for all objects, you can just edit the default one in /etc/sysconfig/varnish.
If the backend sends to the Varnish age headers, the Varnish will consider them as a real expiration date just like a web browser and will purge it's content when the header expires.
You should make sure that the backend doesn't send cache-control headers to the varnish and only the varnish will add cache-control headers when sending data to the browsers.

How to remove the HTTP ETag header by IIS 7.5?

We wish to remove the ETag header added automatically by IIS 7.5. None of the online suggestions worked for us, which may be due to a different version of IIS.
One of the solutions we have come across repeatedly included creating a new ETag HTTP response header with "" as value. This approach adds , "" after the original ETag instead.
I have been able to remove the ETag HTTP header using an outbound rewrite rule.

HTTP Headers - Cache Question

I am making a a request to an image and the response headers that I get back are:
Accept-Ranges:bytes
Content-Length:4499
Content-Type:image/png
Date:Tue, 24 May 2011 20:09:39 GMT
ETag:"0cfe867f5b8cb1:0"
Last-Modified:Thu, 20 Jan 2011 22:57:26 GMT
Server:Microsoft-IIS/7.5
X-Powered-By:ASP.NET
Note the absence of the Cache-Control header.
On subsequent requests on Chrome, Chrome knows to go to the cache to retrieve the image. How does it know to use the cache? I was under the impression that I would have to tell it with the Cache-Control header.
You have both an ETag and a Last-Modified header. It probably uses those. But for that to happen, it still needs to make a request with If-None-Match or If-Modified-Since respectively.
To set the Cache-Control You have to specify it yourself. You can either do it in web.config , IIS Manager for selected folders (static, images ...) or set it in code. The HTTP 1.1 standard recommends one year in future as the maximum expiration time.
Setting expiration date one year in future is considered good practice for all static content in your site. Not having it in headers results in If-Modified-Since requests which can take longer then first time requests for small static files. In these calls ETag header is used.
When You have Cache-Control: max-age=315360000 basic HTTP responses will outnumber If-Modified-Since> calls and because of that it is good to remove ETag header and result in smaller static file response headers. IIS doesn't have setting for that so You have to do response.Headers.Remove("ETag"); in OnPreServerRequestHeaders()
And if You want to optimize Your headers further You can remove X-Powered-By:ASP.NET in IIS settings and X-Aspnet-Version header (altough I don't see in Your response) in web.config - enableVersionHeader="false" in system.web/httpRuntime element.
For more tips I suggest great book - http://www.amazon.com/Ultra-Fast-ASP-NET-Build-Ultra-Scalable-Server/dp/1430223839

Resources