I have trouble figuring out a cache-control header for delivering files that are used for an HTML5 app that uses the AppCache, which works on all major browsers(Chrome/Safari, Opera, Firefox, IE10).
The problem that I run into, is that when one kind of header works for a certain browser, another one may break completely. For example:
Cache-Control: private
Works fine on Webkit browsers, and they refresh and load updated files and replace them in the cache. However Firefox and IE10 both refuse to load the new files and instead get them from the cache (not appcache!), even though they recognize the updated manifest file.
Cache-Control: no-cache
works fine on webkit browsers also, and also makes Firefox AND IE10 load the new files, instead of loading them from their cache, but breaks offline functionality, since they essentially don't cache (as the header would tell) the files, even though they are explicitly mentioned in the appcache manifest.
Lastly, I tried
Cache-Control: must-revalidate
Which works similarly to no-cache but instead of Firefox and IE10 not retaining the files for offline use it's Webkit that doesn't retain them.
Sending no Cache-Control header yields the same results as private or public, since I assume the browser simply assumes that as the standard way.
So what am I missing? public has the same results as private and setting a max-age is not an option since updates (including Hotfixes) are not delivered on a regular basis, but instead whenever they are available or needed.
Can someone shed a light on which Cache-Control header is the correct one to use, which will work on all browsers?
Related
I'm working on a web app and I'm having the following problem:
When I go on some page my server sends a response with cache-control: no-cache header.
Then I do some changes (graphql mutations) on that page.
When I go to an other page and then click browser back then my browser reads the outdated "response" from the disk cache instead of sending a request to the server to get the change data.
browser loads response from cache although no-cache header is set
I wondering if there is something missing in my headers telling the browser to not use the disk cache?
Some info:
The browser does not send a request to my server. (So it is not cached somewhere else.)
It is not the back-forward cache. (There is already some logic handling the bfcache.)
I can reproduce it in all my browsers. (e.g. Firefox, Chrome, ...)
When I disable the disk cache in the Firefox settings then it is working correctly. (Now, the bfcache kicks in.)
I also found the following thread. Is there a better solution?
Chrome is caching even with HTTP no-cache headers
We have 2 Microsoft add-ins written using the Office JS framework.As we understand we are loading the static website (taskpane.html) whenever the pane is loaded.
Changes to our plugin are mostly cosmetic, and due to that we usually do not update the version of the plugin, and just push a new version of code to the bucket hosting the static website.
The issue we are facing is with caching of the build bundle, unless we manually clear the cache using developer tools, we do not get the updated website inside the plugin pane.
We have disabled the caching at S3 end to return Cache-Control header value as no-cache, but even after that I see http status code 304 on plugin refresh against the task.html code.
Are we supposed to distribute a new version of plugin event for website updates ?
Are we supposed to distribute a new version of plugin event for website updates ?
No, that is not required.
We have disabled the caching at S3 end to return Cache-Control header value as no-cache
But the loaded by the web browser files are already cached on the end user machine. To get them requested anew you need to clear the cache for users. Only after that you will get new requests according to the HTTP headers set on the server. Note, the Cache-Control header field holds directives (instructions) — in both requests and responses — that control caching in browsers and shared caches (e.g. Proxies, CDNs).
It is not clear what directives are used for the Cache-Control HTTP header.
I've been running some penetration tests using OWASP ZAP and it raises the following alert for all requests: X-Content-Type-Options Header Missing.
I understand the header, and why it is recommended. It is explained very well in this StackOverflow question.
However, I have found various references that indicate that it is only used for .js and .css files, and that it might actually be a bad thing to set the header for other MIME types:
Note: nosniff only applies to "script" and "style" types. Also applying nosniff to images turned out to be incompatible with existing web sites. [1]
Firefox ran into problems supporting nosniff for images (Chrome doesn't support it there). [2]
Note: Modern browsers only respect the header for scripts and stylesheets and sending the header for other resources (such as images) when they are served with the wrong media type may create problems in older browsers. [3]
The above references (and others) indicate that it is bad to simply set this header for all responses, but despite following any relevant-looking links and searching on Google, I couldn't find any reason behind this argument.
What are the risks/problems associated with setting X-Content-Type-Options: nosniff and why should it be avoided for MIME types other than text/css and text/javascript?
Or, if there are no risks/problems, why are Mozilla (and others) suggesting that there are?
The answer by Sean Thorburn was very helpful and pointed me to some good material, which is why I awarded the bounty. However, I have now done some more digging and I think I have the answer I need, which turns out to be the opposite of the answer given by Sean.
I will therefore answer my own questions:
The above references (and others) indicate that it is bad to simply set this header for all responses, but despite following any relevant-looking links and searching on Google, I couldn't find any reason behind this argument.
There is a misinterpretation here - this is not what they are indicating.
The resources I found during my research referred to the header only being respected for "script and style types", which I interpreted this to mean files that were served as text/javascript or text/css.
However, what they actually referring to was the context in which the file is loaded, not the MIME type it is being served as. For example, <script> or <link rel="stylesheet"> tags.
Given this interpretation, everything make a lot more sense and the answer becomes clear:
You need to serve all files with a nosniff header to reduce the risk of injection attacks from user content.
Serving up only CSS/JS files with this header is pointless, as these types of file would be acceptable in this context and don't need any additional sniffing.
However, for other types of file, by disallowing sniffing we ensure that only files whose MIME type matches the expected type are allowed in each context. This mitigates the risk of a malicious script being hidden in an image file (for example) in a way that would bypass upload checks and allow third-party scripts to be hosted from your domain and embedded into your site.
What are the risks/problems associated with setting X-Content-Type-Options: nosniff and why should it be avoided for MIME types other than text/css and text/javascript?
Or, if there are no risks/problems, why are Mozilla (and others) suggesting that there are?
There are no problems.
The problems being described are issues regarding the risk of the web browser breaking compatibility with existing sites if they apply nosniff rules when accessing content. Mozilla's research indicated that enforcing a nosniff option on <img> tags would break a lot of sites due to server misconfigurations and therefore the header is ignored in image contexts.
Other contexts (e.g. HTML pages, downloads, fonts, etc.) either don't employ sniffing, don't have an associated risk or have compatibility concerns that prevent sniffing being disabled.
Therefore they are not suggesting that you should avoid the use of this header, at all.
However, the issues that they talk about do result in an important footnote to this discussion:
If you are using a nosniff header, make sure you are also serving the correct Content-Type header!
Some references, that helped me to understand this a bit more fully:
The WhatWG Fetch standard that defines this header.
A discussion and code commit relating to this header for the webhint.io site checking tool.
I'm a bit late to the party, but here's my 2c.
This header makes a lot of sense when serving User Generated Content. So people don't upload a .png file that actually has some JS code in it, and then use that .png in a <script> tag.
You don't necessarily have to set it for the static files that you have 100% control of.
I would stick to js, css, text/html, json and xml.
Google recommend using unguessable CSRF tokens provided by the protected resources for other content types. i.e generate the token using a js resource protected by the nosniff header.
You could add it to everything, but that would just be tedious and as you mentioned above - you may run into compatibility and user issues.
https://www.chromium.org/Home/chromium-security/corb-for-developers
I'm having trouble getting compression to work with ServiceStack. I return .ToOptimizedResult from my server, and I get a log entry that tells my that the header is added:
ServiceStack.WebHost.Endpoints.Extensions.HttpResponseExtensions:
DEBUG: Setting Custom HTTP Header: Content-Encoding: deflate
However the content returned is not compressed. I've checked using both Fiddler and Network inspector in Chrome.
Sorry to all
It seems that maybe my antivirus (BitDefender) decompresses the data to scan for virus, even though I disabled the AV. When testing on other computers the output is compressed.
Background:
IIS 7
AspNet 3.5 web app
Chrome dev tools lists 98 requests for the home page of the web app (aspx + js + css + images). In following requests, status code is 200 for css/images files. No cache info, browser asks server each time if file has to be updated. OK.
In IIS 7 I set HTTP header for cache control, set to 6 hours for the "ressources" folder. In Chrome, using dev tools, I can see that header is well set in response:
Cache-Control: max-age=21600
But I still get 98 requests... I thought that browser should not request one ressource if its expiration date is not reached, and I was expecting the number of requests to drop...
I got it. Google Chrome ignores the Cache-Control or Expires header if you make a request immediately after another request to the same URI in the same tab (by clicking the refresh button, pressing the F5 key or pressing Command + R). It probably has an algorithm to guess what does the user really want to do.
A way to test the Cache-Control header is to return an HTML document with a link to itself. When clicking the link, Chrome serves the document from the cache. E.g., name the following document self.html:
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Test Page</title>
</head>
<body>
<p>
Link to the same page.
If correctly cached, a request should not be made
when clicking the link.
</p>
</body>
</html>
Another option is to copy the URL and paste it in the same tab or another tab.
UPDATE: On a Chrome post published on January 26, 2017, it is described what was the previous behavior and how it is changing by doing only revalidation of the main resource, but not of the sub-resources:
Users typically reload either because a page is broken or the content seems stale. The existing reload behavior usually solves broken pages, but stale content is inefficiently addressed by a regular reload, especially on mobile. This feature was originally designed in times when broken pages were quite common, so it was reasonable to address both use cases at once. However, this original concern has now become far less relevant as the quality of web pages has increased. To improve the stale content use case, Chrome now has a simplified reload behavior to only validate the main resource and continue with a regular page load. This new behavior maximizes the reuse of cached resources and results in lower latency, power consumption, and data usage.
In a Facebook post also published on January 26, 2017, it is mentioned that they found a piece of code were Chrome invalidates all cached resources after a POST request:
we found that Chrome would revalidate all resources on pages that were loaded from making a POST request. The Chrome team told us the rationale for this was that POST requests tend to be pages that make a change — like making a purchase or sending an email — and that the user would want to have the most up-to-date page.
It seems this is not the case anymore.
Finally, it is described that Firefox is introducing Cache-Control: immutable to completely stop revalidation of resources:
Firefox implemented a proposal from one of our engineers to add a new cache-control header for some resources in order to tell the browser that this resource should never be revalidated. The idea behind this header is that it's an extra promise from the developer to the browser that this resource will never change during its max-age lifetime. Firefox chose to implement this directive in the form of a cache-control: immutable header.
Chrome appears to be ignoring your Cache-Control settings if you're reloading in the same tab. If you copy the URL to a new tab and load it there, Chrome will respect the cache control tags and reuse the contents from the cache.
As an example I had this Ruby Sinatra app:
#!/usr/bin/env ruby
require 'sinatra'
before do
content_type :txt
end
get '/' do
headers "Cache-Control" => "public, must-revalidate, max-age=3600",
"Expires" => Time.at(Time.now.to_i + (60 * 60)).to_s
"This page rendered at #{Time.now}."
end
When I continuously reloaded it in the same Chrome tab it would display the new time.
This page rendered at 2014-10-08 13:36:46 -0400.
This page rendered at 2014-10-08 13:36:48 -0400.
The headers looked like this:
< HTTP/1.1 200 OK
< Content-Type: text/plain;charset=utf-8
< Cache-Control: public, must-revalidate, max-age=3600
< Expires: 2014-10-08 13:36:46 -0400
< Content-Length: 48
< X-Content-Type-Options: nosniff
< Connection: keep-alive
* Server thin is not blacklisted
< Server: thin
However accessing the same URL, http://localhost:4567/ from multiple new tabs would recycle the previous result from the cache.
After doing some tests with Cache-Control:max-age=xxx:
Pressing reload button: header ignored
Entering same url any tab (current or not): honored
Using JS (window.location.reload()): ignored
Using Developer Tools (with Disable cache unselected) or incognito doesn't affect
So, the best option while developing is put the cursor in the omnibox and press enter instead of refresh button.
Note: a right button click on refresh icon will show refresh options (Normal, Hard, Empty Cache). Incredibly, no one of these affect on these headers.
If Chrome Developer Tools are open (F12), Chrome usually disables caching.
It is controllable in the Developer Tools settings - the Gear icon to the right of the dev-tools top bar.
While this question is old, I wanted to add that if you are developing using a self-signed certificate over https and there is an issue with the certificate then google will not cache the response no matter what cache headers you use.
This is noted in this bug report:
https://bugs.chromium.org/p/chromium/issues/detail?id=110649
This is addition to kievic answer
To force browser to NOT send Cache-Control header in request, open chrome console and type:
location = "https://your.page.com"
To force browser to add this header click "reload" button.
Quite an old question, but I noticed just recently (2020), that Chrome sometimes ignores the Cache-Control headers for my image resources when browsing using an Incognito window.
"Sometimes" because in my case the Cache-Control directive was honored for small images (~60-200KB), but not for larger ones (10MB).
Not using Incognito window resulted in Chrome using the disk cached version even for the large images.
Another tip:
Do not forget to verify "Date" header - if server has incorrect date/time (or is located in another time zone) - Chrome will keep requesting resource again and again.