Does browser always request a cached file on each request? (e.g., a CSS style sheet or .js javascript file that has been sent previously)
I'm not sure but i think the answer is "no, it does not".
But then why does the Apache log show that the cached file was requested again?
What is the default behavior ?
It really depends on how the page is coded, for example, one can force a web browser to request a script from the web server rather than using its cached copy. So, in short, its not "always" that browser requests scripts from the cache but, "most of the time" it uses cached copy.
Related
I am developing on a server which was initially running HTTP protocol. After switching to HTTPS protocol, any changes done on the Javascript file won't update any longer. I've made sure that, the file was in fact saved properly, upload and re-downloaded the file to make sure the changes on the code were really there and, it was.
Here is my question, why https won't react to changes I made to the file but, as soon as I use http, the changes are displayed?
Your Javascipt source code appear to attempt to POST to an HTTP URL when you are using HTTPS. Most modern browsers block this as this is insecure. If your POST URL supports HTTPS, change it and you should see this work.
I'm running an application on Express and my browser keeps fetching files that should've already been cached. The status code for the offending files is 304 and size is consistently 220 B / 221 B. Other resources (that are getting served properly), are showing '(from cache)'.
A bit more information: the ETags / file contents haven't changed and I've set some response headers.
res.set('Cache-Control', 'max-age=345600');
res.set('Expires', new Date(Date.now() + 345600000).toUTCString());
(source: imageno.com)
Admittedly, I'm no HTTP expert, but maybe someone can help me understand why this might be happening?
Essentially, the browser IS caching and serving the cached bundles (although it doesn’t display the “from cache” message). In order to serve them, it sends a request to the server and checks if the file has changed. If it hasn’t changed, the server sends a 304 response code and the browser pulls the file from cache. This takes about 15-50 ms, so it's not a substantial performance impact.
However, I CAN force the browser to show the file without sending a verification req (like externally hosted libraries for example). That would require setting expires/cache-control headers for the far future, time-stamping the filenames for static assets and serving them dynamically (by maybe writing the updated filenames to a configuration file or something like that), but I think this would be more trouble than it’s worth honestly.
Just posting this response for anyone who runs into the same issue.
I'm running varnish on a dedicated server. When i load a page, it is delivered via Apache and on the second and subsequent hits it is then delivered via Varnish Cache (i.e. I can see two timestamps in X-Varnish headers).
But when i open up the same page from some other computer, it's again delivered from the backend (apache) for the first time and on further reloads it comes from Varnish.
If a page is already in Varnish Cache, isn't it supposed to be delivered via Varnish even on a new computer for the first time? I've tried simple hello world php files without any database calls with the same effect. Might it be something wrong with my vcl file or Varnish works this way only?
check whether you sending session data (cookies) which then look like unique calls to varnish. the docs show you how to strip cookies.
Jon is right. I had similar problem. You also need to clean up your cookie and cache before test. Check if the first visit response header, it tries to set cookie. If so, you can do "unset beresp.http.Set-Cookie under vcl_fetch.
I'm using Varnish without touching any configuration (just the PORT forwarding to Apache to 8080).
But I got two issues:
I visit a URL of an image, I delete the image, and I visit again and it exists … Varnish cached it … how can i tell varnish to look first if the file AT LEAST exists before serving it from his cache ?
The PHP files are not being cached (I mean, the HTML content generated by the PHP). I always see in the Headers: Age: 0 … any clue ?
Thank you !
I visit a URL of an image, I delete the image, and I visit again and
it exists … Varnish cached it … how can i tell varnish to look first
if the file AT LEAST exists before serving it from his cache ?
Eh, the whole purpose of caching is not having to do the same work (like checking for existence & loading a file, or generating a PHP response) over and over again, but to reuse the generated response. Varnish never new about the existence of some file to begin with (your backend server did the math) so it can never check if 'the file at least exists'.
There are however ways to instruct varnish not to cache urls forever. For instance; if your back-end response instructs any cache to not reuse the result (certain HTTP response headers indicate this), varnish will not cache it. Varnish will be smart enough (by default) to not cache responses with cookies too (which probably answers your second question). You can tell varnish to only cache a response for a certain period (like 30 seconds), so your deletes will be picked up pretty quickly. You could PURGE urls from varnish after you changed/delete a file. If your backend server does not tell this correctly with it's response headers, your can override this behavior by writing your own .vcl file.
The PHP files are not being cached (I mean, the HTML content generated
by the PHP). I always see in the Headers: Age: 0 … any clue ?
I can guess: you're setting cookies. But it would really help if you added the response headers to your question.
In Firefox or Chrome I'd like to prevent a private web page from making outgoing connections, i.e. if the URL starts with http://myprivatewebpage/ or https://myprivatewebpage/ in a browser tab, then that browser tab must be restricted so that it is allowed to load images, CSS, fonts, JavaScript, XmlHttpRequest, Java applets, flash animations and all other resources only from http://myprivatewebpage/ or https://myprivatewebpage/, i.e. an <img src="http://www.google.com/images/logos/ps_logo.png"> (or the corresponding <script>new Image(...) must not be able to load that image, because it's not on myprivatewebpage. I need a 100% and foolproof solution: not even a single resource outside myprivatewebpage can be accessible, not even at low probability. There must be no resource loading restrictions on Web pages other than myprivatewebpage, e.g. http://otherwebpage/ must be able to load images from google.com.
Please note that I assume that the users of myprivatewebpage are willing to cooperate to keep the web page private unless it's too much work for them. For example, they would be happy to install a Chrome or Firefox extension once, and they wouldn't be offended if they see an error message stating that access is denied to myprivatewebpage until they install the extension in a supported browser.
The reason why I need this restriction is to keep myprivatewebpage really private, without exposing any information about its use to webmasters of other web pages. If http://www.google.com/images/logos/ps_logo.png was allowed, then the use of myprivatewebpage would be logged in the access.log of Google's ps_logo.png, so Google's webmasters would have some information how myprivatewebpage is used, and I don't want that. (In this question I'm not interested in whether the restriction is reasonable, but I'm only interested in the technical solutions and its strengths and weaknesses.)
My ideas how to implement the restriction:
Don't impose any restrictions, just rely on the same origin policy. (This doesn't provide the necessary protection, the same origin policy lets all images pass through.)
Change the web application on the server so it generates HTML, JavaScript, Java applets, flash animations etc. which never attempt to load anything outside myprivatewebpage. (This is almost impossibly hard to foolproof everywhere on a complicated web application, especially with user-generated content.)
Over-sanitize the web page using a HTML output filter on the server, i.e. remove all <script>, <embed> and <object> tags, restrict the target of <img src=, <link rel=, <form action= etc. and also restrict the links in the CSS files. (This can prevent all unwanted resources if I can remember all HTML tags properly, e.g. I mustn't forget about <video>. But this is too restrictive: it removes all dyntamic web page functionality like JavaScript, Java applets and flash animations; without these most web applications are useless.)
Sanitize the web page, i.e. add an HTML output filter into the webserver which removes all offending URLs from the generated HTML. (This is not foolproof, because there can be a tricky JavaScript which generates a disallowed URL. It also doesn't protect against URLs loaded by Java applets and flash animations.)
Install a HTTP proxy which blocks requests based on the URL and the HTTP Referer, and force all browser traffic (including myprivatewebpage, otherwebpage, google.com) through that HTTP proxy. (This would slow down traffic to other than myprivatewebpage, and maybe it doesn't protect properly if XmlHttpRequest()s, Java applets or flash animations can forge the HTTP Referer.)
Find or write a Firefox or Chrome extension which intercepts all outgoing connections, and blocks them based on the URL of the tab and the target URL of the connection. I've found https://developer.mozilla.org/en/Setting_HTTP_request_headers and thinkahead.js in https://addons.mozilla.org/en-US/firefox/addon/thinkahead/ and http://thinkahead.mozdev.org/ . Am I correct that it's possible to write a Firefox extension using that? Is there such a Firefox extension already?
Some links I've found for the Chrome extension:
http://www.chromium.org/developers/design-documents/extensions/notifications-of-web-request-and-navigation
https://groups.google.com/a/chromium.org/group/chromium-extensions/browse_thread/thread/90645ce11e1b3d86?pli=1
http://code.google.com/chrome/extensions/trunk/experimental.webRequest.html
As far as I can see, only the Firefox or Chrome extension is feasible from the list above. Do you have any other suggestions? Do you have some pointers how to write or where to find such an extension?
I've found https://developer.mozilla.org/en/Setting_HTTP_request_headers and thinkahead.js in https://addons.mozilla.org/en-US/firefox/addon/thinkahead/ and http://thinkahead.mozdev.org/ . Am I correct that it's possible to write a Firefox extension using that? Is there such a Firefox extension already?
I am the author of the latter extension, though I have yet to update it to support newer versions of Firefox. My initial guess is that, yes, it will do what you want:
User visits your web page without plugin. Web page contains ThinkAhead block that would send a simple version header to the server, but this is ignored as plugin is not installed.
Since the server does not see that header, it redirects the client to a page to install the plugin.
User installs plugin.
User visits web page with plugin. Page sends version header to server, so server allows access.
The ThinkAhead block matches all pages that are not myprivatewebpage, and does something like set the HTTP status to 403 Forbidden. Thus:
When the user visits any webpage that is in myprivatewebpage, there is normal behaviour.
When the user visits any webpage outside of myprivatewebpage, access is denied.
If you want to catch bad requests earlier, instead of modifying incoming headers, you could modify outgoing headers, perhaps screwing up "If-Match" or "Accept" so that the request is never honoured.
This solution is extremely lightweight, but might not be strong enough for your concerns. This depends on what you want to protect: given the above, the client would not be able to see blocked content, but external "blocked" hosts might still notice that a request has been sent, and might be able to gather information from the request URL.