We're working on a website. Our client want to check the website daily, but they're facing a problem. Whenever we make a change on the website, they have to clear their browser cache.
So I added the following header to my server configuration
Cache-Control: no-cache
As far as I see, firefox is receiving this header and I'm pretty sure that it is obeying it.
My question is, is this "Cache-Control: no-cache" guaranteed and does it work across all the browsers (including IEs)?
I find it's handy to use a "useless" version number in the requests. For example, instead of requesting script.js, request script.js?v=1.0
If you are generating your pages dynamically (PHP, etc) you can just keep the version number in a variable and only have to update it in one place whenever you update. If you want the content never to be cached, just use the output of time() as your version number.
EDIT: have you tried asking your client to change his browser caching settings? That way you can bypass the problem entirely
Related
There are many occasions where my client sees a cached version of the website from the link I shared to them. After a while it can become very frustrating to make sure they're not seeing a cached version of the website. Are there ways out there to avoid this?
I often times have to ask them to clear their browser cache and point them to online how tos in how to clear their respective browser's cache.
I am testing a hack where I use query string to produce a unique URL. It's not being processed in the server, the query string is simply there to have a unique url every time I share it. It's currently working in my browser and to people I've shared but I'm not sure if it's a bulletproof solution.
For example:
http://examplesite.com/index.php?v=25
http://examplesite.com/index.php?d=09282017
This is a generally accepted practice to force the browser pull down the latest copy of the content yes.
This will not prevent the browser from caching other resources that this page requests, for example seperate css or JavaScript files.
Do you know how to change the response header in CouchDB? Now it has Cache-control: must-revalidate; and I want to change it to no-cache.
I do not see any way to configure CouchDB's cache header behavior in its configuration documentation for general (built-in) API calls. Since this is not a typical need, lack of configuration for this does not surprise me.
Likewise, last I tried even show and list functions (which do give custom developer-provided functions some control over headers) do not really leave the cache headers under developer control either.
However, if you are hosting your CouchDB instance behind a reverse proxy like nginx, you could probably override the headers at that level. Another option would be to add the usual "cache busting" hack of adding a random query parameter in the code accessing your server. This is sometimes necessary in the case of broken client cache implementations but is not typical.
But taking a step back: why do you want to make responses no-cache instead of must-revalidate? I could see perhaps occasionally wanting to override in the other direction, letting clients cache documents for a little while without having to revalidate. Not letting clients cache at all seems a little curious to me, since the built-in CouchDB behavior using revalidated Etags should not yield any incorrect data unless the client is broken.
I have a web application that has some pretty intuitive URLs, so people have written some Chrome extensions that use these URLs to make requests to our servers. Unfortunately, these extensions case problems for us, hammering our servers, issuing malformed requests, etc, so we are trying to figure out how to block them, or at least make it difficult to craft requests to our servers to dissuade these extensions from being used (we provide an API they should use instead).
We've tried adding some custom headers to requests and junk-json-preamble to responses, but the extension authors have updated their code to match.
I'm not familiar with chrome extensions, so what sort of access to the host page do they have? Can they call JavaScript functions on the host page? Is there a special header the browser includes to distinguish between host-page requests and extension requests? Can the host page inspect the list of extensions and deny certain ones?
Some options we've considered are:
Rate-limiting QPS by user, but the problem is not all queries are equal, and extensions typically kick off several expensive queries that look like user entered queries.
Restricting the amount of server time a user can use, but the problem is that users might hit this limit by just navigating around or running expensive queries several times.
Adding static custom headers/response text, but they've updated their code to mimic our code.
Figuring out some sort of token (probably cryptographic in some way) we include in our requests that the extension can't easily guess. We minify/obfuscate our JS, so are ok with embedding it in the JS source code (since the variable name it would have would be hard to guess).
I realize this may not be a 100% solvable problem, but we hope to either give us an upper hand in combatting it, or make it sufficiently hard to scrape our UI that fewer people do it.
Welp, guess nobody knows. In the end we just sent a custom header and starting tracking who wasn't sending it.
I read that spammers may be downloading a specific registration page on my site using curl. Is there any way to block that specific page from being CURLed, either through htaccess or other means?
I don't think this is possible to block curl, as curl has the ability to send user agents, cookies, etc. As far as I understand, it can completely emulate a normal user.
If you are worried about protecting a form, you can generate a random token which is submitted automatically when the form is submitted. That way, anyone who tries to make a script to automate registration will have to worry about scraping it first.
There is one weakness in CURL, which you can exploit, it can not run javascript like a browser. So you can take advantage of this fact, one first landing on the reg page, have your server side code check for a cookie, if it isnt there, send some javascript code to the browser, this code will set the cookie and do a redirect/reload ... after reload the server side again checks for the cookie, incase of browsers it will find it.. incase of curl the cookie generation and reload/redirect wont happen in the first place.
I hope i made some sense, bottom line .. utilize javascript to differentiate between curl and browser.
As Oren says, spammers can forge user-agents, so you can't just block the curl user-agent string. The typical solution here is some kind of CATPCHA. These are often jumbled images (though non-visual forms exist) sites (including StackOverflow) have you transcribe to prove you're human.
If I set the content expiration for static files to something like 14 days and I decide to update some files later on, will IIS know to serve the updated files or will the client have to wait until the expiration date?
Or is it the other way around where the browser requests a new file if the modified date is different?
Sometimes I update a file on the server and I have to do a hard refresh (CTRL+F5) to see the difference. Currently I have it to expire after 1 day.
The web browser, and any intermediate proxies, are allowed to cache the page until its expiration date. This means that IIS might not even be aware of the client viewing the page.
You want ETags
An ETag is an opaque identifier assigned by a web server to a specific version of a resource found at a URL. If the resource content at that URL ever changes, a new and different ETag is assigned. Used in this manner ETags are similar to fingerprints, and they can be quickly compared to determine if two versions of a resource are the same or not. [...]