I've been doing a lot of research on output caching lately and have been able to successfully implement output caching in IIS via web.config with either varyByQueryString or varyByHeaders.
However, then there's the issue of Pingdom's Performance & Real User Monitoring (or PRUM). They have a "fun" little beforeUnload routine that sets a PRUM_EPISODES cookie just as you navigate away from the page so it can time your next page load. The value of this cookie is basically a unixtimestamp() which changes every second.
As you can image, this completely breaks user-mode output caching because now every request will be sent with a different Cookie header on each subsequent request.
So two questions:
My first inclination says to find a way to drop PRUM_EPISODES cookie before it reaches the server since it's serves no purpose to the actual application (this is also my informal request for a ClientOnly flag in the next HTTP version). Is anyone familiar with a technique for dropping individual cookies before they reach IIS' output caching engine or some other technique to leverage varyByHeaders="Cookie" while ignoring PRUM_EPISODES? Haven't found such a technique for Web.config as of yet.
Do all monitoring systems manipulate cookies in this manner (changing every page request) for their tracking mechanisms and do they not realize that by doing so, they break user-mode output caching?
Related
So I have few things to say I don't want to use cookies so things like express-session doesn't come as option.
I use nodejs with express with no front-end JavaScript and mysql as database. I don't really know how to do it so I would like to hear your opinion.
I already tried to search on internet.
When dealing with regular web pages, there are only four places in a request to store information that would identify a session.
Cookie sent with each request
Custom header on each request
Query parameter with each request
In the path of the URL
You've ruled out the cookie.
The custom header could work for programmatic requests and is regularly used by Javascript code with various types of tokens. But, if you need a web browser to maintain or send the session on its own, then custom headers are out too.
That leaves query parameters or in the path of the URL. These both have the same issues. You would create a sessionID and then attach something like ?sessionID=92347987 to every single request that your web page makes to your server. There are some server-side frameworks that do sessions this way (most have been retired in favor of cookies). This has all sorts of issues (which is why it isn't used very often any more). Here are some of the downsides:
You have to dynamically generate every single link in a web page so that it will include the right sessionID as part of the link so if the user clicks on it, the resulting http request will have the right sessionID included.
All browser caching has to be disabled or bypassed because you don't want the browser to use cached web pages that might contain the wrong sessionID.
User bookmarks basically don't work because they end up bookmarking a URL with a sessionID in it that won't last forever.
The user sees sessionID=xxxx in all their URLs.
Network infrastructure that log the URLs of requests will include the sessionID (because it's in the URL). This is considered a security risk.
All that said and with those tradeoffs, it can be made to work, but it is not considered the "safest" way to do it.
What is the difference between
1) turning Glimpse off via the web.config's setting:
<glimpse defaultRuntimePolicy="Off" endpointBaseUri="~/Glimpse.axd">
2) turning it off via Glimpse.axd
As I understand it, 1) will turn off all the tracing whereas 2) will stop the return of the traces to that particular browser session, but tracing will still be happening on the server. As I understand it, the only way to turn Glimpse off, say for a production instance, to remove any Glimpse processing overhead, would be to use 1).
Is my understanding correct?
Thanks
In case of 1 the GlimpseRuntime will detect that it should not trace actions going on during any of the requests. This value is one of the Glimpse Runtime Policy values of which Off is the most restricted one. Keep in mind that there will still be a little bit of overhead to make that check. If you want to take Glimpse completely out of the picture, then you must make sure there are no Glimpse related assemblies in your bin folder and that the registered HttpModule and HttpHandler are removed from the config
In case of 2 it will also prevent any tracing for a particular request, which is different from case 1 where the configuration value applies to all requests.
Let me clarify that a little bit. The GlimpseRuntime determines a specific RuntimePolicy value for each request and it does that based on IRuntimePolicy implementations. Glimpse comes with a couple of policies out-of-the-box, some decide whether or not to trace requests or to return the Glimpse client as part of the response. They do that based on returning content types (you don't want the Glimpse panel to be returned when an image is requested for instance), the status code, uri used, ... and one of those policies is the ControlCookiePolicy which effectively checks whether a specific Glimpse cookie is part of the request, if that is not the case, tracing will be disabled completely for that particular request. When you go to the Glimpse.axd page and you turn Glimpse on or off, you're basically creating or deleting that cookie.
So in case of 1 no tracing will be done at all, but in case of 2 tracing can be done for request A if the cookie has been set, but can be disabled for request B if the cookie is no longer there.
It is possible to ignore this ControlCookiePolicy and to create your own policies to determine whether or not the Glimpse Client should be returned or tracing should be done, ...
I'm using Varnish in front of the backend.
Because the backend is sometimes very slow, I've enabled grace mode to serve stale content for clients. However, with grace mode, there is still one user will need to go to backend and have a very bad user experience.
Is it possible with Varnish to server stale content for ALL users while refreshing the cache?
I've seen some people suggested to use a cron job or script to refresh the cache on local host. This is not an elegant solution because there are so many URLs on our site and it'll be very difficult to manually refresh each of them.
I know the underlying problem is with the backend and we need to fix the problem there. But in the short term, I'm wondering if I can improve response time from Varnish layer?
You can do this (in the average case) in Varnish 3 by using restarts and a helper process.
How you'd write a VCL for it is described here: (disclosure: my own blog)
http://lassekarstensen.wordpress.com/2012/10/11/varnish-trick-serve-stale-content-while-refetching/
It is fairly convoluted, but works when you have an existing object that just expired.
In (future) Varnish 4 there will be additional VCL hooks that will make such tricks easier.
Yes, it is possible to serve stale content to all users (during a specified amount of time). You should experiment with the grace and saint mode to set appropriate time limits that suits your application.
Read more here: https://www.varnish-cache.org/docs/3.0/tutorial/handling_misbehaving_servers.html
We are about to make another server for XPages applications. In front of it there will be fail over/load balance component (Microsoft Forefront, IBM Web server) that will redirect HTTP request to one of two cluster servers.
I suppose that scoped variables will be reinitialized in case of fail over - user is redirected to other server which will initialize XPage from scratch (GET) or subset of data (POST). Anything binded to beans/scoped variables will be lost (pager state, application specific data). This can cause odd behaviour to users: loss of entered data or opening of unexpected page. I am aware of fact, that this is highly depending on application design.
The situation can be very similar to expired session on one server - how to prevent loss of data in such case.
Are there any coding best practices how to avoid side effects of fail over from server to server?
While not a code best code best practise, you first need to configure your load balancer to keep users on the same session once started (probably_ using a cookie, so failover only happens when your box really goes down.
Secondly don't take scope variables to be there, always test for them - which is a good practice anyway since a session can timeout and loose its variables on a single server too.
POST will fail due to a lack of x-session, so you might resort to posting only via Ajax that can have an error handler.
You could consider to use cookies to capture state information.
I have a web application that has some pretty intuitive URLs, so people have written some Chrome extensions that use these URLs to make requests to our servers. Unfortunately, these extensions case problems for us, hammering our servers, issuing malformed requests, etc, so we are trying to figure out how to block them, or at least make it difficult to craft requests to our servers to dissuade these extensions from being used (we provide an API they should use instead).
We've tried adding some custom headers to requests and junk-json-preamble to responses, but the extension authors have updated their code to match.
I'm not familiar with chrome extensions, so what sort of access to the host page do they have? Can they call JavaScript functions on the host page? Is there a special header the browser includes to distinguish between host-page requests and extension requests? Can the host page inspect the list of extensions and deny certain ones?
Some options we've considered are:
Rate-limiting QPS by user, but the problem is not all queries are equal, and extensions typically kick off several expensive queries that look like user entered queries.
Restricting the amount of server time a user can use, but the problem is that users might hit this limit by just navigating around or running expensive queries several times.
Adding static custom headers/response text, but they've updated their code to mimic our code.
Figuring out some sort of token (probably cryptographic in some way) we include in our requests that the extension can't easily guess. We minify/obfuscate our JS, so are ok with embedding it in the JS source code (since the variable name it would have would be hard to guess).
I realize this may not be a 100% solvable problem, but we hope to either give us an upper hand in combatting it, or make it sufficiently hard to scrape our UI that fewer people do it.
Welp, guess nobody knows. In the end we just sent a custom header and starting tracking who wasn't sending it.