The behavior of varnish on MISS - varnish

Consider this scenario:
The Varnish cache has a MISS and the backend server is now regenerating the requested content. During the generation time a second request comes in and also gets an MISS. Does varnish send this request to the backend while the other request is pending? What if thousand requests come in between this time. Server would crash right? Every request would make it slower.
Is this correct or is varnish "synchronizing" those scenarios to prevent such a problem?
Thank you in advance!

Varnish sends all the requests to the backend. I.e. it does not queue other requests and issue just one backend request and use its response for all.
However Varnish has a grace option that lets you keep old, expired content in cache for these types of situations.
For example consider the following VCL:
sub vcl_recv {
if (req.backend.healthy) {
set req.grace = 5m;
} else {
set req.grace = 24h;
}
}
sub vcl_fetch {
set beresp.grace = 24h;
}
Now if a backend is healthy (see backend polling) and a request results in a MISS, the first request is sent to a backend. If another request comes for the same content, but there is an item in the cache with age <TTL+req.grace (in this case 5 minutes), that request will get the "stale" content instead. This happens as long as either the first request that resulted in a MISS gets a response from the backend (and the cache is fresh again) or the age of the item becomes greater than TTL+req.grace.
If the backend was down (req.backend.healthy == FALSE), stale content would be served as long as age<TTL+24h.
You might also want to check out the Saving a request section of the Varnish book for a more thorough example and an exercise.
Fixed: unescaped < character.
Fixed more: There was another unescaped < character...

I believe Ketola's (accepted) answer is wrong.
Multiple requests to Varnish for the same URI will be queued.
Then it depends if the result of the first request is cacheable or not. If it is, it will be used for other (queued) requests as well.
If not, all other queued requests will be sent to the backend.
So if you have some slow API endpoint you want to cache, and it's cacheable (regarding Varnish rules), multiple requests will hit the backend only once for that URI.

I don't have the points or whatever to comment on #max_i's answer so I'm submitting another answer to verify his instead.
Ketola's accepted answer isn't completely wrong, it's possibly just out of date, and may have been true for older versions of Varnish. Specifically this part:
Varnish sends all the requests to the backend. I.e. it does not queue other requests and issue just one backend request and use its response for all.
After independently testing this myself using a standard installation of Varnish 4.1 LTS and Apache 2.4, I created a basic PHP file which contained the following:
<?php sleep(5); echo 'hello world!';
Then used ab to test the HTTP request cycle using 50 requests at 5 concurrency. The results showed that while Varnish accepted every single connection, only one request was ever made to the backend, which, as expected, took roughly 5 seconds to resolve. Each Varnish connection subsequently had to wait that minimum period before receiving a response.
The downside to this is of course that requests after the first one are "queued" behind it, but this is of course a minor concern compared to all 50 requests hitting the backend at once (or in the case of my test, with a concurrency of 5).

Related

Varnish Pass - request coalescing

I have a varnish 4 setup with - nginx ssl termination -> varnish -> varnish rr to 4 apache backends
We need to basically not cache any requests where a specific cookie isn't set on the incoming request, so in my vcl_recv I have:
if (!req.http.Cookie ~ "cookiename") {
return(pass);
}
This works fine initially, but as it is a busy site over time (10 mins or so) our backend failures, and busy sleep/wakeup are increasing, and we get 503s from varnish itself, but the backends are fine and don't appear to be under any real load. Which makes me think that the requests are queued and sent sequentially to the backends and it skips any request coalescing.
I can't really find anything to support this, is this the case? Or is there is a better way to do this? I would appreciate the feedback.
Thanks
Passed requests aren't request coalescing candidates. Request coalescing only applies to cacheable resources.
This means requests that go through vcl_miss, but that don't end up becoming Hit-For-Miss/Hit-For-Pass objects in vcl_backend_response.
Please use the following command to monitor potential HTTP 503 errors:
varnishlog -g request -q "BerespStatus == 503"
It will allow you to figure out why the error is taking place.

Does IIS Request Content Filtering Load the full request before filter

I'm looking into IIS Request filtering by content-length. I've set the max allowed content length :
appcmd set config /section:requestfiltering /requestlimits.maxallowedcontentlength:30000000
My question is about when the filter will occur.
Will IIS first read ALL the request into memory and then throw an error, or will it raise an issue as soon as it reaches the threshold?
The IIS Request Filtering module is processed very early in the request pipeline. Unwanted requests are quickly discarded before proceeding to application code which is slower and has a much larger attack surface. For this reason, some have reported performance increases after implementing Request Filtering settings.
Limitations
Request Filtering Limitations include the following:
Stateless - Request Filtering has no knowledge of application or session state. Each request is processed individually regardless of whether a session has or has not been established.
Request Header Only - Request Filtering can only inspect the request header. It has no visibility into the request body or any part of the response.
Basic Logic - Regular expressions and wildcard matches are not available. Most settings consist of establishing size constraints while others perform simple string matching.
maxAllowedContentLength
Request Filtering checks the value of the Content-Length request header. If the value exceeds that which is set for maxAllowedContentLength the client will receive an HTTP 404.13.
The IIS 8.5 STIG recommends a value of 30000000 or less.
IISRFBaseline
This above information is based on my PowerShell module IISRFBaseline. It helps establish an IIS Request Filtering baseline by leveraging Microsoft Logparser to scan a website's content directory and IIS logs.
Many of the settings have a dedicated markdown file providing more information about the setting. The one for maxAllowedContentLength can be found at the following:
https://github.com/phbits/IISRFBaseline/blob/master/IISRFBaseline-maxAllowedContentLength.md
Update - #johnny-5 comment
The filtering happens immediately which makes sense because Request Filtering only has visibility into the request header. This was confirmed via the following methods:
Failed Request Tracing - the Request Filtering module responded to the request with an HTTP 413 Request entity too large.
http.sys event tracing - the request is accepted and handed off to the IIS website. Shortly thereafter is an entry showing the HTTP 413 response. The time between was not nearly long enough for the upload to complete.
Packet capture - Using Microsoft Network Monitor, the HTTP conversation shows IIS immediately responded with an HTTP 413 Request entity too large.
The part you're rightfully concerned with is that IIS still accepts the upload regardless of file size. I found the limiting factor to be connectionTimeout which has a default setting of 120 seconds. If the file is "completed" before the timeout then an HTTP 413 error message is displayed. When a timeout occurs, the browser shows a connection reset since the TCP connection is destroyed by IIS after sending a TCP ACK/RST.
To test this further the timeout was increased and set to connectionTimeout=6000. Then a large upload was submitted and the following IIS components were stopped one at a time. After each stop, the upload was checked via Network Monitor and confirmed to be still running.
Website
Application Pool (Stop-WebAppPool -Name AppPoolName)
World Wide Web Publishing Service (Stop-Service -Name W3SVC)
With all three stopped I verified there was no IIS process still running and yet bytes were still being uploaded. This leads me to conclude that the connection is maintained by http.sys. The fact that connectionTimeout is closely tied to http.sys seems to support this. I do not know if the uploaded bytes go to a buffer or are simply discarded. The event tracing messages didn't provide anything helpful in this context.
Leaving out the Content-Length request header will result in an RFC protocol error (i.e. HTTP 400 Bad request) generated by http.sys since the size of the HTTP payload isn't being declared.

Amazon Cloudfront timeout error

I am working on a node project which generates data using mongodb dataset-generator and I've added my data generation server code to AWS's Lambda which I've expose to AWS's api gateway.
So now the issue is that CloudFront timeout the request after 30 seconds. And the problem is that the computation I am doing cannot be break into multiple API hits. So can anyone from community can help me out here or can tell me some alternative which allows me to hit request which won't get timeout.
I believe I originally misinterpreted the nature of the problem you are experiencing.
So now the issue is that CloudFront timeout the request after 30 seconds
I assumed, since you mentioned CloudFront, that you had explicitly configured CloudFront in front of your API Gateway endpoint.
It may be true that you didn't, since API Gateway implicitly uses services from "the AWS Edge Network" (a.k.a. CloudFront) to provide a portion of its service.
My assumption was that API Gateway's "hidden" CloudFront distributions had different behavior than a standard CloudFront distribution, but apparently that is not the case to any extent that is relevant, here.
In fact, API Gateway also has a 30 second response timeout and Can Be Increased? is No. So the "CloudFront" timeout is essentially the same timeout as the one imposed by API Gateway.
This, of course, would have have precedence over any longer timeout on your Lambda function.
There isn't a simple and obvious workaround. This seems like a task that is outside the scope of the design of API Gateway.
One option -- which I personally tend to dislike when APIs force it on me -- is to require pagination. I really hate that... just give me the data, I can handle it... but it has its practical applications. If the request is for 1000000 rows, return rows 1 through 1000 and return a next_url that will fetch rows 1001 through 2000.
Another option is for the initial function to submit the request to a second lambda function, using asynchronous invocation, for processing, and return a redirect that will send the user to a new URL where the data can be fetched. Now, stick with me, because this solution sounds really horrible, but it's theoretically viable. The asynchronous function would do the work in the background, and store the response in S3. The URL where the data is fetched would be third lambda function that would poll the key in the S3 bucket where the data is to be stored, say once per second for 20 seconds. If the file shows up, it would pre-sign a URL for that location, and issue a final redirect to the browser with the signed URL as the Location. If the file does not show up, it would redirect the browser back to itself again so that polling would continue until the file shows up or the browser gets tired of the redirect loop.
Sketchy? Yes. Viable? Probably. Good idea? That's debatable... but it seems as if you are doing something that really is outside the fundamental design parameters of API Gateway, so either a fairly complex workaround is needed, of you'll want to implement this somewhere other than with API Gateway.
Of course, you could write your own "API Gateway" that runs on EC2 and invokes Lambda functions directly through the Lamdba API and returns results to the caller -- so Lambda still handles the work and the scaling, but you avoid the 30 second timeout. 30 seconds is a long time to wait for a web response.
I see that this is the old question but need to say that start from March 2017 it is possible to change an origin response timeout and keep-alive timeout.
https://aws.amazon.com/about-aws/whats-new/2017/03/announcing-configure-read-timeout-and-keep-alive-timeout-values-for-your-amazon-cloudfront-custom-origins/
Max value is 60 seconds for Origin response timeout but if needed AWS can increase value to 180 seconds (with support request)

Dynamic action calls are getting through Amazon CloudFront

We have configured CDN to speed up our website. In our website we are doing some ajax calls basically action calls which take some amount of time to get response from origin server because they are some heavy queries.
Query takes more than 40 - 50 seconds to execute, due to which for most of the actions which take more than 30 seconds to execute we are getting 504 timeout error from cloud front.
Is there any option in cloudfront where we can increase these limit for dynamic calls or if we can ignore these action by cloudfront because all these are dynamic action it shouldn't get route through cloudfront CDN.
There is no way to set Cloudfront timeouts.
A couple methods:
Route the dynamic calls directly to your server. As you suggested, CLoudfront is going to offer 0 benefit for those calls so don't use the Cloudfront urls and instead use the backend urls.
Polling. The goal is to change your long request into lots of short ones. One call to make the request. Then subsequent calls to check on the status of the job. This is clearly much more effort as it will result in some coding changes - however, at some point, your jobs are going to grow to timing out at the browser level as well, so might be something to think about now. (You could also use something like websockets, where there is a persistent connection that you pass data on.)

Multiple request triggered when used browser but not when used java httpclient

Here is my application cloud environment.
I have ELB with sticky session -> 2 HA Proxy -> 1 Machines which hosts my application on jboss.
I am processing a request which takes more than 1 minute. I am logging IP addresses at the start of the processing request.
When i process this request through browser, I see that duplicate request is being logged after 1 minute and few seconds. If first request routes from the HAProxy1 then another request routes from HAProxy2. On browser I get HttpStatus=0 response after 2.1 minute
My hypotesis is that ELB is triggering this duplicate request.
Kindly help me to verify this hypothesis.
When I use the Apache Http Client for same request, I do not see duplicate request being triggered. Also I get exception after 1 minute and few seconds.
org.apache.http.NoHttpResponseException: The target server failed to respond
Kindly help me to understand what is happening over here.
-Thanks
By ELB I presume you are referring to Amazon AWS's Elastic Load Balancer.
Elastic Load Balancer has a built-in request time-out of 60 seconds, which cannot be changed. The browser has smart re-try logic, hence you're seeing two requests, but your server should be processing them as two separate unrelated requests, so this actually makes matters worse. Using httpclient, the timeout causes the NoHttpResponseException, and no retry is used.
The solution is to either improve the performance of your request on the server, or have the initial request fire off a background task, and then a supplemental request (possibly using AJAX) which polls for completion.

Resources