I am experiencing varnish (6.4) crashing very regularly when about 5K items are in the cache.
The problem is that I don't see any MAIN.n_lru_nuked entry in varnishstat.
Does that mean that no eviction is taking place ?
We have set the storage as malloc with 5g. varnish is running in docker a container with 10g of mem allocated to it.
varnishd -F -f /etc/varnish/default.vcl -a http=:80,HTTP -a proxy=:8443,PROXY -s malloc,5g
Here is the vcl
vcl 4.0;
import directors;
backend back1 {
.host = "xxx.xx.xx.xx";
.port = "80";
.connect_timeout = 600s;
.first_byte_timeout = 600s;
.between_bytes_timeout = 600s;
}
acl purge {
"localhost";
#back1 1
"xxx.xx.xx.xx";
}
sub vcl_init {
new loadbalancer = directors.round_robin();
loadbalancer.add_backend(back1);
}
sub vcl_backend_response {
set beresp.grace = 30s;
if (bereq.url ~ "assets") {
unset beresp.http.set-cookie;
set beresp.http.cache-control = "public, max-age=120";
set beresp.ttl = 2h;
return (deliver);
}
# Default : Any other content is cached for 2hours in Varnish and 120s in the browser . Except for the admin area backend
if ( !(bereq.url ~ "adminarea") )
{
unset beresp.http.set-cookie;
set beresp.http.cache-control = "public, max-age=120";
set beresp.ttl = 2h;
return (deliver);
}
}
sub vcl_deliver {
# Dynamically set the Expires header on every request from the web.
set resp.http.Expires = "" + (now + 120s);
# 2. Delete the temporary header from the response.
unset resp.http.via;
unset resp.http.x-powered-by;
# unset resp.http.server;
# unset resp.http.x-varnish;
}
sub vcl_recv {
if (req.method == "BAN") {
if (!client.ip ~ purge) {
return(synth(403, "Not allowed."));
}
ban("obj.http.Pid == " + req.http.Varnish-Ban-Pid ) ;
# Throw a synthetic page so the
# request won't go to the backend.
return (synth(200, "Banned pid "+ req.http.Varnish-Ban-Pid)) ;
}
# Enable caching only for GET/HEADER methods
if (req.method != "GET" && req.method != "HEAD" ) {
set req.http.X-Varnish-Pass="y";
return (pass);
}
# Do not cache multimedia
if (req.url ~ "\.(mp3|mp4|flv)$") {
return (pass);
}
# Do not check in the cache for TYPO3 backend and AJAX requests
if (req.url ~ "^/adminarea/") {
set req.http.X-Varnish-Pass="y";
return (pass);
}
if (req.http.Accept-Language) {
if (req.http.Accept-Language ~ "^fr") {
set req.http.Accept-Language = "fr";
} elsif (req.http.Accept-Language ~ "^es") {
set req.http.Accept-Language = "es";
} elsif (req.http.Accept-Language ~ "^en") {
set req.http.Accept-Language = "en";
} else {
set req.http.Accept-Language = "fr";
}
}
# Force to gzip compression if the client allow compression of any kind
if (req.http.Accept-Encoding) {
if (req.http.Accept-Encoding ~ "gzip") {
set req.http.Accept-Encoding = "gzip";
} else {
unset req.http.Accept-Encoding;
}
}
# Update the X-Forwarded-For header by adding client IP address to it
if (req.http.X-Forwarded-For) {
set req.http.X-Forwarded-For = req.http.X-Forwarded-For + ", " + client.ip;
} else {
set req.http.X-Forwarded-For = client.ip;
}
# Tell Varnish to cache anything stored in /fileadmin /assets /Resources
# (ignoring web server cache control header directives)
if (req.url ~ "assets") {
return (hash);
}
# Tell Varnish to always cache the calendar
if (req.url ~ "calendar") {
return (hash);
}
if ( !(req.url ~ "adminarea") )
{
return (hash);
}
set req.http.X-Varnish-Pass="y";
return (pass);
}
DISCLAIMER: This is just a working theory, I cannot prove this
Theory: transient storage makes container go out of memory
I notice that over time 17.37G has been allocated to the Transient storage. Your stats show that this number has been freed as well.
Transient storage consumes memory that is not contained within the -s malloc,5g.
You say that your container has 10G allocated to it, so that means if the transient storage reaches 5G at some point, your container might crash.
What goes into transient?
As the name indicates, transient is temporary storage. This type of storage is used for:
Short-lived objects (objects with a TTL lower than the shortlived runtime parameter that defaults to 10 seconds)
Non-cacheable objects that are in-flight
Request bodies
Transient is primarily used to store items that aren't going to be in regular memory for long.
Even non-cacheable objects are temporarily put in transient, because you don't want fast backends to be blocked by slow clients. This means the backend streams the response to transient and can handle other tasks, while the client can pick this response up at its own convenience.
What to happened in your case?
Does your Varnish container process large files, such as video or audio? Even if they are not cached, they need to be kept in transient?
Again, it's just a theory, no way to prove this. But if you can reproduce the problem, please check the transient varnishstat counters.
If you see the SMA.Transient.g_bytes increasing, you know that transient is the reason for the crash.
Related
I'm running multiple Varnish cache servers on top of each other. I want to "combine" the headers of each of them, that is, when I make a request to my website, I can see which cache server it had a hit on. Right now, both of the cache servers have this code:
sub vcl_deliver {
# Happens when we have all the pieces we need, and are about to send the
# response to the client.
#
# You can do accounting or modifying the final object here.
if (obj.hits > 0) {
set resp.http.X-Cache = "HIT";
} else {
set resp.http.X-Cache = "MISS";
}
}
On my second cache server, I'd like the have something like this:
sub vcl_deliver {
# Happens when we have all the pieces we need, and are about to send the
# response to the client.
#
# You can do accounting or modifying the final object here.
if (obj.hits > 0) {
set resp.http.X-Cache = "HIT, " + responsefromfirst;
} else {
set resp.http.X-Cache = "MISS, " + responsefromfirst;
}
}
With responsefromfirst being the "X-Cache" header from the previous cache. How can I do this?
how about
sub vcl_deliver {
# Happens when we have all the pieces we need, and are about to send the
# response to the client.
#
# You can do accounting or modifying the final object here.
if (obj.hits > 0) {
set resp.http.X-Cache = "HIT, " + resp.http.X-Cache;
} else {
set resp.http.X-Cache = "MISS, " + resp.http.X-Cache;
}
}
you really just want to prepend information to header that is already there.
i'm using a varnish 3 behind a nging to proxy multiple sites into one domain.
The basic setup works fine but I now have a problem with varnish serving the wrong files if the filename already exists in it's cache.
Basically all I do in my default.vcl is this:
if(req.url ~ "^/foo1") {
set req.backend = foo1;
set req.url = regsub(req.url, "^/foo1/", "/");
}
else if(req.url ~ "^/foo2") {
set req.backend = foo2;
set req.url = regsub(req.url, "^/foo2/", "/");
}
If I now call /foo1/index.html, /foo2/index.html will serve the same file. After a restart of varnish and a call of /foo2/index.html, /foo1/index.html will serve foo2's index.html.
As far as I found out this is an issue with the creation of the hash which does not respect the used backend but only the url (after shortening) and the domain:
11 VCL_call c hash
11 Hash c /index.html
11 Hash c mydomain
I solved this issue for now by altering my vcl_hash to also use the backend but I'm sure there must be a better, more convenient way:
sub vcl_hash {
hash_data(req.url);
hash_data(req.backend);
}
Any hint would be appreciated, thank you very much!
You have two different ways of doing this. First one, is to do what you suggested by adding extra values (e.g. req.backend) in vcl_hash.
sub vcl_hash {
hash_data(req.url);
hash_data(req.backend);
}
Second way, is to not update req in vcl_recv, but only bereq in vcl_miss/pass.
sub vcl_urlrewrite {
if(req.url ~ "^/foo1") {
set bereq.url = regsub(req.url, "^/foo1/", "/");
}
else if(req.url ~ "^/foo2") {
set bereq.url = regsub(req.url, "^/foo2/", "/");
}
}
sub vcl_miss {
call vcl_urlrewrite;
}
sub vcl_pass {
call vcl_urlrewrite;
}
sub vcl_pipe {
call vcl_urlrewrite;
}
This second approach requires more VCL but it comes with advantages as well. For example, when analyzing logs with varnishlog, you can see the vanilla request (c column), and also the updated backend request (b column).
$ varnishlog /any-options-here/
(..)
xx RxURL c /foo1/index.html
(..)
xx TxURL c /index.html
(..)
$
In my Varnish 2 setup I have a purging/banning block like so:
acl purge {
"localhost";
"x.x.x.x"/24;
}
sub vcl_recv {
if (req.request == "PURGE") {
if (!client.ip ~ purge) {
error 405 "Not allowed.";
}
return (lookup);
}
if (req.request == "BAN") {
if (!client.ip ~ purge) {
error 405 "Not allowed.";
}
ban("obj.http.x-host == " +req.http.host+" && obj.http.x-url ~ "+req.url);
# Throw a synthetic page so the
# request wont go to the backend.
error 200 "Ban added";
}
}
I'd expect that I could simply replace the client.ip in the if-statements for req.http.x-forwarded-for, but when I do the following compile error occurs:
Message from VCC-compiler:
Expected CSTR got 'purge'
(program line 944), at
('purging-banning.vcl' Line 16 Pos 41)
if (!req.http.x-forwarded-for ~ purge) {
----------------------------------------#####----
Running VCC-compiler failed, exit 1
VCL compilation failed
I have been searching Google and StackOverflow, but I haven't found a good solution to my problem yet, or the reason why req.http.x-forwarded-for wouldn't be in the right place here.
Who can help?
Try using "ip" from the vmod_std. See: https://varnish-cache.org/docs/trunk/reference/vmod_std.generated.html#func-ip
Like so:
if (std.ip(req.http.x-forwarded-for, "0.0.0.0") !~ purge) {
error 405 "Not allowed.";
}
This simply converts a string object to an IP object. Then, the IP object can be compared to the IP acl lists.
I don't have the 'rep' to comment, so I'm trying to affirm the answer using std.ip. I have the identical situation, and using std.ip fixed it. Remember to add import std in your default.vcl.
Also, in my case, nginx was forwarding to varnish, and X-Forwarded-For sometimes had 2 IPs in it, so I used X-Real-IP, which was set to $remote_addr in my nginx forwarding config.
I'm was trying to think of a way to help minimize the damage on my node.js application if I ever get a DDOS attack. I want to limit requests per IP. I want to limit every IP address to so many requests per second. For example: No IP address can exceed 10 requests every 3 seconds.
So far I have come up with this:
http.createServer(req, res, function() {
if(req.connection.remoteAddress ?????? ) {
block ip for 15 mins
}
}
If you want to build this yourself at the app server level, you will have to build a data structure that records each recent access from a particular IP address so that when a new request arrives, you can look back through the history and see if it has been doing too many requests. If so, deny it any further data. And, to keep this data from piling up in your server, you'd also need some sort of cleanup code that gets rid of old access data.
Here's an idea for a way to do that (untested code to illustrate the idea):
function AccessLogger(n, t, blockTime) {
this.qty = n;
this.time = t;
this.blockTime = blockTime;
this.requests = {};
// schedule cleanup on a regular interval (every 30 minutes)
this.interval = setInterval(this.age.bind(this), 30 * 60 * 1000);
}
AccessLogger.prototype = {
check: function(ip) {
var info, accessTimes, now, limit, cnt;
// add this access
this.add(ip);
// should always be an info here because we just added it
info = this.requests[ip];
accessTimes = info.accessTimes;
// calc time limits
now = Date.now();
limit = now - this.time;
// short circuit if already blocking this ip
if (info.blockUntil >= now) {
return false;
}
// short circuit an access that has not even had max qty accesses yet
if (accessTimes.length < this.qty) {
return true;
}
cnt = 0;
for (var i = accessTimes.length - 1; i >= 0; i--) {
if (accessTimes[i] > limit) {
++cnt;
} else {
// assumes cnts are in time order so no need to look any more
break;
}
}
if (cnt > this.qty) {
// block from now until now + this.blockTime
info.blockUntil = now + this.blockTime;
return false;
} else {
return true;
}
},
add: function(ip) {
var info = this.requests[ip];
if (!info) {
info = {accessTimes: [], blockUntil: 0};
this.requests[ip] = info;
}
// push this access time into the access array for this IP
info.accessTimes.push[Date.now()];
},
age: function() {
// clean up any accesses that have not been here within this.time and are not currently blocked
var ip, info, accessTimes, now = Date.now(), limit = now - this.time, index;
for (ip in this.requests) {
if (this.requests.hasOwnProperty(ip)) {
info = this.requests[ip];
accessTimes = info.accessTimes;
// if not currently blocking this one
if (info.blockUntil < now) {
// if newest access is older than time limit, then nuke the whole item
if (!accessTimes.length || accessTimes[accessTimes.length - 1] < limit) {
delete this.requests[ip];
} else {
// in case an ip is regularly visiting so its recent access is never old
// we must age out older access times to keep them from
// accumulating forever
if (accessTimes.length > (this.qty * 2) && accessTimes[0] < limit) {
index = 0;
for (var i = 1; i < accessTimes.length; i++) {
if (accessTimes[i] < limit) {
index = i;
} else {
break;
}
}
// remove index + 1 old access times from the front of the array
accessTimes.splice(0, index + 1);
}
}
}
}
}
}
};
var accesses = new AccessLogger(10, 3000, 15000);
// put this as one of the first middleware so it acts
// before other middleware spends time processing the request
app.use(function(req, res, next) {
if (!accesses.check(req.connection.remoteAddress)) {
// cancel the request here
res.end("No data for you!");
} else {
next();
}
});
This method also has the usual limitations around IP address monitoring. If multiple users are sharing an IP address behind NAT, this will treat them all as one single user and they may get blocked due to their combined activity, not because of the activity of one single user.
But, as others have said, by the time the request gets this far into your server, some of the DOS damage has already been done (it's already taking cycles from your server). It might help to cut off the request before doing more expensive operations such as database operations, but it is even better to detect and block this at a higher level (such as Nginx or a firewall or load balancer).
I don't think that is something that should be done at the http server level. Basically, it doesn't prevent users to reach your server, even if they won't see anything for 15 minutes.
In my opinion, you should handle that within your system, using a firewall. Although it's more a discussion for ServerFault or SuperUser, let me give you a few pointers.
Use iptables to setup a firewall on your entry point (your server or whatever else you have access to up the line). iptables allows you to set a limit of max connections per IP. The learning curve is pretty steep though if you don't have a background in Networks. That is the traditional way.
Here's a good resource geared towards beginners : Iptables for beginners
And something similar to what you need here : Unix StackExchange
I recently came across a really nice package called Uncomplicated Firewall (ufw) it happens to have an option to limit connection rate per IP and is setup in minutes. For more complicated stuff, you'll still need iptables though.
In conclusion, like Brad said,
let your application servers do what they do best... run your application.
And let firewalls do what they do best, kick out the unwanted IPs from your servers.
It is not good if you use Nodejs filter the connection or apply the connection policy as that.
It is better if you use Nginx in front of NodeJS
Client --> Nginx --> Nodejs or Application.
It is not difficult and cheap because Ngnix is opensource tooo.
Good luck.
we can use npm Package
npm i limiting-middleware
Code :
const LimitingMiddleware = require('limiting-middleware');
app.use(new LimitingMiddleware({ limit: 100, resetInterval: 1200000 }).limitByIp());
// 100 request limit. 1200000ms reset interval (20m).
For more information: Click here
I've got a Varnish setup sitting in front of PHP machines. For 98% of pages, a single request timeout (req.connect_timeout in VLC) works. I've got a couple pages, however, that we expect to take up to 3 minutes before they should timeout. Is there a way to set req.connection_timeout for specific requests in Varnish? If so, please show me the light in VCL. I'd like to keep the same req.connect_timeout for all pages but raise that number for these few specific pages.
Unfortunatly, this does not work for varnish > 3
Very sad. There does not seem to be a way to actually achieve this in v>3.0
Banging my head since hours on this issue.
I now do have a solution:
Use vcl_miss!
Here is an example:
sub vcl_recv {
set req.backend = director_production;
if (req.request == "POST") {
return(pipe);
}
else {
return(lookup);
}
}
sub vcl_miss {
if (req.url ~ "/longrunning") {
set bereq.first_byte_timeout = 1h; # one hour!
set bereq.between_bytes_timeout = 10m;
} else {
set bereq.first_byte_timeout = 10s;
set bereq.between_bytes_timeout = 1s;
}
}
This works for me.
What got me worried was that the documentation of varnish states that vcl_miss is always called when am object is not found in the cache. In my first version I ommited the if/else in vcl_recv. I then had to experience (once again) that somehow the documentation is wrong. One needs to explicitly state the "return(lookup)". Otherwise vcl_miss is not called. :(
I think connection_timeout limit the time for setting up the connection to the back-end, and first_byte_timeout and between_bytes_timeout limit the processing time. Have you tried setting the bereq.first_byte_timeout programmatically in vcl_recv? E.g. with something like:
backend mybackend {
.host = "127.0.0.1";
.port = "8080";
.connect_timeout = 100ms;
.first_byte_timeout = 5s;
.between_bytes_timeout = 5s;
}
sub vcl_recv {
set req.backend = mybackend;
if ( req.url ~ "/slowrequest" ) {
# set req.connect_timeout = 180s; # old naming convention?
set bereq.connect_timeout = 180s;
}
# .. do default stuff
}
Let me know if it works...
I would solve it by declaring multiple backends in Varnish, each with a different timeout - but probably refering to the very same IP and server. Then you can simply set a new backend for certain URLs, to force them to use the timeouts declared there.
if (req.url ~ "[something]") {
set req.backend = backend_with_higher_timeout;
}
In VCL 4.0 you can define your backend and give varnish hint to use it:
sub vcl_recv {
if (req.method == "POST" && req.url ~ "^/admin") {
set req.backend_hint = backend_admin_slow;
}
}
Use vcl_backend_fetch and set the timeout there:
sub vcl_backend_fetch {
if (bereq.method == "POST" && bereq.url == "/slow") {
set bereq.first_byte_timeout = 300s;
}
}