Programmatically adding/removing backends to/from Varnish-cache 4.x

Programmatically adding/removing backends to/from Varnish-cache 4.x - varnish

I have a simple program for adding and removing backends for varnish 3.x.
It's done in a simple way: there is a python program that accepts HTTP requests for adding and removing backends for named services. It adds and removes backends for specific directors using VCL configuration and then reloads config for varnish.
Now I wonder how to implement such a scheme for new (4.x) Varnish-cache. I see from documentation directors have moved from VCL modules to VMOD loadables. As I see it, now it's impossible to add new backend or director on-fly without restarting the whole varnish. Or am I wrong?
So the questions are:
Does Varnish-cache 4.x have any external [to varnish itself] API for managing configurations and directors in particular?
What's the best way to manage directors in Varnish 4.x in automatic way without varnish restart?

Moving directors to VMODs is a step on the way to make the API service you ask about it #1.
Your software that writes VCL and loads it on changes can continue to operate as before. It is just the notation/syntax that have changed in 4.0. Use vcl.load / vcl.discard with varnishadm as usual to do #2.

Related

Isn't Ngnix load balancing like proxy server?

Mainly is there any difference between using nginx as load balancer for bunch of upstream servers. Or using small nodejs proxy server that acts like a proxy between bunch of servers and one public hosting.
It may look obvious to you but to me nginx is very new. And i barely know anything about it..
Also i guess my question is there any performance advantage for using nginx as proxy server that distribute load vs running your own node js code that acts a proxy between other requests.

In case of introducing +1 technology I'd say keep custom NodeJS proxy as short-term solution.
Long-term solution is Nginx as reverse-proxy among array of backend makes a big sense by number technical and maintenance reasons. An application rarely stays the same because you apply new features, replace legacy code and deploy new ones so the way is to use right tool for right task. Nginx is proven and chosen by many heavy loaded applications over the web. The memory consumption and CPU utilisation is low and stable.
Most of people use Nginx as reverse-proxy (the biggest reason to use Nginx by the way) rather than anything else because such powerful and featured it is.
From request-response life cycle m Nginx keeps rotating between backend to send request again if a given backend is dead, so not even one request lost.
From maintenance point of view dynamic upstream (part of commercial installation) with Rest interface looks good enough. Even open source version is easy to roll out upstream update + graceful reload (HUP signal). Nginx also supports zero downtime binary upgrade (USR2+QUIT).

Leverage browser caching and specifying a cache validator in node.js

I am new node guy and ran some tests one my website for speed efficiency. I got two really bad scores in these sections:
I looked into this and saw .htaccess solutions. From what I've read .htaccess is only supported on Apache servers. Node.js creates the server and therefore having an Apache install is useless.
I've seen some .htaccess alternatives for node but none seem to offer a fix to this issue.
I also do not want to install npm modules that are too large.
What's the best solution?

Since these resources come from an external domain (fonts.googlepais.com), there is nothing you can change in your node.js server for these.
For resources that are under your control, you can set http cache headers with req.header(). See https://stackoverflow.com/a/8007793/1194584

Securing elasticsearch

I am completely new to elasticsearch but I like it very much. The only thing I can't find and can't get done is to secure elasticsearch for production systems. I read a lot about using nginx as a proxy in front of elasticsearch but I never used nginx and never worked with proxies.
Is this the typical way to secure elasticsearch in production systems?
If so, are there any tutorials or nice reads that could help me to implement this feature. I really would like to use elasticsearch in our production system instead of solr and tomcat.

There's an article about securing Elasticsearch which covers quite a few points to be aware of here: http://www.found.no/foundation/elasticsearch-security/ (Full disclosure: I wrote it and work for Found)
There's also some things here you should know: http://www.found.no/foundation/elasticsearch-in-production/
To summarize the summary:
At the moment, Elasticsearch does not consider security to be its job. Elasticsearch has no concept of a user. Essentially, anyone that can send arbitrary requests to your cluster is a “super user”.
Disable dynamic scripts. They are dangerous.
Understand the sometimes tricky configuration is required to limit access controls to indexes.
Consider the performance implications of multiple tenants, a weakness or a bad query in one can bring down an entire cluster!

Proxying ES traffic through nginx with, say, basic auth enabled is one way of handling this (but use HTTPS to protect the credentials). Even without basic auth in your proxy rules, you might, for instance, restrict access to various endpoints to specific users or from specific IP addresses.
What we do in one of our environments is to use Docker. Docker containers are only accessible to the world AND/OR other Docker containers if you explicitly define them as such. By default, they are blind.
In our docker-compose setup, we have the following containers defined:
nginx - Handles all web requests, serves up static files and proxies API queries to a container named 'middleware'
middleware - A Java server that handles and authenticates all API requests. It interacts with the following three containers, each of which is exposed only to middleware:
redis
mongodb
elasticsearch
The net effect of this arrangement is the access to elasticsearch can only be through the middleware piece, which ensures authentication, roles and permissions are correctly handled before any queries are sent through.
A full docker environment is more work to setup than a simple nginx proxy, but the end result is something that is more flexible, scalable and secure.

Here's a very important addition to the info presented in answers above. I would have added it as a comment, but don't yet have the reputation to do so.
While this thread is old(ish), people like me still end up here via Google.
Main point: this link is referenced in Alex Brasetvik's post:
https://www.elastic.co/blog/found-elasticsearch-security
He has since updated it with this passage:
Update April 7, 2015: Elastic has released Shield, a product which provides comprehensive security for Elasticsearch, including encrypted communications, role-based access control, AD/LDAP integration and Auditing. The following article was authored before Shield was available.
You can find a wealth of information about Shield here: here
A very key point to note is this requires version 1.5 or newer.

Ya I also have the same question but I found one plugin which is provide by elasticsearch team i.e shield it is limited version for production you need to buy a license and please find attached link for your perusal.
https://www.elastic.co/guide/en/shield/current/index.html

How to use mod_security as standalone?

I've seen the module named standalone in the package of Mod_Security; but I'm not sure how to use it after making and installing it!
Is there any good resources for the start up?

It does not appear to be possible; based on what the ModSecurity website says for its modes of operation:
Reverse proxies are effectively HTTP routers, designed
to stand between web servers and their clients. When you install a
dedicated Apache reverse proxy and add ModSecurity to it, you get a
"proper" network web application firewall, which you can use to
protect any number of web servers on the same network. Many security
practitioners prefer having a separate security layer. With it you get
complete isolation from the systems you are protecting. On the
performance front, a standalone ModSecurity will have resources
dedicated to it, which means that you will be able to do more (i.e.,
have more complex rules). The main disadvantage of this approach is
the new point of failure, which will need to be addressed with a
high-availability setup of two or more reverse proxies.
They are considering it separate by created a dedicated host that is used for proxying to internal hosts.
That works; but it's technically not standalone.
I also filed a bug, and it was confirmed by Felipe Zimmerle:
Standalone is a wrapper to Apache internals that allows ModSecurity to be executed. That wrapper still demand Apache pieces. It is true that you can extend your application using the Standalone version although, you will need some Apache pieces

As you have noted ModSecurity is an add on to an existing web server - originally as an Apache module (hence the name) but now also available for Nginx and IIS.
You can run it in embedded mode (i.e. as part of your main web server) or run it in reverse proxy mode (which is basically the same but you set up a separate web server and run it on that, and then direct all traffic through that).
To be perfectly honest I've never found much point in the reverse proxy method. I guess it does mean you could use it on non-supported web servers (i.e. if you are not using Apache, Nginx nor IIS), and it would reduce the load on your main web server, but other than that it seems like an extra step and infrastructure for no real gains. Some people might also prefer to do the ModSecurity checks in front of several web servers but I woudl argue if you have several web servers, then it is likely for performance and resiliency reasons so why not spread the ModSecurity to this level too rather than creating a single point of failure which might be a bottleneck in front of it. Only other reason would be to apply session level rules (e.g. if people are changing session ids), which might ultimately be spread between different web servers but I've never been convinced that those rules are that great anyway.
When I build ModSecurity I get a mod_security2.so library being built but no separate standalone file(s) so I presume you're just seeing this from hunting through the source (I do see a standalone)? I'd say just because there is a "standalone" folder in the source is not a guarantee that it can run as a completely separate, standalone piece.
I'd question why you want to run this as a standalone app even if you could? Web servers have a lot of functionality in them and depending on ModSecurity, which was written for web security, rather than web security and all the other things a web server does (e.g. be quick, understand HTTP protocol, gzip and ungzip...etc), needlessly stretches what ModSecurity would need to handle. So why not use a web server to take care of this and let ModSecurity do what it's good at?
If you are using ModSecurity then I guess you have web apps (presumably with a web server), so why not use it through that?
Finally is there any problem with installing this through Apache (or Nginx or IIS)? It's free software that's well supported and easy to set up.
I guess ultimately I don't understand the reason for your question. Is there a particular problem you are trying to solve, or is this more just curiosity?

Is it possible to make CouchApp send requests autonomously?

I want to write very simple app, witch monitors states of some sites. I also want to make it in Couchapp style without using any environment except CouchDB.
So the question is how I can make CouchApp send sites requests using some schedule by itself?
BTW, if I fail with this CouchApp, is there some way to make it not involving demon stuff (or cron) on PHP or on even Java? I want to keep it as simple as possible, but not simpler.

rsp is correct. Since CouchDB uses web protocols and Javascript, it has become a victim of its own success.
My rule of thumb is this: CouchDB is a database. It stores your data. I do not expect MySQL to automatically monitor external web sites. Why would I expect CouchDB to do that?
However I agree; CouchDB always benefits from some persistent processing to maintain the data.
Since CouchDB is completely web-based, you could start with a simple dedicated "worker" web browser. Fetch a password-protected HTML page from CouchDB. That page has the Javascript to make the browser query servers and update CouchDB. This could work in the short-term as a quick solution. However browsers impose security restrictions on your queries; and also a browser is not a long-term computing platform.
The traditional way is to run your own client software to do these things. You can either run a dedicated computer, or use PHP, NodeJS, or any other hosting services out there.
2. The

You can't do it in CouchDB alone (CouchApps can only have pure functions without side effects so they can be guaranteed to be cacheable) but you can do it using simple scripts that talk to CouchDB. See this talk by Mikeal Rogers for details on how to do it.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string