Isn't Ngnix load balancing like proxy server? - node.js

Mainly is there any difference between using nginx as load balancer for bunch of upstream servers. Or using small nodejs proxy server that acts like a proxy between bunch of servers and one public hosting.
It may look obvious to you but to me nginx is very new. And i barely know anything about it..
Also i guess my question is there any performance advantage for using nginx as proxy server that distribute load vs running your own node js code that acts a proxy between other requests.

In case of introducing +1 technology I'd say keep custom NodeJS proxy as short-term solution.
Long-term solution is Nginx as reverse-proxy among array of backend makes a big sense by number technical and maintenance reasons. An application rarely stays the same because you apply new features, replace legacy code and deploy new ones so the way is to use right tool for right task. Nginx is proven and chosen by many heavy loaded applications over the web. The memory consumption and CPU utilisation is low and stable.
Most of people use Nginx as reverse-proxy (the biggest reason to use Nginx by the way) rather than anything else because such powerful and featured it is.
From request-response life cycle m Nginx keeps rotating between backend to send request again if a given backend is dead, so not even one request lost.
From maintenance point of view dynamic upstream (part of commercial installation) with Rest interface looks good enough. Even open source version is easy to roll out upstream update + graceful reload (HUP signal). Nginx also supports zero downtime binary upgrade (USR2+QUIT).

Related

Express-rate-limit vs NGINX in a node server

I'm currently using express-rate-limit module to block multiple requests from the same ip or logged in user account in my node server, and this is working pretty good against DoS attacks. This server is a small local business that requires only one instance, as it doesn't have too many users and it's computing requirements aren't too intensive.
I've been reading a lot about nginx lately, and many people recommends using it in node servers, but I can't see the major advantages of using it in this kind of application.
How would nginx be better for my application? What can it do that other npm modules can't in terms of security for a single server application?
Well I am not an NGINX expert but I use NGINX in production currently on my EC2 instance. When it comes to rate limiting there are a couple of options available with respect to express
You can use redis as a store, get the IP address of each incoming request and check how many hits they currently have before deciding to service them. This could be a middleware that works on all routes
You could use a library like express-rate-limit or rate-limiter-flexible which will handle the redis part for you
Now when you take NGINX, it is a web server whose strongest point is not rate limiting to be precise. It still supports rate limiting though if you modify the configuration. HERE is an insight into NGINX rate limiting.
Another option you havent considered is called HAProxy which is a load balancer which is considered superior for tasks such as rate limiting. You can read about HERE
Lets talk about the second part of your question
Rate limiting inside an application is a bad idea. It does not belong to the application as such. It is not a part of business logic. Also, It does not work well with clustered mode (more than one cores running express at the same time) unless you tweak it for supporting cluster.
Rate limiting using NGINX configuration just needs 2 extra lines as shown in the earlier link I posted. If suddenly you want to add an extra route or exempt some route from rate limiting NGINX can easily do that.
If you want to exempt your cloudfront addresses or CDN server addresses from being rate limited, you can add a whitelist of IPs to NGINX conf so that it will exempt them. Doing this in the application will be a real pain as you would have to git commit, redeploy etc. THIS answer covers how to exempt addresses

We have already had nodejs, why we need nginx or apache?

Recently, I set up me blog site which powered by Ghost -- A light weight, fast and static blog framework. I note that Ghost servers on the nodejs, and I needn't to install apache or nginx anymore.
In this way, why we need apache or nginx? I know nginx is famous of it's outstanding performance, but how about nodejs server's performance?
The V8 engine your NodeJS code runs on is supposed to be a Javascript runtime to execute Javascript code, and not perform as a server.
Therefore, it is better to reverse-proxy your NodeJS application through a server such as Nginx.
Moreover, when you require server based features such as load-balancing, caching, max post size, request timeout etc, it is better to use a proper server software that you can configure these settings on than to depend on the language's runtime. You can still do these things in the language's runtime but that will be an overkill.

How to use mod_security as standalone?

I've seen the module named standalone in the package of Mod_Security; but I'm not sure how to use it after making and installing it!
Is there any good resources for the start up?
It does not appear to be possible; based on what the ModSecurity website says for its modes of operation:
Reverse proxies are effectively HTTP routers, designed
to stand between web servers and their clients. When you install a
dedicated Apache reverse proxy and add ModSecurity to it, you get a
"proper" network web application firewall, which you can use to
protect any number of web servers on the same network. Many security
practitioners prefer having a separate security layer. With it you get
complete isolation from the systems you are protecting. On the
performance front, a standalone ModSecurity will have resources
dedicated to it, which means that you will be able to do more (i.e.,
have more complex rules). The main disadvantage of this approach is
the new point of failure, which will need to be addressed with a
high-availability setup of two or more reverse proxies.
They are considering it separate by created a dedicated host that is used for proxying to internal hosts.
That works; but it's technically not standalone.
I also filed a bug, and it was confirmed by Felipe Zimmerle:
Standalone is a wrapper to Apache internals that allows ModSecurity to be executed. That wrapper still demand Apache pieces. It is true that you can extend your application using the Standalone version although, you will need some Apache pieces
As you have noted ModSecurity is an add on to an existing web server - originally as an Apache module (hence the name) but now also available for Nginx and IIS.
You can run it in embedded mode (i.e. as part of your main web server) or run it in reverse proxy mode (which is basically the same but you set up a separate web server and run it on that, and then direct all traffic through that).
To be perfectly honest I've never found much point in the reverse proxy method. I guess it does mean you could use it on non-supported web servers (i.e. if you are not using Apache, Nginx nor IIS), and it would reduce the load on your main web server, but other than that it seems like an extra step and infrastructure for no real gains. Some people might also prefer to do the ModSecurity checks in front of several web servers but I woudl argue if you have several web servers, then it is likely for performance and resiliency reasons so why not spread the ModSecurity to this level too rather than creating a single point of failure which might be a bottleneck in front of it. Only other reason would be to apply session level rules (e.g. if people are changing session ids), which might ultimately be spread between different web servers but I've never been convinced that those rules are that great anyway.
When I build ModSecurity I get a mod_security2.so library being built but no separate standalone file(s) so I presume you're just seeing this from hunting through the source (I do see a standalone)? I'd say just because there is a "standalone" folder in the source is not a guarantee that it can run as a completely separate, standalone piece.
I'd question why you want to run this as a standalone app even if you could? Web servers have a lot of functionality in them and depending on ModSecurity, which was written for web security, rather than web security and all the other things a web server does (e.g. be quick, understand HTTP protocol, gzip and ungzip...etc), needlessly stretches what ModSecurity would need to handle. So why not use a web server to take care of this and let ModSecurity do what it's good at?
If you are using ModSecurity then I guess you have web apps (presumably with a web server), so why not use it through that?
Finally is there any problem with installing this through Apache (or Nginx or IIS)? It's free software that's well supported and easy to set up.
I guess ultimately I don't understand the reason for your question. Is there a particular problem you are trying to solve, or is this more just curiosity?

Using Node.js only vs. using Node.js with Apache/Nginx [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 months ago.
The community reviewed whether to reopen this question 6 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
In what cases should one prefer to use Node.js only as a server in real deployment?
When one does not want to use Node.js only, what plays better with Node.js? Apache or Nginx?
There are several good reasons to stick another webserver in front of Node.js:
Not having to worry about privileges/setuid for the Node.js process. Only root can bind to port 80 typically. If you let nginx/Apache worry about starting as root, binding to port 80, and then relinquishing its root privileges, it means your Node app doesn't have to worry about it.
Serving static files like images, css, js, and html. Node may be less efficient compared to using a proper static file web server (Node may also be faster in select scenarios, but this is unlikely to be the norm). On top of files serving more efficiently, you won't have to worry about handling eTags or cache control headers the way you would if you were servings things out of Node. Some frameworks may handle this for you, but you would want to be sure. Regardless, still probably slower.
As Matt Sergeant mentioned in his answer, you can more easily display meaningful error pages or fall back onto a static site if your node service crashes. Otherwise users may just get a timed out connection.
Running another web server in front of Node may help to mitigate security flaws and DoS attacks against Node. For a real-world example, CVE-2013-4450 is prevented by running something like Nginx in front of Node.
I'll caveat the second bullet point by saying you should probably be serving your static files via a CDN, or from behind a caching server like Varnish. If you're doing this it doesn't really matter if the origin is Node or Nginx or Apache.
Caveat with nginx specifically: if you're using websockets, make sure to use a recent version of nginx (>= 1.3.13), since it only just added support for upgrading a connection to use websockets.
Just to add one more reason to pauljz's answer, I use a front end server so that it can serve up 502 error pages when I'm restarting the backend server or it crashes for some reason. This allows your users to never get an error about unable to establish a connection.
It is my belief that using Node to serve static files is fine in all circumstances as long as you know what you're doing. It is certainly a new paradigm to use the application server to serve static files as so many (every?) competing technologies (PHP, Ruby, Python, etc) require a web server like HTTPD or Nginx in front of the application server(s).
Every objective reason I have ever read against serving static files with Node revolves around the idea of using what you know best or using what is perceived as better-tested / more stable. These are very valid reasons practically speaking, but have little purely technical relevance.
Unless you find a feature that is possible with a classic web server that is not possible with Node (and I doubt you will), choose what you know best or what you'd prefer to work with as either approach is fine.
As for Nginx vs Apache -- they will "play" with Node the same. You should compare them without regard to Node.
Using Node.js only
Node.js can do all the tasks of a web server: serve static files, respond to an API call, run server on HTTPS... There are also a lot of packages that provide extra functionalities like logging the request, compress the response, set cookies, prevent XSS attacks... Lack of functionalities isn't likely a reason for using another Webserver (Apache/Nginx/etc..) to complete Node.js. In other words, for a simple application that does not need to scale, you don't need to add an extra layer to Node.js, it just complicates the problem.
Using Node.js with another webserver
Each web server has its own advantages. For example, Apache allows additional configuration per-directory via the .htaccess file. Nginx is known for its performance when it comes to serving static files or acting as an reverse proxy. Node.js provides a huge benefit when dealing with I/O heavy systems... Sometimes, we need to combine the forces of different web servers to satisfy the system's requirements.
Example: For an enterprise-level application that might scale up in the future, set up Nginx as a reverse proxy before Node.js application has some advantages :
Nginx can act as a load balancer to dispatch traffic to your NodeJS instances if you have more than 1.
Nginx can handle HTTPS, caching, and compression for you. Encryption and compression are heavily computed operations that NodeJS is not good at. So using Nginx will give you better performance.
Nginx will serve static content, which reduces the load of Node.js.
Separation of concerns: Nginx takes care of all the "configuration" part, and Node.js focus on the application logic.
Placing NGINX in front of Node helps better handle high connection volumes. NGINX offers (to name a few) caching, load balancing, rate limiting (using the leaky bucket algorithm) and can help mitigate attacks if paired with a banning service like Fail2ban.
As for production applications, you can run your application server behind NGINX as reverse proxy, coupled with a caching server like Redis- all of which can be situated behind a content delivery network as another line of defense from exposing your ipv4/ipv6.
An extra: It is important also if you need a Reverse Proxy, for example to execute a Websocket Server on the same port, or maybe mix some techonlogies (reply with NodeJS some requests and with PHP some others or whatever)

What are my options when it comes to node.js lifecycle?

Are there any examples or conventions out there of how to use node.js to host multiple web apps?
I'm already aware that node itself can be used to build a server, but I'm curious as to whether there have been implementations where you aren't necessarily running it all the time. Strictly for the reason that perhaps there are multiple sites being hosted, each with their own copy of a framework, static files and custom functionality.
Or maybe you do run one instance of node and code a multiple site architecture to ensure one bad site doesn't take the server downin some way?
Virtual hosts, ensuring that one site can't crash others...these are all things that have been considered with other platforms, but I have had some difficulties finding for node! :)
I am already aware of connect, express and other middleware, however it doesn't cover what I'm asking here.
If you're worried about runtime isolation, each "site" should run it's own node process. Then use a proxy like node-http-proxy that will do host header based routing. Another great node based option is bouncy, but you don't necessarily need to use node to do the host based routing. You could just as well use haproxy, nginx, etc.
The baseline RAM overhead of each node process is very small (~10mb - 15mb). Also, if you do HTTP based routing you can spread your sites easily across machines, user home directories, etc.
If you want to handle the site/host registration programmatically, I would use seaport and then communicate the hostname and host + port details back to the proxy so that the routing table can by dynamic. This would also make it fairly easy to scale a site across multiple node processes.
Good luck!

Resources