I'm currently using express-rate-limit module to block multiple requests from the same ip or logged in user account in my node server, and this is working pretty good against DoS attacks. This server is a small local business that requires only one instance, as it doesn't have too many users and it's computing requirements aren't too intensive.
I've been reading a lot about nginx lately, and many people recommends using it in node servers, but I can't see the major advantages of using it in this kind of application.
How would nginx be better for my application? What can it do that other npm modules can't in terms of security for a single server application?
Well I am not an NGINX expert but I use NGINX in production currently on my EC2 instance. When it comes to rate limiting there are a couple of options available with respect to express
You can use redis as a store, get the IP address of each incoming request and check how many hits they currently have before deciding to service them. This could be a middleware that works on all routes
You could use a library like express-rate-limit or rate-limiter-flexible which will handle the redis part for you
Now when you take NGINX, it is a web server whose strongest point is not rate limiting to be precise. It still supports rate limiting though if you modify the configuration. HERE is an insight into NGINX rate limiting.
Another option you havent considered is called HAProxy which is a load balancer which is considered superior for tasks such as rate limiting. You can read about HERE
Lets talk about the second part of your question
Rate limiting inside an application is a bad idea. It does not belong to the application as such. It is not a part of business logic. Also, It does not work well with clustered mode (more than one cores running express at the same time) unless you tweak it for supporting cluster.
Rate limiting using NGINX configuration just needs 2 extra lines as shown in the earlier link I posted. If suddenly you want to add an extra route or exempt some route from rate limiting NGINX can easily do that.
If you want to exempt your cloudfront addresses or CDN server addresses from being rate limited, you can add a whitelist of IPs to NGINX conf so that it will exempt them. Doing this in the application will be a real pain as you would have to git commit, redeploy etc. THIS answer covers how to exempt addresses
Related
I want to create Sails.js (Node.js) server app, which will provide API for single-page-app. This server will consist of multiple modules:
user management
forum
chat
admin GUI
content management
payment gateway
...
All these modules will share one database. The server must be able to handle as many requests and web sockets as possible. Clean architecture and performance are my primary goals.
My questions:
Should I create multiple servers running on multiple ports? I mean, one server for content management module. Another server for forum management module.
Or is it better to create only one big universal server, which will consists of multiple separate modules (hooks in Sails.js) and runs on one port? Will performance of the server decrease in this case ?
I was thinking about vertical scaling one big universal server, running on single port with pm2. Or is it better to scale Node.js horizontaly and split server to multiple smaller servers ?
Im new to Node.js so I appreciate any advice.
I think it really boils down to the scale of the project.
For very simple things there's no real reason to scale past a single but reliable server is there?
However for broader projects that have a back-end that is resource intensive and a lot of users and traffic, you may a want to split the back / front end aspects depending on the requirements.
In which case you might have a single server (or more) dealing with the specific administrative requests or routines then have the client / user API running through a load balancer and spread across multiple servers in multiple regions or break it down further into an auto scaling group so as to accommodate for fluctuations in traffic.
It would be worthwhile to note too that this is really suited for higher volumes of traffic or resource usage as you're dedicating the server infrastructure for this purpose, for smaller applications where there is infrequent usage then breaking things down into micro services from the start and getting billed for the runtime rather than dedicated infrastructure utilization might make more sense to me. You could take a look at AWS API Gateway and Lambda services for some more information on that (I am not affiliated to AWS in any way, I just appreciate what they have managed to put together there).
Mainly is there any difference between using nginx as load balancer for bunch of upstream servers. Or using small nodejs proxy server that acts like a proxy between bunch of servers and one public hosting.
It may look obvious to you but to me nginx is very new. And i barely know anything about it..
Also i guess my question is there any performance advantage for using nginx as proxy server that distribute load vs running your own node js code that acts a proxy between other requests.
In case of introducing +1 technology I'd say keep custom NodeJS proxy as short-term solution.
Long-term solution is Nginx as reverse-proxy among array of backend makes a big sense by number technical and maintenance reasons. An application rarely stays the same because you apply new features, replace legacy code and deploy new ones so the way is to use right tool for right task. Nginx is proven and chosen by many heavy loaded applications over the web. The memory consumption and CPU utilisation is low and stable.
Most of people use Nginx as reverse-proxy (the biggest reason to use Nginx by the way) rather than anything else because such powerful and featured it is.
From request-response life cycle m Nginx keeps rotating between backend to send request again if a given backend is dead, so not even one request lost.
From maintenance point of view dynamic upstream (part of commercial installation) with Rest interface looks good enough. Even open source version is easy to roll out upstream update + graceful reload (HUP signal). Nginx also supports zero downtime binary upgrade (USR2+QUIT).
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 months ago.
The community reviewed whether to reopen this question 6 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
In what cases should one prefer to use Node.js only as a server in real deployment?
When one does not want to use Node.js only, what plays better with Node.js? Apache or Nginx?
There are several good reasons to stick another webserver in front of Node.js:
Not having to worry about privileges/setuid for the Node.js process. Only root can bind to port 80 typically. If you let nginx/Apache worry about starting as root, binding to port 80, and then relinquishing its root privileges, it means your Node app doesn't have to worry about it.
Serving static files like images, css, js, and html. Node may be less efficient compared to using a proper static file web server (Node may also be faster in select scenarios, but this is unlikely to be the norm). On top of files serving more efficiently, you won't have to worry about handling eTags or cache control headers the way you would if you were servings things out of Node. Some frameworks may handle this for you, but you would want to be sure. Regardless, still probably slower.
As Matt Sergeant mentioned in his answer, you can more easily display meaningful error pages or fall back onto a static site if your node service crashes. Otherwise users may just get a timed out connection.
Running another web server in front of Node may help to mitigate security flaws and DoS attacks against Node. For a real-world example, CVE-2013-4450 is prevented by running something like Nginx in front of Node.
I'll caveat the second bullet point by saying you should probably be serving your static files via a CDN, or from behind a caching server like Varnish. If you're doing this it doesn't really matter if the origin is Node or Nginx or Apache.
Caveat with nginx specifically: if you're using websockets, make sure to use a recent version of nginx (>= 1.3.13), since it only just added support for upgrading a connection to use websockets.
Just to add one more reason to pauljz's answer, I use a front end server so that it can serve up 502 error pages when I'm restarting the backend server or it crashes for some reason. This allows your users to never get an error about unable to establish a connection.
It is my belief that using Node to serve static files is fine in all circumstances as long as you know what you're doing. It is certainly a new paradigm to use the application server to serve static files as so many (every?) competing technologies (PHP, Ruby, Python, etc) require a web server like HTTPD or Nginx in front of the application server(s).
Every objective reason I have ever read against serving static files with Node revolves around the idea of using what you know best or using what is perceived as better-tested / more stable. These are very valid reasons practically speaking, but have little purely technical relevance.
Unless you find a feature that is possible with a classic web server that is not possible with Node (and I doubt you will), choose what you know best or what you'd prefer to work with as either approach is fine.
As for Nginx vs Apache -- they will "play" with Node the same. You should compare them without regard to Node.
Using Node.js only
Node.js can do all the tasks of a web server: serve static files, respond to an API call, run server on HTTPS... There are also a lot of packages that provide extra functionalities like logging the request, compress the response, set cookies, prevent XSS attacks... Lack of functionalities isn't likely a reason for using another Webserver (Apache/Nginx/etc..) to complete Node.js. In other words, for a simple application that does not need to scale, you don't need to add an extra layer to Node.js, it just complicates the problem.
Using Node.js with another webserver
Each web server has its own advantages. For example, Apache allows additional configuration per-directory via the .htaccess file. Nginx is known for its performance when it comes to serving static files or acting as an reverse proxy. Node.js provides a huge benefit when dealing with I/O heavy systems... Sometimes, we need to combine the forces of different web servers to satisfy the system's requirements.
Example: For an enterprise-level application that might scale up in the future, set up Nginx as a reverse proxy before Node.js application has some advantages :
Nginx can act as a load balancer to dispatch traffic to your NodeJS instances if you have more than 1.
Nginx can handle HTTPS, caching, and compression for you. Encryption and compression are heavily computed operations that NodeJS is not good at. So using Nginx will give you better performance.
Nginx will serve static content, which reduces the load of Node.js.
Separation of concerns: Nginx takes care of all the "configuration" part, and Node.js focus on the application logic.
Placing NGINX in front of Node helps better handle high connection volumes. NGINX offers (to name a few) caching, load balancing, rate limiting (using the leaky bucket algorithm) and can help mitigate attacks if paired with a banning service like Fail2ban.
As for production applications, you can run your application server behind NGINX as reverse proxy, coupled with a caching server like Redis- all of which can be situated behind a content delivery network as another line of defense from exposing your ipv4/ipv6.
An extra: It is important also if you need a Reverse Proxy, for example to execute a Websocket Server on the same port, or maybe mix some techonlogies (reply with NodeJS some requests and with PHP some others or whatever)
Are there any examples or conventions out there of how to use node.js to host multiple web apps?
I'm already aware that node itself can be used to build a server, but I'm curious as to whether there have been implementations where you aren't necessarily running it all the time. Strictly for the reason that perhaps there are multiple sites being hosted, each with their own copy of a framework, static files and custom functionality.
Or maybe you do run one instance of node and code a multiple site architecture to ensure one bad site doesn't take the server downin some way?
Virtual hosts, ensuring that one site can't crash others...these are all things that have been considered with other platforms, but I have had some difficulties finding for node! :)
I am already aware of connect, express and other middleware, however it doesn't cover what I'm asking here.
If you're worried about runtime isolation, each "site" should run it's own node process. Then use a proxy like node-http-proxy that will do host header based routing. Another great node based option is bouncy, but you don't necessarily need to use node to do the host based routing. You could just as well use haproxy, nginx, etc.
The baseline RAM overhead of each node process is very small (~10mb - 15mb). Also, if you do HTTP based routing you can spread your sites easily across machines, user home directories, etc.
If you want to handle the site/host registration programmatically, I would use seaport and then communicate the hostname and host + port details back to the proxy so that the routing table can by dynamic. This would also make it fairly easy to scale a site across multiple node processes.
Good luck!
I'm a developer of a MMO game and currently we're at my company facing some scalability issues which, I think, can be resolved with proper clustering of the game world.
I don't really want to reinvent the wheel that's why I think Linux Virtual Server could be a good choice especially with some Level 7 load balancing technique.
I'm currently looking at ktcpvs as a load balancing solution and wonder if it's a proper choice.
The main idea is to have a number of zones("locations" in terms of my game) running on dedicated servers. When a player decides to go to some specific location the load balancer decides which zone server will be actually serving the player(that's actually why I need a Level 7 load balancer)
What do you folks think about all said above?
Update: I posted the same question to LVS users mailing list http://marc.info/?l=linux-virtual-server&m=124976265209769&w=2
Update: I also started the similar topic on the gamedev.net forum http://www.gamedev.net/community/forums/topic.asp?topic_id=544386
In order to address your question we need to understand whether you need volume or response, but it is difficult to get both at the same time.
Layer 7 load balancing - is data based application level balancing, so the data content of the network packet needs to be routed to an end-point. You can achieve volume (more users) by implementing routing at the application level, service level or kernel level.
Scalability - I assume you are running out of memory, CPU resources and network bandwidth.
Application level - your application logic receives an application packet and routes accordingly.
Service level - your system framework (front end service of some kind) receives the packet and through a module - performs the routing (think of custom apache module, even network driver modules - like writing a network filter)
Kernel level - Performs routing at network packet level.
The closer you move to the metal, the better your response will be. I suggest using dedicated linux server up-front to perform the routing - go native, not virtual. Use multiple or teamed network adapters for the WAN and a dedicated adapter for each end-point (one+ wan, one each for each connected app server)
If response time is important then you need a kernel/supervisor state solution, it will save you a few context switches but be aware that you need to limit hops at all costs and could better be served by fewer, larger machines and your scalability will always be limited. There is a risk in using KTCPVS, it is quite old and not actively updated. If you judge that it works for you great, otherwise consider writing something akin to a network filter as long as it runs in system state.
If volume is important but response time is secondary, implement a custom built high-speed socket switch built in C++ running in problem/user state. It is the easiest to maintain and will offer the best scalability.
You will need to build some prototypes to figure out what suits your needs best.
Final thoughts -
Before doing any of the above first ensure that you have optimized your game design. You may know most of this, I list it here for the benefit of all.
(a) messages should fit comfortably within one network packet, less than 1500 bytes for most home routers
(b) Try to fit the logic of the routing in your game client instead of your servers. A simple download of a small table with zones and IP addresses to a client will allow you to forego all of the above.
(c) Try to limit zone visibility by to the clients, they should know about their zones and adjacent zones only (if you implement the point b above)
Hope this helps, sorry I cannot be more specific regarding KTCPVS.
You haven't specified where the bottleneck is. Network Traffic? Disk IO? CPU Cycles?
Assuming you mean a layer 7 load balancer and don't have enough CPU power, I think LVS ist not the optimal choice. I have done Web Server load balancing with LVS, which works straightforward and isn't exactly complicated.
But I think load balancing an MMORP this way needs considerable amounts of additional code in LVS, it might be easier to do the load balancing with a multithreaded application distributed over some multicore server. But this isn't fully scalable, this only gets you to 16 cores without prohibitve cost increase.
The biggest issue in something like this is what happens when players are near a boundary. Obviously they need to be able to see and interact with each other, but they're on separate servers. So you need some pretty fancy inter-server communication, sometimes just duplicating messages to both servers. It can get even more complicated when someone is near a "corner", and then you have to deal with 4 servers!
The book Massively Multiplayer Game Development has a chapter on "The Pitfalls of Shared Server Boundaries" which covers this issue in detail.
I haven't heard of Linux Virtual Server before now, so I don't understand how it fits. I think your actual server application needs to support this game-specific load balancing, rather than trying to run a cluster and assuming that it will automatically know how to split up your application (which it won't). If I were you, I would write the server program to handle its own piece of land, and it should connect to the pieces of land around it, and then design a server-to-server protocol for the passing of these messages ("here comes a player, I'm going to start telling you about him!" "make sure to tell me about messages near our boundary", "okay the player is out of my territory and into yours, here's his detailed data", etc). I think it's a bit more complicated than just running a different flavor of Linux and assuming you'll get automatic load balancing.
Why are you moving the distribution logic to the loadbalancer? It's a component that's not free and can break. It seems your clients are quite aware of which zone they're in. It seems they could very well connect to zone<n>.example.com. You'd then handle loadbalancing at DNS level.