Can I use Amazon ELB instead of nginx as load balancer for my Node.Js app? - node.js

I have a Node.js app and I've seen a lot of posts here in SO that it needs to be behind a nginx as load balancer. Since I'm already accustomed to Amazon's services, thus my question.

Yes, but there are a few gotcha to keep in mind:
If you have a single server, ensure you don't return anything except 200 to the page that ELB uses to check health. We had a 301 from our non-www to www site, and that made ELB not send anything to our server because of it.
You'll get the ELB's IP instead of the client's in your logs. There is an ngx_real_ip module, but it takes from config hacking to get it to work.

ELB works great in front of a basic Node.js application. If you want WebSockets, you need to configure it for TCP balancing. TCP balancing doesn't support sticky sessions though, so you get one or the other.

Related

Problem running Express app over HTTPS on aws

I have an ExpressJS backend and I want to run over https on aws (so I don't get 'mixed type content' error when trying to connect with my frontend which runs over https), it's running great using http but when using https it doesn't work.
I asked this question before and I got answers like 'use nginx', 'use load balancer', unfortunately I don't know much about this stuff as I'm not very experienced with all aws variations and options, are there any tutorials I can follow step by step ? or any easy way to serve my backend over https without complexity?
any easy way to serve my backend over https without complexity?
The easiest way (don't confused with the cheapest way) is to change your EB environment to load-balanced one. You can do this in EB console's configuration settings.
This change will create Application Load Balancer for your app, and place it in-front of your instance. Once ALB is running you can follow this AWS guide:
How can I configure HTTPS for my Elastic Beanstalk environment?
In the above, only section Terminate HTTPS on the load balancer would be relevant.
Depending on the nature of your application, is it fully dynamic, or more on static side, you could also consider using Using Elastic Beanstalk with Amazon CloudFront, instead of using ALB. CloudFront could be also be easily setup to use HTTPS between clients and CloudFront, but the issue is that traffic between CloudFront and your EB instance would go over the internet unencrypted (HTTP). Obviously, you could make it HTTPS, but this requires further changes and configurations which does not fall into category of "easy ways".

NGINX, THe Edge, HAPRoxy

I was going through Uber Engineering website where I came across this paragraph and I it confused me a lot, if anyone can make it clear for me then I would be thankful to him/her:
The Edge The frontline API for our mobile apps consists of over 600
stateless endpoints that join together multiple services. It routes
incoming requests from our mobile clients to other APIs or services.
It’s all written in Node.js, except at the edge, where our NGINX front
end does SSL termination and some authentication. The NGINX front end
also proxies to our frontline API through an HAProxy load balancer.
This is the link.
NGINX is already a reverse proxy + load balancer, then from where HAProxy load balancer came in the picture and where exactly it fits in the picture? What is "the edge" he talked about? Either the guy who wrote his he wrote confused words or I dont know English.
Please help.
It seems like they're using HAProxy strictly as a load balancer, and using NGINX strictly to terminate SSL and for authentication. It isn't necessary in most cases to use HAProxy along with NGINX, as you mentioned, NGINX has load-balancing capabilities, but being Uber, they probably ran into some unique problems that required the use of both. According to the information I've read, such as http://www.loadbalancer.org/blog/nginx-vs-haproxy/ and https://thehftguy.com/2016/10/03/haproxy-vs-nginx-why-you-should-never-use-nginx-for-load-balancing/, NGINX works extremely well as a web server, including the use case where it is serving as a reverse proxy for a node application, but its load-balancing capabilities are basic and not nearly as performant as HAProxy. Additionally, HAProxy exposes many more metrics for monitoring, and has more advanced routing capabilities.
Load balancing is not the core feature of NGINX. In the context of a node.js application, usually what you would see NGINX used for is to serve as a reverse proxy, meaning that NGINX is the web server, and http requests come through it. Then, based on the hostname and other rules, it forwards on the HTTP request to whatever port your node.js application is running on. As part of this flow, often NGINX will handle SSL termination, so that this computationally-intensive task is not being handled by node.js. Additionally, NGINX is often used to serve static assets for node.js apps, as it is more efficient, especially when compressing assets.

AWS Loadbalancer Proxy for Nodejs

I have configured the load balancer to route the request to two of Ec2 Instance running a NodeJs server. I need to direct the request coming from both http (port 80) and https (port 443) to http (port 80) of the EC2 instances in NodeJs. I have uploaded the ssl certificate to AWS and configured the load balancer to use ssl certificate. The problem is the request coming from http port doesn't automatically route to https. It has to be a server side script or snipped which I need to write in server.js which should be routing the http to https, i tried to do it and it run into endless redirection. So questions -
Is there any guide to do this from AWS ?
If not then how one can achieve this, any pointers or suggestions would be greatly appreciated.
On the server side you can check the X-Forwarded-Proto
(original request protocol) and if it's heaving value http you can send redirect (http 302) to a url with https protocol..
though with ALB (application load balancer you may specify a set of rules, maybe it's possible to do that there..)
I couldn't find a guide from AWS, but I will keep searching and update the answer in the case I find it.
Usually, when you write applications in Node.js, you specify which port should your app run at. It means that you will need two different servers listening. And when your app receives a request on port 80 (HTTP), it should redirect to your HTTPS server, like in this answer.
Another point that may be relevant to your question is that, in production environments, you don't usually bind a port to your Node.js server, since it's not production ready. You probably want to use a reverse proxy and load balancer like Nginx or HAProxy.
If you are using the AWS ALB (Application Load Balancer) they announced the http->https redirect today. Take a look: https://exampleloadbalancer.com/redirect_demo.html
Put your ELB behind the Cloudfront and in settings of your distribution select forward HTTP to HTTPS.
The following doc will be helpful
https://docs.aws.amazon.com/waf/latest/developerguide/tutorials-ddos-cross-service-ELB.html
This method has two benefit:
1-Your problem will be solve
2-You can use the benefit of the powerful CDN, for more information about Cloudfront read https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Introduction.html
Update:
You can forward traffic from HTTP to HTTPS by edit your Listeners setting in your ELB.

Node socket.io on load balanced Amazon EC2

I have a standard LAMP EC2 instance set-up running on Amazon's AWS. Having also installed Node.js, socket.io and Express to meet the demands of live updating, I am now at the stage of load balancing the application. That's all working, but my sockets aren't. This is how my set-up looks:-
--- EC2 >> Node.js + socket.io
/
Client >> ELB --
\
--- EC2 >> Node.js + socket.io
[RDS MySQL - EC2 instances communicate to this]
As you can see, each instance has an installation of Node and socket.io. However, occasionally Chrome debug will 400 the socket request returning the reason {"code":1,"message":"Session ID unknown"}, and I guess this is because it's communicating to the other instance.
Additionally, let's say I am on page A and the socket needs to emit to page B - because of the load balancer these two pages might well be on a different instance (they will both be open at the same time). Using something like Sticky Sessions, to my knowledge, wouldn't work in that scenario because both pages would be restricted to their respective instances.
How can I get around this issue? Will I need a whole dedicated instance just for Node? That seems somewhat overkill...
The issues come up when you consider both websocket traffic (layer 4 -ish) and HTTP traffic (layer 7) moving across a load balancer that can only inspect one layer at a time. For example, if you set the ELB to load balance on layer 7 (HTTP/HTTPS) then websockets will not work at all across the ELB. However, if you set the ELB to load balance on layer 4 (TCP) then any fallback HTTP polling requests could end up at any of the upstream servers.
You have two options here. You can figure out a way to effectively load balance both HTTP and websocket requests or find a way to deterministically map requests to upstream servers regardless of the protocol.
The first one is pretty involved and requires another load balancer. A good walkthrough can be found here. It's worth noting that when that post was written HAProxy didn't have native SSL support. Now that this is the case it might be possible to just remove the ELB entirely, if that's the route you want to go. If that's the case the second option might be better.
Otherwise you can use HAProxy on its own (or a paid version of Nginx) to implement a deterministic load balancing mechanism. In this case you would use IP hashing since socket.io does not provide a route-based mechanism to identify a particular server like sockjs. This would use the first 3 octets of the IP address to determine which upstream server gets each request so unless the user changes IP addresses between HTTP polls then this should work.
The solution would be for the two(or more) node.js installs to use a common session source.
Here is a previous question on using REDIS as a common session store for node.js How to share session between NodeJs and PHP using Redis?
and another
Node.js Express sessions using connect-redis with Unix Domain Sockets

Single domain on multiple servers

I have a domain that needs spread on several server for load balancing purposes.
I also have my application to tell what server suppose to handle certain requests.
Right ow I have it set to use sub-domains like www1, www2 and just redirect to each server but that is ugly.
I need a way to proxy the requests and users to see only www all the time regardless what IP is actually serving the request...
I read a bit into apache proxy thing, but I am still confused how will such a scenario deliver the page and resources like videos without changing the www.
You can enter multiple ip addresses per subdomain in your DNS table. If your DNS server supports it, you can rotate these entries on each request to get a simple round robin load balancer (see http://en.wikipedia.org/wiki/Round-robin_DNS)
However, a much better solution is to have a load balancing server that handles all request to your web site. This way you can add and remove web servers to/from load balance instantaneously. So when you need to do some maintenance on one server you just take it out of the rotation.
Many load balancers also check if the web servers are still alive and remove dead servers automatically. This will increase your uptime significantly.

Resources