How to use an AWS elastic load balancer with React HTML5 page - node.js

I have an React-app web site that plays HTML5 video hosted on EC2. I have two instances running and I can view the web page on each of the two instances. If I set up an ELB that points to a single instance and use the load balancer's URL, the web page works as well. However, with both machines in the target group, the page fails about 50% of the time when going through the load balancer's URL.
This is the error in _webpack_require_:
TypeError: modules[moduleId] is undefined
modules[moduleId]: undefined
modules: (1) […]
modules["./node_modules/#babel/runtime/helpers/arr…dules/#babel/runtime/helpers/arrayWithoutHoles.js(module, exports)
installedModules: {…}
moduleId: "./node_modules/babel-preset-react-app/node_modules/#babel/runtime/helpers/esm/objectWithoutProperties.js"
An internet search for an undefined modules[moduleId] looks like it's all over the place. And I'm not sure I know enough to ask this question properly.
It looks like some people use nginx (https://www.freecodecamp.org/news/production-fullstack-react-express/), which presumably means not using the AWS ELB, which I expect is much harder to control than AWS.

Most streaming protocols require session affinity. Try setting up sticky sessions on your load balancer. Here's how to for classic ELB: https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/elb-sticky-sessions.html
Unfortunately that also pins the other traffic as well, but better than 50% failure.

Related

Find external IP address of node js server in all scenarios

I have build a node js code for an API server. Part of one feature is that when it starts, it should be able to know its own IP, despite the type of setup of the server where it is running.
The classic scenario is not that hard (I think). There are several options, like using the os module and find the ip or the external interface. I am sure there are other ways and some might be better, but this is the way I have been doing it so far. Feel free to add alternatives as informative as possible.
There is this case that I stumbled on. In one case, the web server was running on a google cloud instance. This instance has two IPs, one internal and one external. What I want is the external IP. However, when I use the method above, the actual external IP is not part of the object returned. The internal IP is declared as being considered as non-internal. Even when I run different commands from within the server command line, the only IP returned is the one that is actually internal and cannot be used to access the node server.
From what I understand, the instance itself is not aware of it's external IP. There might be a dns (I think) that redirects requests made to the external IP towards the correct instance.
While reading in the internet I read that problems getting the server's correct external IP might also rise when using load balancing or proxies.
The solution I thought about is to have the node js code make a request towards a service that I will build. This service will treat the node js servers as clients, and will return their external IPs. From experiments that I have done, the req object contains among others the information of the client's IP. So I should check first req.connection.remoteAddress and then the first element of req.headers['x-forwarded-for']. Ideally the server would make a request towards itself, but
I know there are external API like https://api.ipify.org?format=json that do just that - return the actual IP. But I would very much like to have the node js servers independent of services I cannot control.
However, I really am hoping that there are better solutions out there than making a request from the server which returns the server IP.
However, I really am hoping that there are better solutions out there
than making a request from the server which returns the server IP.
It is not really possible, you always rely on some kind of external observer / external request.
While reading in the internet I read that problems getting the
server's correct external IP might also rise when using load balancing
or proxies.
This is because not in all scenarios your own device is able to be self-aware of its external ip. There might be sitting behind some network, that means external address assigned to devices that forwards the WAN to it. (example : router) so when you try to obtain external ip from the devices interface itself, you end up obtaining an ip but inside the scope of the routers LAN and not the one used for external requests .
So if you really want to
Have a method to use in all scenarios
Not rely on 3rd party services
Only Solution :
Build your own ip echo service (you maintain and can use for future projects).

How to get haproxy to use a specific cluster computer via the URI

I have successfully set haproxy on my server cluster. I have run into one snag that I can't find a solution for...
TESTING INDIVIDUAL CLUSTER COMPUTERS
It can happen that for one reason or another, one computer in the cluster gets a configuration variation. I can't find a way to tell haproxy that I want to use a specific computer out of a cluster.
Basically, mysite.com (and several other domains) are served up by boxes web1, web2 and web3. And they round-robin perfectly.
I want to add something to the URL to tell haproxy that I specifically want to talk to web2 only because in a specific case, only that server is throwing an error on one web page.
Anyone know how to do that without building a new cluster with a URI filter and only have one computer in that cluster? I am hoping to use the cluster as-is but add something to the URI that will tell haproxy which server to use out of the cluster.
Thanks!
Have you thought about using different port for this? Defining new listen section with different port, because, as I understand, you can modify your URL by any means?
Basically, haproxy cannot do what I was hoping. There is no way to add a param to the URL to suggest which host in the cluster to use.
I solved my testing issue by setting up unique ports for each server in the cluster at the firewall. This could also be done at the haproxy level.
To secure this path from the outside world, I told the firewall to only accept traffic from inside our own network.
This lets us test specific servers within the cluster. We did have to add a trap in our PHP app to deal with a session cookie that is too large because we have haproxy manipulating this cookie to keep users on the server they first hit. So when the invalid session cookie is detected, we have the page simply drop the session and reload the page.
This is working well for our testing purposes.

502 Bad Gateway on Google Compute Engine, Nginx, and Meteor

I built a Meteor app and would like to run it on Google Compute Engine. I followed the guide found here to get a basic instance of my app up and running, changing the disk size/type, the instance type, the disk and instance zones (to both match where I live), and added export METEOR_SETTINGS={ ... } to the second to last line of the startup.sh file.
Everything seemed to work fine, I have the persistent disk and vm instance listed on my Google Cloud dashboard. I created a new firewall rule on my default network to tcp:80 and tcp:443 incoming traffic, and now when I access the instance's external IP address from my browser, I'm shown a 502 Bad Gateway nginx/1.8.0 page (where I'd instead expect the homepage of my Meteor app).
Is there anything in the configuration details in the startup.sh file that I'm missing or should modify? Could there be an issue with how the compute vm instance is communicating with the persistent disk? Frankly I'm far outside of my domain with this type of thing.
After ssh-ing into my instance and dinking around a bit, I called export ROOT_URL='<the_instances_external_ip>', rather than 'http://localhost', at which point everything started to work. It's unfortunate how little documentation there is on getting Meteor apps configured and running in production (I only thought to mess with ROOT_URL after searching something unrelated), so hopefully this will at least be helpful for someone else.

Node socket.io on load balanced Amazon EC2

I have a standard LAMP EC2 instance set-up running on Amazon's AWS. Having also installed Node.js, socket.io and Express to meet the demands of live updating, I am now at the stage of load balancing the application. That's all working, but my sockets aren't. This is how my set-up looks:-
--- EC2 >> Node.js + socket.io
/
Client >> ELB --
\
--- EC2 >> Node.js + socket.io
[RDS MySQL - EC2 instances communicate to this]
As you can see, each instance has an installation of Node and socket.io. However, occasionally Chrome debug will 400 the socket request returning the reason {"code":1,"message":"Session ID unknown"}, and I guess this is because it's communicating to the other instance.
Additionally, let's say I am on page A and the socket needs to emit to page B - because of the load balancer these two pages might well be on a different instance (they will both be open at the same time). Using something like Sticky Sessions, to my knowledge, wouldn't work in that scenario because both pages would be restricted to their respective instances.
How can I get around this issue? Will I need a whole dedicated instance just for Node? That seems somewhat overkill...
The issues come up when you consider both websocket traffic (layer 4 -ish) and HTTP traffic (layer 7) moving across a load balancer that can only inspect one layer at a time. For example, if you set the ELB to load balance on layer 7 (HTTP/HTTPS) then websockets will not work at all across the ELB. However, if you set the ELB to load balance on layer 4 (TCP) then any fallback HTTP polling requests could end up at any of the upstream servers.
You have two options here. You can figure out a way to effectively load balance both HTTP and websocket requests or find a way to deterministically map requests to upstream servers regardless of the protocol.
The first one is pretty involved and requires another load balancer. A good walkthrough can be found here. It's worth noting that when that post was written HAProxy didn't have native SSL support. Now that this is the case it might be possible to just remove the ELB entirely, if that's the route you want to go. If that's the case the second option might be better.
Otherwise you can use HAProxy on its own (or a paid version of Nginx) to implement a deterministic load balancing mechanism. In this case you would use IP hashing since socket.io does not provide a route-based mechanism to identify a particular server like sockjs. This would use the first 3 octets of the IP address to determine which upstream server gets each request so unless the user changes IP addresses between HTTP polls then this should work.
The solution would be for the two(or more) node.js installs to use a common session source.
Here is a previous question on using REDIS as a common session store for node.js How to share session between NodeJs and PHP using Redis?
and another
Node.js Express sessions using connect-redis with Unix Domain Sockets

Can I use Amazon ELB instead of nginx as load balancer for my Node.Js app?

I have a Node.js app and I've seen a lot of posts here in SO that it needs to be behind a nginx as load balancer. Since I'm already accustomed to Amazon's services, thus my question.
Yes, but there are a few gotcha to keep in mind:
If you have a single server, ensure you don't return anything except 200 to the page that ELB uses to check health. We had a 301 from our non-www to www site, and that made ELB not send anything to our server because of it.
You'll get the ELB's IP instead of the client's in your logs. There is an ngx_real_ip module, but it takes from config hacking to get it to work.
ELB works great in front of a basic Node.js application. If you want WebSockets, you need to configure it for TCP balancing. TCP balancing doesn't support sticky sessions though, so you get one or the other.

Resources