Scaling socket.io with HAProxy

Scaling socket.io with HAProxy - node.js

So far I have had a single node.js app. running socket.io. As number of users grows, it reaches 100% CPU most of the day, so I decided to split users to multiple node.js processes. I have split my node.js application logic to allow sharding of users on different subdomains. I also extracted session code into token passing via URL, so cookies are not important.
I'd like to use at least 4 cores of my 8-core machine, so I want to run multiple node.js processes, each serving the app on subdomain. In order for all node.js's to be accessible via port 80, I decided to use HAProxy. Setup looks like this:
domain.com -> haproxy -> node on 127.0.0.1:5000
sub1.domain.com -> haproxy -> node on 127.0.0.1:5001
sub2.domain.com -> haproxy -> node on 127.0.0.1:5002
sub3.domain.com -> haproxy -> node on 127.0.0.1:5003
Now everything works, but reqular part of the application (not using socket.io) is very slow. It's written using Express.js and it works fast when I open the page directly (i.e. not through HAProxy). Also, connecting to socket.io works fast with XHR transport, but for Websocket transport it also takes a long time to establish connection. Once connection is established, it works well and fast.
I have never used HAProxy before, so I probably misconfigured something. Here's my HAProxy config:
global
maxconn 50000
daemon
defaults
mode http
retries 1
contimeout 8000
clitimeout 120000
srvtimeout 120000
frontend http-in
bind *:80
acl is_l1 hdr_end(host) -i sub1.domain.com
acl is_l2 hdr_end(host) -i sub2.domain.com
acl is_l3 hdr_end(host) -i sub3.domain.com
acl is_l0 hdr_end(host) -i domain.com
use_backend b1 if is_l1
use_backend b2 if is_l2
use_backend b3 if is_l3
use_backend b0 if is_l0
default_backend b0
backend b0
balance source
option forwardfor except 127.0.0.1 # stunnel already adds the header
server s1 127.0.0.1:5000
backend b1
balance source
option forwardfor except 127.0.0.1 # stunnel already adds the header
server s2 127.0.0.1:5001
backend b2
balance source
option forwardfor except 127.0.0.1 # stunnel already adds the header
server s2 127.0.0.1:5002
backend b3
balance source
option forwardfor except 127.0.0.1 # stunnel already adds the header
server s2 127.0.0.1:5003

I figured it out. I failed to find this in docs, but global maxconn setting does NOT apply to frontend. Frontend has default of 2000 concurrent connections and everything beyond was queued. Since I have long-lived socket.io connections this created problems.
The solution is to explicitly set maxconn in frontend section.

Related

Target specific Azure Web App Instance with Header instead of a Cookie

I have an architecture where I have multiple instances but I want to maximize cache hits.
Users are defined in groups and I want to make sure that all users that belong to the same group hit the same server as much as possible.
The application is fully stateless, but having users from the same group hitting the same server will dramatically increase performance and memory load on all instances.
When loading the main page I already know which server I would like to send this user to on the XHR call.
Using the ARRAffinity cookies are not really great is almost impossible in this scenario (cross domain, have to make server call first etc) and I would strongly prefer sending a hint myself through a custom header.
I'm trying manually to do some workarounds with deleting the cookies and assigning them, but it feels very hacky and I don't get it fully working yet. and it doesn't work for XHR calls.
Question:
Is it possible to direct to a specific instance through a header, url or domain instead of a cookie?
Notes
Distributed cache does not work for me in this case. I need the performance of memory cache without extra network hops and serialization/deserialization.
This seems to be possible with Application Gateway, but it seem to need a lot of extra infrastructure and moving parts while all my problems would be fixed by sending the "right" header.
I could fix this by duplicating the web app in its entirety and assigning a different hostname. Also this feels like adding a lot of extra moving parts that can break. Also maintenance will be harder and more confusing, I loss autoscale, etc.
Maybe this can be fixed easily by Kubenetes/Docker Swarm type of architecture (no experience), but as this is a large legacy project and I have a pretty strict deadline I am very cautious of making such a dramatic switch last minute.

If I am understanding correctly you want to set a custom header via a client application and based on that proxy over the connection to some other backend server.
I like to use HAProxy for that, you can also look into Nginx as well for that.
You can install HAProxy on linux from the distribution's package manager, or you can use the available HAProxy docker container.
An example of installing it on ArchLinux:
sudo pacman -S haproxy
sudo systemctl start haproxy
Once its installed you can find out where your haproxy.cfg config file is located and then copy the haproxy.cfg config snippet that I posted here below instead of the existing default config.
In my case haproxy.cfg was in /etc/haproxy/haproxy.cfg
To achieve what you want in HAProxy you would set all clients to communicate with this main HAProxy server, which would then forward the connection to the different backend servers you have based on the value of the custom header that you can set client side, for example "x-mycustom-header: Server-one". As a bonus, you can also enable sticky sessions as well if needed on HAProxy, but it is not mandatory to achieve what you are looking for.
Here is a simple example setup for the HAProxy config file (haproxy.cfg) with only 2 backend servers, but you can add more.
The logic here is that all the clients would make http requests to the HAProxy server listening on port 80, then HAProxy would check the value of the custom header called 'x-mycustom-header' that the clients added and based on that value, it will forward the client to either backend_server_one or backend_server_two.
For testing purposes both HAProxy and the two backends are on the same box but listening on different ports. HAProxy on port 80, server1 is on 127.0.0.1:3000 and server2 is on 127.0.0.1:4000.
cat haproxy.cfg
#---------------------------------------------------------------------
# Example configuration. See the full configuration manual online.
#
# http://www.haproxy.org/download/1.7/doc/configuration.txt
#
#---------------------------------------------------------------------
global
maxconn 20000
log 127.0.0.1 local0
user haproxy
chroot /usr/share/haproxy
pidfile /run/haproxy.pid
daemon
frontend main
bind :80
mode http
log global
option httplog
option dontlognull
option http_proxy
option forwardfor except 127.0.0.0/8
maxconn 8000
timeout client 30s
use_backend backend_server_one if { req.hdr(x-mycustom-header) server-one }
use_backend backend_server_two if { req.hdr(x-mycustom-header) server-two }
default_backend backend_server_one #when the header is something else default to the first backend
backend backend_server_one
mode http
balance roundrobin
timeout connect 5s
timeout server 5s
server static 127.0.0.1:3000 #change this ip to your 1st backend ip address
backend backend_server_two
mode http
balance roundrobin
timeout connect 5s
timeout server 30s
server static 127.0.0.1:4000 #change this ip to your 2nd backend ip address
To test that this works you can open two netcat listeners, one on port 3000, and then the other on port 4000, run them on differnt screens or different ssh sessions.
nc -l 3000 # in the first screen
nc -l 4000 # in a second screen
Then after you do a sudo systemctl reload haproxy to make sure that HAProxy is reloaded with your new config file, you can make an http GET request on port 80 and provide the "x-mycustom-header: Server-one" header.
You will be able to see the request in the output of the netcat instance that is listening on port 3000.
Now change the header to "x-mycustom-header: Server-two" and make a second GET request, and you will see that the request reached to the second netcat instance this time, which is listening on port 4000, which indicates that this works.
Tested on ArchLinux

The Microsoft Team has responded an confirmed this is at the moment not possible.

HAProxy not keeping HTTP connection open

I have a Node.js server that uses Server-Sent Events (SSE) to allow push notifications to connected web clients. It works beautifully when the browser talks to Node directly.
However, when I place haproxy in the middle, as it must be for production to meet other requirements, the connections are closed and reopened (thanks to SSE's auto-reconnect) every 30 seconds. I've changed and tried everything I know of and can find online in the haproxy configuration.
Most info out there and in their documentation examples deal with sockets, but there is very little about SSE support. Should it support persistent HTTP connections for SSE? If so, what is the trick in to configure it?
My config follows:
global
daemon
# maximum number of concurrent connections
maxconn 4096
# drop privileges after port binding
user nobody
group nogroup
# store pid of process in the file
pidfile /var/run/haproxy.pid
# create this socket for stats
stats socket /var/run/socket-haproxy
defaults
log global
mode http
# disable logging of null connections
option dontlognull
# I've tried all these to no avail
#option http-server-close
#option httpclose
option http-keep-alive
# Add x-forwarded-for header to forward clients IP to app
option forwardfor
# maximum time to wait for a server connection to succeed. Can be as low as few msec if Haproxy and server are on same LAN.
timeout connect 1s
# maximum inactivity time on client side. Recommended to keep it same as server timeout.
timeout client 24d
# maximum time given to server to respond to a request
timeout server 24d
# Long timeout for WebSocket connections.
timeout tunnel 8h
# timeout for keep alive
timeout http-keep-alive 60s
# maximum time to wait for client to send full request. Keep it like 5s for get DoS protection.
timeout http-request 5s
# enable stats web interface. very helpful to see what's happening in haproxy
stats enable
# default refresh time for web interface
stats refresh 30s
# this frontend interface receives the incoming http requests and forwards to https then handles all SSL requests
frontend public
# HTTP
bind :80
# by default, all incoming requests are sent to Node.js
default_backend node_backend
# redirect to the SSE backend if /ionmed/events (eventum #????)
acl req_sse_path path_beg /ionmed/events
use_backend node_sse_backend if req_sse_path
# redirect to the tomcat backend if Time Clock, ViewerJS, Spell Checker, Tomcat Manager, or eScripts (eventum #1039, #1082)
acl req_timeclock_path path_beg /TimeClock/
acl req_manager_path path_beg /manager/
acl req_spelling_path path_beg /jspellEvolution/
acl req_escripts_path path_beg /ionmed/escripts
use_backend tomcat_backend if req_timeclock_path or req_manager_path or req_spelling_path or req_escripts_path
# for displaying HAProxy statistics
acl req_stats path_beg /stats
use_backend stats if req_stats
# node backend, transfer to port 8081
backend node_backend
# Tell the backend that this is a secure connection,
# even though it's getting plain HTTP.
reqadd X-Forwarded-Proto:\ https
server node_server localhost:8081
# node SSE backend, transfer to port 8082
backend node_sse_backend
# Tell the backend that this is a secure connection,
# even though it's getting plain HTTP.
reqadd X-Forwarded-Proto:\ https
server node_sse_server localhost:8082
# tomcat backend, transfer to port 8888
backend tomcat_backend
# Tell the backend that this is a secure connection,
# even though it's getting plain HTTP.
reqadd X-Forwarded-Proto:\ https
server tomcat_server localhost:8888

HAProxy Configuration - How to make TCP connection sticky (Node.js, socket.io, websocket, FlashSocket)

I have setup HAProxy for EC2 server where i'm running my nodejs two server on port 3005 and 3006. we have setup this for our multiplayer game. we have used socket.io for our realtime event update on client side and server side. HAProxy is working correctly with "balance source" (I have added working copy of my HAProxy Configuration), in source balancer problem is that its goes all event on same server each and ever time. so i have 40 computer setup in my network so all 40 computer event goes to 3005 port. its not changing port when i'm coming next day. I would like to setup TCP connection sticky with TCP mode in haproxy. is there any way to do with balance roundrobin? I have added my current setting files here. we also trying to used cookie but its not working in our case because we have used mode as tcp.
Also we have flash game which used to load flash policy from port 3843.
Here added my haproxy configuration.
global
debug
log 127.0.0.1 local0 # Enable per-instance logging of events and traffic.
log 127.0.0.1 local1 notice # only send important events
nbproc 1
maxconn 65536
pidfile /var/run/haproxy.pid
defaults
log global
srvtimeout 300s
timeout connect 5s
timeout queue 5s
timeout server 1h
timeout tunnel 1h
frontend flash_policy
bind 0.0.0.0:843
timeout client 5s
default_backend nodejs_flashpolicy
frontend wwws
bind 0.0.0.0:3000 ssl crt /home/certificate/final.crt
timeout client 1h
default_backend flashsocket_backend
tcp-request inspect-delay 500ms
tcp-request content accept if HTTP
use_backend flashsocket_backend if !HTTP
backend flashsocket_backend
mode tcp
option log-health-checks
balance source
cookie JSESSIONID insert indirect nocache
server 3006Game serverip:3006 cookie socket1 weight 1 maxconn 32536 check
server 3005Game serverip:3005 cookie socket2 weight 1 maxconn 32536 check
backend nodejs_flashpolicy
server flashpolicy serverip:3843 weight 1 maxconn 65536 check
# Configuration for HAProxy Stats
listen stats :1900
mode http
timeout client 1h
stats enable
stats hide-version
stats realm Haproxy\ Statistics
stats uri /
stats auth alpesh:alpesh

It is possible by using following options in backend:
stick-table type ip size 50k expire 10m
stick on src

HAProxy mangling Socket.IO request - reserved fields must be empty

Hoping someone can help me.
I'm using NodeJS v0.8.16 Socket.ID v0.9.13 and HAProxy 1.5-dev17.
My setup is on Amazon AWS using a VPC, HAProxy on a public facing instance and NodeJS on a separate instance which normally is not publicly accessible. I do have an IP address on it for testing.
I have a test setup which logs into the NodeJS server, then opens an authenticated websocket through Socket.IO, session details are saved in Redis to share between them. Once the websocket is connected successfully a request is made through NodeJS which prompts an event to be emitted back to the client.
This flow works correctly when the test references the NodeJS instance directly bypassing HAProxy. When it does route through HAProxy I get the errors from Socket.IO
"reserved fields must be empty"
and
"no handler for opcode " referencing a random opcode
From what I can see, there is an initial byte that is parsed by Socket.IO using bitmasks to work out what the request is for. After routing through HAProxy yhis value is now not an expected one and throws these errors.
My HAProxy configuration is from another StackOverflow question HAProxy + WebSocket Disconnection
global
maxconn 4096 # Total Max Connections. This is dependent on ulimit
nbproc 2
defaults
mode http
retries 3
option redispatch
option http-server-close
frontend all 0.0.0.0:3000
timeout client 5000
default_backend www_backend
acl is_websocket hdr(Upgrade) -i WebSocket
acl is_websocket hdr_beg(Host) -i ws
use_backend socket_backend if is_websocket
backend www_backend
balance roundrobin
option forwardfor # This sets X-Forwarded-For
timeout server 5000
timeout connect 4000
server server 10.0.0.214:3000 weight 1 maxconn 1024 check
backend socket_backend
balance roundrobin
option forwardfor # This sets X-Forwarded-For
timeout queue 5000
timeout server 5000
timeout connect 5000
timeout tunnel 3600s
timeout http-keep-alive 1s
timeout http-request 15s
server server1 10.0.0.214:3000 weight 1 maxconn 1024 check
I've also tried other various HAProxy configurations but end up with the same result.
Has anyone come across this issue. I'm not sure what I've done incorrectly.

multiple varnish servers with haproxy

We're planning to add a second varnish server to our infraestructure.
What it's the better method to balance the traffic throw the two servers? I think we can use haproxy in front of the two servers, but how to configure it to load balance the traffic between the 2 varnish? The ideal solution is that if one varnish is down all the traffic goes to the other.
Edit: The ideal behaviour is an active/active conf, with 50% load each and if one goes down haproxy sends 100% load to the other.

Another option would be to put both varnish instances in the same backend, but balance requests by uri or parameter.
This will allow you to have both active in the same backend, and still maintain a high cache-hit ratio since the same uri will be always balanced to the same varnish cache (as long as it is available). Balace uri also takes an optional parameter of length to use when hashing.
Heres a quick example:
frontend http-in
acl my_acl
use_backend varnish if my_acl
backend varnish1
mode http
balance uri
option httpchk GET /check.txt HTTP/1.0
server varnish1 192.168.222.51:6081 check inter 2000
server varnish2 192.168.222.52:6081 check inter 2000

The idea I have come up is use two backends and have in each backed use the other server as backup so when a server is down all the request goes to the alive server.
frontend http-in
acl my_acl <whaever acl to split the traffic>
use_backend varnish2 if my_acl
default_backend varnish1
backend varnish1
mode http
option httpchk GET /check.txt HTTP/1.0
server varnish1 192.168.222.51:6081 check inter 2000
server varnish2 192.168.222.52:6081 check inter 2000 backup
backend varnish2
mode http
option httpchk GET /check.txt HTTP/1.0
server varnish2 192.168.222.51:6081 check inter 2000
server varnish1 192.168.222.52:6081 check inter 2000 backup

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string