Is redis limiting the number of clients to 65K? - linux

I am using Redis 2.4.6 Stable.
I have increased the number of redis file descriptors in file ae.h to over 200K:
#define AE_SETSIZE (1024*200)
But when running it I am reaching a limit of 65534.
I am running redis on ec2 on a RedHat instance: 2.6.32-220.2.1.el6.x86_64
and I am running redis with a ulimit -n 200000
I have set up tests with multiple ec2 nodes that try to push the concurrent connections to over 150K, but it will not beyond 65K.
Any ideas of what can I be missing? Maybe a kernel limitation? bug in redis?
This is a dump of INFO on the redis server:
used_cpu_sys_children:0.00
used_cpu_user_children:0.00
**connected_clients:65534**
connected_slaves:0
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0
used_memory:572810560
used_memory_human:546.27M
used_memory_rss:305123328
used_memory_peak:572810528
used_memory_peak_human:546.27M
mem_fragmentation_ratio:0.53

Are you running afoul of network port limitations? Depending on how the clients are closing the connections you could simply be running out of ports, as they will get stuck in the TIME_WAIT state.
If that is the case, one way to get around it is to bind multiple internal IP's to the server and distribute accordingly. Or alternately, if you can modify the client you're using, making sure that it closes the connection in such a way that it takes on the burden of TIME_WAIT.

Related

How to find number of http(s) connections (TCP) opened by my node JS micro-service (using axios KeepAlive for http) config, in GKE k8s environment?

Problem Description/Context
I have a nodeJS-based application using Axios to make HTTP requests (Outbound REST API calls) against a web service (say https://any.example.restapis.com). And these HTTP requests occasionally used to take > 1-minute latency. After some debugging - when we tried httpsAgent property to keep the HTTP connections live (persistent) it did the trick and now the APIs are taking < 1 second and the application is working OK. Basically, my understanding is with this property the TCP connections used by the HTTP calls are persistent now and the httpsAgent is opening multiple socket connections against the web service (i.e; it's keeping the connections alive based on default configs and opening multiple TCP connections based on the load as required - basically maintaining a pool of connections)
httpsAgent: new https.Agent({ keepAlive: true }),
Question
We are not yet sending the full traffic 100% to the micro-service (just 1%). So I would like to understand in detail what is happening underneath to make sure the fix is indeed complete and my micro-service will scale to full traffic.
So, can anyone please let me know after SSH into the pod's container how I can check if my node JS application is indeed making number of TCP (socket) connections against the web service rather than just using single TCP connection but keeping it alive (I tried to use netstat -atp command like below - however I'm not able to make the connection). So, it will great if anyone help me with how to check the number of TCP connections made by my micro-service.
// example cmd -
// Looking at cmds like netstat, lsof as they may (hoping!) give me details that I want!
netstat -atp | grep <my process ID>
In a microservices architecture, the number of server to server connections increases dramatically compared to alternative setups. Interactions which would traditionally have been an in-memory process in one application now often rely on remote calls to other REST based services over HTTP, meaning it is more important than ever to ensure these remote calls are both fast and efficient.
The netstat command is used to show network status.
# netstat -at : To list all tcp ports.
# netstat -lt : To list only the listening tcp ports.
It is used more for problem determination than for performance measurement. However, the netstat command can be used to determine the amount of traffic on the network to ascertain whether performance problems are due to network congestion.

What should be the ip and port for connecting redis-cluster?

I have one situation to deal with redis-cluster.Actually we want to move to redis-cluster for high availability.So, currently we have one transaction server and we are using redis for managing mini-Statements.We have single instance of redis running on default port with 0.0.0.0 ip. In my transaction server, i have one configuration file in which i am putting redis ip and port for connection.
My Question:
1) Suppose i have two machine with redis server and i want something like if one machine died then my transaction server will automatically use second machine for its work and it should have all the keys available.So for this what ip and port i should configure in my transaction server config file and what should be the setup for redis to achieve this goal?
A suggestion or a link will be helpful!
If you looking for high availability solution for Redis, you might want to look inot Redis Sentinel but not cluster.
Redis Sentinel offers exactly what you need, you can see the official document for more information.

Websocket (node.js) connection limit, clients are getting disconnected after reaching 400-450 connections

I have a big problem with socket.io connection limit. If the number of connections is more than 400-450 connected clients (by browsers) users are getting disconnected. I increased soft and hard limits for tcp but it didn't help me.
The problem is only for browsers. When I tried to connect by socket-io-client module from other node.js server I reached 5000 connected clients.
Its very big problem for me and totally blocked me. Please help.
Update
I have tried with standard Websocket library (ws module with node.js) and problem was similar. I can reach only 456 connected clients.
Update 2
I devided connected clients between a few instances of server. Every group of clients were connecting by other port. Unfortunately this change didn't help me. Sum of connected users was the same like before.
Solved (2018)
There were not enough open ports for a Linux user which run the pm2 manager ("pm2" or "pm" username).
You may be hitting a limit in your operating system. There are security limits in the number of concurrent files open, take a look at this thread.
https://github.com/socketio/socket.io/issues/1393
Update:
I wanted to expand this answer because I was answering from mobile before. Each new connection that gets established is going to open a new file descriptor under your node process. Of course, each connection is going to use some portion of RAM. You would most likely run into the FD limit first before running out of RAM (but that depends on your server).
Check your FD limits: https://rtcamp.com/tutorials/linux/increase-open-files-limit/
And lastly, I suspect your single client concurrency was not using the correct flags to force new connections. If you want to test concurrent connections from one client, you need to set a flag on the webserver:
var socket = io.connect('http://localhost:3000', {'force new connection': true});

socket.io max connection test on multicore machine

To answer my own question, it was a client issue, not a server one. For some unknown reason, my mac osx could not make over ~7.8k connections. Having ubuntu machine as a client solved the problem.
[Question]
I'm trying to estimate the maximum number of connections my server can keep. So I wrote a simple socket.io server and client test code. You can see it here. : gist
Above gist do very simple job. Server accepts all incoming socket requests, and periodically print out number of established connections, and cpu, memory usage. Client tries to connect to a given socket.io server with a certain number and does nothing but keeping connections.
When I ran this test with one server (ubuntu machine) and one client (from my mac osx), roughly 7800 connections were successfully made and it started to drop connections. So next, I ran more servers on different cpu cores, and ran the test again. What I expected is that more connections could be made (in total sum) because major bottleneck would be a CPU power. But instead what I saw was that how many cores I utilized, the total number of connections this server could keep is around 7800 connections. It's hard to understand why my server behaves like this. Can anyone give me the reason behind this behavior or point me out what I am missing?
Number of connections made before dropping any connection.
1 server : 7800
3 servers : 2549, 2299, 2979 (each)
4 servers : 1904, 1913, 1969, 1949 (each)
Server-side command
taskset -c [cpu_num] node --stack-size=99999 server.js -p [port_num]
Client-side command
node client.js -h http://serveraddress:port -b 10 -n 500
b=10, n=500 means that client should see 10 connections established before trying another 10 connections, until 10*500 connections are made.
Package versions
socket.io, socket.io-client : 0.9.16
express : 3.4.8
CPU is rarely the bottleneck in these types of situations. It is more likely the maximum number of TCP connections allowed by the operating system, or a RAM limitation.

Maximum number of concurrent connections on a single port (socket) of Server

What could be the maximum number of concurrent Clients (using different port number) that could communicate to a Server on the same port (Single socket) ? What are the factors that could influence this count ? I am looking for this information w.r.t telnet in Linux environment.
This depends in part on your operating system.
There is however no limit on a specific port. There is a limit on the number of concurrent connections however, typically limited by the number of file descriptors the kernel supports (eg 2048).
The thing to remember is that a TCP connection is unique and a connection is a pair of end points (local and remote IP address and port) so it doesn't matter if 1000 connections connect to the same port on a server because the connections are all still unique because the other end is different.
The other limit to be aware of is that a machine can only make about 64K outbound connections or the kernel limit on connections, whichever is lower. That's because port is an unsigned 16 bit number (0-65535) and each outbound connection uses one of those ports.
You can extend this by giving a machine additional IP addresses. Each IP address is another address space of 64K addresses.
More than you care about. Or rather.
More than your code can actually handle (for other reasons)
More than your clients will actually make
More than you can handle on a single box for performance reasons
More than you need on a single box because your load balancers will distribute them amongst several for availability reasons anyway
I can guarantee that it is more than all of those. There are scalability limitations with large numbers of sockets, which can be worked around (Google for the c10k problem). In practice it is possible to have more than 10,000 sockets usefully used by a single process under Linux. If you have multiple processes per server, you can increase that up again.
It is not necessary to use a single port, as your dedicated load-balancers will be able to round-robin several ports if needed.
If you are running a service for many 10s of 1000s of client processes, it is probably fairly important that it keeps working, therefore you will need several servers for redunancy ANYWAY. Therefore you won't have a problem deploying a few more servers.
I did a testing on Windows, doing multiple loopback connections onto a single socket. Windows refused to allocate anything after 16372 mark.

Resources