Throughput issue in Cassandra client communication

Throughput issue in Cassandra client communication - cassandra

I've got a single node Cassandra cluster(3.11.2)(RHEL 6.5). I'm observing huge differences in the throughput when my client is on the same node on which my database resides vs when my client is on some other computer. The difference is more than 4x! I don't think this is normal.
I had read that port 9042 is used for client communications in Cassandra. If the same port is being used in both scenarios, is the latency being observed in the 2nd scenario due to slow connectivity between the 2 nodes?
For the 2nd scenario, I used the following command on the client side:
time nc -zw30 172.16.129.140 9042 //(172.16.129.140 is the IP_addr_of_database_node)
Connection to 172.16.129.140 9042 port [tcp/*] succeeded!
real 0m0.007s
user 0m0.005s
sys 0m0.001s
Are these values too high? What other linux commands can be useful to get a quantitative measure of the latency in client communication in both scenarios?
I'm using Datastax C++ driver for client.

Related

Is is possible to mqtt client with 100k concurrent session on a single computer

I am testing mqtt broker for bench marking with various opensource githubprojects written in go, models, erlang and jmeter tool for 100k concurrent clients.
Mqtt stresser : https://github.com/inovex/mqtt-stresser
Nodejs benchmark client: https://www.npmjs.com/package/mqtt-benchmark
Jmeter
Erlang mqtt broker bench mark tool: https://github.com/emqtt/emqtt_benchmark
But all these clients are able to send around 64 000 after that it will fail. I'm using windows operating system and same thing happens in ubuntu also. Does it requires some tuning?

If each session is using a separate TCP port then you will run out of ports when you get to around 65535 (minus 1024 since only root can use ports 0-1024) since the that is the total range of TCP ports.
https://en.wikipedia.org/wiki/Port_(computer_networking)

Cassandra client port enable

How to enable cassandra port to connect with BI application. Here my setup with cassandra is of multiple nodes (192.xxx.xx.01,192.xxx.xx.02,192.xxx.xx.03). In this scenario which node will be acting like master / coordinator with my application.
Although i worked with listen_address, rpc_address, broadcast_rpc_address and seeds, I opened both tcp ports 9042 and 9160.
version: 3.10
Kindly, lead me to the right direction.

Cassandra uses master-less architecture.All nodes are equal in cassandra.
When you connect to one of the node that node act as co-ordinator node, any of the node can be co-ordinator.
The coordinator is selected by the driver based on the policy you have set. Common policies are DCAwareRoundRobinPolicy and TokenAware Policy.
For DCAwareRoundRobinPolicy, the driver selects the coordinator node based on its round robin policy. See more here: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/DCAwareRoundRobinPolicy.html
For TokenAwarePolicy, it selects a coordinator node that has the data being queried - to reduce "hops" and latency. More info: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/TokenAwarePolicy.html
native_transport_port is 9042 by default and clients use native transport by default.
Hence you should have connection from your BI to Cassandra host on port 9042.

socket.io max connection test on multicore machine

To answer my own question, it was a client issue, not a server one. For some unknown reason, my mac osx could not make over ~7.8k connections. Having ubuntu machine as a client solved the problem.
[Question]
I'm trying to estimate the maximum number of connections my server can keep. So I wrote a simple socket.io server and client test code. You can see it here. : gist
Above gist do very simple job. Server accepts all incoming socket requests, and periodically print out number of established connections, and cpu, memory usage. Client tries to connect to a given socket.io server with a certain number and does nothing but keeping connections.
When I ran this test with one server (ubuntu machine) and one client (from my mac osx), roughly 7800 connections were successfully made and it started to drop connections. So next, I ran more servers on different cpu cores, and ran the test again. What I expected is that more connections could be made (in total sum) because major bottleneck would be a CPU power. But instead what I saw was that how many cores I utilized, the total number of connections this server could keep is around 7800 connections. It's hard to understand why my server behaves like this. Can anyone give me the reason behind this behavior or point me out what I am missing?
Number of connections made before dropping any connection.
1 server : 7800
3 servers : 2549, 2299, 2979 (each)
4 servers : 1904, 1913, 1969, 1949 (each)
Server-side command
taskset -c [cpu_num] node --stack-size=99999 server.js -p [port_num]
Client-side command
node client.js -h http://serveraddress:port -b 10 -n 500
b=10, n=500 means that client should see 10 connections established before trying another 10 connections, until 10*500 connections are made.
Package versions
socket.io, socket.io-client : 0.9.16
express : 3.4.8

CPU is rarely the bottleneck in these types of situations. It is more likely the maximum number of TCP connections allowed by the operating system, or a RAM limitation.

Nodejs max concurrent connections limited to 1012? (probably only on my machine xD)

So, I've been trying to test out my server code, but client sockets catch 'error' when 1012 connections have been established. Client simulator keeps trying 'til it's tried to connect as many times as I've told it to (obviously). Though, as stated, the server is unwilling to serve more than 1012 connections.
I'm running both client simulator & server on the same computer (might be dumb, but shouldn't it work anyway?).
(Running on socket.io)

To increase the limit of open connection/files in Linux:
ulimit -n 2048
Here is more info regarding ulimit

Is redis limiting the number of clients to 65K?

I am using Redis 2.4.6 Stable.
I have increased the number of redis file descriptors in file ae.h to over 200K:
#define AE_SETSIZE (1024*200)
But when running it I am reaching a limit of 65534.
I am running redis on ec2 on a RedHat instance: 2.6.32-220.2.1.el6.x86_64
and I am running redis with a ulimit -n 200000
I have set up tests with multiple ec2 nodes that try to push the concurrent connections to over 150K, but it will not beyond 65K.
Any ideas of what can I be missing? Maybe a kernel limitation? bug in redis?
This is a dump of INFO on the redis server:
used_cpu_sys_children:0.00
used_cpu_user_children:0.00
**connected_clients:65534**
connected_slaves:0
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0
used_memory:572810560
used_memory_human:546.27M
used_memory_rss:305123328
used_memory_peak:572810528
used_memory_peak_human:546.27M
mem_fragmentation_ratio:0.53

Are you running afoul of network port limitations? Depending on how the clients are closing the connections you could simply be running out of ports, as they will get stuck in the TIME_WAIT state.
If that is the case, one way to get around it is to bind multiple internal IP's to the server and distribute accordingly. Or alternately, if you can modify the client you're using, making sure that it closes the connection in such a way that it takes on the burden of TIME_WAIT.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string