How to scale node.js / socket.io server? - node.js

I am currently running a node.js app and am about to introduce socket.io to allow real time updates (chat, in-app notifications, ...). At the moment, I am running the smallest available setup from DigitalOcean (1 vCPU, 1 GB RAM) for my node.js server. I stress-tested the node.js app connecting to socket.io using Artillery:
config:
target: "https://my.server.com"
socketio:
- transports: ["websocket"] // optional, same results if I remove this
phases:
- duration: 600
arrivalRate: 20
scenarios:
- name: "A user that just connects"
weight: 90
engine: "socketio"
flow:
- get:
url: "/"
- think: 600
It can handle a couple hundred concurrent connections. After that, I start getting the following errors:
Errors:
ECONNRESET: 1
Error: xhr poll error: 12
When I resize my DigitalOcean droplet to 8 vCPU's and 32 GB RAM, I can get upwards of 1700 concurrent connections. No matter how much more I resize, it always sticks around that number.
My first question: is this normal behavior? Is there any way to increase this number per droplet, so I can have more concurrent connections on a single node instance? Here is my configuration:
sysctl -p
fs.file-max = 2097152
vm.swappiness = 10
vm.dirty_ratio = 60
vm.dirty_background_ratio = 2
net.ipv4.tcp_synack_retries = 2
net.ipv4.ip_local_port_range = 2000 65535
net.ipv4.tcp_rfc1337 = 1
net.ipv4.tcp_fin_timeout = 15
net.ipv4.tcp_keepalive_time = 300
net.ipv4.tcp_keepalive_probes = 5
net.ipv4.tcp_keepalive_intvl = 15
net.core.rmem_default = 31457280
net.core.rmem_max = 12582912
net.core.wmem_default = 31457280
net.core.wmem_max = 12582912
net.core.somaxconn = 4096
net.core.netdev_max_backlog = 65536
net.core.optmem_max = 25165824
net.ipv4.tcp_mem = 65536 131072 262144
net.ipv4.udp_mem = 65536 131072 262144
net.ipv4.tcp_rmem = 8192 87380 16777216
net.ipv4.udp_rmem_min = 16384
net.ipv4.tcp_wmem = 8192 65536 16777216
net.ipv4.udp_wmem_min = 16384
net.ipv4.tcp_max_tw_buckets = 1440000
net.ipv4.tcp_tw_reuse = 1
ulimit
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 3838
max locked memory (kbytes, -l) 16384
max memory size (kbytes, -m) unlimited
open files (-n) 65535
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 10000000
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
nginx.conf
user www-data;
worker_processes auto;
worker_rlimit_nofile 1000000;
pid /run/nginx.pid;
include /etc/nginx/modules-enabled/*.conf;
events {
multi_accept on;
use epoll;
worker_connections 1000000;
}
http {
##
# Basic Settings
##
client_max_body_size 50M;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 120;
keepalive_requests 10000;
types_hash_max_size 2048;
# server_tokens off;
# server_names_hash_bucket_size 64;
# server_name_in_redirect off;
include /etc/nginx/mime.types;
default_type application/octet-stream;
##
# SSL Settings
##
ssl_protocols TLSv1 TLSv1.1 TLSv1.2; # Dropping SSLv3, ref: POODLE
ssl_prefer_server_ciphers on;
##
# Logging Settings
##
access_log /var/log/nginx/access.log;
error_log /var/log/nginx/error.log;
##
# Gzip Settings
##
gzip on;
# gzip_vary on;
# gzip_proxied any;
# gzip_comp_level 6;
# gzip_buffers 16 8k;
# gzip_http_version 1.1;
# gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;
##
# Virtual Host Configs
##
include /etc/nginx/conf.d/*.conf;
include /etc/nginx/sites-enabled/*;
}
Another question: I am thinking about scaling horizontally and spinning up more droplets. Let's say 4 droplets to proxy all connections to. How would this be set up in practice? I would use Redis to emit through socket.io to all connected clients. Do I use 4 droplets with the same configuration? Do I run the same stuff on all 4 of them? For instance, should I upload the same server.js app on all 4 droplets? Any advice is welcome.

I can't really answer your first question, however I can try my best on your second.
If you're setting up load balancing, you run the same server.js app on each droplet and have one handle traffic. I don't know much about Redis but found this: https://redis.io/topics/cluster-tutorial
I hope this helped.

Related

WSO2 Identity Server as Key Manager java.io.IOException: Too many open files

I've problem with my installed wso2is as key manager:
dic 18 21:15:30 autenticacion.dominio.info wso2server.sh[825]: [2017-12-18 21:15:30,855] ERROR {org.apache.tomcat.util.net.NioEndpoint$Acceptor} - Socket accept failed
dic 18 21:15:30 autenticacion.dominio.info wso2server.sh[825]: java.io.IOException: Too many open files
dic 18 21:15:30 autenticacion.dominio.info wso2server.sh[825]: at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
dic 18 21:15:30 autenticacion.dominio.info wso2server.sh[825]: at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
dic 18 21:15:30 autenticacion.dominio.info wso2server.sh[825]: at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
dic 18 21:15:30 autenticacion.dominio.info wso2server.sh[825]: at org.apache.tomcat.util.net.NioEndpoint$Acceptor.run(NioEndpoint.java:825)
dic 18 21:15:30 autenticacion.dominio.info wso2server.sh[825]: at java.lang.Thread.run(Thread.java:748)
I've follow the steps on WSO2 ESB too many open files ListeningIOReactor encountered a checked exception : Too many open files Exception thrown when try to add documents to the lucene index continuously inside the for loop but the error persists.
My current /etc/security/limits.conf
# /etc/security/limits.conf
#
#This file sets the resource limits for the users logged in via PAM.
#It does not affect resource limits of the system services.
#
#Also note that configuration files in /etc/security/limits.d directory,
#which are read in alphabetical order, override the settings in this
#file in case the domain is the same or more specific.
#That means for example that setting a limit for wildcard domain here
#can be overriden with a wildcard setting in a config file in the
#subdirectory, but a user specific setting here can be overriden only
#with a user specific setting in the subdirectory.
#
#Each line describes a limit for a user in the form:
#
#<domain> <type> <item> <value>
#
#Where:
#<domain> can be:
# - a user name
# - a group name, with #group syntax
# - the wildcard *, for default entry
# - the wildcard %, can be also used with %group syntax,
# for maxlogin limit
#
#<type> can have the two values:
# - "soft" for enforcing the soft limits
# - "hard" for enforcing hard limits
#
#<item> can be one of the following:
# - core - limits the core file size (KB)
# - data - max data size (KB)
# - fsize - maximum filesize (KB)
# - memlock - max locked-in-memory address space (KB)
# - nofile - max number of open file descriptors
# - rss - max resident set size (KB)
# - stack - max stack size (KB)
# - cpu - max CPU time (MIN)
# - nproc - max number of processes
# - as - address space limit (KB)
# - maxlogins - max number of logins for this user
# - maxsyslogins - max number of logins on the system
# - priority - the priority to run user process with
# - locks - max number of file locks the user can hold
# - sigpending - max number of pending signals
# - msgqueue - max memory used by POSIX message queues (bytes)
# - nice - max nice priority allowed to raise to values: [-20, 19]
# - rtprio - max realtime priority
#
#<domain> <type> <item> <value>
#
#* soft core 0
#* hard rss 10000
##student hard nproc 20
##faculty soft nproc 20
##faculty hard nproc 50
#ftp hard nproc 0
##student - maxlogins 4
# End of file
* soft nofile 4096
* hard nofile 65535
* soft nproc 20000
* hard nproc 20000
Using command ulimit:
[root#autenticacion ~]# ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 7283
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 4096
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 20000
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Using command ulimit with wso2 user:
[root#autenticacion ~]# su wso2
bash-4.2$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 7283
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 4096
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 4096
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Thanks
In my case I'm giving in log some errors like:
ERROR {org.wso2.carbon.metrics.jdbc.reporter.JDBCReporter} - Error when reporting gauges
org.h2.jdbc.JdbcSQLException: IO Exception: "JAVA.net.UnknownHostException: static-ip-1312323.cable.net.co: statict-ip
at org.h2.message.DbException.getJdbcSQLException
... 32 more
Caused by: java.net.UnknownHostException: static-ip-1312323.cable.net.co: static-ip-1312323.cable.net.co:Name or service not know
at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
This errors passed unnoticed because stay from the beginning and the services that provides wso2 is working well.
I've change the IP of /etc/hostname from internet ip to the local ip (or hostname).
Example:
autentication.organization.com (internet ip) to autentication.intranet (intranet ip)
This change is acceptable for the server configuration and provides a solution for this recurrents limit breaks. Apparently this errors cause acumulatives connections to open files and never close this connections.

Varnish process not closed and taking huge memory

We are using Varnish cache 4.1 in centos server, When we started Varnish server lots of varnish process starting and its not closing, due to this issue we are facing memory leak issue, pls let us know how we can resolve it
My Configuration is: /etc/sysconfig/varnish
#DAEMON_OPTS="-a :80 \
# -T localhost:6082 \
# -f /etc/varnish/default.vcl \
# -S /etc/varnish/secret \
# -p thread_pools=8 \
# -p thread_pool_max=4000 \
# -p thread_pool_add_delay=1 \
# -p send_timeout=30 \
# -p listen_depth=4096 \
# -s malloc,2G"
backend default {
.host = "127.0.0.1";
.port = "8080";
.probe = {
.url = "/";
.interval = 5s;
.timeout = 1s;
.window = 5;
.threshold = 3;
}
}
34514 89208 83360 5 0.0 4.3 0:00.00 /usr/sbin/varnishd -a :80 -f /etc/varnish/default.vcl -T 127.0.0.1:6082 -t 120 -p thread pool min=50 -p t 1678 varnish 20 0 345M 89208 83360 S 0.0 4.3 0:00.03 /usr/sbin/varnishd -a :80 -f /etc/varnish/default.vcl -T 127.0.0.1:6082 -t 120 -p thread_pool_min=50 -p • 1679 varnish 20 0
You are not limiting space for transient objects. By default an unlimited malloc is used (see the official doc : https://www.varnish-cache.org/docs/4.0/users-guide/storage-backends.html#transient-storage )
From what I see in your message, you are not using the parameter DAEMON_OPT.
What are the content of your varnishd.service file and /etc/varnish/varnish.params ?
EDIT
Nothing's wrong with your init.d script. It should use the settings found in /etc/sysconfig/varnish.
How many RAM is consumed by varnish?
All the varnish threads are sharing the same storage (malloc 2G + Transient malloc 100M) so it should take up to 2.1G for storage. you need to add an average overhead of 1KB per object stored in cache to get the total memory used.
I don't think you are suffering memory leak, the process are normal. You told varnish to spawn 50 processes (with the thread_pools parameter) so they are expected.
I'd recommend decreasing the number of thread_pools, you are setting it to 50. You should be able to lessen it to something between 2 and 8, at the same time it will help to increase the thread_pool_max to 5000and set the thread_pool_min to 1000.
We are running very large server with 2 pools * 1000-5000 threads and have no issue.

Ulimit chnage after reboot as no effect

I have changed /etc/security/limits.com and rebooted the machine remotely, However, after the boot, the nproc parameter has still the old value.
[ost#compute-0-1 ~]$ cat /etc/security/limits.conf
* - memlock -1
* - stack -1
* - nofile 4096
* - nproc 4096 <=====================================
[ost#compute-0-1 ~]$
Broadcast message from root#compute-0-1.local
(/dev/pts/0) at 19:27 ...
The system is going down for reboot NOW!
Connection to compute-0-1 closed by remote host.
Connection to compute-0-1 closed.
ost#cluster:~$ ssh compute-0-1
Warning: untrusted X11 forwarding setup failed: xauth key data not generated
Last login: Tue Sep 27 19:25:25 2016 from cluster.local
Rocks Compute Node
Rocks 6.1 (Emerald Boa)
Profile built 19:00 23-Aug-2016
Kickstarted 19:08 23-Aug-2016
[ost#compute-0-1 ~]$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 516294
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 4096
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) unlimited
cpu time (seconds, -t) unlimited
max user processes (-u) 1024 <=========================
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Please see that I set max user processes to 4096 but after the reboot, the value is still 1024.
Please take a look at a file named /etc/pam.d/sshd .
If you can find it, open the file and insert a following line.
session required pam_limits.so
Then the new value will be effective even after rebooting.
PAM is a module which is related to authentication. So you need to enable the module through ssh login.
More details on man pam_limits.
Thanks!

How to set nginx max open files?

Though I have done the following setting, and even restarted the server:
# head /etc/security/limits.conf -n2
www-data soft nofile -1
www-data hard nofile -1
# /sbin/sysctl fs.file-max
fs.file-max = 201558
The open files limitation of specific process is still 1024/4096:
# ps aux | grep nginx
root 983 0.0 0.0 85872 1348 ? Ss 15:42 0:00 nginx: master process /usr/sbin/nginx
www-data 984 0.0 0.2 89780 6000 ? S 15:42 0:00 nginx: worker process
www-data 985 0.0 0.2 89780 5472 ? S 15:42 0:00 nginx: worker process
root 1247 0.0 0.0 11744 916 pts/0 S+ 15:47 0:00 grep --color=auto nginx
# cat /proc/984/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 15845 15845 processes
Max open files 1024 4096 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 15845 15845 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
I've tried all possible solutions from googling but in vain. What setting did I miss?
On CentOS (tested on 7.x):
Create file /etc/systemd/system/nginx.service.d/override.conf with the following contents:
[Service]
LimitNOFILE=65536
Reload systemd daemon with:
systemctl daemon-reload
Add this to Nginx config file:
worker_rlimit_nofile 16384; (has to be smaller or equal to LimitNOFILE set above)
And finally restart Nginx:
systemctl restart nginx
You can verify that it works with cat /proc/<nginx-pid>/limits.
I found the answer in few minutes after posting this question...
# cat /etc/default/nginx
# Note: You may want to look at the following page before setting the ULIMIT.
# http://wiki.nginx.org/CoreModule#worker_rlimit_nofile
# Set the ulimit variable if you need defaults to change.
# Example: ULIMIT="-n 4096"
ULIMIT="-n 15000"
/etc/security/limit.conf is used by PAM, so it shoud be nothing to do with www-data (it's nologin user).
For nginx simply editing nginx.conf and setting worker_rlimit_nofile should change the limitation.
I initially thought it is a self-imposed limit of nginx, but it increases limit per process:
worker_rlimit_nofile 4096;
You can test by getting a nginx process ID (from top -u nginx), then run:
cat /proc/{PID}/limits
To see the current limits
Another way on CentOS 7 is by systemctl edit SERVICE_NAME, add the variables there:
[Service]
LimitNOFILE=65536
save that file and reload the service.
For those looking for an answer for pre-systemd Debian machines, the nginx init script executes /etc/default/nginx. So, adding the line
ulimit -n 9999
will change the limit for the nginx daemon without messing around with the init script.
Adding ULIMIT="-n 15000" as in a previous answer didn't work with my nginx version.

JVM error on starting Wowza Media server

I have installed Wowza Media Server 3.5 on a CentOS based VPS. During starting Wowza media server for first time i got the error.
Error occurred during initialization of VM
Could not reserve enough space for object heap
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
Which i suppose is due to memory allocation for JVM
The output of ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 794624
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 1024
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
And of
free -m
total used free shared buffers cached
Mem: 1024 1024 0 0 0 0
-/+ buffers/cache: 1024 0
Swap: 0 0 0
There is also an alias entry like
alias java='java -Xms8m -Xmx64m'
Which could also be the reason for JVM error.

Resources