Why can't I create 50k processes in Linux? - linux

Using Linux
$ uname -r
4.4.0-1041-aws
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.3 LTS
Release: 16.04
Codename: xenial
With limits allowing up to 200k processes
$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 563048
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 524288
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
$ cat /proc/sys/kernel/pid_max
200000
$ cat /proc/sys/kernel/threads-max
1126097
And enough free memory to give 1MB each to 127k processes
$ free
total used free shared buff/cache available
Mem: 144156492 5382168 130458252 575604 8316072 137302624
Swap: 0 0 0
And I have fewer than 1k existing processes/threads.
$ ps -elfT | wc -l
832
But I cannot start 50k processes
$ echo '
seq 50000 | while read _; do
sleep 20 &
done
' | bash
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
...
Why can't I create 50k processes?

It was caused by Linux cancer systemd.
In addition to kernel.pid_max and ulimit, I also needed to change a third limit.
/etc/systemd/logind.conf
[Login]
UserTasksMax=70000
And then restart.

Building on #Basile's answer, you probably ran out of pids.
cat /proc/sys/kernel/pid_max gives me 32768 on my machine (maximum value of a signed short). which is less than 50k
EDIT: I missed that /proc/sys/kernel/pid_max is set to 200000. That probably isn't the issue in this case.

Because each process requires some resources: some RAM (including some kernel memory), some CPU, etc.
Each process has its own virtual address space, including its own call stack (and some of it requires physical resources, including several pages of RAM; read more about resident set size; on my desktop the RSS of some bash process is about 6Mbytes). So a process is actually some quite heavy stuff.
BTW, this is not specific to Linux.
Read more about operating systems, e.g. Operating Systems : Three Easy Pieces
Try also cat /proc/$$/maps and cat /proc/$$/status and read more about proc(5). Read about failure of fork(2) and of execve(2). The resource temporarily unavailable is for EAGAIN (see errno(3)), and several reasons can make fork fail with EAGAIN. And on my system, cat /proc/sys/kernel/pid_max gives 32768 (and reaching that limit gives EAGAIN for fork).
BTW, imagine if you could fork ten thousand processes. Then the context switch time would be dominant w.r.t. to running time.
Your Linux system looks like some AWS instance. Amazon won't let you create that much processes, because their hardware is not expecting that much.
(on some costly supercomputer or server with e.g. a terabyte of RAM and a hundred of cores, perhaps you could run 50K processes; I guess that they need some particular kernel, or kernel configuration. I recommend getting help from Amazon support)

Related

Getting Erorr ECONNRESET intermittently with mosquitto and node.js

I am getting an intermittent error at node.js end intermittently while subscribing the topic from MQTT.
I have configured MQTT log files and found the below error
Unable to accept new connection, system socket count has been exceeded. Try increasing "ulimit -n" or equivalent.
While I am encounter the above message at mqtt logile, I am getting the error ECONNRESET at node.js end at the same time.
I have checked the ulimit at the server end and gives me the below details
ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 256380
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 62987
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
My Linux version is as below
Operating System: CentOS Linux 7 (Core)
CPE OS Name: cpe:/o:centos:centos:7
Kernel: Linux 3.10.0-1062.12.1.vz7.131.10
Architecture: x86-64
Is the problem is related to uilmit? Do I need to increase the value of ulimit at server level?
How to fix the issue for ECONRESET at node.js
You need to increase the open files count on the broker.
You can do it for the running process with the prlimit command, but you should do it for the user running mosquitto so it's persistent across restarts. You can do this by editing the /etc/security/limits.conf file. You will need to log out and back in for it to take effect for a normal user and probably restart the service for a daemon user.

Error: EMFILE: too many open files, watch, unless I use sudo

Description
Recently I've run into an problem. I am not able to run yarn start in element-web directory, I get these errors. Originally I thought it had something to do with element-web itself so I created an issue. Some time after that I tried to run wintersmith preview in bibviz directory and got the same errors. This was weird so I tried to create an Angular project and run ng serve and errors again. I headed to the issue to close it as it wasn't an element-web issue. I found that there was another issue created with the same problem. It had already been closed by turt2live saying it looks like you've run out of memory on your system. Based on this I tried to turn of most programs running in the background and now all the commands worked.
I am sure that ng serve used to work in the past.
My PC has 16 GB of RAM and the commands already fail when I am on 7/16 GB. I can't see any memory spikes when running the commands. Running the commands with sudo also completely eliminates the problem. This doesn't make any sense to me.
Research lead me to ulimits but they seem to have no effect. I have also installed watchman with no effect.
Can someone tell me what I am missing?
Thank you in advance!
Info
I am on Debian 11 Bullseye. This is the output of a few commands that could be useful.
As a regular user:
> uname -a
Linux Simon-s-PC 5.8.0-3-amd64 #1 SMP Debian 5.8.14-1 (2020-10-10) x86_64 GNU/Linux
> sudo sysctl fs.inotify.max_user_watches
fs.inotify.max_user_watches = 524288
> ulimit -a
-t: cpu time (seconds) unlimited
-f: file size (blocks) unlimited
-d: data seg size (kbytes) unlimited
-s: stack size (kbytes) 8192
-c: core file size (blocks) 0
-m: resident set size (kbytes) unlimited
-u: processes 46482
-n: file descriptors 8192
-l: locked-in-memory size (kbytes) unlimited
-v: address space (kbytes) unlimited
-x: file locks unlimited
-i: pending signals 63664
-q: bytes in POSIX msg queues 819200
-e: max nice 0
-r: max rt priority 95
-N 15: unlimited
> yarn --version
1.22.5
With sudo su:
> sysctl fs.inotify.max_user_watches
fs.inotify.max_user_watches = 524288
> ulimit -a
-t: cpu time (seconds) unlimited
-f: file size (blocks) unlimited
-d: data seg size (kbytes) unlimited
-s: stack size (kbytes) 8192
-c: core file size (blocks) 0
-m: resident set size (kbytes) unlimited
-u: processes 63664
-n: file descriptors 1024
-l: locked-in-memory size (kbytes) 2043392
-v: address space (kbytes) unlimited
-x: file locks unlimited
-i: pending signals 63664
-q: bytes in POSIX msg queues 819200
-e: max nice 0
-r: max rt priority 0
-N 15: unlimited
I think I've found a solution:
Set limits in /etc/sysctl.conf by adding:
fs.inotify.max_user_watches=524288
fs.inotify.max_user_instances=512
Open a new terminal or reload sysctl.conf variables with
sudo sysctl --system
Run yarn start
Everything should work fine now, hopefully. If it doesn't work try setting the limits higher.

Linux fork () : resource temporary unavavailable

How to debug following points just to find out exact reason which resource exceeding limit
How many process currently running
How many process running for per
user No. of opened files for per process.
Total no. of opened files for all process.
No. of process limit No. of open file limit
There can be multiple ways to go about what you are trying to achieve, e.g. you could get all the information you need by evaluating /proc/ fs. Below is a list of utilities you could use to debug the actual resource issue.
Good luck.
How many process currently running
ps -eaf | wc -l
How many process running for per user
ps -fu [username] | wc -l
No. of opened files for per process.
lsof -p < pid > | wc -l
Total no. of opened files for all process.
You could iterate over all the pid as shown above and make use of lsof command. Here, you might have to execute the command as root, else you would get permission denied while doing lsof
No. of process limit No. of open file limit
For a specific terminal, you could do
$ ulimit -a
core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 15973
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 15973
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

forkpty fails for jailed linux user

I have a Ubuntu 12.04 setup on the server. Every registered user is also registered as linux user & jailed with limited system resource access through /etc/security/limits.conf .
I tried running a server as one of the registered users. The app is a nodejs app - http://github.com/pocha/terminal-codelearn . It uses https://github.com/chjj/pty.js to create a Pseudo Terminal for every user which comes to the nodejs app.
The app fails with 'forkpty(3) failed' error pointed to line 184 of https://github.com/chjj/pty.js/blob/65dd89fd8f87de914ff1814362918d7bd87c9cbf/src/unix/pty.cc
pid_t pid = pty_forkpty(&master, name, NULL, &winp);
if (pid) {
for (i = 0; i < argl; i++) free(argv[i]);
delete[] argv;
for (i = 0; i < envc; i++) free(env[i]);
delete[] env;
free(cwd);
}
switch (pid) {
case -1:
return ThrowException(Exception::Error(
String::New("forkpty(3) failed.")));
I am able to successfully deploy the app on http://nitrous.io . They probably have similar way to jail user. I tried running ulimits -a & matched every value except for pending signal. Somehow on my server, the maximum pending signal value does not exceed around 90k value while it is 584k on Nitrous server.
Below is the ulimit -a output from Nitrous server
ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 548288
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 512
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 256
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
The app fails on heroku with exact similar error.
Can anybody help on how to make the app run on my server the way it works on nitrous.io
I know that heroku fails to forkpty because they're not actually running POSIX, just very posix-like. So some things, like forkpty, just don't work. I don't think there's a way around that :( wish there were.
I am not sure if I understand POSIX type. But I figured out that in my jailed environment there was no /dev/ptmx & /dev/pts/* . I googled & created them & it started working.

Too many open files error on Ubuntu 8.04

mysqldump: Couldn't execute 'show fields from `tablename`': Out of resources when opening file './databasename/tablename#P#p125.MYD' (Errcode: 24) (23)
on checking the error 24 on the shell it says
>>perror 24
OS error code 24: Too many open files
how do I solve this?
At first, to identify the certain user or group limits you have to do the following:
root#ubuntu:~# sudo -u mysql bash
mysql#ubuntu:~$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 71680
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 71680
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
mysql#ubuntu:~$
The important line is:
open files (-n) 1024
As you can see, your operating system vendor ships this version with the basic Linux configuration - 1024 files per process.
This is obviously not enough for a busy MySQL installation.
Now, to fix this you have to modify the following file:
/etc/security/limits.conf
mysql soft nofile 24000
mysql hard nofile 32000
Some flavors of Linux also require additional configuration to get this to stick to daemon processes versus login sessions. In Ubuntu 10.04, for example, you need to also set the pam session limits by adding the following line to /etc/pam.d/common-session:
session required pam_limits.so
Quite an old question but here are my two cents.
The thing that you could be experiencing is that the mysql engine didn't set its variable "open-files-limit" right.
You can see how many files are you allowing mysql to open
mysql> SHOW VARIABLES;
Probably is set to 1024 even if you already set the limits to higher values.
You can use the option --open-files-limit=XXXXX in the command line for mysqld.
Cheers
add --single_transaction to your mysqldump command
It could also be possible that by some code that accesses the tables dint close those properly and over a point of time, the number of open files could be reached.
Please refer to http://dev.mysql.com/doc/refman/5.0/en/table-cache.html for a possible reason as well.
Restarting mysql should cause this problem to go away (although it might happen again unless the underlying problem is fixed).
You can increase your OS limits by editing /etc/security/limits.conf.
You can also install "lsof" (LiSt Open Files) command to see Files <-> Processes relation.
There are no need to configure PAM, as I think. On my system (Debian 7.2 with Percona 5.5.31-rel30.3-520.squeeze ) I have:
Before my.cnf changes:
\#cat /proc/12345/limits |grep "open files"
Max open files 1186 1186 files
After adding "open_files_limit = 4096" into my.cnf and mysqld restart, I got:
\#cat /proc/23456/limits |grep "open files"
Max open files 4096 4096 files
12345 and 23456 is mysqld process PID, of course.
SHOW VARIABLES LIKE 'open_files_limit' show 4096 now.
All looks ok, while "ulimit" show no changes:
\# su - mysql -c bash
\# ulimit -n
1024
There is no guarantee that "24" is an OS-level error number, so don't assume that this means that too many file handles are open. It could be some type of internal error code used within mysql itself. I'd suggest asking on the mysql mailing lists about this.

Resources