OpenMPI and OpenFabrics registering physical memory warning - linux

I start mpirun with command:
mpirun -np 2 prog
and get next output:
--------------------------------------------------------------------------
WARNING: It appears that your OpenFabrics subsystem is configured to only
allow registering part of your physical memory. This can cause MPI jobs to
run with erratic performance, hang, and/or crash.
This may be caused by your OpenFabrics vendor limiting the amount of
physical memory that can be registered. You should investigate the
relevant Linux kernel module parameters that control how much physical
memory can be registered, and increase them to allow registering all
physical memory on your machine.
See this Open MPI FAQ item for more information on these Linux kernel module
parameters:
http://www.open-mpi.org/faq/?category=openfabrics#ib-..
Local host: node107
Registerable memory: 32768 MiB
Total memory: 65459 MiB
Your MPI job will continue, but may be behave poorly and/or hang.
--------------------------------------------------------------------------
hello from 0
hello from 1
[node107:48993] 1 more process has sent help message help-mpi- btl-openib.txt / reg mem limit low
[node107:48993] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
Other installed soft (Intel MPI library) work fine, without any errors and using all 64GB memory.
For OpenMPI I don't use any PBS manager (Torque, slurm, etc.), I work on single node. I get to the node by command
ssh node107
For command
cat /etc/security/limits.conf
I get next output:
...
* soft rss 2000000
* soft stack 2000000
* hard stack unlimited
* soft data unlimited
* hard data unlimited
* soft memlock unlimited
* hard memlock unlimited
* soft nproc 10000
* hard nproc 10000
* soft nofile 10000
* hard nofile 10000
* hard cpu unlimited
* soft cpu unlimited
...
For command
cat /sys/module/mlx4_core/parameters/log_num_mtt
I get output:
0
Command:
cat /sys/module/mlx4_core/parameters/log_mtts_per_seg
output:
3
Command:
getconf PAGESIZE
output:
4096
With this params and by formula
max_reg_mem = (2^log_num_mtt) * (2^log_mtts_per_seg) * PAGE_SIZE
max_reg_mem = 32768 bytes, nor 32GB, how specified in openmpi warning.
What is the reason for this? Can openmpi don't use Mellanox and params log_num_mtt, log_mtts_per_seg? How I can configure OpenFabrics to use all 64GB memory?

I solve this problem by installing newest version of OpenMPI (2.0.2).

In /etc/modprobe.d/mlx4_core.conf, put the following module parameter:
options mlx4_core log_mtts_per_seg=5
Reload the mlx4_core module:
rmmod mlx4_ib;
rmmod mlx4_core;
modprobe mlx4_ib
Check if log_mtts_per_seg shows up as configured above:
cat /sys/module/mlx4_core/parameters/log_mtts_per_seg

Related

StreamSets Data Collector Installation

I am having trouble manually installing the Full Tarball of StreamSets Data Collector. I am running Ubuntu in a VM setting with over 30GB of storage space.
I have read the Manual Page from the StreamSets website, but it's not useful.
Here is what I have done so far:
I have downloaded the full tarball to my Home/Downloads
I have extracted the tarball to my Home/Downloads folder and I have the directory streamsets-datacollector-3.13.0 with all of its subdirectories
Now when I try bin/streamsets dc I get the following errors:
WARN: could not determine Java environment version; expected 1.8, which are the supported versions
Configuration of maximum open file limit is too low: 1024 (expected at least 32768).
I have downloaded all java files using apt install java*
and I have tried to change the limits in the /etc/security/limits.conf
as proven below:
#* soft core 0
#root hard core 100000
#* hard rss 10000
##student hard nproc 20
##faculty soft nproc 20
##faculty hard nproc 50
#ftp hard nproc 0
#ftp - chroot /ftp
##student - maxlogins 4
# End of file
* soft nproc 33000
* hard nproc 33000
* soft nofile 33000
* hard nofile 33000
I even did a system reboot after. However, when I type ulimit -n it still gives me the default 1024.
How should I fix this error?
you just need type "ulimit -n 32768" to change

Why can't I create 50k processes in Linux?

Using Linux
$ uname -r
4.4.0-1041-aws
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.3 LTS
Release: 16.04
Codename: xenial
With limits allowing up to 200k processes
$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 563048
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 524288
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
$ cat /proc/sys/kernel/pid_max
200000
$ cat /proc/sys/kernel/threads-max
1126097
And enough free memory to give 1MB each to 127k processes
$ free
total used free shared buff/cache available
Mem: 144156492 5382168 130458252 575604 8316072 137302624
Swap: 0 0 0
And I have fewer than 1k existing processes/threads.
$ ps -elfT | wc -l
832
But I cannot start 50k processes
$ echo '
seq 50000 | while read _; do
sleep 20 &
done
' | bash
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
...
Why can't I create 50k processes?
It was caused by Linux cancer systemd.
In addition to kernel.pid_max and ulimit, I also needed to change a third limit.
/etc/systemd/logind.conf
[Login]
UserTasksMax=70000
And then restart.
Building on #Basile's answer, you probably ran out of pids.
cat /proc/sys/kernel/pid_max gives me 32768 on my machine (maximum value of a signed short). which is less than 50k
EDIT: I missed that /proc/sys/kernel/pid_max is set to 200000. That probably isn't the issue in this case.
Because each process requires some resources: some RAM (including some kernel memory), some CPU, etc.
Each process has its own virtual address space, including its own call stack (and some of it requires physical resources, including several pages of RAM; read more about resident set size; on my desktop the RSS of some bash process is about 6Mbytes). So a process is actually some quite heavy stuff.
BTW, this is not specific to Linux.
Read more about operating systems, e.g. Operating Systems : Three Easy Pieces
Try also cat /proc/$$/maps and cat /proc/$$/status and read more about proc(5). Read about failure of fork(2) and of execve(2). The resource temporarily unavailable is for EAGAIN (see errno(3)), and several reasons can make fork fail with EAGAIN. And on my system, cat /proc/sys/kernel/pid_max gives 32768 (and reaching that limit gives EAGAIN for fork).
BTW, imagine if you could fork ten thousand processes. Then the context switch time would be dominant w.r.t. to running time.
Your Linux system looks like some AWS instance. Amazon won't let you create that much processes, because their hardware is not expecting that much.
(on some costly supercomputer or server with e.g. a terabyte of RAM and a hundred of cores, perhaps you could run 50K processes; I guess that they need some particular kernel, or kernel configuration. I recommend getting help from Amazon support)

Elasticsearch process memory locking failed

I have set boostrap.memory_lock=true
Updated /etc/security/limits.conf added memlock unlimited for elastic search user
My elastic search was running fine for many months. Suddenly it failed 1 day back. In logs I can see below error and process never starts
ERROR: bootstrap checks failed
memory locking requested for elasticsearch process but memory is not locked
I hit ulimit -as and I can see max locked memory set to unlimited. What is going wrong here? I have been trying for hours but all in vain. Please help.
OS is RHEL 7.2
Elasticsearch 5.1.2
ulimit -as output
core file size (blocks -c) 0
data seg size (kbytes -d) unlimited
scheduling policy (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 83552
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 65536
pipe size (512 bytes, -q) 8
POSIX message queues (bytes,-q) 819200
real-time priority (-r) 0
stack size kbytes, -s) 8192
cpu time seconds, -t) unlimited
max user processes (-u) 4096
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Here is what I have done to lock the memory on my ES nodes on RedHat/Centos 7 (it will work on other distributions if they use systemd).
You must make the change in 4 different places:
1) /etc/sysconfig/elasticsearch
On sysconfig: /etc/sysconfig/elasticsearch you should have:
ES_JAVA_OPTS="-Xms4g -Xmx4g"
MAX_LOCKED_MEMORY=unlimited
(replace 4g with HALF your available RAM as recommended here)
2) /etc/security/limits.conf
On security limits config: /etc/security/limits.conf you should have
elasticsearch soft memlock unlimited
elasticsearch hard memlock unlimited
3) /usr/lib/systemd/system/elasticsearch.service
On the service script: /usr/lib/systemd/system/elasticsearch.service you should uncomment:
LimitMEMLOCK=infinity
you should do systemctl daemon-reload after changing the service script
4) /etc/elasticsearch/elasticsearch.yml
On elasticsearch config finally: /etc/elasticsearch/elasticsearch.yml you should add:
bootstrap.memory_lock: true
Thats it, restart your node and the RAM will be locked, you should notice a major performance improvement.
OS = Ubuntu 16
ElasticSearch = 5.6.3
I also used to have the same problem.
I set in elasticsearch.yml
bootstrap.memory_lock: true
and i got in my logs:
memory locking requested for elasticsearch process but memory is not locked
i tried several things, but actually you need to do only one thing (according to https://www.elastic.co/guide/en/elasticsearch/reference/master/setting-system-settings.html );
file:
/etc/systemd/system/elasticsearch.service.d/override.conf
add
[Service]
LimitMEMLOCK=infinity
A little bit explanation.
The really funny thing is that systemd does not really care about ulimit settings at all. ( https://fredrikaverpil.github.io/2016/04/27/systemd-and-resource-limits/ ). You can easily check this fact.
Set in /etc/security/limits.conf
elasticsearch - memlock unlimited
check that for elasticsearch max locked memory is unlimited
$ sudo su elasticsearch -s /bin/bash
$ ulimit -l
disable bootstrap.memory_lock: true in /etc/elasticsearch/elasticsearch.yml
# bootstrap.memory_lock: true
start service elasticsearch via systemd
# service elasticsearch start
check what max memory lock settings has service elasticsearch after it is
started
# systemctl show elasticsearch | grep -i limitmemlock
OMG! In spite we have set unlimited max memlock size via ulimit , systemd
completely ignores it.
LimitMEMLOCK=65536
So, we come to conclusion.
To start elasticsearch via systemd with enabled
bootstrap.memory_lock: true
we dont need to care about ulimit settings but we need
explecitely set it in systemd config file.
the end of story.
try setting
in /etc/sysconfig/elasticsearch file
set MAX_LOCKED_MEMORY=unlimited
in /usr/lib/systemd/system/elasticsearch.service
set LimitMEMLOCK=infinity
Make sure that your elasticsearch start process is configured to unlimited. For if e.g. you start elasticsarch with another user as the one configured in /etc/security/limits.conf or as root while defining a wildcard entry in limits.conf (which is not for root) it won't work.
Test itto be sure:
you could e.g. put ulimit -a ; exit just after the "#Start Daemon" in /etc/init.d/elasticsearch and start with bash /etc/init.d/elasticsearch start (adapt accordingly to your start mechanism).
check for the actual limit when the process is running (albeit short) with:
cat /proc/<pid>/limits
You will find lines similar to this:
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
<truncated>
Then depend on the runner or container (in my case it was supervisord's minfds value), you can lift the actual limitation configuration.
I hope it gives a little hint for more general cases.
Followed this post
On ubuntu 18.04 with elasticsearch 6.x, there wasn't entry LimitMEMLOCK=infinity in file /usr/lib/systemd/system/elasticsearch.service.
So adding that in that file and setting MAX_LOCKED_MEMORY=unlimited in /etc/default/elasticsearch did the trick.
The jvm options can be added in /etc/elasticsearch/jvm.options file.
If you use the tar distribution and want to monitor it with monit you
have to tell monit to use unlimited - all other places for this configuration are ignored.
Add ulimit -s unlimited at the beginning of /etc/init.d/monit, then do systemctl daemon-reload and then service monit restart and monit start $yourMonitLabel.
One thing it "can" be is that your /tmp is mounted with noexec https://discuss.elastic.co/t/not-able-to-start-elasticsearch-due-to-failed-memory-lock/158009/6 check your logs and see if it complains about .UnsatisfiedLinkError: Native library
especially CentOS/RedHat but maybe others? Might be fixed in ES 7?

How can the physical RAM size be determined in Linux programatically?

On the command line this can be found out using the 'free' utility and 'cat /proc/meminfo'.
What would be the different ways to find out the physical RAM size in Linux programatically from a :
Userspace Application
Kernel Module
What API calls are available ?
#include <unistd.h>
long long physical_mem_bytes = (long long) sysconf (_SC_PHYS_PAGES) * sysconf (_SC_PAGESIZE);
Other than the command line ulimit, I don't know of a way of finding maximum memory for an individual process.
Programmatically, Linux won't tell you the actual physical size. Instead you should read this info from SMBIOS with, e.g.,
sudo dmidecode -t memory | fgrep -ie 'size:'
This will give you results like the following (from a box with 4 RAM banks, only 2 installed):
Maximum Memory Module Size: 16384 MB
Maximum Total Memory Size: 65536 MB
Installed Size: 2048 MB (Single-bank Connection)
Enabled Size: 2048 MB (Single-bank Connection)
Installed Size: Not Installed
Enabled Size: Not Installed
Installed Size: 2048 MB (Single-bank Connection)
Enabled Size: 2048 MB (Single-bank Connection)
Installed Size: Not Installed
Enabled Size: Not Installed
Size: 2048 MB
Size: No Module Installed
Size: 2048 MB
Size: No Module Installed
Add the reported sizes (or Enabled Sizes, but some BIOSes empirically don't report that) to get (in this case) 4096 MB. (Extra points for code that automates the parsing and arithmetic, but you can probably do that in your head nearly as reliably.)
To check your computation, run
fgrep -e 'MemTotal:' /proc/meminfo
The value reported by /proc/meminfo should not be more than the value you compute from dmidecode. In this case, empirically I get
MemTotal: 3988616 kB
cat /proc/meminfo
specifically from memory, I got this result from what Jared said
sudo dmidecode -t memory
there you can read the specs for each individual memory slot, so you will read something like 2048MB, in my case I have 2 of these being 4gb, despite my non PAE kernel only shows about 3.3gb and all other applications wont say the real physical memory, only dmidecode, thx!

Too many open files error on Ubuntu 8.04

mysqldump: Couldn't execute 'show fields from `tablename`': Out of resources when opening file './databasename/tablename#P#p125.MYD' (Errcode: 24) (23)
on checking the error 24 on the shell it says
>>perror 24
OS error code 24: Too many open files
how do I solve this?
At first, to identify the certain user or group limits you have to do the following:
root#ubuntu:~# sudo -u mysql bash
mysql#ubuntu:~$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 71680
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 71680
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
mysql#ubuntu:~$
The important line is:
open files (-n) 1024
As you can see, your operating system vendor ships this version with the basic Linux configuration - 1024 files per process.
This is obviously not enough for a busy MySQL installation.
Now, to fix this you have to modify the following file:
/etc/security/limits.conf
mysql soft nofile 24000
mysql hard nofile 32000
Some flavors of Linux also require additional configuration to get this to stick to daemon processes versus login sessions. In Ubuntu 10.04, for example, you need to also set the pam session limits by adding the following line to /etc/pam.d/common-session:
session required pam_limits.so
Quite an old question but here are my two cents.
The thing that you could be experiencing is that the mysql engine didn't set its variable "open-files-limit" right.
You can see how many files are you allowing mysql to open
mysql> SHOW VARIABLES;
Probably is set to 1024 even if you already set the limits to higher values.
You can use the option --open-files-limit=XXXXX in the command line for mysqld.
Cheers
add --single_transaction to your mysqldump command
It could also be possible that by some code that accesses the tables dint close those properly and over a point of time, the number of open files could be reached.
Please refer to http://dev.mysql.com/doc/refman/5.0/en/table-cache.html for a possible reason as well.
Restarting mysql should cause this problem to go away (although it might happen again unless the underlying problem is fixed).
You can increase your OS limits by editing /etc/security/limits.conf.
You can also install "lsof" (LiSt Open Files) command to see Files <-> Processes relation.
There are no need to configure PAM, as I think. On my system (Debian 7.2 with Percona 5.5.31-rel30.3-520.squeeze ) I have:
Before my.cnf changes:
\#cat /proc/12345/limits |grep "open files"
Max open files 1186 1186 files
After adding "open_files_limit = 4096" into my.cnf and mysqld restart, I got:
\#cat /proc/23456/limits |grep "open files"
Max open files 4096 4096 files
12345 and 23456 is mysqld process PID, of course.
SHOW VARIABLES LIKE 'open_files_limit' show 4096 now.
All looks ok, while "ulimit" show no changes:
\# su - mysql -c bash
\# ulimit -n
1024
There is no guarantee that "24" is an OS-level error number, so don't assume that this means that too many file handles are open. It could be some type of internal error code used within mysql itself. I'd suggest asking on the mysql mailing lists about this.

Resources