SIPP: open file limit > FD_SETSIZE - linux

actually I try to start SIPP 3.3 on opensuse 11 with a bash console with java.
When I start SIPP with
proc = Runtime.getRuntime().exec("/bin/bash", null, wd);
...
printWriter.println("./sipp -i "+Config.IP+" -sf uac.xml "+Config.IP+":5060");
the error stream gives the following output
Warning: open file limit > FD_SETSIZE; limiting max. # of open files to FD_SETSIZE = 1024
Resolving remote host '137.58.120.17'... Done.
What does the warning means? And is it possible that the bash terminal freezes because of this warning?
How can i remove this warning?

I'm the maintainer of SIPp and I've been looking into the FD_SETSIZE issues recently.
As is mentioned at Increasing limit of FD_SETSIZE and select, FD_SETSIZE is the maximum file descriptor that can be passed to the select() call, as it uses a bit-field internally to keep track of file descriptors. SIPp had code in it to check its own maximum open file limit (i.e. the one shown by ulimit -n), and if it was larger than FD_SETSIZE, to reduce it to FD_SETSIZE in order to avoid issues with select().
However, this has actually been unnecessary for a while - SIPp has used poll() rather than select() (which doesn't have the FD_SETSIZE limit, and has been POSIX-standardised and portable since 2001) since before I became maintainer in 2012. SIPp now also uses epoll where available for even better performance, as of the v3.4 release.
I've now removed this FD_SETSIZE check in the development code at https://github.com/SIPp/sipp, and replaced it with a more sensible check - making sure that the maximum number of open sockets (plus the maximum number of open calls, each of which may open its own media socket) is below the maximum number of file descriptors.

This warning is supposedly related to multi-socket transport options in SIPp eg. -t un or -t tn, (though I have observed it generate these warnings even when not specifying one of these).
SIPp includes an option that controls this warning message:
-skip_rlimit : Do not perform rlimit tuning of file descriptor limits. Default: false.
Though it has the desired effect for me of suppressing the warning output, on its own, it seems like a slightly dangerous option. Although I'm not certain of what will happen if you include this option and SIPp attempts to open more sockets than are available according to FD_SETSIZE, you may avoid possible problems on that front by also including the max_socket argument:
-max_socket : Set the max number of sockets to open simultaneously. This option is
significant if you use one socket per call. Once this limit is reached,
traffic is distributed over the sockets already opened. Default value is
50000

It means pretty much what it says... Your per-process open file limit (ulimit -n) is greater than the pre-defined constant FD_SETSIZE, which is 1024. So the program is adjusting your open file limit down to match FD_SETSIZE.

Related

ArangoDB Too many open files

since a few days we encounter a problem with our ArangoDB installation. A few minutes/up to an hour after start up all connections to the database are refused. The arango log file says that there are "Too many open files". A "lsof | grep arango | wc -l" shows that the database has around 50,000 open file handles, which is a lot under the max. allowed by the linux system (around 3m).
Has anyone an idea where this error comes from?
We are using a Ubuntu Linux with a 3.13 kernel. 30 GB RAM and three cores. The database is still very small with around 1,5m entries and a size of 50GB.
Thx, secana
EDIT:
"netstat -anpt | fgrep 2480" shows:
root#syssec-graphdb-001-test:~# netstat -anpt | fgrep 2480
tcp 0 0 10.215.17.193:2480 0.0.0.0:* LISTEN 7741/arangod
tcp 0 0 10.215.17.193:2480 10.215.50.30:53453 ESTABLISHED 7741/arangod
tcp 0 0 10.215.17.193:2480 10.215.50.31:49299 ESTABLISHED 7741/arangod
tcp 0 0 10.215.17.193:2480 10.215.50.30:53155 ESTABLISHED 7741/arangod
"ulimit -n" has a result of 1024, so I think that the ~50,000 are all arango processes together.
Last lines in log file before the database died:
2015-05-26T12:20:43Z [9672] ERROR cannot open datafile '/data/arangodb/databases/database-235999516/collection-28464454696/datafile-18806474509149.db': 'Too many open files'
2015-05-26T12:20:43Z [9672] ERROR cannot open datafile '/data/arangodb/databases/database-235999516/collection-28464454696/datafile-18806474509149.db': Too many open files
2015-05-26T12:20:43Z [9672] DEBUG [arangod/VocBase/collection.cpp:1632] cannot open '/data/arangodb/databases/database-235999516/collection-28464454696', check failed
2015-05-26T12:20:43Z [9672] ERROR cannot open document collection from path '/data/arangodb/databases/database-235999516/collection-28464454696'
It looks like it will make sense to increase the max. number of open files a process is allowed to manage. Given the stated database size of around 50 GB, the (presumably default) value of 1024 seems to be too low.
arangod will require one file descriptor for each parallel client connection. That may not be many, but in the face of HTTP keep-alive connections this could already account for several file descriptors.
Additionally, each datafile of an active collection will need to be memory-mapped and cost one file descriptor as well. With the default datafile size of 32 MB, a database size of 50 GB (on disk) will already consume 1,600 file descriptors:
50 GB database size / (32 MB default size / 1 datafile) = 1600 datafiles
Increasing the ulimit -n value for the arangod user and environment therefore will make sense. You can confirm that arangod can actually use the configured number of file descriptors by starting it with option --server.descriptors-minimum <value>, e.g.
--server.descriptors-minimum 32768
for that many file descriptors. If arangod cannot effectively use that specified amount of file descriptors, it will fail at start with a fatal error. Of course that option can also be put into the arangod.conf file.
Additionally, the default size for (new) datafiles can be increased via the journalSize parameter for collections. That won't help right now, but will lower the number of required file descriptors for data saved in the future.
For emergencies when you can't restart the database, like in my case, you will find very useful this blog post that explains how you can change the ulimit of a running process.
If your distribution has util-linux-2.21, you can use the "prlimit" tool, or you can compile the small example C program in the blog post that worked great for me.
To check the actual limits of a process you can use:
cat /proc/<PID>/limits
Good luck!

Explain Node's st module fd and stat configuration

Node's st module documentation mentions fd and stat configuration:
cache: { // specify cache:false to turn off caching entirely
fd: {
max: 1000, // number of fd's to hang on to
maxAge: 1000*60*60, // amount of ms before fd's expire
},
stat: {
max: 5000, // number of stat objects to hang on to
maxAge: 1000 * 60, // number of ms that stats are good for
},
...
}
But what are these and how do they impact's st's delivery of static files? Can you give examples?
Those are configurations, for st cache module which is lru-cache.
fd
Which stands for file descriptor. Everytime sd module want to serve a file and needs to read the content from it, it needs to have/open a file descriptor. Caching file descriptor will remove the time taken to open a file.
If the file is moved or deleted, reading with the file descriptor will still result in the old content.
Each system has a maximum amount of open file descriptors per process and globally, and once you run out, you can't open anymore files. So make sure you set the cache.fd.max option lesser than amount per process.
stat
It represents the result of calls to fs.stat and friends. It is needed for setting etag, or responding with a 304.
The max option, is the maximum number of items/size and the maxAge, is the max amount of time an item can remain in memory.
Obviously, the for all the cache types(fd, stat, content,...) the higher the numbers(max and maxAge) are, some requests are served way faster, but more memory is consumed.
Setting fd.max to a optimized amount might be tricky. Since for each connection to be served a file descriptor is opened so technically. You would want to leave some space for the connections you want to handle, because if you hit the limit, your server won't receive anymore connections. Set it according to number of concurrent connections your server is expected to handle and max number of open files for your process in your system. Here's how you would check/change the max number in linux: http://cherry.world.edoors.com/CPZKoGkpxfbQ
As for stat.max, I suggest setting it according to available memory. I suggest testing/measuring it in your production system to find how much memory is used per 1 stat object, so you can decide.
Setting maxAge depends on the frequency of your files being updated.

Where is OPEN_MAX defined for Linux systems?

OPEN_MAX is the constant that defines the maximum number of open files allowed for a single program.
According to Beginning Linux Programming 4th Edition, Page 101 :
The limit, usually defined by the constant OPEN_MAX in limits.h, varies from system to system, ...
In my system, the file limits.h in directory /usr/lib/gcc/x86_64-linux-gnu/4.6/include-fixed does not have this constant. Am i looking at the wrong limits.h or has the location of OPEN_MAX changed since 2008 ?
For what it's worth, the 4th edition of Beginning Linux Programming was published in 2007; parts of it may be a bit out of date. (That's not a criticism of the book, which I haven't read.)
It appears that OPEN_MAX is deprecated, at least on Linux systems. The reason appears to be that the maximum number of file that can be opened simultaneously is not fixed, so a macro that expands to an integer literal is not a good way to get that information.
There's another macro FOPEN_MAX that should be similar; I can't think of a reason why OPEN_MAX and FOPEN_MAX, if they're both defined, should have different values. But FOPEN_MAX is mandated by the C language standard, so system's don't have the option of not defining it. The C standard says that FOPEN_MAX
expands to an integer constant expression that is the minimum number of files that
the implementation guarantees can be open simultaneously
(If the word "minimum" is confusing, it's a guarantee that a program can open at least that many files at once.)
If you want the current maximum number of files that can be opened, take a look at the sysconf() function; on my system, sysconf(_SC_OPEN_MAX) returns 1024. (The sysconf() man page refers to a symbol OPEN_MAX. This is not a count, but a value recognized by sysconf(). And it's not defined on my system.)
I've searched for OPEN_MAX (word match, so excluding FOPEN_MAX) on my Ubuntu system, and found the following (these are obviously just brief excerpts):
/usr/include/X11/Xos.h:
# ifdef __GNU__
# define PATH_MAX 4096
# define MAXPATHLEN 4096
# define OPEN_MAX 256 /* We define a reasonable limit. */
# endif
/usr/include/i386-linux-gnu/bits/local_lim.h:
/* The kernel header pollutes the namespace with the NR_OPEN symbol
and defines LINK_MAX although filesystems have different maxima. A
similar thing is true for OPEN_MAX: the limit can be changed at
runtime and therefore the macro must not be defined. Remove this
after including the header if necessary. */
#ifndef NR_OPEN
# define __undef_NR_OPEN
#endif
#ifndef LINK_MAX
# define __undef_LINK_MAX
#endif
#ifndef OPEN_MAX
# define __undef_OPEN_MAX
#endif
#ifndef ARG_MAX
# define __undef_ARG_MAX
#endif
/usr/include/i386-linux-gnu/bits/xopen_lim.h:
/* We do not provide fixed values for
ARG_MAX Maximum length of argument to the `exec' function
including environment data.
ATEXIT_MAX Maximum number of functions that may be registered
with `atexit'.
CHILD_MAX Maximum number of simultaneous processes per real
user ID.
OPEN_MAX Maximum number of files that one process can have open
at anyone time.
PAGESIZE
PAGE_SIZE Size of bytes of a page.
PASS_MAX Maximum number of significant bytes in a password.
We only provide a fixed limit for
IOV_MAX Maximum number of `iovec' structures that one process has
available for use with `readv' or writev'.
if this is indeed fixed by the underlying system.
*/
Aside from the link given by cste, I would like to point out that there is a /proc/sys/fs/file-max entry that provides the number of files THE SYSTEM can have open at any given time.
Here's some docs:
https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Directory_Server/8.2/html/Performance_Tuning_Guide/system-tuning.html
Note that this is not to say that there's a GUARANTEE you can open that many files - if the system runs out of some resource (e.g. "no more memory available"), then it may well fail.
The FOPEN_MAX indicates that the C library allows this many files to be opened (at least, as discussed), but there are other limits that may happen first. Say for example the SYSTEM limit is 4000 files, and some applications already running has 3990 files open. Then you won't be able to open more than 7 files [since stdin, stdout and stderr take up three slots too]. And if rlimit is set to 5, then you can only open 2 files of your own.
In my opinion, the best way to know if you can open a file is to open it. If that fails, you have to do something else. If you have some process that needs to open MANY files [e.g. a multithreaded search/compare on a machine with 256 cores and 8 threads per core and each thread uses three files (file "A", "B" and "diff") ], then you may need to ensure that your FOPEN_MAX allows for 3 * 8 * 256 files being opened before you start creating threads, as a thread that fails to open its files will be meaningless. But for most ordinary applications, just try to open the file, if it fails, tell the user (log, or something), and/or try again...
I suggest to use the magic of grep to find this constant on /usr/include:
grep -rn --col OPEN_MAX /usr/include
...
...
/usr/include/stdio.h:159: FOPEN_MAX Minimum number of files that can be open at once.
...
...
Hope it helps you

How do I get amount of queued data for UDP socket?

To see how well I'm doing in processing incoming data, I'd like to measure the queue length at my TCP and UDP sockets.
I know that I can get the queue size via SO_RCVBUF socket option, and that ioctl(<sockfd>, SIOCINQ, &<some_int>) tells me the information for TCP sockets. But for UDP the SIOCINQ/FIONREAD ioctl returns only the size of next pending datagram. Is there a way how to get queue size for UDP, without having to parse system tables such as /proc/net/udp?
FWIW, I did some experiments to map out the behavior of FIONREAD on different platforms.
Platforms where FIONREAD returns all the data pending in a SOCK_DGRAM socket:
Mac OS X, NetBSD, FreeBSD, Solaris, HP-UX, AIX, Windows
Platforms where FIONREAD returns only the bytes for the first pending datagram:
Linux
It might also be worth noting that some implementations include headers or other overhead bytes in the count, while others only count the payload bytes. Linux appears to return the payload size, not including IP headers.
As ldx mentioned, it is not supported through ioctl or getsockopt.
It seems to me that the current implementation of SIOCINQ was aimed to determine how much buffer is needed to read the entire waiting buffer (but I guess it is not so useful for that, as it can change between the read of it to the actual buffer read).
There are many other telemetries which are not supported though such system calls, I guess there is no real need in normal production usage.
You can check the drops/errors through "netstat -su" , or better using SNMP (udpInErrors) if you just want to monitor the machine state.
BTW: You always have the option to hack in the Kernel code and add this value (or others).

Increasing the number of file descriptors in Linux

I have a long running process which monitors the system and prints periodic logs. If I let it run for longer than 10-15 minutes, it exits with a message saying:
Too many open files.
The program is setup using real time timer_create() and timer_settime() which raise a SIGUSR1 every 2 seconds. In the handler, there is one fork()-exec() in child There is a wait in parent and subsequent mmap() and stream operations on /proc/acpi/battery/state and /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq and scaling_setspeed files.
I have taken care to close the stream FILE * pointers in the periodic signal handler and at all other places. I have also ensured munmap() of all the mapped files.
How can I get around this?
Should I increase the maximum file descriptors allowed or should I increase the maximum open files shown by ulimit -aS?
Why is this happening if I am closing all the FILE * using fclose()?
Here are the values for my system as of now:
#cat /proc/sys/fs/file-max
152808
#ulimit -aS
.
.
.
.
open files (-n) 1024
Use lsof or a debugger to find what files your process has open. Increasing the limit will just delay the point at which you run out of descriptors.

Resources