Explain Node's st module fd and stat configuration - node.js

Node's st module documentation mentions fd and stat configuration:
cache: { // specify cache:false to turn off caching entirely
fd: {
max: 1000, // number of fd's to hang on to
maxAge: 1000*60*60, // amount of ms before fd's expire
stat: {
max: 5000, // number of stat objects to hang on to
maxAge: 1000 * 60, // number of ms that stats are good for
But what are these and how do they impact's st's delivery of static files? Can you give examples?

Those are configurations, for st cache module which is lru-cache.
Which stands for file descriptor. Everytime sd module want to serve a file and needs to read the content from it, it needs to have/open a file descriptor. Caching file descriptor will remove the time taken to open a file.
If the file is moved or deleted, reading with the file descriptor will still result in the old content.
Each system has a maximum amount of open file descriptors per process and globally, and once you run out, you can't open anymore files. So make sure you set the cache.fd.max option lesser than amount per process.
It represents the result of calls to fs.stat and friends. It is needed for setting etag, or responding with a 304.
The max option, is the maximum number of items/size and the maxAge, is the max amount of time an item can remain in memory.
Obviously, the for all the cache types(fd, stat, content,...) the higher the numbers(max and maxAge) are, some requests are served way faster, but more memory is consumed.
Setting fd.max to a optimized amount might be tricky. Since for each connection to be served a file descriptor is opened so technically. You would want to leave some space for the connections you want to handle, because if you hit the limit, your server won't receive anymore connections. Set it according to number of concurrent connections your server is expected to handle and max number of open files for your process in your system. Here's how you would check/change the max number in linux: http://cherry.world.edoors.com/CPZKoGkpxfbQ
As for stat.max, I suggest setting it according to available memory. I suggest testing/measuring it in your production system to find how much memory is used per 1 stat object, so you can decide.
Setting maxAge depends on the frequency of your files being updated.


How to monitor which files consumes iops?

I need to understand which files consumes iops of my hard disc. Just using "strace" will not solve my problem. I want to know, which files are really written to disc, not to page cache. I tried to use "systemtap", but I cannot understand how to find out which files (filenames or inodes) consumes my iops. Is there any tools, which will solve my problem?
Yeah, you can definitely use SystemTap for tracing that. When upper-layer (usually, a VFS subsystem) wants to issue I/O operation, it will call submit_bio and generic_make_request functions. Note that these doesn't necessary mean a single physical I/O operation. For example, writes from adjacent sectors can be merged by I/O scheduler.
The trick is how to determine file path name in generic_make_request. It is quite simple for reads, as this function will be called in the same context as read() call. Writes are usually asynchronous, so write() will simply update page cache entry and mark it as dirty, while submit_bio gets called by one of the writeback kernel threads which doesn't have info of original calling process:
Writes can be deduced by looking at page reference in bio structure -- it has mapping of struct address_space. struct file which corresponds to an open file also contains f_mapping which points to the same address_space instance and it also points to dentry containing name of the file (this can be done by using task_dentry_path)
So we would need two probes: one to capture attempts to read/write a file and save path and address_space into associative array and second to capture generic_make_request calls (this is performed by probe ioblock.request).
Here is an example script which counts IOPS:
// maps struct address_space to path name
global paths;
// IOPS per file
global iops;
// Capture attempts to read and write by VFS
probe kernel.function("vfs_read"),
kernel.function("vfs_write") {
mapping = $file->f_mapping;
// Assemble full path name for running task (task_current())
// from open file "$file" of type "struct file"
path = task_dentry_path(task_current(), $file->f_path->dentry,
paths[mapping] = path;
// Attach to generic_make_request()
probe ioblock.request {
for (i = 0; i < $bio->bi_vcnt ; i++) {
// Each BIO request may have more than one page
// to write
page = $bio->bi_io_vec[i]->bv_page;
mapping = #cast(page, "struct page")->mapping;
iops[paths[mapping], rw] <<< 1;
// Once per second drain iops statistics
probe timer.s(1) {
foreach([path+, rw] in iops) {
printf("%3d %s %s\n", #count(iops[path, rw]),
bio_rw_str(rw), path);
delete iops
This example script is works for XFS, but needs to be updated to support AIO and volume managers (including btrfs). Plus I'm not sure how it will handle metadata reads and writes, but it is a good start ;)
If you want to know more on SystemTap you can check out my book: http://myaut.github.io/dtrace-stap-book/kernel/async.html
Maybe iotop gives you a hint about which process are doing I/O, in consequence you have an idea about the related files.
iotop --only
the --only option is used to see only processes or threads actually doing I/O, instead of showing all processes or threads

SIPP: open file limit > FD_SETSIZE

actually I try to start SIPP 3.3 on opensuse 11 with a bash console with java.
When I start SIPP with
proc = Runtime.getRuntime().exec("/bin/bash", null, wd);
printWriter.println("./sipp -i "+Config.IP+" -sf uac.xml "+Config.IP+":5060");
the error stream gives the following output
Warning: open file limit > FD_SETSIZE; limiting max. # of open files to FD_SETSIZE = 1024
Resolving remote host ''... Done.
What does the warning means? And is it possible that the bash terminal freezes because of this warning?
How can i remove this warning?
I'm the maintainer of SIPp and I've been looking into the FD_SETSIZE issues recently.
As is mentioned at Increasing limit of FD_SETSIZE and select, FD_SETSIZE is the maximum file descriptor that can be passed to the select() call, as it uses a bit-field internally to keep track of file descriptors. SIPp had code in it to check its own maximum open file limit (i.e. the one shown by ulimit -n), and if it was larger than FD_SETSIZE, to reduce it to FD_SETSIZE in order to avoid issues with select().
However, this has actually been unnecessary for a while - SIPp has used poll() rather than select() (which doesn't have the FD_SETSIZE limit, and has been POSIX-standardised and portable since 2001) since before I became maintainer in 2012. SIPp now also uses epoll where available for even better performance, as of the v3.4 release.
I've now removed this FD_SETSIZE check in the development code at https://github.com/SIPp/sipp, and replaced it with a more sensible check - making sure that the maximum number of open sockets (plus the maximum number of open calls, each of which may open its own media socket) is below the maximum number of file descriptors.
This warning is supposedly related to multi-socket transport options in SIPp eg. -t un or -t tn, (though I have observed it generate these warnings even when not specifying one of these).
SIPp includes an option that controls this warning message:
-skip_rlimit : Do not perform rlimit tuning of file descriptor limits. Default: false.
Though it has the desired effect for me of suppressing the warning output, on its own, it seems like a slightly dangerous option. Although I'm not certain of what will happen if you include this option and SIPp attempts to open more sockets than are available according to FD_SETSIZE, you may avoid possible problems on that front by also including the max_socket argument:
-max_socket : Set the max number of sockets to open simultaneously. This option is
significant if you use one socket per call. Once this limit is reached,
traffic is distributed over the sockets already opened. Default value is
It means pretty much what it says... Your per-process open file limit (ulimit -n) is greater than the pre-defined constant FD_SETSIZE, which is 1024. So the program is adjusting your open file limit down to match FD_SETSIZE.

TCP receiving and sending buffersizes in node.js

I have been working with node.js for the last 4 month and now wants to increase tcp receving and sending buffersize.
My purpose is to speed up my application and expermantation with buffersizes may increase preformance.
I have searched on google but haven't found anything useful except that you can change the default socket buffersizes on linux as example on this website:
Is there any way to change/set tcp sending and receiving buffersizes for node.js io?
stream_wrap has an allocation callback passed to libuv that is passed a suggested_size of the allocated memory to use in the receiving the data. Right now it passes 64KB as the suggested size, and there's no way to change this afaik.
Is this along the line of your question?
I found the stream_wrap on git:
git... src/stream_wrap src file
And if you navigate src/stream_wrap.cc in node.js src folder and looking up following code:
// If less than 64kb is remaining on the slab allocate a new one.
if (SLAB_SIZE - slab_used < 64 * 1024) {
slab = NewSlab(global, wrap->object_);
} else {
wrap->object_->SetHiddenValue(slab_sym, slab_obj);
Then you might be able to change the size.
#Trev Norris you know anything about this?

linux write(): does it try to write as many bytes as possible?

If I use write in this way: write (fd, buf, 10000000 /* 10MB */) where fd is a socket and uses blocking I/O, will the kernel tries to flush as many bytes as possible so that only one call is enough? Or I have to call write several times according to its return value? If that happens, does it mean something is wrong with fd?
============================== EDITED ================================
Thanks for all the answers. Furthermore, if I put fd into poll and it returns successfully with POLLOUT, so call to write cannot be blocked and writes all the data unless something is wrong with fd?
In blocking mode, write(2) will only return if specified number of bytes are written. If it can not write it'll wait.
In non-blocking (O_NONBLOCK) mode it'll not wait. It'll return right then. If it can write all of them it'll be a success other wise it'll set errno accordingly. Then you have check the errno if its EWOULDBLOCK or EAGAIN you have to invoke same write agian.
From manual of write(2)
The number of bytes written may be less than count if, for example, there is insufficient space on the underlying physical medium, or the RLIMIT_FSIZE resource
limit is encountered (see setrlimit(2)), or the call was interrupted by a signal handler after having written less than count bytes. (See also pipe(7).)
So yes, there can be something wrong with fd.
Also note this
A successful return from write() does not make any guarantee that data has been committed to disk. In fact, on some buggy implementations, it does not even guar‐
antee that space has successfully been reserved for the data. The only way to be sure is to call fsync(2) after you are done writing all your data.
/etc/sysctl.conf is used in Linux to set parameters for the TCP protocol, which is what I assume you mean by a socket. There may be a lot of parameters there, but when you dig through it, basically there is a limit to the amount of data the TCP buffers can hold at one time.
So if you tried to write 10 MB of data at one go, write would return a ssize_t value equal to that value. Always check the return value of the write() system call. If the system allowed 10MB then write would return that value.
The value is
net.core.wmem_max = [some number]
If you change some number to a value large enough to allow 10MB you can write that much. DON'T do that! You could cause other problems. Research settings before you do anything. Changing settings can decrease performance. Be careful.
has basic C information for TCP settings. Also check out /proc/sys/net on your box.
One other point - TCP is a two way door, so just because you can send a zillion bytes at one time does not mean the other side can read it or even handle it. You socket may just block for a while. And possibly your write() return value may be less than you hoped for.

Increasing the number of file descriptors in Linux

I have a long running process which monitors the system and prints periodic logs. If I let it run for longer than 10-15 minutes, it exits with a message saying:
Too many open files.
The program is setup using real time timer_create() and timer_settime() which raise a SIGUSR1 every 2 seconds. In the handler, there is one fork()-exec() in child There is a wait in parent and subsequent mmap() and stream operations on /proc/acpi/battery/state and /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq and scaling_setspeed files.
I have taken care to close the stream FILE * pointers in the periodic signal handler and at all other places. I have also ensured munmap() of all the mapped files.
How can I get around this?
Should I increase the maximum file descriptors allowed or should I increase the maximum open files shown by ulimit -aS?
Why is this happening if I am closing all the FILE * using fclose()?
Here are the values for my system as of now:
#cat /proc/sys/fs/file-max
#ulimit -aS
open files (-n) 1024
Use lsof or a debugger to find what files your process has open. Increasing the limit will just delay the point at which you run out of descriptors.
