Too many open files - KairosDB

Too many open files - KairosDB - linux

on running this query:
{ "start_absolute":1359695700000, "end_absolute":1422853200000,
"metrics":[{"tags":{"Building_id":["100"]},"name":"meterreadings","group_by":[{"name":"time","group_count":"12","range_size":{"value":"1","unit":"MONTHS"}}],"aggregators":[{"name":"sum","align_sampling":true,"sampling":{"value":"1","unit":"Months"}}]}]}
I am getting the following response:
500 {"errors":["Too many open files"]}
Here this link it is written that increase the size of file-max.
My file-max output is:
cat /proc/sys/fs/file-max
382994
it is already very large, do I need to increase its limit

What version are you using? Are you using a lot of grou-by in your queries?
You may need to restart kairosDB as a workaround.
Can you check if you have deleted (ghost) files handles (replace by kairosDB process ID in the command line below)?
ls -l /proc/<PID>/fd | grep kairos_cache | grep -v '(delete)' | wc -l
THere was a fix in 0.9.5 for unclosed file handles.
There's a fix pending for next release (1.0.1).
cf. https://github.com/kairosdb/kairosdb/pull/180, https://github.com/kairosdb/kairosdb/issues/132, and https://github.com/kairosdb/kairosdb/issues/175.

Related

Error: ENOSPC: no space left on device, write pm2 server

I am not able to identify what is causing my ec2 disk space to reach 100% of capacity.
I have a script which deletes files in tmp folder.But still randomly sometimes my disk capacity reaches 100%.
I have attached the logs of df -i to show disk utilization.
Error
PM2 | Error: ENOSPC: no space left on device, write
PM2 | at Object.writeSync (fs.js:679:3)
PM2 | at Object.writeFileSync (fs.js:1393:26)
PM2 | at ProcessContainer (/usr/lib/node_modules/pm2/lib/ProcessContainer.js:70:10)
PM2 | at Object.<anonymous> (/usr/lib/node_modules/pm2/lib/ProcessContainer.js:103:3)
PM2 | at Module._compile (internal/modules/cjs/loader.js:999:30)
I am using command df -i to find
[![enter image description here][1]][1]
[![enter image description here][2]][2]
du -h -d 1

Check the user .pm2/logs directory, if your node app as errors or many regular logs this can increase disk space used.
I think that 8 Go is too small. I think you should upgrade your server to allocate more space. This will solved your problem.
If you can't or if you don't want to add disk space, you can take a look at the /var/log directory to delete some extra log. On long term, you can use logrotate to compress log files and upload compressed one to another place in order to keep /var/log as small a possible.
UPDATE
Also, i am not a specialist of ubuntu and snap, but your /snap directory is 2,1Go in size. You can check this to see if snap retain old version of snap package or if there is some cache that can be cleared.
Here is a bash script to remove old snaps version that i find here : https://www.debugpoint.com/clean-up-snap/
#!/bin/bash
#Removes old revisions of snaps
#CLOSE ALL SNAPS BEFORE RUNNING THIS
set -eu
LANG=en_US.UTF-8 snap list --all | awk '/disabled/{print $1, $3}' |
while read snapname revision; do
snap remove "$snapname" --revision="$revision"
done
You can also delete files in /var/lib/snapd/cache it's a snap cache that can be cleared.
But as i say, not a specialist of Ubuntu, so not tested.

You can use the dh utility
cd /
du -h -d 1
it will show the disk usage for every folder in /, then you can cd in the biggest ones and repeat the same.
You can also run
du | sort -n
and you'll get (after a while) all the folders size in the filesystem (ordered by ascending size). By my experience I'd take a first look at /home, /tmp and /var.

How to calculate PostgreSQL memory usage on Linux?

I searched a lot about how to calculate memory usage of PostgreSQL process on Linux. I read article about how to calculate Memory usage for a generic process but I think that PostgreSQL has some peculiarity. For example, it has some basic processes:logger, checkpointer, bg writer, etc. But Linux creates also a process for each client connection on the master node.
The easy way to calculate the memory usage is with ps command listing the RSS of each process:
ps -aux | grep -v grep | grep postgres | awk '{ print $6 }'
and then sum it. But this doesn't work since the result is larger than the total memory.
Some articles suggest the use of:
/proc/PID/smaps
but as said above I have more PID and I am unable to find a website that shows a script that let me easily calculate this information.
I found this interesting article, but it's not clear to me how to convert it into a working script.
https://www.depesz.com/2012/06/09/how-much-ram-is-postgresql-using/
Does anyone know which is the best approach to solve this issue?

To apply the information from the blog you quote:
ps -u postgres o pid= | \
sed 's# *\(.*\)#/proc/\1/smaps#' | \
xargs sudo grep ^Pss: | \
awk '{A+=$2} END{print A}'
first, get the process numbers of all processes running under user postgres
then get the name of the corresponding smaps file (/proc/PID/smaps)
then get the Pss line from that file, which contains the “proportional stack size” which divides shared memory by the number of processes attached to it
finally, add up the numbers

Too many open files in system in kubernetes cluster [duplicate]

I am trying create a bunch of pods, services and deployment using Kubernetes, but keep hitting the following errors when I run the kubectl describe command.
for "POD" with RunContainerError: "runContainer: API error (500): Cannot start container bbdb58770a848733bf7130b1b230d809fcec3062b2b16748c5e4a8b12cc0533a: [8] System error: too many open files in system\n"
I have already terminated all pods and try restarting the machine, but it doesn't solve the issue. I am not an Linux expert, so I am just wondering how shall find all the open files and close them?

You can confirm which process is hogging file descriptors by running:
lsof | awk '{print $2}' | sort | uniq -c | sort -n
That will give you a sorted list of open FD counts with the pid of the process. Then you can look up each process w/
ps -p <pid>
If the main hogs are docker/kubernetes, then I would recommend following along on the issue that caesarxuchao referenced.

Kubernetes can't start due to too many open files in system

I am trying create a bunch of pods, services and deployment using Kubernetes, but keep hitting the following errors when I run the kubectl describe command.
for "POD" with RunContainerError: "runContainer: API error (500): Cannot start container bbdb58770a848733bf7130b1b230d809fcec3062b2b16748c5e4a8b12cc0533a: [8] System error: too many open files in system\n"
I have already terminated all pods and try restarting the machine, but it doesn't solve the issue. I am not an Linux expert, so I am just wondering how shall find all the open files and close them?

You can confirm which process is hogging file descriptors by running:
lsof | awk '{print $2}' | sort | uniq -c | sort -n
That will give you a sorted list of open FD counts with the pid of the process. Then you can look up each process w/
ps -p <pid>
If the main hogs are docker/kubernetes, then I would recommend following along on the issue that caesarxuchao referenced.

How do I find out what inotify watches have been registered?

I have my inotify watch limit set to 1024 (I think the default is 128?). Despite that, yeoman, Guard and Dropbox constantly fail, and tell me to up my inotify limit. Before doing so, I'd like to know what's consuming all my watches (I have very few files in my Dropbox).
Is there some area of /proc or /sys, or some tool I can run, to find out what watches are currently registered?

Oct 31 2022 update
While my script below works fine as it is, Michael Sartain implemented a native executable that is much faster, along with additional functionality not present in my script (below). Worth checking out if you can spend a few seconds compiling it! I have also added contributed some PRs to align the functionality, so it should be pretty 1:1, just faster.
Upvote his answer on the Unix Stackexchange.
Original answer with script
I already answered this in the same thread on Unix Stackexchange as was mentioned by #cincodenada, but thought I could repost my ready-made answer here, seeing that no one really has something that works:
I have a premade script, inotify-consumers, that lists the top offenders for you:
INOTIFY INSTANCES
WATCHES PER
COUNT PROCESS PID USER COMMAND
------------------------------------------------------------
21270 1 11076 my-user /snap/intellij-idea-ultimate/357/bin/fsnotifier
201 6 1 root /sbin/init splash
115 5 1510 my-user /lib/systemd/systemd --user
85 1 3600 my-user /usr/libexec/xdg-desktop-portal-gtk
77 1 2580 my-user /usr/libexec/gsd-xsettings
35 1 2475 my-user /usr/libexec/gvfsd-trash --spawner :1.5 /org/gtk/gvfs/exec_spaw/0
32 1 570 root /lib/systemd/systemd-udevd
26 1 2665 my-user /snap/snap-store/558/usr/bin/snap-store --gapplication-service
18 2 1176 root /usr/libexec/polkitd --no-debug
14 1 1858 my-user /usr/bin/gnome-shell
13 1 3641 root /usr/libexec/fwupd/fwupd
...
21983 WATCHES TOTAL COUNT
INotify instances per user (e.g. limits specified by fs.inotify.max_user_instances):
INSTANCES USER
----------- ------------------
41 my-user
23 root
1 whoopsie
1 systemd-ti+
...
Here you quickly see why the default limit of 8K watchers is too little on a development machine, as just WebStorm instance quickly maxes this when encountering a node_modules folder with thousands of folders. Add a webpack watcher to guarantee problems ...
Even though it was much faster than the other alternatives when I made it initially, Simon Matter added some speed enhancements for heavily loaded Big Iron Linux (hundreds of cores) that sped it up immensely, taking it down from ten minutes (!) to 15 seconds on his monster rig.
Later on, Brian Dowling contributed instance count per process, at the expense of relatively higher runtime. This is insignificant on normal machines with a runtime of about one second, but if you have Big Iron, you might want the earlier version with about 1/10 the amount of system time :)
How to use
inotify-consumers --help 😊 To get it on your machine, just copy the contents of the script and put it somewhere in your $PATH, like /usr/local/bin. Alternatively, if you trust this stranger on the net, you can avoid copying it and pipe it into bash over http:
$ curl -s https://raw.githubusercontent.com/fatso83/dotfiles/master/utils/scripts/inotify-consumers | bash
INOTIFY
WATCHER
COUNT PID USER COMMAND
--------------------------------------
3044 3933 myuser node /usr/local/bin/tsserver
2965 3941 myuser /usr/local/bin/node /home/myuser/.config/coc/extensions/node_modules/coc-tsserver/bin/tsserverForkStart /hom...
6990 WATCHES TOTAL COUNT
How does it work?
For reference, the main content of the script is simply this (inspired by this answer)
find /proc/*/fd \
-lname anon_inode:inotify \
-printf '%hinfo/%f\n' 2>/dev/null \
\
| xargs grep -c '^inotify' \
| sort -n -t: -k2 -r
Changing the limits
In case you are wondering how to increase the limits
$ inotify-consumers --limits
Current limits
-------------
fs.inotify.max_user_instances = 128
fs.inotify.max_user_watches = 524288
Changing settings permanently
-----------------------------
echo fs.inotify.max_user_watches=524288 | sudo tee -a /etc/sysctl.conf
sudo sysctl -p # re-read config

inotify filesystem options
sysctl fs.inotify
opened files
lsof | grep inotify | wc -l
Increase the values like this
sysctl -n -w fs.inotify.max_user_watches=16384
sysctl -n -w fs.inotify.max_user_instances=512

The default maximum number of inotify watches is 8192; it can be increased by writing to /proc/sys/fs/inotify/max_user_watches.
You can use sysctl fs.inotify.max_user_watches to check current value.
Use tail -f to verify if your OS does exceed the inotify maximum watch limit.
The internal implementation of tail -f command uses the inotify mechanism to monitor file changes.
If you've run out of your inotify watches, you'll most likely to get this error:
tail: inotify cannot be used, reverting to polling: Too many open files
To find out what inotify watches have been registered, you may refer to this, and this. I tried, but didn't get the ideal result. :-(
Reference:
https://askubuntu.com/questions/154255/how-can-i-tell-if-i-am-out-of-inotify-watches
https://unix.stackexchange.com/questions/15509/whos-consuming-my-inotify-resources
https://bbs.archlinux.org/viewtopic.php?pid=1340049

I think
sudo ls -l /proc/*/fd/* | grep notify
might be of use. You'll get a list of the pids that have a inotify fd registered.
I don't know how to get more info than this! HTH

Since this is high in Google results, I'm copy-pasting part of my answer from a similar question over on the Unix/Linux StackExchange:
I ran into this problem, and none of these answers give you the answer of "how many watches is each process currently using?" The one-liners all give you how many instances are open, which is only part of the story, and the trace stuff is only useful to see new watches being opened.
This will get you a file with a list of open inotify instances and the number of watches they have, along with the pids and binaries that spawned them, sorted in descending order by watch count:
sudo lsof | awk '/anon_inode/ { gsub(/[urw]$/,"",$4); print "/proc/"$2"/fdinfo/"$4; }' | while read fdi; do count=$(sudo grep -c inotify $fdi); exe=$(sudo readlink $(dirname $(dirname $fdi))/exe); echo -e $count"\t"$fdi"\t"$exe; done | sort -nr > watches
If you're interested in what that big ball of mess does and why, I explained in depth over on the original answer.

The following terminal command worked perfectly for me on my Ubuntu 16.04 Machine:
for foo in /proc/\*/fd/*; do readlink -f $foo; done | grep '^/proc/.*inotify' |cut -d/ -f3 |xargs -I '{}' -- ps --no-headers -o '%p %U %a' -p '{}' |uniq -c |sort -n
My problem was that I had a good majority of my HDD loaded as a folder in Sublime Text. Between /opt/sublime_text/plugin_host 8992 and /opt/sublime_text/sublime_text, Sublime had 18 instances of inotify while the rest of my programs were all between 1-3.
Since I was doing Ionic Mobile App development I reduced the number of instances by 5 by adding the large Node.js folder "node_modules" to the ignore list in the Sublime settings.
"folder_exclude_patterns": [".svn", ".git", ".hg", "CVS", "node_modules"]
Source: https://github.com/SublimeTextIssues/Core/issues/1195

Based on the excellent analysis of cincodenada, I made my own one-liner, which works better for me:
find /proc/*/fd/ -type l -lname "anon_inode:inotify" -printf "%hinfo/%f\n" | xargs grep -cE "^inotify" | column -t -s:
It helps to find all inotify watchers and their watching count. It does not translate process ids to their process names or sort them in any way but that was not the point for me. I simply wanted to find out which process consumes most of the watches. I then was able to search for that process using its process id.
You can omit the last column command if you don't have it installed. It's only there to make the output look nicer.
Okay, as you can see, there is a similar and less fork hungry approach from #oligofren. Better you use his simple script. It's very nice. I was also able to shrink my one-liner because I was not aware of the -lname parameter of find which comes in very handy here.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Too many open files - KairosDB - linux

Related

Error: ENOSPC: no space left on device, write pm2 server

How to calculate PostgreSQL memory usage on Linux?

Too many open files in system in kubernetes cluster [duplicate]

Kubernetes can't start due to too many open files in system

How do I find out what inotify watches have been registered?

Categories

Resources