RHEL Linux: Discover what processes are using a specific disk - linux

I need to discover what processes are using a specific disk. This is a multipath disk but I cannot find a way of setting up a way to record to a log file what processes are running when a particular disk is being read or written to. I know the major:minor block IDs using lsblk then lsof but these only show current activity and as there currently is none, I cannot find out the process that uses this disk.
Any ideas anyone?

You can use lsof. Lsof revision lists on its standard output file information about files opened by processes. for example this command below will list all files that are opened for writing:
lsof | grep -e "[[:digit:]]\+w > mylogfile.log"
you can redirect the command to log file if you which with the redirect operator >

Related

Linux File descriptors

I have a Java program after 2 weeks of running in average will become stuck and produce the following error:
Caused by: java.net.SocketException: Too many open files
at sun.nio.ch.Net.socket0(Native Method)
at sun.nio.ch.Net.socket(Net.java:415)
at sun.nio.ch.Net.socket(Net.java:408)
at sun.nio.ch.SocketChannelImpl.<init>(SocketChannelImpl.java:105)
That hints to me that many sockets are opened but never closed.
Before diving into programmatic instrumentation i started to inspect what information i could draw from linux itself. I am using Redhat.
And then, a few questions came up as follows:
Why the following commands do not give the same output?
See
[ec2-user#ip-172-22-28-102 ~]$ sudo ls /proc/32085/fd | wc -l
592
[ec2-user#ip-172-22-28-102 ~]$ sudo lsof -a -p 32085 | wc -l
655
Is there a way to know from the proc stat info which thread created which file descriptor?
It seems like there is not because if i do the following, i am getting the same information:
[ec2-user#ip-172-22-28-102 ~]$ sudo ls /proc/32085/task/22386/fd | wc -l
592
[ec2-user#ip-172-22-28-102 ~]$ sudo ls /proc/32085/fd | wc -l
592
Same if i go to the thread directly from under /proc/ .
Thx
Is there a way to know from the proc stat info which thread created which file descriptor?
I am pretty sure the answer here is "no". File descriptors are opened by processes, not threads (and will be visible to all threads spawned by the same process).
Why the following commands do not give the same output?
First, the -a argument to lsof appears to be a no-op in this case. Specfically, the man says that it "causes list selection options to be ANDed, as described above". So you are really just running:
sudo lsof -p 32085
And that will print things other than open file descriptors (such as memory-mapped files, current working directory, etc), while /proc/<PID>/fd contains only open file descriptors. So you're getting different results because you're asking for different information.
The only reason you can receive that message is that you have opened files and you didn't close them after use. You have a file descriptor leak in your java application. Java programmers normally don't check memory as the garbage collector copes with unreferenced objects. If you save file descriptors without closing in some data structure or you don't close the files after using, you can reach the maximum limit allowed to a process (this is controlled per process and can be changed by the ulimit shell command)
But if your problem is a file descriptor leak, pushing up the ulimit will only delay the problem some time. File descriptors must be closed, or you'll run into trouble.
I've just ran across this difference today, the explanation is that lsof takes into account more types of files, like memory-mapped objects, run-time libraries etc

How to close open (deleted) file descriptor on linux shell

If i use
lsof -n | grep deleted
I have along list of php5-fpm list values.
two sample output of a list value:
(deleted)/dev/zero (stat: No such file or directory)
(deleted)/tmp/.ZendSem.JQTejx
1) How can i close them within an openVZ container?
2) Is this a result of forgetting to close a mysql handle within a php script?
df -h
shows 41% /var/lib/vz/root/102/var/www/clients/client1/web1/log
and within the log directory there are only a few MB
so how to restore the lost webspace??

How to detect file descriptor leaks in Node

I suspect that I have a file descriptor leak in my Node application, but I'm not sure how to confirm this. Is there a simple way to detect file descriptor leaks in Node?
Track open files
On linux you can use the lsof command to list the open files [for a process].
Get the PIDs of the thing you want to track:
ps aux | grep node
Let's say its PID 1111 and 1234, list the open files:
lsof -p 1111,1234
You can save that list and compare when you expect them to be released by your app.
Make it easier to reproduce
If it's taking a while to confirm this (because it takes a while to run out of descriptors) you can try to lower the limit for file descriptors available using ulimit
ulimit -n 500 #or whatever number makes sense for you
#now start your node app in this terminal

Linux: Which process is causing "device busy" when doing umount? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 4 years ago.
Improve this question
Linux: Which process is causing "device busy" when doing umount?
Look at the lsof command (list open files) -- it can tell you which processes are holding what open. Sometimes it's tricky but often something as simple as sudo lsof | grep (your device name here) could do it for you.
Just in case... sometimes happens that you are calling umount from the terminal, and your current directory belongs to the mounted filesystem.
You should use the fuser command.
Eg. fuser /dev/cdrom will return the pid(s) of the process using /dev/cdrom.
If you are trying to unmount, you can kill theses process using the -k switch (see man fuser).
Check for open loop devices mapped to a file on the filesystem with "losetup -a". They wont show up with either lsof or fuser.
Also check /etc/exports. If you are exporting paths within the mountpoint via NFS, it will give this error when trying to unmount and nothing will show up in fuser or lsof.
lsof +f -- /mountpoint
(as lists the processes using files on the mount mounted at /mountpoint. Particularly useful for finding which process(es) are using a mounted USB stick or CD/DVD.
lsof and fuser are indeed two ways to find the process that keeps a certain file open.
If you just want umount to succeed, you should investigate its -f and -l options.
That's exactly why the "fuser -m /mount/point" exists.
BTW, I don't think "fuser" or "lsof" will indicate when a resource is held by kernel module, although I don't usually have that issue..
Open files
Processes with open files are the usual culprits. Display them:
lsof +f -- <mountpoint or device>
There is an advantage to using /dev/<device> rather than /mountpoint: a mountpoint will disappear after an umount -l, or it may be hidden by an overlaid mount.
fuser can also be used, but to my mind lsof has a more useful output. However fuser is useful when it comes to killing the processes causing your dramas so you can get on with your life.
List files on <mountpoint> (see caveat above):
fuser -vmM <mountpoint>
Interactively kill only processes with files open for writing:
fuser -vmMkiw <mountpoint>
After remounting read-only (mount -o remount,ro <mountpoint>), it is safe(r) to kill all remaining processes:
fuser -vmMk <mountpoint>
Mountpoints
The culprit can be the kernel itself. Another filesystem mounted on the filesystem you are trying to umount will cause grief. Check with:
mount | grep <mountpoint>/
For loopback mounts, also check the output of:
losetup -la
Anonymous inodes (Linux)
Anonymous inodes can be created by:
Temporary files (open with O_TMPFILE)
inotify watches
[eventfd]
[eventpoll]
[timerfd]
These are the most elusive type of pokemon, and appear in lsof's TYPE column as a_inode (which is undocumented in the lsof man page).
They won't appear in lsof +f -- /dev/<device>, so you'll need to:
lsof | grep a_inode
For killing processes holding anonymous inodes, see: List current inotify watches (pathname, PID).
lsof and fuser didn't give me anything either.
After a process of renaming all possible directories to .old and rebooting the system every time after I made changes I found one particular directory (relating to postfix) that was responsible.
It turned out that I had once made a symlink from /var/spool/postfix to /disk2/pers/mail/postfix/varspool in order to minimise disk writes on an SDCARD-based root filesystem (Sheeva Plug).
With this symlink, even after stopping the postfix and dovecot services (both ps aux as well as netstat -tuanp didn't show anything related) I was not able to unmount /disk2/pers.
When I removed the symlink and updated the postfix and dovecot config files to point directly to the new dirs on /disk2/pers/ I was able to successfully stop the services and unmount the directory.
Next time I will look more closely at the output of:
ls -lR /var | grep ^l | grep disk2
The above command will recursively list all symbolic links in a directory tree (here starting at /var) and filter out those names that point to a specific target mount point (here disk2).
If you still can not unmount or remount your device after stopping all services and processes with open files, then there may be a swap file or swap partition keeping your device busy. This will not show up with fuser or lsof. Turn off swapping with:
sudo swapoff -a
You could check beforehand and show a summary of any swap partitions or swap files with:
swapon -s
or:
cat /proc/swaps
As an alternative to using the command sudo swapoff -a, you might also be able to disable the swap by stopping a service or systemd unit. For example:
sudo systemctl stop dphys-swapfile
or:
sudo systemctl stop var-swap.swap
In my case, turning off swap was necessary, in addition to stopping any services and processes with files open for writing, so that I could remount my root partition as read only in order to run fsck on my root partition without rebooting. This was necessary on a Raspberry Pi running Raspbian Jessie.
Filesystems mounted on the filesystem you're trying to unmount can cause the target is busy error in addition to any files that are in use. (For example when you mount -o bind /dev /mnt/yourmount/dev in order to use chroot there.)
To find which file systems are mounted on the filesystem run the following:
mount | grep '/mnt/yourmount'
To find which files are in use the advice already suggested by others here:
lsof | grep '/mnt/yourmount'

How do I find out what process has a lock on a file in Linux?

Today I had the problem that I couldn't delete a folder because "it was busy".
How can I find out which application to blame for that or can I just delete it with brute force?
Use lsof to find out what has what files are open.
man lsof or have a look here
The fuser Unix command will give you the PIDs of the processes accessing a file.
lslocks lists information about all the currently held file locks in a Linux system. (part of util-linux) this utility has support for json output, which is nice for scripts.
~$ sudo lslocks
COMMAND PID TYPE SIZE MODE M START END PATH
cron 873 FLOCK 4B WRITE 0 0 0 /run/crond.pid
..
..
fuser will show you which processes are accessing a file or directory.

Resources