I am using ssconvert (Gnumeric) to convert large Excel files into separate CSV files. Most files work, however with some of the larger files with additional formatting the process dies abruptly and says 'killed'.
ssconvert -S '/tmp/inputfile.xlsx' '/tmp/output.csv'
Is there any special handling for larger files that can be used?
If the user or sysadmin did not kill the program the kernel may have. The kernel would only kill a process under exceptional circumstances such as extreme resource starvation.
dmesg | grep -E -i -B100 'killed process'
Related
I need to discover what processes are using a specific disk. This is a multipath disk but I cannot find a way of setting up a way to record to a log file what processes are running when a particular disk is being read or written to. I know the major:minor block IDs using lsblk then lsof but these only show current activity and as there currently is none, I cannot find out the process that uses this disk.
Any ideas anyone?
You can use lsof. Lsof revision lists on its standard output file information about files opened by processes. for example this command below will list all files that are opened for writing:
lsof | grep -e "[[:digit:]]\+w > mylogfile.log"
you can redirect the command to log file if you which with the redirect operator >
I am using strace to capture calls to open(), close() and read() on Linux. The target process is the jetty web server. As far as I can tell, strace is not logging all calls to open(). Maybe the others too, I have not tried to correlate the file descriptors to open() calls.
For example, starting strace:
strace -f -e trace=open,close,read -o/tmp/strace.out -p62881
I then use wget to fetch 100 static files; all were retrieved successfully. In one run, only 56 open events were logged; on another run of 100 different files, I got 66 open events.
I believe that using "-f" results in strace attaching to all the LWPIDs for the threads ("Process 62881 attached with 25 threads - interrupt to quit
"); when I try to explicitly attach to all using multiple "-p" options, I get a single "attach" success message, but multiple "Operation not permitted messages", one for each child PID.
I restarted Jetty to clear its cache before my tests.
Kernel version is 2.6.32-504.3.3.el6.x86_64 (Red Hat). Strace package version is strace-4.5.19-1.19.el6.x86_64.
What am I missing?
Thanks
On some systems you have to use openat() instead of open().
Try:
strace -f -e trace=openat,close,read -o/tmp/strace.out -p62881
Try -ff (in addition to -f):
-ff: If the -o filename option is in effect, each processes trace is written to filename.pid where pid is
the numeric process id of each process. This is incompatible with -c, since no per-process counts are
kept.
I have a Java program after 2 weeks of running in average will become stuck and produce the following error:
Caused by: java.net.SocketException: Too many open files
at sun.nio.ch.Net.socket0(Native Method)
at sun.nio.ch.Net.socket(Net.java:415)
at sun.nio.ch.Net.socket(Net.java:408)
at sun.nio.ch.SocketChannelImpl.<init>(SocketChannelImpl.java:105)
That hints to me that many sockets are opened but never closed.
Before diving into programmatic instrumentation i started to inspect what information i could draw from linux itself. I am using Redhat.
And then, a few questions came up as follows:
Why the following commands do not give the same output?
See
[ec2-user#ip-172-22-28-102 ~]$ sudo ls /proc/32085/fd | wc -l
592
[ec2-user#ip-172-22-28-102 ~]$ sudo lsof -a -p 32085 | wc -l
655
Is there a way to know from the proc stat info which thread created which file descriptor?
It seems like there is not because if i do the following, i am getting the same information:
[ec2-user#ip-172-22-28-102 ~]$ sudo ls /proc/32085/task/22386/fd | wc -l
592
[ec2-user#ip-172-22-28-102 ~]$ sudo ls /proc/32085/fd | wc -l
592
Same if i go to the thread directly from under /proc/ .
Thx
Is there a way to know from the proc stat info which thread created which file descriptor?
I am pretty sure the answer here is "no". File descriptors are opened by processes, not threads (and will be visible to all threads spawned by the same process).
Why the following commands do not give the same output?
First, the -a argument to lsof appears to be a no-op in this case. Specfically, the man says that it "causes list selection options to be ANDed, as described above". So you are really just running:
sudo lsof -p 32085
And that will print things other than open file descriptors (such as memory-mapped files, current working directory, etc), while /proc/<PID>/fd contains only open file descriptors. So you're getting different results because you're asking for different information.
The only reason you can receive that message is that you have opened files and you didn't close them after use. You have a file descriptor leak in your java application. Java programmers normally don't check memory as the garbage collector copes with unreferenced objects. If you save file descriptors without closing in some data structure or you don't close the files after using, you can reach the maximum limit allowed to a process (this is controlled per process and can be changed by the ulimit shell command)
But if your problem is a file descriptor leak, pushing up the ulimit will only delay the problem some time. File descriptors must be closed, or you'll run into trouble.
I've just ran across this difference today, the explanation is that lsof takes into account more types of files, like memory-mapped objects, run-time libraries etc
Currently, I am taking up the long method of doing this by getting a list of processes using the following command
sudo ps -eo pid,command | grep -v grep | awk '{print $1}' > pids.txt
And then iterating through the process ids and executing in background the strace of each process and generating logs for each process with the process id in the log's extension
filename="$1"
while read -r line
do
chmod +x straceProgram.sh
./straceProgram.sh $line &
done < "$filename"
straceProgram.sh
pid="$1"
sudo strace -p $pid -o log.$pid
However, the problem with this approach is that if there is any new process which gets started, it will not be straced since the strace is on the process ids stored in the pids.txt during the first run.
The list of pids.txt can be updated with new process ids, however, I was inquisitive on running a strace at an operating system level which would strace all the activities being performed.
Could there be a better way to do this?
If your resulting filesystem is going to be a kernel filesystem driver, I would recommend using tracefs to gather the information you require. I would recommend against making this a kernel filesystem unless you have a lot of time and a lot of testing resources. It is not trivial.
If you want an easier, safer alternative, write your filesystem using fuse. The downside is that performance is not quite as good and there are a few places where it cannot be used, but it is often acceptable. Note that there is already an implementation of a logging filesystem under fuse.
use the strace -f (fork) option, also I suggest the -s 9999 for more details
When doing
tail -f /var/log/apache2/access.log
It shows logs and then
Killed
I have to re-execute tail -f to see new logs.
How do I make tail -f continually display logs without killing itself?
The first thing I'd do is try --follow instead of -f. Your problem could be happening because your log file is being rotated out. From the man page:
With --follow (-f), tail defaults to following the file descriptor, which means that even if a tail'ed file is renamed, tail will continue to track its end. This default behavior is not desirable when you really want to track the actual name of the file, not the file descriptor (e.g., log rotation). Use --follow=name in that case. That causes tail to track the named file in a way that accommodates renaming, removal and creation.*
tail -f should not get killed.
Btw, tail does not kill itself, it is killed by something. For example system is out of memory or resource limit is too restrictive.
Please figure out what kills your tail, using for example gdb or strace. Also check your environment, at least ulimit -a and dmesg for any clues.
If your description is correct, and tail actually displays
Killed
then it is probably not happening as a result of log rotation. Log rotation will causes tail to stop displaying new lines, but even if the file is deleted, tail will not be killed.
Rather, some other process on the system, or perhaps the kernel, is sending it a signal 9 (SIGKILL). Possible causes for this include:
A user in another terminal issuing a command such as kill -9 1234 or pkill -9 tail
Some other tool or daemon (although I can't think of any that would do this)
The kernel itself can send SIGKILL to your process. One scenario under which it would do this is if the OOM (Out of memory) killer kicked in. This happens when all RAM and swap in the system is used. The kernel will select a process which is using a lot of memory and kill it. If this was happening it would be visible in syslog, but it is quite unlikely that tail would use that much memory.
The kernel can send you SIGKILL if RLIMIT_CPU (the limit on the amount of CPU time your process has used) is exceeded. If you leave tail running for long enough, and you have a ulimit set, then this can happen. To check for this (and other resource limitations) use ulimit -a
In my opinion, either the first or last of these explanations seems most likely.
You need to use tail -F logfile , it will not get terminate if log file rotate.