How to follow the progress of a linux command? - linux

I am currently working with a large data set where even the file format conversion takes at least an hour per subject and as a result I am often unsure whether my command has been executed or the program has frozen. I was wondering whether anyone has a tip to how to follow the progress of the commands/scripts I am trying to run in linux?
Your help will be much appreciated.

In addition to #basile-starynkevitch answer,
I have a bash script that can measure how much file did you processed in percents.
It watch into procfs get current position from fd information (/proc/pid/fdinfo), and count this in percents, relative to total file size.
See https://gist.github.com/azat/2830255
curl -s https://gist.github.com/azat/2830255/raw >| progress_fds.sh \
&& chmod +x progress_fds.sh
Usage:
./progress_fds.sh /path/to/file [ PID]
Сan be useful to someone

If the long-lasting command produces some output in a file foo.out, you could do watch ls -l foo.out or tail -f foo.out
You could also list /proc/$(pidof prog)/fd to find out the opened files of some prog

You can follow the syscalls of a program by using strace, which will enable you to follow the open calls.

You can use verbose output, but it will slow things down even more.

I guess there can't be a general answer to that, it just depends on the type of program (that doesn't even has to do anything with Linux, see the "halting problem").
If you happen to use a pipe during the conversion I find the pv(1) tool pretty helpful. Even if pv can't know the total size of the data it helps to see if there is actual progress and how good the datarate is. It isn't part of most standard installations though and probably has to be installed explicitly.

Related

Why does my crontab not work?

I am planning to run some bash scripts every minute, and I wrote:
* * * * * bash ~/Dropbox/temp_scripts/run_all_scripts
in crontab.
It was supposed to run every minute, but it did not work. Does anyone have idea why this happens?
Transferring a comment into an answer.
Add I/O redirection to the command line in the crontab entry:
>/tmp/run_all_scripts.out 2>/tmp/run_all_scripts.err
Review the contents of the files after a minute or two has passed. Consider recording the environment to see if that's part of the problem. And consider using bash -x instead of just bash.
If you still don't get anything (the files in /tmp are not created), then you've got issues with cron; the daemon isn't running, or your user does not have permission to use it (but crontab isn't telling you that), or you've not submitted your crontab to the program (what does crontab -l say?), or … whatever is really wrong.
Note, too, that the output from cron jobs is normally (well, at least sometimes — on Mac OS X for a system I currently use, and Solaris for another that I've used previously) emailed to the person whose job it is. You should review the email on the system.
Thank you! I have already fixed it! The reason why it does not work is I used "ls -a .sh" in the script, and when the crontab did not find any *.sh files in the folder it was executing. When modifying it to "ls -a $HOME/Dropbox/temp_scripts/.sh", everything works! This debugging technique is quite helpful!
It is, in many ways, the most basic of debugging techniques — make sure you see what is actually happening. If you're not sure why a shell script isn't working, make sure you can see that it is executing and what it is producing in the way of output, and (very often) make sure you can see what it is executing with bash -x or equivalent. (AFAIK, all shells support -x to trace the execution.)

View source for standard Linux commands e.g. cat, ls, cd

I would like to view the source code for a Linux command to see what is actually going on inside each command. When I attempt to open the commands in /bin in a text/hex editor, I get a bunch of garbage. What is the proper way to view the source on these commands?
Thanks in advance,
Geoff
EDIT:
I should have been more specific. Basically I have a command set that was written by someone who I can no longer reach. I would like to see what his command was actually doing, but without a way to 'disassemble' the command, I am dead in the water. I was hoping for a way to do this within the OS.
Many of the core Linux commands are part of the GNU core utils. The source can be found online here
The file you are opening is the binary executables which are the stuff the kernel passes to the CPU. These files are made using a compiler that takes in the source code you and I understand and turns it via a number of stages into this CPU friendly format.
You can find out the system calls that are being made using strace
strace your_command
Most likely you can download the source code with your distribution's package manager. For example, on Debian and related distros (Ubuntu included), first find which package the command belongs to:
$ dpkg -S /bin/cat
coreutils: /bin/cat
The output tells you that /bin/cat is in the coreutils package. Now you can download the source code:
apt-get source coreutils
This question is related to reverse engineering.
Some keyword is static analysis and dynamic analysis
use gdb to check that the binary file have symbol table inside or not. (if binary compile with debugging flag, you can get the source code and skip below step)
observe program behavior by strace/ltrace.
write seudo-code by use objdump/ida-pro or other disassembler.
run it by gdb to dynamic analysis and correct the seudo-code.
A normal binary file can be reverted back to source code if you want and have time. Conversely, an abnormal program is not easy to do this, but it only appear on specific ctf competition. (Some special skill like strip/objcopy/packer ... etc)
You can see assembly code of /bin/cat with:
objdump -d /bin/cat
Then analyze it and see what command can be launch.
Another way of approaching is strings /bin/cat, it is usefull make a initial idea and then reverse it.
You can get the source code of every linux command online anyway :D

How can I use the intel pin tool to count the instruction executed on linux?

everyone, I am a fresh here as well as to linux
i want to use the intel pin tool to help me count the instructions executed in a quick sort program, just a homework, but when i did this as the readme document told me, like
cd source/tools/SimpleExamples
make obj-ia32/opcodemix.so
the system told me
make: * No rule to make target `obi-ia32/opcodemix.so'. Stop.
and i also tried obj-intel64,nothing changed.
can anybody tell me what is going on here, i am really confused with this pin stuff.
cd pintool/source/tools/ManualExamples
type command as
make inscount0.test
this commnad compile and show you the out put file then use following command on same directory
../../../pin -t obj-ia32/inscount0.so -- /bin/ls
this will make .so file after that see the ouput by using following command
cat inscount.out
I can't tell exactly what your question is. Format your commands with the code and separate them line by line, so I can know what you executed.
Anyway, if I'm right, you should just type:
make
(without targets) in under source/tools/ManualExamples, and it should build them all.

Limit output of all Linux commands

I'm looking for a way to limit the amount of output produced by all command line programs in Linux, and preferably tell me when it is limited.
I'm working over a server which has a lag on the display. Occasionally I will accidentally run a command which outputs a large amount of text to the terminal, such as cat on a large file or ls on a directory with many files. I then have to wait a while for all the output to be printed to the terminal.
So is there a way to automatically pipe all output into a command like head or wc to prevent too much output having to be printed to terminal?
I don't know about the general case, but for each well-known command (cat, ls, find?)
you could do the following:
hardlink a copy to the existing utility
write a tiny bash function that calls the utility and pipes to head (or wc, or whatever)
alias the name of the utility to call your function.
So along these lines (utterly untested):
$ ln `which cat` ~/bin/old_cat
function trunc_cat () {
`old_cat $# | head -n 100`
}
alias cat=trunc_cat
Making aliases of all your commands would be a good start. Something like
alias lm="ls -al | more"
alias cam="cat $# | more"
Perhaps using screen could help?
this makes me think of bash-completion.
As complete command in bash enables you to specify handler when a program is not found,
what about write your own handler and clear $PATH, in order to execute every command with redirection to a filtering pipe?
#Did not try it myself.
Assuming you're working over a network connection, like ssh, into a remote server then try piping the output of the command to less. That way you can manage and navigate the output from the program on the server better. Use 'j' and 'k' to move up and down per line and 'ctrl-u' and 'ctrl-d' to move 1/2 a page up and down. When you do this only the relevant text (i.e. what fits on the screen) will be transmitted over the network.

How does the 'ls' command work in Linux/Unix?

I would like to know exactly how the "Is" command works in Linux and Unix.
As far as I know, ls forks & exec to the Linux/Unix shell and then gets the output (of the current file tree. eg./home/ankit/). I need a more detailed explanation, as I am not sure about what happens after calling fork.
Could anyone please explain the functionality of the 'ls' command in detail?
ls doesn't fork. The shell forks and execs in order to run any command that isn't built in, and one of the commands it can run is ls.
ls uses opendir() and readdir() to step through all the files in the directory. If it needs more information about one of them it calls stat().
To add to the answer, in The C Programming Language book (K&RC) they have given a small example on how to go about implementing ls. They have explained the datastructures and functions used very well.
To understand what ls does, you could take a gander at the OpenSolaris source: https://hg.java.net/hg/solaris~on-src/file/tip/usr/src/cmd/ls/ls.c.
If that´s overwhelming, on Solaris you start by using truss to look at the system calls that ls makes to understand what it does. Using truss, try:
truss -afl -o ls.out /bin/ls
then look at the output in ls.out
I believe that trace is the equivalent to truss in Linux.
If you really want to understand the detailed innards of ls, look at the source code. You can follow tpgould's link to the Solaris source, or it's easy to find the source online from any Linux or BSD distribution.
I'll particularly recommend the 4.4BSD source.
As I recall, ls starts by parsing its many options, then starts with the files or directories listed on the command line (default is "."). Subdirectories are handled by recursion into the directory list routine. There's no fork() or exec() that I recall.
This is a old thread , but still I am commenting because I believe the answer which was upvoted and accepted is partially incorrect. #Mark says that ls is built into shell so shell doesn't exec and fork. When I studied the tldp document on bash(I have attached the link)
"ls" is not listed as a build in command.
http://tldp.org/LDP/Bash-Beginners-Guide/html/sect_01_03.html
Bash built-in commands:
alias, bind, builtin, command, declare, echo, enable, help, let, local, logout, printf, read, shopt, type, typeset, ulimit and unalias.

Resources