How does the 'ls' command work in Linux/Unix? - linux

I would like to know exactly how the "Is" command works in Linux and Unix.
As far as I know, ls forks & exec to the Linux/Unix shell and then gets the output (of the current file tree. eg./home/ankit/). I need a more detailed explanation, as I am not sure about what happens after calling fork.
Could anyone please explain the functionality of the 'ls' command in detail?

ls doesn't fork. The shell forks and execs in order to run any command that isn't built in, and one of the commands it can run is ls.
ls uses opendir() and readdir() to step through all the files in the directory. If it needs more information about one of them it calls stat().

To add to the answer, in The C Programming Language book (K&RC) they have given a small example on how to go about implementing ls. They have explained the datastructures and functions used very well.

To understand what ls does, you could take a gander at the OpenSolaris source: https://hg.java.net/hg/solaris~on-src/file/tip/usr/src/cmd/ls/ls.c.
If that´s overwhelming, on Solaris you start by using truss to look at the system calls that ls makes to understand what it does. Using truss, try:
truss -afl -o ls.out /bin/ls
then look at the output in ls.out
I believe that trace is the equivalent to truss in Linux.

If you really want to understand the detailed innards of ls, look at the source code. You can follow tpgould's link to the Solaris source, or it's easy to find the source online from any Linux or BSD distribution.
I'll particularly recommend the 4.4BSD source.
As I recall, ls starts by parsing its many options, then starts with the files or directories listed on the command line (default is "."). Subdirectories are handled by recursion into the directory list routine. There's no fork() or exec() that I recall.

This is a old thread , but still I am commenting because I believe the answer which was upvoted and accepted is partially incorrect. #Mark says that ls is built into shell so shell doesn't exec and fork. When I studied the tldp document on bash(I have attached the link)
"ls" is not listed as a build in command.
http://tldp.org/LDP/Bash-Beginners-Guide/html/sect_01_03.html
Bash built-in commands:
alias, bind, builtin, command, declare, echo, enable, help, let, local, logout, printf, read, shopt, type, typeset, ulimit and unalias.

Related

Checking whether a program exists

In the middle of my perl script I want to execute a bash command. The script takes a long time, so at the beginning of the script I want to see if the command exists. This answer says to just try and run it and this other answer suggests some bash commands to test if the program exists.
Is the latter option the best solution? Are there any better ways to do this check in perl?
My best guess is that you want to check for existence of an executable file that you want to run using system or qx//
But if you want your command line to behave the same way as the shell, then you can probably use File::Which
What if we assume that we don't know the command's location?
This means that syck's answer won't work, and zdim's answer is incomplete.
Try this function in perl:
sub check_exists_command {
my $check = `sh -c 'command -v $_[0]'`;
return $check;
}
# two examples
check_exists_command 'pgrep' or die "$0 requires pgrep";
check_exists_command 'readlink' or die "$0 requires readlink";
I just tested it, because I just wrote it.
With perl, you can test files for existence, readability, executability etc., take a look here.
Therefore just use
executeBashStuff() if -x $filename;
or stat it:
stat($filename);
executeBashStuff() if -x _;
To me a better check is to run the program at the beginning of the script (with -V say).
I'd use the same invocation as you use to run the job later (via shell or not, via execvp). Once at it, make sure to see whether it threw errors. This is also discussed in your link but I would in fact get the output back (not send it away) and check that. This is the surest way to see whether the thing actually runs out of your program and whether it is what you expect it to be.
Checking for the executable with -x (if you know the path) is useful, too, but it only tells you that a file with a given name is there and that it is executable.
The system's which seems to be beset with critism for its possible (mis)behavior, it may or may not be a shell-builtin (which complicates how exactly to use it), is an external utility, and its exact behavior is system dependent. The module File::Which pointed out in Borodin's answer would be better -- if it is indeed better than which. (What it may well be, I just don't know.)
Note. I am not sure what "bash command" means: a bash shell built-in, or the fact that you use bash when on terminal? Perl's qx and system use the sh shell, not bash (if they invoke the shell, which depends on how you use them). While sh is mostly a link, and often to bash, it may not be and there are differences, and you cannot rely on your shell configuration.
Can also actually run a shell, qx(/path/bash -c 'cmd args'), if you must. Mind the quotes. You may need to play with it to find the exact syntax on your system. See this page and links.

View source for standard Linux commands e.g. cat, ls, cd

I would like to view the source code for a Linux command to see what is actually going on inside each command. When I attempt to open the commands in /bin in a text/hex editor, I get a bunch of garbage. What is the proper way to view the source on these commands?
Thanks in advance,
Geoff
EDIT:
I should have been more specific. Basically I have a command set that was written by someone who I can no longer reach. I would like to see what his command was actually doing, but without a way to 'disassemble' the command, I am dead in the water. I was hoping for a way to do this within the OS.
Many of the core Linux commands are part of the GNU core utils. The source can be found online here
The file you are opening is the binary executables which are the stuff the kernel passes to the CPU. These files are made using a compiler that takes in the source code you and I understand and turns it via a number of stages into this CPU friendly format.
You can find out the system calls that are being made using strace
strace your_command
Most likely you can download the source code with your distribution's package manager. For example, on Debian and related distros (Ubuntu included), first find which package the command belongs to:
$ dpkg -S /bin/cat
coreutils: /bin/cat
The output tells you that /bin/cat is in the coreutils package. Now you can download the source code:
apt-get source coreutils
This question is related to reverse engineering.
Some keyword is static analysis and dynamic analysis
use gdb to check that the binary file have symbol table inside or not. (if binary compile with debugging flag, you can get the source code and skip below step)
observe program behavior by strace/ltrace.
write seudo-code by use objdump/ida-pro or other disassembler.
run it by gdb to dynamic analysis and correct the seudo-code.
A normal binary file can be reverted back to source code if you want and have time. Conversely, an abnormal program is not easy to do this, but it only appear on specific ctf competition. (Some special skill like strip/objcopy/packer ... etc)
You can see assembly code of /bin/cat with:
objdump -d /bin/cat
Then analyze it and see what command can be launch.
Another way of approaching is strings /bin/cat, it is usefull make a initial idea and then reverse it.
You can get the source code of every linux command online anyway :D

API for whereis Command in linux

Is there an API similar to the "whereis" command in UNIX that can be called from a C program to find out all instances of a given command?
Use getenv("PATH") to get a list of ':'-separated directory names. Look for the command name in each directory (e.g. using stat() or access()) and check if it's a regular file and can be executed. (If the directory name is empty, assume "." instead.) That's exactly what the which and whereis commands do.
The execvp() and execlp() functions automatically do PATH lookups when executing the given command, although it seems they do not manually check each path but just call execv(); if an error code is returned, they just try the next path.
There are many different functions in C you can use to launch shell command from your program. I think you should particularly look in the exec(3) family.
Every example you may need are in the manual: man 3 exec in a terminal or here: http://linux.die.net/man/3/exec.
Hope this helps!

How to get bash built in commands using Perl

I was wondering if there is a way to get Linux commands with a perl script. I am talking about commands such as cd ls ll clear cp
You can execute system commands in a variety of ways, some better than others.
Using system();, which prints the output of the command, but does not return the output to the Perl script.
Using backticks (``), which don't print anything, but return the output to the Perl script. An alternative to using actual backticks is to use the qx(); function, which is easier to read and accomplishes the same thing.
Using exec();, which does the same thing as system();, but does not return to the Perl script at all, unless the command doesn't exist or fails.
Using open();, which allows you to either pipe input from your script to the command, or read the output of the command into your script.
It's important to mention that the system commands that you listed, like cp and ls are much better done using built-in functions in Perl itself. Any system call is a slow process, so use native functions when the desired result is something simple, like copying a file.
Some examples:
# Prints the output. Don't do this.
system("ls");
# Saves the output to a variable. Don't do this.
$lsResults = `ls`;
# Something like this is more useful.
system("imgcvt", "-f", "sgi", "-t", "tiff", "Image.sgi", "NewImage.tiff");
This page explains in a bit more detail the different ways that you can make system calls.
You can, as voithos says, using either system() or backticks. However, take into account that this is not recommended, and that, for instance, cd won't work (won't actually change the directory). Note that those commands are executed in a new shell, and won't affect the running perl script.
I would not rely on those commands and try to implement your script in Perl (if you're decided to use Perl, anyway). In fact, Perl was designed at first to be a powerful substitute for sh and other UNIX shells for sysadmins.
you can surround the command in back ticks
`command`
The problem is perl is trying to execute the bash builtin (i.e. source, ...) as if they were real files, but perl can't find them as they don't exist. The answer is to tell perl what to execute explicitly. In the case of bash builtins like source, do the following and it works just fine.
my $XYZZY=`bash -c "source SOME-FILE; DO_SOMETHING_ELSE; ..."`;
of for the case of cd do something like the following.
my $LOCATION=`bash -c "cd /etc/init.d; pwd"`;

Location of cd executable

I read that the executables for the commands issued using exec() calls are supposed to be stored in directories that are part of the PATH variable.
Accordingly, I found the executables for ls, chmod, grep, cat in /bin.
However, I could not find the executable for cd.
Where is it located?
A process can only affect its own working directory. When an executable is executed by the shell it executes as a child process, so a cd executable (if one existed) would change that child process's working directory without affecting the parent process (the shell), hence the cd command must be implemented as a shell built-in that actually executes in the shell's own process.
cd is a shell built-in, unfortunately.
$ type cd
cd is a shell builtin
...from http://www.linuxquestions.org/questions/linux-newbie-8/whereis-cd-sudo-doesnt-find-cd-464767/
But you should be able to get it working with:
sh -c "cd /somedir; do something"
Not all utilities that you can execute at a shell prompt need actually exist as actual executables in the filesystem. They can also be so-called shell built-ins, which means – you guessed it – that they are built into the shell.
The Single Unix Specification does, in general, not specify whether a utility has to be provided as an executable or as a built-in, that is left as a private internal implementation detail to the OS vendor.
The only exceptions are the so-called special built-ins, which must be provided as built-ins, because they affect the behavior of the shell itself in a manner that regular executables (or even regular built-ins) can't (for example set, which sets variables that persist even after set exits). Those special built-ins are:
break
:
continue
.
eval
exec
exit
export
readonly
return
set
shift
times
trap
unset
Note that cd is not on that list, which means that cd is not a special built-in. In fact, according to the specification, it would be perfectly legal to implement cd as a regular executable. It's just not possible, for the reasons given by the other answers.
And if you scroll down to the non-normative section of the specification, i.e. to the part that is not officially part of the specification but only purely informational, you will find that fact explicitly mentioned:
Since cd affects the current shell execution environment, it is always provided as a shell regular built-in.
So, the specification doesn't require cd to be a built-in, but it's simply impossible to do otherwise.
Note that sometimes utilities are provided both as a built-in and as an executable. A good example is the time utility, which on a typical GNU system is provided both as an executable by the Coreutils package and as a shell regular built-in by Bash. This can lead to confusion, because when you do man time, you get the manpage of the time executable (the time builtin is documented in man builtins), but when you execute time you get the time built-in, which does not support the same features as the time executable whose manpage you just read. You have to explicitly run /usr/bin/time (or whatever path you installed Coreutils into) to get the executable.
According to this, cd is always a built-in command and never an executable:
Since cd affects the current shell execution environment, it is always provided as a shell regular built-in.
cd is part of the shell; an internal command. There is no binary for it.
The command cd is built-in in your command line shell. It could not affect the working directory of your shell otherwise.
I also searched the executable of "cd" and there is no such.
You can work with chdir (pathname) in C, it has the same effect.

Resources