How are builtin commands implemented in shell? - linux

When a shell (e.g. bash) invokes an executable file, it first fork itself, and then its copy execve the executable file.
When a shell invokes builtin commands, there is no new process created, and execve can only operate on executable files while builtin commands are not stored in executable files.
So how are builtin commands stored, and how are they invoked in terms of system calls?

"builtin command" means that you don't have to run an external program. So, no, there's no execve involved at all, and no, there's not even any system call necessarily involved. Your shell really just parses a command string and sees "hey, that's a builtin command, let's execute this and that function".

You can imagine they are the same as shell functions.
So instead of launching external process the shell invokes some internal function library function which reads the input outputs the result and does pretty much the same as main function of regular program.

The shell process itself just handles the builtin and potentially modifies itself or its environment as a result. There might not be any system calls made at all.

Related

Is there any drawback to using functions instead of aliases?

Bash functions are more versatile than aliases. For example, they accept parameters.
Is there any drawback to going full function style and completely drop aliases, even for simple cases? I can imagine that maybe functions are more resource intensive, but have no data to back that up.
Any other reason to keep some of my aliases? They have easier syntax and are easier for humans to read, but apart from that?
Note: aliases take precedence over functions.
Following link may be relevant regarding function overhead, it seems there is no overhead comparing to alias: 3.6. Functions, Aliases, and the Environment
Quoting Dan again: "Shell functions are about as efficient as they can be. It is the approximate equivalent of sourcing a bash/bourne shell script save that no file I/O need be done as the function is already in memory. The shell functions are typically loaded from [.bashrc or .bash_profile] depending on whether you want them only in the initial shell or in subshells as well. Contrast this with running a shell script: Your shell forks, the child does an exec, potentially the path is searched, the kernel opens the file and examines enough bytes to determine how to run the file, in the case of a shell script a shell must be started with the name of the script as its argument, the shell then opens the file, reads it and executes the statements. Compared to a shell function, everything other than executing the statements can be considered unnecessary overhead."

Is linux shell just optional utility or mandatory when running processes

I'm still pretty confused with the role of linux shell running programs despite of using linux a lot.
I understand there are two type of shells, interactive shells and non-interactive shells. Terminal session interacts with interactive shell, and scripts run in non-interactive shell. But is there really other difference than ability to read input and print output? If I invoke script from shell, does it run in this interactive shell or new non-interactive shell inside shell?
Also, when I execute binary either by invoking it through interactive shell or graphical interface, does it always run in the shell, or could a process run without shell at all? It's said that all processes communicates with kernel through the shell, but I'm confused because in docker, you can define the entrypoint to be either a binary or "sh -c binary".
The shell is just one possible interface. Every Linux system has a notion of a "first" process (usually called init) that is started directly by the kernel. Every other program on your computer is started by another process that first forks itself, then calls exec (actually, one of about 6 functions in the same family) to replace itself with a different program.
The shell is just one possible interface, one that parses text into requests to run other programs. The shell command line mv foo bar is parsed as a request to pass fork the shell and call exec in the new copy with the three words mv, foo, and bar as arguments.
Consider the following snippet of Python:
subprocess.call(["mv", "foo", "bar"])
which basically does the same thing: the Python program forks itself and calls exec with the three given strings as arguments. There is no shell involvement.
The shell is just a convenient UI that lets you run other processes the way you want to. It can also run scripts to do the same. That's all it does. It's not responsible for doing anything for the processes once it runs them.
You could entirely replace it with pythonwhich lets you do the same things, but that's annoying because you have to type chepner's subprocess.call(["mv", "foo", "bar"])just to to run the mv program. If you wanted to pipe one program to another, you'd need 5-10 such lines. Not much fun to write interactively.
You could entirely replace it with KDE/Gnome/whatever and double click programs to run them, but that's not very flexible since you can't include arguments and such, and you can't automate it.
I understand there are two type of shells, interactive shells and non-interactive shells. Terminal session interacts with interactive shell, and scripts run in non-interactive shell. But is there really other difference than ability to read input and print output?
It's just two different modes that you can run sh with. You want comfy keyboard shortcuts, aliases and options to help type things manually (interactively), but they're pointless or annoying when running pre-written script.
If I invoke script from shell, does it run in this interactive shell or new non-interactive shell inside shell?
It runs in a new, independent process. You can run it in the same interactive shell instance with source yourscript, which is basically the same as typing the script contents on the keyboard.
Also, when I execute binary either by invoking it through interactive shell or graphical interface, does it always run in the shell, or could a process run without shell at all?
The process always runs entirely independently of the shell, but may share the same terminal.
It's said that all processes communicates with kernel through the shell,
Processes never talk to the kernel through the shell. They talk through syscalls.
but I'm confused because in docker, you can define the entrypoint to be either a binary or "sh -c binary".
For a simple binary, the two are identical.
If you want to e.g. set up pipes or redirections because the process doesn't do it on its own, you can use sh -c to have a shell do it instead.

system() call in perl [duplicate]

This question already has answers here:
in perl, how do we detect a segmentation fault in an external command
(2 answers)
Closed 8 years ago.
I want to make a system() call in a perl script and, because I want to redirect stdin and stdout, I think I need to pass a single string to system() and let the shell interpret the metacharacters. However, I do not seem to be able to correctly detect when the program called via system() segfaults.
The perl system() man page at http://perldoc.perl.org/functions/system.html cautions "When system's arguments are executed indirectly by the shell, results and return codes are subject to its quirks." Should I be concerned about this?
My code for testing the return value of system() is pretty much identical to the example given on the same man page (just above the warning I mention) but in retrospect that appears to be for calling system() with a LIST.
So, I tihnk my core issue is, how do I detect how a program terminated that was called in a shell from perl's system(). Apologies if this is a repeat question but I cannot find it addressed anywhere before. FWIW I'm running the script on a Fedora distro of linux.
Many thank.
I would suggest you have a look at IPC::Open2 and IPC::Open3
What you are trying to do is a bit too complicated for system which is more or less geared up to running a command, and then capturing output.
IPC::Open2 allows you to open an exec pipe to a process, and attach your own filehandles to STDIN and STDOUT, meaning you can do bidirectional communication. (Open3 also allows STDERR).
Catching signals and errors on your attached process is a bit more complicated - the only thing you're fairly sure to get is a return code. With system, $? should be set automatically, but with IPC::Open[23] you may need to use waitpid to catch the return code.

difference between command , function and systemcall

What is the difference between a command, function and systemcall?
This is probably homework help, but anyhow:
Command - A program (or a shell built-in) you execute from (probably) your command shell.
Function - A logical subset of your program. Calling one is entirely within your process.
Systemcall - A function that is executed by your operating system; a primary way of using OS features like working with a file system or using the network.
A command can be a program, which in turn is comprised of functions, which themselves can execute system calls.
For example, the 'cp' command in Unix-like systems copies files. Its implementation includes functions which perform the copying. Those functions themselves execute system-calls like open() and read().
They are all just abstractions of a set of computer instructions which perform a given task.

Difference between execv and just running an app?

We have an stub that we are launching from inittab that execv's our process. (ARM Linux Kernel 2.6.25)
When testing the process it fails only if launched from inittab and execv'd. If launched on the command line it work perfectly, every time.
The process makes heavy use of SYS V IPC.
Are there any differences between the two launch methods that I should be aware of?
As Matthew mentioned it, it is probably an env variable issue. Try to dump yout env list before calling your program in both case - through the stub or 'by hand'.
BTW It could help a lot if you could provide more information why your program did crash. Log file ? core dump/gdb ? return value from execve ?
Edit:
Other checks: are you sure to pass exactly the same parameter list (if there are parameters)?
To answer your question , there is no differences between the 2 methods. Actually your shell fork() and finally call execve() to launch your process, feeding it with parameters you've provided by hand, and the environement variables you've set in your shell. Btw when launching your program through init it could launch it during an early stage of your machine startup. Are you sure everything is ready for the good running of your application at that point?
Could it be an issue of environment variables? If so, consider using execve or execle with an appropriate envp argument.
The environment variable suggestion is pretty good - specifically I'd check $PATH to make sure your dependent libraries are being found (if you have any). Another thing you could check is, are you running under the same uid/gid when run as inittab?
And if you replace your stub with a shell script ?
If it works from the command line, it should work from a shell script, and you can know wether it is your stub or the fact that it is in inittab.
Could it be a controlling tty issue ?
Another advantage of the shell script is you can edit it and strace your program to see where it fails
Was a mismatched kernel/library issue. Everything cleaned up after a full recompile.

Resources