How do sub shells work under the hood in Linux? - linux

My understanding is that sub shells fork a child process off of the parent process and that any commands in the parentheses are executed using execve. The parent process waits for the child process to finish executing. Am I missing anything here?

The sub shell may fork to create the subshell, and then for each external cmmand in (), fork again before calling execve.
If the commands in () are internal commands, then it would not need to fork and execve for those.

Related

Fork child process to die when parent exits? (bash)

I'm working with parallel processing and rather than dealing with cvars and locks I've found it's much easier to run a few commands in a shell script in sequence to avoid race conditions in one place. The new problem is that one of these commands calls another program, which the OS has decided to put into a new process. I need to kill this process from the parent program, but the parent program only knows the pid of the parent (shell script), so this process keeps executing on its own.
Is there a way in bash to set a subprocess to die when the parent dies? I've tried to figure out how to execute it as a daemon because I read daemons exit when the parent dies, but it's tricky and I can't quite get it right. Thanks!
Found the problem, and this fixed it (except for some pesky messages that somehow cannot be redirected to /dev/null).
trap "trap - SIGTERM && kill -- -$$" SIGINT SIGTERM EXIT

When wait system call is used

The theory says that, if wait is not called parent wont be getting information about terminated child and child becomes zombie. But when we create a process, zombies are not created even if we are not calling wait. My question is whether the wait is called automatically?
In many languages, calling a sub process will call wait() for you. For example, in ruby or perl, you often shell out like this:
#!/usr/bin/ruby
system("ls /tmp")
`ls /tmp`
This is doing a bunch of magic for you, including calling wait(). In fact, Ruby must wait for the process to exit anyway to collect the output before the program can continue.
You can easily create zombies like this:
#!/usr/bin/ruby
if fork
sleep 1000 # Parent ignoring the child
else
exec "ls /tmp" # short-lived child
end
When we manually fork/exec, there is no magic calling wait() for us, and a zombie will be created. But when the parent exits, the zombie child will get re-parented to init, which will always call wait() to clean up zombies.

making parent process wait till child has called exec

In linux, after calling fork(), my child process is going to call exec soon. Is there a way for the parent process to wait() and not do anything till the child has exec'ed?
Thanks.
There is no (API) way for the parent to know that the child is performing an exec().
But there is a nice pipe-trick: have the child inherit a filedescriptor (for a pipe) and (before the fork() ) set the close-on-exec flag for the pipe. The parent will be notified by an EOF on the pipe when it is closed by the exec().
Please note that this does not need any collaboration from the child.
Use vfork() instead of fork(). That causes the parent to be suspended until the child either exits or calls one of the execve() family of functions.
You need to use waitpid using the process ID returned from the fork call that is returned to the parent.
EDIT
Or if you mean that you want to know that the child is about to call exec use pause in the parent. Get the child to call kill with a suitable signal to the parent (whose process ID can be obtained from getppid). USR1 signal might be useful to use. Do this just before the exec.

parent process after child does an exec

I have a scenario in which after the fork the child executes using the excele() command
a linux system command in which its executes a small shell script .
And the parent does only a wait() after that . So my question is , does the parent executes
wait after an execle() which the child process executes ?
Thanks
Smita
I'm not too sure what you're asking, but the parent is in a wait() system call it will wait there until any child exits. There are other things like signals that will take it out of the exit too.
You do have to be careful in the child process that you don't accidently fall through into the parent code on error.
This (a child process doing some execve after its parent fork-ed, and the parent wait- or waitpid-ing it) is a very common scenario; most shells are working this way. You could e.g. strace -f an interactive bash shell to learn more, or study the source code of simple shells like sash
Notice that after a fork(2) syscall, the parent and the child processes may run simultanously (e.g. at the same time, especially on multi-core machines).

Get the child PID after system()

As far as I understand, the system() call uses internally fork() and exec() but encapsulates them for a easier handling.
Is it possible to get the PID from the child process created with the system() call?
Aim: I want to be able to SIGINT any child process after a certain timeout. I could rebuild the system() function by using fork() and exec(). But all I need is the child's PID and perhaps there is shortcut by using system()?
Typically, system() is a synchronous operation. This means it won't return until the child has exited, i.e. there is no valid PID for the child process when system() returns, since the child process no longer exists.
There's no way (that I know of) when using system(). Besides, with system() there's the additional step of launching a shell which will execute your command, making this a tad more difficult. You're probably better off replacing it with fork() and exec().
I had this problem. Solved it by:
int syspid,status;
pid_t ppid=getpid();
syspid=ppid+1
status=system(argv[1]); //here argv1 was another program;
This might not always work, but most of the time the system()'s PID is the parent's pid +1 (unless you have multiple forks).
However, there is a way of doing what you want via the /proc file system. You can go through process directories (directory names are PIDs) and check the "status" files. There's a PPid entry in each of them specifying the parent pid.
This way, if you get a "status" file which specifies the PID of your process as PPID, then its folder name in /proc file system is the value you are looking for.

Resources