I'm a beginner in Linux, just have some questions about job and process group.
My textbook says 'Unix shells use the abstraction of a job to represent the processes that are created as a result of evaluating a single command line. At any point in time, there is at most one foreground job and zero or more background jobs.
Lets say we have this simple shell code(I leave out some unimportant code, i.e setup argv etc):
When we type the first commnad, for example:./exampleProgram &
Q1- Is a job created? if yes, what process does the job contain?
the main() of shellex.c is invoked, so when execute the line 15: fork() create a new child process, lets say the parent process is p1, and the newly created child process is c1
and if it is background job, then we can type another command in the prompt, ./anotherProgram & and type enter
Q2- I'm pretty sure it just p1 and c1 at the moment and p1 is executing the second command, when it execute the line 15: fork() again,
we have p1, c1 and new created c2, is my understanding correct?
Q3- How many job is there? is it just one job that contains p1, c1, and c2?
Q4- if it is only one job, so when we keeps typing new commands we will only have one job contains one parent process p1 and many child processes c1, c2, c3, c4...
so why my textbook says that Shell can have more than one job? and why there is at most one foreground job and zero or more background jobs.
There is quite a bit to say on this topic, some of which can fit in an answer, and most of which will require further reading.
For Q1, I would say conceptually yes, but jobs are not automatic, and job tracking and control are not magical. I don't see any logic in the code snippits you've show that e.g. establishes and maintains a jobs table. I understand it's just a sample, so maybe the job control logic is elsewhere. Job control is a feature of common, existing Unix shells, but if a person writes a new Unix shell from scratch, job control features would need to be added, as code / logic.
For Q2, the way you've put it is not how I would put it. After the first call to fork(), yes there is a p1 and a c1, but recognize that at first, p1 and c1 are different instances of the same program (shellex); only after the call to execve() is exampleProgram running. fork() creates a child instance of shellex, and execve() causes the child instance of shellex to be replaced (in RAM) by exampleProgram (assuming that's the value of argv[0]).
There is no real sense in which the parent is "executing" the child, nor the process that replaces the child upon execve(), except just to get them going. The parent starts the child and might wait for the child execution to complete, but really a parent and its whole hierarchy of child processes are all executing each on its own, being executed by the kernel.
But yes, if told that the program to run should be run in the background, then shellex will accept further input, and upon the next call to fork(), there will be the parent shellex with two child processes. And again, at first the child c2 will be an instance of shellex, quickly replaced via execve() by whatever program has been named.
(Regarding running in the background, whether or not & has that effect depends upon the logic inside the function named parseline() in the sample code. Shells I'm familiar with use & to say "run this in the background", but there is nothing special nor magical about that. A newly-written Unix shell can do it some other way, with a trailing +, or a leading BG:, or whatever the shell author decides to do.
For Q3 and Q4, the first thing to recognize is that the parent you are calling p1 is the shell program that you've shown. So, no, p1 would not be part of the job.
In Unix, a job is a collection of processes that execute as part of a single pipeline. Thus a job can consist of one process or many. Such processes remain attached to the terminal from which they are run, but might be in the foreground (running and interactive), suspended, or in the background (running, not interactive).
one process, foreground : ls -lR
one process, background : ls -lR &
one process, background : ls -lR, then CTRL-Z, then bg
many processes, foreground : ls -lR | grep perl | sed 's/^.*\.//'
many processes, background : ls -lR | grep perl | sed 's/^.*\.//' &
To see jobs vs. processes empirically, run a pipeline in the background (the 5th of the 5 examples above), and while it is running use ps to show you the process IDs and the process group IDs. e.g., on my Mac's version of bash, that's:
$ ls -lR | grep perl | sed 's/^.*\.//' &
[1] 2454 <-- job 1, PID of the sed is 2454
$ ps -o command,pid,pgid
COMMAND PID PGID
vim 2450 2450 <-- running in a different tab
ls -lR 2452 2452 }
grep perl 2453 2452 }-- 3 PIDs, 1 PGID
sed s/^.*\.// 2454 2452 }
In contrast to this attachment to the shell and the terminal, a daemon detaches from both. When starting a daemon, the parent uses fork() to start a child process, but then exits, leaving only the child running, and now with a parent of PID 1. The child closes down stdin, stdout, and stderr, since those are meaningless, since a daemon runs "headless".
But in a shell, the parent -- which, again, is the shell -- stays running either wait()ing (foreground child program), or not wait()ing (background child program), and the child typically retains use of stdin, stdout, and stderr (although, these might be redirected to files, etc.)
And, a shell can invoke sub-shells, and of course any program that is run can fork() its own child processes, and so on. So the hierarchy of processes can become quite deep. Without specific action otherwise, a child process will be in the same process group as it's parent.
Here are some articles for further reading:
What is difference between a job and a process in Unix?
https://unix.stackexchange.com/questions/4214/what-is-the-difference-between-a-job-and-a-process
https://unix.stackexchange.com/questions/363126/why-is-process-not-part-of-expected-process-group
Bash Reference Manual; Job Control
Bash Reference Manual; Job Control Basics
A job is not a Linux thing, it's not a background process, it's something your particular shell defines to be a "job".
Typically a shell introduces the notion of "job" to do job control. This normally includes a way to identify a job and perform actions on it, like
bring into foreground
put into background
stop
resume
kill
If a shell has no means to do any of this, it makes little sense to talk about jobs.
Related
I have multiple processes (web-scrapers) running in the background (one scraper for each website). The processes are python scripts that were spawned/forked a few weeks ago. I would like to control (they listen on sockets to enable IPC) them from one central place (kinda like a dispatcher/manager python script), while the processes (scrapers) remain individual unrelated processes.
I thought about using the PID to reference each process, but that would require storing the PID whenever I (re)launch one of the scrapers because there is no semantic relation between a number and my use case. I just want to supply some text-tag along with the process when I launch it, so that I can reference it later on.
pgrep -f searches all processes by their name and calling pattern (including arguments).
E.g. if you spawned a process as python myscraper --scrapernametag=uniqueid01 then you can run:
TAG=uniqueid01; pgrep -f "scrapernametag=$TAG"
to discover the PID of a process later down the line.
I have a Bash script that gets invoked like this:
script.sh < input_file.txt
All script.sh does is run some other program:
#!/bin/bash
otherprogram
Now when "otherprogram" reads from stdin, it gets text from input_file.txt, without any need to explicitly redirect the standard input of script.sh into otherprogram.
I don't know a lot about how processes get started, but I have read that when fork() gets called, all file descriptors from the parent process, including stdin, are shared with the child--which makes sense, since fork() just makes an identical copy of everything in the parent process' memory. But why would all file descriptions still be shared after the child process replaces the copy of the parent with a new program (presumably by calling exec...())?
If child processes do always inherit all file descriptors from their parent, can someone explain why that actually makes sense and is a good idea?
When a fork is called () Most fields of the PCB (process control block) are copied from the original to the newly created PCB (and so also the open files, as you said).
To summarize, immediately after executing a fork:
There are 2 processes that are exactly the same, except for the differences described in the fork(2) man page if you want to have a look.
Both processes are at the same line of code (the line immediately after the fork).
In the child process, the return value of the fork is 0.
In the parent process, the return value of the fork is greater than 0.
Let's move to the exec:
So we now have two copies of the shell. But they are still both running the shell program; we want the child to run any program. The child uses exec, to replace itself with the the program you passes as argument. Exec does not create a new process; it just changes the program file that an existing process is running.
So exec first wipes out the memory state of the calling process. It then goes to the filesystem to find the program file requested and copies this file into the program's memory and initializes register state, including the PC.
exec doesn't alter most of the other fields in the PCB - this is important, because it means the process calling exec can set things up if it wants to, for example changing the open files as in your case where the child copy inherit the the file descriptor of the stdin which is pointing to your input file.
Another example can be:
You want that the child process when it prints on the standard output (2) for example by means of an echo, it actually prints on a file. What you can do is before calling the exec (father process) changing the place where the file descriptor 2 points to (using for example dup2()) and then call the fork() ad the exec in the child process.
In Unix and related OSes, the general way to launch a program is:
The process forks.
The child process makes whatever changes to the environment are needed, such as (perhaps):
Changing where file descriptors point (especially stdin/stdout/stderr).
Changing the running user.
Changing the present working directory.
Changing/adding/removing environment variables. (Though you can also just do this as part of the next step.)
The child process then replaces itself with the desired program.
This approach allows for a huge amount of flexibility; the parent program can decide exactly what should and should not be inherited. (It also fits well with the Unix philosophy, of doing one thing and doing it well: exec doesn't need 50 bajillion arguments for different things you might want the new program to have, because you just set those up in the process before switching over to that program.)
It also makes it very easy to write "wrapper" programs that delegate almost of all their functionality to other programs. For example, a script like this:
#!/bin/bash
cd /directory/that/foo/requires/
foo "$#"
which is a drop-in replacement for foo that just changes the directory it's run in, doesn't have to worry about all the things that foo should inherit, only the one thing that it shouldn't. (Which, again, fits very well with the Unix philosophy.)
I was wondering if there is some way to force to use some specific process ID to Linux to some application before running it. I need to know in advance the process ID.
Actually, there is a way to do this. Since kernel 3.3 with CONFIG_CHECKPOINT_RESTORE set(which is set in most distros), there is /proc/sys/kernel/ns_last_pid which contains last pid generated by kernel. So, if you want to set PID for forked program, you need to perform these actions:
Open /proc/sys/kernel/ns_last_pid and get fd
flock it with LOCK_EX
write PID-1
fork
VoilĂ ! Child will have PID that you wanted.
Also, don't forget to unlock (flock with LOCK_UN) and close ns_last_pid.
You can checkout C code at my blog here.
As many already suggested you cannot set directly a PID but usually shells have facilities to know which is the last forked process ID.
For example in bash you can lunch an executable in background (appending &) and find its PID in the variable $!.
Example:
$ lsof >/dev/null &
[1] 15458
$ echo $!
15458
On CentOS7.2 you can simply do the following:
Let's say you want to execute the sleep command with a PID of 1894.
sudo echo 1893 > /proc/sys/kernel/ns_last_pid; sleep 1000
(However, keep in mind that if by chance another process executes in the extremely brief amount of time between the echo and sleep command you could end up with a PID of 1895+. I've tested it hundreds of times and it has never happened to me. If you want to guarantee the PID you will need to lock the file after you write to it, execute sleep, then unlock the file as suggested in Ruslan's answer above.)
There's no way to force to use specific PID for process. As Wikipedia says:
Process IDs are usually allocated on a sequential basis, beginning at
0 and rising to a maximum value which varies from system to system.
Once this limit is reached, allocation restarts at 300 and again
increases. In Mac OS X and HP-UX, allocation restarts at 100. However,
for this and subsequent passes any PIDs still assigned to processes
are skipped
You could just repeatedly call fork() to create new child processes until you get a child with the desired PID. Remember to call wait() often, or you will hit the per-user process limit quickly.
This method assumes that the OS assigns new PIDs sequentially, which appears to be the case eg. on Linux 3.3.
The advantage over the ns_last_pid method is that it doesn't require root permissions.
Every process on a linux system is generated by fork() so there should be no way to force a specific PID.
From Linux 5.5 you can pass an array of PIDs to the clone3 system call to be assigned to the new process, up to one for each nested PID namespace, from the inside out. This requires CAP_SYS_ADMIN or (since Linux 5.9) CAP_CHECKPOINT_RESTORE over the PID namespace.
If you a not concerned with PID namespaces use an array of size one.
I have a process on a machine which I stopped (with a Ctrl-Z). After ssh'ing onto the machine, how do I resume the process?
You will need to find the PID and then issue kill -CONT <pid>.
You can find the PID by using ps with some options to produce extended output. Stopped jobs have a T in the STAT (or S) column.
If you succeed in continuing the process but it no longer has a controlling terminal (and it needs one) then it could possibly hang or go into a loop: just keep your eye on its CPU usage.
You can type in fg to resume process. If you have multiple processes, you can type fg processname, (e.g. fg vim) or fg job_id.
To find out the job id's, use the jobs command.
Relevant quote from wikipedia on what it does:
fg is a job control command in Unix and Unix-like operating systems that resumes execution of a suspended process by bringing it to the foreground and thus redirecting its standard input and output streams to the user's terminal.
To find out job-id and pid, use "jobs -l", like this:
$ jobs -l
[1]+ 3729 Stopped vim clustertst.cpp
The first column is job_id, and the second is pid.
I implemented a simple c shell to take in commands like sleep 3 &. I also implemented it to "listen" for sigchild signals once the job complete.
But how do I get the job id and command to be printed out like the ubuntu shell once it is completed?
I would advise against catching SIGCHLD signals.
A neater way to do that is to call waitpid with the WNOHANG option. If it returns 0, you know that the job with that particular pid is still running, otherwise that process has terminated and you fetch its exit code from the status parameter, and print the message accordingly.
Moreover, bash doesn't print the job completion status at the time the job completes, but rather at the time when the next command is issued, so this is a perfect fit for waitpid.
A small disadvantage of that approach is that the job process will stay as a zombie in the period between its termination and the time you call waitpid, but that probably shouldn't matter for a shell.
You need to remember the child pid (from the fork) and the command executed in your shell (in some sort of table or map structure). Then, when you get a SIGCHILD, you find the child pid and that gives you the corresponding command.