What is the first process a typical Linux kernel starts? - linux

I searched on the internet for which is the first process which gets executed upon system startup.
I found two answers which are init and sched. What is it really?
Which gets executed first? sched process or init process?

Typically it is the init process, the path of which is hard coded into the kernel itself. init performs very low level functions like starting upstart in the case of Ubuntu (prior to 15.40) or systemd in the case of Ubuntu 15.04 and later, Arch, Fedora, and others, which load the remaining processes and setup. Note that the system is not done booting when init runs - that is a common misconception. In fact, init sets up your login screen and other related tasks. Here's a WikiPedia page on init: https://en.wikipedia.org/wiki/Linux_startup_process#SysV_init
Init is the father of all processes. Its primary role is to create processes from a script stored in the file /etc/inittab. This file usually has entries which cause init to spawn gettys on each line that users can log in. It also controls autonomous processes required by any particular system. A run level is a software configuration of the system which allows only a selected group of processes to exist. The processes spawned by init for each of these run levels are defined in the /etc/inittab file.
However, the Linux kernel does start the scheduler but it is not in userspace, which is what most people associate as the home for a process. Also, the Bourne Shell (/bin/sh) can be substituted if the init is missing or cannot be called. You can also in theory substitute it for any executable by using the init=*some path here* Linux kernel boot option.

Its sched, as per Linux 3.13 start kernel() first calls sched_init() and runs first user space process init i.e rest_init() creates a kernel thread passing another function kernel_init() as the entry point and kernel goes to idle unless called.
start_kernel() {
...
sched_init();
rest_init(); calls function kernel_init();
}

The kernel at least has one runnable process, which is known as the idle task, swapper, init_task and sched. They are different names of the same process whose pid is 0. This init_task is a global variable of kernel, so it have a fixed address, you can see it from System.map by command grep 'D init_task' /boot/System.map-*. The address wouldn't change unless you recompile the kernel.
The program init whose pid is 1, spawned by init_task(pid 0). In Ubuntu, The program init is the Upstart process management daemon, while in other systems, it could be systemd. The address of init changed every time while system rebooting.
So, process 0 run first, then spawn process 1.

You can try
pstree 0
it will show the all process hierarchy in tree form right from the children on sched process (PID 0). No doubt init is the parent of all process but sched gets executed before init and spaws both init and kthread.
You can also see the PPID (i.e. process id of parent process ) using:
ps -eaf
You will notice it to be 0 for both init and kthread.

Swapper is the first process running. It has pid 0.

Related

Is changing parent process necessary when daemonize a process?

I am reading about daemonizing a process at https://en.wikipedia.org/wiki/Daemon_%28computing%29#Creation
In a strictly technical sense, a Unix-like system process is a daemon
when its parent process terminates and the daemon is assigned the init
process (process number 1) as its parent process and has no
controlling terminal. However, more commonly, a daemon may be any
background process, whether a child of the init process or not.
On a Unix-like system, the common method for a process to become a
daemon, when the process is started from the command line or from a
startup script such as an init script or a SystemStarter script,
involves:
Dissociating from the controlling tty
Becoming a session leader
Becoming a process group leader
Executing as a background task by forking and exiting (once or twice). This is required sometimes for the process to become a session
leader. It also allows the parent process to continue its normal
execution.
Setting the root directory (/) as the current working directory so that the process does not keep any directory in use that may be on
a mounted file system (allowing it to be unmounted).
Changing the umask to 0 to allow open(), creat(), and other operating system calls to provide their own permission masks and not
to depend on the umask of the caller
Closing all inherited files at the time of execution that are left open by the parent process, including file descriptors 0, 1 and 2
for the standard streams (stdin, stdout and stderr). Required files
will be opened later.
Using a logfile, the console, or /dev/null as stdin, stdout, and stderr
If the process is started by a super-server daemon, such as inetd,
launchd, or systemd, the super-server daemon will perform those
functions for the process[5][6][7] (except for old-style daemons not
converted to run under systemd and specified as Type=forking[7] and
"multi-threaded" datagram servers under inetd[5]).
Is there a step there that changes the parent process of a process
to be daemonized? It seems to me none of the steps does that?
Is changing parent process necessary when daemonize a process?
After changing the parent process of a process (a process not necessarily
to be daemonized), can the process be associated to the controlling
tty of the new parent process? (The purpose of the question is to
see whether "keeping a process disassociated from the the
controlling tty of the new parent process" is a necessary condition
of "changing the parent process of the process".)
See my related question https://unix.stackexchange.com/questions/266565/daemonize-a-process-in-shell
Thanks.
The parent of a Unix process can't be changed by the process itself. The typical method of creating a daemon involves a fork call (which creates the process that will become the daemon). The initial process then exits, and the newly-orphaned child process will be inherited by the init process which becomes it's new parent. That's handled in step 4. The only thing init will do is wait for all it's children to exit. init doesn't have a controlling TTY, so once inherited by init the daemon can't become associated with a controlling TTY anymore. The main reason to become disassociated is to prevent signals generated from the TTY (hangups and control-C's etc.) from getting to the daemon.
There are two ways daemons are usually run:
From a shell script. The script runs the daemon's executable with the & operator at the end of the command to put the daemon into the background, possibly with I/O redirection to set the daemon's stdin, stdout and/or stderr, and then exits leaving the daemon without a parent. Running an executable from the shell involves the shell doing a fork followed by an exec in the child process of the executable to be run.
The daemon program has an option to daemonize itself. When run with that option it does a fork followed in the child process by an exec of itself with an appropriate set of arguments. The parent will normally exit after the fork since the work it's been asked to do is done. If it doesn't, the child process needs an extra fork to give it a parent that can exit. NB: this is why so many programs that normally run as daemons can be run directly without becoming a daemon, the "become a daemon" option causes the child process to close stdin/stdout/stderr and then just exec it's own executable without the "become a daemon" option.
I would suggest to use daemon(3). See also credentials(7)
Your list does not mention explicitly setsid(2).
MUSL libc has a legacy/daemon.c which forks twice and do setsid

How to list threads were killed by the kernel?

Is there any way to list all the killed processes in a linux device?
I saw this answer suggesting:
check in:
/var/log/kern.log
but it is not generic. there is any other way to do it?
What I want to do:
list thread/process if it got killed. What function in the kernel should I edit to list all the killed tid/pid and their names, or alternitavily is there a sysfs does it anyway?
The opposite of do_fork is do_exit, here:
do_exit kernel source
I'm not able to find when threads are exiting, other than:
release_task
I believe "task" and "thread" are (almost) synonymous in Linux.
First, task and thread contexts are different in the kernel.
task (using tasklet api) runs in software interrupt context (meaning you cannot sleep while you are in the task ctx) while thread (using kthread api, or workqueue api) runs the handler in process ctx (i.e. sleep-able ctx).
In both cases, if a thread hangs in the kerenl, you cannot kill it.
if you run "ps" command from the shell, you can see it there (normally with "[" and "]" braces) but any attempt to kill it won't work.
the kernel is trusted code, such a situation shouldn't happen, and if it does, it indicates a kernel (or kernel module) bug.
normally the whole machine will hand after a while because the core running that thread is not responding (you will see a message in /var/log/messages or the console with more info) in some other cases the machine may survive but that specific core is dead. depends on the kernel configuration.

Secure method for killing a process and all offspring

I am creating a sandbox environment in Linux using apparmor, setrlimit, cap_set_rpoc to let anonymous users basically execute some arbitrary code on my server in the context of a scientific application. One thing that is specifically allowed in the sandbox is starting new processes by forking and calling executables (although the total number of processes by one user is limited by RLIMIT_NPROC).
After a given time period, say 1 minute, the system will kill the main process, and all of the potential children. I am currently relying on the process group id to identify children. However, in theory, a child process could call setpgid to change its process group, so that it will no longer be affected when I call kill(-1 * pid) on the main process id (correct?). Unfortunately, there is no linux capability that I can set to prevent processes from calling setpgid.
What would be a robust way of killing a process and all of its (recursive) children, which would make it very hard for the children to somehow "escape" the massacre and continue as orphan processes?
If you use lxc (Linux containers) to isolate each process tree, then you can use lxc-stop to kill all the processes in a container. See the "Starting / Stopping a container" section of the lxc manual page.

Is it possible to completely manage the life cycle of a process and its forks?

Consider a system that manages user-defined programs:
A program can be anything. Its command line is defined by non-privileged users in some configuration file. It could be /bin/ls, it could be /usr/sbin/apache; the user may specify whatever he is permitted to start.
Each program is run as a non-root user.
Any given user can configure any number of programs.
Each program runs for as long as it wants.
Each program may call fork(), exec() etc.
Each program may set itself as a session leader (ie., setsid()).
The system that starts the programs might not run continuously. It starts a program, then quits.
The action "stop all of program P's processes, including children/forks" must be possible.
The action "find all processes belonging to program P" must be possible.
Here's the question: How can one provide such a system within the Linux process model?
The naive method:
Start program with fork(), exec(), setuid(), etc..
Write the child PID (plus its start timestamp, from /proc/stat, to uniquely and permanently identify it) to a file.
To stop a single process, set SIGTERM to PID.
To find all processes, inspect /proc to build the process hiearchy based on the PID.
This method has a big hole: Any process may fork and break out of its process group. It's not sufficient to look at the process hierarchy. After a program has created new processes, it's not possible to trace their origin back to the original program.
A workaround would be to ensure that each program is started with a unique UID. This is not desirable or particularly workable, since a (human) user may define any number of programs; the system would then have to programmatically create new, unique users for each program.
My only idea so far is to inject a special, reserved environment variable into the program's initial process, ie., run the program with env PROGRAM=myprogram <command line>. The system could then mandate that all processes must inherit their parent's environment. At regular intervals, the system could trawl /proc and forcibly kill any process missing the PROGRAM environment variable.
Are there any secrets in the Linux syscall API that I could use?
(1) The action "stop all of program P's processes, including children/forks" must be possible. (2) The action "find all processes belonging to program P" must be possible.
cgroups implement this, and systemd is perhaps the heaviest user to date to make use of (2) to achieve (1). You can break out of progress groups, but not cgroups.

How is the pid (process id) allocated for a daemon process in Linux kernel?

I researched the Linux kernel code (2.6.11) about the creation of a process/thread, and followed do_fork()->alloc_pidmap()
It seems that alloc_pidmap always returns pid > 300 once the previous pid's ever reached the max pid, while actually daemon's pid is always < 300 (Is this correct?).
Does a daemon obtain its pid using a function other than alloc_pidmap()? If so, does it imply the daemon process is not created using do_fork?
AFAIK pid are allocated by the kernel; the limit of 300 (i.e. #define RESERVED_PIDS 300 private inside kernel/pid.c) you are seeing is perhaps because on most systems, several processes have been forked early in the bootstrap (e.g. from initrd perhaps).
You could test by booting from GRUB directly into a kernel with init=/bin/sh
Some processes are kernel processes (without userland code, e.g. kworker or kauditd), which are not started by fork from init or descendants. They are probably started with kthread_create inside the kernel (and without any syscall).
And you should explain why are you asking that. Is your question about determining if a process is a deamon or not?

Resources