Which one is PID1: /sbin/init or systemd - linux

I'm using Arch Linux. I have read about systemd, and as I understand it, systemd is the first process, and it starts the rest of the processes.
But when I use:
ps -aux
The result shows that /sbin/init has PID 1. And when I use:
pstree -Apn
The result shows that systemd has PID 1. Which is correct? Is /sbin/init starting systemd?

They're probably both right.
$ sudo ls -ltrh /proc/1/exe
[sudo] password for user:
lrwxrwxrwx 1 root root 0 May 30 21:22 /proc/1/exe -> /lib/systemd/systemd
$ echo $(tr '\0' ' ' < /proc/1/cmdline )
/sbin/init splash
$ stat /sbin/init
File: '/sbin/init' -> '/lib/systemd/systemd'
Size: 20 Blocks: 0 IO Block: 4096 symbolic link
Device: 801h/2049d Inode: 527481 Links: 1
Access: (0777/lrwxrwxrwx) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2017-05-30 21:27:12.058023583 -0500
Modify: 2016-10-26 08:04:58.000000000 -0500
Change: 2016-11-19 11:38:45.749226284 -0600
Birth: -
The commands above show us:
what is the file corresponding to pid 1's executable image?
what was invoked (passed to exec()) when pid 1 was started?
what are the characteristics of the path at /sbin/init?
On my system, /sbin/init is a symlink to "/lib/systemd/systemd". This is likely similar to your system. We can see what information ps -aux is using by straceing it.
$ strace ps -aux
...
open("/proc/1/cmdline", O_RDONLY) = 6
read(6, "/sbin/init\0splash\0", 131072) = 18
read(6, "", 131054) = 0
close(6) = 0
...
and likewise for pstree:
$ strace pstree -Apn
...
getdents(3, /* 332 entries */, 32768) = 8464
open("/proc/1/stat", O_RDONLY) = 4
stat("/proc/1", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
fstat(4, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
read(4, "1 (systemd) S 0 1 1 0 -1 4194560"..., 8192) = 192
read(4, "", 7168) = 0
open("/proc/1/task", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 5
...
So the difference in output is because they use different sources of information. /proc/1/cmdline tells us how the process was invoked. Whereas /proc/1/stat shows that the process' name is systemd.
$ cat /proc/1/stat
1 (systemd) S 0 1 1 0 -1 4194560 34371 596544 1358 3416 231 144 298 1758 20 0 1 0 4 190287872 772 18446744073709551615 1 1 0 0 0 0 671173123 4096 1260 0 0 0 17 2 0 0 12188 0 0 0 0 0 0 0 0 0 0

Related

bash sh script with user permissions 755, cannot be run

Why can't run it?
If I run it in the following way, it works:
[usuario#MyPC ~]$ sh ./x11vnc.sh
PORT=5900
First, the permissions, so that you can see that it is in 755.
ls -l
-rw-rw-rw- 1 usuario users 4485 dic 2 11:35 x11vnc.log
-rwxr-xr-x 1 usuario users 117 nov 7 14:06 x11vnc.sh
Second,the script file
cat x11vnc.sh
#!/bin/bash
x11vnc -nap -wait 30 -noxdamage -passwd somepass -display :0 -forever -o ~/x11vnc.log -bg -rfbport 5900
Third, I must clarify the structure of the disks
lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 3,6T 0 disk
├─md126 9:126 0 3,6T 0 raid1
│ ├─md126p1 259:3 0 3,6T 0 part /home/usuario
│ └─md126p2 259:4 0 8G 0 part [SWAP]
└─md127 9:127 0 0B 0 md
sdb 8:16 0 3,6T 0 disk
├─md126 9:126 0 3,6T 0 raid1
│ ├─md126p1 259:3 0 3,6T 0 part /home/usuario
│ └─md126p2 259:4 0 8G 0 part [SWAP]
└─md127 9:127 0 0B 0 md
nvme0n1 259:0 0 232,9G 0 disk
├─nvme0n1p1 259:1 0 232,6G 0 part /
└─nvme0n1p2 259:2 0 256M 0 part /boot
I am the user usuario.
I can edit and modify the x11vnc.sh file as I wish, but I can't run it, and I need to run it to include in the auto-start session of the plasma.
[usuario#MyPC ~]$ ~/x11vnc.sh
-bash: /home/usuario/x11vnc.sh: permission denied
Why can't run it?
If I run it in the following way, it works:
[usuario#MyPC ~]$ sh ./x11vnc.sh
PORT=5900
Thank you all, specially to #CharlesDuffy
I change the fstab line from
UUID=16b711b6-789f-4c27-9d6c-d0f744407f00 /home/usuario ext4 auto,exec,rw,user,relatime 0 2
to
UUID=16b711b6-789f-4c27-9d6c-d0f744407f00 /home/usuario ext4 auto,rw,user,exec,relatime 0 2
The position of exec is important, since user also applies noexec. By putting exec after user, you ensure that exec is set. The most important options should be listed last

Why does this strace on a pipeline not finish

I have a directory with a single file, one.txt. If I run ls | cat, it works fine. However, if I try to strace both sides of this pipeline, I do see the output of the command as well as strace, but the process doesn't finish.
strace ls 2> >(stdbuf -o 0 sed 's/^/command1:/') | strace cat 2> >(stdbuf -o 0 sed 's/^/command2:/')
The output I get is:
command2:execve("/usr/bin/cat", ["cat"], [/* 50 vars */]) = 0
command2:brk(0) = 0x1938000
command2:mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f87e5a93000
command2:access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
<snip>
command2:open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
command2:fstat(3, {st_mode=S_IFREG|0644, st_size=106070960, ...}) = 0
command2:mmap(NULL, 106070960, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f87def8a000
command2:close(3) = 0
command2:fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0
command2:fstat(0, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
command2:fadvise64(0, 0, 0, POSIX_FADV_SEQUENTIAL) = -1 ESPIPE (Illegal seek)
command2:read(0, "command1:execve(\"/usr/bin/ls\", ["..., 65536) = 4985
command1:execve("/usr/bin/ls", ["ls"], [/* 50 vars */]) = 0
command1:brk(0) = 0x1190000
command1:mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fae869c3000
command1:access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
<snip>
command1:close(3) = 0
command1:fstat(1, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
command2:write(1, "command1:close(3) "..., 115) = 115
command2:read(0, "command1:mmap(NULL, 4096, PROT_R"..., 65536) = 160
command1:mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fae869c2000
one.txt
command1:write(1, "one.txt\n", 8) = 8
command2:write(1, "command1:mmap(NULL, 4096, PROT_R"..., 160) = 160
command2:read(0, "command1:close(1) "..., 65536) = 159
command1:close(1) = 0
command1:munmap(0x7fae869c2000, 4096) = 0
command1:close(2) = 0
command2:write(1, "command1:close(1) "..., 159) = 159
command2:read(0, "command1:exit_group(0) "..., 65536) = 53
command1:exit_group(0) = ?
command2:write(1, "command1:exit_group(0) "..., 53) = 53
command2:read(0, "command1:+++ exited with 0 +++\n", 65536) = 31
command1:+++ exited with 0 +++
command2:write(1, "command1:+++ exited with 0 +++\n", 31) = 31
and it hangs from then on. ps reveals that both commands in the pipeline (ls and cat here) are running.
I am on RHEL7 running Bash version 4.2.46.
I put a strace on your strace:
strace bash -c 'strace true 2> >(cat > /dev/null)'
It hangs on a wait4, indicating that it's stuck waiting on children. ps f confirms this:
24740 pts/19 Ss 0:00 /bin/bash
24752 pts/19 S+ 0:00 \_ strace true
24753 pts/19 S+ 0:00 \_ /bin/bash
24755 pts/19 S+ 0:00 \_ cat
Based on this, my working theory is that this effect is a deadlock because:
strace waits on all children, even the ones it didn't spawn directly
Bash spawns the process substitution as a child of the process. Since the process substitution is attached to stderr, it essentially waits for the parent to exit.
This suggests at least two workarounds, both of which appear to work:
strace -D ls 2> >(nl)
{ strace ls; true; } 2> >(nl)
-D, to quote the man page, "[runs the] tracer process as a detached grandchild, not as parent of the tracee". The second one forces bash to do another fork to run strace by adding another command to do after.
In both cases, the extra forks mean that the process substitution doesn't end up as strace's child, avoiding the issue.

Run program on boot with initramfs

I'm running uClinux on a SmartFusion2 as part of a University team building a small cube satellite. However, I'm not super experienced in Linux kernel, and this issue has had me stumped for a few days. I'm trying to get the SmartFusion to run a program on bootup. Currently, the only .uImage that does this is the test 'hello' file. I'm trying to recreate the process for another program, but am running into some difficulties.
in my hello directory I have the following files: hello.busybox, hello.kernel.M2S, help.txt, hello.uImage, Makefile, hello.initramfs, hello (directory)
in the hello subdirectory (projects/hello/hello):
hello (executable), hello.c, hello.gdb, hello.h, hello.o, Makefile
to try and get the uImage to boot and run a different program, I made a copy of my projects/hello/hello directory and renamed it 'goodbye', with a few minor changes int the .h and .c files for testing purposes. Now I'm trying to get the executable 'hello' in projects/hello/goodbye to run on boot.
My initramfs file originally looked like this:
# This is a very simple, default initramfs
dir /dev 0755 0 0
nod /dev/console 0600 0 0 c 5 1
nod /dev/tty 0666 0 0 c 5 0
nod /dev/null 0600 0 0 c 1 3
nod /dev/mem 0600 0 0 c 1 1
nod /dev/kmem 0600 0 0 c 1 2
nod /dev/zero 0600 0 0 c 1 5
nod /dev/random 0600 0 0 c 1 8
nod /dev/urandom 0600 0 0 c 1 9
dir /dev/pts 0755 0 0
nod /dev/ptmx 0666 0 0 c 5 2
nod /dev/ttyS0 0666 0 0 c 4 64
nod /dev/ttyS1 0666 0 0 c 4 65
nod /dev/ttyS2 0666 0 0 c 4 66
nod /dev/ttyS3 0666 0 0 c 4 67
nod /dev/ttyS4 0666 0 0 c 4 68
nod /dev/ttyS5 0666 0 0 c 4 69
dir /bin 755 0 0
dir /proc 755 0 0
file /bin/hello ${INSTALL_ROOT}/projects/${SAMPLE}/hello/hello 755 0 0
slink /bin/init hello 777 0 0
I changed the last two lines of the initramfs to read as follows:
file /bin/hello ${INSTALL_ROOT}/projects/${SAMPLE}/hello/goodbye 755 0 0
slink /bin/init hello 777 0 0
But when I try and boot the SmartFusion2 after remaking the uImage, I get this, witht the error at the bottom:
Starting kernel ...
Linux version 2.6.33-arm1 (ecenstudent#EE10308) (gcc version 4.4.1 (Sourcery G++ Lite 2010q1-189) ) #38 Thu May 25 09:09:08 MDT 2017
CPU: ARMv7-M Processor [412fc231] revision 1 (ARMv7M)
CPU: NO data cache, 8K instruction cache
Machine: Microsemi M2S
Built 1 zonelists in Zone order, mobility grouping on. Total pages: 16256
Kernel command line: m2s_platform=m2s-fg484-som console=ttyS0,115200 panic=10 ip=10.2.118.102:10.2.118.101:192.168.0.1::m2s-fg484-som:eth0:off ethaddr=3C:FB:96:05:00:53
PID hash table entries: 256 (order: -2, 1024 bytes)
Dentry cache hash table entries: 8192 (order: 3, 32768 bytes)
Inode-cache hash table entries: 4096 (order: 2, 16384 bytes)
Memory: 64MB = 64MB total
Memory: 64408k/64408k available, 1128k reserved, 0K highmem
Virtual kernel memory layout:
vector : 0x00000000 - 0x00001000 ( 4 kB)
fixmap : 0xfff00000 - 0xfffe0000 ( 896 kB)
vmalloc : 0x00000000 - 0xffffffff (4095 MB)
lowmem : 0xa0000000 - 0xa4000000 ( 64 MB)
modules : 0xa0000000 - 0x01000000 (1552 MB)
.init : 0xa0008000 - 0xa0012000 ( 40 kB)
.text : 0xa0074bc0 - 0xa0083000 ( 58 kB)
.data : 0xa0084000 - 0xa008cce0 ( 36 kB)
Hierarchical RCU implementation.
NR_IRQS:83
Calibrating delay loop... 132.30 BogoMIPS (lpj=661504)
Mount-cache hash table entries: 512
Switching to clocksource mss_timer2
Serial: 8250/16550 driver, 2 ports, IRQ sharing disabled
serial8250.0: ttyS0 at MMIO 0x40000000 (irq = 10) is a 16550A
console [ttyS0] enabled
serial8250.1: ttyS1 at MMIO 0x40010000 (irq = 11) is a 16550A
Freeing init memory: 40K
Kernel panic - not syncing: No init found. Try passing init= option to kernel.
Backtrace: no frame pointer
Rebooting in 10 seconds..
Can somebody help explain why this is happening and what I need to do to my initramfs to make it run the proper program on boot? Thanks!!
As it turns out, I was confused about how those two lines worked. When I finally figured it out, they looked like this:
file /bin/hello ${INSTALL_ROOT}/projects/${SAMPLE}/goodbye/hello 755 0 0
slink /bin/init hello 777 0 0
then it worked as desired, and I was able to implement it into other uImages.

chown does not set SGID

I am trying to create a a directory with permissions 02770, so that the resultant permissions would be drwxrws---
When I run the below commands I get the expected behavior
rsam.svtest2.serendipity> (/home/svtest2)
$ mkdir abc
rsam.svtest2.serendipity> (/home/svtest2)
$ ls -lrt
drwxrwxr-x 2 svtest2 users 6 Apr 18 10:57 abc
rsam.svtest2.serendipity> (/home/svtest2)
$ chmod 02770 abc
rsam.svtest2.serendipity> (/home/svtest2)
$ ls -lrt
drwxrws--- 2 svtest2 users 6 Apr 18 10:57 abc
UPDATE#1
Following from above, after running mkdir and chmod on a directory, when I run chown the SGID bit gets cleared off.
rsam.svtest2.serendipity> (/home/svtest2)
$ chown svtest2:users abc
rsam.svtest2.serendipity> (/home/svtest2)
$ ls -lrt
drwxrwx--- 2 svtest2 users 6 Apr 18 10:57 abc
From the chown documentation,
Only a privileged process (Linux: one with the CAP_CHOWN capability)
may change the owner of a file. The owner of a file may change the
group of the file to any group of which that owner is a member. A
privileged process (Linux: with CAP_CHOWN) may change the group
arbitrarily.
The problem is that my user svtest does not have CAP_CHOWN capability.
Now the question boils down to - How to I get the user to have CAP_CHOWN capability?
It looks like there is some instruction here - SO - setting CAP_CHOWN
but I am yet to try it out.
However, when I run below C++ code(part of tuxedo server)
// Check if the directory exists and if not creates the directory
// with the given permissions.
struct stat st;
int lreturn_code = stat(l_string, &st);
if (lreturn_code != 0 &&
(mkdir(l_string, lpermission) != 0 ||
chmod(l_string, lpermission) != 0)) {
....
....
}
....
....
// Convert group name to group id into lgroup
if (chown(l_string, -1, lgroup) != 0) {
// System error.
}
The directory is created as below:
$ ls -l|grep DirLevel1
drwxrwx--- 2 svtest2 users 6 Apr 18 11:14 DirLevel1
Notice that the SGUID bit is not set as against when the commands were run directly as mentioned above.
Excerpt from strace for the operation:
5864 stat("/home/svtest2/data/server/log/DirLevel1/", 0x7ffd235f29f0) = -1 ENOENT (No such file or directory)
5864 mkdir("/home/svtest2/data/server/log/DirLevel1/", 02770) = 0
5864 chmod("/home/svtest2/data/server/log/DirLevel1/", 02770) = 0
5864 socket(PF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 15
5864 connect(15, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
5864 close(15) = 0
5864 socket(PF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 15
5864 connect(15, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
5864 close(15) = 0
5864 open("/etc/group", O_RDONLY|O_CLOEXEC) = 15
5864 fstat(15, {st_mode=S_IFREG|0644, st_size=652, ...}) = 0
5864 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f414d7c4000
5864 read(15, "root:x:0:\nbin:x:1:\ndaemon:x:2:\ns"..., 4096) = 652
5864 close(15) = 0
5864 munmap(0x7f414d7c4000, 4096) = 0
5864 chown("/home/svtest2/data/server/log/DirLevel1/", 4294967295, 100) = 0
5864 write(7, "\0\0\2~\6\0\0\0\0\0\21i\216\376\377\377\377\377\377\377\377\1\0\0\0\0\0\0\0\1\0\0"..., 638) = 638
5864 read(7, "\0\0\0\300\6\0\0\0\0\0\10\0\0\0\0\250\0\0\0\0\0\0\0\0\0(\0\0\0\0\0\0"..., 8208) = 192
5864 write(7, "\0\0\1}\6\0\0\0\0\0\3h\221\1\0\0\0\0\0\0\0\376\377\377\377\377\377\377\377\250\0\0"..., 381) = 381
5864 read(7, "\0\0\0\26\6\0\0\0\0\0\10\4\0\0\0\t\1\0\0\0\215\f", 8208) = 22
5864 msgsnd(43679799, {805306373, "y\0\0\0007\200\232\2\0\0\0\0\f\2\0\0\0\0\0\200\0\0\0\0\0\0\0\0\0\0\0\0"...}, 516, IPC_NOWAIT) = 0
5864 msgrcv(43614264,
From http://man.sourcentral.org/RHEL7/2+chown,
When the owner or group of an executable file are changed by an
unprivileged user the S_ISUID and S_ISGID mode bits are cleared. POSIX
does not specify whether this also should happen when root does the
chown(); the Linux behavior depends on the kernel version. In case of
a non-group-executable file (i.e., one for which the S_IXGRP bit is
not set) the S_ISGID bit indicates mandatory locking, and is not
cleared by a chown().
The above highlights a possible scenario, but I'm not sure how that is applicable to my case because it is not executable file but a directory.
Since *nix system consider a file to be an executable by seeing the 'x' permission bit, I believe a searchable directory might be considered as an executable, too.

Linux proc/pid/fd for stdout is 11?

Executing a script with stdout redirected to a file. So /proc/$$/fd/1 should point to that file (since stdout fileno is 1). However, actual fd of the file is 11. Please, explain, why.
Here is session:
$ cat hello.sh
#!/bin/sh -e
ls -l /proc/$$/fd >&2
$ ./hello.sh > /tmp/1
total 0
lrwx------ 1 nga users 64 May 28 22:05 0 -> /dev/pts/0
lrwx------ 1 nga users 64 May 28 22:05 1 -> /dev/pts/0
lr-x------ 1 nga users 64 May 28 22:05 10 -> /home/me/hello.sh
l-wx------ 1 nga users 64 May 28 22:05 11 -> /tmp/1
lrwx------ 1 nga users 64 May 28 22:05 2 -> /dev/pts/0
I have a suspicion, but this is highly dependent on how your shell behaves. The file descriptors you see are:
0: standard input
1: standard output
2: standard error
10: the running script
11: a backup copy of the script's normal standard out
Descriptors 10 and 11 are close on exec, so won't be present in the ls process. 0-2 are, however, prepared for ls before forking. I see this behaviour in dash (Debian Almquist shell), but not in bash (Bourne again shell). Bash instead does the file descriptor manipulations after forking, and incidentally uses 255 rather than 10 for the script. Doing the change after forking means it won't have to restore the descriptors in the parent, so it doesn't have the spare copy to dup2 from.
The output of strace can be helpful here.
The relevant section is
fcntl64(1, F_DUPFD, 10) = 11
close(1) = 0
fcntl64(11, F_SETFD, FD_CLOEXEC) = 0
dup2(2, 1) = 1
stat64("/home/random/bin/ls", 0xbf94d5e0) = -1 ENOENT (No such file or
+++++++>directory)
stat64("/usr/local/bin/ls", 0xbf94d5e0) = -1 ENOENT (No such file or directory)
stat64("/usr/bin/ls", 0xbf94d5e0) = -1 ENOENT (No such file or directory)
stat64("/bin/ls", {st_mode=S_IFREG|0755, st_size=96400, ...}) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
+++++++>child_tidptr=0xb75a8938) = 22748
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 22748
--- SIGCHLD (Child exited) # 0 (0) ---
dup2(11, 1) = 1
So, the shell moves the existing stdout to an available file descriptor above 10 (namely, 11), then moves the existing stderr onto its own stdout (due to the >&2 redirect), then restores 11 to its own stdout when the ls command is finished.

Resources