problems during linux kernel init - linux

What are the possible culprits for crashes during kernel init?
I am running a kernel that has initramfs, the inittab is very basic rcS (as sysinit) and getty (respawn). While booting I don't get any error message, however the init gives me this message:
S0 respawning too fast: disabled for 5 minutes, where S0 is actually the respawn::getty line(it seems as getty keep crashing), also none of the messages generated by the rcS are seen on the console (I assume that rcS commands also crashe).
If I force the kernel to go to /bin/sh (instead of /init) I can call rcS manually and I get no errors, same happens for getty (if I call getty with the same params from inittab it works fine).
I am wondering what are the difference between the way init spawns processes and the way /bin/sh does.

Some OS's log init respawns to wtmp, you might want to check there. Turning up your syslog might also help.
When you kick off getty via /bin/sh, does it stay running? AFAIK, the trick with init respawn is that the PID it generates is monitored and if it goes down it kicks off another one.

The stock /bin/sh is not built static, nor is getty. You need to look at the
shared library dependencies of /bin/sh and getty that all the libraries are present.
You can use ldd or 'readelf -a' to see the shared library dependencies.

Maybe nothing is setting up /dev/tty1, /dev/tty2, etc, but stuff is running ok on /dev/console (which is not the same thing as /dev/tty1). If you're depending on a /dev directory being in your initramfs or root filesystem, check those.
Probably the main difference between init=/bin/sh and letting init spawn stuff is the /dev/console vs. /dev/ttyx. I can't think of anything else that would be relevant. Keep in mind that the initramfs does run first, I think.
And BTW, you're obviously past the kernel init stage if init(8) or /bin/sh can run.

Related

inittab not restarting service after service crash in Red Hat 6.7

NOTE: I am running Red Hat 6.7
I have a service that is configured with the Linux init system to start a process as a service when the machine boots. This was done by doing this one-time configuration from the command line:
ln -snf /home/me/bin/my_service /etc/init.d/my_service
chkconfig --add my_service
chkconfig --level 235 my_service on
When the OS reboots, the service starts as expected.
I ALSO need the service to be restarted if the service (my_service) crashes. From what I've read, all I need to do is add an entry to /etc/inittab that looks like this:
mysvc:235:respawn:/home/me/bin/my_service_starter
Where my_service_starter looks like:
#!/bin/bash
/home/me/bin/my_service start
My understanding is that when the init system detects that my_service is not running, it will attempt to restart it by running "my_service_starter".
However this does not seem to be working.
I need to understand how to tell the Linux init system to restart my service when the service crashes.
Given an entry like:
mysvc:235:respawn:/home/me/bin/my_service_starter
Then inittab will:
call /home/me/bin/my_service_starter
which will call /home/me/bin/my_service start
...and then exit, so init will thing your service has failed
so init will call /home/me/bin/my_service_starter again
...and so forth, which will result in init deciding that your script is respawning too fast, after which it will ignore it completely.
A process started by inittab is not expected to exit. If you really want to use inittab to maintain your service, you could remove /etc/init.d/my_service, and then in /etc/inittab you would have something like:
mysvc:235:respawn:/home/me/bin/my_service
And you would need to ensure that my_service runs in the foreground (some programs automatically daemonize by default, although these will often have some sort of --run-in-foreground flag).
If you upgrade to CentOS 7 or something else with systemd, this all becomes easier.
You can also investigate "third-party" process supervisors like "supervisord" or "runit" that you could use for process monitoring/restarting on CentOS 6.
Update
As mangotang points out, and I forgot, RHEL 6 actually shipped with upstart, even though it used almost exclusively SysV-style init scripts. So a better solution would be to create an upstart service instead. There are some reasonable getting-started docs here.
On RHEL 6.X, at top of the /etc/inittab file it says:
# inittab is only used by upstart for the default runlevel.
#
# ADDING OTHER CONFIGURATION HERE WILL HAVE NO EFFECT ON YOUR SYSTEM
RHEL 6.X uses Upstart instead of the System V init system. See the man pages for initctl(8) and initctl(5), or ask Google about Upstart.

Running gvfs after building

I am trying to run a local build of gvfs. I have followed the Newcomers document to set up a working build environment, built gvfs from sources and am now trying to figure out how to run it.
The docs have instructions on running applications or the GNOME shell, which say I need to kill the current instance, then launch the newly-built binary with jhbuild run, as in:
$ killall gnome-weather
$ jhbuild run gnome-weather
or, in the case of the shell,
$ jhbuild run gnome-shell --replace
For gvfs, I see that it spawns a bunch of processes (all children of P1 running under my account), the first of them (lowest PID) being gvfsd. So I tried the following:
$ killall gvfsd
$ jhbuild run gvfs
Which gives me the error message:
jhbuild run: Unable to execute the command 'gvfs': [Errno 2] No such file or directory
If instead I try
$ jhbuild run gvfsd
I get the same message. Same when I try any of the above two with --replace.
Since gvfs is a daemon rather than an application, I searched around a bit and came across this post, which suggests launching daemons with
jhbuild run dbus-launch --exit-with-session name-of-daemon
No joy either... no matter whether I use gvfs or gvfsd for the name, I get the error message
Couldn't exec gvfs: No such file or directory
(reporting the name I specified in the command).
Is this the correct way to launch gvfs at all? If not, what is? If it is, how can I find out what's going wrong?
EDIT: Apparently, the code I intend to modify is part of the gvfs-mtp-volume-monitor binary – but essentially the same goes here. How do I launch my own version of the binary rather than the one that came with my OS distro?
jhbuild run can be used for gvfs in the same manner.
For gvfsd do the following:
jhbuild run ~/jhbuild/install/libexec/gvfsd -r
The -r switch tells gvfsd to replace any running version. gvfsd will also start gvfsd-fuse if it was built and you didn't disable it via a command-line switch.
You will also need to replace any volume monitors (and other processes you need), such as:
killall gvfs-mtp-volume-monitor
jhbuild run ~/jhbuild/install/libexec/gvfs-mtp-volume-monitor
Care must be taken with anything that is invoked over dbus:
Namespaces may change between versions. If that happened between the version shipped with your OS and the current one, the latter will not work unless you tweak your dbus config to reflect that.
If dbus is used to spawn processes, it will fall back to the binaries shipped with your OS. Again you would need to modify your dbus config (specifically .service entries) to point to your binaries.

Starting a program in a chroot environment returns immediately

I am working in a virtual environment, trying to start open vm tools in a chroot environment.
I tested with bash and it seems to work fine.
I used ./configure --options --prefix=/home/chroot_env to install the program, then using ldd on vmtoolsd, i copied the corresponding libraries to the /lib directory.
Now when I start chroot /home/chroot_env /bin/vmtoolsd, nothing happens, the chroot returns directly. Launching the same binary in the normal environment does work.
Does someone have an idea why it isn't working, the correct libraries are there, and it works with bash.
EDIT : strace showed that vmtoolsd is trying to access /dev/console, I added mount --bind /dev/ /home/chroot_env/dev/ but it is still failing.
EDIT2 : another strace showed it was looking for another plugin loaded dynamically, i added it and it worked, conclusion strace is great for debugging such issue!
When you run a program and nothing happens, you can always run it with strace in order to see which syscalls are made. This is an easy way to obtain the list of the files (regular or not) that are opened. In your case, check that your program doesn't try to access a file that is not in the chroot.

Is it correct to use the rc.local file to start a program when the system starts?

I want to start a java program ( class ) when the system starts ( even though when no user has logged in the system ). I read some documentation and I found that Linux executes the init command when booting and executes the scripts in the rc.d directory. And the rc.local file is executed last. So I wonder if it is right to edit this file in order to run an application when the system has finished booting up ?
It depends. If your program
is a purely local modification and
you don't need daemon control over the software (start/stop it later at system run) and
a simple SIGTERM suffices to stop your program on system shutdown,
then yes, rc.local is just the point to put it.
If you intend to install the file on other systems, put a file in /etc/init.d and use the system specific commands to let it run at the appropriate runlevels.

gdb appears to ignore executable capabilities

I am debugging a program that makes use of libnetfilter_queue. The documentation states that a userspace queue-handling application needs the CAP_NET_ADMIN capability to function. I have done this using the setcap utility as follows:
$ sudo setcap cap_net_raw,cap_net_admin=eip ./a.out
I have verified that the capabilities are applied correctly as a) the program works and b) getcap returns the following output:
$ getcap ./a.out
./a.out = cap_net_admin,cap_net_raw+eip
However, when I attempt to debug this program using gdb (e.g. $ gdb ./a.out) from the command line, it fails on account of not having the correct permissions set. The debugging functionality of gdb works perfectly otherwise and debugs as per normal.
I have even attempted to apply these capabilities to the gdb binary itself to no avail. I did this as it seemed (as documented by the manpages that the "i" flag might allowed the debugee to inherit the capability from the debugger.
Is there something trivial I am missing or can this really not be done?
I run into same problem and at beginning I thought the same as above that maybe gdb is ignoring the executable's capability due to security reason. However, reading source code and even using eclipse debugging gdb itself when it is debugging my ext2fs-prog which opens /dev/sda1, I realize that:
gdb is no special as any other program. (Just like it is in the matrix, even the agents themselves they obey the same physical law, gravity etc, except that they are all door-keepers.)
gdb is not the parent process of debugged executable, instead it is grand father.
The true parent process of debugged executable is "shell", i.e. /bin/bash in my case.
So, the solution is very simple, apart from adding cap_net_admin,cap_net_raw+eip to gdb, you have also apply this to your shell. i.e. setcap cap_net_admin,cap_net_raw+eip /bin/bash
The reason that you have also to do this to gdb is because gdb is parent process of /bin/bash before create debugged process.
The true executable command line inside gdb is like following:
/bin/bash exec /my/executable/program/path
And this is parameter to vfork inside gdb.
For those who have the same problem, you can bypass this one by executing gdb with sudo.
A while ago I did run into the same problem. My guess is that running the debugged program with the additional capabilities is a security issue.
Your program has more privileges than the user that runs it. With a debugger a user can manipulate the execution of the program. So if the program runs under the debugger with the extra privileges then the user could use these privileges for other purposes than for which the program intended to use them. This would be a serious security hole, because the user does not have the privileges in the first place.
For those running GDB through an IDE, sudo-ing GDB (as in #Stéphane J.'s answer) may not be possible. In this case, you can run:
sudo gdbserver localhost:12345 /path/to/application
and then attach your IDE's GDB instance to that (local) GDBServer.
In the case of Eclipse CDT, this means making a new 'C/C++ Remote Application' debug configuration, then under the Debugger > Connection tab, entering TCP / localhost / 12345 (or whatever port you chose above). This lets you debug within Eclipse, whilst your application has privileged access.
I used #NickHuang's solution until, with one of system updates, it broke systemd services (too much capabilities on bash for systemd to start it or some such). Switched to leaving bash alone and instead pass a command to gdb to invoke the executable directly. The command is
set startup-with-shell off
OK, so I struggled a bit with this so I thought I'd combine answers and summarise.
The easy solution is just to sudo gdb as suggested but just be a bit careful. What you're doing here is running the debugged program as root. This may well cause it to operate differently than when you run it from the command line as a normal user. Could be a bit confusing. Not that I would EVER fall into this trap... Oopsies.
This will be fine if you're running the debugged program as root with sudo OR if the debugged program has the setuid bit set. But if the debugged program is running with POSIX capabilities (setcap / getcap) then you need to mirror these more granular permissions in bash and gdb as Nick Huang suggested rather than just brute forcing permissions with 'sudo'.
Doing anything else may lead you to a bad place of extreme learning.

Resources