mount fails from spawned process - linux

I want to start a process that uses a USB hard drive once it gets inserted.
Since UDEV rules specifically mentions not to run long-time processes from RUN command, I send a FIFO message to my service which then opens the relevant process.
So the flow goes like this:
UDEV > runs action process > sends FIFO message to service > service gets message > runs the process who works with the HDD (aka HDD-PROCESS).
If I run my service from shell-1 and run 'action process' (the one that UDEV runs) from shell-2 everything works (including when trying it with udev).
But in deployment, the service is spawned from init, and when it does, the mount command fails saying "No such device".
I then detached "HDD-PROCESS" with fork and setsid, but that didn't help either.
from inittab:
::respawn:/opt/spwn_frm_init
ps relevant output:
PID PPID PGID SID COMM ARGS
31112 1 31112 31112 spwn_frm_init /bin/sh /opt/spwn_frm_init
31113 31112 31112 31112 runSvc /bin/sh /app/sys/runSvc
31114 31113 31112 31112 python python /app/sys/mainSvc.py
24064 1 24064 24064 python /usr/bin/python /app/sys/hdd_proc.py sdb1
everything runs under root (ps shows that too, I omitted that to save screen space).
So in short: when I run /opt/spwn_frm_init from shell, everything works. when I kill it and let it re-spawn from inittab, it doesn't and mount fails with error above.
UPDATE:
There is no problem when trying to mount an ext3 drive, but only on the NTFS one (using ntfs-3g).

Found it!
One of the differences between spawned process and another one who runs from shell is the environment variables which usually should be a problem when all I want is to call mount.
But when I noticed the problem happens only with the NTFS drive, it suddenly occurred to me that mount might need to call ntfs-3g so it worth checking if the second is accessible in the PATH variable.
which ntfs-3g led to /usr/local/bin/ntfs-3g which was mentioned in the default shell PATH but not in the one spawned from init.
To solve it, I added this /usr/local/bin to PATH in the "HDD-PROCESS" and mount began to work :)
A better error message in mount could have saved a lot of time here...

Related

Linux ssh bash fork retry: no child processes

I am on arch linux, accessing an account on a server over SSH. I have run a bash script containing recursion that results in an infinite loop of "no such file or directory" which continues despite any interrupt command ctrl C etc, it is totally uninterruptible. This eventually results in an endless stream of bash: fork: No child processes. I cannot execute any commands whilst this happens, and when it stops with "Resource temporarily unavailable", i am unable to execute any commands to kill the script because "bash: fork: No child processes" starts up again. I have no idea what to do, any help?
ps doesn't work
Looks like you've caused a fork bomb. You can try the methods here to stop it, but you'll most likely end up needing to reboot.
Run kill -9 -1 from the user login that caused the forkbomb . No need to reboot.
PS: Consult your seniors before running it on Prod server
1) ps faux (find PID and place in second command)
2) kill [PID]
If any virus attack then again this process come so you need to enable virus scanner on cpanel and scan and remove.
Important:
Hosting providers must install the following services for this interface to appear:
The ClamAV Scanner plugin in WHM’s Manage Plugins interface (WHM >> Home >> cPanel >> Manage Plugins).
The Exim Mail Server service on the server in WHM’s Service Manager interface (WHM >> Home >> Service Configuration >> Service Manager).
Run ps faux (you might need to run it from other user or with sudo) and search for the offending process (may look like a big branch of the tree)
If needed, kill the process via its PID

Is it possible to pass input to a running service or daemon?

I want to create a Java console application that runs as a daemon on Linux, I have created the application and the script to run the application as a background daemon. The application runs and waits for command line input.
My question:
Is it possible to pass command line input to a running daemon?
On Linux, all running processes have a special directory under /proc containing information and hooks into the process. Each subdirectory of /proc is the PID of a running process. So if you know the PID of a particular process you can get information about it. E.g.:
$ sleep 100 & ls /proc/$!
...
cmdline
...
cwd
environ
exe
fd
fdinfo
...
status
...
Of note is the fd directory, which contains all the file descriptors associated with the process. 0, 1, and 2 exist for (almost?) all processes, and 0 is the default stdin. So writing to /proc/$PID/fd/0 will write to that process' stdin.
A more robust alternative is to set up a named pipe connected to your process' stdin; then you can write to that pipe and the process will read it without needing to rely on the /proc file system.
See also Writing to stdin of background process on ServerFault.
The accepted answer above didn't quite work for me, so here's my implementation.
For context I'm running a Minecraft server on a Linux daemon managed with systemctl. I wanted to be able to send commands to stdin (StandardInput).
First, use mkfifo /home/user/server_input to create a FIFO file somewhere (also known as the 'named pipe' solution mentioned above).
[Service]
ExecStart=/usr/local/bin/minecraft.sh
StandardInput=file:/home/user/server_input
Then, in your daemon *.service file, execute the bash script that runs your server or background program and set the StandardInput directive to the FIFO file we just created.
In minecraft.sh, the following is the key command that runs the server and gets input piped into the console of the running service.
tail -f /home/user/server_input| java -Xms1024M -Xmx4096M -jar /path/to/server.jar nogui
Finally, run systemctl start your_daemon_service and to pass input commands simply use:
echo "command" > /home/user/server_input
Creds to the answers given on ServerFault

Simulating a process stuck in a blocking system call

I'm trying to test a behaviour which is hard to reproduce in a controlled environment.
Use case:
Linux system; usually Redhat EL 5 or 6 (we're just starting with RHEL 7 and systemd, so it's currently out of scope).
There're situations where I need to restart a service. The script we use for stopping the service usually works quite well; it sends a SIGTERM to the process, which is designed to handle it; if the process doesn't handle the SIGTERM within a timeout (usually a couple of minutes) the script sends a SIGKILL, then waits a couple minutes more.
The problem is: in some (rare) situations, the process doesn't exit after a SIGKILL; this usually happens when it's badly stuck on a system call, possibly because of a kernel-level issue (corrupt filesystem, or not-working NFS filesystem, or something equally bad requiring manual intervention).
A bug arose when the script didn't realize that the "old" process hadn't actually exited and started a new process while the old was still running; we're fixing this with a stronger locking system (so that at least the new process doesn't start if the old is running), but I find it difficult to test the whole thing because I haven't found a way to simulate an hard-stuck process.
So, the question is:
How can I manually simulate a process that doesn't exit when sending a SIGKILL to it, even as a privileged user?
If your process are stuck doing I/O, You can simulate your situation in this way:
lvcreate -n lvtest -L 2G vgtest
mkfs.ext3 -m0 /dev/vgtest/lvtest
mount /dev/vgtest/lvtest /mnt
dmsetup suspend /dev/vgtest/lvtest && dd if=/dev/zero of=/mnt/file.img bs=1M count=2048 &
In this way the dd process will stuck waiting for IO and will ignore every signal, I know the signals aren't ignore in the latest kernel when processes are waiting for IO on nfs filesystem.
Well... How about just not sending SIGKILL? So your env will behave like it was sent, but the process didn't quit.
Once a proces is in "D" state (or TASK_UNINTERRUPTIBLE) in a kernel code path where the execution can not be interrupted while a task is processed, which means sending any signals to the process would not be useful and would be ignored.
This can be caused due to device driver getting too many interrupts from the hardware, getting too many incoming network packets, data from NIC firmware or blocked on a HDD performing I/O. Normally if this happens very quickly and threads remain in this state for very short span of time.
Therefore what you need to be doing is look at the syslog and sar reports during the time when the process was stuck in D-state. If you find stack traces in the log, try to search kernel.bugzilla.org for similar issues or seek support from the Linux vendor.
I would code the opposite way. Have your server process write its pid in e.g. /var/run/yourserver.pid (this is common practice). Have the starting script read that file and test that the process does not exist e.g. with kill of signal 0, or with
yourserver_pid=$(cat /var/run/yourserver.pid)
if [ -f /proc/$yourserver_pid/exe ]; then
You could improve that by readlink /proc/$yourserver_pid/exe and comparing that to /usr/bin/yourserver
BTW, having a process still alive a few seconds after a SIGKILL is a serious situation (the common case when it could happen is if the process is stuck in a D state, waiting for some NFS server), and you probably should detect and syslog it (e.g. with logger in your script).
I also would try to first send SIGTERM, wait a few seconds, send SIGQUIT, wait a few seconds, and at last send SIGKILL and only a few seconds later test that the server process has gone
A bug arose when the script didn't realize that the "old" process hadn't actually exited and started a new process while the old was still running;
This is the bug in the OS/kernel level, not in your service script. The situation is rare and is hard to simulate because the OS is supposed to kill the process when SIGKILL signal happens. So I guess your goal is to let your script work well under a buggy kernel. Is that correct?
You can attach gdb to the process, SIGKILL won't remove such process from processlist but it will flag it as zombie, which might still be acceptable for your purpose.
void#tahr:~$ ping 8.8.8.8 > /tmp/ping.log &
[1] 3770
void#tahr:~$ ps 3770
PID TTY STAT TIME COMMAND
3770 pts/13 S 0:00 ping 8.8.8.8
void#tahr:~$ sudo gdb -p 3770
...
(gdb)
Other terminal
void#tahr:~$ ps 3770
PID TTY STAT TIME COMMAND
3770 pts/13 t 0:00 ping 8.8.8.8
sudo kill -9 3770
...
void#tahr:~$ ps 3770
PID TTY STAT TIME COMMAND
3770 pts/13 Z 0:00 [ping] <defunct>
First terminal again
(gdb) quit

Simple replacement of init to just start console

On a very simple PC, I want to replace Ubuntu 12.04 /sbin/init by the most simple bash script in order to have the very minimum number of running processes. Obviously, no X, no USB, no detection of new hardware, no upgrade, no apt, "nothing", I just need a working console with a DHCP-based Wi-Fi IP address (ssid, passphrase are already stored in /etc/network/interfaces). That's all. Currently, I have tried this in replacement of /sbin/init:
#!/bin/sh
mount -o rw,remount /
mount -t proc none /proc
udevd --daemon
mkdir /run/network
ifup -a &
while [ 1 ]; do
/sbin/getty -8 115200 tty1 vt100
done
It's working as I'm getting an IP address and I can login but:
A) While running shutdown, I get "shutdown: Unable to shutdown system:"
B) control-c is not working in the console
C) After a login, I get: "bash: cannot set terminal process group (-1): Inappropriate ioctl for device"
D) After a login, I get: "bash: no job control in this shell"
Also, I have noticed that all the user-space processes have a "?" in the tty column when running ps avx. How can I fix those problems? I don't want to use upstart in order to really control what is started on the PC and have the very bare minimum.
I ended up using Busybox init. Great tiny init...
You could leverage runlevels and based on your question runlevel 3 is what you want to use.
If you have some services that you do not wish to start, you could turn them off too for that runlevel.
For booting into runlevel 3, you just append the boot argument to the kernel in your boot loader:
<EXISTING_BOOT_CMD> 3
If your distro uses systemd instead of sysvinit, they are instead called as targets. The equivalent of runlevel 3 in systemd is usually named as multi-user.target
The kernel boot argument you would need to pass in this case is systemd.unit=multi-user.target
<EXISTING_BOOT_CMD> systemd.unit=multi-user.target
An alternative, if you do not want to touch the boot loader:
systemctl enable multi-user.target

I don't get coredump with all process

I try to get a coredump, so i use :
ulimit -c unlimited
I run my program in background, and I kill it :
kill -SEGV %1
But i just get :
[1]+ Exit 1 ./Test
And no coredumps are created.
I did the same with other programs and it works, so why that didn't work with all ? Anybody can help me ?
Thanks. (GNU/Linux, Debian 2.6.26)
If your program traps the SEGV signal and does something else, it won't invoke the OS core dump routine. Check that it doesn't do that.
Under Linux, processes which change their user ID using setuid, seteuid or some other parameters get excluded from dumping core for security reasons (Think: /bin/passwd dumps core while reading /etc/shadow into memory)
You can re-enable dumping core on Linux programs which change their user ID by calling prctl() after the change of UID
Also you might want to check that the program you're running is not changing its working directory ( chdir() ), because then it will create the core file in a different directory than the one you're running it from.
And you can try this too:
kill -ABRT pid
Try (as root):
sysctl kernel.core_pattern=core
and then repeat your experiment. On some systems that variable is set to /dev/null by default.
However, if you see exit status 1, perhaps the program indeed intercepts the signal.

Resources