In the documentation for Node's Child Processes, there is this sentence in the section on child_process.spawn():
On Windows, setting options.detached to true makes it possible for the
child process to continue running after the parent exits.
That makes it sound like (at least on Windows) when you leave options.detached to the default value of false, spawn()'d processes will automatically be killed. That's actually the behavior I want in my application, and in fact I was calling myChildProcess.kill( "SIGINT" ) in my code, but commented it out, and the child processes still went away when my app quit. So that's great, but:
(1) My understanding is that it's necessary to do some tricky stuff with "job objects" as discussed here in order to make this work on Windows. Do you know if Node is doing something tricky like that to make child processes go away? Or perhaps it's more simple than that and Node just keeps a list of the spawned process IDs and kills any of them that are still around when shutting down? Which leads to the closely related question...
(2) If Node is indeed doing something special to kill child processes, do you know if there are cases (e.g., some kind of app crash) that would defeat what it's doing and leave the child processes running?
UPDATE: To clarify, the child processes I'm launching in my case are Python web server processes, not other Node processes. I don't know if there's a difference in behavior between a Node child process and some other child process for the purpose of this question.
A Node instance will quit as long as there is nothing left in the event queue (and no async code pending), so as long as you aren't leaving anything open then naturally a Node process will quit when it's done.
In terms of the process hanging on a crash, unless you are explicitly handling uncaught exceptions the the process will exit immediately.
If you want a child process to be long-running and to survive the termination of the node process itself, as you know you set options.detached = true.
This business of stopping a child process when a parent process stops is operating-system behavior. A parent process (running any programming language system, not just node) owns a non-detached child process. The OS cleans up child processes upon the termination of their parent.
Detaching a process tells the OS to make it no longer a child process, so the OS won't clean it up automatically.
A good practice for node child processes: whenever possible, have them do their assigned task and then exit. In other words, in most cases you should not need to rely on this child / detached behavior.
Related
I have a desktop Node.js application that spawns a child process. After spawning, all communications between the parent and child processes go through the network, and I need the processes to be completely independent of each other. spawn()'s option.detached allows for the child process to continue running after the parent exits, and subprocess.unref() allows the parent to exit independently of the child.
While everything works and is pretty easy to manually test (track the processes in Task Manager/Activity Monitor and see what happens when one of the processes is killed/exits), I'm wondering about how I could test this programmatically, specifically using Jest.
There are two testing scenarios:
option.detached – test whether the child process lives even after the parent process exits. But how can I simulate a parent exiting if the parent process is the process Jest tests run in? Perhaps I could simulate/mock process.exit() somehow?
unref() – test whether the parent process can exit without waiting for the child process. Now this is even more difficult and I have no idea how to test this. Perhaps mock the child process with a sample long-lived process, like setTimeout(fn, 999999)? Then again, how do I test whether the parent process exits?
I know that a child process whose parent has died becomes a zombie process but when that happens, does it continue execution normally ?
What I have read so far seems to suggest that yes but I have not found confirmation and my programming ventures seems to suggest otherwise.
Whether a child's parent has exited has no effect on whether it continues running. Assuming the child has access to the resources that it needs, it will continue to run normally.
This is important when writing a daemon, since typically the started process forks twice, and it is the grandchild that ultimately runs as a service.
Note that there are some reasons a child may end up exiting abnormally due to a parent exiting. For example, if the parent is an interactive shell and it exits, the terminal may disappear, and as a result the child may receive a SIGHUP. However, in that case, the reason the child will have exited is because it received a signal it didn't handle, and if it had set up a suitable handler, it would have continued running.
I have a somewhat interesting setup. I have a wrapper script that does some stuff, and launches a child process. I want to be able to do the following:
have the wrapper script be able to kill the child process and any of its children that it may have created
Ensure that if the wrapper itself is signaled / killed it will pass it along to the child processes as well
From what I can see there's a bit of a contradiction here. To account for the first requirement, from what I have seen, I need to use killpg on the child processes' process group. This is fine, but it also kills the wrapper script itself since the child has the parent script's process group.
So now if I setpgrp in the child so it gets a separate PG, I can killpg it with its children properly, BUT not I have lost the second requirement ( if the wrapper gets killed it wont go to the child ).
I can manage this problem by registering a signal handler in the wrapper script and passing the signal via killpg as well, however this doesn't work for SIGKILL... which leaves me at a bit of a paradox.
Any possible solutions to this?
In am writing an SDK in Go that integrators will communicate with via a local socket connection.
From the integrating application I need a way to start the SDK as a process but more importantly, I need to be able to cancel that process when the main application is closing too.
This question is language agnostic (I think) as I think the challenge is linux related. i.e. How to start a program and cancel it at a later stage.
Some possible approaches:
I am thinking that it's a case of starting the program via exec, getting it's PID or some ID then using that to kill later. Sudo may be required to do this, which is not ideal. Also, not good practice as you will be effectively force closing the SDK, offering no time for cleanup.
Start the program via any means. Once ready to close, just send a "shutdown" command via the SDK API which will allow the SDK to cleanup, manage state then exit the application.
What is best practice for this please?
Assuming you're using Linux or similar Unix:
You are on the right track. You won't need sudo. The comments thus far are pointing in the right direction, but not spelling out the details.
See section 2 of the manual pages (man 2 ...) for details on the functions mentioned here. They are documented for calling from C. I don't have experience with Go to help determine how to use them there.
The integrator application will be called the "parent" process. The SDK-as-a-process will be called the "child" process. A process creates a child and becomes its parent by calling fork(). The new process is a duplicate of the parent, running the same code, and having for the most part all the same state (data in memory). But fork() returns different values to parent and child, so each can determine its role in the relationship. This includes informing the parent of the process identifier (pid) of the child. Hang on to this value. Then, the child uses exec() to execute a different program within the existing process, i.e. your SDK binary. An alternative to fork-then-exec is posix_spawn(), which has rather involved parameters (but gives greater control if you need it).
Designing the child to shutdown in response to a signal, rather than a command through the API, will allow processes other than the parent to initiate clean shutdown in standard fashion. For example, this might be useful for the administrator or user; it enables sending the shutdown signal from the shell command-line or script.
The child installs a signal handler function, that will be called when the child process receives a signal, by calling signal() (or the more complex sigaction() recommended for its portability). There are different signals that can be sent/received, identified by different integer values (and also given names like SIGTERM). You indicate which you're interested in receiving when calling signal(). When your signal handler function is invoked, you've received the signal, and can initiate clean shutdown.
When the parent wants the child to shut down cleanly, the parent sends a signal to the child using the unfortunately named kill(). Unfortunately named because signals can be used for other purposes. Anyway, you pass to kill() the pid (returned by fork()) and the specific signal (e.g. SIGTERM) you want to send.
The parent can also determine when the child has completely shut down by calling waitpid(), again passing the pid returned by fork(); or alternately by registering to receive signal SIGCHLD. Register to receive SIGCHLD before fork()/exec() or you might miss the signal.
Actually, it's important that you do call waitpid(), optionally after receiving SIGCHLD, in order to deallocate a resource holding the child process's exit status, so the OS can cleanup that last remnant of the process. Failing to do so keeps the child as a "zombie" process, unable to be fully reclaimed. Too many zombies and the OS will be unable to launch new processes.
If a process refuses to shut down cleanly or as quickly as you require, you may force it to quit (without executing its cleanup code) by sending the signal SIGKILL.
There are variants of exec(), waitpid() and posix_spawn(), with different names and behaviors, mentioned in their man pages.
I have just started learning about fork and wait in Linux and came across this paragraph in the wait() manual page notes:
A child that terminates, but has not been waited for becomes a "zombie". The kernel maintains a minimal set of information about the zombie process (PID, termination status, resource usage information) in order to allow the parent to later perform a wait to obtain information about the child. As long as a zombie is not removed from the system via a wait, it will consume a slot in the kernel process table, and if this table fills, it will not be possible to create further processes. If a parent process terminates, then its "zombie" children (if any) are adopted by init(8), which automatically performs a wait to remove the zombies.
A question that came to mind after reading this:
Isn't the fact that not using wait() causes a resource waste until the parent terminates, a problem that amplifies when the parent process is meant to be a long lived process in the system?
Does this means I should always use wait() as soon as possible after using fork?
Isn't the fact that not using wait() will cause a resource waste until
the parent will terminate?
When a child process is running, there's no wastage of resource; it's still doing its task. The resource waste that your citation talks about is only when a child dies but it's parent hasn't reaped it yet i.e. not wait()ed on the child process.
a problem that amplifies when the parent process is meant to be a long
lived process in the system?
When your application runs for a very longtime and keeps forking children, there's a chance that the system might run out of resources when many child process are still running or the parent process didn't reap the exited children. It's the job of the application process to to optimally manage the resources on the system and reaping the child processes as soon as they might have done.
Does this means I should always use wait() as soon as possible after
using fork?
There's no straight "as early" or "as late" kind of answer to this. For example, parent process might want to carry on do something useful when the child is still running rather than waiting (It might be unnecessary to even check periodically if children status with WNOHANG when parent knows the children might have a long tasks to finish). So in this case, waiting as soon as forking a process might not be what you want. In general, parent should call wait() whenever it expects the child(ren) to have completed its task (or wants to know the stauts of children). The responsibility lies with the programmer to code correctly and call wait() at the most appropriate time.