node-sass module install script does not exit when npm install is launched by Ruby's popen3 - node.js

UPDATE 3: This problem appears to be a part of the module installation of node-sass. The stranded process has a working directory of ./node_modules/node-sass, and its command-line scripts/install.js resolves to a file within the module. Furthermore, the last line of output that reaches the console matches a line output by node-sass' scripts/install.js:
if (cachedBinary) {
console.log('Cached binary found at', cachedBinary);
fs.createReadStream(cachedBinary).pipe(fs.createWriteStream(binaryPath));
return;
}
This code does not have any issues when run from the command-line (i.e., simply issuing npm install at the command prompt with a blank node_modules directory), but when npm install is launched via popen3, it appears that the stream .pipe call here blocks indefinitely.
This is a head scratcher for me at the moment...
If I ^C the terminal where Ruby has launched these child processes, the interrupt makes it to the rogue process and causes it to terminate. However, forcing all the pipes closed (or simply terminating the parent process) does not cause the rogue node.exe to exit.
I contemplated an alternative version of popen3 that explicitly waits on the child process instead of just implicitly waiting for the streams to all come to an end, but while this does permit the calling side to proceed properly, the rogue child process still hangs around, and would interfere with subsequent invocations by holding an open handle to the ./node_modules/node-sass directory.
UPDATE 4: I have opened this bug report with the node-sass project: https://github.com/sass/node-sass/issues/2459
UPDATE: I'm pretty sure this is actually a Node issue. I tracked down the root cause of the hang, and it is that through a complex tree of child processes, npm install ultimately leaves behind an instance of node.exe that just sits there, apparently indefinitely, doing nothing, keeping the stdout and stderr pipes it has inherited open.
So, this leaves new questions:
Is there a way to make Node not leave behind a straggler process after an npm install completes?
Is there a way to explicitly wait for the direct child process of popen3 to exit, instead of waiting for the streams to end, and then possibly close the streams from the listening side to terminate the threads pumping the output?
UPDATE 2: I have reproduced the problem with this minimalist code:
Open3::popen3 "npm install" do |stdin, stdout, stderr, thr|
stdin.close
stdout.each_line { |l| puts l }
end
With this code, the rogue node.exe process (command-line: scripts/install.js) hangs around after the npm install completes. Terminating the process unblocks the popen3 call (by causing stdout to come to an end, so the each_line loop terminates), and ^Cing the Ruby code (when running in an IRB window) causes the rogue node.exe to terminate (following a line in the console output: => #<IO:(closed)>).
This only happens when the process is run through popen3; the identical npm install from a CMD prompt exits normally.
Original question:
I'm having a problem with popen3 in a Ruby script. It's hanging, but I'm pretty sure it's not any of the usual candidates. I've updated my call to popen3 with tons of annotation so that I can see in the console output what's going on. Here is how I'm making the call:
command_output_lines = []
lock = Mutex.new
exit_code = nil
Logger.log("[MAIN] beginning popen3 block")
Open3.popen3(command_w_params) do |stdin, stdout, stderr, thr|
Logger.log("[MAIN] closing stdin stream")
stdin.close
Logger.log("[MAIN] starting [STDOUT]")
stdout_thread = Thread.new do
Logger.log("[STDOUT] started")
begin
stdout.each_line do |stdout_line|
Logger.log("[STDOUT] got a line, acquiring lock")
lock.synchronize do
command_output_lines <<= stdout_line
Logger.log(stdout_line)
end
Logger.log("[STDOUT] lock released")
end
rescue Exception => e
Logger.log("[STDOUT] exception: #{e}")
end
Logger.log("[STDOUT] exiting")
end
Logger.log("[MAIN] starting [STDERR]")
stderr_thread = Thread.new do
Logger.log("[STDERR] started")
begin
stderr.each_line do |stderr_line|
Logger.log("[STDERR] got a line, acquiring lock")
lock.synchronize do
command_output_lines <<= "[STDERR] " + stderr_line
Logger.warn(stderr_line)
end
Logger.log("[STDERR] lock released")
end
rescue Exception => e
Logger.log("[STDERR] exception: #{e}")
end
Logger.log("[STDERR] exiting")
end
Logger.log("[MAIN] joining to [STDOUT]")
stdout_thread.join
Logger.log("[MAIN] joining to [STDERR]")
stderr_thread.join
Logger.log("[MAIN] threads joined, reading exit status")
exit_code = thr.value.exitstatus
end
Logger.log("[MAIN] popen3 block completed")
(Never mind what exactly Logger.log is; just know that it sends output to the console.)
Where I'm seeing the problem, command_w_params is equal to npm install, and this code is running in the context of a bundle exec rake TaskName.
When it reaches this code, I see the following console output:
[MAIN] beginning popen3 block
[MAIN] closing stdin stream
[MAIN] starting [STDOUT]
[MAIN] starting [STDERR]
[MAIN] joining to [STDOUT]
[STDOUT] started
[STDERR] started
[STDOUT] got a line, acquiring lock
[STDOUT] lock released
[STDOUT] got a line, acquiring lock
> node-sass#4.9.2 install C:\Users\Jonathan Gilbert\RepositoryName\ProjectName\node_modules\node-sass
[STDOUT] lock released
[STDOUT] got a line, acquiring lock
> node scripts/install.js
[STDOUT] lock released
[STDOUT] got a line, acquiring lock
[STDOUT] lock released
[STDOUT] got a line, acquiring lock
Cached binary found at C:\Users\Jonathan Gilbert\AppData\Roaming\npm-cache\node- sass\4.9.2\win32-x64-57_binding.node
[STDOUT] lock released
...and then it just hangs. At this point, I can see in Process Explorer that the child process has exited. There is nothing left but ruby.exe, but it just sits there indefinitely until it is explicitly cancelled. The two threads are still running, indicating that the stdout and stderr streams haven't signalled end-of-stream yet.
Now, often when people have a problem with popen3, it's because they're not reading both stdout and stderr simultaneously, and one or the other fills up its pipe buffer while the parent process is only paying attention to the other. But my code is using separate threads and keeping the pipe buffers empty.
Another problem I've seen come up is that the child process may be sticking around waiting for stdin to be closed, but in this case:
stdin is being closed.
The child process doesn't even exist any more.
Does anybody recognize these symptoms? Why are the stdout and stderr streams not hitting end-of-stream when the child process exits??

Related

Crossbar Thruway worker crashes

I have a Crossbar.io server with PHP Thruway workers. Recently, I started getting the following error. It happens about once a day now:
2016-04-17T21:08:12+0000 [Router 9572] Unable to format event {'log_logger': <Logger 'crossbar.router.protocol.WampWebSocketServerProtocol'>, 'log_time': 1460927292.17918, 'log_source': None, 'log_format': 'Traceback (most recent call last):\n File "/usr/local/lib/python2.7/site-packages/autobahn/wamp/websocket.py", line 88, in onMessage\n for msg in self._serializer.unserialize(payload, isBinary):\n File "/usr/local/lib/python2.7/site-packages/autobahn/wamp/serializer.py", line 106, in unserialize\n raise ProtocolError("invalid serialization of WAMP message ({0})".format(e))\nProtocolError: invalid serialization of WAMP message (Expected object or value)\n'}: tuple index out of range
2016-04-17T21:08:15+0000 [Guest 9583] The connected has closed with reason: close
2016-04-17T21:08:19+0000 [Guest 9583] PHP Fatal error: Call to a member function call() on null in /var/www/html/pickupServer/vendor/voryx/thruway/src/Thruway/ClientSession.php on line 106
2016-04-17T21:08:19+0000 [Guest 9583] Fatal error: Call to a member function call() on null in /var/www/html/pickupServer/vendor/voryx/thruway/src/Thruway/ClientSession.php on line 106
2016-04-17T21:08:19+0000 [Controller 9565] Guest worker2 exited with error A process has ended with a probable error condition: process ended with exit code 255.
Does anyone know hot to prevent this?
How do I automatically restart the worker if it fails as in this case?
I solved it in linux with MONIT checking crossbar-controler process and adding the following line:
if children < 3 then restart
3 is the number os child processes that crossbar-controler has on my environment. If any of them exits then crossbar restarts itself and notifies me. You have to check the number of child processes you have running using:
sudo service crossbar status.
This solves the error exit worker but with a cost of restarting the crossbar-controler. I am convinced that must be a crossbar/thruway way of solving the problem.
The ideal way is to try catch all possible errors of php to prevent fatal worker exits.
Thanks

What causes BitBake worker processes to exit unexpectedly?

I have a BitBake build process that runs on a Docker container (CentOS 7). The BitBake fails during recipe gcc-cross-i586-5.2.0-r0: task do_compile on each run that I try it in.
An example of bitbake's output:
NOTE: recipe gcc-cross-i586-5.2.0-r0: task do_compile: Started
ERROR: Worker process (367) exited unexpectedly (-9), shutting down...
ERROR: Worker process (367) exited unexpectedly (-9), shutting down...
ERROR: Worker process (367) exited unexpectedly (-9), shutting down...
ERROR: Worker process (367) exited unexpectedly (-9), shutting down...
NOTE: Tasks Summary: Attempted 1538 tasks of which 17 didn't need to be rerun and all succeeded.
Is this a problem with recipe gcc-cross-i586-5.2.0-r0: task do_compile? Perhaps an out-of-memory error? I don't know what the -9 refers to or how to find out more information about it.
Try:
$ bitbake -c cleansstate gcc-cross ; bitbake -k gcc-cross
How much you have memory of ram?
Report log error here.
This worked for me,
Edit conf/local.conf and decrease the number of working threads by adding the following to you conf/local.conf file (under the build directory):
BB_NUMBER_THREADS = "6"
Just a long shot, -9 in kernel land means EBADF (bad file number.) Is it possible you have done some operations as root and some files are not accessible during the build? Is the issue reproducible? ie. can you rm -rf tmp and does it happen again? Make sure you don't have any permissions issues in your project directory and associated file system(s).

abrtd: Node Process was killed by signal 6 (SIGABRT)

I am running a Node program that does a long running data migration job. After an hour is process, Node process terminates by Abrt daemon and creates core dump.
Looking into the reason I see this:
node process was killed by signal 6 (SIGABRT)
Any ideas why Node process is killed and how to deal with it?
It turned out to be MemoryLeak issue in Strong-Oracle module I am using. I have increased Nodejs process memory to run with 4G memory. Working fine now.

Why process terminates abnormally without coredump?

I have stuck in a such problem that my c++ server program can't coredump when terminate abnormally. The program running in daemon mode with chdir to '/'.
I had done the following things:
ulimit -c unlimited, so coredump enabled.
echo "/tmp/coredump/core.%e.%p.%t" > /proc/sys/kernel/core_pattern, and chmod a+w coredump, so it has the permission to write coredump file.
and I had try such things:
send a SIGABRT via kill -6, it can coredump.
in dmesg, I can't find any info about the abnormally terminates process.
running the program not in daemon mode.
My OS version: CentOS release 6.4 (Final), x86_64
ps. the server program installed a signal handler (sigaction() with flag SA_RESETHAND) to catch such signals {SIGHUP, SIGINT, SIGQUIT, SIGTERM} for normally terminates(free resources). so it can exclude the signal shielding.

Quitting node.js gracefully

I'm reading through the excellent online book http://nodebeginner.org/ and trying out the simple code
var http = require("http");
function onRequest(request, response) {
response.writeHead(200, {"Content-Type": "text/plain"});
response.write("Hello World");
response.end();
}
http.createServer(onRequest).listen(8888);
Now I didn't know (and I still don't know!) how to shut down node.js gracefully, so I just went ctrl+z. Now each time I try to run node server.js I get the following error messages.
node.js:134
throw e; // process.nextTick error, or 'error' event on first tick
^
Error: EADDRINUSE, Address already in use
at Server._doListen (net.js:1100:5)
at net.js:1071:14
at Object.lookup (dns.js:153:45)
at Server.listen (net.js:1065:20)
at Object.<anonymous> (/Users/Bob/server.js:7:4)
at Module._compile (module.js:402:26)
at Object..js (module.js:408:10)
at Module.load (module.js:334:31)
at Function._load (module.js:293:12)
at Array.<anonymous> (module.js:421:10)
So, two questions:
1) How do I shut down node.js gracefully?
2) How do I repair the mess I've created?
I currently use Node's event system to respond to signals. Here's how I use the Ctrl-C (SIGINT) signal in a program:
process.on( 'SIGINT', function() {
console.log( "\nGracefully shutting down from SIGINT (Ctrl-C)" );
// some other closing procedures go here
process.exit( );
})
You were getting the 'Address in Use' error because Ctrl-Z doesn't kill the program; it just suspends the process on a unix-like operating system and the node program you placed in the background was still bound to that port.
On Unix-like systems, [Control+Z] is the most common default keyboard
mapping for the key sequence that suspends a process (SIGTSTP).[3]
When entered by a user at their computer terminal, the currently
running foreground process is sent a SIGTSTP signal, which generally
causes the process to suspend its execution. The user can later
continue the process execution by typing the command 'fg' (short for
foreground) or by typing 'bg' (short for background) and furthermore
typing the command 'disown' to separate the background process from
the terminal.1
You would need to kill your processes by doing a kill <pid> or 'killall -9 node' or the like.
Use Ctrl+C to exit the node process gracefully
To clean up the mess depends on your platform, but basically you need to find the remains of the process in which node was running and kill it.
For example, on Unix: ps -ax | grep node will give you an entry like:
1039 ttys000 0:00.11 node index.js
where index.js is the name of your node file.
In this example, 1039 is the process id (yours will be different), so kill -9 1039 will end it, and you'll be able to bind to the port again.
As node.js is an event-driven runtime the most graceful exit is to exhaust the queue of pending events. When the event queue is empty the process will end. You can ensure the event queue is drained by doing things such as clearing any interval timers that are set and by closing down any servers with open socket connections. It gets trickier when using 3rd party modules because you are at the mercy of whether the module author has taken care to gracefully drain the pending events it created. This might not be the most practical way to exit a node.js process as you will spend a lot of effort tracking down 'leaked' pending events, but it is the most graceful I think.
Type either
process.exit()
or
.exit
to exit node gracefully.
Hitting Control + C twice will force an exit.
1) How do I shut down node.js gracefully?
Listening for a SIGINT signal. On Windows you need to listen for a ctrl-c with the readline module.
I've written my own solution to provide an application a graceful shutdown and the usage of domains: grace. It's worth to have a look.

Resources