Is it possible to attach to a running background process with ruby? - node.js

I have a nodejs daemon running on my server, I would like to give him some input on stdin and read it stdout from a Rails controller, is it possible with Ruby?
I am looking at Open3 but it seems to give me only the chance to spawn a new process.
I need the keep the nodejs process running because the starting overhead is too high to be called at every request.

In general there is no way to attach to a running process's IO streams unless it was set up to do so initially. It is easy if, for example, the process was set up to read from a pipe: just have Ruby write to that pipe like any other file (this is what the Open3 lib does).
For a daemon usually there are more proper ways to interact with it than hijacking its input with a pipe, though it depends on the particular daemon you are running and how it is being managed by the OS. For example, sockets are a popular way to communicate to a running process on *nix systems.

Related

How node IPC works between 2 processes

Using nodejs fork you can perform IPC between the parent process and the child process. Previously I was under the impression that the child process would have an extra environment variable with a file descriptor. I printed the process env but I can't see any variable with a file Id, I don't see any open sockets either, so my question is how does node IPC works behind the scenes?
so my question is how does node IPC (for forked processes) works behind the scenes
The source code for fork uses a Pipe object internally. Looking further into that Pipe object, it is a wrapper over the libuv Pipe object. Then, looking into libuv, it's Pipe abstraction is a domain socket on Unix and a named pipe on Windows.
Now, since this is all undocumented implementation details, there's nothing that says it has to always be done this way in the future - though one would not expect it to change unless there was a really good reason.

Parent process <- child processes UNIdirectional communication in "real-world" Haskell?

Goal:
There is an IShell which is nothing but an ordinary console able
to consume somewhat command like do param1=value1 --option.
IShell should orchestrate whole execution. It does not run commands, the only thing it does is starts appropriate
process.
Any process started from the running IShell instance should be able to report back to it what's happening inside. So,
say, IShell has started process A to do something
complicated; process A should be able to report both progress
and result back to parent IShell. In practice, it means, that
there should be a mechanism how to, for example, print message
from process A to appropriate IShell.
Finally, code should work both with Windows and Linux.
I really like Haskell and I'd like to promote "real-world" Haskell usage. But I don't know existing libraries well, I haven't done yet any "real-world" Haskell app.
Thus, questions:
How can I establish IShell <- it's processes communication? Is there a single library able to handle both Windows-specific and Linux-specific stuff?
The process package supports Linux and Windows and provides mechanisms for communicating with children processes via their stdin, stdout, stderr, and exit code.
The network package supports Linux and Windows and provides mechanisms for communicating with children processes via socket.

How to monitor open file descriptors in Ruby on Rails?

Background:
I had an issue with my Rails server recently where it would stop responding, requiring a bounce to get it back up and running. This issue was due to a controller that does some forking upon receiving a POST, to do some heavy-weight concurrent processing -- server response time kept increasing until the server completely stopped responding. I'm pretty sure I have fixed that issue (DB connections copied upon fork weren't getting closed in child processes), but it would be great to authoritatively test that.
Question:
Is there a way to monitor open file descriptors from inside my Rails app? It's running on Linux, so I've been mucking around with the proc filesystem and the lsof command to observe the open file descriptors; this is messy, because it only gives you a snapshot of the current processes. Ideally I would like to print the open file descriptors in the parent and child processes before, during, and after the processing, to ensure that file descriptors don't stay open past their welcome.
One method to consider (probably the simplest) is using a background worker of some sort, such as with Workling, and making it run lsof in intervals, and getting output using syntax:
`lsof | grep something` # shell command example.
Programs like lsof can really hurt performance if run too frequently. Perhaps every 10s to 30s. Perhaps down to maybe 5s, but that's really pushing it. I'm assuming you have a dedicated server or a beasty virtual machine.
In your background worker, you can store these command results into a variable, or grep it down to what you're really looking for (as demonstrated), and access/manipulate the data as you please.

Maintaining a long-running task on Linux

My system includes a task which opens a network socket, receives pushed data from the network, processes it, and writes it out to disk or pings other machines depending on the messages. This task is intended to run forever, and the service is designed to have this task always running. But sometimes it crashes.
What's the best practice for keeping a task like this alive? Assume it's okay for the task to be dead for up to 30 seconds before we restart it.
Some obvious ideas include having a watchdog process that checks to make sure the process is still running. Watchdog could be triggered by cron. But how does it know if the process is alive or not? Write a pidfile? touch a heartbeat file? An ideal solution wouldn't continuously spin up more processes if the machine gets bogged down to the point where the watchdog is running faster than the heartbeat.
Are there standard linux tools for this? I can imagine a solution that uses a message queue, but I'm not sure if that's a good idea or not.
Depending on the nature of the task that you wish to monitor, one method is to write a simple wrapper to start up your task in a fork().
The wrapper task can then do a waitpid() on the child and restart it if it is terminated.
This does depend on modifying the source for the task that you wish to run.
sysvinit will restart processes that die, if added to inittab.
If you're worried about the process freezing without crashing and ending the process, you can use a heartbeat and hard kill the active instance, letting init restart it.
You could use monit along with daemonize. There are lots of tools for this in the *nix world.
Supervisor was designed precisely for this task. From the project website:
Supervisor is a client/server system that allows its users to monitor and control a number of processes on UNIX-like operating systems.
It runs as a daemon (supervisord) controlled by a command line tool, supervisorctl. The configuration file contains a list of programs it is supposed to monitor, among other settings.
The number of options is quite extensive, -- have a look at the docs for a complete list. In your case, the relevant configuration section might be something like this:
[program:my-network-task]
command=/bin/my-network-task # where your binary lives
autostart=true # start when supervisor starts?
autorestart=true # restart automatically when stopped?
startsecs=10 # consider start successful after how many secs?
startretries=3 # try starting how many times?
I have used Supervisor myself and it worked really well once everything was set up. It requires Python, which should not be a big deal in most environments but might be.

Debugging utilities for Linux process hang issues?

I have a daemon process which does the configuration management. all the other processes should interact with this daemon for their functioning. But when I execute a large action, after few hours the daemon process is unresponsive for 2 to 3 hours. And After 2- 3 hours it is working normally.
Debugging utilities for Linux process hang issues?
How to get at what point the linux process hangs?
strace can show the last system calls and their result
lsof can show open files
the system log can be very effective when log messages are written to track progress. Allows to box the problem in smaller areas. Also correlate log messages to other messages from other systems, this often turns up interesting results
wireshark if the apps use sockets to make the wire chatter visible.
ps ax + top can show if your app is in a busy loop, i.e. running all the time, sleeping or blocked in IO, consuming CPU, using memory.
Each of these may give a little bit of information which together build up a picture of the issue.
When using gdb, it might be useful to trigger a core dump when the app is blocked. Then you have a static snapshot which you can analyze using post mortem debugging at your leisure. You can have these triggered by a script. The you quickly build up a set of snapshots which can be used to test your theories.
One option is to use gdb and use the attach command in order to attach to a running process. You will need to load a file containing the symbols of the executable in question (using the file command)
There are a number of different ways to do:
Listening on a UNIX domain socket, to handle status requests. An external application can then inquire as to whether the application is still ok. If it gets no response within some timeout period, then it can be assumed that the application being queried has deadlocked or is dead.
Periodically touching a file with a preselected path. An external application can look a the timestamp for the file, and if it is stale, then it can assume that the appliation is dead or deadlocked.
You can use the alarm syscall repeatedly, having the signal terminate the process (use sigaction accordingly). As long as you keep calling alarm (i.e. as long as your program is running) it will keep running. Once you don't, the signal will fire.
You can seamlessly restart your process as it dies with fork and waitpid as described in this answer. It does not cost any significant resources, since the OS will share the memory pages.

Resources