Reading application stdout data using node.js

Reading application stdout data using node.js - node.js

Let's take e.g. "top" application which displays system information and periodically updates it.
I want to run it using node.js and display that information (and updates!).
Code I've come up with:
#!/usr/bin/env node
var spawn = require('child_process').spawn;
var top = spawn('top', []);
top.stdout.on('readable', function () {
console.log("readable");
console.log('stdout: '+top.stdout.read());
});
It doesn't behave the way I expected. In fact it produces nothing:
readable
stdout: null
readable
stdout:
readable
stdout: null
And then exits (that is also unexpected).
top application is taken just as an example. Goal is to proxy those updates through the node and display them on the screen (so same way as running top directly from command line).
My initial goal was to write script to send file using scp. Done that and then noticed that I am missing progress information which scp itself displays. Looked around at scp node modules and they also do not proxy it. So backtracked to common application like top.

top is an interactive console program designed to be run against a live pseudo-terminal.
As to your stdout reads, top is seeing that its stdin is not a tty and exiting with an error, thus no output on stdout. You can see this happen in the shell if you do echo | top it will exit because stdin will not be a tty.
Even if it was actually running though, it's output data is going to contain control characters for manipulating a fixed-dimension console. (like "move the cursor to the beginning of line 2"). It is an interactive user interface and a poor choice as a programmatic data source. "Screen scraping" and interpreting this data and extracting meaningful information is going to be quite difficult and fragile. Have you considered a cleaner approach such as getting the data you need out of the /proc/meminfo file and other special files the kernel exposes for this purpose? Ultimately top is getting all this data from readily-available special files and system calls, so you should be able to tap into data sources that are convenient for programmatic access instead of trying to screen scrape top.
Now of course, top has analytics code to do averages and so forth that you may have to re-implement, so both screen-scraping and going through clean data sources have pros and cons, and aspects that are easy and difficult. But my $0.02 would be focus on good data sources instead of trying to screen scrape a console UI.
Other options/resources to consider:
The free command such as free -m
vmstat
and other commands described in this article
the expect program is designed to help automate console programs that expect a terminal
And just to be clear, yes it is certainly possible to run top as a child process, trick it into thinking there's a tty and all the associated environment settings, and get at the data it is writing. It's just extremely complicated and is analogous to trying to get the weather by taking a photo of the weather channel on a TV screen and running optical character recognition on it. Points for style, but there are easier ways. Look into the expect command if you need to research more about tricking console programs into running as subprocesses.

Related

How to get back raw binary output from python fabric remote command?

I am using Python 3.8.10 and fabric 2.7.0.
I have a Connection to a remote host. I am executing a command such as follows:
resObj = connection.run("cat /usr/bin/binaryFile")
So in theory the bytes of /usr/bin/binaryFile are getting pumped into stdout but I can not figure out what wizardry is required to get them out of resObj.stdout and written into a file locally that would have a matching checksum (as in, get all the bytes out of stdout). For starters, len(resObj.stdout) !== binaryFile.size. Visually glancing at what is present in resObj.stdout and comparing to what is in /usr/bin/binaryFile via hexdump or similar makes them seem about similar, but something is going wrong.
May the record show, I am aware that this particular example would be better accomplished with...
connection.get('/usr/bin/binaryFile')
The point though is that I'd like to be able to get arbitrary binary data out of stdout.
Any help would be greatly appreciated!!!

I eventually gave up on doing this using the fabric library and reverted to straight up paramiko. People give paramiko a hard time for being "too low level" but the truth is that it offers a higher level API which is pretty intuitive to use. I ended up with something like this:
with SSHClient() as client:
client.set_missing_host_key_policy(AutoAddPolicy())
client.connect(hostname, **connectKwargs)
stdin, stdout, stderr = client.exec_command("cat /usr/bin/binaryFile")
In this setup, I can get the raw bytes via stdout.read() (or similarly, stderr.read()).
To do other things that fabric exposes, like put and get it is easy enough to do:
# client from above
with client.open_sftp() as sftpClient:
sftpClient.put(...)
sftpClient.get(...)
also was able to get the exit code per this SO answer by doing:
# stdout from above
stdout.channel.recv_exit_status()
The docs for recv_exit_status list a few gotchas that are worth being aware of too. https://docs.paramiko.org/en/latest/api/channel.html#paramiko.channel.Channel.recv_exit_status .
Moral of the story for me is that fabric ends up feeling like an over abstraction while Paramiko has an easy to use higher level API and also the low level primitives when appropriate.

How can I select() (ie, simultaneously read from) standard input and a file in bash?

I have a program that accepts input on one FIFO and emits output to another FIFO. I want to write a small script to control this program. The script needs to listen both to standard input (so I can input commands to adjust things in real time) and the program's output FIFO (so it can respond to events happening there as well).
Essentially my control program needs to select between standard input and a file (my FIFO).
I like learning how to figure out how to develop simple and elegant bash-based solutions to complex problems, and after a little headscratching I remembered that that tail -f will happily select on multiple files and tell you when one of them changes in real time, so I initially tried
tail -f <(od -An -vtd1 -w1) <(cat fifo)
to read both standard input (I'd previously run stty icanon min 1; this od invocation shows each stdin character on a separate line alongside its ASCII code, and is great for escape sequence parsing) and my FIFO. This failed epically (as does cat <(cat)): od gets run here as a backgrounded task, so it doesn't get access to the controlling TTY, and fails with a cryptic "I/O error" that was explained incredibly well here.
So now I'm a bit stumped. I realize that I can use any scripting language like Perl/Python/Ruby/Tcl to solve this; my compsci/engineering question is whether/how I might be able to solve this using (Linux) shell scripting.

How do I "dump" the contents of an X terminal programmatically a la /dev/vcs{,a} in the Linux console?

Linux's kernel-level console/non-X terminal emulator contains a very cool feature (if compiled in): each /dev/ttyN device corresponds with /dev/vcsaN and /dev/vcsN devices which represent the in-memory (displayed) state of that tty, with and without attributes (color, flashing, etc) respectively. This allows you to very easily cat /dev/vcs7 and see a dump of /dev/tty7 wherever cat was launched. I used this incredibly practical capability the other day to login to a system via SSH and remotely watch a dd process I'd forgotten to put inside a screen (or similar) session - it was running off a text console, so I took a few moments to finetune the character ranges that I wanted to grab, and presently I was watching dd's transfer status over SSH (once every second, incidentally).
To reiterate and clarify, /dev/vcs{,a}* are character devices that retrieve the current in-memory representation the kernel console VT100 emulator, represented as a single "line" of text (there are no "newlines" at the end of each "line" of the screen). Just to remove confusion, I want to note that I can't tail -f this device: it's not a character stream like the TTY itself is. (But I've never needed this kind of behavior, for what it's worth.)
I've kept my ears perked for many years for a method to dump the character-cell memory state of X terminal emulators - or indeed any arbitrary process that needs to work with ttys, in some similar manner as I can with the Linux console. And... I am rather surprised that there is no practical solution to this problem - since it has, arguably, existed for approximately 30 years - X was introduced in 1984 - or, to be pedantic, at least 19 years - /dev/vcs{,a}* was introduced in kernel 1.1.94; the newest file in that release is dated 22 Feb 1995. (The oldest is from 1st Dec 1993 :P)
I would like to say that I do understand and realize that the tty itself is not a "screen buffer" as such but a character stream, and that the nonstandard feature I essentially exploited above is a quirky capability specific to the Linux VT102 emulator. However, this feature is cool enough (why else would it be in the mainline tree? :D) that, in my opinion, there should be a counterpart to it for things that work with /dev/pts*.
This afternoon, I needed to screen-scrape the output of an interactive ncurses application so I could extract metadata from the information it presented in my terminal. (There was no other practical way to achieve the goal I was aiming for.) Linux' kernel VT100 driver would permit such a task to be completed very easily, and I made the mistake of thinking that it, in light of this, it couldn't truly be that hard to do the same under X11.
By 9AM, I'd decided that the easiest way to experimentally request a dump of a remote screen would be to run it in dtach (think "screen -x" without any other options) and hack the dtach code to request a screen update and quit.
Around 11AM-12PM, I was requesting screen updates and dumping them to stdout.
Around 3:30PM, I accepted that using dtach would be impossible:
First of all, it relies on the application itself to send the screen redraws on request, by design, to keep the code simple. This is great, but, as luck would have it, the application I was using didn't support whole-screen repaints - it would only redraw on screen-size change (and only if the screen size was truly different!).
Running the program inside a screen session (because screen is a true terminal emulator and has an internal 2D character-cell buffer), then running screen -x inside dtach, also mysteriously failed to produce character cell updates.
I have previously examined screen and found the code sufficiently insane enough to remove any inclinations I might otherwise have to hack on it; all I can say is that said insanity may be one of the reasons screen does not already have the capabilities I have presented here (which would arguably be very easy to implement).
Other questions similar to this one frequently get answers to use typescript, or script; I just want to clarify that script saves the stream of the tty itself to a file, which I would need to push through a VT100 emulator to obtain a screen image of the current state of the tty in question. In other words, script would be a very insane solution to my problem.

I'm not marking this as accepted since it doesn't solve the actual core issue (which is many years old), but I was able to achieve the specific goal I set out to do.
My specific requirements were that I wanted to screen-scrape the output of the ncdu interactive disk usage browser, so I could simply press Enter in another terminal (or perform some similar, easy sequence) to add the directory currently highlighted/selected in ncdu to a file-list of files I wanted to work with.My goal was not to have to distract myself with endless copy+paste and/or retyping of directory names (probably with not a few inaccuracies to boot), so I could focus on the directories I wanted to select.
screen has a refresh feature, accessed by pressing (by default) CTRL+A, CTRL+L. I extended my copy of dtach to be capable of sending keystrokes in addition to dumping remote screens to stdout, and wrapped dtach in a script that transmitted the refresh sequence (\001\014) to screen -x running inside dtach. This worked perfectly, retrieving complete screen updates without any flicker.
I will warn anyone interested in trying this technique, however, that you will need to perfect the art of dodging VT100 escape sequences. I used regular expressions for this so I wasn't writing thousands of lines of code; here's the specific part of the script that extracted out the two pieces of information I needed:
sh -c "(sleep 0.1; dtach -k qq $'\001\014') &"; path="$(dtach -d qq -t 130000 | sed -n $'/^\033\[7m.*\/\.\./q;/---.*$/{s/.*--- //;s/ -\+.*//;h};/^\033\[7m/{s/.\033.*//g;s/\r.*//g;s/ *$//g;s/^\033\[7m *[^ ]\+ \[[# ]*\] *\(\/*\)\(.*\)$/\/\\2\\1/;p;g;p;q}' | sed 'N;s/\(.*\)\n\(.*\)/\2\1/')"
Since screenshots are cool and help people visualize things, here's a look at how it works when it's running:
The file shown inverted at the bottom of the ncdu-scrape window is being screen-scraped from the ncdu window itself; the four files in the list are there because I selected them using the arrow keys in ncdu, moved my mouse over to the ncdu-scrape window (I use focus-follows-mouse), and pressed Enter. That added the file to the list (a simple text file itself).
Having said this, I would like to clarify that the regular expression above is not a code sample to run with; it is, rather, a warning: for anything beyond incredibly trivial (!!) content extractions such as the one presented here, you're basically getting into the same territory as large corporations/interests who want to convert from VT100-based systems to something more modern, who have to spend tends of thousands commissioning large translation frameworks that perform the kind of conversion outlined above on an especially large scale.
Saner solutions appreciated.

Linux capture window of a large stream of data

Let's say I have a ton of data flowing through stdout over a long period of time, maybe an hour, and I want to capture a 30 second window of that data based on a trigger that occurs in the middle of that window. For instance, maybe something like
$ program-that-outputs-lots-of-data | program-that-captures-a-window-of-data
At some point, a line that contains "A-unique-string" will be output by the program, and at that point I want to save the 15 seconds worth of data before and after that string, discarding everything before that. Immediately afterward, I want to start monitoring again for the same string and capture another window when it comes in and save it to a new file. Any idea how I can do something like this with Linux tools?

The fact that you are trying to use time as a unit for buffering makes your problem very rare. Under the Unix command line, everything tends to be designed around the text line concept.
For example, if instead of 15 seconds of data you would like to capture 15 lines of text (before and after the special token), you could simply do:
$ program-that-outputs-lots-of-data | grep -C 15 A-unique-string
In your case, even if you are developing your own tailored filtering tool, deciding how much input to save and to discard is a pretty complex problem. I'd say that multimedia streaming is the area where there might be some ready to use tools.

I don't think anything exists that approaches these goals. Aside from the fact that your requirements are fairly specific, you also ask that the window be time-based, whereas most Unix-style text filters are line-oriented (e.g. grep -C 100 to get the hundred lines surrounding a match).
It should be fairly straightforward to do this in Python or Perl or Ruby or a similar scripting language.

Linux - Program Design for Debug - Print STDOUT streams from several programs

Let's say I have 10 programs (in terminals) working in tandem: {p1,p2,p3,...,p10}.
It's hard to keep track of all STDOUT debug statements in their respective terminal. I plan to create a GUI to keep track of each STDOUT such that, if I do:
-- Click on p1 would "tail" program 1's output.
-- Click on p3 would "tail" program 4's output.
It's a decent approach but there may be better ideas out there? It's just overwhelming to have 10 terminals; I'd rather have 1 super terminal that keeps track of this.
And unfortunately, linux "screen" is not an option. RESTRICTIONS: I only have the ability to either: redirect STDOUT to a file. (or read directly from STDOUT).

If you are looking for a creative alternative, I would suggest that you look at sockets.
If each program writes to the socket (rather than STDOUT), then your master terminal can act as a server and organize the output.
Now from what you described, it seems as though you are relatively constrained to STDOUT, however it could be possible to do something like this:
# (use netcat (or nc on some systems) to write to a socket on the provided port)
./prog1 | netcat localhost 12312
I'm not sure if this fits in the requirements of what you are doing (and it might be more effort than it is worth!), but it could provide a very stable solution.
EDIT: As was pointed out in the comments, netcat does exactly what you would need to make this work.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string