Why console.log in a loop stops printing when piping to another command? - node.js

I'm using fake-words module (npm install fake-words) with the following simple code:
#!/usr/bin/env node
const fake = require("fake-words");
while (true) {
console.log(fake.sentence());
}
When I run ./genwords.js, everything works as expected.
However when I pipe into external program (on Ubuntu shell), the generation of words stops after a second.
$ ./genwords.js | cat
...
(output generation stops after a second)
$ ./genwords.js | tee
...
(stuck as well)
$ ./genwords.js | pv -l
...
4.64k 0:00:13 [0.00 /s]
Same happening when assigning a value to variable to avoid any caching (as precaution after reading this post, probably not relevant to Node.js):
while (true) {
words = fake.sentence();
console.log(words);
}
What I'm doing wrong?
I'm using Node v16 on Ubuntu:
$ node --version
v16.13.1

The behavior of console.log() in code (such as the while loop in your example) that never relinquishes control to the event loop (especially when piped to another process) in Node.js is a longstanding...uh...quirk. It's a lot harder to fix than you might think. Here's the relevant issue in the tracker: https://github.com/nodejs/node/issues/6379
Or more specifically on the whole handling-when-piped-to-another-process issue: https://github.com/nodejs/node/issues/1741
You can work around the issue by restructuring the code to relinquish control to the event loop. Here is one possibility.
#!/usr/bin/env node
const fake = require("fake-words");
function getWords() {
console.log(fake.sentence());
setImmediate(getWords);
}
getWords()

Related

NodeJS child spawn exits without even waiting for process to finish

I'm trying to create an Angular11 application that connects to the NodeJS API that would run bash scripts when called and on exit it should either send an error or send a 200 status with a confirmation message.
here is one of the functions from that API. It runs a script called initialize_event.sh, gives it a few arguments when prompted and once the program finishes its course should display a success message (There is no error block for this function):
exports.create_event = function (req, res) {
var child = require("child_process").spawn;
var spawned = child("sh", ["/home/ubuntu/master/initialize_event.sh"]);
spawned.stdout.once("data", function (data) {
spawned.stdin.write(req.body.name + "\n");
});
spawned.stdout.once("data", function (data) {
spawned.stdin.write(req.body.domain_name + "\n");
});
spawned.on("exit", function (err) {
res.status(200).send(JSON.stringify("Event created successfully"));
});
};
The bash script is a long one, but what it basically does is take two variables (event name and domain name) and uses that to create a new event instance. Here are the first few lines of code for the program:
#!/bin/bash
#GET EVENT NAME
echo -n "Enter event name: "; read event;
echo -n "Enter event domain: "; read eventdomain;
#LOAD VARIABLES
export eventdomain;
export event;
export ename=$event-env;
export event_rds= someurl.com ;
export master_rds= otherurl.com;
export master_db=master;
# rest of code...
When called on its own directly from the terminal, the process takes around 30-40 seconds after taking input to create an event and then exits once completed. I can then check the list of events created using another script and the new event would show up in the list. However, when I call this script from the NodeJS function, it manages to take the inputs and the exit within 5 or 6 seconds, saying the event has been created successfully. When I check the list of events there is no event created. I wait to see if the process is still running and check back after a few minutes, still, no event created.
I suspect that the spawn exits before the script can be run completely. I thought that maybe the stdio streams are still open so I tried to use spawned.on.close instead of spawned.on.exit, but still the program exits before it even runs completely. I don't see any exceptions or errors appearing in the Node express console, so I can't really figure out why the program exits successfully without running all the way through.
I've used the same inputs when running from the terminal and on Postman, and have logged them as well to see if there are any empty variables being sent, but found nothing wrong with them either. I've double-checked the paths as well, literally copy-pasted from pwd to make sure I haven't been missing something, but still nothing.
What am I doing wrong here??
So here's the problem I found and solved:
The folder where the Node Express was being served from, and the folder where the bash scripts were saved were in different directories.
Problem:
So basically, whenever I created a child process, it was created with the following current directory:
var/www/html/node/
But the bash scripts were run from:
var/www/html/other/bash/scripts/
so any commands that were added to the bash script that involved directory change (like cd) were relative to the bash directory.
However, since the spawn's current directory was var/www/html/node the script being executed in the spawn also had the same current working directory as the node folder, and any directory changes within the script were now invalid since they didn't exist relative to node directory.
E.g.
When run from terminal:
test.sh -> cd /savedir/ -> /var/www/html/other/bash/scripts/savedir/ -> exists
When run from spawn:
test.sh -> cd /savedir/ -> /var/www/html/node/savedir/ -> Doesn't exist!
Solution:
The easiest way I was able to solve this was to modify the test.sh file. i.e during the start I added cd /var/www/html/other/bash/scripts/. This allowed the current directory of my spawn to change to the right directory that would make all the mv cd and other path relevant commands valid.

Read cmd to stream to BSON in Julia

I have curl command, whose input I want to load using BSON.
For performance reason, I want to read the curl output directly to memory, without saving it to file.
Also, I want to close the curl as soon as possible, so I want to read data from curl and then to pass them to BSON, we had some problems when curl was open because it was faster than consecutive parsing.
I know this works, but it keeps curl open for too long, which causes problems when we do this in parallel many times at once and the server where we download from is a bit busy.
using BSON
cmd = `curl <some data>`
BSON.load(open(cmd))
To close cmd ASAP, I have this:
# created IOBuffer to wrap bytes
import BSON.load
function BSON.load(bytes::Vector{UInt8})
io = IOBuffer()
write(io, bytes)
seekstart(io)
BSON.load(io)
end
cmd = `curl <some data>`
BSON.load(read(cmd))
which works, but I consider it very ugly. Also I'm not sure if this doesn't have some performance penalty.
Is there a more elegant way to do this? Can I read(cmd) into some IO structure, which could be then passed to BSON.load?
I realized the exactly same problem holds for Serialization.deserialize. My solution for deserialization is same, but I welcome any improvoements.
It's a little unclear what your question means when you say that it "keeps curl open for too long", but here are two different ways to do this:
julia> using BSON
julia> url = "https://raw.githubusercontent.com/JuliaIO/BSON.jl/master/test/test.bson"
"https://raw.githubusercontent.com/JuliaIO/BSON.jl/master/test/test.bson"
julia> open(BSON.load, `curl -s $url`)
Dict{Symbol,Any} with 2 entries:
:a => Complex{Int64}[1+2im, 3+4im]
:b => "Hello, World!"
julia> BSON.load(IOBuffer(read(`curl -s $url`)))
Dict{Symbol,Any} with 2 entries:
:a => Complex{Int64}[1+2im, 3+4im]
:b => "Hello, World!"
The first version is similar to your first version but closes the curl process immediately when done downloading. The second version reads the result of the curl call into a byte vector, wraps it in an IOBuffer and then calls BSON.load on that.

NodeJS Specific Language Syntax Checker

I recently found a npm package called syntax-checker (https://www.npmjs.com/package/syntax-checker)
And i would like to integrate this into my js script. I'm using a Discord chat bot which checks the message for a code block and the coding language. As the description of Syntax checker says, it supports Ruby, PHP, Perl, Lua, C/CPP, Bash, Javascript and Python. How would i integrate this into the bot? I currently use for js checking this script
if (message.content.includes("```js"))
{
let code = message.content.substring('```js '.length);
var codebegin = code.split("```js").pop();
var n = codebegin.indexOf('```');
var codeend = codebegin.substring(0, n != -1 ? n : codebegin.length);
var check = require('syntax-error');
var err = check(codeend);
if (err)
{
message.reply("Your code contains errors! ```" + err + "```");
}
else
{
message.reply("No Errors!");
}
}
Syntax-checker works by running the program on your computer used to compile code (with no output) and checking to see if there are any errors. It runs by analyzing every file in a directory passed in to it and then outputting to a file. You'll need to create a temporary file for each request then run the program using shell (look into child_process or exec for this).
All that module does ultimately is decide what language the code is from its file extension and run something like exec('php -l file/path/here.php', callbackFunctionHere). That's what it runs for PHP, the others are ruby -c, python -m py_compile, perl -c, luac -p, bash -n. gcc -fsyntax_only, and uglifyjs -o /dev/null.
With that knowledge, there's no sense in messing around with the file system whatsoever. Just use something like exec("echo '" + codeStr + "' | php -l', callbackFunctionHere);. Replace php -l with whichever linter you need. Make sure you escape any single quotes that might occur in the codeStr since you'll end up with odd errors otherwise.

cannot create /dev/stdout: No such device or address

I'm want to run a shell command via node and capture the result of stdout. My script works fine on OSX, but not on Ubuntu.
I've simplified the problem and script to the following node script:
var execSync = require('child_process').execSync,
result = execSync('echo "hello world" >> /dev/stdout');
// Do something with result
Results in:
/bin/sh: 1: cannot create /dev/stdout: No such device or address
I have tried replacing /dev/stdout with /dev/fd/1
I have tried changing the shell to bash... execSync('echo ...', {shell : '/bin/bash'})
Like I said, the problem above is simplified. The real script accepts as a parameter the name of a file where results should be written, so I need to resolve this by providing access to the stdout stream as a file descriptor, i.e. /dev/stdout.
How can I execute a command via node, while giving the command access to its own stdout stream?
On /dev/stdout
I don't have access to an OSX box, but from this issue on phantomjs, it seems that while on both OSX/BSD and Linux /dev/stdout is a symlink, nonetheless it seems to work differently between them. One of the commenters said it's standard on OSX to use /dev/stdout but not for Linux. In another random place I read statements that imply /dev/stdout is pretty much an OSX thing. There might be a clue in this answer as to why it doesn't work on Linux (seems to implicitly close the file descriptor when used this way).
Further related questions:
https://unix.stackexchange.com/questions/36403/portability-of-dev-stdout
bash redirect to /dev/stdout: Not a directory
The solution
I tried your code on Arch and it indeed gives me the same error, as do the variations mentioned - so this is not related to Ubuntu.
I found a blog post that describes how you can pass a file descriptor to execSync. Putting that together with what I got from here and here, I wrote this modified version of your code:
var fs = require('fs');
var path = require('path');
var fdout = fs.openSync(path.join(process.cwd(), 'stdout.txt'), 'a');
var fderr = fs.openSync(path.join(process.cwd(), 'stderr.txt'), 'a');
var execSync = require('child_process').execSync,
result = execSync('echo "hello world"', {stdio: [0,fdout,fderr] });
Unless I misunderstood your question, you want to be able to change where the output of the command in execSync goes. With this you can, using a file descriptor. You can still pass 1 and 2 if you want the called program to output to stdout and stderr as inherited by its parent, which you've already mentioned in the comments.
For future reference, this worked on Arch with kernel version 4.10.9-1-ARCH, on bash 4.4.12 and node v7.7.3.

Bash run a function in background

Have a relatively simple question here. I need to run a function in the background in bash. Normally I would do it just like so:
FUNCTION &
but things are a bit more complicated than that. I have the following line that runs the main function for each record in a text database. I cant really edit this code all that much without vastly changing the rest of the entire project, but im still open to new ideas.
cat databases/$WAN | grep -v \# | while read LINE; do MAIN; done
I want to spawn a new terminal in background for each record to do a sort of parallel type processing, making things go much faster. Main takes a minute to process for each record. This however does not work.
cat databases/$WAN | grep -v \# | while read LINE; do MAIN &; done
Any suggestions?
* UPDATE *
Thanks for all the responses. Let me see if I can answer some of those questions.
gniourf_gniourf - Yes I know using cat like this is wrong. This was early on, and critical code, so I have not updated it yet. I now read into the while loop for most things I do. I will fix it eventually. You may be right about syntax. When I break it up like so, things seem to work now:
cat databases/$WAN | grep -v \# | while read LINE
do
MAIN & > /dev/null 2>&1
done
So that fixes the background problem. I wonder what was messed up in my single line syntax. Thanks
chepner - I don't believe LINE is a variable. I could be wrong though. Some things about Bash still confuse me. Maybe it is and is a variable that the entire record from the database gets stored to prior to processing.
Bruce K - Waiting is exactly what I was trying to avoid. If I let it run in the same terminal one at a time, it will slowly process each record in order. If I push each record to a seperate terminal for processing, all records will be processed simultaneously (at least in our eyes). The additional overhead is intentional in order to speed up how quickly the loop through the database occurs.
Radix - Yes you're right. I'll read up on that. Thanks for the link.
This worked for me:
$ function testt(){ echo "lineee is <$lineee>";}
$ grep 5432 /etc/services|while read lineee;do testt&done
lineee is <postgres 5432/udp # POSTGRES>
lineee is <postgres 5432/tcp # POSTGRES>
If, for some reason, your MAIN function is not seeing a LINE variable, you can try:
"export" the LINE variable beforehand:
$ export LINE
$ # do your thing
Or, pass the line read as an argument to the function:
$ function testt(){ LINE="$1"; echo "LINE is <$LINE>";}
$ grep 5432 /etc/services|while read LINE;do testt "$LINE"&done

Resources