Node.js stops when reading from multiple Readable streams

Node.js stops when reading from multiple Readable streams - node.js

After creating a stream (A), creating another stream (B) and reading stream (B), the reading process stops from the stream (A).
How can I solve this problem?
Node.js v14.18.1
import * as readline from 'readline';
import { Readable } from 'stream';
async function main() {
const streamA = Readable.from('a');
const readerA = readline.createInterface({
input: streamA,
crlfDelay: Infinity
});
var stopCase = false;
if (stopCase) {
const streamB = Readable.from('b');
const readerB = readline.createInterface({
input: streamB,
crlfDelay: Infinity
});
console.log('readB');
for await (const line of readerB) {
console.log(line);
}
}
console.log(`readerA.closed = ${'closed' in readerA}`);
console.log('readA');
for await (const line of readerA) {
console.log(line);
}
console.log('success');
}
main();
Output(stopCase=true):
readB
b
readerA.closed = true
readA
Output(stopCase=false):
readerA.closed = false
readA
a
success

The issue is that as soon as you do this:
const readerA = readline.createInterface({
input: streamA,
crlfDelay: Infinity
});
Then, streamA is now ready to flow and readerA is ready to generate events as soon as you hit the event loop. When you go into the stopCase block and hit the for await (const line of readerB), that will allow streamA to flow which will allow readerA to fire events.
But, you aren't listening for the readerA events when they fire and thus it finishes the streamA content it had while you aren't listening.
You can see how it works better if you don't create readerA until after you're done with the stopCase block. Because then streamA and readerA aren't yet flowing when you hit the await inside of the stopCase block.
This is what I would call a growing pain caused by trying to add promises onto the event driven streams. If you leave the stream in a flowing state and you were going to use await to read those events, but you then await some other promise, all your events on that first stream fire when you aren't yet listening. It doesn't know you're waiting to use await on it. You set it up to flow so as soon as the interpreter hits the event loop, it starts flowing, even though you aren't listening with await.
I've run into this before in my own code and the solution is to not set a stream up to flow until you're either just about to use await to read it or until you have a more traditional event handler configured to listen to any events that flow. Basically, you can't configure two streams for use with for await (...) at the same time. Configure one stream, use it with your for await (...), then configure the other. And, be aware of any other promises used in your processing of the for await (...) loop too. There are lots of ways to goof up when using that structure.
In my opinion, it would work more reliably if a stream was actually put in a different state to be used with promises so it will ONLY flow via the promise interface. Then, this kind of thing would not happen. But, I'm sure there are many challenges with that implementation too.
For example, if you do this:
import * as readline from 'readline';
import { Readable } from 'stream';
async function main() {
var stopCase = true;
console.log(`stopCase = ${stopCase}`);
if (stopCase) {
const streamB = Readable.from('b');
const readerB = readline.createInterface({
input: streamB,
crlfDelay: Infinity
});
console.log('readB');
for await (const line of readerB) {
console.log(line);
}
}
const streamA = Readable.from('a');
const readerA = readline.createInterface({
input: streamA,
crlfDelay: Infinity
});
console.log(`streamA flowing = ${streamA.readableFlowing}`);
console.log(`readerA.closed = ${!!readerA.closed}`);
console.log('readA');
for await (const line of readerA) {
console.log(line);
}
console.log('success');
}
main();
Then, you get all the output:
stopCase = true
readB
b
streamA flowing = true
readerA.closed = false
readA
a
success
The reason you never get the console.log('success') is probably because you hit the for await (const line of readerA) { ...} loop and it gets stopped there on a promise that has no more data. Meanwhile, nodejs notices that there is nothing left in the process that can create any future events so it exits the process.
You can see that same concept in play in an even simpler app:
async function main() {
await new Promise(resolve => {
// do nothing
});
console.log('success');
}
main();
It awaits a promise that never completes and there are no event creating things left in the app so nodejs just shuts down with ever logging success.

Related

What's the best way to create a RxJS Observable out of objects that are yielded from a node.js stream pipeline?

I was able to implement this, but I'm not able to explain why the process will exit with code 0 if I don't have another async function (processAfterDeclaration) always trying to pull from the Observable, recordObservable.
setup to run the source file
npm init -y
npm i rxjs#7.4.0 byline
source file that does what I want, but in a confusing way
// node.js 14
const fs = require('fs');
const pipeline = require('util').promisify(require('stream').pipeline);
const byline = require('byline');
const { Observable } = require('rxjs');
const { take } = require('rxjs/operators');
const sleep = ms => new Promise(r => setTimeout(r, ms));
let recordObservable;
(async () => {
const inputFilePath = 'temp.csv';
try {
const data = 'a,b,c\n' +
'1,2,3\n' +
'10,20,30\n' +
'100,200,300';
fs.writeFileSync(inputFilePath, data);
console.log('starting pipeline');
// remove this line, and the `await pipeline` resolves, but process exits early?
processAfterDeclaration().catch(console.error);
await pipeline(
fs.createReadStream(inputFilePath),
byline.createStream(),
async function* (sourceStream) {
console.log('making observable', inputFilePath);
recordObservable = new Observable(async subscriber => {
for await (const lineBuffer of sourceStream) {
subscriber.next(lineBuffer.toString());
}
subscriber.complete();
});
console.log('made observable', recordObservable);
}
);
console.log('pipeline done', recordObservable);
} catch (error) {
console.error(error);
} finally {
fs.unlinkSync(inputFilePath);
}
})();
async function processAfterDeclaration() {
while (!recordObservable) {
await sleep(100);
}
console.log('can process');
await recordObservable
.pipe(take(2))
.subscribe(console.log)
}
edit: It may be better to just forgo node.js stream.pipeline. I'd think using pipeline is best bc it should be the most efficient and offers backpressuring, but I want to test some things offered by RxJS.
edit2: More reasons to be able to forgo stream.pipeline is that I can still use pipe methods and provide any readable stream to the from function as an arg. I can then use subscribe method to write/append each thing from the observable to my output stream then call add on my subscription to add teardown logic, specifically for closing my write stream. I would hope that RxJS from would help determine when to close the read stream it's given as input. Finally, I would recommend await lastValueFrom(myObservable) or firstValueFrom possibly.

RxJS from operator
The RxJS from operator will turn an async iterator (like node stream) into an observable for you!
I can't run/test on your code, but something in this ballpark should work.
const fs = require('fs');
const byline = require('byline');
const { from } = require('rxjs');
const { map, take, finalize} = require('rxjs/operators');
const inputFilePath = 'temp.csv';
(async () => {
const data = 'a,b,c\n' +
'1,2,3\n' +
'10,20,30\n' +
'100,200,300';
fs.writeFileSync(inputFilePath, data);
console.log('starting pipeline');
from(byline(fs.createReadStream(inputFilePath)))
.pipe(
map(lineBuffer => lineBuffer.toString()),
take(2),
finalize(() => fs.unlinkSync(inputFilePath))
)
.subscribe(console.log);
})();
Your second async function
I'm not able to explain why the process will exit with code 0 if I don't have another async function (processAfterDeclaration) always trying to pull from the Observable
If you define a function and never call it, that function will never calculate anything.
If you define an observable and never subscribe to it, that observable will never do anything either. That's different from promises which start the moment they're defined. You just need to subscribe to that observable, it doesn't need a separate function though.
This should work the same:
recordObservable = new Observable(async subscriber => {
for await (const lineBuffer of sourceStream) {
subscriber.next(lineBuffer.toString());
}
subscriber.complete();
});
recordObservable.pipe(
take(2)
).subscribe(console.log)

The second async function
I'm not able to explain why the process will exit with code 0 if I
don't have another async function (processAfterDeclaration) always
trying to pull from the Observable
The logic error was that the await pipeline would never resolve or reject because the 3rd step in the pipeline would never yield anything because nothing would ever subscribe and pull from the recordObservable. It was a deadlock that was written accidentally.

Break and continue Node.js readline async for loop

I'm using the Node.js readline interface to read a file line-by-line using an async for-of loop. But I want to be able to control the flow and I'm not sure how to break and continue the loop where it left off.
Simplified example:
const fileStream = fs.createReadStream('input.txt')
const rl = readline.createInterface({
input: fileStream,
crlfDelay: Infinity
})
for await (const line of rl) {
console.log(line) // This works
break
}
for await (const line of rl) {
console.log(line) // This does not print anything
}
See this replit for a complete example.
How do I use the same readline interface to continue the loop where it left off?

Async generators close the underlying stream once you exit the loop. So I went for this instead:
const it = this.lineReader[Symbol.asyncIterator]()
while (true) {
const res = await it.next()
if (res.done) break
line = res.value.trim()
}

Asynchronous file read reading different number of lines each time, not halting

I built a simple asynchronous implementation of the readlines module built into nodejs, which is simply a wrapper around the event-based module itself. The code is below;
const readline = require('readline');
module.exports = {
createInterface: args => {
let self = {
interface: readline.createInterface(args),
readLine: () => new Promise((succ, fail) => {
if (self.interface === null) {
succ(null);
} else {
self.interface.once('line', succ);
}
}),
hasLine: () => self.interface !== null
};
self.interface.on('close', () => {
self.interface = null;
});
return self;
}
}
Ideally, I would use it like so, in code like this;
const readline = require("./async-readline");
let filename = "bar.txt";
let linereader = readline.createInterface({
input: fs.createReadStream(filename)
});
let lines = 0;
while (linereader.hasLine()) {
let line = await linereader.readLine();
lines++;
console.log(lines);
}
console.log("Finished");
However, i've observed some erratic and unexpected behavior with this async wrapper. For one, it fails to recognize when the file ends, and simply hangs once it reaches the last line, never printing "Finished". And on top of that, when the input file is large, say a couple thousand lines, it's always off by a few lines and doesn't successfully read the full file before halting. in a 2000+ line file it could be off by as many as 20-40 lines. If I throw a print statement into the .on('close' listener, I see that it does trigger; however, the program still doesn't recognize that it should no longer have lines to read.

It seems that in nodejs v11.7, the readline interface was given async iterator functionality and can simply be looped through with a for await ... of loop;
const rl = readline.createInterface({
input: fs.createReadStream(filename);
});
for await (const line of rl) {
console.log(line)
}
How to get synchronous readline, or "simulate" it using async, in nodejs?

How to force sequential statement execution in Node?

I am new to Node, so please forgive me if my question is too simple. I fully appreciate the Async paradigm and why it is useful in single threads. But some logical operations are synchronous by nature.
I have found many posts about the async/sync issue, and have been reading for whole two days about callbacks, promises, async/await etc...
But still I cannot figure what should be straight forward and simple thing to do. Am I missing something!
Basically for the code below:
const fs = require('fs');
var readline = require('readline');
function getfile (aprodfile) {
var prodlines = [];
var prodfile = readline.createInterface({input: fs.createReadStream(aprodfile)});
prodfile.on('line', (line) => {prodlines.push(line)});
prodfile.on('close', () => console.log('Loaded '+prodlines.length));
// the above block is repeated several times to read several other
// files, but are omitted here for simplicity
console.log(prodlines.length);
// then 200+ lines that assume prodlines already filled
};
the output I get is:
0
Loaded 11167
whereas the output I expect is:
Loaded 11167
11167
This is because the console.log statement executes before the prodfile.on events are completed.
Is there a nice clean way to tell Node to execute commands sequentially, even if blocking? or better still to tell console.log (and the 200+ lines of code following it) to wait until prodlines is fully populated?

Here's the execution order of what you wrote :
prodfile.on('line', line => { // (1) subscribing to the 'line' event
prodlines.push(line) // (4), whenever the 'line' event is triggered
});
prodfile.on('close', () => { // (2) subscribing to the 'close' event
console.log('Loaded '+prodlines.length) // (5), whenever the 'close' event is triggered
});
console.log(prodlines.length); // (3) --> so it logs 0, nothing has happened yet
What you can do is this :
function getfile (aprodfile) {
var prodlines = [];
var prodfile = readline.createInterface({input: fs.createReadStream(aprodfile)});
prodfile.on('line', line => { prodlines.push(line) });
prodfile.on('close', () => {
console.log('Loaded '+prodlines.length)
finishedFetching( prodlines );
});
};
function finishedFetching( prodlines ) {
console.log(prodlines.length) // 200!
}

Block for stdin in Node.js

Short explanation:
I'm attempting to write a simple game in Node.js that needs to wait for user input every turn. How do I avoid callback hell (e.g. messy code) internal to a turn loop where each turn loop iteration needs to block and wait for input from stdin?
Long explanation:
All the explanations I have read on StackOverflow when someone asks about blocking for stdin input seem to be "that's not what Node.js is about!"
I understand that Node.js is designed to be non-blocking and I also understand why. However I feel that it has me stuck between a rock and a hard place on how to solve this. I feel like I have three options:
Find a way to block for stdin and retain my while loop
Ditch the while loop and instead recursively call a method (like nextTurn) whenever the previous turn ends.
Ditch the while loop and instead use setTimeout(0, ...) or something similar to call a method (like nextTurn) whenever a turn ends.
With option (1) I am going against Node.js principles of non-blocking IO.
With option (2) I will eventually reach a stack overflow as each call adds another turn to the call stack.
With option (3) my code ends up being a mess to follow.
Internal to Node.js there are default functions that are marked **Sync (e.g. see the fs library or the sleep function) and I'm wondering why there is no Sync method for getting user input? And if I were to write something similar to fs.readSync how would I go about doing it and still follow best practices?

Just found this:
https://www.npmjs.com/package/readline-sync
Example code (after doing an npm install readline-sync)
var readlineSync = require('readline-sync');
while(true) {
var yn = readlineSync.question("Do you like having tools that let you code how you want, rather than how their authors wanted?");
if(yn === 'y') {
console.log("Hooray!");
} else {
console.log("Back to callback world, I guess...");
process.exit();
}
}
Only problem so far is the wailing of the "That's not how node is meant to be used!" chorus, but I have earplugs :)

I agree with the comment about moving towards an event based system and would ditch the loops. I've thrown together a quick example of text based processing which can be used for simple text games.
var fs = require('fs'),
es = require('event-stream');
process.stdin
.pipe(es.split())
.on('data', parseCommand);
var actionHandlers = {};
function parseCommand(command) {
var words = command.split(' '),
action = '';
if(words.length > 1) {
action = words.shift();
}
if(actionHandlers[action]) {
actionHandlers[action](words);
} else {
invalidAction(action);
}
}
function invalidAction(action) {
console.log('Unknown Action:', action);
}
actionHandlers['move'] = function(words) {
console.log('You move', words);
}
actionHandlers['attack'] = function(words) {
console.log('You attack', words);
}
You can now break up your actions into discrete functions which you can register with a central actionHandlers variable. This makes adding new commands almost trivial. If you can add some details on why the above approach wouldn't work well for you, let me know and I'll revise the answer.

ArtHare's solution, at least for my use case, blocks background execution, including those started by a promise. While this code isn't elegant, it did block execution of the current function, until the read from stdin completed.
While this code must run from inside an async function, keep in mind that running an async function from a top-level context (directly from a script, not contained within any other function) will block that function until it completes.
Below is a full .js script demonstrating usage, tested with node v8.12.0:
const readline = require('readline');
const sleep = (waitTimeInMs) => new Promise(resolve => setTimeout(resolve, waitTimeInMs));
async function blockReadLine() {
var rl = readline.createInterface({
input: process.stdin,
output: process.stdout,
terminal: false
});
let result = undefined;
rl.on('line', function(line){
result = line;
})
while(!result) await sleep(100);
return result;
}
async function run() {
new Promise(async () => {
while(true) {
console.log("Won't be silenced! Won't be censored!");
await sleep(1000);
}
});
let result = await blockReadLine();
console.log("The result was:" + result);
process.exit(0);
}
run();

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Node.js stops when reading from multiple Readable streams - node.js

Related

What's the best way to create a RxJS Observable out of objects that are yielded from a node.js stream pipeline?

Break and continue Node.js readline async for loop

Asynchronous file read reading different number of lines each time, not halting

How to force sequential statement execution in Node?

Block for stdin in Node.js

Categories

Resources