Troubles processing data from Async calls

Troubles processing data from Async calls - node.js

So this is my task:
You must collect the complete content provided to you by each of the
URLs an d print it to the console (stdout). You don't need to print
out the length, just the data as a String; one line per URL. The catch
is that you must prin t them out in the same order as the URLs are
provided to you as command-line arguments.
And I'm doing it like this:
var http = require('http');
var dataStream = [];
var dataArr = [];
var count = 0;
/*
Function to print results
#dataArr - array
*/
function printResults(dataArr) {
for (var i = 0; i < process.argv.length - 2; i++)
console.log(dataArr[i]);
}
/*
Function to get data from http
#i - int
Getting command line arguments as parametrs.
*/
function httpGet(i) {
http.get(process.argv[2 + i], function(res) {
res.setEncoding('utf8');
res.on('data', function(data) {
dataStream.push(data);
});
res.on('end', function() {
dataArr[i] = (dataStream.join(""));
dataStream = [];
count++;
if (count == process.argv.length - 2) {
printResults(dataArr);
}
});
res.on('error', function(e) {
console.log("Got error: " + e.message);
});
});
}
for (var i = 0; i < process.argv.length - 2; i++) {
httpGet(i);
}
And for some reason sometimes it stores data in array as it supposed, but sometimes it breaks and outputs complete nonsense.
Some results examples:
When working:
$ learnyounode verify program.js
Your submission results compared to the expected:
────────────────────────────────────────────────────────────────────────────
────
1. ACTUAL: "Shazza got us some trackies when as stands out like dog's ba
lls. Grab us a show pony heaps he hasn't got a lurk. She'll be right rubbish
mate it'll be budgie smugglers. You little ripper bloke heaps we're going t
op end. He's got a massive bog standard also built like a freckle. "
1. EXPECTED: "Shazza got us some trackies when as stands out like dog's ba
lls. Grab us a show pony heaps he hasn't got a lurk. She'll be right rubbish
mate it'll be budgie smugglers. You little ripper bloke heaps we're going t
op end. He's got a massive bog standard also built like a freckle. "
2. ACTUAL: "As dry as a sook and as dry as a cleanskin. As cunning as a
metho where get a dog up ya parma. "
2. EXPECTED: "As dry as a sook and as dry as a cleanskin. As cunning as a
metho where get a dog up ya parma. "
3. ACTUAL: "Gutful of gyno how come a mokkies. It'll be clacker and buil
t like a holy dooley!. Get a dog up ya boozer heaps come a captain cook. "
3. EXPECTED: "Gutful of gyno how come a mokkies. It'll be clacker and buil
t like a holy dooley!. Get a dog up ya boozer heaps come a captain cook. "
4. ACTUAL: ""
4. EXPECTED: ""
And an example when it's not working:
$ learnyounode verify program.js
Your submission results compared to the expected:
────────────────────────────────────────────────────────────────────────────
────
1. ACTUAL: "of bogan with it'll be rort. He hasn't got a give it a burl
flamin you little ripper dinky-di. Watch out for the mate's rate to shazza g
ot us some swag. "
1. EXPECTED: "He's got a massive op shop to you little ripper corker. Gutf
ul of bogan with it'll be rort. He hasn't got a give it a burl flamin you li
ttle ripper dinky-di. Watch out for the mate's rate to shazza got us some sw
ag. "
2. ACTUAL: "You little ripper thongs when as stands out like ropeable. T
rent from punchy boardies bloody as cunning as a brisvegas. "
2. EXPECTED: "You little ripper thongs when as stands out like ropeable. T
rent from punchy boardies bloody as cunning as a brisvegas. "
3. ACTUAL: "As dry as a uluru when come a scratchy. Flat out like a ute
with get a dog up ya chrissie. As busy as a fair go no worries it'll be fair
dinkum. She'll be right freo when it'll be cracker. He's Watch got out a fo
r massive the op crook shop my to as you busy little as ripper a corker. bru
mby. Gutful "
3. EXPECTED: "As dry as a uluru when come a scratchy. Flat out like a ute
with get a dog up ya chrissie. As busy as a fair go no worries it'll be fair
dinkum. She'll be right freo when it'll be cracker. Watch out for the crook
my as busy as a brumby. "
4. ACTUAL: ""
4. EXPECTED: ""

Don't treat the data you receive on res.on("data") as an array. Instead, treat it as a string and define it as a variable within the http function (not as a global variable) and do str += data.
Alternatively, you could look at using a library like Async to manage the correct ordering of the async functions you need to execute, as you need each Async to be executed and returned in sequential order.

So the problem was:
Your code makes the assumption that the 3 http responses will not
overlap - that a data event for one response will never occur before
the end event of the previous response, this is not always the case. I
recommend you move the definition of your dataStream variable inside
your httpGet function, this way each request/response will have its
own variable and they cannot interfere with each other, regardless of
timing.
So I refactored my solution to look like this, and now it works 100% of the time:
var http = require('http');
var dataArr = [];
var count = 0;
/*
Function to print results
#dataArr - array
*/
function printResults(dataArr) {
for (var i = 0; i < process.argv.length - 2; i++) {
console.log(dataArr[i].replace('undefined', ''));
}
}
/*
Function to get data from http
#i - int
Getting command line arguments as parametrs.
*/
function httpGet(i) {
http.get(process.argv[2 + i], function(res) {
res.setEncoding('utf8');
res.on('data', function(data) {
dataArr[i] += data;
});
res.on('end', function() {
count++;
if (count == process.argv.length - 2) {
printResults(dataArr);
}
});
res.on('error', function(e) {
console.log("Got error: " + e.message);
});
});
}
for (var i = 0; i < process.argv.length - 2; i++) {
httpGet(i);
}
More info: https://github.com/nodeschool/discussions/issues/1270

Related

CosmosDB insertion loop stops inserting after a certain number of iterations (Node.js)

I'm doing a few tutorials on CosmosDB. I've got the database set up with the Core (SQL) API, and using Node.js to interface with it. for development, I'm using the emulator.
This is the bit of code that I'm running:
const CosmosClient = require('#azure/cosmos').CosmosClient
process.env.NODE_TLS_REJECT_UNAUTHORIZED = "0";
const options = {
endpoint: 'https://localhost:8081',
key: REDACTED,
userAgentSuffix: 'CosmosDBJavascriptQuickstart'
};
const client = new CosmosClient(options);
(async () => {
let cost = 0;
let i = 0
while (i < 2000) {
i += 1
console.log(i+" Creating record, running cost:"+cost)
let response = await client.database('TestDB').container('TestContainer').items.upsert({}).catch(console.log);
cost += response.requestCharge;
}
})()
This, without fail, stops at around iteration 1565, and doesn't continue. I've tried it with different payloads, without much difference (it may do a few more or a few less iterations, but seems to almsot always be around that number)
On the flipside, a similar .NET Core example works great to insert 10,000 documents:
double cost = 0.0;
int i = 0;
while (i < 10000)
{
i++;
ItemResponse<dynamic> resp = await this.container.CreateItemAsync<dynamic>(new { id = Guid.NewGuid() });
cost += resp.RequestCharge;
Console.WriteLine("Created item {0} Operation consumed {1} RUs. Running cost: {2}", i, resp.RequestCharge, cost);
}
So I'm not sure what's going on.

So, after a bit of fiddling, this doesn't seem to have anything to do with CosmosDB or it's library.
I was running this in the debugger, and Node would just crap out after x iterations. I noticed if I didn't use a console.log it would actually work. Also, if I ran the script with node file.js it also worked. So there seems to be some sort of issue with debugging the script while also printing to the console. Not exactly sure whats up with that, but going to go ahead and mark this as solved

Async Use of Http Get in Node

For each url in a list of urls, I'm trying to push a string response from that url to an array after requesting it. This basically maps a list of urls to their contents. The catch is, the resulting list must be in the same order as the urls. This is taken from this challenge.
I can't figure out why my solution doesn't work. As far as I can tell, it should put things in the right order. But it doesn't pass the tests.
var http = require('http')
var urls = process.argv.slice(2, process.argv.length)
var result = Array(3)
urls.forEach((url, ind) => {
var str = ''
http.get(url, response => {
response.on('data', d => {
str+=d.toString()
})
response.on('end', () => {
result[ind] = str
})
if (result.every(val => {return val.length!==0})){
result.forEach(s => {console.log(s)})
}
})
})
Here are the workshopper's test results:
ACTUAL: "She'll be right slaps how he's got a massive jillaroo. As busy as a esky flamin built like a hottie. "
EXPECTED: "You little ripper shag on a rock and trent from punchy bottlo. As cunning as a compo no dramas lets get some gyno. Lets get
some durry when built like a cranky. "
ACTUAL: "You little ripper shag on a rock and trent from punchy bottlo. As cunning as a compo no dramas lets get some gyno. Lets get
some durry when built like a cranky. "
EXPECTED: "She'll be right slaps how he's got a massive jillaroo. As busy as a esky flamin built like a hottie. "
ACTUAL: "She'll be right slaps how he's got a massive jillaroo. As busy as a esky flamin built like a hottie. "
EXPECTED: "We're going no dramas heaps trent from punchy christmas. As busy as a ironman mate stands out like a thingo. As dry
as a cream to stands out like a rip snorter. Watch out for the ratbag
with gutful of cut lunch. Shazza got us some shag on a rock with trent
from punchy throw-down. "
ACTUAL: ""
EXPECTED: ""

NODEJS: Uncork() method on writable stream doesn't really flush the data

I am writing quite simple application to transform data - read one file and write to another. Files are relatively large - 2 gb. However, what I found is that flush to the file system is not happening, on cork-uncork cycle, it only happens on end(), so the end() basically hangs the system until it's fully flashed.
I simplified the example so it just writes a line to the stream a lot of times.
var PREFIX = 'E:\\TEST\\';
var line = 'AA 11 999999999 20160101 123456 20160101 AAA 00 00 00 0 0 0 2 2 0 0 20160101 0 00';
var fileSystem = require('fs');
function writeStrings() {
var stringsCount = 0;
var stream = fileSystem.createWriteStream(PREFIX +'output.txt');
stream.once('drain', function () {
console.log("drained");
});
stream.once('open', function (fileDescriptor) {
var started = false;
console.log('writing file ');
stream.cork();
for (i = 0; i < 2000000; i++) {
stream.write(line + i);
if (i % 10000 == 0) {
// console.log('passed ',i);
}
if (i % 100000 == 0) {
console.log('uncorcked ',i,stream._writableState.writing);
stream.uncork();
stream.cork();
}
}
stream.end();
});
stream.once('finish', function () {
console.log("done");
});
}
writeStrings();
going inside the node _stream_writable.js, I found that it flushes the buffer only on this condition:
if (!state.writing &&
!state.corked &&
!state.finished &&
!state.bufferProcessing &&
state.buffer.length)
clearBuffer(this, state);
and, as you can see from example, the writing flag doesn't set back after first uncork(), which prevents the uncork to flush.
Also, I don't see drain events evoking at all. Playing with highWaterMark doesn't help (actually doesn't seems to have effect on anything). Manually setting the writing to false (+ some other flags) indeed helped but this is surely wrong.
Am I am misunderstanding the concept of this?

From the node.js documentation I found that number of uncork() should match the number of cork() call, I am not seeing matching stream.uncork() call for stream.cork(), which is called before the for loop. That might be the issue.

Looking at a guide on nodejs.org, you aren't supposed to call stream.uncork() twice in the same event loop. Here is an excerpt:
// Using .uncork() twice here makes two calls on the C++ layer, rendering the
// cork/uncork technique useless.
ws.cork();
ws.write('hello ');
ws.write('world ');
ws.uncork();
ws.cork();
ws.write('from ');
ws.write('Matteo');
ws.uncork();
// The correct way to write this is to utilize process.nextTick(), which fires
// on the next event loop.
ws.cork();
ws.write('hello ');
ws.write('world ');
process.nextTick(doUncork, ws);
ws.cork();
ws.write('from ');
ws.write('Matteo');
process.nextTick(doUncork, ws);
// as a global function
function doUncork(stream) {
stream.uncork();
}
.cork() can be called as many times we want, we just need to be careful to call .uncork() the same amount of times to make it flow again.

Fast file copy with progress information in Node.js?

Is there any chance to copy large files with Node.js with progress infos and fast?
Solution 1 : fs.createReadStream().pipe(...) = useless, up to 5 slower than native cp
See: Fastest way to copy file in node.js, progress information is possible (with npm package 'progress-stream' ):
fs = require('fs');
fs.createReadStream('test.log').pipe(fs.createWriteStream('newLog.log'));
The only problem with that way is that it takes easily 5 times longer compared "cp source dest". See also the appendix below for the full test code.
Solution 2 : rsync ---info=progress2 = same slow as solution 1 = useless
Solution 3 : My last resort, write a native module for node.js, using "CoreUtils" (linux sources for cp and others) or other functions as shown in Fast file copy with progress
Does anyone knows better than solution 3? I'd like to avoid native code but it seems the best fit.
thanks! any package recommendations or hints (tried all fs**) are welcome!
Appendix:
test code, using pipe and progress:
var path = require('path');
var progress = require('progress-stream');
var fs = require('fs');
var _source = path.resolve('../inc/big.avi');// 1.5GB
var _target= '/tmp/a.avi';
var stat = fs.statSync(_source);
var str = progress({
length: stat.size,
time: 100
});
str.on('progress', function(progress) {
console.log(progress.percentage);
});
function copyFile(source, target, cb) {
var cbCalled = false;
var rd = fs.createReadStream(source);
rd.on("error", function(err) {
done(err);
});
var wr = fs.createWriteStream(target);
wr.on("error", function(err) {
done(err);
});
wr.on("close", function(ex) {
done();
});
rd.pipe(str).pipe(wr);
function done(err) {
if (!cbCalled) {
console.log('done');
cb && cb(err);
cbCalled = true;
}
}
}
copyFile(_source,_target);
update: a fast (with detailed progress!) C version is implemented here: https://github.com/MidnightCommander/mc/blob/master/src/filemanager/file.c#L1480. Seems the best place to go from :-)

One aspect that may slow down the process is related to console.log. Take a look into this code:
const fs = require('fs');
const sourceFile = 'large.exe'
const destFile = 'large_copy.exe'
console.time('copying')
fs.stat(sourceFile, function(err, stat){
const filesize = stat.size
let bytesCopied = 0
const readStream = fs.createReadStream(sourceFile)
readStream.on('data', function(buffer){
bytesCopied+= buffer.length
let porcentage = ((bytesCopied/filesize)*100).toFixed(2)
console.log(porcentage+'%') // run once with this and later with this line commented
})
readStream.on('end', function(){
console.timeEnd('copying')
})
readStream.pipe(fs.createWriteStream(destFile));
})
Here are the execution times copying a 400mb file:
with console.log: 692.950ms
without console.log: 382.540ms

cpy and cp-file both support progress reporting

I have the same issue. I want to copy large files as fast as possible and want progress information. I created a test utility that tests the different copy methods:
https://www.npmjs.com/package/copy-speed-test
You can run it simply with:
npx copy-speed-test --source someFile.zip --destination someNonExistentFolder
It does a native copy using child_process.exec(), a copy file using fs.copyFile and it uses createReadStream with a variety of different buffer sizes (you can change buffer sizes by passing them on the command line. run npx copy-speed-test -h for more info.
Some things I learnt:
fs.copyFile is just as fast as native
you can get quite inconsistent results on all these methods, particularly when copying from and to the same disc and with SSDs
if using a large buffer then createReadStream is nearly as good as the other methods
if you use a very large buffer then the progress is not very accurate.
The last point is because the progress is based on the read stream, not the write stream. if copying a 1.5GB file and your buffer is 1GB then the progress immediately jumps to 66% then jumps to 100% and you then have to wait whilst the write stream finishes writing. I don't think that you can display the progress of the write stream.
If you have the same issue I would recommend that you run these tests with similar file sizes to what you will be dealing with and across similar media. My end use case is copying a file from an SD card plugged into a raspberry pi and copied across a network to a NAS so that's what I was the scenario that I ran the tests for.
I hope someone other than me finds it useful!

I solved a similar problem (using Node v8 or v10) by changing the buffer size. I think the default buffer size is around 16kb, which fills and empties quickly but requires a full cycle around the event loop for each operation. I changed the buffer to 1MB and writing a 2GB image fell from taking around 30 minutes to 5, which sounds similar to what you are seeing. My image was also decompressed on the fly, which possibly exacerbated the problem. Documentation on stream buffering has been in the manual since at least Node v6: https://nodejs.org/api/stream.html#stream_buffering
Here are the key code components you can use:
let gzSize = 1; // do not initialize divisors to 0
const hwm = { highWaterMark: 1024 * 1024 }
const inStream = fs.createReadStream( filepath, hwm );
// Capture the filesize for showing percentages
inStream.on( 'open', function fileOpen( fdin ) {
inStream.pause(); // wait for fstat before starting
fs.fstat( fdin, function( err, stats ) {
gzSize = stats.size;
// openTargetDevice does a complicated fopen() for the output.
// This could simply be inStream.resume()
openTargetDevice( gzSize, targetDeviceOpened );
});
});
inStream.on( 'data', function shaData( data ) {
const bytesRead = data.length;
offset += bytesRead;
console.log( `Read ${offset} of ${gzSize} bytes, ${Math.floor( offset * 100 / gzSize )}% ...` );
// Write to the output file, etc.
});
// Once the target is open, I convert the fd to a stream and resume the input.
// For the purpose of example, note only that the output has the same buffer size.
function targetDeviceOpened( error, fd, device ) {
if( error ) return exitOnError( error );
const writeOpts = Object.assign( { fd }, hwm );
outStream = fs.createWriteStream( undefined, writeOpts );
outStream.on( 'open', function fileOpen( fdin ) {
// In a simpler structure, this is in the fstat() callback.
inStream.resume(); // we have the _input_ size, resume read
});
// [...]
}
I have not made any attempt to optimize these further; the result is similar to what I get on the commandline using 'dd' which is my benchmark.
I left in converting a file descriptor to a stream and using the pause/resume logic so you can see how these might be useful in more complicated situations than the simple fs.statSync() in your original post. Otherwise, this is simply adding the highWaterMark option to Tulio's answer.

Here is what I'm trying to use now, it copies 1 file with progress:
String.prototype.toHHMMSS = function () {
var sec_num = parseInt(this, 10); // don't forget the second param
var hours = Math.floor(sec_num / 3600);
var minutes = Math.floor((sec_num - (hours * 3600)) / 60);
var seconds = sec_num - (hours * 3600) - (minutes * 60);
if (hours < 10) {hours = "0"+hours;}
if (minutes < 10) {minutes = "0"+minutes;}
if (seconds < 10) {seconds = "0"+seconds;}
return hours+':'+minutes+':'+seconds;
}
var purefile="20200811140938_0002.MP4";
var filename="/sourceDir"+purefile;
var output="/destinationDir"+purefile;
var progress = require('progress-stream');
var fs = require('fs');
const convertBytes = function(bytes) {
const sizes = ["Bytes", "KB", "MB", "GB", "TB"]
if (bytes == 0) {
return "n/a"
}
const i = parseInt(Math.floor(Math.log(bytes) / Math.log(1024)))
if (i == 0) {
return bytes + " " + sizes[i]
}
return (bytes / Math.pow(1024, i)).toFixed(1) + " " + sizes[i]
}
var copiedFileSize = fs.statSync(filename).size;
var str = progress({
length: copiedFileSize, // length(integer) - If you already know the length of the stream, then you can set it. Defaults to 0.
time: 200, // time(integer) - Sets how often progress events are emitted in ms. If omitted then the default is to do so every time a chunk is received.
speed: 1, // speed(integer) - Sets how long the speedometer needs to calculate the speed. Defaults to 5 sec.
// drain: true // drain(boolean) - In case you don't want to include a readstream after progress-stream, set to true to drain automatically. Defaults to false.
// transferred: false// transferred(integer) - If you want to set the size of previously downloaded data. Useful for a resumed download.
});
/*
{
percentage: 9.05,
transferred: 949624,
length: 10485760,
remaining: 9536136,
eta: 42,
runtime: 3,
delta: 295396,
speed: 949624
}
*/
str.on('progress', function(progress) {
console.log(progress.percentage+'%');
console.log('eltelt: '+progress.runtime.toString().toHHMMSS() + 's / hátra: ' + progress.eta.toString().toHHMMSS()+'s');
console.log(convertBytes(progress.speed)+"/s"+' '+progress.speed);
});
//const hwm = { highWaterMark: 1024 * 1024 } ;
var hrstart = process.hrtime(); // measure the copy time
var rs=fs.createReadStream(filename)
.pipe(str)
.pipe(fs.createWriteStream(output, {emitClose: true}).on("close", () => {
var hrend = process.hrtime(hrstart);
var timeInMs = (hrend[0]* 1000000000 + hrend[1]) / 1000000000;
var finalSpeed=convertBytes(copiedFileSize/timeInMs);
console.log('Done: file copy: '+ finalSpeed+"/s");
console.info('Execution time (hr): %ds %dms', hrend[0], hrend[1] / 1000000);
}) );

Refer to https://www.npmjs.com/package/fsprogress.
With that package, you can track progress while you are copying or moving files. The progress tracking is event and method call based so its very convenient to use.
You can provide options to do a lot of things. eg. total number of file for concurrent operation, chunk size to read from a file at a time.
It was tested for single file upto 17GB and directories up to i dont really remember but it was pretty large. And also :D, it is safe to use for large file(s).
So, go ahead and have a look at it whether it matches your expectations or if it is what you are looking for :D

Why does the Node.js scripts console close instantly in Windows 8?

I've tried nearly every example for scripts I can find. Every sample opens the terminal for a split second. Even this closes as soon as input is entered. Is this normal?
var rl = require('readline');
var prompts = rl.createInterface(process.stdin, process.stdout);
prompts.question("How many servings of fruits and vegetables do you eat each day? ", function (servings) {
var message = '';
if (servings < 5) {
message = "Since you're only eating " + servings +
" right now, you might want to start eating " + (5 - servings) + " more.";
} else {
message = "Excellent, your diet is on the right track!";
}
console.log(message);
process.exit();
});

There are 2 options that control this in Tools/Options/Node.js Tools/General:
Wait for input when process exists abnormally
Wait for input when process exists normally
Taken from https://nodejstools.codeplex.com/discussions/565665

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string