SharpJS pipe memory leak on linux - node.js

I'm reading from a fs readStream (from graphql upload file) and piping it through SharpJS transform function and piping it through a writeStream to write to a file.
In my system (windows) it works just fine but in my Host (linux) it creates a core dump file with 500mb size and the images that it creates are with size of 0 kb.
const transform = (dimen) => sharp().resize(dimen, dimen)
const fs512 = fs.createWriteStream(addSuffix(filePath))
const fs256 = fs.createWriteStream(addSuffix(filePath))
const fs128 = fs.createWriteStream(addSuffix(filePath))
await stream.pipe(transform(512)).pipe(fs512)
await stream.pipe(transform(256)).pipe(fs256)
await stream.pipe(transform(128)).pipe(fs128)
I have tried listening on finish event and closing them but it didn't work.
I think the issue is because of SharpJS, if I remove the first pipe, it works (without resizing).

Related

nodejs - log to console stdio and file using only core modules

My application is simple and I want to avoid using a logging library like Winston. I need to log the output to both the console, and to file. I found a few tutorials on how to do this using a child process, such as this, but I can't find anything that leverages the main process stdio, like process.stdout and process.stdin
The key to solving this was recognizing that process.stdio is a writable stream whereas a child process's stdio using the child_process module is a readable stream (thanks to this article). Therefore I needed to create both a readable and writable file stream, and pipe the readable stream out to process.stdio. You could probably simplify this even further with a duplex stream, but for noobs like myself, this is a straightforward and easy to read approach.
const { Console } = require("console")
, process = require("process")
, path = require("path")
, fs = require('fs');
// Define the file paths to log to
const outputFilePath = path.join(__dirname, './stdout.log');
const errorFilePath = path.join(__dirname, './stderr.log');
// Create the empty files synchronously to guarantee it exists prior to stream creation.
// Change flag to 'w' to overwrite rather than append.
fs.closeSync(fs.openSync(outputFilePath, 'a+'));
fs.closeSync(fs.openSync(errorFilePath, 'a+'));
// Create a writable file stream for both stdout and stderr
const fileWriterOut = fs.createWriteStream(outputFilePath);
const fileWriterErr = fs.createWriteStream(errorFilePath);
// Create a new Console object using the file writers
const Logger = new Console({ stdout: fileWriterOut, stderr: fileWriterErr });
// Create readable file streams for process.stdio to consume
const fileReaderOut = fs.createReadStream(path.join(__dirname, './stdout.log'));
const fileReaderErr = fs.createReadStream(path.join(__dirname, './stderr.log'));
// Pipe out the file reader into process stdio
fileReaderOut.pipe(process.stdout);
fileReaderErr.pipe(process.stderr);
// Test the new logger
Logger.log("Logger initialized");
// Export
module.exports = Logger;

Write multiple files to http response with streams in nodejs

I have an array of files that I have to pack into a gzip archive and send them through http response on the fly. That means I can't store the whole file in the memory yet I have to synchronously pipe them into tar.entry or everything is going to break.
const tar = require('tar-stream'); //lib for tar stream
const { createGzip } = require('zlib'); //lib for gzip stream
//large list of huge files.
const files = [ 'file1', 'file2', 'file3', ..., 'file99999' ];
...
//http request handler:
const pack = tar.pack(); //tar stream, creates .tar
const gzipStream = createGzip(); //gzip stream so we could reduce the size
//pipe archive data trough gzip stream
//and send it to the client on the fly
pack.pipe(gzipStream).pipe(response);
//The issue comes here, when I need to pass multiple files to pack.entry
files.forEach(name => {
const src = fs.createReadStream(name); //create stream from file
const size = fs.statSync(name).size; //determine it's size
const entry = pack.entry({ name, size }); //create tar entry
//and this ruins everything because if two different streams
//writes smth into entry, it'll fail and throw an error
src.pipe(entry);
});
Basically I need for the pipe to complete sending data (smth like await src.pipe(entry);), but pipes in nodejs don't do that. So is there any way I could get around it?
Nevermind, just don't use forEach in this case

docker stats: block i/o incorrect for node.js app

NodeJS app inside a docker container.
The app only writes periodically some data to a file on a volume mounted to the container (defined in my docker-compose.yml)
I try to use fs.writeSync and fs.writeFileSync
Both ways results with correct data in the file. However, if I use the second way, docker stats have incorrect (zero) data for output (Block IO) for the container. Why?
1)
const fs = require('fs')
setInterval(() => fs.writeFileSync('/data/file.txt', (new Array(1024)).join('.')), 100)
2)
const fs = require('fs')
const fd = fs.openSync(`/data/file.txt`, 'w')
setInterval(() => fs.writeSync(fd, (new Array(1024)).join('.')), 100)

gunzip partials read from read-stream

I use Node.JS to fetch files from my S3 bucket.
The files over there are gzipped (gz).
I know that the contents of each file is composed by lines, where each line is a JSON of some record that failed to be put on Kinesis.
Each file consists of ~12K such records. and I would like to be able to process the records while the file is being downloaded.
If the file was not gzipped, that could be easily done using streams and readline module.
So, the only thing that stopping me from doing this is the gunzip process which, to my knowledge, needs to be executed on the whole file.
Is there any way of gunzipping a partial of a file?
Thanks.
EDIT 1: (bad example)
Trying what #Mark Adler suggested:
const fileStream = s3.getObject(params).createReadStream();
const lineReader = readline.createInterface({input: fileStream});
lineReader.on('line', line => {
const gunzipped = zlib.gunzipSync(line);
console.log(gunzipped);
})
I get the following error:
Error: incorrect header check
at Zlib._handle.onerror (zlib.js:363:17)
Yes. node.js has a complete interface to zlib, which allows you to decompress as much of a gzip file at a time as you like.
A working example that solves the above problem
The following solves the problem in the above code:
const fileStream = s3.getObject(params).createReadStream().pipe(zlib.createGunzip());
const lineReader = readline.createInterface({input: fileStream});
lineReader.on('line', gunzippedLine => {
console.log(gunzippedLine);
})

How to pipe one readable stream into two writable streams at once in Node.js?

The goal is to:
Create a file read stream.
Pipe it to gzip (zlib.createGzip())
Then pipe the read stream of zlib output to:
1) HTTP response object
2) and writable file stream to save the gzipped output.
Now I can do down to 3.1:
var gzip = zlib.createGzip(),
sourceFileStream = fs.createReadStream(sourceFilePath),
targetFileStream = fs.createWriteStream(targetFilePath);
response.setHeader('Content-Encoding', 'gzip');
sourceFileStream.pipe(gzip).pipe(response);
... which works fine, but I need to also save the gzipped data to a file so that I don't need to regzip every time and be able to directly stream the gzipped data as a response.
So how do I pipe one readable stream into two writable streams at once in Node?
Would sourceFileStream.pipe(gzip).pipe(response).pipe(targetFileStream); work in Node 0.8.x?
Pipe chaining/splitting doesn't work like you're trying to do here, sending the first to two different subsequent steps:
sourceFileStream.pipe(gzip).pipe(response);
However, you can pipe the same readable stream into two writeable streams, eg:
var fs = require('fs');
var source = fs.createReadStream('source.txt');
var dest1 = fs.createWriteStream('dest1.txt');
var dest2 = fs.createWriteStream('dest2.txt');
source.pipe(dest1);
source.pipe(dest2);
I found that zlib returns a readable stream which can be later piped into multiple other streams. So I did the following to solve the above problem:
var sourceFileStream = fs.createReadStream(sourceFile);
// Even though we could chain like
// sourceFileStream.pipe(zlib.createGzip()).pipe(response);
// we need a stream with a gzipped data to pipe to two
// other streams.
var gzip = sourceFileStream.pipe(zlib.createGzip());
// This will pipe the gzipped data to response object
// and automatically close the response object.
gzip.pipe(response);
// Then I can pipe the gzipped data to a file.
gzip.pipe(fs.createWriteStream(targetFilePath));
you can use "readable-stream-clone" package
const fs = require("fs");
const ReadableStreamClone = require("readable-stream-clone");
const readStream = fs.createReadStream('text.txt');
const readStream1 = new ReadableStreamClone(readStream);
const readStream2 = new ReadableStreamClone(readStream);
const writeStream1 = fs.createWriteStream('sample1.txt');
const writeStream2 = fs.createWriteStream('sample2.txt');
readStream1.pipe(writeStream1)
readStream2.pipe(writeStream2)

Resources