Using Electron's net module, the aim is to fetch a resource and, once the response is received, to pipe it to a writeable stream like so:
const stream = await fetchResource('someUrl');
stream.pipe(fs.createWriteStream('./someFilepath'));
As simplified implementation of fetchResource is as follows:
import { net } from 'electron';
async function fetchResource(url) {
return new Promise((resolve, reject) => {
const data = [];
const request = net.request(url);
request.on('response', response => {
response.on('data', chunk => {
data.push(chunk);
});
response.on('end', () => {
// Maybe do some other stuff with data...
});
// Return the response to then pipe...
resolve(response);
});
request.end();
});
}
The response ends up being an instance of IncomingMessage, which implements a Readable Stream interface according to the node docs, so it should be able to be piped to a write stream.
The primary issue is there ends up being no data in the stream that get's piped through 😕
Answering my own question, but the issue is reading from multiple sources: the resolved promise and the 'data' event. The event listener source was flushing out all the data before the resolved promise could get to it.
A solution is to fork the stream into a new one that won't compete with the original if more than once source tries to pipe from it.
import stream from 'stream';
// ...make a request and get a response stream, then fork the stream...
const streamToResolve = response.pipe(new stream.PassThrough());
// Listen to events on response and pipe from it
// ...
// Resolve streamToResolve and separately pipe from it
// ...
Related
I am unable to understand how the event loop is processing my snippet. What I am trying to achieve is
to read from a csv
download a resource found in the csv
upload it to s3
write it into a new csv file
const readAndUpload = () => {
fs.createReadStream('filename.csv')
.pipe(csv())
.on('data', ((row: any) => {
const file = fs.createWriteStream("file.jpg");
var url = new URL(row.imageURL)
// choose whether to make an http or https request
let client = (url.protocol=="https:") ? https : http
const request = client.get(row.imageURL, function(response:any) {
// file save
response.pipe(file);
console.log('file saved')
let filePath = "file.jpg";
let params = {
Bucket: 'bucket-name',
Body : fs.createReadStream(filePath),
Key : "filename.jpg"
};
// upload to s3
s3.upload(params, function (err: any, data: any) {
//handle error
if (err) {
console.log("Error", err);
}
//success
if (data) {
console.log("Uploaded in:", data.Location);
row.imageURL = data.Location
writeData.push(row)
// console.log(writeData)
}
});
});
}))
.on('end', () => {
console.log("done reading")
const csvWriter = createCsvWriter({
path: 'out.csv',
header: [
{id: 'id', title: 'some title'}
]
});
csvWriter
.writeRecords(writeData)
.then(() => console.log("The CSV file was written successfully"))
})
}
Going by the log statements that I have added, done reading and The CSV file was written successfully is printed by before file saved. My understanding was that the end event is called after the data event, so I am unsure of where I am going wrong.
Thank you for reading!
I'm not sure if this is part of the problem or not, but you've got an extra set of parens in this part of the code. Change this:
.on('data', ((row: any) => {
.....
})).on('end', () => {
to this:
.on('data', (row: any) => {
.....
}).on('end', () => {
And, if the event handlers are set up properly, your .on('data', ...) event handler gets called before the .on('end', ....) for the same stream. If you put this:
console.log('at start of data event handler');
as the first line in that event handler, you will see it get called first.
But, your data event handler uses multiple asynchronous calls and nothing you have in your code makes the end event wait for all your processing to be done in the data event handler. So, since that processing takes awhile, it's natural that the end event would occur before you're done running all that asynchronous code on the data event.
In addition, if you ever can have more than one data event (which one normally would), you're going to have multiple data events in flight at the same time and since you're using a fixed filename, they will probably be overwriting each other.
The usual way to solve something like this is to to stream.pause() to pause the readstream at the start of the data event processing and then when all your asynchronous stuff is done, you can then stream.resume() to let it start going again.
You will need to get the right stream in order to pause and resume. You could do something like this:
let stream = fs.createReadStream('filename.csv')
.pipe(csv());
stream.on('data', ((row: any) => {
stream.pause();
....
});
Then, way inside your s3.upload() callback, you can call stream.resume(). You will also need much, much better error handling that you have or things will just get stuck if you get an error.
It also looks like you have other concurrency issues too where you call:
response.pipe(file);
And you then attempt to use the file without actually waiting for that .pipe() operation to be done (which is also asynchronous). Overall, this whole logic really needs a major cleanup. I don't understand what exactly you're trying to do in all the different steps to know how to write a totally clean and simpler version.
I am trying to fetch pdf url as stream from axios. I need to further upload that file to another location and return the hash of the uploaded file. I have third party function which accepts the stream, and upload file to target location. How can I use same stream to get the hash of the file?
I am trying to run below code:
const getFileStream = await axios.get<ReadStream>(externalUrl, {
responseType: "stream"
});
const hashStream = crypto.createHash("md5");
hashStream.setEncoding("hex");
const pHash = new Promise<string>(resolve => {
getFileStream.data.on("finish", () => {
resolve(hashStream.read());
});
});
const pUploadedFile = externalUploader({
stream: () => getFileStream.data
});
getFileStream.data.pipe(hashStream);
const [hash, uploadedFile] = await Promise.all([pHash, pUploadedFile]);
return { hash, id: uploadedFile.id };
After running this code, when I download the same pdf, I am getting corrupted file
You can reuse the same axios getFileStream.data to pipe to multiple sinks as long as they are consumed simultaneously.
Below is an example of downloading a file using an axios stream and "concurrently" calculating the MD5 checksum of the file while uploading it to a remote server.
The example will output stdout:
Incoming file checksum: 82c12f208ea18bbeed2d25170f3669a5
File uploaded. Awaiting server response...
File uploaded. Done.
Working example:
const { Writable, Readable, Transform, pipeline } = require('stream');
const crypto = require('crypto');
const https = require('https');
const axios = require('axios');
(async ()=>{
// Create an axios stream to fetch the file
const axiosStream = await axios.get('https://upload.wikimedia.org/wikipedia/commons/thumb/8/86/Map_icon.svg/128px-Map_icon.svg.png', {
responseType: "stream"
});
// To re-upload the file to a remote server, we can use multipart/form-data which will require a boundary key
const key = crypto.randomBytes(16).toString('hex');
// Create a request to stream the file as multipart/form-data to another server
const req = https.request({
hostname: 'postman-echo.com',
path: '/post',
method: 'POST',
headers: {
'content-type': `multipart/form-data; boundary=--${key}`,
'transfer-encoding': 'chunked'
}
});
// Create a promise that will be resolved/rejected when the remote server has completed the HTTP(S) request
const uploadRequestPromise = new Promise(resolve => req.once('response', (incomingMessage) => {
incomingMessage.resume(); // prevent response data from queuing up in memory
incomingMessage.on('end', () => {
if(incomingMessage.statusCode === 200){
resolve();
}
else {
reject(new Error(`Received status code ${incomingMessage.statusCode}`))
}
});
}));
// Construct the multipart/form-data delimiters
const multipartPrefix = `\r\n----${key}\r\n` +
'Content-Disposition: form-data; filename="cool-name.png"\r\n' +
'Content-Type: image/png\r\n' +
'\r\n';
const multipartSuffix = `\r\n----${key}--`;
// Write the beginning of a multipart/form-data request before streaming the file content
req.write(multipartPrefix);
// Create a promise that will be fulfilled when the file has finished uploading
const uploadStreamFinishedPromise = new Promise(resolve => {
pipeline(
// Use the axios request as a stream source
axiosStream.data,
// Piggyback a nodejs Transform stream because of the convenient flush() call that can
// add the multipart/form-data suffix
new Transform({
objectMode: false,
transform( chunk, encoding, next ){
next( null, chunk );
},
flush( next ){
this.push( multipartSuffix );
next();
}
}),
// Write the streamed data to a remote server
req,
// This callback is executed when all data from the stream pipe has been processed
(error) => {
if( error ){
reject( error );
}
else {
resolve();
}
}
)
});
// Create a MD5 stream hasher
const hasher = crypto.createHash("md5");
// Create a promise that will be resolved when the hash function has processed all the stream
// data
const hashPromise = new Promise(resolve => pipeline(
// Use the axios request as a stream source.
// Note that it's OK to use the same stream to pipe into multiple sinks. In this case, we're
// using the same axios stream for both calculating the haas, and uploading the file above
axiosStream.data,
// The has function will process stream data
hasher,
// This callback is executed when all data from the stream pipe has been processed
(error) => {
if( error ){
reject( error );
}
else {
resolve();
}
}
));
/**
* Note that there are no 'awaits' before both stream sinks have been established. That is
* important since we want both sinks to process data from the beginning of stream
*/
// We must wait to call the hash function's digest() until all the data has been processed
await hashPromise;
const hash = hasher.digest("hex");
console.log("Incoming file checksum:", hash);
await uploadStreamFinishedPromise;
console.log("File uploaded. Awaiting server response...");
await uploadRequestPromise;
console.log("File uploaded. Done.");
})()
.catch( console.error );
I want to get binary from image to rotate then, using sharp.rotate();
I try to do this content += chunk; but dosent work.
let Client = require('ftp');
let fs = require('fs');
let sharp = require('sharp');
let path = 'users/'+userId+'/headerImage/header';
let Ftp = new Client();//create new istance of Ftp
//Start. Here we get image from server
await Ftp.on('ready', function(){
Ftp.get(path, async function(err, stream){
if(err){
res.status(400).send(err);
};
var content = '';
await stream.on('data', async (chunk) => {
content += chunk;
});
await stream.on('end', async function(){
console.log(content);
let image = await sharp(content);
await image
.rotate(90)
.toBuffer()
.then(async data => {
console.log(data);
})
.catch(error => {
console.log(error);
});
Ftp.end();
});
});
});
await Ftp.connect({
host: fileTransferProtocol.host,
port: fileTransferProtocol.port,
user: fileTransferProtocol.user,
password: fileTransferProtocol.pass
});
console: Error: [Error: Input file is missing]
I believe the problem you are having is that you are not handling the incoming data as a buffer. The stream variable inside the Ftp.get callback is of type ReadableStream. By default, stream data will be returned as Buffer objects unless you specify an encoding for the data, using the readable.setEncoding() method.
For your specific purpose, you want to handle the data as a Buffer object, since that is what the sharp function is expecting. To store the incoming data into a Buffer modify what happens on the data event.
var content = new Buffer(0);
stream.on("data", async chunk => {
content = Buffer.concat([content, chunk]);
});
Also, I don't think you are using async/await duly. The ftp module runs with callbacks and events, not promises. Appending those functions with await won't make them run synchronously.
Please check the following link to find more information about this feature:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/async_function
If you want to us async/await to handle your ftp requests try this module:
https://www.npmjs.com/package/promise-ftp
It provides an asynchronous interface for communicating with an FTP server.
I have a get request in node that successfully receives data from API.
When I pipe that response directly to a file like this, it works, the file created is a valid, readable pdf (as i expect to receive from the API).
var http = require('request');
var fs = require('fs');
http.get(
{
url:'',
headers:{}
})
.pipe(fs.createWriteStream('./report.pdf'));
Simple, however the file gets corrupted if I use the event emitters of the request like this
http.get(
{
url:'',
headers:{}
})
.on('error', function (err) {
console.log(err);
})
.on('data', function(data) {
file += data;
})
.on('end', function() {
var stream = fs.createWriteStream('./report.pdf');
stream.write(file, function() {
stream.end();
});
});
I have tried all manner of writing this file and it always ends in a totally blank pdf - the only time the pdf is valid is via the pipe method.
When i console log the events, the sequence seems to be correct - ie, all chunks received and then the end fires at the end.
It's making things impossible to do anything after the pipe. What is pipe doing differently to the writestream ?
I assume that you initialize file as a string:
var file = '';
Then, in your data handler, you add the new chunk of data to it:
file += data;
However, this performs an implicit conversion to (UTF-8-encoded) strings. If the data is actually binary, like with a PDF, this will invalidate the output data.
Instead, you want to collect the data chunks, which are Buffer instances, and use Buffer.concat() to concatenate all those buffers into one large (binary) buffer:
var file = [];
...
.on('data', function(data) {
file.push(data);
})
.on('end', function() {
file = Buffer.concat(file);
...
});
If you wanted to do something after the file is done being written by pipe, you can add an event listener for finish on the object returned by pipe.
.pipe(fs.createWriteStream('./report.pdf'))
.on('finish', function done() { /* the file has been written */ });
Source: https://nodejs.org/api/stream.html#stream_event_finish
I'm trying to use streams to send data to the browser with Hapi, but can't figure our how. Specifically I am using the request module. According to the docs the reply object accepts a stream so I have tried:
reply(request.get('https://google.com'));
The throws an error. In the docs it says the stream object must be compatible with streams2, so then I tried:
reply(streams2(request.get('https://google.com')));
Now that does not throw a server side error, but in the browser the request never loads (using chrome).
I then tried this:
var stream = request.get('https://google.com');
stream.on('data', data => console.log(data));
reply(streams2(stream));
And in the console data was outputted, so I know the stream is not the issue, but rather Hapi. How can I get streaming in Hapi to work?
Try using Readable.wrap:
var Readable = require('stream').Readable;
...
function (request, reply) {
var s = Request('http://www.google.com');
reply(new Readable().wrap(s));
}
Tested using Node 0.10.x and hapi 8.x.x. In my code example Request is the node-request module and request is the incoming hapi request object.
UPDATE
Another possible solution would be to listen for the 'response' event from Request and then reply with the http.IncomingMessage which is a proper read stream.
function (request, reply) {
Request('http://www.google.com')
.on('response', function (response) {
reply(response);
});
}
This requires fewer steps and also allows the developer to attach user defined properties to the stream before transmission. This can be useful in setting status codes other than 200.
2020
I found it !! the problem was the gzip compression
to disable it just for event-stream you need provide the next config to Happi server
const server = Hapi.server({
port: 3000,
...
mime:{
override:{
'text/event-stream':{
compressible: false
}
}
}
});
in the handler I use axios because it support the new stream 2 protocol
async function handler(req, h) {
const response = await axios({
url: `http://some/url`,
headers: req.headers,
responseType: 'stream'
});
return response.data.on('data',function (chunk) {
console.log(chunk.toString());
})
/* Another option with h2o2, not fully checked */
// return h.proxy({
// passThrough:true,
// localStatePassThrough:true,
// uri:`http://some/url`
// });
};