I am working on getting files from an SFTP server and piping the data to Box.com using their sdk. The Box sdk takes a readable stream as as parameter for uploading a file. The code that I have written to fetch the files from the sftp server uses the npm module ssh2-sftp-client.
The issue I am having is that a writable stream is "the end of the line" with streams unless you are using something like a Transform which is a Duplex and implements both read and write. Below is the code that I am using. Because I am working on this for a client I am intentionally leaving out some stuff that is not necessary.
Below is the method on the sftp class
async getFile(filepath: string): Promise<Readable> {
logger.info(`Fetching file: ${filepath}`);
const writable = new Writable();
const stream = new PassThrough();
await this.client.get(filepath, writable);
return writable.pipe(stream);
}
The implementation of getting a file and attempting to pipe to box which is an instance of an authorized BoxSDK client.
try {
for (const filename of filenames) {
const stream: Readable = await tmsClient.getFile(
'redacted' + filename,
);
logger.info(`Piping ${filename} to Box...`);
await box.createFile(filename, 'redacted', stream);
logger.info(`${filename} successfully downloaded`);
}
} catch (error) {
logger.error(`Failed to move files: ${error}`);
}
I am not super well versed in streams but based on my research I think this should work in theory.
I have also tried this implementation where the ssh client returns a buffer and then I try and pipe that buffer as a readable stream. With this implementation though I keep getting errors from the Box sdk that the stream ended unexpectedly.
async getFile(filepath: string): Promise<Readable> {
logger.info(`Fetching file: ${filepath}`);
const stream = new Readable();
const buffer = (await this.client.get(filepath)) as Buffer;
stream._read = (): void => {
stream.push(buffer);
stream.push(null);
};
return stream;
}
And the error message: 2020-02-06 15:24:57 error: Failed to move files: Error: Unexpected API Response [400 Bad Request] bad_request - Stream ended unexpectedly.
Any insight is greatly appreciated!
So after doing some more research into this it turns out that the issue is actually with the Box sdk for Node. The sdk is terminating the body of the stream before it is actually done. This is because under the hood they are using the request library which requires a content-length header to send large payloads. Without that in place it will continue to terminate the stream before the payload is sent.
On the Box community forum they suggest adding properties to the stream prototype to pass stuff to the underlying request library. I STRONGLY disagree with this because it is not the correct way to go about it. The Box sdk needs to provide a way to pass in the length of the content in Bytes. As the user of their API I should not have to manipulate their underlying dependencies. I am going to open an issue with their sdk and hopefully get this fixed.
Hope this is useful to someone else!
Related
I'm trying to create a file reader that streams a file so I can then store my data and use it in which ever way I need to. However when I create my read stream I get an error saying that it can't resolve the path bucketName/[object object]. I'm confused as to why it is looking for object object instead of my file name its self. I'm assuming it has something to do with how createReadStream reads a file directly instead of following a path or url. I wasn't able to find any really clear documentation on how it handles working with a file directly. If anyone has any good resources that would be great.
Error log:
Note that familyfilestorage is just the bucketname.
ApiError: No such object: familyfilestorage/[object Object]
Code:
const filepath = req.params;
const file = bucket.file(filepath)
let fileStream = file.createReadStream()
fileStream.setEncoding('utf8')
fileStream
.on('error',(err)=>{
console.log(err)
console.log('Log complete')
})
.on('end',()=>{
console.log("Completed")
})
.on('data',(chunk)=>{
console.log(chunk)
})
Context
I am working on a Proof of Concept for an accounting bot. Part of the solution is the processing of receipts. User makes picture of receipt, bot asks some questions about it and stores it in the accounting solution.
Approach
I am using the BotFramework nodejs example 15.handling attachments that loads the attachment into an arraybuffer and stores it on the local filesystem. Ready to be picked up and send to the accounting software's api.
async function handleReceipts(attachments) {
const attachment = attachments[0];
const url = attachment.contentUrl;
const localFileName = path.join(__dirname, attachment.name);
try {
const response = await axios.get(url, { responseType: 'arraybuffer' });
if (response.headers['content-type'] === 'application/json') {
response.data = JSON.parse(response.data, (key, value) => {
return value && value.type === 'Buffer' ? Buffer.from(value.data) : value;
});
}
fs.writeFile(localFileName, response.data, (fsError) => {
if (fsError) {
throw fsError;
}
});
} catch (error) {
console.error(error);
return undefined;
}
return (`success`);
}
Running locally it all works like a charm (also thanks to mdrichardson - MSFT). Stored on Azure, I get
There was an error sending this message to your bot: HTTP status code InternalServerError
I narrowed the problem down to the second part of the code. The part that write to the local filesystem (fs.writefile). Small files and big files result in the same error on Azure.fs.writefile seams unable to find the file
What is happpening according to stream logs:
Attachment uploaded by user is saved on Azure
{ contentType: 'image/png',contentUrl:
'https://webchat.botframework.com/attachments//0000004/0/25753007.png?t=< a very long string>',name: 'fromClient::25753007.png' }
localFilename (the destination of the attachment) resolves into
localFileName: D:\home\site\wwwroot\dialogs\fromClient::25753007.png
Axios loads the attachment into an arraybuffer. Its response:
response.headers.content-type: image/png
This is interesting because locally it is 'application/octet-stream'
fs throws an error:
fsError: Error: ENOENT: no such file or directory, open 'D:\home\site\wwwroot\dialogs\fromClient::25753007.png
Some assistance really appreciated.
Removing ::fromClient prefix from attachment.name solved it. As #Sandeep mentioned in the comments, the special characters where probably the issue. Not sure what its purpose is. Will mention it in the Botframework sample library github repository.
[update] team will fix this. Was caused by directline service.
I'm writing a Node Express server which connects via sftp to a file store. I'm using the ssh2-sftp-client package.
To retrieve files it has a get function with the following signature:
get(srcPath, dst, options)
The dst argument should either be a string or a writable stream, which will be used as the destination for a stream pipe.
I would like to avoid creating a file object at my server and instead transfer the file onto my client to save memory consumption as described in this article. I tried to accomplish this with the following code:
const get = (writeStream) => {
sftp.connect(config).then(() => {
return sftp.get('path/to/file.zip', writeStream)
});
};
app.get('/thefile', (req, res) => {
get(res); // pass the res writable stream to sftp.get
});
However, this causes my node server to crash due to a unhandled promise rejection. Is what I am attempting possible? Should I store the file on my server machine first before sending to the client? I've checked the documentation/examples for the sftp package in question, but cannot find an example of what I am looking for.
I found the error, and it's a dumb one on my part. I was forgetting to end the sftp connection. When this method was called a second time it was throwing the exception when it tried to connect again. If anyone finds themselves in the same situation remember to end the connection once you're finished with it like this:
const get = (writeStream) => {
sftp.connect(config).then(() => {
return sftp.get('path/to/file.zip', writeStream);
}).then(response => {
sftp.end();
resolve(response);
});
};
I have an AWS S3 object and a read stream created on it like this:
const s3 = new AWS.S3();
const readStream = s3
.getObject(params)
.createReadStream()
.on('error', err => {
// do something
});
Now when the stream is not read to the end (e.g. the streaming is aborted by client) after 120 sec the error event is triggered with: TimeoutError: Connection timed out after 120000ms
How can I close the stream (or the entire S3 object)?
I tried readStream.destroy() that is documented here, but it does not work.
I was looking for a solution to a similar case and bumped into this thread.
There is an AWS Request abort method documented here which allows you to cancel the request without receiving all the data (it's a similar concept to node's http request).
Your code should look somewhat like this:
const s3 = new AWS.S3();
const request = s3.getObject(params);
const readStream = request.createReadStream()
.on('error', err => {
request.abort(); // and do something else also...
});
It may be on error, but in my case - I'm fetching data and I want to stop streaming when I've reached a certain point (i.e. found specific data in the file and it's only a matter of checking if it exists - I don't need anything else).
The above will work well with both request and node-fetch modules as well.
I've been googling this and looking around stackoverflow for a while but haven't found a solution - hence the post.
I am playing around with Node.js and WebSockets out of curiosity. I am trying to stream some binary data (an mp3) to the client. My code so far is below but is obviously not working as intended.
I suspect that my problem is that I am not actually sending binary data from the server and would like some clarification/help.
Heres my server...
var fs = require('fs');
var WebSocketServer = require('ws').Server;
var wss = new WebSocketServer({port: 8080,host:"127.0.0.1"});
wss.on('connection', function(ws) {
var readStream =
fs.createReadStream("test.mp3",
{'flags': 'r',
'encoding': 'binary',
'mode': 0666,
'bufferSize': 64 * 1024});
readStream.on('data', function(data) {
ws.send(data, {binary: true, mask: false});
});
});
And my client...
context = new webkitAudioContext();
var ws = new WebSocket("ws://localhost:8080");
ws.binaryType = 'arraybuffer';
ws.onmessage = function (evt) {
context.decodeAudioData(
evt.data,
function(buffer) {
console.log("Success");
},
function(error) {
console.log("Error");
});
};
The call to decode always end up in the error callback. I am assuming this is because it is receiving bad data.
So my question is how to I correctly stream the file as binary?
Thanks
What your server is doing is that it is sending messages consisting of binary audio data in 64 KB chunks to your client. Your client should rebuild the audio file before calling decodeAudioData.
You are calling decodeAudioDataevery time your client is getting message on websocket. You have to create a separate buffer to add all the chunks to it. Then on completion of transfer, the buffer should be given input to decodeAudioData.
You have two options now:
You load entire file (fs.read) without using stream events and send the whole file with ws.send (easy to do)
You use stream events, modify your client to accept chunks of data and assemble them before calling decodeAudioData
Problem solved.
I fixed this issue with a combination of removing the "'encoding': 'binary'" parameter from the options passed to "createReadStream()" and the solution at...
decodeAudioData returning a null error
As per some of my comments, when I updated the createReadStream options, the first chunk was playing but all other chunks were executing the onError callback from decodeAudioData(). The solution in the link above fixed this for me.
It seems that decodeAudioData() is a bit picky as to how the chunks it receives should be formatted. They should be valid chunks apparently...
Define 'valid mp3 chunk' for decodeAudioData (WebAudio API)