Downloading Binary File from OneDrive API Using Node/Axios - node.js

I am using the One Drive API to grab a file with a node application using the axios library.
I am simply trying to save the file to the local machine (node is running locally).
I use the One Drive API to get the download document link, which does not require authentication (with https://graph.microsoft.com/v1.0/me/drives/[location]/items/[id]).
Then I make this call with the download document link:
response = await axios.get(url);
I receive a JSON response, which includes, among other things, the content-type, content-length, content-disposition and a data element which is the contents of the file.
When I display the JSON response to the console, the data portion looks like this:
data: 'PK\u0003\u0004\u0014\u0000\u0006\u0000\b\u0000\u0000\u0000!\u...'
If the document is simply text, I can save it easily using:
fs.writeFileSync([path], response.data);
But if the file is binary, like a docx file, I cannot figure out how to write it properly. Every time I try it seems to have the wrong encoding. I tried different encodings.
How do I save the file properly based on the type of file retrieved.

Have you tried using an encoding option of fs.writeFileSync of explicitly null, signifying the data is binary?
fs.writeFileSync([path], response.data, {
encoding: null
});

Related

How to read JSONL line-by-line after hitting url in Node.JS?

From the Shopify API, I receive a link to a large amount of JSONL. Using NodeJS, I need to read this data line-by-line, as loading it all at once would use lots of memory. When I hit the JSONL url from the web browser, it automatically downloads the JSONL file to my downloads folder.
Example of JSONL:
{"id":"gid:\/\/shopify\/Customer\/6478758936817","firstName":"Joe"}
{"id":"gid:\/\/shopify\/Order\/5044232028401","name":"#1001","createdAt":"2022-09-16T16:30:50Z","__parentId":"gid:\/\/shopify\/Customer\/6478758936817"}
{"id":"gid:\/\/shopify\/Order\/5044244480241","name":"#1003","createdAt":"2022-09-16T16:37:27Z","__parentId":"gid:\/\/shopify\/Customer\/6478758936817"}
{"id":"gid:\/\/shopify\/Order\/5057425703153","name":"#1006","createdAt":"2022-09-27T17:24:39Z","__parentId":"gid:\/\/shopify\/Customer\/6478758936817"}
{"id":"gid:\/\/shopify\/Customer\/6478771093745","firstName":"John"}
{"id":"gid:\/\/shopify\/Customer\/6478771126513","firstName":"Jane"}
I'm unsure how to process this data in NodeJS. Do I need to hit the url, download all of the data and store it in a temporary file, then process the data line-by-line? Or can I read the data line-by-line directly after hitting the url (via some sort of stream?) and process it without storing it in a temporary file on the server?
(The JSONL comes from https://storage.googleapis.com/ if that helps.)
Thanks.
using axios you can set the response to be a stream, and then using a buildin readline module, you can process your data line by line.
import axios from 'axios'
import { createInterface } from 'node:readline'
const response = await axios.get('https://raw.githubusercontent.com/zaibacu/thesaurus/master/en_thesaurus.jsonl', {
responseType: 'stream'
})
const rl = createInterface({
input: response.data
})
for await (const line of rl) {
// do something with the current line
const { word, synonyms } = JSON.parse(line)
console.log('word, synonyms: ', word, synonyms);
}
testing this there is barely any memory usage
You can easily run a great CLI tool called jq. Magic.
Unlike tying yourself to browser code, this code can be run in any way you need to parse JSONL.
jq -cs '.' doodoo.myshopify.com.export.jsonl > out.json
Would take my nicely just downloaded bulk file from a query and give me a very nice pure JSON data structure to play with, or save.

Node JS Image Binary String

I'm working with Etsy api uploading images like this example, and it requires the images be in binary format. Here is how I'm getting the image binary data:
async function getImageBinary(url) {
const imageUrlData = await fetch(url);
const buffer = await imageUrlData.buffer();
return buffer.toString("binary");
}
However Etsy says it is not a valid image file. How can I get the image in the correct format, or make it in a valid binary format?
Read this for a working example of Etsy API
https://github.com/etsy/open-api/issues/233#issuecomment-927191647
Etsy API is buggy and has an inconsistent guide. You might think of using 'binary' encoding for the buffer because the docs saying that the data type is string but you actually don't need to. Just put the default encoding.
Also currently there is a bug for image upload, try to remove the Content-type header. Better read the link above

Save an image file into a database with node/request/sequelize/mysql

I'm trying to save a remote image file into a database, but I'm having some issues with it since I've never done it before.
I need to download the image and pass it along (with node-request) with a few other properties to another node api that saves it into a mysql database (using sequelize). I've managed to get some data to save, but when I download it manually and try to open it, it's not really usable and no image shows up.
I've tried a few things: getting the image with node-request, converting it to a base64 string (read about that somewhere) and passing it along in a json payload, but that didn't work. Tried sending it as a multipart, but that didn't work either. Haven't worked with streams/buffers/multipart all that much before and never in node. I've tried looking into node-request pipes, but I couldn't really figure out how possibly apply them to this context.
Here's what I currently have (it's a part es6 class so there's no 'function' keywords; also, request is promisified):
function getImageData(imageUrl) {
return request({
url: imageUrl,
encoding: null,
json: false
});
}
function createEntry(entry) {
return getImageData(entry.image)
.then((imageData) => {
entry.image_src = imageData.toString('base64');
var requestObject = {
url: 'http://localhost:3000/api/entry',
method: 'post',
json: false,
formData: entry
};
return request(requestObject);
});
}
I'm almost 100% certain the problem is in this part because the api just takes what it gets and gives it to sequelize to put into the table, but I could be wrong. Image field is set as longblob.
I'm sure it's something simple once I figure it out, but so far I'm stumped.
This is not a direct answer to your question but it is rarely needed to actually store an image in the database. What is usually done is storing an image on storage like S3, a CDN like CloudFront or even just in a file system of a static file server, and then storing only the file name or some ID of the image in the actual database.
If there is any chance that you are going to serve those images to some clients then serving them from the database instead of a CDN or file system will be very inefficient. If you're not going to serve those images then there is still very little reason to actually put them in the database. It's not like you're going to query the database for specific contents of the image or sort the results on the particular serialization of an image format that you use.
The simplest thing you can do is save the images with a unique filename (either a random string, UUID or a key from your database) and keep the ID or filename in the database with other data that you need. If you need to serve it efficiently then consider using S3 or some CDN for that.

Express res.download() not actually downloading file

I'm attempting to return generated files to the front end through Express' res.download function. I'm using chrome, but whenever I call that API that executes the following code all that is returned is the same values returned from the Express res.sendFile() function.
I know that res.download uses res.sendFile, but I would like the download function to actually save to the file system instead of just returning the file in the body of the response.
This is my code.
exports.download = function(req,res) {
var filePath = //somefile that I want to download
res.download(filePath, 'response.txt', function(err) {
throw err;
}
}
I know that the above code at least partly works because I'm getting back, in the response, the contents of the file. However, I want it to be saved onto the file system.
Am I misunderstanding what the download function is supposed to do? Do I just need to take the response data and write it to the file system manually?
res.download adds headers that suggest to the browser that the file should be downloaded rather than opened. However, there's no way to force the browser to do this; it's ultimately the user's choice whether to download a particular file, typically.
If you're triggering this request with AJAX, well, that's not going to cause it to be downloaded, because your JavaScript is requesting that it get the data.
Do I just need to take the response data and write it to the file system manually?
You don't have file system access in browser-side JavaScript. I'm not sure how you intend to do this.

Chrome Extension: Local Storage, how to export

I have a chrome extension that saves a bunch of data to chrome.storage.local. I'm trying to find easy ways to export this data and package it into a file. I'm not constrained on what type of file it is (JSON, CSV, whatever), I just need to be able to export the contents into a standalone (and send-able) file. The extension is only run locally and the user would have access to all local files.
First, you need to get all data.
Then serialize the result.
Finally, offer it as a download to the user.
chrome.storage.local.get(null, function(items) { // null implies all items
// Convert object to a string.
var result = JSON.stringify(items);
// Save as file
var url = 'data:application/json;base64,' + btoa(result);
chrome.downloads.download({
url: url,
filename: 'filename_of_exported_file.json'
});
});
To use the chrome.downloads.download method, you need to declare the "downloads" permission in addition to the storage permission in the manifest file.
You should look here: https://groups.google.com/a/chromium.org/forum/#!topic/chromium-extensions/AzO_taH2b7U
It shows exporting chrome local storage to JSON.
Hope it helps

Resources