So, I'm trying to pass an image to a Node Lambda through API Gateway and this is automatically base64 encoded. This is fine, and my form data all comes out correct, except somehow my image is being corrupted, and I'm not sure how to decode this properly to avoid this. Here is the relevant part of my code:
const multipart = require('aws-lambda-multipart-parser');
exports.handler = async (event) => {
console.log({ event });
const buff = Buffer.from(event.body, 'base64');
// using utf-8 appears to lose some of the data
const decodedEventBody = buff.toString('ascii');
const decodedEvent = { ...event, body: decodedEventBody };
const jsonEvent = multipart.parse(decodedEvent, false);
const asset = Buffer.from(jsonEvent.file.content, 'ascii');
}
First off, it would be good to know if aws-sdk had a way of parsing the multipart form data rather than using this unsupported third party code. Next, the value of asset ends with a buffer that's exactly the same size as the original file, but some of the byte values are off. My assumption is that the way this is being encoded vs decoded is slightly different and maybe some of the characters are being interpreted differently.
Just an update in case anybody else runs into a similar problem - updated 'ascii' to 'latin1' in both places and then it started working fine.
Related
I created a NodeJS application which should get some data from an external API-Server. That server provides its data only as 'Content-Type: text/plain;charset=ISO-8859-1'. I have got that information through the Header-Data of the server.
Now the problem for me is that special characters like 'ä', 'ö' or 'ü' are shown as �.
I tried to convert them with Iconv to UTF-8, but then I got these things '�'...
My question is, what am I doing wrong?
For testing I use Postman. These are the steps I do to test everything:
Use Postman to trigger my NodeJS application
The App requests data from the API-Server
API-Server sends Data to NodeJS App
My App prints out the raw response-data of the API, which already has those strange characters �
The App then tries to convert them with Iconv to UTF-8, where it shows me now this '�' characters
Another strange thing:
When I connect Postman directly to the API-Server, the special characters get shown as they have too without problems. Therefore i guess my application causes the problem but I cannot see where or why...
// Javascript Code:
try {
const response = await axios.get(
URL
{
params: params,
headers: headers
}
);
var iconv = new Iconv('ISO-8859-1', 'UTF-8');
var converted = await iconv.convert(response.data);
return converted.toString('UTF-8');
} catch (error) {
throw new Error(error);
}
So after some deeper research I came up with the solution to my problem.
The cause of all trouble seems to lie within the post-process of axios or something similar. It is the step close after data is received and convertet to text and shortly before the response is generated for my nodejs-application.
What I did was to define the "responseType" of the GET-method of axios as an "ArrayBuffer". Therefore an adjustment in axios was necessary like so:
var resArBuffer = await axios.get(
URL,
{
responseType: 'arraybuffer',
params: params,
headers: headers
}
);
Since JavaScript is awesome, the ArrayBuffer provides a toString() method itself to convert the data from ArrayBuffer to String by own definitions:
var response = resArBuffer.data.toString("latin1");
Another thing worth mentioning is the fact that I used "latin1" instead of "ISO-8859-1". Don't ask me why, some sources even recommended to use "cp1252" instead, but "latin1" workend for me here.
Unfortunately that was not enough yet since I needed the text in UTF-8 format. Using "toString('utf-8')" itself was the wrong way too since it would still print the "�"-Symbols. The workaround was simple. I used "Buffer.from(...)" to convert the "latin1" defined text into a "utf-8" text:
var text = Buffer.from(response, 'utf-8').toString();
Now I get the desired UTF-8 converted text I needed. I hope this thread helps anyone else outhere since thse informations hwere spread in many different threads for me.
I'm working in an environment where the available image processing library is NodeJS's Sharp for scaling images. It has been stable as it is being pipe based, but I'm tasked with converting it into TypeScript, and set it up using Async/Await when possible. I have most of the pieces ready, but the issue I am facing lies in the fact that all I have is a URL of an image and Sharp either expects a string URI (local file only) or a Buffer.
Currently, I am using the package Axios in order to fetch the image as a string retrievable by the data property on the response. I've been feeding a buffer created from the string by Buffer.from(response.data) into Sharp, and it doesn't have any issues until I try and "work" with the image by attempting to gather the metadata. At this point it throws an error: [Error: Input buffer contains unsupported image format]. But I know that the image is valid, as it worked in the old system, and I didn't change any dependencies.
I use QuokkaJS to test, and the following PoC fails, and I need to get it to functioning order.
import axios from 'axios';
import Sharp from 'sharp';
const url = 'https://dqktdb1dhykn6.cloudfront.net/357882-TLRKytH3h.jpg';
const imageResponse = await axios({url: url, responseType: 'stream'});
const buffer = Buffer.from(imageResponse.data);
let src = new Sharp(buffer);
const src2 = src.clone();//this is simply because it will end up being a loop, if this is the issue let me know.
try {
await src2.jpeg();
await src2.resize(null, 1920);
await src2.resize(1080, null);
const metadata = await src2.clone().metadata();//this is where it fails
console.log(metadata);
} catch(e) {
console.log(e);//logs the mentioned error
}
If anybody has any idea what I am doing incorrectly, or has anything specific information that they would like me to add, please let me know! If I need to pipe the image data, let me know. I've tried to directly pipe it getting a pipe is not a function on the string (which makes sense).
Update #1:
A big thank you to #Thee_Sritabtim for the comment, which solved the issue. Basically, I had been trying to convert a Stream based String into a Buffer. I needed to instead declare that the request was for an ArrayBuffer, and then feed it into Sharp while declaring its type of binary. The working example of the PoC is below!
import axios from 'axios';
import Sharp from 'sharp';
const url = 'https://dqktdb1dhykn6.cloudfront.net/357882-TLRKytH3h.jpg';
const imageResponse = await axios({url: url, responseType: 'arraybuffer'});
const buffer = Buffer.from(imageResponse.data, 'binary');
let src = new Sharp(buffer);
try {
await src.jpeg();
await src.resize(null, 1920);
await src.resize(1080, null);
const metadata = await src.metadata();//this was where it failed, but now it prints an object of metadata
console.log(metadata);
} catch(e) {
console.log(e);//Doesn't catch anything any more!
}
To get a buffer from a axios response, you'll have to set responseType to 'arraybuffer'.
const imageResponse = await axios({url: url, responseType: 'arraybuffer'})
const buffer = Buffer.from(imageResponse.data, 'binary')
Alternatively,
You could also use stream as input for sharp(), so you could keep the responseType to 'stream'
const imageResponse = await axios({url: url, responseType: 'stream'})
const src = imageResponse.data.pipe(sharp())
//...
const metadata = await src.metadata()
Let's say I am creating a REST API with Node/Express and data is exchanged between the client and server via JSON.
A user is filling out a registration form and one of the fields is an image input to upload a profile image. Images cannot be sent through JSON and therefore must be converted to a base64 string.
How do I validate this is indeed a base64 string of an image on the serverside? Or is it best practice not to send the profile image as a base64?
You could start by checking if the string is a base64 image, with a proper mime type.
I found this library on npm registry doing exactly that (not tested).
const isBase64 = require('is-base64');
let base64str_img = 'data:image/jpeg;base64,/9j/4AAQSkZJRgABAQEASABIAAD/2wBDAAoHBwgHBgoIC...ljA5GC68sN8AoXT/AF7fw7//2Q==';
console.log(isBase64(base64str_img, { mime: true })); // true
Then you can verify if the mime type is allowed within your app, or make other verifications like trying to display the image file and catch possible error.
Anyway, If you want to be really sure about user input, you have to handle it by yourself in the first place. That is the best practice you should care about.
The Base64 value it is a valid image only if its decoded data has the correct MIME type, and the width and height are greater than zero. A handy way to check it all, is to install the jimp package and use it as follows:
var b64 = 'R0lGODdhAQADAPABAP////8AACwAAAAAAQADAAACAgxQADs=',
buf = Buffer.from(b64, 'base64');
require('jimp').read(buf).then(function (img) {
if (img.bitmap.width > 0 && img.bitmap.height > 0) {
console.log('Valid image');
} else {
console.log('Invalid image');
}
}).catch (function (err) {
console.log(err);
});
I wanted to do something similar, and ended up Googling it and found nothing, so I made my own base64 validator:
function isBase64(text) {
let utf8 = Buffer.from(text).toString("utf8");
return !(/[^\x00-\x7f]/.test(utf8));
}
This isn't great, because I used it for a different purpose but you may be able to build on it, here is an example using atob to prevent invalid base64 chars (they are ignored otherwise):
function isBase64(text) {
try {
let utf8 = atob(text);
return !(/[^\x00-\x7f]/.test(utf8));
} catch (_) {
return false;
}
}
Now, about how it works:
Buffer.from(text, "base64") removes all invalid base64 chars from the string, then converts the string to a buffer, toString("utf8"), converts the buffer to a string. atob does something similar, but instead of removing the invalid chars, it will throw an error when it encounters one (hence the try...catch).
!(/[^\x00-\x7f]/.test(utf8)) will return true if all the chars from the decoded string belong in the ASCII charset, otherwise it will return false. This can be altered to use a smaller charset, for example, [^\x30-\x39\x41-\x5a\x61-\x7a] will only return true if all the characters are alphanumeric.
For some reason I need to store some files (mostly images or pdfs) into my database (PG 9.2.20).
Those files are uploaded by users and when I download them back, they are corrupted.
I'm working with nodejs.
The column type I store the file is BYTEA.
This is how I store them :
const { files, fields } = await asyncBusboy(ctx.req);
const fileName = files[0].filename;
const mimeType = files[0].mimeType;
const bufferedFile = fs.readFileSync(files[0].path, { encoding: 'hex' });
const fileData = `\\x${bufferedFile}`;
//Just a basic insert into with knex.raw
const fileId = await storageModel.create(fields.name, fields.description, fileName, mimeType, fileData, ctx.user);
And this is how I retrieve my file :
const file = await storageModel.find(ctx.params.fileId, ctx.user);
ctx.body = Buffer.from(file.file_bin, 'hex');
ctx.set('Content-disposition', `attachment; filename=${file.file_name}`);
The file is corrupted, and of course, if I look closely, the uploaded file and the one I downloaded are different.
See hex screenshot, there is some additional data at the start of the downloaded one : http://imgur.com/a/kTRAB
After some more testing I can tell the problem lies into the koa part, when I put the buffer into the ctx.body. It got corrupted (???)
EDIT : I was working with Swagger UI : https://github.com/swagger-api/swagger-ui/issues/1605
You should not use bytea as a regular text string. You should pass in type Buffer directly, and have the driver escape it for you correctly.
Not sure which driver you are using, but for example...
pg-promise does it automatically, see the example
node-postgres is supposed to do it automatically, which it does mostly, but I know there were issues with arrays, recently fixed here.
massive.js - based on pg-promise since v3.0, so the same story - it just works.
I have a web service that takes a base 64 encoded string representing an image, creates a thumbnail of that image using the imagemagick library, then stores both of them in mongodb. I am doing this with the following code (approximately):
var buf = new Buffer(req.body.data, "base64"); //original image
im.resize({ srcData: buf, width: 256 }, function(err, stdout, stderr) {
this.thumbnail = new Buffer(stdout, "binary");
//store buf and stdout in mongo
});
You will notice that I am creating a Buffer object using the "binary" encoding, which the docs say not to do:
'binary' - A way of encoding raw binary data into strings by using
only the first 8 bits of each character. This encoding method is
deprecated and should be avoided in favor of Buffer objects where
possible. This encoding will be removed in future versions of Node.
First off I'm not sure what they are saying there. I'm trying to create a Buffer object and they seem to imply I should already have one.
Secondly, the source of the problem appears to be that the imagemagick resize method returns a string containing binary data. Doing typedef(stdout) return "string" and printing it out to the screen certainly appears to show a bunch of non-character data.
So what do I do here? I can't change how imagemagick works. Is there another way of doing what I'm trying to do?
Thats how I am doing the same with success, storing images in mongodb.
//original ---> base64
var thumbnail = new Buffer(req.body.data).toString('base64');
//you can store this string value in a mongoose model property, and save to mongodb
//base64 ---> image
var buffer = new Buffer(thumbnail, "base64");
I am not sure if storing images as base64 is the best way to do it
Please try this as your base64 might not be pre-handled:
var imgRawData =
req.body.images[0].replace(/^data:image\/png;base64,|^data:image\/jpeg;base64,|^data:image\/jpg;base64,|^data:image\/bmp;base64,/, "");
var yourBuffer = new Buffer(imgRawData, "base64");
Then, save the yourBuffer into MongoDB buffer.