Node.js Buffer and Encoding

Node.js Buffer and Encoding - node.js

I have an HTTP endpoint where user uploads file. I need to read the file contents and then store it to DB. I can read it to Buffer and getting string from it.
The problem is, then file content is not UTF-8 I can see "strange" symbols in output string.
Is that possible somehow to detect the encoding of Buffer contents and serialise it to string correctly?

Related

Convert BASE64 String to PDF file in the IFS on AS400

We receive a BASE64 encoded representation of a courier label PDF in an xml file, which we store in the IFS of our AS400.
We would like to decode this BASE64 string and save it as a .PDF in the IFS so we can then either email it as an attachment or send it to a printer.
I have looked at the capability of the CPYSPLF command using the *PDF WSCST parameter, but this only seems relevant where we would have a Spooled File representation of the label we want to produce.
Does anyone know if this is possible via native iSeries commands/RPG?

One way is to
IFS_READ_UTF8 to load the XML file
XMLPARSE to make it a XML object
XML_TABLE to extract BASE64 data
BASE64_DECODE to decode B64 data to PDF binary stream
IFS_WRITE_BINARY to write that stream as a .pdf file
You could make it a pure sql prodecure, or a SQLRPGLE program.
You also could extract BASE64 date using RPGLE XML-INTO. The use Scott Klement's BASE64 SRVPGM to decode it, then write it to the IFS

Converting a nodejs buffer to string and back to buffer gives a different result in some cases

I created a .docx file.
Now, I do this:
// read the file to a buffer
const data = await fs.promises.readFile('<pathToMy.docx>')
// Converts the buffer to a string using 'utf8' but we could use any encoding
const stringContent = data.toString()
// Converts the string back to a buffer using the same encoding
const newData = Buffer.from(stringContent)
// We expect the values to be equal...
console.log(data.equals(newData)) // -> false
I don't understand in what step of the process the bytes are being changed...
I already spent sooo much time trying to figure this out, without any result... If someone can help me understand what part I'm missing out, it would be really awesome!

A .docXfile is not a UTF-8 string (it's a binary ZIP file) so when you read it into a Buffer object and then call .toString() on it, you're assuming it is already encoding as UTF-8 in the buffer and you want to now move it into a Javascript string. That's not what you have. Your binary data will likely encounter things that are invalid in UTF-8 and those will be discarded or coerced into valid UTF-8, causing an irreversible change.
What Buffer.toString() does is take a Buffer that is ALREADY encoded in UTF-8 and puts it into a Javascript string. See this comment in the doc,
If encoding is 'utf8' and a byte sequence in the input is not valid UTF-8, then each invalid byte is replaced with the replacement character U+FFFD.
So, the code you show in your question is wrongly assuming that Buffer.toString() takes binary data and reversibly encodes it as a UTF8 string. That is not what it does and that's why it doesn't do what you are expecting.
Your question doesn't describe what you're actually trying to accomplish. If you want to do something useful with the .docX file, you probably need to actually parse it from it's binary ZIP file form into the actual components of the file in their appropriate format.
Now that you explain you're trying to store it in localStorage, then you need to encode the binary into a string format. One such popular option is Base64 though it isn't super efficient (size wise), but it is better than many others. See Binary Data in JSON String. Something better than Base64 for prior discussion on this topic. Ignore the notes about compression in that other answer because your data is already ZIP compressed.

Creating a file in node.js, using an encoding (CP437 / IBM) which is not part of the supported standard node encodings [ascii/base64/latin1/...]

I am processing Files with different encoding-types.
Right now, any encoded file is transformed to utf-8 and saved to my SQL DB.
My goal ist to generate new files with the same encoding as the original data.
I am able to decode hex as CP437/IBM but unable to write the resulting String to a File maintaining the desired encoding.
decodedString = cptable.utils.decode(437, myHexString);
fs.appendFile(filename, decodedString, [options.encoding],(err)=>{
console.log("please help me")
}
The result is a file with faulty encoding, but also contains a hidden message.

Binary data from mongodb gets corrupted

When I upload a photo it converts to base64 and then when I send to mongodb using Mongoose it saves as Binary. But when I call the same picture back from the database it returns as Buffer array. After converting to base64 it returns as a base64 string but completely different from the original base64. The new base64 is unable to be rendered in browser because it has been corrupted.
Below are pictures of the different strings
This is the initial base64
This is the Buffer array
This is the corrupted base64 after converting from the buffer array using Buffer.from(avatar).toString('base64').
Please note that I appended to it "data:image/png;base64," before rendering in the browser and it still did not render.
Please can someone tell me what I am doing wrong?

the best solution is convert to png or jpg file and upload only path and save image to folder.

Here is how I solved it.
I converted from binary to utf8 instead of to base64.
There is a huge difference bewteen
Buffer.from(binary_data, 'binary').toString('utf8')
and
Buffer.from(binary_data, 'binary').toString('base64')

DocuSign Connect: pdfbytes leads to corrupted pdf file

I am trying to connect docusign with my java application and I was successful.
I have created listener to listen response of docusign after user complete sign process so that document saved/updated automatically in my system.
I am able to get that response in xml format with pdfbytes but as soon as I create pdf from that pdfBytes,I am not able to opening that pdf(might be corrupted pdfbytes).
I am making base64 decoding of that byte before generating pdf.

This is a common problem when the pdfbytes are not managed as a run of binary bytes. At some point you may be treating the data as a string. The PDF file becomes corrupted at that point.
Issues to check:
When you Base64 decode the string, the result is binary. Is your receiving variable capable of receiving binary data? (No codeset transformations.)
When you write your binary buffer to the output file, check that your output file format is binary clean. This is especially an issue on Windows systems.
If you're still having a problem, edit your question to include your code.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string