NodeJS crypto and binary encoding by default

NodeJS crypto and binary encoding by default - node.js

We have a legacy NodeJS API we are looking to move away from. We are extracting the auth component out of it.
We have a function that hashes a password, that looks a bit like:
var sha256 = crypto.createHash('sha256');
sha256.update(password + secret);
return sha256.digest('hex');
Aside from the obvious security implications a function like this has, it's encoded using the binary encoding NodeJS has.
If you pass in a String to update, it will use 'binary' as the encoding format. This results in it actually encoding unicode characters like "Kiełbasa" as "KieBbasa", before SHA256'ing them.
In our Scala code we are now looking to rewrite this legacy function so we can auth old users. But we cannot find a charset to use on these strings that has the same resulting output. Our Scala code looks like:
def encryptPassword(password: String): String = {
Hashing.sha256().hashString(in, Charsets.UTF_8).toString
}
We need the in string to be the same as what Node is using, but we can't figure it out.
Ideas? Node.js... not even once.

Well, turns out this is much easier than it seems. The following code works:
def encryptPassword(password: String): String = {
val in = (password + secret).map(_.toByte.toChar).mkString("")
Hashing.sha256().hashString(in, Charsets.UTF_8).toString
}
Thanks #stefanobaghino

Related

Trouble with getting correct tag for 256-bit AES GCM encryption in Node.js

I need to write the reverse (encryption) of the following decryption function:
const crypto = require('crypto');
let AESDecrypt = (data, key) => {
const decoded = Buffer.from(data, 'binary');
const nonce = decoded.slice(0, 16);
const ciphertext = decoded.slice(16, decoded.length - 16);
const tag = decoded.slice(decoded.length - 16);
let decipher = crypto.createDecipheriv('aes-256-gcm', key, nonce);
decipher.setAuthTag(tag)
decipher.setAutoPadding(false);
try {
let plaintext = decipher.update(ciphertext, 'binary', 'binary');
plaintext += decipher.final('binary');
return Buffer.from(plaintext, 'binary');
} catch (ex) {
console.log('AES Decrypt Failed. Exception: ', ex);
throw ex;
}
}
This above function allows me to properly decrypt encrypted buffers following the spec:
| Nonce/IV (First 16 bytes) | Ciphertext | Authentication Tag (Last 16 bytes) |
The reason why AESDecrypt is written the way it (auth tag as the last 16 bytes) is because that is how the default standard library implementations of AES encrypts data in both Java and Go. I need to be able to bidirectionally decrypt/encrypt between Go, Java, and Node.js. The crypto library based encryption in Node.js does not put the auth tag anywhere, and it is left to the developer how they want to store it to pass to setAuthTag() during decryption. In the above code, I am baking the tag directly into the final encrypted buffer.
So the AES Encryption function I wrote needed to meet the above circumstances (without having to modify AESDecrypt since it is working properly) and I have the following code which is not working for me:
let AESEncrypt = (data, key) => {
const nonce = 'BfVsfgErXsbfiA00'; // Do not copy paste this line in production code (https://crypto.stackexchange.com/questions/26790/how-bad-it-is-using-the-same-iv-twice-with-aes-gcm)
const encoded = Buffer.from(data, 'binary');
const cipher = crypto.createCipheriv('aes-256-gcm', key, nonce);
try {
let encrypted = nonce;
encrypted += cipher.update(encoded, 'binary', 'binary')
encrypted += cipher.final('binary');
const tag = cipher.getAuthTag();
encrypted += tag;
return Buffer.from(encrypted, 'binary');
} catch (ex) {
console.log('AES Encrypt Failed. Exception: ', ex);
throw ex;
}
}
I am aware hardcoding the nonce is insecure. I have it this way to make it easier to compare properly encrypted files with my broken implementation using a binary file diff program like vbindiff
The more I looked at this in different ways, the more confounding this problem has become for me.
I am actually quite used to implementing 256-bit AES GCM based encryption/decryption, and have properly working implementations in Go and Java. Furthermore, because of certain circumstances, I had a working implementation of AES decryption in Node.js months ago.
I know this to be true because I can decrypt in Node.js, files that I encrypted in Java and Go. I put up a quick repository that contains the source code implementations of a Go server written just for this purpose and the broken Node.js code.
For easy access for people that understand Node.js, but not Go, I put up the following Go server web interface for encrypting and decrypting using the above algorithm hosted at https://go-aes.voiceit.io/. You can confirm my Node.js decrypt function works just fine by encrypting a file of your choice at https://go-aes.voiceit.io/, and decrypting the file using decrypt.js (Please look at the README for more information on how to run this if you need to confirm this works properly.)
Furthermore, I know this issue is specifically with the following lines of AESEncrypt:
const tag = cipher.getAuthTag();
encrypted += tag;
Running vbindiff against the same file encrypted in Go and Node.js, The files started showing differences only in the last 16 bytes (where the auth tag get's written). In other words, the nonce and the encrypted payload is identical in Go and Node.js.
Since the getAuthTag() is so simple, and I believe I am using it correctly, I have no idea what I could even change at this point. Hence, I have also considered the remote possibility that this is a bug in the standard library. However, I figured I'd try Stackoverflow first before posting an Github Issue as its most likely something I'm doing wrong.
I have a slightly more expanded description of the code, and proof of how I know what is working works, in the repo I set up to try to get help o solve this problem.
Thank you in advance.
Further info: Node: v14.15.4 Go: go version go1.15.6 darwin/amd64

In the NodeJS code, the ciphertext is generated as a binary string, i.e. using the binary/latin1 or ISO-8859-1 encoding. ISO-8859-1 is a single byte charset which uniquely assigns each value between 0x00 and 0xFF to a specific character, and therefore allows the conversion of arbitrary binary data into a string without corruption, see also here.
In contrast, the authentication tag is not returned as a binary string by cipher.getAuthTag(), but as a buffer.
When concatenating both parts with:
encrypted += tag;
the buffer is converted into a string implicitly using buf.toString(), which applies UTF-8 encoding by default.
Unlike ISO-8859-1, UTF-8 is a multi byte charset that defines specific byte sequences between 1 and 4 bytes in length that are assigned to characters, s. UTF-8 table. In arbitrary binary data (such as the authentication tag) there are generally byte sequences that are not defined for UTF-8 and therefore invalid. Invalid bytes are represented by the Unicode replacement character with the code point U+FFFD during conversion (see also the comment by #dave_thompson_085). This corrupts the data because the original values are lost. Thus UTF-8 encoding is not suitable for converting arbitrary binary data into a string.
During the subsequent conversion into a buffer with the single byte charset binary/latin1 with:
return Buffer.from(encrypted, 'binary');
only the last byte (0xFD) of the replacement character is taken into account.
The bytes marked in the screenshot (0xBB, 0xA7, 0xEA etc.) are all invalid UTF-8 byte sequences, s. UTF-8 table, and are therefore replaced by the NodeJS code with 0xFD, resulting in a corrupted tag.
To fix the bug, the tag must be converted with binary/latin1, i.e. consistent with the encoding of the ciphertext:
let AESEncrypt = (data, key) => {
const nonce = 'BfVsfgErXsbfiA00'; // Static IV for test purposes only
const encoded = Buffer.from(data, 'binary');
const cipher = crypto.createCipheriv('aes-256-gcm', key, nonce);
let encrypted = nonce;
encrypted += cipher.update(encoded, 'binary', 'binary');
encrypted += cipher.final('binary');
const tag = cipher.getAuthTag().toString('binary'); // Fix: Decode with binary/latin1!
encrypted += tag;
return Buffer.from(encrypted, 'binary');
}
Please note that in the update() call the input encoding (the 2nd 'binary' parameter) is ignored, since encoded is a buffer.
Alternatively, the buffers can be concatenated instead of the binary/latin1 converted strings:
let AESEncrypt_withBuffer = (data, key) => {
const nonce = 'BfVsfgErXsbfiA00'; // Static IV for test purposes only
const encoded = Buffer.from(data, 'binary');
const cipher = crypto.createCipheriv('aes-256-gcm', key, nonce);
return Buffer.concat([ // Fix: Concatenate buffers!
Buffer.from(nonce, 'binary'),
cipher.update(encoded),
cipher.final(),
cipher.getAuthTag()
]);
}
For the GCM mode, a nonce length of 12 bytes is recommended by NIST for performance and compatibility reasons, see here, chapter 5.2.1.1 and here. The Go code (via NewGCMWithNonceSize()) and the NodeJS code apply a nonce length of 16 bytes different from this.

Node.js encrypt a string into alphanumerical

I am currently using NPM module crypto-js to encrypt strings. The codes are the following:
CryptoJS.AES.encrypt("hello", "secrete_key").toString()
This yields to an encrypted string of: U2FsdGVkX1/G276++MaPasgSVxcPUgP72A1wMaB8aAM=
Notice that in the encrypted string, there are special symbols such as / + and =. These generated special symbols cause a problem in my app. Is there a way that I can encrypt a string into only alphanumericals? I like the way crypto-js is doing, since it is simple. I only need to feed in the string, and a password hash. If crypto-js cannot, what other modules can achieve this in a simple and straight-forward way?

CryptoJS also includes utilities for encoding/decoding so you can convert your encrypted string to a URL-safe format. See the CryptoJS docs for more info.
For example:
var CryptoJS = require("crypto-js");
let encrypted = CryptoJS.AES.encrypt("hello", "secrete_key").toString()
var encoded = CryptoJS.enc.Base64.parse(encrypted).toString(CryptoJS.enc.Hex);
console.log(encrypted)
console.log(encoded)
var decoded = CryptoJS.enc.Hex.parse(encoded).toString(CryptoJS.enc.Base64);
var decrypted = CryptoJS.AES.decrypt(decoded, "secrete_key").toString(CryptoJS.enc.Utf8);
console.log(decoded)
console.log(decrypted)

Google Vision text detection for Node.js using base64 encoding

Just started exploring Google Cloud Vision APIs. From their guide:
const client = new vision.ImageAnnotatorClient();
const fileName = 'Local image file, e.g. /path/to/image.png';
const [result] = await client.textDetection(fileName);
However, I wanna use base64 representation of binary image data, since they claim that it's possible to use.
I found this reference on SO:
Google Vision API Text Detection with Node.js set Language hint
Instead of imageUri I used "content": stringas mentioned here. But the SO sample uses const [result] = await client.batchAnnotateImages(request);method. I tried using same technique on const [result] = await client.textDetection( method and it gave me an error.
So my question is: Is it possible to use base64 encoded string to represent image in order to perform TEXT_DETECTION ? And if so, how?
Any kind of help is highly appreciated.

You can use the quickstart guide and from there edit the lines after the creation of the client for the following:
// Value of the image in base64
const img_base64 = '/9j/...';
const request = {
image: {
content: Buffer.from(img_base64, 'base64')
}
};
const [result] = await client.textDetection(request);
console.log(result.textAnnotations);
console.log(result.fullTextAnnotation);
You can take a look at the function here, read the description of the request parameter, in particular the following part:
A dictionary-like object representing the image. This should have a
single key (source, content).
If the key is content, the value should be a Buffer.
Which leads to the structure used in the sample code from before. Opposed to when using imageUri or filename, which have to be inside of another object which key is source, as shown in the sample.

content field need to be Buffer.
You use the nodejs client library. The library use the grpc API internally, and grpc API expect bytes type at content field.
However, JSON API expect base64 string.
References
https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#image
https://googleapis.dev/nodejs/vision/latest/v1.ImageAnnotatorClient.html#textDetection

What is the correct way to encode SAML request to the Azure AD endpoint?

I am using the following code to encode the SAMLRequest value to the endpoint, i.e. the XYZ when calling https://login.microsoftonline.com/common/saml2?SAMLRequest=XYZ.
Is this the correct way to encode it?
private static string DeflateEncode(string val)
{
var memoryStream = new MemoryStream();
using (var writer = new StreamWriter(new DeflateStream(memoryStream, CompressionMode.Compress, true), new UTF8Encoding(false)))
{
writer.Write(val);
writer.Close();
return Convert.ToBase64String(memoryStream.GetBuffer(), 0, (int)memoryStream.Length, Base64FormattingOptions.None);
}
}

If you just want to convert string to a base64 encoded string, then you can use the following way:
var encoded = Convert.ToBase64String(System.Text.Encoding.Default.GetBytes(val));
Console.WriteLine(encoded);
return encoded;

Yes, that looks correct for the Http Redirect binding.
But don't do this yourself unless you really know what you are doing. Sending the AuthnRequest is the simple part. Correctly validating the received response, including guarding for Xml signature wrapping attacks is hard. Use an existing library, there are both commercial and open source libraries available.

convert byte[] to string with charset Encoding in C#

I'm new to Windows phone 8 Dev and C#. I try to get String from remote server. This is my code:
var client = new HttpClient();
var response = await client.GetAsync(MyUrl);
string charset = GetCharset(response.Content.Headers.ContentType.ToString());
Debug.WriteLine("charset: " + charset); \\debug show: charset: ISO-8859-1
var bytesCharset = await response.Content.ReadAsByteArrayAsync();
Debug.WriteLine(Encoding.UTF8.GetString(bytesCharset,0,bytesCharset.Length));
//Debug show: `m\u1EF9 t�m`
m\u1EF9 t�m is what I got, but I expect it's Vietnamese: Mỹ Tâm. Can anyone give me an idea to get expected result? (I think the bytes is encoded in IOS-8859-1).

Encoding.UTF8 should be interpreting your data correctly, as ISO-8859-1 is a subset of UTF-8. However, so that other encodings are handled transparently, it is probably better to use HttpContent's ReadAsStringAsync() method instead of ReadAsByteArrayAsync().
I suspect you have the correct string there but your debug window simply cannot display those two unicode characters properly.

You can use the static GetEncoding method to request a specificy codepage.
e.g.:
var encoding = Encoding.GetEncoding("ISO-8859-1");

Finally I found the solution. After using Json.Net to parse json from remote server, my problem's gone, and I can get expected result:
JArray jArray = JArray.Parse(isoText);
String exptectedText= jArray[1].ToString();
Actually, I dont know why my code works :( Thank you anyway!

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

NodeJS crypto and binary encoding by default - node.js

Well, turns out this is much easier than it seems. The following code works: def encryptPassword(password: String): String = { val in = (password + secret).map(_.toByte.toChar).mkString("") Hashing.sha256().hashString(in, Charsets.UTF_8).toString } Thanks #stefanobaghino

Related

Trouble with getting correct tag for 256-bit AES GCM encryption in Node.js

Node.js encrypt a string into alphanumerical

Google Vision text detection for Node.js using base64 encoding

What is the correct way to encode SAML request to the Azure AD endpoint?

convert byte[] to string with charset Encoding in C#

Categories

Resources