Generate a consistent sha256 hash from an object in Node - node.js

I have an object that I'd like to hash with sha256 in Node. The contents of the object are simple Javascript types. For example's sake, let's say:
var payload = {
"id": "3bab3f00-7d55-11e7-9b0a-4c32759242a5",
"foo": "a message",
"version": 7,
};
I create a hash like this:
const crypto = require('crypto');
var hash = crypto.createHash('sha256');
hash.update( ... ).digest('hex');
The question is, what to pass to update? The documentation for crypto says you can pass a <string> | <Buffer> | <TypedArray> | <DataView>, which seems to suggest an object is not a good thing to pass.
I can't use toString() because that prints "[object Object]". I could use JSON.stringify, however I have read elsewhere that the output from stringify is not guaranteed to be deterministic for the same input.
Are there any other options? I do not want to download a package from NPM.

The right terms are "canonical" and the action is called "canonicalization" (I'm assuming EN-US here), you can find a stringify that produces canonical output here.
Beware that you must make sure that the output also has the right character set (UTF-8 should be preferred) and line endings. Spurious data should not be present, e.g. a byte order mark or NUL termination string is enough to void the hash value.
After that you can pass it as string I suppose.
You can of course use any canonical encoding. Note that XML has defined XML-digsig, which contains canonicalization during signature generation and signing, which means that the verification will even succeed if the XML code is altered (without altering the structure or contents of course, but whitespace / indentation will not matter).
I'd still recommend regression testing between implementations and even version updates of the libraries.

Related

How do I validate the Hmac using NodeJS?

I can successfully create an Hmac via NodeJS using the following code:
(slightly altered example from : https://nodejs.org/api/crypto.html#cryptocreatehmacalgorithm-key-options)
Crypto.createHmac('sha256', Crypto.randomBytes(16))
.update('I love cupcakes')
.digest('hex');
That results in a value like the following (hex-based string Hmac signature):
fb2937ca821264812d511d68ae06a643915931375633173ba64af9425f2ffd53
How do I use that signature to verify that the data was not altered? (using NodeJS, of course).
My Assumption
I'm assuming there is a method call where you supply the data and the signature and you get a boolean that tells you if the data was altered or not -- or something similar.
Another Solution?
Oh, wait, as I was writing that I started thinking...
Do I need to store the original random bytes I generated (Crypto.randomBytes(16)) and pass them to the receiver so they can just generate the HMac again and verify that the result is the same (fb2937ca821264812d511d68ae06a643915931375633173ba64af9425f2ffd53)?
If that is true that would be odd, because the parameter for Crypto.randomBytes(16) is named secret (in the official example)*. Seems like that needs to be kept secret??
Please let me know if there is a way to verify the signature on the receiving side & how I do that.
Official Documentation : A Bit Confusing
Here's the function as it is defined in the official docs:
crypto.createHmac(algorithm, key[, options])
In the function definition, you can see the second param is named key.
However, in the example they refer to it as secret
const secret = 'abcdefg';
const hash = crypto.createHmac('sha256', secret)
.update('I love cupcakes')
.digest('hex');
console.log(hash);
Just posting the answer so if anyone in future sees this they will be able to have the definitive answer.
As the commentor (Topaco) pointed out, the simple answer is that:
The receiver who want wants to validate the Hmac simply needs to use the same key value & data and apply it to the method and retrieve the hash value.
const secret = 'abcdefg';
const hash = crypto.createHmac('sha256', secret)
.update('I love cupcakes')
.digest('hex');
console.log(hash);
The original Hmac-creating party must provide three things for the verifying party:
data : (could be encrypted data from AES256, for example)
key : original key passed into the createHmac() method -- note: this item is called secret in the sample code by NodeJS (above).
hash :the (clearText) hash which the original creator generated when calling the createHmac() method.
With those three things the verifying party can now call the createHmac() method and determine if the hash they get matches the hash that the original hmac-creating party generated.
Doing this validates that the Data which was sent has not been corrupted or altered.
Additional Note On Key (secret)
I've come back after thinking about the Hmac a bit more.
It is required that both parties know the key (aka secret) but it does not mean that it should be exposed to others.
This must be kept secret (as the code implies) because if a nefarious type knew the value and could alter it, then they could also alter the data and generate a new key (secret) and pass it along as if the original creator sent it along (MITM - man in the middle attack).
So, the point here is that yes, both parties have to know the key (secret) value, but it should not be shared where it might be discovered by nefarious types.
Instead, it will have to be agreed upon or based upon a secret password, etc.

How to copy S3 object with special character in key

I have objects in an S3 bucket, and I do not have control over the names of the keys. Some of these keys have special characters and AWS SDK does not like them.
For example, one object key is: folder/‍Johnson, Scott to JKL-Discovery.pdf, it might look fine at first glance, but if I URL encode it: folder%2F%E2%80%8DJohnson%2C+Scott+to+JKL-Discovery.pdf, you can see that after folder/ (or folder%2F when encoded) there is a random sequence of characters %E2%80%8D before Johnson.
It is unclear where these characters come from, however, I need to be able to handle this use case. When I try to make a copy of this object using the Node.js AWS SDK,
const copyParams = {
Bucket,
CopySource,
Key : `folder/‍Johnson, Scott to JKL-Discovery.pdf`
};
let metadata = await s3.copyObject(copyParams).promise();
It fails and can't find the object, if I encodeURI() the key, it also fails.
How can I deal with this?
DO NOT SUGGEST I CHANGE THE ALLOWED CHARACTERS IN THE KEY NAME. I DO NOT HAVE CONTROL OVER THIS
Faced the same problem but with PHP. copyObject() method is automatically encoding destination parameters (Bucket and Key) parameters, but not source parameter (CopySource) so it has to be encoded manually. In php it looks like this:
$s3->copyObject([
'Bucket' => $targetBucket,
'Key' => $targetFilePath,
'CopySource' => $s3::encodeKey("{$sourceBucket}/{$sourceFilePath}"),
]);
I'm not familiar with node.js but there should also exist that encodeKey() method that can be used?
trying your string, there's a tricky 'zero width space' unicode char...
http://www.ltg.ed.ac.uk/~richard/utf-8.cgi?input=342+200+213&mode=obytes
I would sanitize the string removing unicode chars and then proceding with url encoding as requested by official docs.
encodeURI('folder/‍johnson, Scott to JKL-Discovery.pdf'.replace(/[^\x00-\x7F]/g, ""))

Node Js SHA 1 multiple times

Hello I have this Java code which uses the following encryption method to encrypt password.
MessageDigest digester = MessageDigest.getInstance("SHA-1");
value = digester.digest(password.getBytes());
digester.update(email.getBytes());
value = digester.digest(value);
This returns base64 encoded string like qXO4aUUUyiue6arrcLAio+TBNwQ= This is sample not exact.
I am converting this to NodeJs not sure how to handle this. I have tried like
var crypto = require('crypto');
var shasum = crypto.createHash('sha1');
var value = shasum.update('hello');
shasum.update('abc#xyz.com');
value = shasum.digest(value).toString('base64');
console.log(value);
The string base64 I get in node js is not similar to get from java. Not sure why?. I need to have same encoding as java as its old system migrated to new one cant lose old details.
Can someone help me how I can achieve same base64 string.
In Java you're calculating the first value as the hash of the password alone, then overwrite it with hash of the email alone. (digest gives the result and resets the hash, in Java).
In Javascript, on the other hand, you're having an undefined value, then overwrite it with the hash of (password concatenated with email).
PS that hash is conceptually wrong: you should always put a separator between two fields, to avoid ambiguity and, thus, possible attacks.

What is the best way to safely read user input?

Let's consider a REST endpoint which receives a JSON object. One of the JSON fields is a String, so I want to validate that no malicious text is received.
#ValidateRequest
public interface RestService {
#POST
#Consumes(APPLICATION_JSON)
#Path("endpoint")
void postData (#Valid #NotNull Data data);
}
public class Data {
#ValidString
private String s;
// get,set methods
}
I'm using the bean validation framework via #ValidString to delegate the validation to the ESAPI library.
#Override
public boolean isValid (String value, ConstraintValidatorContext context) {
return ESAPI.validator().isValidInput(
"String validation",
value,
this.constraint.type(),
this.constraint.maxLength(),
this.constraint.nullable(),
true);
}
This method canonicalizes the value (i.e. removes encryption) and then validates against a regular expression provided in the ESAPI config. The regex is not that important to the question, but it mostly whitelists 'safe' characters.
All good so far. However, in a few occasions, I need to accept 'less' safe characters like %, ", <, >, etc. because the incoming text is from an end user's free text input field.
Is there a known pattern for this kind of String sanitization? What kind of text can cause server-side problems if SQL queries are considered safe (e.g. using bind variables)? What if the user wants to store <script>alert("Hello")</script> as his description which at some point will be send back to the client? Do I store that in the DB? Is that a client-side concern?
When dealing with text coming from the user, best practice is to white list only known character sets as you stated. But that is not the whole solution, since there are times when that will not work, again as you pointed out sometimes "dangerous" characters are part of the valid character set.
When this happens you need to be very vigilant in how you handle the data. I, as well as the commenters, recommended is to keep the original data from the user in its original state as long as possible. Dealing with the data securely will be to use proper functions for the target domain/output.
SQL
When putting free format strings into a SQL database, best practice is to use prepared statements (in java this is the PreparedStatement object or using ORM that will automatically parameterizes the data.
To read more on SQL injection attacks and other forms of Injection attacks (XML, LDAP, etc.) I recommended OWASPS Top 10 - A1 Injections
XSS
You also mentioned what to do when outputting this data to client. In this case I you want to make sure you html encode the output for the proper context, aka contextual output encoding. ESAPI has Encoder Class/Interface for this. The important thing to note is which context (HTML Body, HTML Attribute, JavaScript, URL, etc.) will the data be outputted. Each area is going to be encoded differently.
Take for example the input: <script>alert('Hello World');<script>
Sample Encoding Outputs:
HTML: <script>alert('Hello World');<script>
JavaScript: \u003cscript\u003ealert(\u0027Hello World\u0027);\u003cscript\u003e
URL: %3Cscript%3Ealert%28%27Hello%20World%27%29%3B%3Cscript%3E
Form URL:
%3Cscript%3Ealert%28%27Hello+World%27%29%3B%3Cscript%3E
CSS: \00003Cscript\00003Ealert\000028\000027Hello\000020World\000027\000029\00003B\00003Cscript\00003E
XML: <script>alert(&apos;Hello World&apos;);<script>
For more reading on XSS look at OWASP Top 10 - A3 Cross-Site Scripting (XSS)

mongodb, node.js and encrypted data

I'm working on a project which involves a lot of encrypted data. Basically, these are JSON objects serialized into a String, then encrypted with AES256 into a Cyphertext, and then have to be stored in Mongo.
I could of course do this the way described above, which will store the cyphertext as String into a BSON document. However, this way, if for some reason along the way the Cyphertext isn't treated properly (for instance, different charset or whatever reason), the cyphertext is altered and I cannot rebuild the original string anymore. With millions of records, that's unacceptable (it's also slow).
Is there a proper way to save the cyphertext in some kind of native binary format, retrieve it binary and then return it to the original string? I'm used to working with strings, my skills with binary format are pretty rusty. I'm very interested in hearing your thoughts on the subject.
Thanks everyone for your input,
Fabian
yes :)
var Binary = require('mongodb').Binary;
var doc = {
data: new Binary(new Buffer(256))
}
or with 1.1.5 of the driver you can do
var doc = {
data: new Buffer(256)
}
The data is always returned as a Binary object however and not a buffer. The link to the docs is below.
http://mongodb.github.com/node-mongodb-native/api-bson-generated/binary.html

Resources