Why Are node scrypt Hashes the Same Given the Same Inputs? - node.js

I was trying to find a compare or verify function for node's built-in crypto module, specifically for scrypt, as most password-hashing modules I have used have such a function. Then, I discovered why this was an impossible task: All hashes generated with these algorithms using the same parameters generate the same string (technically buffer). This is the case for many of crypto's hashing functions, including its pbkdf2 implementation.
Why is this safe? Isn't the whole (modern) point of a password/message hashing function that you can't generate the same password/message again using the same input? This is how the various bcrypt modules work, as well as the original version of scrypt, from which the built-in version, the one I'm asking about, got derived.
For example:
let scryptHash1;
let scryptHash2;
let scryptHash3;
let pbkdfHash1;
let pbkdfHash2;
let pbkdfHash3;
const key1 = 'my secret key';
const key2 = 'my other secret key';
const salt = 'my salt';
crypto.scrypt(key1, salt, 16, hash => scryptHash1 = hash);
crypto.scrypt(key1, salt, 16, hash => scryptHash2 = hash);
crypto.scrypt(key2, salt, 16, hash => scryptHash3 = hash);
scryptHash1.toString() === scryptHash2.toString(); // true
scryptHash1.toString() === scryptHash3.toString(); // false
crypto.pbkdf2(key1, salt, 16, 16, 'sha256', hash => pbkdfHash1 = hash);
crypto.pbkdf2(key1, salt, 16, 16, 'sha256', hash => pbkdfHash2 = hash);
crypto.pbkdf2(key2, salt, 16, 16, 'sha256', hash => pbkdfHash3 = hash);
pbkdfHash1.toString() === pbkdfHash2.toString(); // true
pbkdfHash1.toString() === pbkdfHash3.toString(); // false
I originally asked this question on Cryptography, as I'm more concerned about the safety than anything else, as I want to move from bcrypt to scrypt. However, as multiple people pointed out, and as I feared, the question is more about API design. That being said, any accepted answer should include why this method is safe, or safe enough to switch over (granting that "safe enough" is never safe enough). I took security as my major, but I'm now a web dev, and security changes all the time, though the core concepts stay mostly the same.

You seem to have some fundamental misunderstanding about password hashing. First and foremost, just as any hash function a password hashing function is also a function in the mathematical sense. I.e. it is simply a mapping that assigns a fixed value from its range to every element of its input domain.
What sets password hashes apart from regular hashes is two things: First, they are designed to be slow and/or use large amounts of memory when evaluated. (This is irrelevant for our discussion here.) And second they take a second input, the salt.
For a password hashing function H you want that for any fixed password m and any two salts s≠ s' it not only holds that H(m,s)≠ H(m,s'), but also given both hash values and salts you should not be able to detect that they are hash values of the same m.
What you seem confused about are different choices of API design. Specifically who gets to choose the salt. Every time a new password m is hashed (e.g. to be entered into a database), a fresh uniformly random salt s should be chosen and then the hash value h:=H(m,s) is computed and both h and s are stored in the database. Whenever someone claiming to be that same user submits a password m' to authenticate themselves, what happens is that (h,s) is retrieved and its checked whether h=H(m',s).
Now the question is who chooses the salt. It appears that APIs you are familiar with do not trust the user to do so. So when you make a call to hash password m, the library will choose a salt s, compute h and output h'=(h,s) as a "hash value". To check whether a password m' is correct, you then submit h',m' and the library will extract the salt, recompute the hash and compare.
The library you are now looking at expects the user to choose the salt. I.e., each time you create a new entry in a password database you have to choose a new salt, compute h=H(m,s) and store both (h,s). Since the library in this case does not attempt to "hide" anything from you, you need to take care of the comparison.

Related

Encrypt/Decrypt aes256cbc in Nodejs

I'm working on a porject where I need develop a Encrypt/Decrypt string in nodejs.
I receive the string the next format: pTS3JQzTxrSbd+cLESXHpg==
this string is generate from this page: https://encode-decode.com/aes-256-cbc-encrypt-online/
and use the aes-256-cbc standard
the code that i implemented is the next:
var CryptoJS = require("crypto-js");
var key = 'TEST_KEY';
var text = 'pTS3JQzTxrSbd+cLESXHpg==';
function decript(text, key) {
return CryptoJS.AES.decrypt(text.trim(), key);
}
console.log(decript(text, key).toString(CryptoJS.enc.Utf8));
But i always get an empty response.
could you say to me what is the issue?
thanks a lot!
As the documentation explains and I just answered yesterday, CryptoJS.AES when given a 'key' that is a string treats it as a password and uses password-based key derivation compatible with openssl enc. That is different from and incompatible with what your linked website does, which is not clearly stated, but based on the list of cipher names is almost certainly internally calling OpenSSL's 'EVP' interface, which means among other things that if you specify a key too short for the algorithm, as you did, it uses whatever happens to be adjacent in memory, which apparently was zero-value bytes (not unusual for programs run on operating systems newer than about 1980), and it either uses the default IV of zero bytes or similarly sets it to something that is zero bytes. And for CBC it uses PKCS5/7 padding, which is compatible with CryptoJS (and most other things). Therefore:
const CryptoJS = require('crypto-js');
var key = CryptoJS.enc.Latin1.parse("TEST_KEY\0\0\0\0\0\0\0\0"+"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0")
var iv = CryptoJS.enc.Latin1.parse("\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0")
var ctx = CryptoJS.enc.Base64.parse("pTS3JQzTxrSbd+cLESXHpg==")
var enc = CryptoJS.lib.CipherParams.create({ciphertext:ctx})
console.log( CryptoJS.AES.decrypt (enc,key,{iv:iv}) .toString(CryptoJS.enc.Utf8) )
->
Test text

node bcryptjs not deterministic?

I used brypt once for password authentication. For some reasons I can't install it on several machines anymore. Anyway I installed bcryptjs instead.
const bcrypt = require('bcryptjs')
const salt = bcrypt.genSaltSync(10);
const hash = bcrypt.hashSync("hallo", salt);
console.log(hash);
I ran the code six times and got these six outputs:
$2a$10$SnIj6q67OvPXINLeajqONebAjZltLwrqs8OU/5C871NyTib.SJeyu
$2a$10$8aLhlLvYi5RcuV40SansxOuQroS.SPmPG6GMjsRlcndjjzRSJkFRu
$2a$10$wZJCuAUwtG9v.oh8tgZ9M.unYBe/MRv0jO3IU51gLz8XI1ClYJni6
$2a$10$mGhPf85kGpn/PBdV3JjDsuXypnQ.E2pBTEoDtDZ/eW6qsq5DAb6M6
$2a$10$WkEro4eOiuqzE0.hB/ka2eyPUpWE/Dv5dWkqSZ3yujQ2PA3iRYJMC
$2a$10$l4GVALWSvWdcOin37WXsQeIufA7SHxvhU.9dIasXspsSPi1e1/IeG
but this additional code compare it actually the right way
const hallo = bcrypt.compareSync("hallo", hash);
console.log(hallo); //always true
const burger = bcrypt.compareSync("burger", hash);
console.log(burger); //always false
how does bcrypt actually compares the hash to the string?
Does it only look on the first 7 characters that are same?
thanks
Amit
This is correct. Bcrypt is a salted hash and the salt is randomly generated. This means that each time you run the hash you will get a random result. This is intentional.
The part of the hash before the . (dot) and after the last $ is the embedded salt. You don't need to store the salt separately since it is a part of the hash. Since the salt is correctly randomly generated you get full protection from rainbow table attacks.
The way bcrypt checks the hash is to first extract the embedded salt. Then run the string and salt through the algorithm again. If the resulting hash matches the hash then it passes.

mongodb: Unique index on encrypted field

I'm encrypting SSNs in mongodb. However, I need to use the SSN as a unique identifier to make sure that a person with that SSN does not insert a duplicate. So basically I want to check for duplicate SSNs before saving. However I'm unsure if I'll be able to do this after encrypting this field with an AES function. If I encrypt and sign 2 strings which are identical with AES, will the output still be identical?
If not, what would be a good alternative? I had thought about hashing the SSN, but an SSN seems to have such little entropy(its 9 numeric digits, some of which are somewhat predictable). If I salt, I lose the ability to assign a unique index on that field, unless I use a static salt which doesn't really do much.
Addition
I would be encrypting at the application level using the node.js crypto core module.
Using the same symmetric AES key to encipher 2 identical strings will produce an identical output. Therefore you can identify whether or not the encrypted field is unique by comparing it to a value enciphered with the same key.
PoC:
var crypto = require('crypto');
var cipher = crypto.createCipher('aes-256-ctr', "someString");
var cipher2 = crypto.createCipher('aes-256-ctr', "someString");
var crypted = cipher.update("hello world",'utf8','hex');
var crypted2 = cipher2.update("hello world",'utf8','hex');
crypted === crypted2 //true

why does compareSync not need salt string?

I am trying to use bcryptjs to generate hash of user passwords. However I am a bit confused in one matter.
Conventionally, according to this article, we need to:
keep the salt of our password hash relatively long and unique,
hash the user password salted with this salt
store the salted hashed password along with the salt
So when we are comparing the hash while authenticating the user, we append the stored salt to user inputed password, and compare it with hash from database.
However using hashSync and compareSync of bcryptjs as follows:
//hashSync to generate hash
var bcrypt = require('bcryptjs');
var password = "abc";
var hash = bcrypt.hashSync( <some string>, < integer length of salt>) // the salt of mentioned length(4-31) is self-generated which is random and fairly unique
//compareSYnc to compare hash
var testString="abc";
console.log(bcrypt.compareSync(testString, hash)) // compares with previously generated hash returns "true" in this case.
What I am confused is, if we dont need the salt while authenticating, what is significance of generating it? compareSync returns true without the access of salt. So wouldnt it make bruteforce attack for comparatively small password easy? All of the following returns true regardless of salt size:
console.log(bcrypt.compareSync("abc", bcrypt.hashSync("abc"))); // consoles true. by default, if salt size is not mentioned, size is 10.
console.log(bcrypt.compareSync("abc", bcrypt.hashSync("abc", 4))); //consoles true
console.log(bcrypt.compareSync("abc", bcrypt.hashSync("abc", 8))); //consoles true
console.log(bcrypt.compareSync("abc", bcrypt.hashSync("abc", 32))); //consoles true
console.log(bcrypt.compareSync("ab", bcrypt.hashSync("abc", 4))); //consoles false
I hope I am clear enough in explaining my confusion.
The bcrypt standard makes storing salts easy - everything it needs to check a password is stored in the output string.
The prefix "$2a$" or "2y" in a hash string in a shadow password file indicates that hash string is a bcrypt hash in modular crypt format. The rest of the hash string includes the cost parameter, a 128-bit salt (base-64 encoded as 22 characters), and the 192-bit[dubious – discuss] hash value (base-64 encoded as 31 characters).
That's from the Wikipedia page on bcrypt.

Node.js Crypto AES Cipher

For some odd reason, Node's built-in Cipher and Decipher classes aren't working as expected. The documentation states that cipher.update
"Returns the enciphered contents, and can be called many times with new data as it is streamed."
The docs also state that cipher.final
"Returns any remaining enciphered contents."
However, in my tests you must call cipher.final to get all of the data, thus rendering the Cipher object worthless, and to process the next block you have to create a new Cipher object.
var secret = crypto.randomBytes(16)
, source = crypto.randomBytes(8)
, cipher = crypto.createCipher("aes128", secret)
, decipher = crypto.createDecipher("aes128", secret);
var step = cipher.update(source);
var end = decipher.update(step);
assert.strictEqual(source.toString('binary'), end); // should not fail, but does
Note that this happens when using crypto.createCipher or crypto.createCipheriv, with the secret as the initialization vector. The fix is to replace lines 6 and 7 with the following:
var step = cipher.update(source) + cipher.final();
var end = decipher.update(step) + decipher.final();
But this, as previously noted, renders both cipher and decipher worthless.
This is how I expect Node's built-in cryptography to work, but it clearly doesn't. Is this a problem with how I'm using it or a bug in Node? Or am I expecting the wrong thing? I could go and implement AES directly, but that would be time-consuming and annoying. Should I just create a new Cipher or Decipher object every time I need to encrypt or decrypt? That seems expensive if I'm doing so as part of a stream.
I was having two problems: the first is that I assumed, incorrectly, that the size of a block would be 64 bits, or 8 bytes, which is what I use to create the "plaintext." In reality the internals of AES split the 128 bit plaintext into two 64 bit chunks, and go from there.
The second problem was that despite using the correct chunk size after applying the above changes, the crypto module was applying auto padding, and disabling auto padding solved the second problem. Thus, the working example is as follows:
var secret = crypto.randomBytes(16)
, source = crypto.randomBytes(16)
, cipher = crypto.createCipheriv("aes128", secret, secret); // or createCipher
, decipher = crypto.createDecipheriv("aes128", secret, secret);
cipher.setAutoPadding(false);
decipher.setAutoPadding(false);
var step = cipher.update(source);
var end = decipher.update(step);
assert.strictEqual(source.toString('binary'), end); // does not fail
AES uses block sizes of 16 bytes (not two times 8 as you were suggesting). Furthermore, if padding is enabled it should always pad. The reason for this is that otherwise the unpadding algorithm cannot distinguish between padding and the last bytes of the plaintext.
Most of the time you should not expect the ciphertext to be the same size as the plain text. Make sure that doFinal() is always called. You should only use update this way for encryption / decryption if you are implementing your own encryption scheme.
There's a node.js issue with calling update multiple times in a row. I suppose it's been solved and reflected in the next release.

Resources