Letting Rfc2898DeriveBytes calculate the salt

Letting Rfc2898DeriveBytes calculate the salt - security

I've read a lot of posting here about Rfc2898DeriveBytes() and it seems that in all of them, the salt is pre-calculated and passed to the constructor. However, there is a constructor that accepts a salt length input, and the salt will be calculated for you. It is available afterwards in the Salt property.
Any disadvantage to letting the method calc the salt? In my case, the usage is for password hashing.

Specifying the salt length instead of the salt itself may reduce the chance of choosing the salt insecurely when deriving a new key (or obscuring a password for storage). The salt should be chosen by a cryptographic random bit generator, and should be changed each time the password is updated. Presumably, this constructor will use a high-quality RNG that was properly seeded. Leaving that up the the application allows for mistakes at worst, and at best creates unnecessary complexity.
Of course, if you are recovering a key, for example to check user input against the stored password, you'd need to specify the salt that was used initially.

Related

how to get original value from hash value in node.js

I have created hash of some fields and storing in database using 'crypto' npm.
var crypto = require('crypto');
var hashFirtName = crypto.createHash('md5').update(orgFirtName).digest("hex"),
QUESTION: How can I get the original value from the hash value when needed?

The basic definition of a "hash" is that it's one-way. You cannot get the originating value from the hash. Mostly because a single value will always produce the same hash, but a hash isn't always related to a single value, since most hash functions return a string of finite/fixed length.
Additional Information
I wanted to provide some additional information, as I felt I may have left this too short.
As #xShirase pointed out in his answer, you can use a table to reverse a Hash. These are known as Rainbow Tables. You can generate them or download them from the internet, usually from nefarious sources [ahem].
To expand on my other statement about a hash value possibly relating to multiple original values, lets take a look at MD5.
MD5 is a 128-bit hash. This means it can hold 2^128 bits, or (unsigned) 0 through 340,282,366,920,938,463,463,374,607,431,768,211,455. That's a REALLY big number. So, for any given input you have a 1 in 340,282,366,920,938,463,463,374,607,431,768,211,456 chance that it will collide with the same hash result of another input value.
Now, for simple data like passwords, the chances are astronomical. And for those purposes, who cares? Most of the time you are simply taking an input, hashing it, then comparing the hashes. For reasons I will not get into, when using hashes for passwords you should ALWAYS store the data already hashed. You don't want to leave plain-text passwords just lying about. Keep in mind that a hash is NOT the same as encryption.
Hashes can also be used for other reasons. For instance, they can be used to create a fast-lookup data structure known as a Hash Table. A Hash Table uses a hash as sort of a "primary key", allowing it to search a huge set of data in relatively few number of instructions, approaching O(1) (On-order of 1). Depending on the implementation of the Hash Table and the hashing algorithm, you have to deal with collisions, usually by means of a sorted list. This is why the Hash Table isn't "exactly" O(1), but close. If your hash algorithm is bad, the performance of your Hash Table can begin to approach O(n).
Another use for a hash it to tell if a file's contents have been altered, or match an original. You will see many OSS project provide binary downloads that also have an MD5 and/or SHA-2 hash values. This is so you can download the files, do a hash locally, and compare the results against theirs to make sure the file you are getting is the file they posted. Again, since the odds of two files matching another is 1 in 340,282,366,920,938,463,463,374,607,431,768,211,456, the odds of a hacker successfully generating a file of the same size with a bad payload that hashes to the exact same MD5/SHA-2 hash is pretty low.
Hope this discussion can help either you or someone in the future.

If you could get the original value from the hash, it wouldn't be that secure.
If you need to compare a value to what you have previously stored as a hash, you can create a hash for this value and compare the hashes.
In practice there is only one way to 'decrypt' a hash. It involves using a massive database of decrypted hashes, and compare them to yours. An example here

Decrypted Hash and Encrypted hash

If this password's ( qwqwqw123456 ) hash is $2a$07$sijdbfYKmgWdcGhPPn$$$.C98C0wmy6jsqA3fUKODD0OFBKJkHdn.
What is the password of this hash $2a$07$sijdbfYKmgWdcGhPPn$$$.9PTdICzon3EUNHZvOOXgTY4z.UTQTqG
And Can I know which hash algorithm is it ?

You could try to guess which algorithm was used,
depending on the format and length of the hash,
your known value etc. but there is no definitive way to know it.
And the purpose of any "hash" function is
that it is NOT reversible/decryptable/whatever.
Depending on some factors you could try to guess the original value too
(Brute force attack: Try to hash all possible values and check which hash
is equal to yours) but, depending on the count of possibilities,
the used algortihm etc. that could take millions of years. (you could also be lucky
and get the correct value within short time, but that´s unlikely).
There are other things than bruteforce-ing, but in the end,
it´s pretty much impossible to reverse a good hash function

How safe is this procedure?

I'm going to use this kind of approach to store my password:
User enters password
Application salts password with random number
Then with salted password encrypt with some encryption algorithm randomly selected array of data (consisting from predefined table of chars/bytes)
for simplicity it can be used just table of digits, so in case of digits random array would be simply be long enough integer/biginteger.
Then I store in DB salt (modified value) and encrypted array
To check password validity:
Getting given password
Read salt from DB and calculate decrypt key
Try to decrypt encrypted array
If successfull (in mathematical mean) compare decrypted value byte by byte
does it contains only chars/bytes from known table. For instance is it integer/biginteger? If so - password counts as valid
What do you think about this procedure?
In a few words, it's a kind of alternative to using hash functions...
In this approach encryption algorithm is about to be used for calculation of non-inversible value.
EDIT
# Encrypt/decrypt function that works like this:
KEY=HASH(PASSWORD)
CYPHERTEXT = ENCRYPT(PLAINTEXT, KEY)
PLAINTEXT = DECRYPT(CYPHERTEXT, KEY)
# Encrypting the password when entered
KEY=HASH(PASSWORD)+SALT or HASH(PASSWORD+SALT)
ARRAY={A1, A2,... AI}
SOME_TABLE=RANDOM({ARRAY})
ENCRYPTED_TABLE = ENCRYPT(SOME_TABLE, KEY + SALT)
# Checking validity
DECRYPT(ENCRYPTED_TABLE, PASSWORD + SALT) == SOME_TABLE
if(SOME_TABLE contains only {ARRAY} elements) = VALID
else INVALID

From what you write I assume you want to do the following:
# You have some encryption function that works like this
CYPHERTEXT = ENCRYPT(PLAINTEXT, KEY)
PLAINTEXT = DECRYPT(CYPHERTEXT, KEY)
# Encrypting the password when entered
ENCRYPTED_TABLE = ENCRYPT(SOME_TABLE, PASSWORD + SALT)
# Checking validity
DECRYPT(ENCRYPTED_TABLE, PASSWORD + SALT) == SOME_TABLE
First off: No sane person would use such a homemade scheme in a production system. So if you were thinking about actually implementing this in the real world, please go back. Don't even try to write the code yourself, use a proven software library that implements widely accepted algorithms.
Now, if you want to think about it as a mental exercise, you could start off like this:
If you should assume that an attacker will know all the parts of the equation, except the actual password. The attacker, who wants to retrieve the password, will therefore already know the encrypted text, the plaintext AND part of the password.
The chance of success will depend on the actual encryption scheme, and maybe the chaining mode.
I'm not a cryptanalyst myself, but without thinking about it too much I have the feeling that there could be a number of angles of attack.

The proposed scheme is, at best, slightly less secure than simply storing the hash of the password and salt.
This is because the encryption step simply adds a small constant amount of time to checking if each hash value is correct; but at the same time it also introduces classes of equivalent hashes, since there are multiple possible permutations of ARRAY that will be recognised as valid.

You would have to brute force the encryption on every password every time someone logs in.
Read salt from DB and calculate decrypt key
This can't be done unless you know what the password is before hand.
Just salt (And multiple hash) the password.

Password salts: prepending vs. appending

I just looked at the implementation of password hashing in Django and noticed that it prepends the salt, so the hash is created like sha1(salt + password), for example.
In my opinion, salts are good for two purposes
Preventing rainbow table lookups
Alright, prepending/appending the salt doesn't really make a difference for rainbow tables.
Hardening against brute-force/dictionary attacks
This is what my question is about. If someone wants to attack a single password from a stolen password database, he needs to try a lot of passwords (e.g. dictionary words or [A-Za-z0-9] permutations).
Let's assume my password is "abcdef", the salt is "salt" and the attacker tries all [a-z]{6} passwords.
With a prepended salt, one must calculate hash("salt"), store the hash algorithm's state and then go on from that point for each permutation. That is, going through all permutations would take 26^6 copy-hash-algorithm's-state-struct operations and 26^6 hash(permutation of [a-z]{6}) operations. As copying the hash algorithm's state is freakin fast, the salt hardly adds any complexity here, no matter how long it is.
But, with an appended salt, the attacker must calculate hash(permutation of [a-z]{6} + salt) for each permutation, leading to 26^10 hash operations. So obviously, appending salts adds complexity depending on the salt length.
I don't believe this is for historical reasons because Django is rather new. So what's the sense in prepending salts?

Do neither, use a standard Key derivation function like PBKDF2. Never roll your own crypto. It's much too easy to get it wrong. PBKDF2 uses many iterations to protect against bruteforce which is a much bigger improvement than the simple ordering.
And your trick pre-calculating the internal state of the hash-function after processing the salt probably isn't that easy to pull off unless the length of the salt corresponds to the block-length of the underlying block-cypher.

If salt is prepended, attacker can make hash state database for salts (assuming salt is long enough to make a hashing step) and then run dictionary attack.
But if salt is appended, attacker can make such database for password dictionary and additionally compute only salt's hash. Given that salt is usually shorter than password (like 4 chars salt and 8 char password), it will be faster attack.

You are making a valid point, of course; but , really, if you want to increase time it takes to calculate hash, just use longer hash. SHA256 instead of SHA1, for example.

How can I store a salted password hash if I have only one database column?

I've read a number of SO questions on this topic, but grokking the applied practice of storing a salted hash of a password eludes me.
Let's start with some ground rules:
a password, "foobar12" (we are not discussing the strength of the password).
a language, Java 1.6 for this discussion
a database, postgreSQL, MySQL, SQL Server, Oracle
Several options are available to storing the password, but I want to think about one (1):
Store the password hashed with random salt in the DB, one column
The automatic fail of plaintext storage is not open for discussion. :) Found on SO and elsewhere are solutions with MD5/SHA1 and use of dual-columns, both with pros and cons.
MD5/SHA1 is simple. MessageDigest in Java provides MD5, SHA1 (through SHA512 in modern implementations, certainly 1.6). Additionally, most RDBMSs listed provide methods for MD5 encryption functions on inserts, updates, etc. The problems become evident once one groks "rainbow tables" and MD5 collisions (and I've grokked these concepts).
Dual-column solutions rest on the idea that the salt does not need to be secret (grok it). However, a second column introduces a complexity that might not be a luxury if you have a legacy system with one (1) column for the password and the cost of updating the table and the code could be too high.
But it is storing the password hashed with a random salt in single DB column that I need to understand better, with practical application.
I like this solution for a couple of reasons: a salt is expected and considers legacy boundaries. Here's where I get lost: If the salt is random, and the password plus salt are hashed to produced a one-way value for storing, how can the system ever match a plaintext password and a new random salt?
I have theory on this, and as I type I might be grokking the concept: Given a random salt of 128 bytes and a password of 8 bytes ('foobar12'), it could be programmatically possible to remove the part of the hash that was the salt, by hashing a random 128 byte salt and getting the substring of the original hash that is the hashed password. Then re hashing to match using the hash algorithm...?
So... any takers on helping. :) Am I close?

There's no great mystery. The single-column solution is just like the multi-column solution, except that it combines the salt and the hash together into a single column. The checking code still has to know how to break that single value down into the salt and hash. (This has been how salted passwords have typically worked - for example, the UNIX /etc/shadow format stores an algorithm identifier, salt and hash together in a single field).
You don't have to worry about this too much though, because the password hashing algorithm should include the smarts to do this. For example, if you use jBCrypt, then you simply:
Store the string returned by BCrypt.hashpw() in the database password column when storing a password; and
Supply the value from the database password column as the second parameter to BCrypt.checkpw() when checking a password.

You could also store salt and hash in the same column (using a separator).

You always have to know the salt, by storing it in the DB (as you saw with multi column solutions) or be able to generate it in some other way (which defeats some, but not all, of the point of random salt).
If you only have a single column in which to store the password, then you can either:
generate the salt on the fly (ie not really random, but use some sort of function of the username or email address)
concatenate the salt with the hashed password in some way before storing it.
In the first case, you can come up with some trivial function:
public String getSalt(String username)
{
// assuming Hash returns a String
return Hash(username + " 1234 my site is totally awesome").substring(0,16);
}
In the second:
// Passwords stored in the db as 16 characters of salt, and the rest is password hash
public boolean authenticate(String username, String authPassword)
{
// 'SELECT saltyhash FROM users WHERE username=x'
String saltyhash = getSaltyHashForUserFromDB(username);
String salt = saltyhash.substring(0,16);
String dbPassword = salt + Hash(salt + authPassword);
// perform the actual 'SELECT FROM users WHERE saltypassword=x' stuff
return hitTheDatabaseToPerformLogin(username, dbPassword);
}
public void createUser(String username, String password)
{
String salt = createSomeAwesomeSalt();
String saltyhash = salt + Hash(salt + password);
createTheUserInTheDatabase(username, saltyhash);
}

Here's where I get lost: if the salt
is random and hashed with the
password, how can the system ever
match the password?
It matches it by computing the hash of the password the user entered with the same salt, which it reads from the database (at the same time as the hash).

See http://www.aspheute.com/english/20040105.asp
The user is authenticated with a salted hash, not the unsalted password or the random salt by itself. The salted hash and the salt (but not the actual password) are both stored in the database (you can store them in a single column if you like, but you will have to separate them again before use).
In order to recover the salted hash from the user (so that you can compare it to the stored salted hash), you need the salt from the database, and the password provided by the user.
The salted hash is created like this:
// Initialize the Password class with the password and salt
Password pwd = new Password(myPassword, mySalt);
// Compute the salted hash
// NOTE: you store the salt and the salted hash in the database
string saltedHash = pwd.ComputeSaltedHash();
Authentication is done like this:
// retrieve salted hash and salt from user database, based on username
...
Password pwd = new Password(txtPassword.Text, nSaltFromDatabase);
if (pwd.ComputeSaltedHash() == storedSaltedHash)
{
// user is authenticated successfully
}
else
{
...
A new salt is generated for each user. Should two users accidentally choose the same password, the salted hash will still be different for both user accounts.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string