Using hash of password to encrypt private key - security

I am developing a web application in which I need to encrypt sensitive information. My plan is to use use AES-256 where the private key is encrypted by a hash of the user's password. I need to store the hash of the password for authentication purposes, but it obviously can't be same used to encrypt the private key. My current thought is to use bcrypt to generate a key to be used to encrypt the private key. For authentication, my thought was to simply hash the password using bcrypt and then hash that hash using bcrypt again and then store that hash in the database. Since it is one way, there shouldn't be any way to use the stored hash to decrypt the private key? Are there any obvious security issues with doing this that I may be missing?
My other thought was to use two different encryption algorithms, such as using a bcrypt hash to encrypt the private key and storing a SHA-2 hash for authentication purposes.
Thanks for your help.

don't use hash to encrypt AES password. salted hash should be used only for authentication. when user logs in, you have his password. use this password to encrypt (first time) and decrypt (later) the AES key and then forget the password.

I'd recommend using PBKDF2 in this situation. You can use two different salts, one that would derive the symmetric key and the other one would derive the password hash to be stored. The salt should contain a deterministic part distinguishing the two different use cases, as well as a random part - cf. this comment:
Otherwise, the salt should contain data that explicitly
distinguishes between different operations and different key
lengths, in addition to a random part that is at least eight
octets long, and this data should be checked or regenerated by
the party receiving the salt. For instance, the salt could have
an additional non-random octet that specifies the purpose of
the derived key. Alternatively, it could be the encoding of a
structure that specifies detailed information about the derived
key, such as the encryption or authentication technique and a
sequence number among the different keys derived from the
password. The particular format of the additional data is left
to the application.
A plain, salted SHA-2 probably isn't enough because of the poor entropy of typical passwords, as was mentioned in the comments.

A suggestion: use two different salts. When the user enters their password concatenate it with a random salt and hash it for the password recognition routine. Use a different salt and hash it again for the AES encryption key. Depending on how secure you want things, you can stretch the hashing as well.
Effectively you have:
storedPasswordCheck = SHA256(password + salt1);
AESkey = SHA256(password + salt2);
The AES keys are not stored of course, but are regenerated from the user's password as needed. You will need two separate salts, best at least 128 bits each, stored for each user.

Related

How does every system or server get a different hash function to store passwords

as I understand it, user passwords must be stored as hashes instead of encrypted, because an attacker cant deduce a password from its hash, while he can deduce a password from its encryption, if he gets access to the encryption key.
Now, obviously every system must use a different hashing function to hash its keys. My question is, how do they create these different hashing functions? Do they use a standard hashing function and prime it with a big key? If so, wouldn't an attacker be able to deduce the passwords if he got access to this key, making it the same as encryption?
Cryptographic hash functions are always non reversible, this is their purpose. Even discouraged "unsafe" function like MD5 and SHA1 are not reversible and they don't need a key. The problem is that you can find possible matching passwords too fast with brute-forcing (more than 10 Giga MD5/sec).
The "big key" you mentioned is probably the salt. You generate a random salt and use this salt in the calculation. It is safe to store this salt together with the hash, because its purpose is to prevent the attacker from building one single rainbow-table and finding matches for all passwords at once. Instead (s)he must build a rainbow-table for every salt separately, what makes those tables unpracticable.
The problem with the speed you can only overcome with iterations of the hash function. A cost factor defines how many times the hash is calculated. Recommended algorithms are BCrypt, PBKDF2 and SCrypt.
Now, obviously every system must use a different hashing function to hash its keys
No, they don't.
If your password is s3cr3t, then it will have the same hash value in the database of a lot of servers, sadly likely A4D80EAC9AB26A4A2DA04125BC2C096A
The way to make this suck less is to generate a random code per password, called a salt, so that the hash of s3cr3t on server 1 is likely to be different than the hash of s3cr3t on server2: hashFunction('s3cr3t' + 'perUserSalt')
Use bcrypt, scrypt, or PBKDF2 only for password storage.

Encrypting a string

Is there a way to encrypt a string so there is no reversable effect? Like if you run some algorith 100 times, encrypting a message, you can run it 100 times in reverse and get the right one. If there a technology or method that eliminates such possibility?
There are two broad categories you should look into, depending on your needs:
Cryptographic Hash Functions
Cryptographic hash functions produce fixed-width values based on an arbitrarily long input, in such a way that even very minor changes in the input result in significantly different output. As a rule, they are irreversible (though flaws have been found in some algorithms). This is a good choice if you do not need to be able to recover the value of the string yourself. For example, good username/password verification systems store a hash of the password rather than the password itself, and authenticate by comparing that hash to the hash of the password provided by the user. This way, even if the username/password database is compromised, user passwords are not exposed.
Public-Key Cryptography
In public-key cryptography, a sender uses the intended recipient's "public" key to encrypt a message, and the recipient uses their "private" key to decrypt it. The message cannot be decrypted by the same key that encrypted it, so in that sense the algorithm is not strictly "reversible" (splicing hairs, I know). TLS, SSL, and PGP are all based on this technique, to name a few examples. This is probably your best option if you are transmitting data between two known parties.

How can bcrypt have built-in salts?

Coda Hale's article "How To Safely Store a Password" claims that:
bcrypt has salts built-in to prevent rainbow table attacks.
He cites this paper, which says that in OpenBSD's implementation of bcrypt:
OpenBSD generates the 128-bit bcrypt salt from an arcfour
(arc4random(3)) key stream, seeded with random data the kernel
collects from device timings.
I don't understand how this can work. In my conception of a salt:
It needs to be different for each stored password, so that a separate rainbow table would have to be generated for each
It needs to be stored somewhere so that it's repeatable: when a user tries to log in, we take their password attempt, repeat the same salt-and-hash procedure we did when we originally stored their password, and compare
When I'm using Devise (a Rails login manager) with bcrypt, there is no salt column in the database, so I'm confused. If the salt is random and not stored anywhere, how can we reliably repeat the hashing process?
In short, how can bcrypt have built-in salts?
This is bcrypt:
Generate a random salt. A "cost" factor has been pre-configured. Collect a password.
Derive an encryption key from the password using the salt and cost factor. Use it to encrypt a well-known string. Store the cost, salt, and cipher text. Because these three elements have a known length, it's easy to concatenate them and store them in a single field, yet be able to split them apart later.
When someone tries to authenticate, retrieve the stored cost and salt. Derive a key from the input password, cost and salt. Encrypt the same well-known string. If the generated cipher text matches the stored cipher text, the password is a match.
Bcrypt operates in a very similar manner to more traditional schemes based on algorithms like PBKDF2. The main difference is its use of a derived key to encrypt known plain text; other schemes (reasonably) assume the key derivation function is irreversible, and store the derived key directly.
Stored in the database, a bcrypt "hash" might look something like this:
$2a$10$vI8aWBnW3fID.ZQ4/zo1G.q1lRps.9cGLcZEiGDMVr5yUP1KUOYTa
This is actually three fields, delimited by "$":
2a identifies the bcrypt algorithm version that was used.
10 is the cost factor; 210 iterations of the key derivation function are used (which is not enough, by the way. I'd recommend a cost of 12 or more.)
vI8aWBnW3fID.ZQ4/zo1G.q1lRps.9cGLcZEiGDMVr5yUP1KUOYTa is the salt and the cipher text, concatenated and encoded in a modified Base-64. The first 22 characters decode to a 16-byte value for the salt. The remaining characters are cipher text to be compared for authentication.
This example is taken from the documentation for Coda Hale's ruby implementation.
I believe that phrase should have been worded as follows:
bcrypt has salts built into the generated hashes to prevent rainbow table attacks.
The bcrypt utility itself does not appear to maintain a list of salts. Rather, salts are generated randomly and appended to the output of the function so that they are remembered later on (according to the Java implementation of bcrypt). Put another way, the "hash" generated by bcrypt is not just the hash. Rather, it is the hash and the salt concatenated.
This is a simple terms...
Bcrypt does not have a database it stores the salt...
The salt is added to the hash in base64 format....
The question is how does bcrypt verifies the password when it has no database...?
What bcrypt does is that it extract the salt from the password hash... Use the salt extracted to encrypt the plain password and compares the new hash with the old hash to see if they are the same...
To make things even more clearer,
Registeration/Login direction ->
The password + salt is encrypted with a key generated from the: cost, salt and the password. we call that encrypted value the cipher text. then we attach the salt to this value and encoding it using base64. attaching the cost to it and this is the produced string from bcrypt:
$2a$COST$BASE64
This value is stored eventually.
What the attacker would need to do in order to find the password ? (other direction <- )
In case the attacker got control over the DB, the attacker will decode easily the base64 value, and then he will be able to see the salt. the salt is not secret. though it is random.
Then he will need to decrypt the cipher text.
What is more important : There is no hashing in this process, rather CPU expensive encryption - decryption. thus rainbow tables are less relevant here.
Lets imagine a table that has 1 hashed password. If hacker gets access he would know the salt but he will have to calculate a big list for all the common passwords and compare after each calculation. This will take time and he would have only cracked 1 password.
Imagine a second hashed password in the same table. The salt is visible but the same above calculation needs to happen again to crack this one too because the salts are different.
If no random salts were used, it would have been much easier, why? If we use simple hashing we can just generate hashes for common passwords 1 single time (rainbow table) and just do a simple table search, or simple file search between the db table hashes and our pre-calculated hashes to find the plain passwords.

Is It Possible To Reconstruct a Cryptographic Hash's Key

We would like to cryptographically (SHA-256) hash a secret value in our database. Since we want to use this as a way to lookup individual records in our database, we cannot use a different random salt for each encrypted value.
My question is: given unlimited access to our database, and given that the attacker knows at least one secret value and hashed value pair, is it possible for the attacker to reverse engineer the cryptographic key? IE, would the attacker then be able to reverse all hashes and determine all secret values?
It seems like this defeats the entire purpose of a cryptographic hash if it is the case, so perhaps I'm missing something.
There are no published "first pre-image" attacks against SHA-256. Without such an attack to open a shortcut, it is impossible for an attacker to the recover a secret value from its SHA-256 hash.
However, the mention of a "secret key" might indicate some confusion about hashes. Hash algorithms don't use a key. So, if an attacker were able to attack one "secret-value–hash-value" pair, he wouldn't learn a "key" that would enable him to easily invert the rest of the hash values.
When a hash is attacked successfully, it is usually because the original message was from a small space. For example, most passwords are chosen from a relatively short list of real words, perhaps with some simple permutations. So, rather than systematically testing every possible password, the attacker starts with an ordered list of the few billion most common passwords. To avoid this, it's important to choose the "secret value" randomly from a large space.
There are message authentication algorithms that hash a secret key together with some data. These algorithms are used to protect the integrity of the message against tampering. But they don't help thwart pre-image attacks.
In short, yes.
No, a SHA hash is not reversible (at least not easily). When you Hash something if you need to reverse it you need to reconstruct the hash. This is usually done with a private (salt) and public key.
For example, if I'm trying to prevent access based off my user id. I would hash my user id and the salt. Let say MD5 for example. My user id is "12345" and the salt is "abcde"
So I will hash the string "12345_abcde", which return a hash of "7b322f78afeeb81ad92873b776558368"
Now I will pass to the validating application the hash and the public key, "12345" which is the public key and the has.
The validating application, knows the salt, so it hashes the same values. "12345_abcde", which in turn would generate the exact same hash. I then compare the hash i validated with the one passed off and they match. If I had somehow modified the public key without modifying the hash, a different has would have been generated resulting in a mismatch.
Yes it's possible, but not in this lifetime.
Modern brute-force attacks using multiple GPUs could crack this in short order. I recommend you follow the guidelines for password storage for this application. Here are the current password storage guidelines from OWASP. Currently, they recommend a long salt value, and PBKDF2 with 64,000 iterations, which iteratively stretches the key and makes it computationally complex to brute force the input values. Note that this will also make it computationally complex for you to generate your key values, but the idea is that you will be generating keys far less frequently than an attacker would have to. That said, your design requires many more key derivations than a typical password storage/challenge application, so your design may be fatally flawed. Also keep in mind that the iteration count should doubled every 18 months to make the computational complexity follow Moore's Law. This means that your system would need some way of allowing you to rehash these values (possibly by combining hash techniques). Over time, you will find that old HMAC functions are broken by cryptanalysts, and you need to be ready to update your algorithms. For example, a single iteration of MD5 or SHA-1 used to be sufficient, but it is not anymore. There are other HMAC functions that could also suit your needs that wouldn't require PBKDF2 (such as bcrypt or scrypt), but PBKDF2 is currently the industry standard that has received the most scrutiny. One could argue that bcrypt or scrypt would also be suitable, but this is yet another reason why a pluggable scheme should be used to allow you to upgrade HMAC functions over time.

Difference between Hashing a Password and Encrypting it

The current top-voted to this question states:
Another one that's not so much a security issue, although it is security-related, is complete and abject failure to grok the difference between hashing a password and encrypting it. Most commonly found in code where the programmer is trying to provide unsafe "Remind me of my password" functionality.
What exactly is this difference? I was always under the impression that hashing was a form of encryption. What is the unsafe functionality the poster is referring to?
Hashing is a one way function (well, a mapping). It's irreversible, you apply the secure hash algorithm and you cannot get the original string back. The most you can do is to generate what's called "a collision", that is, finding a different string that provides the same hash. Cryptographically secure hash algorithms are designed to prevent the occurrence of collisions. You can attack a secure hash by the use of a rainbow table, which you can counteract by applying a salt to the hash before storing it.
Encrypting is a proper (two way) function. It's reversible, you can decrypt the mangled string to get original string if you have the key.
The unsafe functionality it's referring to is that if you encrypt the passwords, your application has the key stored somewhere and an attacker who gets access to your database (and/or code) can get the original passwords by getting both the key and the encrypted text, whereas with a hash it's impossible.
People usually say that if a cracker owns your database or your code he doesn't need a password, thus the difference is moot. This is naïve, because you still have the duty to protect your users' passwords, mainly because most of them do use the same password over and over again, exposing them to a greater risk by leaking their passwords.
Hashing is a one-way function, meaning that once you hash a password it is very difficult to get the original password back from the hash. Encryption is a two-way function, where it's much easier to get the original text back from the encrypted text.
Plain hashing is easily defeated using a dictionary attack, where an attacker just pre-hashes every word in a dictionary (or every combination of characters up to a certain length), then uses this new dictionary to look up hashed passwords. Using a unique random salt for each hashed password stored makes it much more difficult for an attacker to use this method. They would basically need to create a new unique dictionary for every salt value that you use, slowing down their attack terribly.
It's unsafe to store passwords using an encryption algorithm because if it's easier for the user or the administrator to get the original password back from the encrypted text, it's also easier for an attacker to do the same.
As shown in the above image, if the password is encrypted it is always a hidden secret where someone can extract the plain text password. However when password is hashed, you are relaxed as there is hardly any method of recovering the password from the hash value.
Extracted from Encrypted vs Hashed Passwords - Which is better?
Is encryption good?
Plain text passwords can be encrypted using symmetric encryption algorithms like DES, AES or with any other algorithms and be stored inside the database. At the authentication (confirming the identity with user name and password), application will decrypt the encrypted password stored in database and compare with user provided password for equality. In this type of an password handling approach, even if someone get access to database tables the passwords will not be simply reusable. However there is a bad news in this approach as well. If somehow someone obtain the cryptographic algorithm along with the key used by your application, he/she will be able to view all the user passwords stored in your database by decryption. "This is the best option I got", a software developer may scream, but is there a better way?
Cryptographic hash function (one-way-only)
Yes there is, may be you have missed the point here. Did you notice that there is no requirement to decrypt and compare? If there is one-way-only conversion approach where the password can be converted into some converted-word, but the reverse operation (generation of password from converted-word) is impossible. Now even if someone gets access to the database, there is no way that the passwords be reproduced or extracted using the converted-words. In this approach, there will be hardly anyway that some could know your users' top secret passwords; and this will protect the users using the same password across multiple applications. What algorithms can be used for this approach?
I've always thought that Encryption can be converted both ways, in a way that the end value can bring you to original value and with Hashing you'll not be able to revert from the end result to the original value.
Hashing algorithms are usually cryptographic in nature, but the principal difference is that encryption is reversible through decryption, and hashing is not.
An encryption function typically takes input and produces encrypted output that is the same, or slightly larger size.
A hashing function takes input and produces a typically smaller output, typically of a fixed size as well.
While it isn't possible to take a hashed result and "dehash" it to get back the original input, you can typically brute-force your way to something that produces the same hash.
In other words, if a authentication scheme takes a password, hashes it, and compares it to a hashed version of the requires password, it might not be required that you actually know the original password, only its hash, and you can brute-force your way to something that will match, even if it's a different password.
Hashing functions are typically created to minimize the chance of collisions and make it hard to just calculate something that will produce the same hash as something else.
Hashing:
It is a one-way algorithm and once hashed can not rollback and this is its sweet point against encryption.
Encryption
If we perform encryption, there will a key to do this. If this key will be leaked all of your passwords could be decrypted easily.
On the other hand, even if your database will be hacked or your server admin took data from DB and you used hashed passwords, the hacker will not able to break these hashed passwords. This would actually practically impossible if we use hashing with proper salt and additional security with PBKDF2.
If you want to take a look at how should you write your hash functions, you can visit here.
There are many algorithms to perform hashing.
MD5 - Uses the Message Digest Algorithm 5 (MD5) hash function. The output hash is 128 bits in length. The MD5 algorithm was designed by Ron Rivest in the early 1990s and is not a preferred option today.
SHA1 - Uses Security Hash Algorithm (SHA1) hash published in 1995. The output hash is 160 bits in length. Although most widely used, this is not a preferred option today.
HMACSHA256, HMACSHA384, HMACSHA512 - Use the functions SHA-256, SHA-384, and SHA-512 of the SHA-2 family. SHA-2 was published in 2001. The output hash lengths are 256, 384, and 512 bits, respectively,as the hash functions’ names indicate.
Ideally you should do both.
First Hash the pass password for the one way security. Use a salt for extra security.
Then encrypt the hash to defend against dictionary attacks if your database of password hashes is compromised.
As correct as the other answers may be, in the context that the quote was in, hashing is a tool that may be used in securing information, encryption is a process that takes information and makes it very difficult for unauthorized people to read/use.
Here's one reason you may want to use one over the other - password retrieval.
If you only store a hash of a user's password, you can't offer a 'forgotten password' feature.

Resources