I've been doing some research about securely storing passwords in a database. It is generally suggested that you use a salt. As explained in one of the answers in Secure hash and salt for PHP passwords, this changes the value of hashes, making a password more difficult to compromise.
As part of the verification mechanism, the password entered by the user is combined with the salt and hashed as needed. Given that the salt is transparent to the user, how does using salt provide any added benefit?
As I see it, with or without hashing, the same password will successfully authenticate you, because the plumbing that makes it different will take place behind the scenes. That is why none of the articles I've read so far have clarified things.
consider a scenario, where you accept a password from you user and you are sending it over network or storing in database as plain-text.
if your user enters a password say 6-8 characters long. A hacker may have pre-generate hashes for all possible strings of 6-8 characters length and he can possibly deduce the password, by comparing it with your hash.(Matching your hash against all the pre-generates hashes, he can get a set of possible candidates,if collision occurs)
But if you append a salt of say 30 chracters to his plain-text password and then hash it. it becomes very difficult for any hacker to pre-generate all the possible combinations of that range. That is the main reason why we use a salt.
You cant restrict every user to input a 30 character long password for security purposes. if any user chooses a 4 char length password, just add 30 char salt and make it more secure.
Salted passwords reduce the probability that a rainbow table will already have the salted password's hash contained in it.
Related
I've been reading up on OWASP 10 and I came across the best practice to store information.
Salted hashing. Where you generate one random salt for every password and combing it and hash it and store it.
My doubt is, if the salt is generated randomly how the password be authenticated when the user types it?
Is the salt saved along with the user name?
If so, this practice is still vulnerable.
OR how do they do it?
The salt is saved along with the user name. Salts are not secret. The point of a salt is to ensure that if two people have the same password, they won't have the same hashed password. This prevents pre-computed hash attacks (rainbow tables), and prevents leaking that two users in a database have the same password.
While per-user random salts are ideal, the benefits of salting can also be achieved with deterministic, but unique, salts. For example, you can use some fixed string for your database and join that with the userid (com.example.mygreatsystem:user1#example.com) and use that as the salt. Since it's unique to every user (not just within this system, but globally), it achieves the same goals as a random salt without requiring an extra database lookup. Like with random salts, this scheme does not need to be secret. The important part of a salt is it be unique. But when practical, a per-user random salt of sufficient length (typically 8 random bytes), stored with the user record, is best practice.
From Wikipedia on Salt (cryptography):
A new salt is randomly generated for each password. In a typical setting, the salt and the password are concatenated and processed with a cryptographic hash function, and the resulting output (but not the original password) is stored with the salt in a database.
But what if I don't have a discrete database? Is it okay to salt with an intrinsic property of the password, such as its reverse? Or even (better?) salting a password with the hash of the password? For example:
md5(md5("password") + "password")
Of course there are performance consequences, but if I'm working with a low-access system, would this kind of salting display any vulnerabilities?
Again, the main reason I would look into doing this would be to save myself a lot of trouble storing a salt.
Since you have to store the hash of the password plus any salt somewhere (else, how would you have anything to compare to when it comes time to authenticate), why not store them together?
It's not uncommon to store both the salt and the hash result of the password and salt in a single field. They can be teased apart when needed by using salts with constant lengths, or by using a separator character that is not part of the set of characters used in your salt.
Would this kind of salting display any vulnerabilities?
Yes. A key purpose of the unique salt is to ensure that users who select the same password will have different password hashes. If the salt is calculated as a function of the password, then users who share the same password will also share the same password hash.
With a database of hashes, an attacker can simply find hashes that appear multiple times. Such passwords are likely to be weak and attractive targets for a brute-force attack.
If you must store passwords and cannot store a dedicated salt, a better approach would be to use an invariant field associated with the account (e.g., username or account identifier) as the salt. This approach will protect against duplicate password hashes in your database.
This I think may be a silly question, but I have become quite confused on what I should do here for the best.
When salting a password hash, should the salt also be hashed or left as plaintext?
NOTE: I am hashing a password in SHA-256 and the Salt is a pre defined string as only one password will ever be stored at a time.
TIA
Chris (Shamballa).
It doesn't matter.
The purpose of a salt is to prevent pre-computation attacks.
Either way, hashing the salt or using it by itself, results in the same data being added as a salt each time. If you hash the salt, all you are effectively doing is changing the salt. By hashing it first, you convert it into a different string, which is then used as the salt. There is no reason to do this, but it will not do anything wrong if you do.
You just need to be consistent and use the same method every time or you will end up with a different password hash.
You must not hash the salt, since hashes are one way. You need the salt so that you can add it to the password before hashing. You could encrypt it, but it's not necessary.
The critical thing about salts is that each password should have its own salt. Ideally, each salt should be unique, but random is good too. The salt should therefore be long enough to allow it to be unique for each password.
If all salts are the same, it's obvious to the cracker (who can see your hash values), which accounts have the same password. The hash values will be the same. This means that if they crack one password, they get more than one account with no additional work. The cracker might even target those accounts.
You should assume that the cracker will gain both the salt and the hash value, so the hash algorithm must be secure.
Having any salt at all prevents using existing precomputed rainbow tables to crack your hash value, and having a unique salt for each account removes the desire for your cracker to precompute their own rainbow tables using your salt.
The salt should not be hashed, as you need the original value to combine with the password before hashing it.
No you must not hash the salt. The salt is in clear text and it is needed to you to recompute the password and check it with the one stored in the hashed password file.
But if you need a strong salting procedure you can compute your salted password in this manner:
SaltedHashedPwd = H(H(H(H(.....H(PWD-k+SALT-k)+SALT-k)+SALT-k).....)+SALT-k+N
H is the hash function
SALT-k is a k-random string you use as salt
PWD-k is the k-password
(every Password has a different salt)
N is the iterations number you compose the H function
In the PKCS#5 standard it uses N=1000!
In this manne a Dictionary attack is not possible because for every word into the Dictionary and for every SALT into the password file, the attacker needs to compute the Hash. Too expansive in time!
I think that N=100 should be enough for your uses :-)
As the salt needs to be saved along with the hash (or at least must be retrievable along with the hash), an attacker could possibly get both the salt and the hashed password. In some of my applications, I've stored the salt encrypted in the database (with a key known only to the application). My reasoning was that storing the salt unencrypted along with the hashed password would make it easier to crack the passwords, as a hacker that would be able to retrieve the password table (and would know or make an assumption about the hash algorithm) would be able to find matches between hashes of well known words (dictionary attack) by hashing each word in the dictionary and then salting with the salt he also has access to. If the salt would be encrypted, such an attack wouldn't be possible unless he would also have access to the encryption key known to the application.
(If anybody sees a fault in this logic, please comment.)
Suppose you were at liberty to decide how hashed passwords were to be stored in a DBMS. Are there obvious weaknesses in a scheme like this one?
To create the hash value stored in the DBMS, take:
A value that is unique to the DBMS server instance as part of the salt,
And the username as a second part of the salt,
And create the concatenation of the salt with the actual password,
And hash the whole string using the SHA-256 algorithm,
And store the result in the DBMS.
This would mean that anyone wanting to come up with a collision should have to do the work separately for each user name and each DBMS server instance separately. I'd plan to keep the actual hash mechanism somewhat flexible to allow for the use of the new NIST standard hash algorithm (SHA-3) that is still being worked on.
The 'value that is unique to the DBMS server instance' need not be secret - though it wouldn't be divulged casually. The intention is to ensure that if someone uses the same password in different DBMS server instances, the recorded hashes would be different. Likewise, the user name would not be secret - just the password proper.
Would there be any advantage to having the password first and the user name and 'unique value' second, or any other permutation of the three sources of data? Or what about interleaving the strings?
Do I need to add (and record) a random salt value (per password) as well as the information above? (Advantage: the user can re-use a password and still, probably, get a different hash recorded in the database. Disadvantage: the salt has to be recorded. I suspect the advantage considerably outweighs the disadvantage.)
There are quite a lot of related SO questions - this list is unlikely to be comprehensive:
Encrypting/Hashing plain text passwords in database
Secure hash and salt for PHP passwords
The necessity of hiding the salt for a hash
Clients-side MD5 hash with time salt
Simple password encryption
Salt generation and Open Source software
Password hashes: fixed-length binary fields or single string field?
I think that the answers to these questions support my algorithm (though if you simply use a random salt, then the 'unique value per server' and username components are less important).
The salt just needs to be random and unique. It can be freely known as it doesn't help an attacker. Many systems will store the plain text salt in the database in the column right next to the hashed password.
The salt helps to ensure that if two people (User A and User B) happen to share the same password it isn't obvious. Without the random and unique salt for each password the hash values would be the same and obviously if the password for User A is cracked then User B must have the same password.
It also helps protect from attacks where a dictionary of hashes can be matched against known passwords. e.g. rainbow tables.
Also using an algorithm with a "work factor" built in also means that as computational power increases the work an algorithm has to go through to create the hash can also be increased. For example, bcrypt. This means that the economics of brute force attacks become untenable. Presumably it becomes much more difficult to create tables of known hashes because they take longer to create; the variations in "work factor" will mean more tables would have to be built.
I think you are over-complicating the problem.
Start with the problem:
Are you trying to protect weak passwords?
Are you trying to mitigate against rainbow attacks?
The mechanism you propose does protect against a simple rainbow attack, cause even if user A and user B have the SAME password, the hashed password will be different. It does, seem like a rather elaborate method to be salting a password which is overly complicated.
What happens when you migrate the DB to another server?
Can you change the unique, per DB value, if so then a global rainbow table can be generated, if not then you can not restore your DB.
Instead I would just add the extra column and store a proper random salt. This would protect against any kind of rainbow attack. Across multiple deployments.
However, it will not protect you against a brute force attack. So if you are trying to protect users that have crappy passwords, you will need to look elsewhere. For example if your users have 4 letter passwords, it could probably be cracked in seconds even with a salt and the newest hash algorithm.
I think you need to ask yourself "What are you hoping to gain by making this more complicated than just generating a random salt value and storing it?" The more complicated you make your algorithm, the more likely you are to introduce a weakness inadvertently. This will probably sound snarky no matter how I say it, but it's meant helpfully - what is so special about your app that it needs a fancy new password hashing algorithm?
Why not add a random salt to the password and hash that combination. Next concatenate the hash and salt to a single byte[] and store that in the db?
The advantage of a random salt is that the user is free to change it's username. The Salt doesn't have to be secret, since it's used to prevent dictionary attacks.
At work we have two competing theories for salts. The products I work on use something like a user name or phone number to salt the hash. Essentially something that is different for each user but is readily available to us. The other product randomly generates a salt for each user and changes each time the user changes the password. The salt is then encrypted in the database.
My question is if the second approach is really necessary? I can understand from a purely theoretical perspective that it is more secure than the first approach, but what about from a practicality point of view. Right now to authenticate a user, the salt must be unencrypted and applied to the login information.
After thinking about it, I just don't see a real security gain from this approach. Changing the salt from account to account, still makes it extremely difficult for someone to attempt to brute force the hashing algorithm even if the attacker was aware of how to quickly determine what it was for each account. This is going on the assumption that the passwords are sufficiently strong. (Obviously finding the correct hash for a set of passwords where they are all two digits is significantly easier than finding the correct hash of passwords which are 8 digits). Am I incorrect in my logic, or is there something that I am missing?
EDIT: Okay so here's the reason why I think it's really moot to encrypt the salt. (lemme know if I'm on the right track).
For the following explanation, we'll assume that the passwords are always 8 characters and the salt is 5 and all passwords are comprised of lowercase letters (it just makes the math easier).
Having a different salt for each entry means that I can't use the same rainbow table (actually technically I could if I had one of sufficient size, but let's ignore that for the moment). This is the real key to the salt from what I understand, because to crack every account I have to reinvent the wheel so to speak for each one. Now if I know how to apply the correct salt to a password to generate the hash, I'd do it because a salt really just extends the length/complexity of the hashed phrase. So I would be cutting the number of possible combinations I would need to generate to "know" I have the password + salt from 13^26 to 8^26 because I know what the salt is. Now that makes it easier, but still really hard.
So onto encrypting the salt. If I know the salt is encrypted, I wouldn't try and decrypt (assuming I know it has a sufficient level of encryption) it first. I would ignore it. Instead of trying to figure out how to decrypt it, going back to the previous example I would just generate a larger rainbow table containing all keys for the 13^26. Not knowing the salt would definitely slow me down, but I don't think it would add the monumental task of trying to crack the salt encryption first. That's why I don't think it's worth it. Thoughts?
Here is a link describing how long passwords will hold up under a brute force attack:
http://www.lockdown.co.uk/?pg=combi
Hiding a salt is unnecessary.
A different salt should be used for every hash. In practice, this is easy to achieve by getting 8 or more bytes from cryptographic quality random number generator.
From a previous answer of mine:
Salt helps to thwart pre-computed dictionary attacks.
Suppose an attacker has a list of likely passwords. He can hash each
and compare it to the hash of his victim's password, and see if it
matches. If the list is large, this could take a long time. He doesn't
want spend that much time on his next target, so he records the result
in a "dictionary" where a hash points to its corresponding input. If
the list of passwords is very, very long, he can use techniques like a
Rainbow Table to save some space.
However, suppose his next target salted their password. Even if the
attacker knows what the salt is, his precomputed table is
worthless—the salt changes the hash resulting from each password. He
has to re-hash all of the passwords in his list, affixing the target's
salt to the input. Every different salt requires a different
dictionary, and if enough salts are used, the attacker won't have room
to store dictionaries for them all. Trading space to save time is no
longer an option; the attacker must fall back to hashing each password
in his list for each target he wants to attack.
So, it's not necessary to keep the salt secret. Ensuring that the
attacker doesn't have a pre-computed dictionary corresponding to that
particular salt is sufficient.
After thinking about this a bit more, I've realized that fooling yourself into thinking the salt can be hidden is dangerous. It's much better to assume the salt cannot be hidden, and design the system to be safe in spite of that. I provide a more detailed explanation in another answer.
However, recent recommendations from NIST encourage the use of an additional, secret "salt" (I've seen others call this additional secret "pepper"). One additional iteration of the key derivation can be performed using this secret as a salt. Rather than increasing strength against a pre-computed lookup attack, this round protects against password guessing, much like the large number of iterations in a good key derivation function. This secret serves no purpose if stored with the hashed password; it must be managed as a secret, and that could be difficult in a large user database.
The answer here is to ask yourself what you're really trying to protect from? If someone has access to your database, then they have access to the encrypted salts, and they probably have access to your code as well. With all that could they decrypt the encrypted salts? If so then the encryption is pretty much useless anyway. The salt really is there to make it so it isn't possible to form a rainbow table to crack your entire password database in one go if it gets broken into. From that point of view, so long as each salt is unique there is no difference, a brute force attack would be required with your salts or the encrypted salts for each password individually.
A hidden salt is no longer salt. It's pepper. It has its use. It's different from salt.
Pepper is a secret key added to the password + salt which makes the hash into an HMAC (Hash Based Message Authentication Code). A hacker with access to the hash output and the salt can theoretically brute force guess an input which will generate the hash (and therefore pass validation in the password textbox). By adding pepper you increase the problem space in a cryptographically random way, rendering the problem intractable without serious hardware.
For more information on pepper, check here.
See also hmac.
My understanding of "salt" is that it makes cracking more difficult, but it doesn't try to hide the extra data. If you are trying to get more security by making the salt "secret", then you really just want more bits in your encryption keys.
The second approach is only slightly more secure. Salts protect users from dictionary attacks and rainbow table attacks. They make it harder for an ambitious attacker to compromise your entire system, but are still vulnerable to attacks that are focused on one user of your system. If you use information that's publicly available, like a telephone number, and the attacker becomes aware of this, then you've saved them a step in their attack. Of course the question is moot if the attacker gets your whole database, salts and all.
EDIT: After re-reading over this answer and some of the comments, it occurs to me that some of the confusion may be due to the fact that I'm only comparing the two very specific cases presented in the question: random salt vs. non-random salt. The question of using a telephone number as a salt is moot if the attacker gets your whole database, not the question of using a salt at all.
... something like a user name or phone number to salt the hash. ...
My question is if the second approach is really necessary? I can understand from a purely theoretical perspective that it is more secure than the first approach, but what about from a practicality point of view?
From a practical point of view, a salt is an implementation detail. If you ever change how user info is collected or maintained – and both user names and phone numbers sometimes change, to use your exact examples – then you may have compromised your security. Do you want such an outward-facing change to have much deeper security concerns?
Does stopping the requirement that each account have a phone number need to involve a complete security review to make sure you haven't opened up those accounts to a security compromise?
Here is a simple example showing why it is bad to have the same salt for each hash
Consider the following table
UserId UserName, Password
1 Fred Hash1 = Sha(Salt1+Password1)
2 Ted Hash2 = Sha(Salt2+Password2)
Case 1 when salt 1 is the same as salt2
If Hash2 is replaced with Hash1 then user 2 could logon with user 1 password
Case 2 when salt 1 not the same salt2
If Hash2 is replaced with Hash1 then user2 can not logon with users 1 password.
There are two techniques, with different goals:
The "salt" is used to make two otherwise equal passwords encrypt differently. This way, an intruder can't efficiently use a dictionary attack against a whole list of encrypted passwords.
The (shared) "secret" is added before hashing a message, so that an intruder can't create his own messages and have them accepted.
I tend to hide the salt. I use 10 bits of salt by prepending a random number from 1 to 1024 to the beginning of the password before hashing it. When comparing the password the user entered with the hash, I loop from 1 to 1024 and try every possible value of salt until I find the match. This takes less than 1/10 of a second. I got the idea to do it this way from the PHP password_hash and password_verify. In my example, the "cost" is 10 for 10 bits of salt. Or from what another user said, hidden "salt" is called "pepper". The salt is not encrypted in the database. It's brute forced out. It would make the rainbow table necessary to reverse the hash 1000 times larger. I use sha256 because it's fast, but still considered secure.
Really, it depends on from what type of attack you're trying to protect your data.
The purpose of a unique salt for each password is to prevent a dictionary attack against the entire password database.
Encrypting the unique salt for each password would make it more difficult to crack an individual password, yes, but you must weigh whether there's really much of a benefit. If the attacker, by brute force, finds that this string:
Marianne2ae85fb5d
hashes to a hash stored in the DB, is it really that hard to figure out what which part is the pass and which part is the salt?