why salt did not help when using dictionary attack - security

From this site http://codahale.com/how-to-safely-store-a-password/:
It’s important to note that salts are useless for preventing dictionary attacks or brute force attacks.
If salt is useless to prevent dictionary attack, why using salt?

For single passwords, it doesn't make that much of a difference. Brute-forcing an unsalted password is just as hard as brute-forcing a salted password. You just try out keys until you get a hit.
The difference is when there are a lot of passwords, for example in a leaked database. The basic idea is that part of the necessary computations can be re-used when cracking many passwords. This is done by constructing a rainbow table. Doing that is computationally expensive, but once done it allows the attacker to crack a lot of passwords relatively fast. Cracking N passwords with a rainbow table is a lot faster than brute-forcing those N passwords individually.
If every password is hashed with an individual salt, you can't re-use information in the same way. You could still construct rainbow tables, but they would only be usable for exactly one password in the database, which renders them useless. So in order to crack N passwords, you really have to brute-force all N passwords individually, which is usually not practical for the attacker.
For unsalted passwords and popular hash algorithms, you can simply download pre-calculated rainbow tables from the Internet, so an attacker wouldn't even have to calculate them by himself. He can just download a table and lookup the password for a particular hash. A salt prevents that.
Unsalted hashes also have the drawback that the password hash for two users with the same password is identical. So if an attacker finds multiple users with the same password hash, he only has to crack that password once.

If the 'attacker' has the password hash (and salt) used by your site/app they will simply brute force "salt" + "password".
However, using a salt offers more protection against rainbow tables (precalculated hash tables) so they're still worth using.

Salts prevent instant cracking from a dictionary via rainbow tables; the article and follow-up make the point that the CPU/Storage tradeoff is now such that rainbow tables don't make sense, and so salts don't help you. And of course, they never helped with brute-force attacks.

For illustration purposes, say you are using 2 character string for salts which can be a random element from the set
salts = {'00', '01', '02'...... '99'}
The formula you use is:
salt = salts[rnd(100)] # gets a random element from the set above, say '87'
password_hash = MD5(password + salt) # say the hash is 'dai480hgld0'
Thereafter you'll save the hash and salt in your database, something like
+---------------------------+
| password_hash | salt|
+---------------------------+
| dai480hgld0 | 87 |
| sjknigu2948 | 23 |
| . | . |
| . | . |
+--------------------+------+
We assume that in a compromised system an attacker has access to your code - so he knows how you calculated your hashes.
The attacker will also have access to your database, so he has all the password hashes and the salts.
Given this information, in order to do to crack your password (which has a hash: 'dai480hgld0') he'll have to do the following:
for word in dictionary_words #iterate over all the words in dictionary
for salt in salts #iterate over all possible salts (100 iterations)
password_hash = MD5(word + salt)
if password_hash == 'dai480hgld0'
print "The password is " + word
exit()
endif
next
next
Note that if you'd have not used any salt at all, the algorithm would have been
for word in dictionary_words #iterate over all the words in dictionary
password_hash = MD5(word)
if password_hash == 'dai480hgld0'
print "The password is " + word
exit()
endif
next
From the above two code samples, its obvious that adding a salt to the password increases the number of attempts in the brute force attack. In our case since there are 100 possible salts, you've made the attacker try each word with 100 salts.
So, to conclude:
Salts are good. They make your passwords tough to crack. Even if your users enter weak passwords, the salt makes sure that the resultant hashes are not googlable. For eg, its easy to google a hash '3cc31cd246149aec68079241e71e98f6' which is actually a password that is fairly complex and will meet almost all password policies. Still cracking it requires not a single line of code !
Salts are not panacea. They just increase the time it takes for a cracker to brute force your passwords. However, if your salt address space is fairly big then you are pretty good. For eg, if you have 32 characters alphanumeric string as a salt - brute force will really take very long.
Slow algorithms like bcrypt help you in this regard just because they are well... 'slow'. For a brute force attack, it will take unrealistically long to break hashes that are slow to compute.

Salt makes the encryption stronger. However, dictionary attacks don't try to decrypt the password hash, so salt or no salt, it doesn't matter, they will just try out many passwords until one works.

Now this doesn't seem like a programming question, so I'll just give you some info on salting and encryption:
The purpose of salting is to aid in one-way functions like Hashing, which is used widely in Cryptography, often in use of passwords because of its difficulty to guess, and time it takes for other attacks like brute-force attacks to crack them.
If you want to securely store passwords, the best way is definitely encryption. Look up encryption on Wikipedia for more info on that.

It is not entirely accurate, as with most things it depends on your assumption.
main assumption are:
Attacker has salt
calculation of hashes "on the fly" are done pretty quick (as with salt he will need to recalculate all and wont be able to use predefined lists)
same salt for each user.

Two comments:
Regular hash algorithms can be iterated. There is no need to use a non-standard algorithm just because you want to increase the work factor.
Using a Salt is to be recommended even if you use a slow hash method. It might not necessarily increase the work load of the best attack, but it will stop trivial attacks in case a user chooses a password identical to that of another user, another account or to an old password.

This belongs on security.stackexchange.com
The problem is one of compute capacity in combination with the speed of the hashing algorithm. Basically, he's pitching bcrypt which is slow.
If a hacker has both the hash and salt used as well as knows the algorithm used to hash the password, then it's simply a matter of time to crack it.
If using a very fast algorithm, then that time is pretty short. If using an extremely slow algorithm then the time is, obviously, much longer to find a hit.
Which brings us to the primary reason why we hash/salt things in the first place: to buy time. Time that can be used in order to change all of the passwords listed and time to contact all of the users to let them know in case they need to change their passwords on other systems.
The reason we use salt is to force the hacker to build a rainbow table per salt value. This way one table can't be used to crack all of your passwords. The only reasons to do this are to buy time and, hopefully, dissuade the common hackers from investing further resources in cracking all of them.
Hashed passwords, regardless of mechanism used, are not secure in the sense that most people take that word. Secure doesn't mean "can never be cracked". Rather it means "this is going to be expensive in term of time/effort to crack". For most hackers, they want low hanging fruit such as clear text only. For some, they'll go to whatever extreme is required, such as building massive rainbow tables per salt value to get them all.
And, of course, underpinning this is whether any "super" user accounts are easily identified in your user table. For most systems just cracking the sys admin type of account is good enough and therefore the fact of using a different salt value per user is immaterial. The smart ones will just bother with that one account.

Related

Why do salts make dictionary attacks 'impossible'?

Update: Please note I am not asking what a salt is, what a rainbow table is, what a dictionary attack is, or what the purpose of a salt is. I am querying: If you know the users salt and hash, isn't it quite easy to calculate their password?
I understand the process, and implement it myself in some of my projects.
s = random salt
storedPassword = sha1(password + s)
In the database you store:
username | hashed_password | salt
Every implementation of salting I have seen adds the salt either at the end of the password, or beginning:
hashed_Password = sha1(s + password )
hashed_Password = sha1(password + s)
Therfore, a dictionary attack from a hacker who is worth his salt (ha ha) would simply run each keyword against the stored salts in the common combinations listed above.
Surely the implementation described above simply adds another step for the hacker, without actually solving the underlying issue? What alternatives are there to step around this issue, or am I misunderstanding the problem?
The only thing I can think to do is have a secret blending algorithm that laces the salt and password together in a random pattern, or adds other user fields to the hashing process meaning the hacker would have to have access to the database AND code to lace them for a dictionary attack to prove fruitful. (Update, as pointed out in comments it's best to assume the hacker has access to all your information so this probably isn't best).
Let me give an example of how I propose a hacker would hack a user database with a list of passwords and hashes:
Data from our hacked database:
RawPassword (not stored) | Hashed | Salt
--------------------------------------------------------
letmein WEFLS... WEFOJFOFO...
Common password dictionary:
Common Password
--------------
letmein
12345
...
For each user record, loop the common passwords and hash them:
for each user in hacked_DB
salt = users_salt
hashed_pw = users_hashed_password
for each common_password
testhash = sha1(common_password + salt)
if testhash = hashed_pw then
//Match! Users password = common_password
//Lets visit the webpage and login now.
end if
next
next
I hope this illustrates my point a lot better.
Given 10,000 common passwords, and 10,000 user records, we would need to calculate 100,000,000 hashes to discover as many user passwords as possible. It might take a few hours, but it's not really an issue.
Update on Cracking Theory
We will assume we are a corrupt webhost, that has access to a database of SHA1 hashes and salts, along with your algorithm to blend them. The database has 10,000 user records.
This site claims to be able to calculate 2,300,000,000 SHA1 hashes per second using the GPU. (In real world situation probably will be slower, but for now we will use that quoted figure).
(((95^4)/2300000000)/2)*10000 = 177
seconds
Given a full range of 95 printable ASCII characters, with a maximum length of 4 characters, divided by the rate of calculation (variable), divided by 2 (assuming the average time to discover password will on average require 50% of permutations) for 10,000 users it would take 177 seconds to work out all users passwords where the length is <= 4.
Let's adjust it a bit for realism.
(((36^7)/1000000000)/2)*10000 = 2 days
Assuming non case sensitivity, with a password length <= 7, only alphanumeric chars, it would take 4 days to solve for 10,000 user records, and I've halved the speed of the algorithm to reflect overhead and non ideal circumstance.
It is important to recognise that this is a linear brute force attack, all calculations are independant of one another, therfore it's a perfect task for multiple systems to solve. (IE easy to set up 2 computers running attack from different ends that would half the exectution time).
Given the case of recursively hashing a password 1,000 times to make this task more computationally expensive:
(((36^7) / 1 000 000 000) / 2) * 1000
seconds = 10.8839117 hours
This represents a maximum length of 7 alpha-numeric characters, at a less than half speed execution from quoted figure for one user.
Recursively hashing 1,000 times effectively blocks a blanket attack, but targetted attacks on user data are still vulnerable.
It doesn't stop dictionary attacks.
What it does is stop someone who manages to get a copy of your password file from using a rainbow table to figure out what the passwords are from the hashes.
Eventually, it can be brute-forced, though. The answer to that part is to force your users to not use dictionary words as passwords (minimum requirements of at least one number or special character, for example).
Update:
I should have mentioned this earlier, but some (most?) password systems use a different salt for each password, likely stored with the password itself. This makes a single rainbow table useless. This is how the UNIX crypt library works, and modern UNIX-like OSes have extended this library with new hash algorithms.
I know for a fact that support for SHA-256 and SHA-512 were added in newer versions of GNU crypt.
To be more precise, a dictionary attack, i.e. an attack where all words in an exhaustive list are tried, gets not "impossible", but it gets impractical: each bit of salt doubles the amount of storage and computation required.
This is different from pre-computed dictionary attacks like attacks involving rainbow tables where it does not matter whether the salt is secret or not.
Example: With a 64-bit salt (i.e. 8 bytes) you need to check 264 additional password combinations in your dictionary attack. With a dictionary containing 200,000 words you will have to make
200,000 * 264 = 3.69 * 1024
tests in the worst case - instead of 200,000 tests without salt.
An additional benefit of using salt is that an attacker cannot pre-compute the password hashes from his dictionary. It would simply take too much time and/or space.
Update
Your update assumes that an attacker already knows the salt (or has stolen it). This is of course a different situation. Still it is not possible for the attacker to use a pre-computed rainbow table. What matters here a lot is the speed of the hashing function. To make an attack impractical, the hashing function needs to be slow. MD5 or SHA are not good candidates here because they are designed to be fast, better candidates for hashing algorithms are Blowfish or some variations of it.
Update 2
A good read on the matter of securing your password hashes in general (going much beyond the original question but still interesting):
Enough With The Rainbow Tables: What You Need To Know About Secure Password Schemes
Corollary of the article: Use salted hashes created with bcrypt (based on Blowfish) or Eksblowfish that allows you to use a configurable setup time to make hashing slow.
Yes, you need just 3 days for sha1(salt | password). That's why good password storage algorithms use 1000-iteration hashing: you will need 8 years.
A dictionary is a structure where values are indexed by keys. In the case of a pre-computed dictionary attack, each key is a hash, and the corresponding value is a password that results in the hash. With a pre-computed dictionary in hand, an attacker can "instantly" lookup a password that will produce the necessary hash to log in.
With salt, the space required to store the dictionary grows rapidly… so rapidly, that trying to pre-compute a password dictionary soon becomes pointless.
The best salts are randomly chosen from a cryptographic random number generator. Eight bytes is a practical size, and more than 16 bytes serves no purpose.
Salt does much more than just "make an attacker's job more irritating." It eliminates an entire class of attack—the use of precomputed dictionaries.
Another element is necessary to completely secure passwords, and that is "key-strengthening." One round of SHA-1 is not good enough: a safe password hashing algorithm should be very slow computationally.
Many people use PBKDF2, a key derivation function, that feeds back results to the hash function thousands of times. The "bcrypt" algorithm is similar, using an iterative key derivation that is slow.
When the hashing operation is very slow, a precomputed table becomes more and more desirable to an attacker. But proper salt defeats that approach.
Comments
Below are the comments I made on the question.
Without salt, an attacker wouldn't use the method demonstrated in "Update 2". He'd simply do a lookup in a pre-computed table and get the password in O(1) or O(log n) time (n being the number of candidate passwords). Salt is what prevents that and forces him to use the O(n) approach shown in "Update 2".
Once reduced to an O(n) attack, we have to consider how long each attempt takes. Key-strengthening can cause each attempt in the loop to take a full second, meaning that the time needed to test 10k passwords on 10k users will stretch from 3 days to 3 years… and with only 10k passwords, you're likely to crack zero passwords in that time.
You have to consider that an attacker is going to use the fastest tools he can, not PHP, so thousands of iterations, rather than 100, would be a good parameter for key-strengthening. It should take a large fraction of a second to compute the hash for a single password.
Key-strengthening is part of the standard key derivation algorithms PBKDF1 and PBKDF2, from PKCS #5, which make great password obfuscation algorithms (the "derived key" is the "hash").
A lot of users on StackOverflow refer to this article because it was a response to Jeff Atwood's post about the dangers of rainbow tables. It's not my favorite article, but it does discuss these concepts in more detail.
Of course you assume the attacker has everything: salt, hash, user name. Assume the attacker is a corrupt hosting company employee who dumped the user table on your myprettypony.com fansite. He's trying recover these passwords because he's going to turn around and see if your pony fans used the same password on their citibank.com accounts.
With a well-designed password scheme, it will be impossible for this guy to recover any passwords.
The point of salting is to prevent the amortization of the attacker's effort.
With no salt, a single table of precomputed hash-password entries (e.g. MD5 of all alphanumeric 5 character strings, easy to find online) can be used on every user in every database in the world.
With a site-specific salt, the attacker has to compute the table himself and can then use it on all users of the site.
With a per-user salt, the attacker has to expend this effort for every user separately.
Of course, this doesn't do much to protect really weak passwords straight out of a dictionary, but it protects reasonably strong passwords against this amortization.
Also - one more imporatant point - using a USER-specific salt prevents the detection of two users with the SAME password - their hashes would match. That's why many times the hash is hash(salt + username + password)
If you try and keep the hash secret the attacker also can not verify the hashes.
Edit- just noticed the main point was made in a comment above.
Salts are implemented to prevent rainbow table attacks. A rainbow table is a list of pre-calculated hashes, which makes translating a hash into it's phrase much more simple. You need to understand that salting isn't effective as a modern prevention to cracking a password unless we have a modern hashing algo.
So lets say we're working with SHA1, taking advantage of recent exploits discovered with this algo, and lets say we have a computer running at 1,000,000 hashes/second, it would take 5.3 million million million years to find a collision, so yeah php can work 300 a second, big woop, doesn't really matter. The reason we salt is because if someone did bother to generate all common dictionary phrases, (2^160 people, welcome to 2007 era exploits).
So here's an actual database, with 2 users I use for testing and admin purposes.
RegistrationTime UserName UserPass
1280185359.365591 briang a50b63e927b3aebfc20cd783e0fc5321b0e5e8b5
1281546174.065087 test 5872548f2abfef8cb729cac14bc979462798d023
In fact, the salting scheme is your sha1(registration time + user name). Go ahead, tell me my password, these are real passwords in production. You can even sit there and hash out a word list in php. Go wild.
I'm not crazy, I just know that this is secure. For fun sake, test's password is test.
sha1(sha1(1281546174.065087 + test) + test) = 5872548f2abfef8cb729cac14bc979462798d023
You would need to generate an entire rainbow table perpended with 27662aee8eee1cb5ab4917b09bdba31d091ab732 for just this user. That means I can actually allow my passwords to not all be compromised by a single rainbow table, the hacker needs to generate an entire rainbow table for 27662aee8eee1cb5ab4917b09bdba31d091ab732 for test, and again f3f7735311217529f2e020468004a2aa5b3dee7f for briang. Think back to the 5.3 million million million years for all hashes. Think of the size of storing just the 2^80 hashes (that's well over 20 yottabytes), it's not going to happen.
Don't confuse salting as a means of making a hash something you can't ever decode, it's a means of preventing a rainbow table from translating all your user passwords. It's imposable at this level of technology.
The idea behind dictionary attack is that you take a hash and find the password, from which this hash was calculated, without hash calculation. Now do the same with salted password - you can't.
Not using a salt makes password search as easy as lookup in the database. Adding a salt make attacker perform hash calculation of all possible passwords (even for dictionary attach this significantly increases time of attack).
In simplest terms: without salting, each candidate password need only be hashed once to check it against every user, anywhere in the "known universe" (collection of compromised databases), whose password is hashed via the same algorithm. With salting, if the number of possible salt values substantially exceeds the number of users in the "known universe", each candidate password must be hashed separately for each user against whom it will be tested.
Simply put salting does not prevent a hash from attack (bruteforce or dictionary), it only makes it harder; the attacker will either need to find the salting algorithm (which if implemented properly will make use of more iterations) or bruteforce the algo, which unless very simple, is nearly impossible. Salting also almost completely discards the option of rainbow table lookups...
Salt makes Rainbow table attacks much more difficult since it makes a single password hash much harder to crack. Imagine you have a horrid password of just the number 1. A rainbow table attack would crack this immediately.
Now imagine each password in the db is salted with a long random value of many random characters. Now your lousy password of "1" is stored in the db as a hash of 1 plus a bunch of random characters (the salt), so in this example the rainbow table needs to have the hash for something like: 1.
So assuming your salt is something secure and random, say ()%ISLDGHASKLU(%#%#, the hacker's rainbow table would need to have an entry for 1*()%ISLDGHASKLU(*%#%#. Now using a rainbow table on even this simple password is no longer practical.

Salt Generation and open source software

As I understand it, the best practice for generating salts is to use some cryptic formula (or even magic constant) stored in your source code.
I'm working on a project that we plan on releasing as open source, but the problem is that with the source comes the secret formula for generating salts, and therefore the ability to run rainbow table attacks on our site.
I figure that lots of people have contemplated this problem before me, and I'm wondering what the best practice is. It seems to me that there is no point having a salt at all if the code is open source, because salts can be easily reverse-engineered.
Thoughts?
Since questions about salting hashes come along on a quite regular basis and there seems to be quite some confusion about the subject, I extended this answer.
What is a salt?
A salt is a random set of bytes of a fixed length that is added to the input of a hash algorithm.
Why is salting (or seeding) a hash useful?
Adding a random salt to a hash ensures that the same password will produce many different hashes. The salt is usually stored in the database, together with the result of the hash function.
Salting a hash is good for a number of reasons:
Salting greatly increases the difficulty/cost of precomputated attacks (including rainbow tables)
Salting makes sure that the same password does not result in the same hash.
This makes sure you cannot determine if two users have the same password. And, even more important, you cannot determine if the same person uses the same password across different systems.
Salting increases the complexity of passwords, thereby greatly decreasing the effectiveness of both Dictionary- and Birthday attacks. (This is only true if the salt is stored separate from the hash).
Proper salting greatly increases the storage need for precomputation attacks, up to the point where they are no longer practical. (8 character case-sensitive alpha-numeric passwords with 16 bit salt, hashed to a 128 bit value, would take up just under 200 exabytes without rainbow reduction).
There is no need for the salt to be secret.
A salt is not a secret key, instead a salt 'works' by making the hash function specific to each instance. With salted hash, there is not one hash function, but one for every possible salt value. This prevent the attacker from attacking N hashed passwords for less than N times the cost of attacking one password. This is the point of the salt.
A "secret salt" is not a salt, it is called a "key", and it means that you are no longer computing a hash, but a Message Authentication Code (MAC). Computing MAC is tricky business (much trickier than simply slapping together a key and a value into a hash function) and it is a very different subject altogether.
The salt must be random for every instance in which it is used. This ensures that an attacker has to attack every salted hash separately.
If you rely on your salt (or salting algorithm) being secret, you enter the realms of Security Through Obscurity (won't work). Most probably, you do not get additional security from the salt secrecy; you just get the warm fuzzy feeling of security. So instead of making your system more secure, it just distracts you from reality.
So, why does the salt have to be random?
Technically, the salt should be unique. The point of the salt is to be distinct for each hashed password. This is meant worldwide. Since there is no central organization which distributes unique salts on demand, we have to rely on the next best thing, which is random selection with an unpredictable random generator, preferably within a salt space large enough to make collisions improbable (two instances using the same salt value).
It is tempting to try to derive a salt from some data which is "presumably unique", such as the user ID, but such schemes often fail due to some nasty details:
If you use for example the user ID, some bad guys, attacking distinct systems, may just pool their resources and create precomputed tables for user IDs 1 to 50. A user ID is unique system-wide but not worldwide.
The same applies to the username: there is one "root" per Unix system, but there are many roots in the world. A rainbow table for "root" would be worth the effort, since it could be applied to millions of systems. Worse yet, there are also many "bob" out there, and many do not have sysadmin training: their passwords could be quite weak.
Uniqueness is also temporal. Sometimes, users change their password. For each new password, a new salt must be selected. Otherwise, an attacker obtained the hash of the old password and the hash of the new could try to attack both simultaneously.
Using a random salt obtained from a cryptographically secure, unpredictable PRNG may be some kind of overkill, but at least it provably protects you against all those hazards. It's not about preventing the attacker from knowing what an individual salt is, it's about not giving them the big, fat target that will be used on a substantial number of potential targets. Random selection makes the targets as thin as is practical.
In conclusion:
Use a random, evenly distributed, high entropy salt. Use a new salt whenever you create a new password or change a password. Store the salt along with the hashed password. Favor big salts (at least 10 bytes, preferably 16 or more).
A salt does not turn a bad password into a good password. It just makes sure that the attacker will at least pay the dictionary attack price for each bad password he breaks.
Usefull sources:
stackoverflow.com: Non-random salt for password hashes
Bruce Schneier: Practical Cryptography (book)
Matasano Security: Enough with the Rainbow Tables
usenix.org: Unix crypt used salt since 1976
owasp.org: Why add salt
openwall.com: Salts
Disclaimer:
I'm not a security expert. (Although this answer was reviewed by Thomas Pornin)
If any of the security professionals out there find something wrong, please do comment or edit this wiki answer.
Really salts just need to be unique for each entry. Even if the attacker can calculate what the salt is, it makes the rainbow table extremely difficult to create. This is because the salt is added to the password before it is hashed, so it effectively adds to the total number of entries the rainbow table must contain to have a list of all possible values for a password field.
Since Unix became popular, the right way to store a password has been to append a random value (the salt) and hash it. Save the salt away where you can get to it later, but where you hope the bad guys won't get it.
This has some good effects. First, the bad guys can't just make a list of expected passwords like "Password1", hash them into a rainbow table, and go through your password file looking for matches. If you've got a good two-byte salt, they have to generate 65,536 values for each expected password, and that makes the rainbow table a lot less practical. Second, if you can keep the salt from the bad guys who are looking at your password file, you've made it much harder to calculate possible values. Third, you've made it impossible for the bad guys to determine if a given person uses the same password on different sites.
In order to do this, you generate a random salt. This should generate every number in the desired range with uniform probability. This isn't difficult; a simple linear congruential random number generator will do nicely.
If you've got complicated calculations to make the salt, you're doing it wrong. If you calculate it based on the password, you're doing it WAY wrong. In that case, all you're doing is complicating the hash, and not functionally adding any salt.
Nobody good at security would rely on concealing an algorithm. Modern cryptography is based on algorithms that have been extensively tested, and in order to be extensively tested they have to be well known. Generally, it's been found to be safer to use standard algorithms rather than rolling one's own and hoping it's good. It doesn't matter if the code is open source or not, it's still often possible for the bad guys to analyze what a program does.
You can just generate a random salt for each record at runtime. For example, say you're storing hashed user passwords in a database. You can generate an 8-character random string of lower- and uppercase alphanumeric characters at runtime, prepend that to the password, hash that string, and store it in the database. Since there are 628 possible salts, generating rainbow tables (for every possible salt) will be prohibitively expensive; and since you're using a unique salt for each password record, even if an attacker has generated a couple matching rainbow tables, he still won't be able to crack every password.
You can change the parameters of your salt generation based on your security needs; for example, you could use a longer salt, or you could generate a random string that also contains punctuation marks, to increase the number of possible salts.
Use a random function generator to generate the salt, and store it in the database, make salt one per row, and store it in the database.
I like how salt is generated in django-registration. Reference: http://bitbucket.org/ubernostrum/django-registration/src/tip/registration/models.py#cl-85
salt = sha_constructor(str(random.random())).hexdigest()[:5]
activation_key = sha_constructor(salt+user.username).hexdigest()
return self.create(user=user,
activation_key=activation_key)
He uses a combination of sha generated by a random number and the username to generate a hash.
Sha itself is well known for being strong and unbreakable. Add multiple dimensions to generate the salt itself, with random number, sha and the user specific component, you have unbreakable security!
In the case of a desktop application that encrypts data and send it on a remote server, how do you consider using a different salt each time?
Using PKCS#5 with the user's password, it needs a salt to generate an encryption key, to encrypt the data. I know that keep the salt hardcoded (obfuscated) in the desktop application is not a good idea.
If the remote server must NEVER know the user's password, is it possible to user different salt each time? If the user use the desktop application on another computer, how will it be able to decrypt the data on the remote server if he does not have the key (it is not hardcoded in the software) ?

Publicly viewable salt security

If the password salt for keys are viewable does it not improve security compared to without salt?
Would it be better just to not use the salt and improve some performance?
Even a publicly viewable salt increases the security a bit, because your attackers cannot use previously generated rainbow tables. They have to generate their own. This takes a very long time.
It prevents the use of pre-calculated hash tables or rainbow tables from being used to merely lookup an acceptable input.
Take a look at: http://en.wikipedia.org/wiki/Rainbow_table
Keep in mind that having the salt hidden increases security, because then the attacker does not know exactly what function is being used to generate the hashes. However, the main benefit of hashing passwords is in the event of them being obtained -- much more work to make use of a list of hashes than a list of plain passwords. If someone has your hashes, they likely have your salt as well. Just food for thought.
A unique salt will per password will prevent a Rainbow attack with a pre-computed hash. Using a unique salt per password requires the attacker to calculate the hash foreach individual password for each attempt.
It's main goal is slow the attacker down enough, to make the attack no longer feasible.

Are salts useless for security if the attacker knows them?

Let's say I have a table of users set up like this:
CREATE TABLE `users` (
`id` INTEGER PRIMARY KEY,
`name` TEXT,
`hashed_password` TEXT,
`salt` TEXT
)
When a user is created, a randomly-generated salt is produced and stored in the database alongside the results of something like get_hash(salt + plaintext_password).
I'm wondering that if a malicious user gets their hands on this data, would they be able to use it to crack users's passwords? If so, what's a way that it could be prevented?
No, they're not useless.
So long as you use a unique salt for each row, then the salt will prevent slow down an attack. The attacker will need to mount a brute force attack, rather than using rainbow tables against the password hashes.
As mentioned in the comments, you should ensure that the salt is a sensible size.
Salting was introduced (or at least made popular) in UNIX /etc/passwd file, which was world-readable. It is usually assumed that the salt as well as the encrypted password is known to the cracker. The purpose of the salt is the slow-down of the cracking process (since the same password won't map to the same encrypted string); it is not a secret in itself.
Knowing the salt makes it possible to do a brute-force attack, but that doesn't make it useless. Salt prevents the attacker from using an already generated rainbow table (which you could find on the web).
The best way to prevent brute-forcing is simply to use long, complex passwords.
If an attacker knows the salt, the hashed password and the hash algorithm, then they can mount a brute-force dictionary attack (or rainbow attack).
This should give you an idea of how it works.
Lets say you want to encrypt a word "secret." After it is encrypted lets say it now looks like this 00110010.
If a hacker knows the encryption algorithm, they can create a table of words and their corresponding encrypted values. So they take the encrypted password "00110010" and find it in the table. Now they know that the password used to generate "00110010" was the word "secret." If you salt the word first, then a generic lookup table will be useless to the hacker. (A generic lookup table being a table of unsalted dictionary words and their encrypted values)
If you salt the word first ("saltsecret"), now the encrypted value will look different, and the hacker wont find it in the lookup table.
However, they can still start creating their own lookup table from scratch using your salt and eventually they will be able to reverse lookup passwords.
So to answer the question, if the passwords are sufficiently complex, it will take ages for the hacker to figure them out. You could change your salt every year and they would have to start creating a table all over again.
No, it's not worthless.
To successfully attack an account, an attacker needs to know the salt for that account (and every account's salt should be different), the hashing algorightm used, and the final stored password hash.
Given all of that information, you can write a program that keeps trying to hash different potential passwords until it finds one that matches.
If it's a bad salt (too simple or short), this can be made much faster because the program can use rainbow lookup tables to match the final stored password hash to the string that was hashed, and then just subtract the salt. But they still need all the information.
If it's a shared salt, this is bad because an attacker and use the salt to generate a rainbow table in advance that's good for any account on your system.
Assuming brute force attack of MD5,SHA1,SHA256 algorithms with GPU has a throughput greater than 1 billion of tries per second and SHA512 around 300M/s. If you use one of these algorithms, it will slow down hacker who used rainbow table (less likely), but it will not slow down hacker who used brute force attack (more likely). It will definitively not protect you, it just add a bit of security against outdated rainbow table (for these algo). A bit is better than nothing.
But if you use a strongest algorithm (eg. bcrypt), salt definitively worth it even if stored with hash because brut force is not feasible in term of time so rainbow make sense.
Have a look at this
article and to summarize:
If you are a user:
Make sure all your passwords are 12 characters or more, ideally a lot more. I recommend adopting pass phrases, which are not only a lot easier to remember than passwords (if not type) but also ridiculously secure against brute forcing purely due to their length.
If you are a developer:
Use bcrypt or PBKDF2 exclusively to hash anything you need to be secure. These new hashes were specifically designed to be difficult to implement on GPUs. Do not use any other form of hash. Almost every other popular hashing scheme is vulnerable to brute forcing by arrays of commodity GPUs, which only get faster and more parallel and easier to program for every year.
Posted by Jeff Atwood

The necessity of hiding the salt for a hash

At work we have two competing theories for salts. The products I work on use something like a user name or phone number to salt the hash. Essentially something that is different for each user but is readily available to us. The other product randomly generates a salt for each user and changes each time the user changes the password. The salt is then encrypted in the database.
My question is if the second approach is really necessary? I can understand from a purely theoretical perspective that it is more secure than the first approach, but what about from a practicality point of view. Right now to authenticate a user, the salt must be unencrypted and applied to the login information.
After thinking about it, I just don't see a real security gain from this approach. Changing the salt from account to account, still makes it extremely difficult for someone to attempt to brute force the hashing algorithm even if the attacker was aware of how to quickly determine what it was for each account. This is going on the assumption that the passwords are sufficiently strong. (Obviously finding the correct hash for a set of passwords where they are all two digits is significantly easier than finding the correct hash of passwords which are 8 digits). Am I incorrect in my logic, or is there something that I am missing?
EDIT: Okay so here's the reason why I think it's really moot to encrypt the salt. (lemme know if I'm on the right track).
For the following explanation, we'll assume that the passwords are always 8 characters and the salt is 5 and all passwords are comprised of lowercase letters (it just makes the math easier).
Having a different salt for each entry means that I can't use the same rainbow table (actually technically I could if I had one of sufficient size, but let's ignore that for the moment). This is the real key to the salt from what I understand, because to crack every account I have to reinvent the wheel so to speak for each one. Now if I know how to apply the correct salt to a password to generate the hash, I'd do it because a salt really just extends the length/complexity of the hashed phrase. So I would be cutting the number of possible combinations I would need to generate to "know" I have the password + salt from 13^26 to 8^26 because I know what the salt is. Now that makes it easier, but still really hard.
So onto encrypting the salt. If I know the salt is encrypted, I wouldn't try and decrypt (assuming I know it has a sufficient level of encryption) it first. I would ignore it. Instead of trying to figure out how to decrypt it, going back to the previous example I would just generate a larger rainbow table containing all keys for the 13^26. Not knowing the salt would definitely slow me down, but I don't think it would add the monumental task of trying to crack the salt encryption first. That's why I don't think it's worth it. Thoughts?
Here is a link describing how long passwords will hold up under a brute force attack:
http://www.lockdown.co.uk/?pg=combi
Hiding a salt is unnecessary.
A different salt should be used for every hash. In practice, this is easy to achieve by getting 8 or more bytes from cryptographic quality random number generator.
From a previous answer of mine:
Salt helps to thwart pre-computed dictionary attacks.
Suppose an attacker has a list of likely passwords. He can hash each
and compare it to the hash of his victim's password, and see if it
matches. If the list is large, this could take a long time. He doesn't
want spend that much time on his next target, so he records the result
in a "dictionary" where a hash points to its corresponding input. If
the list of passwords is very, very long, he can use techniques like a
Rainbow Table to save some space.
However, suppose his next target salted their password. Even if the
attacker knows what the salt is, his precomputed table is
worthless—the salt changes the hash resulting from each password. He
has to re-hash all of the passwords in his list, affixing the target's
salt to the input. Every different salt requires a different
dictionary, and if enough salts are used, the attacker won't have room
to store dictionaries for them all. Trading space to save time is no
longer an option; the attacker must fall back to hashing each password
in his list for each target he wants to attack.
So, it's not necessary to keep the salt secret. Ensuring that the
attacker doesn't have a pre-computed dictionary corresponding to that
particular salt is sufficient.
After thinking about this a bit more, I've realized that fooling yourself into thinking the salt can be hidden is dangerous. It's much better to assume the salt cannot be hidden, and design the system to be safe in spite of that. I provide a more detailed explanation in another answer.
However, recent recommendations from NIST encourage the use of an additional, secret "salt" (I've seen others call this additional secret "pepper"). One additional iteration of the key derivation can be performed using this secret as a salt. Rather than increasing strength against a pre-computed lookup attack, this round protects against password guessing, much like the large number of iterations in a good key derivation function. This secret serves no purpose if stored with the hashed password; it must be managed as a secret, and that could be difficult in a large user database.
The answer here is to ask yourself what you're really trying to protect from? If someone has access to your database, then they have access to the encrypted salts, and they probably have access to your code as well. With all that could they decrypt the encrypted salts? If so then the encryption is pretty much useless anyway. The salt really is there to make it so it isn't possible to form a rainbow table to crack your entire password database in one go if it gets broken into. From that point of view, so long as each salt is unique there is no difference, a brute force attack would be required with your salts or the encrypted salts for each password individually.
A hidden salt is no longer salt. It's pepper. It has its use. It's different from salt.
Pepper is a secret key added to the password + salt which makes the hash into an HMAC (Hash Based Message Authentication Code). A hacker with access to the hash output and the salt can theoretically brute force guess an input which will generate the hash (and therefore pass validation in the password textbox). By adding pepper you increase the problem space in a cryptographically random way, rendering the problem intractable without serious hardware.
For more information on pepper, check here.
See also hmac.
My understanding of "salt" is that it makes cracking more difficult, but it doesn't try to hide the extra data. If you are trying to get more security by making the salt "secret", then you really just want more bits in your encryption keys.
The second approach is only slightly more secure. Salts protect users from dictionary attacks and rainbow table attacks. They make it harder for an ambitious attacker to compromise your entire system, but are still vulnerable to attacks that are focused on one user of your system. If you use information that's publicly available, like a telephone number, and the attacker becomes aware of this, then you've saved them a step in their attack. Of course the question is moot if the attacker gets your whole database, salts and all.
EDIT: After re-reading over this answer and some of the comments, it occurs to me that some of the confusion may be due to the fact that I'm only comparing the two very specific cases presented in the question: random salt vs. non-random salt. The question of using a telephone number as a salt is moot if the attacker gets your whole database, not the question of using a salt at all.
... something like a user name or phone number to salt the hash. ...
My question is if the second approach is really necessary? I can understand from a purely theoretical perspective that it is more secure than the first approach, but what about from a practicality point of view?
From a practical point of view, a salt is an implementation detail. If you ever change how user info is collected or maintained – and both user names and phone numbers sometimes change, to use your exact examples – then you may have compromised your security. Do you want such an outward-facing change to have much deeper security concerns?
Does stopping the requirement that each account have a phone number need to involve a complete security review to make sure you haven't opened up those accounts to a security compromise?
Here is a simple example showing why it is bad to have the same salt for each hash
Consider the following table
UserId UserName, Password
1 Fred Hash1 = Sha(Salt1+Password1)
2 Ted Hash2 = Sha(Salt2+Password2)
Case 1 when salt 1 is the same as salt2
If Hash2 is replaced with Hash1 then user 2 could logon with user 1 password
Case 2 when salt 1 not the same salt2
If Hash2 is replaced with Hash1 then user2 can not logon with users 1 password.
There are two techniques, with different goals:
The "salt" is used to make two otherwise equal passwords encrypt differently. This way, an intruder can't efficiently use a dictionary attack against a whole list of encrypted passwords.
The (shared) "secret" is added before hashing a message, so that an intruder can't create his own messages and have them accepted.
I tend to hide the salt. I use 10 bits of salt by prepending a random number from 1 to 1024 to the beginning of the password before hashing it. When comparing the password the user entered with the hash, I loop from 1 to 1024 and try every possible value of salt until I find the match. This takes less than 1/10 of a second. I got the idea to do it this way from the PHP password_hash and password_verify. In my example, the "cost" is 10 for 10 bits of salt. Or from what another user said, hidden "salt" is called "pepper". The salt is not encrypted in the database. It's brute forced out. It would make the rainbow table necessary to reverse the hash 1000 times larger. I use sha256 because it's fast, but still considered secure.
Really, it depends on from what type of attack you're trying to protect your data.
The purpose of a unique salt for each password is to prevent a dictionary attack against the entire password database.
Encrypting the unique salt for each password would make it more difficult to crack an individual password, yes, but you must weigh whether there's really much of a benefit. If the attacker, by brute force, finds that this string:
Marianne2ae85fb5d
hashes to a hash stored in the DB, is it really that hard to figure out what which part is the pass and which part is the salt?

Resources