Should we salt informations that are unique and random - security

Salt is used when storing passwords in databases in order to protect against dictionary attacks and rainbow tables.
However, let's assume we need to store unique and random (sensitive) information about users. Is there still an advantage in salting this information before hashing it ?
Wouldn't salt use, in this case, just add randomness to an already random data (unlike man-typed passwords) ?

It depends on how confidential your information is and what are the consequences when this data is compromised. Is it a PII information like SSN or DOB?
You mentioned that your data is random and unique. Which means it is difficult to identify a pattern. If the pattern is random enough then Salting your data may not be required. if you go with salting, then you will have an added responsibility of protecting those salts as well.
I would recommend to use low privileged account, hardening of server, authentication, authorization to protect your data and minimize the surface of attack.
Again, you should come to the conclusion after classification of your data based on CIA principles.

This depends very heavily on the size of the search space. For example, we could pretend that social security numbers are both random and unique (they're not actually either, but for the purpose of this discussion we will pretend they are). If you're hashing SSNs, not only do you need a salt, but a salt isn't sufficient. Why? Because there are fewer than 10 billion SSNs in existence. Creating a rainbow table for those is trivial. Even with a salt, it isn't that hard to brute force, even if the values are unique and random.
So to protect a random and unique value that lives in a small search space we have to use a stretching algorithm like PBKDF2, not just a hash. The point of a stretching algorithm is to make computing the hash very slow.
Stretching algorithms always include a salt. But it doesn't have to be a random salt. It could be deterministic (some database identifier + the user id for example, "com.example.mygreatapp:alice"). But for a small search space, you still need it to be unique per user because there are so few items in the search space.
On the other hand, if your random and unique data represents a large search space (not less than 2^64, and ideally at least 2^80), and that search space is sparse (you only use a very small fraction of legal elements), then salting and stretching is likely not required.

Related

Why is it considered OK to store both the password hash and salt in the same place?

Let's assume for each user, we store the user's password hash and a unique salt in the same table. For example, it looks like this:
user_email, sha256(raw_password+salt), salt
AFAIK, this is a conventional practice and considered safe because it prevents attacks using rainbow tables. The reasoning is since attackers don't have a precomputed list of sha256(raw_password+salt) they are forced to recompute this for every row, and this will take a lot of time.
But I don't understand the reasoning above. According to this old post, one core can run sha256 more than 20M times per second. Doesn't that make it trivial for attackers to just recompute sha256(raw_password+salt) for all rows if the entire users table is compromised?
Related:
Is It okay to save user's salt in the same table as password hash?
As John notes, your example is incorrect:
user_email, sha256(raw_password+salt), salt
This is not a good way to store passwords. You should replace sha256 here with a Key Derivation Function (KDF) such as PBKDF2 or scrypt. Then it would be fine. A properly tuned KDF can get the hashing rate down to dozens a second or fewer, even on good hardware (there are various competing factors here because the attacker doesn't have the same language and hardware restrictions you likely do, but even in the worst cases this value can be kept very low in cryptographic terms).
But even if you used sha256 here, it would be dramatically stronger than an unsalted hash. It makes every hash different. This means that if multiple people have the same password (very common), then breaking one doesn't break all users having the same password. This protects against rainbow tables, and particularly protects people who have very common passwords (password, dragon, mustang, etc.)
But it also protects against other password-collision attacks. For example, say I want to know Alice's password, and I can see it has the same hash as Bob. I now know that tricking either of them into revealing their password through some means will reveal both of their passwords.
Doesn't that make it trivial for attackers to just recompute sha256(raw_password+salt) for all rows if the entire users table is compromised?
This is thinking about the problem backwards. If the attacker knew raw_password, then yes, this would be trivial. But that's exactly the thing the attacker does not know (and if they did, they wouldn't need to do any hashing). So the attacker must make a full search of each row of the database, which even with just a single SHA-256 is quite slow.
There are approximately 96 characters you can easily type on most English keyboards. The complete space of those for an 8-character string is 96^8 or about 7x10^15. At 20M per second, that's about 360M seconds or roughly 11 CPU-years per row. That's not an impossible space to crack, but it's still not fast. (Obviously there are many thing pushing in both directions; users don't choose passwords randomly, but they also aren't limited to 8 characters. This computation is just for illustration.)
A key take-away is that knowing the salt gives you no information at all about sha256(salt+password) if you don't know the password, too. That's a key feature of all cryptographic hashes (including the SHA series). If knowing part of the data gave you any information about the hash of the entire data, then that would tell us that the hash isn't secure.

Is there a hash algorithm that secure enough to hash passwords without a salt?

For example, is SHA-1 secure enough to verify user passwords without salt?
Or do we always need a salt to prevent attacks?
General consensus would be that salt should be used regardless of hash function, its easy to do, and makes attacks orders of magnitude harder. Modern security best practice encourages assuming that any part of your infrastructure could be (or already is) breached, requiring that all other components of your system be hardened to limit the damage such an attacker could do.
Salting passwords largely prevents rainbow table attacks.
A rainbow table is a list of possible passwords (either enumerated from possible combinations of characters, or built from a dictionary) that are run through a hash and the password stored with its resulting hash.
An attacker using a rainbow table would compare a hash recovered from their target, with their precomputed list, to find a password that results in the matching hash.
Pre-computed rainbow tables are freely available. http://project-rainbowcrack.com/table.htm for eg. provides SHA1 tables for all 1-8 digit passwords using upper and lower case alphabets, numbers, and common symbols, or 1-9 digits for alphanumeric combinations.
If sites did not use salt, passwords for all users with passwords 8-9 characters or less would exist in these tables, and could be trivially reversed should an attacker gain access to a sites password database.
Adding salt increases the complexity of the data entering the hash function, vastly increasing the pre-computation work required to build useful rainbow tables.
SHA1 alone is not an ideal algorithm for hashing passwords, hash functions by design are fast, making them useful for generating indexes for example. This also makes them efficient for generating rainbow tables. Multiple rounds of hashing are often used to make this harder (for eg. hashing the password, and repeatedly hashing the result 1000 times). There are existing standards for deriving keys for use in storing password, such as PBKDF2.
Finally, I would consider building your own mechanisms for securing user credentials as an absolute last resort. Most languages have pre existing libraries that implement good robust reliable solutions.
This article covers the topic well.

Salting Hashes - why is the salt treated by the literature as being known to Eve?

The title says everything. I don't understand: why you shouldn't keep your salt a secret like the password. Or did I misunderstand something?
The salt is treated as public primarily because keeping it secret isn't necessary.
The point of salt is primarily to make dictionary attacks more difficult/less practical. In a dictionary attack, the attacker hashes common words from a dictionary, and (if he's serious at all) supplements those with things like common names. Equipped with this, if he can get a hold of your list of hashed passwords, he can see if any of them matches a hash in his list. Assuming you have a significant number of users, he has a pretty good chance of finding at least one. When he does, he looks in his list to find what word produced that hash, and he can now use it to log in and impersonate that user.
Adding a salt means that instead of doing this once, he has to do it once for each possible salt value. For example, if you use a 24-bit salt, he has to hash each word in the dictionary ~16 million times, and store the results of all ~16 million hashes.
Just for the sake of argument, let's assume that without salt, it would take the attacker 8 hours to hash all the candidate words, and 16 megabytes to store the results (hashes and word that produced each). We'll further assume that the storage is equally divided between the hashes themselves and the list of words/names/whatever that produced them.
Using the same 24-bit salt, that means his time is multiplied by the same factor of ~16 million. His storage for the words that produced the hashes remains the same, but for the hashes themselves is (again) multiplied by the ~16 million. Working out the math, those come out to approximately 15,000 years of computation and 128 terabytes of storage.
In short, without salt, a dictionary attack is within easy reach of almost anybody. I could easily believe that (for example) somebody would let their computer run overnight to do the hashing just to pull a good April fools joke on a few of his co-workers (easy to believe, because I've seen it done).
When you get down to it, it's all a numbers game: a dictionary attack isn't betting that every user will have a password that's easy to guess, only that enough will for them to find at least a few open holes. Likewise, making the salt public does allow a somewhat simpler attack, by downloading the salt for each hash, and doing individual dictionary attacks on each, using the known salt for each one. Assuming a system has fewer users than possible hash values, this is a more practical attack. Nonetheless, he's now stuck with attacking each password individually, rather than using a single dictionary not only for an entire system, but in fact for all systems he might want to attack that use the same hash algorithm.
In summary: salt can do its job perfectly well even though it's made public. One of the aims of almost any security system is to minimize the amount of information that needs to be kept secret. Since salt can work even if it is public, it's generally assumed to be public knowledge. In a practical system, you certainly don't try to publish it to attackers, but you don't (shouldn't, anyway) rely on its remaining a secret either.
The purpose of salt is making an attack on several crypted passwords at the same time harder. It doesn't make an attack on a single crypted password harder.
With a salt, an attacker has to test each candidate plaintext password once for every different salt.
The reason as I found in this article is, that you actually need the salt to check an incoming password with the salted and hashed one in your database.
You should keep your salt a secret for the same reason that you salt in the first place.
Hackers can and have created Rainbow Tables whereby they hash using (md5, sha1, sha256, sha512, etc.) a list of the top 1,000 or so most common passwords.
If a hacker manages to get a hold of your database... its good that your passwords are hashed, but if they do a quick comparison and find a hash that matches one they have in their list, they know what the password is for that account.
The key to them doing the hack, is having that rainbow table handy. If you've added a salt, their rainbow table is useless... but if you make the salt east to find or you share it with others, then the hackers can re-build a new rainbow table using your salt.(*) e.g. you've made it easier for them to hack.
(*) Note this is a little harder than described, since the hacker may not know if you added the salt as a prefix, suffix, both, etc.
As said above, unique secret salt for each password will prevent anyone from pre-computing the hashes in a rainbow table; this is the sole purpose of unique salts.

Why do salts make dictionary attacks 'impossible'?

Update: Please note I am not asking what a salt is, what a rainbow table is, what a dictionary attack is, or what the purpose of a salt is. I am querying: If you know the users salt and hash, isn't it quite easy to calculate their password?
I understand the process, and implement it myself in some of my projects.
s = random salt
storedPassword = sha1(password + s)
In the database you store:
username | hashed_password | salt
Every implementation of salting I have seen adds the salt either at the end of the password, or beginning:
hashed_Password = sha1(s + password )
hashed_Password = sha1(password + s)
Therfore, a dictionary attack from a hacker who is worth his salt (ha ha) would simply run each keyword against the stored salts in the common combinations listed above.
Surely the implementation described above simply adds another step for the hacker, without actually solving the underlying issue? What alternatives are there to step around this issue, or am I misunderstanding the problem?
The only thing I can think to do is have a secret blending algorithm that laces the salt and password together in a random pattern, or adds other user fields to the hashing process meaning the hacker would have to have access to the database AND code to lace them for a dictionary attack to prove fruitful. (Update, as pointed out in comments it's best to assume the hacker has access to all your information so this probably isn't best).
Let me give an example of how I propose a hacker would hack a user database with a list of passwords and hashes:
Data from our hacked database:
RawPassword (not stored) | Hashed | Salt
--------------------------------------------------------
letmein WEFLS... WEFOJFOFO...
Common password dictionary:
Common Password
--------------
letmein
12345
...
For each user record, loop the common passwords and hash them:
for each user in hacked_DB
salt = users_salt
hashed_pw = users_hashed_password
for each common_password
testhash = sha1(common_password + salt)
if testhash = hashed_pw then
//Match! Users password = common_password
//Lets visit the webpage and login now.
end if
next
next
I hope this illustrates my point a lot better.
Given 10,000 common passwords, and 10,000 user records, we would need to calculate 100,000,000 hashes to discover as many user passwords as possible. It might take a few hours, but it's not really an issue.
Update on Cracking Theory
We will assume we are a corrupt webhost, that has access to a database of SHA1 hashes and salts, along with your algorithm to blend them. The database has 10,000 user records.
This site claims to be able to calculate 2,300,000,000 SHA1 hashes per second using the GPU. (In real world situation probably will be slower, but for now we will use that quoted figure).
(((95^4)/2300000000)/2)*10000 = 177
seconds
Given a full range of 95 printable ASCII characters, with a maximum length of 4 characters, divided by the rate of calculation (variable), divided by 2 (assuming the average time to discover password will on average require 50% of permutations) for 10,000 users it would take 177 seconds to work out all users passwords where the length is <= 4.
Let's adjust it a bit for realism.
(((36^7)/1000000000)/2)*10000 = 2 days
Assuming non case sensitivity, with a password length <= 7, only alphanumeric chars, it would take 4 days to solve for 10,000 user records, and I've halved the speed of the algorithm to reflect overhead and non ideal circumstance.
It is important to recognise that this is a linear brute force attack, all calculations are independant of one another, therfore it's a perfect task for multiple systems to solve. (IE easy to set up 2 computers running attack from different ends that would half the exectution time).
Given the case of recursively hashing a password 1,000 times to make this task more computationally expensive:
(((36^7) / 1 000 000 000) / 2) * 1000
seconds = 10.8839117 hours
This represents a maximum length of 7 alpha-numeric characters, at a less than half speed execution from quoted figure for one user.
Recursively hashing 1,000 times effectively blocks a blanket attack, but targetted attacks on user data are still vulnerable.
It doesn't stop dictionary attacks.
What it does is stop someone who manages to get a copy of your password file from using a rainbow table to figure out what the passwords are from the hashes.
Eventually, it can be brute-forced, though. The answer to that part is to force your users to not use dictionary words as passwords (minimum requirements of at least one number or special character, for example).
Update:
I should have mentioned this earlier, but some (most?) password systems use a different salt for each password, likely stored with the password itself. This makes a single rainbow table useless. This is how the UNIX crypt library works, and modern UNIX-like OSes have extended this library with new hash algorithms.
I know for a fact that support for SHA-256 and SHA-512 were added in newer versions of GNU crypt.
To be more precise, a dictionary attack, i.e. an attack where all words in an exhaustive list are tried, gets not "impossible", but it gets impractical: each bit of salt doubles the amount of storage and computation required.
This is different from pre-computed dictionary attacks like attacks involving rainbow tables where it does not matter whether the salt is secret or not.
Example: With a 64-bit salt (i.e. 8 bytes) you need to check 264 additional password combinations in your dictionary attack. With a dictionary containing 200,000 words you will have to make
200,000 * 264 = 3.69 * 1024
tests in the worst case - instead of 200,000 tests without salt.
An additional benefit of using salt is that an attacker cannot pre-compute the password hashes from his dictionary. It would simply take too much time and/or space.
Update
Your update assumes that an attacker already knows the salt (or has stolen it). This is of course a different situation. Still it is not possible for the attacker to use a pre-computed rainbow table. What matters here a lot is the speed of the hashing function. To make an attack impractical, the hashing function needs to be slow. MD5 or SHA are not good candidates here because they are designed to be fast, better candidates for hashing algorithms are Blowfish or some variations of it.
Update 2
A good read on the matter of securing your password hashes in general (going much beyond the original question but still interesting):
Enough With The Rainbow Tables: What You Need To Know About Secure Password Schemes
Corollary of the article: Use salted hashes created with bcrypt (based on Blowfish) or Eksblowfish that allows you to use a configurable setup time to make hashing slow.
Yes, you need just 3 days for sha1(salt | password). That's why good password storage algorithms use 1000-iteration hashing: you will need 8 years.
A dictionary is a structure where values are indexed by keys. In the case of a pre-computed dictionary attack, each key is a hash, and the corresponding value is a password that results in the hash. With a pre-computed dictionary in hand, an attacker can "instantly" lookup a password that will produce the necessary hash to log in.
With salt, the space required to store the dictionary grows rapidly… so rapidly, that trying to pre-compute a password dictionary soon becomes pointless.
The best salts are randomly chosen from a cryptographic random number generator. Eight bytes is a practical size, and more than 16 bytes serves no purpose.
Salt does much more than just "make an attacker's job more irritating." It eliminates an entire class of attack—the use of precomputed dictionaries.
Another element is necessary to completely secure passwords, and that is "key-strengthening." One round of SHA-1 is not good enough: a safe password hashing algorithm should be very slow computationally.
Many people use PBKDF2, a key derivation function, that feeds back results to the hash function thousands of times. The "bcrypt" algorithm is similar, using an iterative key derivation that is slow.
When the hashing operation is very slow, a precomputed table becomes more and more desirable to an attacker. But proper salt defeats that approach.
Comments
Below are the comments I made on the question.
Without salt, an attacker wouldn't use the method demonstrated in "Update 2". He'd simply do a lookup in a pre-computed table and get the password in O(1) or O(log n) time (n being the number of candidate passwords). Salt is what prevents that and forces him to use the O(n) approach shown in "Update 2".
Once reduced to an O(n) attack, we have to consider how long each attempt takes. Key-strengthening can cause each attempt in the loop to take a full second, meaning that the time needed to test 10k passwords on 10k users will stretch from 3 days to 3 years… and with only 10k passwords, you're likely to crack zero passwords in that time.
You have to consider that an attacker is going to use the fastest tools he can, not PHP, so thousands of iterations, rather than 100, would be a good parameter for key-strengthening. It should take a large fraction of a second to compute the hash for a single password.
Key-strengthening is part of the standard key derivation algorithms PBKDF1 and PBKDF2, from PKCS #5, which make great password obfuscation algorithms (the "derived key" is the "hash").
A lot of users on StackOverflow refer to this article because it was a response to Jeff Atwood's post about the dangers of rainbow tables. It's not my favorite article, but it does discuss these concepts in more detail.
Of course you assume the attacker has everything: salt, hash, user name. Assume the attacker is a corrupt hosting company employee who dumped the user table on your myprettypony.com fansite. He's trying recover these passwords because he's going to turn around and see if your pony fans used the same password on their citibank.com accounts.
With a well-designed password scheme, it will be impossible for this guy to recover any passwords.
The point of salting is to prevent the amortization of the attacker's effort.
With no salt, a single table of precomputed hash-password entries (e.g. MD5 of all alphanumeric 5 character strings, easy to find online) can be used on every user in every database in the world.
With a site-specific salt, the attacker has to compute the table himself and can then use it on all users of the site.
With a per-user salt, the attacker has to expend this effort for every user separately.
Of course, this doesn't do much to protect really weak passwords straight out of a dictionary, but it protects reasonably strong passwords against this amortization.
Also - one more imporatant point - using a USER-specific salt prevents the detection of two users with the SAME password - their hashes would match. That's why many times the hash is hash(salt + username + password)
If you try and keep the hash secret the attacker also can not verify the hashes.
Edit- just noticed the main point was made in a comment above.
Salts are implemented to prevent rainbow table attacks. A rainbow table is a list of pre-calculated hashes, which makes translating a hash into it's phrase much more simple. You need to understand that salting isn't effective as a modern prevention to cracking a password unless we have a modern hashing algo.
So lets say we're working with SHA1, taking advantage of recent exploits discovered with this algo, and lets say we have a computer running at 1,000,000 hashes/second, it would take 5.3 million million million years to find a collision, so yeah php can work 300 a second, big woop, doesn't really matter. The reason we salt is because if someone did bother to generate all common dictionary phrases, (2^160 people, welcome to 2007 era exploits).
So here's an actual database, with 2 users I use for testing and admin purposes.
RegistrationTime UserName UserPass
1280185359.365591 briang a50b63e927b3aebfc20cd783e0fc5321b0e5e8b5
1281546174.065087 test 5872548f2abfef8cb729cac14bc979462798d023
In fact, the salting scheme is your sha1(registration time + user name). Go ahead, tell me my password, these are real passwords in production. You can even sit there and hash out a word list in php. Go wild.
I'm not crazy, I just know that this is secure. For fun sake, test's password is test.
sha1(sha1(1281546174.065087 + test) + test) = 5872548f2abfef8cb729cac14bc979462798d023
You would need to generate an entire rainbow table perpended with 27662aee8eee1cb5ab4917b09bdba31d091ab732 for just this user. That means I can actually allow my passwords to not all be compromised by a single rainbow table, the hacker needs to generate an entire rainbow table for 27662aee8eee1cb5ab4917b09bdba31d091ab732 for test, and again f3f7735311217529f2e020468004a2aa5b3dee7f for briang. Think back to the 5.3 million million million years for all hashes. Think of the size of storing just the 2^80 hashes (that's well over 20 yottabytes), it's not going to happen.
Don't confuse salting as a means of making a hash something you can't ever decode, it's a means of preventing a rainbow table from translating all your user passwords. It's imposable at this level of technology.
The idea behind dictionary attack is that you take a hash and find the password, from which this hash was calculated, without hash calculation. Now do the same with salted password - you can't.
Not using a salt makes password search as easy as lookup in the database. Adding a salt make attacker perform hash calculation of all possible passwords (even for dictionary attach this significantly increases time of attack).
In simplest terms: without salting, each candidate password need only be hashed once to check it against every user, anywhere in the "known universe" (collection of compromised databases), whose password is hashed via the same algorithm. With salting, if the number of possible salt values substantially exceeds the number of users in the "known universe", each candidate password must be hashed separately for each user against whom it will be tested.
Simply put salting does not prevent a hash from attack (bruteforce or dictionary), it only makes it harder; the attacker will either need to find the salting algorithm (which if implemented properly will make use of more iterations) or bruteforce the algo, which unless very simple, is nearly impossible. Salting also almost completely discards the option of rainbow table lookups...
Salt makes Rainbow table attacks much more difficult since it makes a single password hash much harder to crack. Imagine you have a horrid password of just the number 1. A rainbow table attack would crack this immediately.
Now imagine each password in the db is salted with a long random value of many random characters. Now your lousy password of "1" is stored in the db as a hash of 1 plus a bunch of random characters (the salt), so in this example the rainbow table needs to have the hash for something like: 1.
So assuming your salt is something secure and random, say ()%ISLDGHASKLU(%#%#, the hacker's rainbow table would need to have an entry for 1*()%ISLDGHASKLU(*%#%#. Now using a rainbow table on even this simple password is no longer practical.

Non-random salt for password hashes

UPDATE: I recently learned from this question that in the entire discussion below, I (and I am sure others did too) was a bit confusing: What I keep calling a rainbow table, is in fact called a hash table. Rainbow tables are more complex creatures, and are actually a variant of Hellman Hash Chains. Though I believe the answer is still the same (since it doesn't come down to cryptanalysis), some of the discussion might be a bit skewed.
The question: "What are rainbow tables and how are they used?"
Typically, I always recommend using a cryptographically-strong random value as salt, to be used with hash functions (e.g. for passwords), such as to protect against Rainbow Table attacks.
But is it actually cryptographically necessary for the salt to be random? Would any unique value (unique per user, e.g. userId) suffice in this regard? It would in fact prevent using a single Rainbow Table to crack all (or most) passwords in the system...
But does lack of entropy really weaken the cryptographic strength of the hash functions?
Note, I am not asking about why to use salt, how to protect it (it doesn't need to be), using a single constant hash (don't), or what kind of hash function to use.
Just whether salt needs entropy or not.
Thanks all for the answers so far, but I'd like to focus on the areas I'm (a little) less familiar with. Mainly implications for cryptanalysis - I'd appreciate most if anyone has some input from the crypto-mathematical PoV.
Also, if there are additional vectors that hadn't been considered, that's great input too (see #Dave Sherohman point on multiple systems).
Beyond that, if you have any theory, idea or best practice - please back this up either with proof, attack scenario, or empirical evidence. Or even valid considerations for acceptable trade-offs... I'm familiar with Best Practice (capital B capital P) on the subject, I'd like to prove what value this actually provides.
EDIT: Some really good answers here, but I think as #Dave says, it comes down to Rainbow Tables for common user names... and possible less common names too. However, what if my usernames are globally unique? Not necessarily unique for my system, but per each user - e.g. email address.
There would be no incentive to build a RT for a single user (as #Dave emphasized, the salt is not kept secret), and this would still prevent clustering. Only issue would be that I might have the same email and password on a different site - but salt wouldnt prevent that anyway.
So, it comes back down to cryptanalysis - IS the entropy necessary, or not? (My current thinking is it's not necessary from a cryptanalysis point of view, but it is from other practical reasons.)
Salt is traditionally stored as a prefix to the hashed password. This already makes it known to any attacker with access to the password hash. Using the username as salt or not does not affect that knowledge and, therefore, it would have no effect on single-system security.
However, using the username or any other user-controlled value as salt would reduce cross-system security, as a user who had the same username and password on multiple systems which use the same password hashing algorithm would end up with the same password hash on each of those systems. I do not consider this a significant liability because I, as an attacker, would try passwords that a target account is known to have used on other systems first before attempting any other means of compromising the account. Identical hashes would only tell me in advance that the known password would work, they would not make the actual attack any easier. (Note, though, that a quick comparison of the account databases would provide a list of higher-priority targets, since it would tell me who is and who isn't reusing passwords.)
The greater danger from this idea is that usernames are commonly reused - just about any site you care to visit will have a user account named "Dave", for example, and "admin" or "root" are even more common - which would make construction of rainbow tables targeting users with those common names much easier and more effective.
Both of these flaws could be effectively addressed by adding a second salt value (either fixed and hidden or exposed like standard salt) to the password before hashing it, but, at that point, you may as well just be using standard entropic salt anyhow instead of working the username into it.
Edited to Add: A lot of people are talking about entropy and whether entropy in salt is important. It is, but not for the reason most of the comments on it seem to think.
The general thought seems to be that entropy is important so that the salt will be difficult for an attacker to guess. This is incorrect and, in fact, completely irrelevant. As has been pointed out a few times by various people, attacks which will be affected by salt can only be made by someone with the password database and someone with the password database can just look to see what each account's salt is. Whether it's guessable or not doesn't matter when you can trivially look it up.
The reason that entropy is important is to avoid clustering of salt values. If the salt is based on username and you know that most systems will have an account named either "root" or "admin", then you can make a rainbow table for those two salts and it will crack most systems. If, on the other hand, a random 16-bit salt is used and the random values have roughly even distribution, then you need a rainbow table for all 2^16 possible salts.
It's not about preventing the attacker from knowing what an individual account's salt is, it's about not giving them the big, fat target of a single salt that will be used on a substantial proportion of potential targets.
Using a high-entropy salt is absolutely necessary to store passwords securely.
Take my username 'gs' and add it to my password 'MyPassword' gives gsMyPassword. This is easily broken using a rainbow-table because if the username hasn't got enough entropy it could be that this value is already stored in the rainbow-table, especially if the username is short.
Another problem are attacks where you know that a user participates in two or more services. There are lots of common usernames, probably the most important ones are admin and root. If somebody created a rainbow-table that have salts with the most common usernames, he could use them to compromise accounts.
They used to have a 12-bit salt. 12 bit are 4096 different combinations. That was not secure enough because that much information can be easily stored nowadays. The same applies for the 4096 most used usernames. It's likely that a few of your users will be choosing a username that belongs to the most common usernames.
I've found this password checker which works out the entropy of your password. Having smaller entropy in passwords (like by using usernames) makes it much easier for rainbowtables as they try to cover at least all passwords with low entropy, because they are more likely to occur.
It is true that the username alone may be problematic since people may share usernames among different website. But it should be rather unproblematic if the users had a different name on each website. So why not just make it unique on each website. Hash the password somewhat like this
hashfunction("www.yourpage.com/"+username+"/"+password)
This should solve the problem. I'm not a master of cryptanalysis, but I sure doubt that the fact that we don't use high entropy would make the hash any weaker.
I like to use both: a high-entropy random per-record salt, plus the unique ID of the record itself.
Though this doesn't add much to security against dictionary attacks, etc., it does remove the fringe case where someone copies their salt and hash to another record with the intention of replacing the password with their own.
(Admittedly it's hard to think of a circumstance where this applies, but I can see no harm in belts and braces when it comes to security.)
If the salt is known or easily guessable, you have not increased the difficulty of a dictionary attack. It even may be possible to create a modified rainbow table that takes a "constant" salt into account.
Using unique salts increases the difficulty of BULK dictionary attacks.
Having unique, cryptographically strong salt value would be ideal.
I would say that as long as the salt is different for each password, you will probably be ok. The point of the salt, is so that you can't use standard rainbow table to solve every password in the database. So if you apply a different salt to every password (even if it isn't random), the attacker would basically have to compute a new rainbow table for each password, since each password uses a different salt.
Using a salt with more entropy doesn't help a whole lot, because the attacker in this case is assumed to already have the database. Since you need to be able to recreate the hash, you have to already know what the salt is. So you have to store the salt, or the values that make up the salt in your file anyway. In systems like Linux, the method for getting the salt is known, so there is no use in having a secret salt. You have to assume that the attacker who has your hash values, probably knows your salt values as well.
The strength of a hash function is not determined by its input!
Using a salt that is known to the attacker obviously makes constructing a rainbow table (particularly for hard-coded usernames like root) more attractive, but it doesn't weaken the hash. Using a salt which is unknown to the attacker will make the system harder to attack.
The concatenation of a username and password might still provide an entry for an intelligent rainbow table, so using a salt of a series pseudo-random characters, stored with the hashed password is probably a better idea. As an illustration, if I had username "potato" and password "beer", the concatenated input for your hash is "potatobeer", which is a reasonable entry for a rainbow table.
Changing the salt each time the user changes their password might help to defeat prolonged attacks, as would the enforcement of a reasonable password policy, e.g. mixed case, punctuation, min length, change after n weeks.
However, I would say your choice of digest algorithm is more important. Use of SHA-512 is going to prove to be more of a pain for someone generating a rainbow table than MD5, for example.
Salt should have as much entropy as possible to ensure that should a given input value be hashed multiple times, the resulting hash value will be, as close as can be achieved, always different.
Using ever-changing salt values with as much entropy as possible in the salt will ensure that the likelihood of hashing (say, password + salt) will produce entirely different hash values.
The less entropy in the salt, the more chance you have of generating the same salt value, as thus the more chance you have of generating the same hash value.
It is the nature of the hash value being "constant" when the input is known and "constant" that allow dictionary attacks or rainbow tables to be so effective. By varying the resulting hash value as much as possible (by using high entropy salt values) ensures that hashing the same input+random-salt will produce many different hash value results, thereby defeating (or at least greatly reducing the effectiveness of) rainbow table attacks.
Entropy is the point of Salt value.
If there is some simple and reproducible "math" behind salt, than it's the same as the salt is not there. Just adding time value should be fine.

Resources