We've got a legacy system that stores passwords using the MS membership provider and have just found out that it only used a SHA1 with a random salt to store the passwords, so obviously we are concerned about this situation. I know the ideal situation would be to force a global password reset, but for assorted reasons we would like to avoid this, if possible, and keep the existing passwords as-is. I've done some poking around and have managed to find some source-code, and can re-hash my password so I get the same result as the stored version of it, so I am wanting to override all the appropriate methods to reimplement the code in a secure manner.
What I am proposing to do is to rehash the stored passwords using a currently "secure" hash (AFAIK, the current methods are only classed as secure due to the amount of computation time it takes to brute-force a password, so if systems get a big performance upgrade the whole programming world could end up having to revisit this), then wrap this hash around the existing hash in the code, but I have 2 questions:-
Is this actually secure? As far as I have read, each hash needs to increase the amount of entropy and I'm 90% sure this will do so, but are there any issues in doing this that I need to be aware of? I'm also guessing that in chains of hashing it's the strongest hash function that determines the "base-line" security level, but I thought I'd double-check there weren't any weird mathematical quirks with hashing an "insecure" hash. I'm sure not, but due to the nature of the problem, I'd rather ask a stupid question than make any incorrect assumptions, as the technical aspect of hashing functions isn't something I've really looked into.
Should I re-apply the salt to the current hash, before re-hashing. My thinking on this was in case there were existing calculated tables which convert older hashes to newer ones - in case someone nefarious had done the grunt-work to try to bypass this method. I believe b-crypt may already include a salt, but if I use an alternative that doesn't, I'm guessing should include one?
Double hashing can be a good way to protect very weak password-hashes immediately, if you can't wait on the next user login and don't want to enforce a login. Weak password-hashes include unsalted hashes or very fast hashes like SHA-*/MD5.
So you can prepare your database like this:
Make the old salt persistent in the database, you need the oldSalt to verify the double hash.
Calculate the double hash and store it in the database newHash = newSafeHashFunction(oldHash, newSalt). Nowadays safe hash functions are BCrypt, SCrypt, Argon2 and PBKDF2. Generate a new salt fullfilling the requirements of the new password-hash function.
After the next successful login, the double hash should be replaced with the pure new algorithm newSafeHashFunction(password, newSalt).
Most password-hash implementations will generate a safe salt on their own and include it in the resulting password-hash string, so there is no need to generate and store them separately. When the user logs in the next time, the password can be verified like this:
if (checkIfDoubleHash(storedHash))
correctPassword = newSafeHashFunction(oldUnsafeHashFunction(password, oldSalt), storedHash)
else
correctPassword = newSafeHashFunction(password, storedHash)
➽ Note the function checkIfDoubleHash(), it is crucial and a common pitfall for double hashing. If we would generally accept newSafeHashFunction(password, storedHash) and an attacker can get hold of an old backup, or has values from an earlier SQL-injection, (s)he could use the old hashes directly as password.
The implementation of checkIfDoubleHash() can be as easy as checking for the old salt, or it can be made future proof in marking the hash as double hash. Most frameworks already offer a password_hash() function which adds such a "mark", so they can switch to newer algorithms when necessary.
$2y$10$nOUIs5kJ7naTuTFkBy1veuK0kSxUFXfuaOKdOKf9xYT0KKIGSJwFa
|
hash-algorithm-descriptor = 2y = BCrypt
This is an often adopted format used by the Unix crypt() function. There is nothing preventing you from using your own descriptor for the double hashes. Of course the mark can also be stored in a separate database field.
Related
I have legacy browser game which historicaly uses simple hashing function for password storage. I know that it' far from ideal. However time has proven that most of the cheaters (multiaccounts) use same password for all of fake accounts.
In update of my game I want to store passwords more safely. I already know, that passwords should by randomly salted, hashed by safe algorithms etc. That's all nice.
But is there any way, how to store passwords properly and determine that two (or more) users use same password? I don't want to know the password. I don't want to be able to search by password. I only need to tell, that suspect users A, B and C use same one.
Thanks.
If you store them correctly - no. This is one of the points of a proper password storage.
You could have very long passwords, beyond what is available on rainbow tables (not sure about the current state of the art, but it used to be 10 or 12 characters) and not salt them. In this case two passwords would have the same hash. This is a very bad idea (but a solution nevertheless) - if your passwords leak someone may be able to guess them indirectly (xkcd reference).
You may also look at homomorphic encryption, but this is in the realm of science fiction for now.
Well, if you use salt + hashing, you have all the salts as plain text. When a user enters a password, before storing/verifying it, you can hash it with all the salts available and see if you get the corresponding existing hash. :)
The obvious problem with this is that if you are doing it properly with bcrypt or pbkdf2 for hashing, this would be very slow - that's kind of the point in these functions.
I don't think there is any other way you can tell whether two passwords are the same - you need at least one of them plain text, which is only when the user enters it. And then you want to remove it from memory asap, which contradicts doing all these calculations with the plain text password in memory.
This will reduce the security of all passwords somewhat, since it leaks information about when two users have the same password. Even so, it is a workable trade-off and is straightforward to secure within that restriction.
The short answer is: use the same salt for all the passwords, but make that salt unique to your site.
Now the long answer:
First, to describe a standard and appropriate way to handle passwords. I'll get to the differences for you afterwards. (You may know all of this already, but it's worth restating.)
Start with a decent key-stretching algorithm, such as PBKDF2 (there are others, some even better, but PBKDF2 is ubiquitous and sufficient for most uses). Select a number of iterations depending on what is client-side environment is involved. For JavaScript, you'll want something like 1k-4k iterations. For languages with faster math, you can use 10k-100k.
The key stretcher will need a salt. I'll talk about the salt in a moment.
The client sends the password to the server. The server applies a fast hash (SHA-256 is nice) and compares that to the stored hash. (For setting the password, the server does the same thing; it accepts a PBKDF2 hash, applies SHA-256, and then stores it.)
All that is standard stuff. The question is the salt. The best salt is random, but no good for this. The second-best salt is built from service_id+user_id (i.e. use a unique identifier for the service and concatenate the username). Both of these make sure that every user's password hash is unique, even if their passwords are identical. But you don't want that.
So now finally to the core of your question. You want to use a per-service, but not per-user, static salt. So something like "com.example.mygreatapp" (obviously don't use that actual string; use a string based on your app). With a constant salt, all passwords on your service that are the same will stretch (PBKDF2) and hash (SHA256) to the same value and you can compare them without having any idea what the actual password is. But if your password database is stolen, attackers cannot compare the hashes in it to hashes in other sites' databases, even if they use the same algorithm (because they'll have a different salt).
The disadvantage of this scheme is exactly its goal: if two people on your site have the same password and an attacker steals your database and knows the password of one user, they know the password of the other user, too. That's the trade-off.
I have a database of legacy passwords that were salted and hashed using MD5. I would like to update the system so that the data is more secure.
The first option is to transition the users to a new hashing scheme (Salt + Scrypt or PBKDF2 HMACSHA256) when they login and deactivate old users after a certain period of time so they have to use the password recovery feature which would automatically update their hash.
Another option that would allow me to instantly upgrade everyone would be to take the existing MD5 hashes, add a new random salt value to each, and then hash the result using the new hashing scheme (Salt + Scrypt or PBKDF2 HMACSHA256) and store that value to the database and delete the old value.
Then when users login, I would have to apply the old, and then the new method. I like the second option better since it allows me to remove all the old insecure hashes from the database sooner than later.
Is it secure to salt and rehash the existing hashes? Is MD5 so broken that I can run a script to de-hash the passwords and rehash them using the new scheme?
Or maybe the best solution is to do a combination of both options? This way I don't have to leave the existing MD5 hashes unsecured in the database and I can migrate users to the new system for a period of time?
MD5 is not so broken that you can de-hash all the passwords easily, but assuming the quality of the passwords isn't too good then you could probably brute force them and convert them to the new, more secure format. The brokenness of MD5 results from it's relatively small length (more collision surface) and it's computationally simple calculation (meaning brute force attacks are more feasible than algorithms that have larger run-time complexity such as SHA2)
If I were you I'd do both methods you listed (because as you mentioned, getting the passwords moved over quickly is important in case your DB is hacked). First I would brute force all the brute forcible MD5 passwords and convert them to the new format. I have done this in the past, and by far the best results have been using HashCat (the Cuda or OCL flavors preferably since they use the GPU and are 200 times faster). If Hashcat is too difficult (the learning curve can be steep), then try John the Ripper. It is a lot slower than HashCat but it's a lot easier to use.
For the passwords that you can't crack, expire the user's account and have them reset the password. Or to be nicer to your users, just update the password in the database to the new format the next time they log in by sending both hashes. If the MD5 checks out, then destroy it and replace it with the new format. These are just some ideas.
EDIT:
Forgot to mention that if you want to just hash the MD5 passwords into the new format that would be just fine security-wise, though it adds another layer of complexity to your code, and where there is complexity there is room for implementation flaws. Just something to think about.
This is actually a very ingenious idea you had. Normally i would have:
waited until a user returned
realize that their stored password needs to be updated
now that i have their (known valid) password in memory: rehash it with the new algorithm
store the new hash in the database
The downside to only having used MD5 is that it's easy to bruteforce. By (temporarily) treating the MD5 result as an intermediate step before applying the real scrypt/Argon2, you thwart bruteforcing attempts.
Using a fast hash algorithm as a pre-processing step before the "real" password hash is not unheard of - and can even be useful.
BCrypt has a known password length limitation of 72 bytes (71 utf-8 characters and then a null terminator). Dropbox applies SHA2-512 to the incoming plaintext password before running it through bcrypt. By running a long password through a hash first, they overcome the 71 character limit.
Not only does this overcome the password length limitation (avoiding having to truncate or limit the password size), but it can prevent a Denial of Service attack when someone supplies an extraordinarily long password. BCrypt and Scrypt are suseptible to attacks with longer passwords (i don't know about Argon2).
So there can be a virtue in using a pre-hash (although not necessarily MD5).
I don't know how you're currently storing the MD5 hashes. MD5 is 128-bit. Assuming you store it in Base64, you can easily recognize it:
MD5: nMKuihunqT2jm0b8EBnEgQ==
The desired final goal is something like scrypt:
MD5: nMKuihunqT2jm0b8EBnEgQ==
scrypt: $s0$e0801$epIxT/h6HbbwHaehFnh/bw==$7H0vsXlY8UxxyW/BWx/9GuY7jEvGjT71GFd6O4SZND0=
So when validating credentials against a saved hash, you can figure out which hash it is and use the appropriate algorithm. Your intermediate step, which adds the computational complexity, is defining your own format for:
MD5 + scrypt
something like:
MD5: nMKuihunqT2jm0b8EBnEgQ==
MD5 + scrypt: $md5s0$e0801$eX8cPtmLjKSrZBJszHIuZA==$vapd0u4tYVdOXOlcIkFmrOEIr1Ml2Ue1l2+FVOJgbcI=
scrypt: $s0$e0801$epIxT/h6HbbwHaehFnh/bw==$7H0vsXlY8UxxyW/BWx/9GuY7jEvGjT71GFd6O4SZND0=
Now you recognize the algorithm being used based on the saved hash, and can upgrade passwords in pieces.
I'm switching a site over to rails. It's quite a large site with 50k+ users. The problem is, the existing password hashing method is extremely weak. I have two options:
1) Switch to a new algorithm, generate random passwords for everyone and then email them those passwords and require the change immediately after
2) Implement new algorithm but use the the old one before and then hash the result. For example:
Password: abcdef =Algorithm 1=> xj31ndn =Algorithm 2=> $21aafadsada214
Any new passwords would need to go through the original algorithm (md5) and then have the result of that hashed if that makes any sense? Is there any disadvantage to this?
Normally it's not necessary to reset the passwords, one can just wait until the user logs in the next time.
First try to verify the entered password with the new algorithm. New passwords and already converted passwords will not take longer for verification then.
If it does not match, compare it with the old hash algorithm.
Should the old hash value match, then you can calculate and store the new hash, since you know the password then.
Every password-storing-system must have the option to switch to a better hash algorithm, your problem is not a one-time migration problem. Good password hash algorithms like BCrypt have a cost factor, from time to time you have to increase this cost factor (because of faster hardware), then you need the exact same procedure as you need for the migration.
Your option 2 with hashing the old hash is a good thing, if your first algorithm is really weak, and you want to give more protection immediately. In this case you can calculate a double-hash and replace the old hash in the database with the new double-hash.
$newHashToStoreInTheDb = new_hash($oldHashFromDb)
You should also mark this password-hash (see why), so you can recognize it as double-hash. This can be done in a separate database field, or you can include your own signature. Modern password hash functions also include a signature of the algorithm, so that they can upgrade to newer algorithms, and still can verify older hashes. The example shows the signature of a BCrypt hash:
$2y$10$nOUIs5kJ7naTuTFkBy1veuK0kSxUFXfuaOKdOKf9xYT0KKIGSJwFa
___
|
signature of hash-algorithm = 2y = BCrypt
The verification would run like this:
Decide whether it is a double-hash.
If it is a new hash, call the new hash-function to verify the entered password, and you are done.
If it is a double-hash, compare it with the double-hash algorithm new_hash(old_hash($password)).
Should the double-hash value match, then you can calculate and store the new hash.
The simplest solution is probably to add a "password hash type" column to the database. Set it initially to "old"; when a user logs in, re-hash the password using the new algorithm and set the database type to "new".
A variant of this method is to store the hash type as part of the hash string. This works just as well, as long as you can unambiguously tell the different hash formats apart, and has the advantage that you can also include any other needed parameters (such as the salt and the work factor for key stretching) in the same string without having to add extra fields for each to your database.
For example, this is the approach typically used by modern Unix crypt(3) implementations (and the corresponding functions in various high-level languages like PHP): a classic DES-based (and horribly weak) password hash would look something like abJnggxhB/yWI, while a (slightly) more modern hash might look like $1$z75qouSC$nNVPAk1FTd0yVd62S3sjR1, where 1 specified the hashing method, z75qouSC is the salt and nNVPAk1FTd0yVd62S3sjR1 the actual hash, and the delimiter $ is chosen because it cannot appear in an old-style DES hash.
The method you suggest, where the new hashes are calculated as:
hash = new_hash( old_hash( password ) )
can be useful in some cases, since it allows all existing records to be updated without having to wait for users to log in. However, it's only safe if the old hash function preserves enough of the entropy in the passwords.
For example, even a fairly old and weak cryptographic hash function, like unsalted MD5, would be good enough, since its output depends on the entire input and has up to 128 bits of entropy, which is more than almost any password will have (and more than enough to withstand a brute force attack, anyway). On the other hand, trying to apply this construction using the old DES-based crypt(3) function as the old hash would be disastrous, since old crypt(3) would ignore all but the first 8 characters of each password (as well as the most significant bits of even those characters).
You can create a new password field with all users that has updated their password with the new password method, and just update everybody with your option 2.
Combined this with forcing password update on login for all users with the old password method will automatically move all active users to the new password method.
An alternative could be to keep both hashes available for the migration phase in separate columns of the database:
If the new hash does not exist during login, check with the old hash and save the new hash and delete the old hash.
If the new hash exists, use only this to verify.
Thus, after some time you will be left with the new hashes only - at least for those users who logged in at least one time.
Does using multiple algorithms make passwords more secure? (Or less?)
Just to be clear, I'm NOT talking about doing anything like this:
key = Hash(Hash(salt + password))
I'm talking about using two separate algorithms and matching both:
key1 = Hash1(user_salt1 + password)
key2 = Hash2(user_salt2 + password)
Then requiring both to match when authenticating. I've seen this suggested as a way eliminate collision matches, but I'm wondering about unintended consequences, such as creating a 'weakest link' scenario or providing information that makes the user database easier to crack, since this method provides more data than a single key does. E.g. something like combining information the hash to find them more easily. Also if collisions were truly eliminated, you could theoretically brute force the actual password not just a matching password. In fact, you'd have to in order to brute force the system at all.
I'm not actually planning to implement this, but I'm curious whether or not this is actually an improvement over the standard practice of single key = Hash(user_salt + password).
EDIT:
Many good answers, so just to surmise here, this should have been obvious looking back, but you do create a weakest link by using both, because the matches of weaker of the two algorithms can be tried against the other. Example if you used a weak (fast) MD5 and a PBKDF2, I'd brute force the MD5 first, then try any match I found against the other, so by having the MD5 (or whatever) you actual make the situation worse. Also even if both are among the more secure set (bcrypt+PBKDF2 for example), you double your exposure to one of them breaking.
The only thing this would help with would be reducing the possibility of collisions. As you mention, there are several drawbacks (weakest link being a big one).
If the goal is to reduce the possibility of collisions, the best solution would simply be to use a single secure algorithm (e.g. bcrypt) with a larger hash.
Collisions are not a concern with modern hashing algorithms. The point isn't to ensure that every hash in the database is unique. The real point is to ensure that, in the event your database is stolen or accidentally given away, the attacker has a tough time determining a user's actual password. And the chance of a modern hashing algorithm recognizing the wrong password as the right password is effectively zero -- which may be more what you're getting at here.
To be clear, there are two big reasons you might be concerned about collisions.
A collision between the "right" password and a supplied "wrong" password could allow a user with the "wrong" password to authenticate.
A collision between two users' passwords could "reveals" user A's password if user B's password is known.
Concern 1 is addressed by using a strong/modern hashing algorithm (and avoiding terribly anti-brilliant things, like looking for user records based solely on their password hash). Concern 2 is addressed with proper salting -- a "lengthy" unique salt for each password. Let me stress, proper salting is still necessary.
But, if you add hashes to the mix, you're just giving potential attackers more information. I'm not sure there's currently any known way to "triangulate" message data (passwords) from a pair of hashes, but you're not making significant gains by including another hash. It's not worth the risk that there is a way to leverage the additional information.
To answer your question:
Having a unique salt is better than having a generic salt. H(S1 + PW1) , H(S2 + PW2)
Using multiple algorithms may be better than using a single one H1(X) , H2(Y)
(But probably not, as svidgen mentions)
However,
The spirit of this question is a bit wrong for two reasons:
You should not be coming up with your own security protocol without guidance from a security expert. I know it's not your own algorithm, but most security problems start because they were used incorrectly; the algorithms themselves are usually air-tight.
You should not be using hash(salt+password) to store passwords in a database. This is because hashing was designed to be fast - not secure. It's somewhat easy with today's hardware (especially with GPU processing) to find hash collisions in older algorithms. You can of course use a newer secure Hashing Algorithm (SHA-256 or SHA-512) where collisions are not an issue - but why take chances?
You should be looking into Password-Based Key Derivation Functions (PBKDF2) which are designed to be slow to thwart this type of attack. Usually it takes a combination of salting, a secure hashing algorithm (SHA-256) and iterates a couple hundred thousand times.
Making the function take about a second is no problem for a user logging in where they won't notice such a slowdown. But for an attacker, this is a nightmare since they have to perform these iterations for every attempt; significantly slowing down any brute-force attempt.
Take a look at libraries supporting PBKDF encryption as a better way of doing this. Jasypt is one of my favorites for Java encryption.
See this related security question: How to securely hash passwords
and this loosely related SO question
A salt is added to password hashes to prevent the use of generic pre-built hash tables. The attacker would be forced to generate new tables based on their word list combined with your random salt.
As mentioned, hashes were designed to be fast for a reason. To use them for password storage, you need to slow them down (large number of nested repetitions).
You can create your own password-specific hashing method. Essentially, nest your preferred hashes on the salt+password and recurs.
string MyAlgorithm(string data) {
string temp = data;
for i = 0 to X {
temp = Hash3(Hash2(Hash1(temp)));
}
}
result = MyAlgorithm("salt+password");
Where "X" is a large number of repetitions, enough so that the whole thing takes at least a second on decent hardware. As mentioned elsewhere, the point of this delay is to be insignificant to the normal user (who knows the correct password and only waits once), but significant to the attacker (who must run this process for every combination). Of course, this is all for the sake of learning and probably simpler to just use proper existing APIs.
I am looking for a password hash function that can stay with me for years. Picking the wrong one can be fatal, as it is impossible to upgrade the existing hashes without having the users log in.
It is often suggested to use bcrypt or sha256-crypt from glibc. These use
key stretching,
but I do not like the fact that I am unable to extend the stretching later on.
One should be able to keep up with Moore's law.
Right now, I am considering the simple algorithm from the Wikipedia link, with SHA-256 for the hash function. That one allows me to just keep adding iterations as I see fit.
However, that algorithm is not a standard. It is therefore unlikely that I will ever be able to use the password hash with LDAP, htaccess, and so on.
Is there a better option available?
You should use SHA1 for password hashing. However, more than algorithm, you should also consider adding salt to passwords. Ideally a random salt should be created for each password and stored along with password.
This is to defeat rainbow tables.
Great discussion on this : Non-random salt for password hashes
I may be coming at this from another angle, but if you are saying that you may have users who will not log in for long periods of time then that presents a big risk. The longer you allow a user to stick with the same password, the greater the risk of bruteforce from an attacker who manages to grab your password hash file somehow. Don't rely on security preventing that ever happening...
Hash functions don't go out of date that rapidly, so I would imagine you should be fine reviewing this annually, as hopefully you will have your users change passwords more often than that.
It all depends on your exact requirements, obviously, but have a think about it.
In general bcrypt or sha256 can suit the requirement nicely.
Update: You could think about popping this query across to security.stackexchange.com, as it is a security management question.