How many attempts per second can a password cracker actually make? - security

Google searches reveal that password crackers can quickly try millions of combinations and easily crack many passwords.
My research does not show whether they can practically make that many attempts so quickly in a real-world attack. How do these password-crackers actually have to interface with servers? Are they filling out the forms in an automated way? When I submit a password IRL it takes up to several seconds to get a response. This would multiply the time required for password-cracking by a large factor! This should provide a lot of protection against these password crackers!
Do password crackers distribute password attempts among many many machines so that they can try them simultaneously? Isn't this trivial for website servers to recognize as an automated attack? Is there some faster way that crackers are allowed to make many attempts (and why would servers allow it)?

How fast passwords can be cracked varies - by hash type, hardware capability, software used, and number of hashes. There's also an arms race between attackers and defenders that ebbs and flows as time goes on, so the answer to your question will only apply to the rough era that it's asked. So even though another answer was already accepted, and even though the question is probably a duplicate, it's worth re-answering definitively once in a while.
First, it sounds like we need to clarify the difference between online and offline attacks.
If someone writes software to automate the process of an online attack - trying a list of usernames and passwords against an active web interface - they will (hopefully) quickly run into mechanisms designed to stop that (for example, allowing only 5 bad attempts for a given username or from a given IP address in a specific window of time, etc).
By contrast, most password cracking software is designed to perform an offline attack - where an attacker has acquired the hashes passwords stored in the back end, and can move them to their own platform to attack in bulk.
So password-cracking discussions are usually centered around about offline attacks, because the threat model that matters is if a threat actor steals your hashes and can attack them using a platform of their choosing.
Offline cracking speeds are dependent entirely on a variety of factors:
how well the password was stored (how "slow" the hash is);
the hardware available to the attacker (usually, more GPUs = better);
and for well-stored hashes that are "salted", how many hashes are being attacked (fewer unique salts = faster attack, so attacking a single hash would be much faster than attacking a million salted hashes, etc.)
So to put some real numbers to your question:
One of the most common benchmarks used to compare password-cracking performance is NTLM (the hash used by Windows systems to store local passwords). It's useful for benchmarks because it is extremely common, of high interest in many attack models, and also a very "fast" (easier to crack) hash. Recently (February 2019), hashcat demonstrated the ability to crack NTLM hashes on a single NVIDIA 2080Ti card at the speed of 100 billion hashes per second (disclosure: I'm a member of Team Hashcat). At speeds like that, the vast majority of password-remembering strategies that people use are very likely to be crackable by an attacker with the right tools and know-how. Only the strongest passwords (either random, or random-passphrase based - and of sufficient length/entropy) are entirely out of reach for an attacker.
By contrast, one of the slowest hashes (and best for the defender) is bcrypt. Bcrypt has a 'cost' factor that doubles the cost for the attacker with each iteration. Bcrypt hashes of cost 12 or so are recommended, but even a relatively "fast" bcrypt cost (cost 5), on the same 2080Ti GPU, can only be cracked at a rate of about 28,000 hashes per second. At that speed, only the weakest passwords can be quickly cracked, middling-strength passwords have "strength in numbers" and are harder to crack in bulk (but can still be cracked if a single person's hash is targeted), and any reasonably strong password will usually be out of reach for the attacker.
Again, these are point-in-time answers, and have to be adapted to your specific threat model.
Also, keep in mind that password-hash leaks are forever. Defenders should store passwords today in a way that will be resistant to cracking for years into the future, including estimation of future hardware capabilities, Moore's law, etc.

The Hashcat is the fastest and most advanced password crack utility. It can run on CPUs and GPUs. It can use multiple cores in the GPU and can be parallelized to use multiple cores and boards. The number of the tested password depends on the applied password protection mechanism. See a benchmark here. The modern password protection mechanisms as BCrypt and Argon2 has features against fast passwords searches as memorySizeKB and parallelism.
A System administrator may use the Hashcat to test the passwords of their users. If not easily found, with a threshold with time, then it is a good one. Otherwise, propose the user to change the password. Of course, there should be rules that prevent simple passwords. Min length, numeral, alpha-numerals, etc...
The attackers when they access the system, download the password file then they can use Hashcat. It is not entering a password to login again and again. If so, the login system starts to delay the login mechanism or lock the user account.
The real benefit is that people tend to use the same passwords for other sites, too. Once the attackers find some of the user's passwords from hacked site x then can try another site to see that the password is the same or not.

Related

Best Practices: Salting & peppering passwords?

I came across a discussion in which I learned that what I'd been doing wasn't in fact salting passwords but peppering them, and I've since begun doing both with a function like:
hash_function($salt.hash_function($pepper.$password)) [multiple iterations]
Ignoring the chosen hash algorithm (I want this to be a discussion of salts & peppers and not specific algorithms but I'm using a secure one), is this a secure option or should I be doing something different? For those unfamiliar with the terms:
A salt is a randomly generated value usually stored with the string in the database designed to make it impossible to use hash tables to crack passwords. As each password has its own salt, they must all be brute-forced individually in order to crack them; however, as the salt is stored in the database with the password hash, a database compromise means losing both.
A pepper is a site-wide static value stored separately from the database (usually hard-coded in the application's source code) which is intended to be secret. It is used so that a compromise of the database would not cause the entire application's password table to be brute-forceable.
Is there anything I'm missing and is salting & peppering my passwords the best option to protect my user's security? Is there any potential security flaw to doing it this way?
Note: Assume for the purpose of the discussion that the application & database are stored on separate machines, do not share passwords etc. so a breach of the database server does not automatically mean a breach of the application server.
Ok. Seeing as I need to write about this over and over, I'll do one last canonical answer on pepper alone.
The Apparent Upside Of Peppers
It seems quite obvious that peppers should make hash functions more secure. I mean, if the attacker only gets your database, then your users passwords should be secure, right? Seems logical, right?
That's why so many people believe that peppers are a good idea. It "makes sense".
The Reality Of Peppers
In the security and cryptography realms, "make sense" isn't enough. Something has to be provable and make sense in order for it to be considered secure. Additionally, it has to be implementable in a maintainable way. The most secure system that can't be maintained is considered insecure (because if any part of that security breaks down, the entire system falls apart).
And peppers fit neither the provable or the maintainable models...
Theoretical Problems With Peppers
Now that we've set the stage, let's look at what's wrong with peppers.
Feeding one hash into another can be dangerous.
In your example, you do hash_function($salt . hash_function($pepper . $password)).
We know from past experience that "just feeding" one hash result into another hash function can decrease the overall security. The reason is that both hash functions can become a target of attack.
That's why algorithms like PBKDF2 use special operations to combine them (hmac in that case).
The point is that while it's not a big deal, it is also not a trivial thing to just throw around. Crypto systems are designed to avoid "should work" cases, and instead focus on "designed to work" cases.
While this may seem purely theoretical, it's in fact not. For example, Bcrypt cannot accept arbitrary passwords. So passing bcrypt(hash(pw), salt) can indeed result in a far weaker hash than bcrypt(pw, salt) if hash() returns a binary string.
Working Against Design
The way bcrypt (and other password hashing algorithms) were designed is to work with a salt. The concept of a pepper was never introduced. This may seem like a triviality, but it's not. The reason is that a salt is not a secret. It is just a value that can be known to an attacker. A pepper on the other hand, by very definition is a cryptographic secret.
The current password hashing algorithms (bcrypt, pbkdf2, etc) all are designed to only take in one secret value (the password). Adding in another secret into the algorithm hasn't been studied at all.
That doesn't mean it is not safe. It means we don't know if it is safe. And the general recommendation with security and cryptography is that if we don't know, it isn't.
So until algorithms are designed and vetted by cryptographers for use with secret values (peppers), current algorithms shouldn't be used with them.
Complexity Is The Enemy Of Security
Believe it or not, Complexity Is The Enemy Of Security. Making an algorithm that looks complex may be secure, or it may be not. But the chances are quite significant that it's not secure.
Significant Problems With Peppers
It's Not Maintainable
Your implementation of peppers precludes the ability to rotate the pepper key. Since the pepper is used at the input to the one way function, you can never change the pepper for the lifetime of the value. This means that you'd need to come up with some wonky hacks to get it to support key rotation.
This is extremely important as it's required whenever you store cryptographic secrets. Not having a mechanism to rotate keys (periodically, and after a breach) is a huge security vulnerability.
And your current pepper approach would require every user to either have their password completely invalidated by a rotation, or wait until their next login to rotate (which may be never)...
Which basically makes your approach an immediate no-go.
It Requires You To Roll Your Own Crypto
Since no current algorithm supports the concept of a pepper, it requires you to either compose algorithms or invent new ones to support a pepper. And if you can't immediately see why that's a really bad thing:
Anyone, from the most clueless amateur to the best cryptographer, can create an algorithm that he himself can't break.
Bruce Schneier
NEVER roll your own crypto...
The Better Way
So, out of all the problems detailed above, there are two ways of handling the situation.
Just Use The Algorithms As They Exist
If you use bcrypt or scrypt correctly (with a high cost), all but the weakest dictionary passwords should be statistically safe. The current record for hashing bcrypt at cost 5 is 71k hashes per second. At that rate even a 6 character random password would take years to crack. And considering my minimum recommended cost is 10, that reduces the hashes per second by a factor of 32. So we'd be talking only about 2200 hashes per second. At that rate, even some dictionary phrases or modificaitons may be safe.
Additionally, we should be checking for those weak classes of passwords at the door and not allowing them in. As password cracking gets more advanced, so should password quality requirements. It's still a statistical game, but with a proper storage technique, and strong passwords, everyone should be practically very safe...
Encrypt The Output Hash Prior To Storage
There exists in the security realm an algorithm designed to handle everything we've said above. It's a block cipher. It's good, because it's reversible, so we can rotate keys (yay! maintainability!). It's good because it's being used as designed. It's good because it gives the user no information.
Let's look at that line again. Let's say that an attacker knows your algorithm (which is required for security, otherwise it's security through obscurity). With a traditional pepper approach, the attacker can create a sentinel password, and since he knows the salt and the output, he can brute force the pepper. Ok, that's a long shot, but it's possible. With a cipher, the attacker gets nothing. And since the salt is randomized, a sentinel password won't even help him/her. So the best they are left with is to attack the encrypted form. Which means that they first have to attack your encrypted hash to recover the encryption key, and then attack the hashes. But there's a lot of research into the attacking of ciphers, so we want to rely on that.
TL/DR
Don't use peppers. There are a host of problems with them, and there are two better ways: not using any server-side secret (yes, it's ok) and encrypting the output hash using a block cipher prior to storage.
Fist we should talk about the exact advantage of a pepper:
The pepper can protect weak passwords from a dictionary attack, in the special case, where the attacker has read-access to the database (containing the hashes) but does not have access to the source code with the pepper.
A typical scenario would be SQL-injection, thrown away backups, discarded servers... These situations are not as uncommon as it sounds, and often not under your control (server-hosting). If you use...
A unique salt per password
A slow hashing algorithm like BCrypt
...strong passwords are well protected. It's nearly impossible to brute force a strong password under those conditions, even when the salt is known. The problem are the weak passwords, that are part of a brute-force dictionary or are derivations of them. A dictionary attack will reveal those very fast, because you test only the most common passwords.
The second question is how to apply the pepper ?
An often recommended way to apply a pepper, is to combine the password and the pepper before passing it to the hash function:
$pepperedPassword = hash_hmac('sha512', $password, $pepper);
$passwordHash = bcrypt($pepperedPassword);
There is another even better way though:
$passwordHash = bcrypt($password);
$encryptedHash = encrypt($passwordHash, $serverSideKey);
This not only allows to add a server side secret, it also allows to exchange the $serverSideKey, should this be necessary. This method involves a bit more work, but if the code once exists (library) there is no reason not to use it.
The point of salt and pepper is to increase the cost of a pre-computed password lookup, called a rainbow table.
In general trying to find a collision for a single hash is hard (assuming the hash is secure). However, with short hashes, it is possible to use computer to generate all possible hashes into a lookup onto a hard disk. This is called a Rainbow Table. If you create a rainbow table you can then go out into the world and quickly find plausable passwords for any (unsalted unpeppered) hash.
The point of a pepper is to make the rainbow table needed to hack your password list unique. Thus wasting more time on the attacker to construct the rainbow table.
The point of the salt however is to make the rainbow table for each user be unique to the user, further increasing the complexity of the attack.
Really the point of computer security is almost never to make it (mathematically) impossible, just mathematically and physically impractical (for example in secure systems it would take all the entropy in the universe (and more) to compute a single user's password).
I want this to be a discussion of salts & peppers and not specific algorithms but I'm using a secure one
Every secure password hashing function that I know of takes the password and the salt (and the secret/pepper if supported) as separate arguments and does all of the work itself.
Merely by the fact that you're concatenating strings and that your hash_function takes only one argument, I know that you aren't using one of those well tested, well analyzed standard algorithms, but are instead trying to roll your own. Don't do that.
Argon2 won the Password Hashing Competition in 2015, and as far as I know it's still the best choice for new designs. It supports pepper via the K parameter (called "secret value" or "key"). I know of no reason not to use pepper. At worst, the pepper will be compromised along with the database and you are no worse off than if you hadn't used it.
If you can't use built-in pepper support, you can use one of the two suggested formulas from this discussion:
Argon2(salt, HMAC(pepper, password)) or HMAC(pepper, Argon2(salt, password))
Important note: if you pass the output of HMAC (or any other hashing function) to Argon2 (or any other password hashing function), either make sure that the password hashing function supports embedded zero bytes or else encode the hash value (e.g. in base64) to ensure there are no zero bytes. If you're using a language whose strings support embedded zero bytes then you are probably safe, unless that language is PHP, but I would check anyway.
Can't see storing a hardcoded value in your source code as having any security relevance. It's security through obscurity.
If a hacker acquires your database, he will be able to start brute forcing your user passwords. It won't take long for that hacker to identify your pepper if he manages to crack a few passwords.

Are there any security measures that are resistant to a brute force attack?

I'm not talking in particular about encryption, but security as a whole. Are there any security measures that can be put in place to protect data and/or a system that can withstand even a hypothetical amount of resources being pitted against it over a hypothetical amount of time?
I think the answer is no, but I thought I'd double check before saying this out loud to people because I'm no security expert.
UPDATE: I should point out, I'm not asking this because I need to implement something. It's idle curiosity. I should also mention that I'm ok dealing with hypotheticals here. Feel free to bring things like quantum computing into the equation if there's any relevance.
The One-time pad is such an encryption technique: it's fundamentally secure against brute force, in other words, information-theoretically secure. If you don't have the key, it cannot be "broken" regardless of what computation power you throw at it. The trick is that it's impossible to distinguish the correct answer from all other possible answers, because every answer is equally likely.
Read more on Wikipedia
Unfortunately the one-time pad is almost useless in practice, because the key must be as long as your plaintext, the key may never be re-used, and it has to be random. All of this means that you can't derive the key from a memorable password, so you need a secure storage method for the key itself. But if you can already secure a massive key, you might as well put your plaintext there without encryption.
The first thing that comes to mind is shutting down access (at least for some time) after a number of failed attempts. Such as a bank card becoming invalid after the wrong PIN has been used a couple of times, or a phone that deletes its own data after you fail to unlock it repeatedly.
Of course, this will not work with files, that the attacker can make copies of on his own machine.
First of all, you'd be better off trying this on ITsec.SE.
Now, to answer your question:
Yes, of course there are.
Brute force attacks can accomplish two things: "guessing" some sort of secret (e.g. password, encryption key, etc), and overwhelming resources (i.e. flooding, or Denial of service - DoS).
Any countermeasures aimed at preventing any other form of attack, would be irrelevant to bruteforce.
For example, take the standard reccomendations to protect against SQL Injection: input validation, stored procedures (or parameterized queries), command/parameter objects, and the like.
What would you try to bruteforce here? If code was written correctly, there is no "secret" to guess.
Now, if you're asking, "How to prevent brute force attacks?", well the answer would depend on what the attacker is trying to brute force.
Assuming that we're talking about bruteforcing a password / login screen, there several options: strong password policy (to make it harder), account lockout (to limit rate of bruteforce attempts), throttling (again limits the attempt rate), and more.
ideally no , but typically in a solution you provide , an additional step can be introduced, the data that can be subjected to direct brute force can be obfuscated to make it tough to or meaningless
for ex: a password that is encrypted and being sent over wire can be subjected brute force but if its obfuscated by transforming it into some form and then sent over wire then even brute force may not help unless the attacker knows the transforming functions too
You can always try to look for repeated / large volume attempts (to log in for example) and ban the source (IP) temporarily or even permanently.
Talking about a distributed attack it's much more difficult of course, but you can still issue mass temporary bans and scale services down for unknown users.
I'm not sure if there's any silver bullet, just be creative :) Having a home brewn solution will probably make your chances better as there are no known exploits.

Increasing security of web-based login

Right now my login system is the following:
Password must be at least 8 characters long, and contain at least one upper and lowercase letter, a number and a symbol.
Password can't contain the username as its substring.
Username, salted+hashed (using SHA2) password stored on db.
The nonce (salt) is unique for each user and stored as plaintext along with the username and password.
The whole login process can only be made over TLS
How would you rank the effectiveness of the following measures to increase security?
Increase password length
Force the user to change the password every X period of time, and the new password can't be any of the last Y previous passwords
Increase nonce size from 32 bytes to 64 bytes (removed for uselessness)
Encrypt the salt using AES, with the key available only to the application doing authentication
Rehash the password multiple times
Use a salt that's a combination of a longer, application-wide salt + unique user salt on the db.
I am not very fond of 1 and 2 because it can inconvenience the user though.
4 and 6 of course are only effective when an attacker has compromised the db (eg: via SQL injection) but not the filesystem where the application is in.
The answers may depend somewhat on the nature of the website, its users and attackers. For instance, is it the kind of site where crackers might target specific accounts (perhaps "admin" accounts) or is it one where they'd just want to get as many accounts as possible? How technical are the users and how motivated are they to keep their own account secure? Without knowing the answers, I'll assume they're not technical and not motivated.
Measures that might make a difference
5) Rehash the password multiple times. This can slow down all brute force attacks significantly - hash 1000 times and brute force attacks become 1000 times slower.
4) Encrypt the salt using AES, with the key available only to the application doing authentication How would you make it available only to the application? It has to be stored somewhere and, chances are, if the app is compromised the attacker can get it. There might be some attacks directly against the DB where this makes a difference, so I wouldn't call this useless, but it's probably not worthwhile. If you do make the effort, you might as well encrypt the password itself and any other sensitive data in the DB.
6) Use a salt that's a combination of a longer, application-wide salt + unique user salt on the db. If you're only concerned about the password then yes, this would be a better way of achieving the same result as 4) and, of course, it's very easy to implement.
Ineffective measures
3) Increase nonce size from 32 bytes to 64 bytes. Computing rainbow tables is already completely impractical with any salt, so this would only make a difference if the salt was not known to the attacker. However, if they can get the hashed password they could also get the salt.
Ineffective and annoying measures
1) Increase password length Increasing password length beyond 8 won't make a practical difference to the brute force time.
2) Force the user to change the password I agree, this will always be worked around. In fact, it may make the site less secure, because people will write down the password somewhere!
Increasing password length add a few bits of entropy to the password.
Requiring frequent password changes will generally force the users to use less secure passwords. They will need to figure out what the password is in May, June, July. Some#05x, Some#06x, Some#07x.
Can't say for sure, but I would expect the password length to be more significant in your case.
Slightly more secure. But if someone gains access to your data, they can likely gain access to the key.
Other than increasing CPU costs, you won't gain anything.
There are a number of well tried one-way password encryption algorithms which are quite secure. I would use one of them rather than inventing my own. Your original items 1, 2, and 5 are all good. I would drop 3, and 4.
You could allow pass phrases to ease password length issues.
I would suggest that you read http://research.microsoft.com/en-us/um/people/cormac/papers/2009/SoLongAndNoThanks.pdf
This paper discusses part of the reason it is hard to get users to follwo good security advice; in short the costs lie with the users and they experience little or no benefit.
Increasing the password length and forcing more complex passwords can reduce seciryt by leading to one or both of; reused passwords between sites/applications and writing down of passwords.
3 Increase nonce size from 32 bytes to 64 bytes
4 Encrypt the salt using AES, with the key available only to the application doing authentication
5 Rehash the password multiple times
These steps only affect situations where the password file (DB columns) are stolen and visible to the attacker. The nonce only defeats pre-hashing (rainbow tables), but that's still a good thing and should be kept.
(Again, under the assumption you're trying to minimize the impact of a compromised DB.) Encrypting the nonce means the attacker has an extra brute-force step, but you didn't say where the encryption key for the nonce is stored. It's probably safe to assume that if the DB is compromised the nonce will be plaintext or trivially decrypted. So, the attacker's effort is once again a brute-force against each hash.
Rehashing just makes a brute-force attack take longer, but possibly not much more so depending on your assumptions about the potential attacker's cracks/second.
Regardless of your password composition requirements a user can still make a "more guessable" password like "P#ssw0rd" that adheres to the rule. So, brute force is likely to succeed for some population of users in any case. (By which I mean to highlight taking steps to prevent disclosure of the passwords.)
The password storage scheme sounds pretty good in terms of defense against disclosure. I would make sure other parts of the auth process are also secure (rate limiting login attempts, password expiration, SQL injection countermeasures, hard-to-predict session tokens, etc.) rather than over-engineering this part.
For existing:
e1: I see where you're coming from, but these rules are pretty harsh - it certainly increases security, but at the expense of user experience. As vulkanino mentions this is going to deter some users (depends on your audience - if this is an intranet application they have no choice... but they'll have a yellow sticky with their password on their monitor - cleaners and office loiterers are going to be your biggest issue).
e2: Is a start, but you should probably check against a list of bad passwords (eg: 'password', 'qwerty', the site URL)... there are several lists on the net to help you with this. Given your e1 conditions such a scan might be moot - but then surely users aren't going to have a username with 8 chars, upper+lower, a symbol and a number?
e3: Good call - prevent rainbow attacks.
e4: Unique salt prevents identification of multiple users with the same password, but there are other ways to make it unique - by using the username as a secondary salt+hash for example.
e5: Solid, although TLS has built in fall-backs, the lower end TLS protocols aren't very secure so you may want to check you're not allowing these connections.
New ideas:
n1+n2: e1 is already painful enough.
n3: No discernible benefit
n4: No discernible benefit - whatever the encryption process is would be available in the code, and so also likely compromised. That is unless your DB and App servers are two different machines hardened for their own tasks - in this case anything you can avoid storing with the password is helpful in the event the DB is compromised (in this case dropping unique salt from the database will help).
n5: Rehashing decreases brute force attack speed through your application - a worth while idea in many ways (within reason - a user won't notice a quarter second login delay, but will notice a 5 second delay... note this is also a moving target as hardware gets better/faster/stronger/work it)
Further points:
Your security is only as good as the systems it is stored on and processed through. Any system that could be compromised, or already has a back door (think: number of users who can access the system - server admins, DBAs, coders, etc) is a weak link.
Attack detection scripts in your application could be beneificial - but you should be aware of Denial of Service (DoS) attacks. Tracking failed logins and source is a good start - but be aware if you lock the account at 5 failures, someone could DoS a known account (including the admin account). Being unable to use the App may be as bad as loosing control of your account. Multi-hash (n5) slows down this process, picking a slower hash algorithm is a good step too, and perhaps building in re-attempt delays will help too (1 second on first fail, 2 on second, etc)- but again; be DoS aware. Two basic things you might want to filter: (1) multi attacks from the same source/IP (slow down, eventually prevent access from that IP - but only temporarily since it could be a legitimate user) perhaps further testing for multiple sets of multi attacks. (2) Multi attacks from different IPs - the first approach only locks a single user/source, but if someone uses a bot-net, or an anonymizing service you'll need to look for another type of suspicious activity.
Is it possible to piggy-back off another system? You could use an LDAP, or Active Directory server in your domain or use OpenID or OAuth or something similar. Save yourself all these headaches by off loading the work ;) {Inter-system security still needs to be addressed if you're a middle man} Plus the less passwords users have to remember (and rules attached to each) the more likely they are to have a good password, that isn't written down, etc.
I don't consider any of those things to increase your password security. The security of the password stored in the database is only relevant if you expect someone to obtain a copy of the database. Using a (perceived) stronger hash function in the database only obfuscates your application. In fact a salted MD5 would be fine (I am aware of the attacks on MD5, and don't believe any of them to be relevant to password hashing).
You might be better relaxing the password rules to get better security, as if you require at least one upper and lower LATIN letters, you effectively force non-latin keyboard users to use alien letters (try typing upper and lower case latin letters on a cyrilic keyboard). This makes them more likely to write it down.
The best security would be to avoid using passwords in their entirety. If it is an enterprise application in a corporate that uses Active Directory, consider delegating authentication instead of writing your own. Other approaches can include using an Information Card by making your application claims-aware.
How about encrypting the password in client browser already with MD5/SHA, then treat the hash as user's password at server side. This way the password isn't in plain text when it travels over SSL/TLS nor it is never-ever in plain text in server either. Thus even it is stolen by hackers at any point (man-in-the-middle, server/db hacks), it cannot be used to gain access to other web services where the user might have same email/username+password combo (yes, its very common...)
It doesn't help with YOUR site login security directly, but it certainly would stop hacked password lists spreading around the net if some server has been hacked. It might work to your advantage as well, if another, hacked site applies the same approach, your site user's aren't compromised.
It also guarantees all users will have decent alphanumeric password with enough length and complexity, you can perhaps then relax your requirements for password strength a little :-)

Hashing SSNs and other limited-domain information

I'm currently working on an application where we receive private health information. One of the biggest concerns is with the SSN. Currently, we don't use the SSN for anything, but in the future we'd like to be able to use it to uniquely identify a patient across multiple facilities. The only way I can see to do that reliably is through the SSN. However, we (in addition to our customers) REALLY don't want to store the SSN.
So naturally, I thought of just SHA hashing it since we're just using it for identification. The problem with that is that if an attacker knows the problem domain (an SSN), then they can focus on that domain. So it's much easier to calculate the billion SSNs rather than a virtually unlimited number of passwords. I know I should use a site salt and a per-patient salt, but is there anything else I can do to prevent an attacker from revealing the SSN? Instead of SHA, I was planning on using BCrypt, since Ruby has a good library and it handles scalable complexity and salting automagically.
It's not going to be used as a password. Essentially, we get messages from many facilities, and each describes a patient. The only thing close to a globally unique identifier for a patient is the SSN number. We are going to use the hash to identify the same patient at multiple facilities.
The algorithm for generating Social Security Numbers was created before the concept of a modern hacker and as a consequence they are extremely predictable. Using a SSN for authentication is a very bad idea, it really doesn't matter what cryptographic primitive you use or how large your salt value is. At the end of the day the "secret" that you are trying to protect doesn't have much entropy.
If you never need to know the plain text then you should use SHA-256. SHA-256 is a very good function to use for passwords.
If you seriously want to hash a social security number in a secure way, do this:
Find out how much entropy is in
an SSN (hint: there is very little.
Far less than a randomly chosen 9
digit number).
Use any hashing algorithm.
Keep fewer (half?) bits than
there is entropy in an SSN.
Result:
Pro: Secure hash of an SSN because of
a large number of hash collisions.
Pro: Your hashes are short and easy to store.
Con: Hash collisions.
Con: You can't use it for a unique
identifier because of Con#1.
Pro: That's good because you really
really need to not be using SSNs as
identifiers unless you are the Social
Security Administration.
First, much applause and praise for storing a hash of the SSN.
It appears as if you're reserving the SSNs as a sort of 'backup username.' In this case, you need another form of authentication besides the username - a password, a driver's license number, a passport number, proof of residence, etcetera.
Additionally, if you're concerned that an attacker is going to predict the top 10,000 SSNs for a patient born in 1984 in Arizona, and attempt each of them, then you can put in an exponentially increasing rate limiter in your application.* For additional defense, build in a notification system that alerts a sys-admin when it appears that there is an unusually high number of failed login attempts.**
*Example exponentially increasing rate limiter:
After each failed request, delay the next request by (1.1^N) seconds, where N is the number of failed requests from that IP. Track IP and failed login attempts in a DB table; shouldn't add too much load, depending on the audience of your application (do you work for Google?).
**In the case where an attacker has access to multiple IPs, the notification will alert a sys-admin who can use his or her judgment to see if you have an influx of stupid users or it's a malicious attempt.

Crypto, hashes and password questions, total noob?

I've read several stackoverflow posts about this topic, particularly this one:
Secure hash and salt for PHP passwords
but I still have a few questions, I need some clarification, please let me know if the following statements are true and explain your comments:
If someone has access to your database/data, then they would still have to figure out your hashing algorithm and your data would still be somewhat secure, depending on your algorithm? All they would have is the hash and the salt.
If someone has access to your database/data and your source code, then it seems like no matter what your do, your hashing algorithm can be reversed engineered, the only thing you would have on your side would be how complex and time consuming your algorithm is?
It seems like the weakest link is: how secure your own systems are and who has access to it?
Lasse V. Karlsen ... brings up a good point, if your data is compromised then game over ... my follow up question is: what types of attacks are these hashes trying to protect against? I've read about rainbow table and dictionary attacks (brute force), but how are these attacks administered?
The security of cryptographic algorithms is always in their secret input. Reasonable cryptanalysis is based on an assumption that any attacker knows what algorithm you use. Good cryptographic hashes are non-invertible and collision resistant. This means that there's still a lot of work to do going from a hash to the value that generated it, regardless of whether you know the algorithm applied.
If you used a secure hash, access to the hash, salt, and algorithm will still leave a lot of work for a would-be attacker.
Yes, a secure hash puts a very hard to invert algorithm on your side. Note that this inversion is not 'reverse-engineering'
The weak link is probably the processes and procedures that get those password hashes into the database. There are all sorts of ways to screw up and store sensitive data in the clear.
As I noted in a comment, there are attacks that these measures defend against. First, knowing the password may lead to authorization to do things beyond what the contents of the database suggest. Second, those passwords may be used elsewhere, and you expose your users to risk by revealing their passwords as a result of a break-in. Third, with hashing, an insider can't exploit read-only access to the database (subject to less auditing, etc.) to impersonate a user.
Dictionaries and rainbow tables are techniques for accelerating hash inversion.
You question is about using passwords as an authentication mechanism and how to securely store these passwords in a database using a hash. As you probably already know the goal is to be able to verify passwords without storing these passwords i clear text in the database. In this context let me try to answer each of your questions:
If someone has access to your database/data, then they would still have to figure out your hashing algorithm and your data would still be somewhat secure, depending on your algorithm? All they would have is the hash and the salt.
The basic idea of hashing passwords is that the attacker has knowledge of the hashing algorithm and has access to both the hash and the salt. By selecting a cryptographic strong hash function and a suitable salt value that is different for each password the computational effort required to guess the password is so high that the cost exceeds the possible gain the attacker can get from guessing the password. So to answer your question, hiding the hash function does not improve the security.
If someone has access to your database/data and your source code, then it seems like no matter what your do, your hashing algorithm can be reversed engineered, the only thing you would have on your side would be how complex and time consuming your algorithm is?
You should always use a well-known (and suitably strong) hashing algorithm, and reverse engineering this algorithm is not meaningful as there is nothing hidden in your code. If you didn't mean reverse engineer but actually reverse then, yes, the passwords are protected by the complexity of reversing the hash function (or guessing a password that matches a hash value). Good hash functions makes this very hard.
It seems like the weakest link is: how secure your own systems are and who has access to it?
In general this is true, but when it comes to securing passwords by storing them as hashes you should still assume that the attacker has full access to the hashes and design your system accordingly by choosing an appropriate hash function and using salts.
What types of attacks are these hashes trying to protect against? I've read about rainbow table and dictionary attacks (brute force), but how are these attacks administered?
The basic attack that password hashing protects against is when the attacker gets access to your database. The clear text password cannot be read from the database and the password is protected.
A more sophisticated attacker can generate a list of possible passwords and compute the hash using the same algorithm as you. He can then compare the computed hash to the stored hash and if he finds a match he has a valid password. This is a brute force attack and it is generally assumed that the attacker has "offline" access to your database. By requiring the users to use long and complex passwords the effort required to "brute force" a password is significantly increased.
When the attacker wants to attack not one password, but all the passwords in the database a large table of passwords and hash value pairs can be precomputed and further improved by using what is called hash chains. Rainbow tables is an application of this idea and can be used to brute force many passwords simultaneously without increasing the effort significantly. However, if a unique salt is used to compute the hash for each password a precomputed table becomes useless as it is different for each salt and cannot be reused.
To sum it up: Security by obscurity is not a good strategy for protecting sensitive information and modern cryptography allows you to secure information without having to resort to obscurity.
what types of attacks are these hashes trying to protect against?
That type when someone gets your password from poorly secured site, reverses it, and then tries to access your bank/PayPal/etc. account. It happens all the time, and many people are still using same (and often weak) passwords everywhere.
As a side note, from what I've read, key derivation functions (PBKDF2/scrypt/bcrypt) are considered better/more secure (#1, #2) than plain salted SHA-1/SHA-2 hashes by crypto people.
If you have just a hash, no salt, then once they know your data (and algorithm) they can get your password via a rainbow table lookup. If you have a hash and a salt, they can get your password by burning a lot of CPU cycles and building a rainbow table.
If your salt is the same for all your data, they only need to burn a lot of CPU cycles once to build the table and then they have all the passwords. If your salt is not always the same, they need to burn through the CPU cycles to make a unique rainbow table for each record.
If the salt is long enough, the CPU cycles they need become very cost-prohibitive.
If you know your data security is breached, of course, you need to reset all the passwords immediately anyway, because as far as you know the attacker is willing to spend that time.
If someone has access to your database/data, then they would still
have to figure out your hashing
algorithm and your data would still be
somewhat secure, depending on your
algorithm? All they would have is the
hash and the salt.
This might be all a really dedicated opponent would need. Much of this answer depends on how valuable the data is, which would tell you how motivated the opponent is. Credit card numbers are going to be extremely valuable, and criminal attackers seem to have plenty of time and accomplices to do their dirty work. Some bad guys have been known to farm out key decryption tasks to botnets!
If someone has access to your database/data and your source code,
then it seems like no matter what your
do, your hashing algorithm can be
reversed engineered, the only thing
you would have on your side would be
how complex and time consuming your
algorithm is?
If they have access to your source and all the data, the question is going to be "how did you load your key into the memory of the server in the first place?" If it's embedded in the data or in the program code, it's game over and you've lost. If it was hand-keyed by an operator at the machine's boot time, it should be as secure as your trust in your operator. If it is stored in an HSM*, it should still be secure.
And if they have root-level authority access to your running machine, then they can probably trigger and recover a memory dump that will reveal the secret key.
It seems like the weakest link is: how
secure your own systems are and who
has access to it?
This is true. But there are alternatives that help improve security.
For bank-like protection, the kind that passes security and industry audits, it's recommended that you use a *Hardware Security Module (HSM) to perform key storage and encryption/decryption functions. The commercial strength HSMs we're looking at cost 10s of thousands of dollars or more each, depending on capacity. But I have seen hardware encryption cards that plug into a PCI slot that cost substantially less.
The idea behind an HSM is that the encryption happens on a secure, hardened platform that nobody has access to without the secret keys. Most of them have cabinets with intrusion detection switches, trip wires, epoxied chips, and memory that will self-destruct if tampered with. Not even the legitimate owner or the factory should be able to recover the database key from an HSM without the set of authorized crypto keys (usually carried on smart cards.)
For a very small installation, an HSM can be as simple as a smart card. Smart cards aren't high performance encryption devices, though, so you can't pump more than about one decryption transaction per second through them. Systems using smart cards usually just store the root key, then decrypt the working database key on the smart card and send it back to the database accessing system. These will still yield the working database key if the attacker can access running memory, or if the attacker can sniff the USB traffic to and from the smart card.
And I have no experience getting TPM chips to work (yet), but theoretically they can be used to securely store keys on a machine. Again, it is still no defense against an attacker taking a memory dump while the key is loaded in memory, but they would prevent a stolen hard drive containing code and data from revealing its secrets.
A hash cannot be reversed. Conceptually, think of a hash as taking the value to be hashed as the seed to a random number generator, then taking the 500th number that it generates. This is a repeatable process, but it is not a reversible process.
If you store a hashed password in your database, when your user logs in, you take his password from the input to the login page, you apply the same hash to it, and then you compare the result of that operation to what you have stored in the database. If they match, the user typed the right password. (Or, in theory, they could have typed something that happens to hash to the same value, but in practice, you can completely ignore this.)
The purpose of the salt is so that even if users have the same password, you can't tell, and also lots of other things which are equivalent to this idea. If the user's password is "secret", and the salt is "abc", then instead of making a hash of "secret", you hash "secretabc" and store the results of that in your database. You also store the salt, but this is perfectly safe to store -- you can't figure out any information about the password from it.
The only reason to safeguard the hashed passwords and salt is that if an attacker has a copy of it, he can test passwords offline on his own machine, rather than repeatedly trying to log in to your server, which you would probably lock him out after three attempts or something like that. Even if you don't lock him out, it's much faster to test locally than to wait for the network round-trip.
( OP )
brings up a good point, if your data
is compromised then game over ... my
follow up question is: what types of
attacks are these hashes trying to
protect against? I've read about
rainbow table and dictionary attacks
(brute force), but how are these
attacks administered
( discussion )
It's not a game, except to the attacker. Research these terms:
Sarbanes-Oxley
Gramm-Leach-Bliley Act (GLBA)
HIPAA
Digital Millenium Copyright Act (DMCA)
PATRIOT Act
Then tell us ( as thought provocation for you ) how do we protect against whom? For one thing, it is the efforts of innocents vis-a-vis intruders - and for another it is data-recovery if part of the system fails.
It is an interesting experiment that the original intent of tcp/ip and so on is advertised as being a weapon of war, survivability under attacks. Okay, so passwords are hashed - no one can recover them ...
Which, duh, includes the owner-operator of the system.
So you build a robust record locking tool that implements key controls, then political pressures force the use of brand-x tools.
You can read Federal Information Security Management Act (FISMA) and by the time you have read it some governmental entity somewhere will have had an entire disk either stolen or compromised.
How would you protect that disk if it was your personal identity information on that disk.
I can tell you from the caliber of Martin Liversage and jadeters they will be paying attention.
Here are my thoughts to your points:
If people have access to your database you have bigger security concerns than your hash algorithm and salt phrase. Hashes are somewhat secure, however there are problems such as hash collisions and hash lookups.
Hashes are one-way, so unless they can guess the input there is no way to reverse out the original text even with the algorithm and salt; hence the name one-way hash.
Security is about obscurity and layers of defense. If you layer your defenses and make determining what those defenses are you stand a much better chance of staving off an attack than if you relied on a single approach to security such as password hashing and running OS/network hardware updates. Throw in some curveballs like obsfucation of the web server platform and clear boundaries between the prod web and database environments. Layers and hiding implementation details buy you valuable time.
When hashing a password, it is one way. So it is very difficult to get the password even if you have the salt, source and alot of cpu cyles to burn.

Resources