Why are passphrases more secure than normal passwords? (p%9y#k&yFm?)
Wouldn't it be easier to crack a passphrase than a normal password, since it only contains letters?
And is there any way to make passphrases more secure?
In a brute force attack against a passphrase, since there are generally more characters, it takes longer to crack if just guessing.
Ex:
Password - e7%2b
Number of possible solutions: 128 ^ 5 = 34,359,738,368
(5 is number of characters, 128 is character amount in ascii for example)
Passphrase - iloveicecream
Number of possible solutions: (2 * 26)13 = 20,325,604,337,285,010,030,592
(13 characters, 26 * 2 (lowercase and capital) amount of letters)
A detailed attack can try and guess words based on other words, ex. with ice it could guess cream, but it still takes a long time to process.
Obviously, a super long passphrase is ideal but iloveicream is a very simple password for a person to remember which is why it is often said to be the best.
Related
Trying to understand more about password security.
How many bits are there in a character in a password?
For example, say I choose the following password for an account: 4T!h36*^^NQi!u*6m7qFT&3X$L6!x6^&
How many bits are in this password?
Say I choose a different password only composed of alphanumeric characters: 45v9Zu9tvrWTd5ew8qsp9w9d899zf6su
Is the number of bits influenced by the composition of the password (i.e. whether I include special characters or not) or is it only affected by the length?
What would constitute a 256-bit password on a website?
I'm assuming that what you mean is the amount of information contained in a password, ie. the effort it takes to bruteforce a password compared to a 256Bit key.
You can calculate the amount of possible combinations by
(number of possible characters)^(length of password).
e.g. a 20 character password containing only lowercase letters has 2620 possible combinations.
From that you can calculate the information contained in the password by log2(number of combinations)
That means a 20 character password containing only lowercase has equal security than a log2(2620) = 94Bit Key. They are equally hard to bruteforce.
Calculating the bit-strength of a passwort is a good measure of how good a password is.
Important: Please note that this assumes that the password is completely random and doesn't contain any words, ie. every character is statistically independent from each other
Assuming a SHA 256 hash and a completely random password using the extended ASCII charset, is there a specific length after which additional characters offer no increase in entropy, and if so what is this?
Thanks.
SHA-256 has 256 bits, obviously. The minimum UTF-8 character length is one byte, i.e. 8 bits. Therefore, any password longer than 256/8=32 characters is guaranteed extremely likely to collide with a shorter one.
Is this what you meant?
A hash doesn't increase entropy, it just, so to speak, distills it. Since SHA256 produces 256 bits of output, if you supply it with a password that's completely unpredictable (i.e., each bit of input represents one bit of entropy) then anything beyond 256 bits of input is more or less wasted.
Other than from a truly random source, however, it's really hard to get input that has one bit of entropy for every bit of input. For typical English text, Shannon's testing showed about one bit of entropy per character.
I have come to roughly the same conclusion as the others did, but with a different rationale.
Generally speaking, a preimage (brute force) attack on SHA-256 requires 2^256 evaluations, regardless of password length. In other words, a hash of a "password" that is thousands of characters long would still take an average of 2^256 tries to duplicate. 2^256 is about 1.2 x 10^77. However, a very short password, where the number of possibilities is less than 2^256, is even easier to break.
The threshold is passed when the number of possibilities is greater than 2^256.
If you are using ISO 8859-1, which has 191 characters, there are 191^n possible random passwords of length n, where n is the length of the password. 191^33 is about 1.9 x 10^75 and 191^34 is about 3.6 x 10^77, so the threshold would be at 33 characters.
If you were using plain ASCII, with 128 characters, there would be 128^n possible random passwords of length n, where n is the length of the password. 128^36 is about 7.2 x 10^75 and 128^37 is about 9.3 x 10^77, so the threshold would be at 36 characters.
Some of the other answers seem to imply that the threshold is always at 32 characters. However, if my logic is correct, the threshold varies, depending on the number of characters you have in your character set.
In fact, suppose that you used only characters a-z and 0-9, you would continue to add password strength up until your password was 49 characters long! (36^49 is about 1.8 x 10^76)
Hopefully this answer gives you a mathematical basis for answering the question.
As a side note, if a birthday (collision) attack were possible on SHA-256, it would theoretically require only 2^128 evaluations (on average), which is about 3.4 x 10^38. In that case, the threshold for ISO 8859-1 would be at only 16 characters (191^16 is about 3.1 x 10^36). Thankfully, such an attack has not yet been publicly demonstrated.
Please see the Wikipedia articles on SHA-2, preimage attacks, and birthday attacks.
I don't think there is an "effective" limit. Password of any length will be effective if it is effectively created (the usual rules, no words, mixed numbers, letters, cases and characters). It is best to force user to follow these rules rather then limit length. But minimum length should be imposed, sth like 8-10 characters, to save the users from themselves.
If I have a users 6 digit PIN (or n char string) and I wish to verify say 3 digits chosen at random from the PIN (or x chars) as part of a 'login' procedure, how would I store the PIN in a database or some encrypted/hashed version of the PIN in such a way that I could verify the users identity?
Thoughts:
Store the PIN in a reversible
(symmetrically or asymmetrically) encrypted manner, decrypt for digit checks.
Store a range of hashed permutations of the PIN against some
ID, which links to the 'random
digits' selected, eg:
ID: 123 = Hash of Digits 1, 2, 3
ID: 416 = Hash of Digits 4, 1, 6
Issues:
Key security: Assume that the key is
'protected' and that the app is not
financial nor highly critical, but
is 'high-volume'.
Creating a
wide-number number of hash
permutations is both prohibitively
high-storage (16bytes x several
permutations) and time-consuming probably overkill
Are there any other options, issues or refinements?
Yes: I know storing passwords/PINs in a reversible manner is 'contentious' and ideally shouldn't be done.
Update
Just for clarification:
1. Random digits is a scheme I am considering to avoid key-loggers.
2. It is not possible to attempt more than a limited number of retries.
3. Other elements help secure and authenticate access.
As any encryption scheme you use to store the password/pass phrase would be either prohibitively expensive, or, easily cracked I am coming down on the side of just plain storing it in plain textr and ensuring that the database and server security is up to scratch.
You could consider some lightweight encryption scheme to hide the passwords from a casual browser of the database, but, you have to admit that any scheme will have two basic vulnerabilties. One -- your program will need a password or key which will have to be stored somewhere and will be almost as vulnerable to snooping as the actual passwords sotred in plain text, and, Two -- if you have a reasonable number of users then a hacker who has access to the encrypted passwords has lots of "clue"s to aid his brute force attack, and if your site is open to the public he can insert any number of "known texts" into your database.
Since 6C3 is 20 and 10C3 is 120, I'll get a false positive (be authenticated) on 1/6th of my guesses.
This scheme is only slightly better than no authentication at all regardless of how you store the token.
I totally agree with msw but that argument is only (or mostly) valid for the six digit scheme. For the n-char approach, the false positive ratio will (sometimes...) be much lower. One improvement would be that the random characters must be entered in the same order as in the password.
Also I think that storing hashed permutations would make it relatively easy to find the key using some brute force approach. For example, testing and combining different combinations of three characters and checking those against the stored hashes. This would defeat the purpose of hashing the key in the first place so you might as well store the key encrypted instead.
Another, totally different argument, is that your users might get very confused by this odd login procedure :)
One possible solution is to use Reed-Solomon (or something like it) to construct an n-of-m scheme: generate an nth degree polynomial f(x), where n is the number of digits needed to log in, and generate the pin digits by evaluating f(x) at x=1..6. The digits combined become your full pin. Any three of these digits can then be used (along with their x coordinate) to interpolate the polynomial constants. If they are equal to your original constants, the digits are correct.
The biggest problem, of course, is to form a field out of numbers 0..9 for polynomial constant arithmetic. Ordinary arithmetic will not cut it in this instance. And my finite field is too rusty to remember if it is possible. If you go 4 bits per digit, you can use GF(2^4) to overcome this deficiency. In addition, it is not possible to select your PIN. It will need to be assigned to you. Finally, assuming you can fix all the problems, there are only 1000 distinct polynomials for a 3 of n scheme, and it is too small for proper security.
Anyhow, I don't think this will be a good method, but I wanted to add some different ideas into the mix.
You say you've other elements for authentication. If you've also passwords, you might do the following:
Ask for a password (password is stored as hash only on your side)
First check the hash of the entered password against the stored password hash
On success, continue, otherwise go back to 1
Use there entered (unhashed) password as key for symmetrically encrypted PINs
Ask for some random digits of the PIN
This way the PIN is encrypted, but the key is not stored in plain text on your side. The online portal of my bank seems to do just that (at least I hope so that the PIN is encrypted, but from the users view the login process is like the one described above).
The key is 'protected'
The app is not financial nor highly
critical,
The app is 'high-volume'.
Creating a wide-number number of hash
permutations is both prohibitively
high-storage (16bytes x several
permutations) and time-consuming
probably overkill
Random digits is a scheme I am
considering to avoid key-loggers.
It is not possible to attempt more
than a limited number of retries.
Other elements help secure and
authenticate access.
You seem to be arguing for storing the PIN in the clear. I say go for it. You're basically describing a challenge-response authentication method, and cleartext storage on the server side is common for that use-case.
Something similar to this is a one-time-pad, or a secret key matrix. The difference is that the user has to keep / have the pad with them to access. The benefit is that as long as you get the key distribution sufficiently secure, you're very safe from keyloggers.
If you want to make it so that exposure of the matrix / pad doesn't cause compromise alone, have the user use a short (3-4 number) PIN with the pad, and keep your sensitive locking mechanism.
Example of a matrix:
1 2 3 4 5 6 7 8
A ; k j l k a s g
B f q 3 n 0 8 u 0
C 1 2 8 e g u 8 -
A challenge might be: "Enter your PIN, and then the character from square B3 from your matrix."
The response might be:
98763
Well, from the discussion of hashing methods weaknesses, I've got that the only ol' good brute-force is efficient to break.
So, the question is:
Is there a hashing algorithm which is more rigid against brute-force than others?
In case of hashing passwords.
The only protection against brute force is the fact that it takes an inordinately long time to perform a brute force.
Brute force works by simply going through every possible input string and trying it, one at a time. There's no way to protect against simply trying every possible combination.
This question is a decade old and now I've got the answer.
Yes, there are bruteforce-proof algorithms. The key for such algo is being slow. It will do no harm if correctness will be verified in a few milliseconds. But it will drastically slow the brute-force. Moreover, those algorithms can adapt to the future CPU's performance increase. Such algorithms include
bcrypt
argon2
Particularly in PHP, the password_hash() function must be used for hashing passwords
If you know that the input space is small enough for a brute force attack to be feasible, then there are two options for protecting against brute-force attacks:
Artificially enlarging the input space. This isn't really feasible - Password salting looks like that at first glance, but it really only prevents attackers from amortizing the cost of a brute force attack across multiple targets.
Artificially slowing down the hashing through key strengthening or using a hash algorithm that is inherently slow to compute - presumably, it's only a small extra cost to have the hash take a relatively long time (say, a tenth of a second) in production. But a brute-force attacker incurs this cost billions of times.
So that's the answer: the slower a hash algorithm is to compute, the less susceptible it is against brute-forcing the input space
(Original Answer follows)
Any additional bit in the output format makes the algorithm twice as strong against a straightforward brute force attack.
But consider that if you had a trillion computers that could each try a trillion hashes per second, it would still take you over 100 trillion years to brute-force a 128 bit hash, and you'll realize that a straightforward brute-force attack on the output is simply not worth wasting any throughts on.
Of course, if the input of the hash has less than 128bits of entropy, then you can brute-force the input - this is why it's often feasible to brute-force password cracking (Nobody can actually remember a password with 128 bits of entropy).
Consider the output of the hash algorithms.
A MD5 (128 bit) or SHA-1 (160 bit) is certainly easier to brute-force than a SHA-2 (224, 384 or even 512 bit).
Of course, there can be other flaws (like in MD5 or a bit less in SHA-1) which weaken the algorithm a lot more.
As Codeka said, no hashing algorithm is 100% secure against brute force attacks. However, even with hardware-assisted password cracking (using the GPU to try passwords), the time it takes to crack a sufficiently long password is astronomical. If you have a password of 8ish characters, you could be vulnerable to a brute force attack. But if you add a few more characters, the time it takes to crack increases radically.
Of course, this doesn't mean you're safe from rainbow attacks. The solution to that is to add a salt to your password and use a hashing algorithm that isn't vulnerable to preimage attacks.
If you use a salted password of 12-14 characters, preferably hashed with an sha2 algo (or equivalent), you've got a pretty secure password.
Read more here: http://www.codinghorror.com/blog/2007/10/hardware-assisted-brute-force-attacks-still-for-dummies.html
All cryptographic systems are vulnerable to brute force. Another term for this is a "Trivial Attack".
A simple explanation for hashing is that all hashing algorithms we use accept an infinitely sized input and have a fixed sized output. This is an unavoidable collision, and for something like sha256 it takes 2^256 operations to find one naturally. md5() has a shortcut making it 2^39th operations to find a collision.
One thing you can do to make your passwords stronger is to hide your salt. A password hash cannot be broken until its salt is retrieved. John The Ripper can be given a Dictionary, a Salt and a Password to recover password hashes of any type. In this case sha256() and md5() will break in about the same amount of time. If the attacker doesn't have the salt he will have to make significantly more guesses. If your salt is the same size as sha256 (32 bytes) it will take (dictionary size)*2^256 guesses to break one password. This property of salts is the basis of CWE-760.
Brute force is the worst attack, nothing can be brute force proof...
right now ~80-90 bits is considered cryptographically safe from a brute force attack standpoint, so you only need 10 bytes if a Collision Resistant Hash function is perfect, but they aren't so you just do more bits...
the proof that nothing can be brute force proof is in the Pigeon Hole Principle.
since hash function H allows arbitrary sized input [0,1]^n and outputs constant output [0,1]^k when the size of input exceeds the output size:, n>k, there are necessarily some outputs that can be produced by more than one input.
you can visualize that with a square divided into 9 sub squares.
0 | 0 | 0
0 | 0 | 0
0 | 0 | 0
these are your 9 holes. We are a brute force attacker, we have unlimited chances to attack... we have unlimited pigeons... but we at most need 10 to find a collision...
after 4 pidgeons and a good collision resistant hashing algorithm:
P | 0 | 0
0 | P | P
0 | 0 | P
after 9 pidgeons:
P | P | P
P | P | P
P | P | P
so our 10th pigeon will necessarily be a collision, because all of the holes are full.
but it really isn't even that good, because of another numerical property called the Birthday Paradox where given a number of independent selections you will find a duplicate much much faster than it takes to fill all of your "holes".
Join me in the fight against weak password hashes.
A PBKDF2 password hash should contain the salt, the number of iterations, and the hash itself so it's possible to verify later. Is there a standard format, like RFC2307's {SSHA}, for PBKDF2 password hashes? BCRYPT is great but PBKDF2 is easier to implement.
Apparently, there's no spec. So here's my spec.
>>> from base64 import urlsafe_b64encode
>>> password = u"hashy the \N{SNOWMAN}"
>>> salt = urlsafe_b64decode('s8MHhEQ78sM=')
>>> encoded = pbkdf2_hash(password, salt=salt)
>>> encoded
'{PBKDF2}1000$s8MHhEQ78sM=$hcKhCiW13OVhmLrbagdY-RwJvkA='
Update: http://www.dlitz.net/software/python-pbkdf2/ defines a crypt() replacement. I updated my little spec to match his, except his starts with $p5k2$ instead of {PBKDF2}. (I have the need to migrate away from other LDAP-style {SCHEMES}).
That's {PBKDF2}, the number of iterations in lowercase hexadecimal, $, the urlsafe_base64 encoded salt, $, and the urlsafe_base64 encoded PBKDF2 output. The salt should be 64 bits, the number of iterations should be at least 1000, and the PBKDF2 with HMAC-SHA1 output can be any length. In my implementation it is always 20 bytes (the length of a SHA-1 hash) by default.
The password must be encoded to utf-8 before being sent through PBKDF2. No word on whether it should be normalized into Unicode's NFC.
This scheme should be on the order of iterations times more costly to brute force than {SSHA}.
There is a specification for the parameters (salt and iterations) of PBKDF2, but it doesn't include the hash. This is included in PKCS #5 version 2.0 (see Appendix A.2). Some platforms have built-in support for encoding and decoding this ASN.1 structure.
Since PBKDF2 is really a key derivation function, it doesn't make sense for it to specify a way to bundle the "hash" (which is the really a derived key) together with the derivation parameters—in normal usage, the key must remain secret, and is never stored.
But for usage as a one-way password hash, the hash can be stored in a record with the parameters, but in its own field.
I'll join you in the fight against weak hashes.
OWASP has a Password Storage Cheat Sheet (https://www.owasp.org/index.php/Password_Storage_Cheat_Sheet) with some guidance; they recommend 64,000 PBKDF2 iterations minimum as of 2012, doubling every two years (i.e. 90,510 in 2012).
Note that a storing a long, cryptographically random salt per-userid is always basic.
Note that having a widely variable per-userid number of iterations and storing the number of iterations along with the salt will add some complexity to cracking software, and may help preclude certain optimizations. For instance, "bob" gets encrypted with 135817 iterations, while "alice" uses 95,121 iterations, i.e. perhaps a minimum of(90510 + RAND(90510)) for 2013.
Note also that all of this is useless if users are allowed to choose weak passwords like "password", "Password1!", "P#$$w0rd", and "P#$$w0rd123", all of which will be found by rules based dictionary attacks very quickly indeed (the latter is simply "password" with the following rules: uppercase first letter, 1337-speak, add a three digit number to the end). Take a basic dictionary list (phpbb, for a good, small starter wordlist) and apply rules like this to it, and you'll crack a great many passwords where people try "clever" tricks.
Therefore, when checking new passwords, don't just apply "All four of upper, lower, number, digit, at least 11 characters long", since "P#$$w0rd123" complies with this seemingly very tough rule. Instead, use that basic dictionary list and see if basic rules would crack it (it's a lot simpler than actually trying a crack - you can lower-case your list and their word, and then simply write code like "if the last 4 characters are a common year, check all but the last four characters against the wordlist", and "if the last 3 characters are digits, check all but the last 3 characters against the wordlist" and "check all but the last two characters against the wordlist" and "De-1337 the password - turn #'s into a, 3 into e, and so on, and then check it against the wordlist and try those other rules too."
As far as passphrases go, in general are a great idea, particularly if some other characters are added to the middle of words, but if and only if they're long enough, since you're giving up a lot of possible combinations.
Note that modern machines with GPU's are up to the tens of billions of hash iterations (MD5, SHA1, SHA-256, SHA-512, etc.) per second, even in 2012. As far as word combination "correct horse battery staple" type passwords, this one is at best a very modest password- it's only 4 all lower case English words of length 7 or less with spaces. So, if we go looking for XKCD style passwords with an 18 billion guess a second setup: A modern small american english dictionary has: 6k words of length 5 or less 21k words of length 7 or less 36k words of length 9 or less 46k words of length 11 or less 49k words of length 13 or less
With an XKCD style passphrase, and without bothering to filter words by popularity ("correct" vs. "chair's" vs. "dumpier" vs. "hemorrhaging") we have 21k^4, which is only about 2E17 possibilities. With the 18 billion/sec setup (a single machine with 8 GPU's if we're facing a single SHA1 iteration), that's about 4 months to exhaustively search the keyspace. If we had ten such setups, that's about two weeks. If we excluded unlikely words like "dumpier", that's a lot faster for a quick first pass.
Now, if you get words out of a "huge" linux american english wordlist, like "Balsamina" or "Calvinistically" (both chosen by using the "go to row" feature", then we'd have 30k words of length 5 or less 115k words of length 7 or less 231k words of length 9 or less 317k words of length 11 or less 362k words of length 13 or less
Even with the 7 length max limit, with this huge dictionary as a base and randomly chosen words, we have 115k^4 ~= 1.8E20 possibilities, or about 12 years if the setup is kept up to date (doubling in power every 18 months). This is extremely similar to a 13 character, lower case + number only password. "300 years" is what most estimates will tell you, but they fail to take Moore's Law into account.