In a web application I am reading some bytes from /dev/urandom to get a random salt for hashing the passwords.
Is it good to base64 the salt before hashing? Because base64 encoding sometimes appends some = at the end, which could then result in a known plaintext attack.
But it may be no problem, because the salt is nevertheless stored in db, or am I wrong?
Does this have an effect on the security of the application?
For the most part, probably not. Your salt has to be known in order to decrypt the password, so we can assume that any attacker will be able to gain both the hashed password and the salt used. All that your salt is now protecting against is rainbow table-based attacks and increasing the amount of work (since each plaintext now needs to be hashed n times instead of once to compare against n passwords).
As long as your salt is of a reasonable length, you're probably fine.
It depends on the used hash algorithm, which alphabet of characters is accepted as salt. BCrypt for example will accept following characters, which is nearly but not exactly the same as a base64 encoded text:
./0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz.
A known plain text attack is no problem here, since we do not encrypt anything, especially not the salt.
No it's not secure.
You shouldn't use any hash function for user passwords. Instead you should use a password-based key derivation function such as PBKDF2 or scrypt with an appropriate number of iterations so as to slow down hashing, which mitigates the risk of bruteforce attacks.
What's the difference between a Key Derivation Function and a Password-Hash?
If you are using PHP for your web application:
Do I need base64 encode my salt (for hashing passwords)?
Secure hash and salt for PHP passwords
The purpose of a salt is to make sure that each password is stored differently. i.e. so if two people use the same password, the storage of the two passwords is not identical. This protects against rainbow and hashtable attacks if an attacker manages to extract the password table data.
Although there is no reason to Base64 it - the hash should be a sequence of bytes rather than ASCII text - this should not affect the security of your hashed passwords. Yes, there are limited byte sequences that will be used (just ones that represent valid ASCII characters), however your hash will be longer and it is representing the same range of possible values.
Related
So let's say we somehow got the hashed password of a victim.
So the brute force approach is to take every possible string, hash it and check if it matches the victims hashed password. If it does we can use that string is the password and hence hacked.
But this requires a great deal of computational power and good amount of time even for strings with 6-8 characters.
But what if we can hash every possible string with less than 10(some) characters and store it in storage like a sorted database before hand. So that when you get the hashed password you can easily look up the table and get the password.
P.S:-
For this example let's say we are working with only one type of hashing algorithm and have huge data servers to store data.
I'm new to security and this a very very basic question but for some reason the answer to the question was really hard to find on the internet.
This is called a rainbow table, and is very much a known concept.
It is also the reason you should never just store the hash of passwords. A salt (a random string added to the password and then stored with the hash as plaintext for verification) can easily mitigate this attack by effectively making it impossible to use a rainbow table and force recomputation.
Also just for completeness it's important to note that plain cryptographic hashes are not adequate anymore to be used for credentials (passwords), because they are too fast, which means it's too fast to generate a rainbow table for a given salt, effectively bruteforcing a password. Specialized hardware makes it feasible to recover relatively strong passwords if only hashed with a plain crypto hash, even if using a salt.
So the best practice is to use a key derivation function (KDF) to generate your password hashes in a way that makes it very slow (infeasible) to brute force, but fast enough to verify. Also in most known implementations adding a random salt to each hash is automatic and the whole thing is just secure.
Such algorithms are for example PBKDF2, bcrypt, scrypt or more recently, Argon2. Each of these have different characteristics, and are more resistant against different attacks.
I'd like to incorporate the encryption and decryption of files in one of my C# .NET apps. The scenario is simple: User A sends an AES256-encrypted file to user B. The clear text password is exchanged on a different channel (e.g. phone call or whatever).
From what I understand I should use Rfc2898DeriveBytes for converting the user's clear text password into a more secure password using maybe 10,000 rounds. (see this article).
What I don't understand is the role of salt in my scenario. Usually salt is used in hashing passwords to prevent dictionary attacks. But in my scenario the PBKDF2 algo is used to compensate weaknesses of short or easy to guess clear text passwords by adding extra calculations required by the PBKDF2-rounds.
If I choose a random salt then the receiver will need to know that salt also in order to decrypt correctly. If I use a constant salt, then hackers can easily reverse engineer my code and run brute force attacks using my constant salt (although they'll be really slow thanks to the PBKDF2 iterations).
From what I understand I have no choice but to use a constant salt in my scenario and enforce a good clear text password rule to make up for the weakness of constant salt. Is my assumption correct?
Salts, in the context of password hashing (and key derivation), are used to prevent precomputation attacks like rainbow tables.
Note that the salt must be different and unpredictable (preferably random) for every password. Also note that salts need not be secret – that's what the password is for. You gain no security by keeping the salt secret.
The recommended approach in your case is to generate a random salt every time a file is encrypted, and transmit the salt along with the ciphertext.
Is there a specific reason you're using AES-256 by the way? It's around 40% slower than AES-128 due to the extra rounds, and it offers no practical security benefit (particularly not in the case of password-based encryption).
It's also worth considering using a well-established standard like PGP rather than building your own protocol from cryptographic primitives, because building secure protocols is so hard that even experts don't always get it right.
Your assumption is correct. If they have access to the password, they will also have access to the salt. The BCrypt implementations I've seen put the number of iterations, the hash, and the salt all in the same result string!
The idea is: your hash should be secure even if the salt and number if iterations is known. (If we could always know that the salt and number of iterations and even the algorithm would be unknown to attackers, security would get a whole heck of a lot easier! Until attackers politely decline to read our salts, we must assume they will have access to them in the event of a breach.) So you're right, they can brute force it - if they have a few supercomputers and a couple million years of computing time at their disposal.
Coda Hale's article "How To Safely Store a Password" claims that:
bcrypt has salts built-in to prevent rainbow table attacks.
He cites this paper, which says that in OpenBSD's implementation of bcrypt:
OpenBSD generates the 128-bit bcrypt salt from an arcfour
(arc4random(3)) key stream, seeded with random data the kernel
collects from device timings.
I don't understand how this can work. In my conception of a salt:
It needs to be different for each stored password, so that a separate rainbow table would have to be generated for each
It needs to be stored somewhere so that it's repeatable: when a user tries to log in, we take their password attempt, repeat the same salt-and-hash procedure we did when we originally stored their password, and compare
When I'm using Devise (a Rails login manager) with bcrypt, there is no salt column in the database, so I'm confused. If the salt is random and not stored anywhere, how can we reliably repeat the hashing process?
In short, how can bcrypt have built-in salts?
This is bcrypt:
Generate a random salt. A "cost" factor has been pre-configured. Collect a password.
Derive an encryption key from the password using the salt and cost factor. Use it to encrypt a well-known string. Store the cost, salt, and cipher text. Because these three elements have a known length, it's easy to concatenate them and store them in a single field, yet be able to split them apart later.
When someone tries to authenticate, retrieve the stored cost and salt. Derive a key from the input password, cost and salt. Encrypt the same well-known string. If the generated cipher text matches the stored cipher text, the password is a match.
Bcrypt operates in a very similar manner to more traditional schemes based on algorithms like PBKDF2. The main difference is its use of a derived key to encrypt known plain text; other schemes (reasonably) assume the key derivation function is irreversible, and store the derived key directly.
Stored in the database, a bcrypt "hash" might look something like this:
$2a$10$vI8aWBnW3fID.ZQ4/zo1G.q1lRps.9cGLcZEiGDMVr5yUP1KUOYTa
This is actually three fields, delimited by "$":
2a identifies the bcrypt algorithm version that was used.
10 is the cost factor; 210 iterations of the key derivation function are used (which is not enough, by the way. I'd recommend a cost of 12 or more.)
vI8aWBnW3fID.ZQ4/zo1G.q1lRps.9cGLcZEiGDMVr5yUP1KUOYTa is the salt and the cipher text, concatenated and encoded in a modified Base-64. The first 22 characters decode to a 16-byte value for the salt. The remaining characters are cipher text to be compared for authentication.
This example is taken from the documentation for Coda Hale's ruby implementation.
I believe that phrase should have been worded as follows:
bcrypt has salts built into the generated hashes to prevent rainbow table attacks.
The bcrypt utility itself does not appear to maintain a list of salts. Rather, salts are generated randomly and appended to the output of the function so that they are remembered later on (according to the Java implementation of bcrypt). Put another way, the "hash" generated by bcrypt is not just the hash. Rather, it is the hash and the salt concatenated.
This is a simple terms...
Bcrypt does not have a database it stores the salt...
The salt is added to the hash in base64 format....
The question is how does bcrypt verifies the password when it has no database...?
What bcrypt does is that it extract the salt from the password hash... Use the salt extracted to encrypt the plain password and compares the new hash with the old hash to see if they are the same...
To make things even more clearer,
Registeration/Login direction ->
The password + salt is encrypted with a key generated from the: cost, salt and the password. we call that encrypted value the cipher text. then we attach the salt to this value and encoding it using base64. attaching the cost to it and this is the produced string from bcrypt:
$2a$COST$BASE64
This value is stored eventually.
What the attacker would need to do in order to find the password ? (other direction <- )
In case the attacker got control over the DB, the attacker will decode easily the base64 value, and then he will be able to see the salt. the salt is not secret. though it is random.
Then he will need to decrypt the cipher text.
What is more important : There is no hashing in this process, rather CPU expensive encryption - decryption. thus rainbow tables are less relevant here.
Lets imagine a table that has 1 hashed password. If hacker gets access he would know the salt but he will have to calculate a big list for all the common passwords and compare after each calculation. This will take time and he would have only cracked 1 password.
Imagine a second hashed password in the same table. The salt is visible but the same above calculation needs to happen again to crack this one too because the salts are different.
If no random salts were used, it would have been much easier, why? If we use simple hashing we can just generate hashes for common passwords 1 single time (rainbow table) and just do a simple table search, or simple file search between the db table hashes and our pre-calculated hashes to find the plain passwords.
This I think may be a silly question, but I have become quite confused on what I should do here for the best.
When salting a password hash, should the salt also be hashed or left as plaintext?
NOTE: I am hashing a password in SHA-256 and the Salt is a pre defined string as only one password will ever be stored at a time.
TIA
Chris (Shamballa).
It doesn't matter.
The purpose of a salt is to prevent pre-computation attacks.
Either way, hashing the salt or using it by itself, results in the same data being added as a salt each time. If you hash the salt, all you are effectively doing is changing the salt. By hashing it first, you convert it into a different string, which is then used as the salt. There is no reason to do this, but it will not do anything wrong if you do.
You just need to be consistent and use the same method every time or you will end up with a different password hash.
You must not hash the salt, since hashes are one way. You need the salt so that you can add it to the password before hashing. You could encrypt it, but it's not necessary.
The critical thing about salts is that each password should have its own salt. Ideally, each salt should be unique, but random is good too. The salt should therefore be long enough to allow it to be unique for each password.
If all salts are the same, it's obvious to the cracker (who can see your hash values), which accounts have the same password. The hash values will be the same. This means that if they crack one password, they get more than one account with no additional work. The cracker might even target those accounts.
You should assume that the cracker will gain both the salt and the hash value, so the hash algorithm must be secure.
Having any salt at all prevents using existing precomputed rainbow tables to crack your hash value, and having a unique salt for each account removes the desire for your cracker to precompute their own rainbow tables using your salt.
The salt should not be hashed, as you need the original value to combine with the password before hashing it.
No you must not hash the salt. The salt is in clear text and it is needed to you to recompute the password and check it with the one stored in the hashed password file.
But if you need a strong salting procedure you can compute your salted password in this manner:
SaltedHashedPwd = H(H(H(H(.....H(PWD-k+SALT-k)+SALT-k)+SALT-k).....)+SALT-k+N
H is the hash function
SALT-k is a k-random string you use as salt
PWD-k is the k-password
(every Password has a different salt)
N is the iterations number you compose the H function
In the PKCS#5 standard it uses N=1000!
In this manne a Dictionary attack is not possible because for every word into the Dictionary and for every SALT into the password file, the attacker needs to compute the Hash. Too expansive in time!
I think that N=100 should be enough for your uses :-)
As the salt needs to be saved along with the hash (or at least must be retrievable along with the hash), an attacker could possibly get both the salt and the hashed password. In some of my applications, I've stored the salt encrypted in the database (with a key known only to the application). My reasoning was that storing the salt unencrypted along with the hashed password would make it easier to crack the passwords, as a hacker that would be able to retrieve the password table (and would know or make an assumption about the hash algorithm) would be able to find matches between hashes of well known words (dictionary attack) by hashing each word in the dictionary and then salting with the salt he also has access to. If the salt would be encrypted, such an attack wouldn't be possible unless he would also have access to the encryption key known to the application.
(If anybody sees a fault in this logic, please comment.)
The current top-voted to this question states:
Another one that's not so much a security issue, although it is security-related, is complete and abject failure to grok the difference between hashing a password and encrypting it. Most commonly found in code where the programmer is trying to provide unsafe "Remind me of my password" functionality.
What exactly is this difference? I was always under the impression that hashing was a form of encryption. What is the unsafe functionality the poster is referring to?
Hashing is a one way function (well, a mapping). It's irreversible, you apply the secure hash algorithm and you cannot get the original string back. The most you can do is to generate what's called "a collision", that is, finding a different string that provides the same hash. Cryptographically secure hash algorithms are designed to prevent the occurrence of collisions. You can attack a secure hash by the use of a rainbow table, which you can counteract by applying a salt to the hash before storing it.
Encrypting is a proper (two way) function. It's reversible, you can decrypt the mangled string to get original string if you have the key.
The unsafe functionality it's referring to is that if you encrypt the passwords, your application has the key stored somewhere and an attacker who gets access to your database (and/or code) can get the original passwords by getting both the key and the encrypted text, whereas with a hash it's impossible.
People usually say that if a cracker owns your database or your code he doesn't need a password, thus the difference is moot. This is naïve, because you still have the duty to protect your users' passwords, mainly because most of them do use the same password over and over again, exposing them to a greater risk by leaking their passwords.
Hashing is a one-way function, meaning that once you hash a password it is very difficult to get the original password back from the hash. Encryption is a two-way function, where it's much easier to get the original text back from the encrypted text.
Plain hashing is easily defeated using a dictionary attack, where an attacker just pre-hashes every word in a dictionary (or every combination of characters up to a certain length), then uses this new dictionary to look up hashed passwords. Using a unique random salt for each hashed password stored makes it much more difficult for an attacker to use this method. They would basically need to create a new unique dictionary for every salt value that you use, slowing down their attack terribly.
It's unsafe to store passwords using an encryption algorithm because if it's easier for the user or the administrator to get the original password back from the encrypted text, it's also easier for an attacker to do the same.
As shown in the above image, if the password is encrypted it is always a hidden secret where someone can extract the plain text password. However when password is hashed, you are relaxed as there is hardly any method of recovering the password from the hash value.
Extracted from Encrypted vs Hashed Passwords - Which is better?
Is encryption good?
Plain text passwords can be encrypted using symmetric encryption algorithms like DES, AES or with any other algorithms and be stored inside the database. At the authentication (confirming the identity with user name and password), application will decrypt the encrypted password stored in database and compare with user provided password for equality. In this type of an password handling approach, even if someone get access to database tables the passwords will not be simply reusable. However there is a bad news in this approach as well. If somehow someone obtain the cryptographic algorithm along with the key used by your application, he/she will be able to view all the user passwords stored in your database by decryption. "This is the best option I got", a software developer may scream, but is there a better way?
Cryptographic hash function (one-way-only)
Yes there is, may be you have missed the point here. Did you notice that there is no requirement to decrypt and compare? If there is one-way-only conversion approach where the password can be converted into some converted-word, but the reverse operation (generation of password from converted-word) is impossible. Now even if someone gets access to the database, there is no way that the passwords be reproduced or extracted using the converted-words. In this approach, there will be hardly anyway that some could know your users' top secret passwords; and this will protect the users using the same password across multiple applications. What algorithms can be used for this approach?
I've always thought that Encryption can be converted both ways, in a way that the end value can bring you to original value and with Hashing you'll not be able to revert from the end result to the original value.
Hashing algorithms are usually cryptographic in nature, but the principal difference is that encryption is reversible through decryption, and hashing is not.
An encryption function typically takes input and produces encrypted output that is the same, or slightly larger size.
A hashing function takes input and produces a typically smaller output, typically of a fixed size as well.
While it isn't possible to take a hashed result and "dehash" it to get back the original input, you can typically brute-force your way to something that produces the same hash.
In other words, if a authentication scheme takes a password, hashes it, and compares it to a hashed version of the requires password, it might not be required that you actually know the original password, only its hash, and you can brute-force your way to something that will match, even if it's a different password.
Hashing functions are typically created to minimize the chance of collisions and make it hard to just calculate something that will produce the same hash as something else.
Hashing:
It is a one-way algorithm and once hashed can not rollback and this is its sweet point against encryption.
Encryption
If we perform encryption, there will a key to do this. If this key will be leaked all of your passwords could be decrypted easily.
On the other hand, even if your database will be hacked or your server admin took data from DB and you used hashed passwords, the hacker will not able to break these hashed passwords. This would actually practically impossible if we use hashing with proper salt and additional security with PBKDF2.
If you want to take a look at how should you write your hash functions, you can visit here.
There are many algorithms to perform hashing.
MD5 - Uses the Message Digest Algorithm 5 (MD5) hash function. The output hash is 128 bits in length. The MD5 algorithm was designed by Ron Rivest in the early 1990s and is not a preferred option today.
SHA1 - Uses Security Hash Algorithm (SHA1) hash published in 1995. The output hash is 160 bits in length. Although most widely used, this is not a preferred option today.
HMACSHA256, HMACSHA384, HMACSHA512 - Use the functions SHA-256, SHA-384, and SHA-512 of the SHA-2 family. SHA-2 was published in 2001. The output hash lengths are 256, 384, and 512 bits, respectively,as the hash functions’ names indicate.
Ideally you should do both.
First Hash the pass password for the one way security. Use a salt for extra security.
Then encrypt the hash to defend against dictionary attacks if your database of password hashes is compromised.
As correct as the other answers may be, in the context that the quote was in, hashing is a tool that may be used in securing information, encryption is a process that takes information and makes it very difficult for unauthorized people to read/use.
Here's one reason you may want to use one over the other - password retrieval.
If you only store a hash of a user's password, you can't offer a 'forgotten password' feature.