I have read about password salting, but this might sound a little odd. But how do I store and secure the salt. For example in a multi tire architecture say I use the client machine’s GUID to generate my salt then the user gets restricted to a single machine but if I use random salt it has to be stored somewhere. Few days back I saw an sample application where the hash and the salt was generated on the client system whenever a new user was created and then the salted password and the hash is transmitted to the server where they are stored in SQL server. But if I follow this method and the database is compromised the passwords and the salt values for each password will be available to the X person. So, should I again salt/encrypt the passwords and received salt on server side? What is the best practice of salting?
Storing the salt unencrypted in the database next to the hashed passwords is not a problem.
The purpose of the salt is not to be secret. It's purpose is to be different for each hash (i.e. random), and long enough to defeat the use of rainbow tables when an attacker gets his hands on the database.
See this excellent post on the subject by Thomas Ptacek.
edit #ZJR: even if the salts were completely public, they would still defeat the benefit of rainbow tables. When you have a salt and hashed data, the best you can do to reverse it is brute force (provided that the hash function is cryptographically secure)
edit #n10i: See the wikipedia article for secure hash function. As for the salt size, the popular bcrypt.gensalt() implementation uses 128 bit.
Please take a moment to read this very good description of salts and hashing
Salt Generation and open source software
Related
does using hash functions and algorithims suffice the need to encrypt the data, while communicating with the server
I am applying salt mechanism in my project and I will be concatenating the salt with the entered password, and then hash them all.
do I still need to encrypt the result?
The usual workflow for a website to transmit user passwords looks like this:
The client sends the password plaintext to the server.
The transmission is done with an encrypted connection (HTTPS/SSL), to prevent a ManInTheMiddle attack.
The server calculates a hash of the plaintext password, and this hash is then stored in the database. It is not necessary to encrypt the hash any further.
Make sure you use a random unique salt for each password, and a slow hash function with a cost factor for hashing passwords. Good algorithms are BCrypt, PBKDF2 or SCrypt.
Storing passwords
To store user passwords securely, you need 3 things:
Do not store the plain password, store a hash instead
The hash makes it extremely difficult to recuperate the password even if an attacker manages to capture the entire database
To prevent the use of rainbow tables, you need a salt
The salt is stored in the clear (can be along with the hash) and is random, one for every user and you can easily chose a few one whenever the user changes their password.
You need a SLOW hash, not a fast hash
What are fast hashes: MD5 (consider it broken), SHA-1, SHA-2, ... are unsuitable as the attacker can perform them too fast and use dictionary attacks to try common passwords and find that way up to 95% of you user's passwords in mere hours on modern rigs.
How slow does it need to be ? As slow as you can afford.
And there's a rule 0: Do not invent crypto yourself, you will make serious mistakes before you know it.
Other privacy sensitive info
You're most probably also storing other sensitive information of your visitors in addition to the passwords (email addresses, IP addresses, names, postal address, ...), CC numbers (you better not go there), ...
You need to protect that as well and using hashes isn't the way to do that in most cases. Some of these have requirements and regulations of their own (e.g. Credit Card data is regulated by the issuers who'll force you to be compliant with PCI-DSS).
In essence you need to do a risk analysis and manage that risk by either accepting it ("so be it"), transferring it ("get insurance"), avoid it ("we're not storing that"), or mitigating it ("we're going to improve our way of working").
encryption
Why the media will make you believe there's a "magic" solution in that incomprehensible "encryption" thing, in reality it needs to be done right and in the right conditions to have any meaning at all.
E.g. If you encrypt the entire disk of a server: it will not protect you from an attacker abusing your server scripts and getting to the database (as the database engine and webserver scripts have access to the decrypted disk already)
So, you really need to go back to the risk analysis and chose the measures there instead of getting ahead of yourself and suggesting encryption as a tool that's unlikely to help you for your biggest risks.
Given that you really have to perform your password hashing on the client side, how can you implement server-side salting?
The first solution that I can think of is to ask for the user's salt from the server's users table before you perform the hash. But that means you're confirming that the user "exists" since you give him the valid salt of the user.
I've also thought that instead of storing the salt in the user's table, you can make the salt something that is available to the user, for example, a variation of his username. But consistency problems might arise because the server and the client needs to remember how exactly the salt is gotten from the provided user data.
What is the best way to do this?
I'm no expert with regards to the topic but how about using something like a one-time salt along with the solutions you mentioned.
Meaning, you provide the client a salting function that generates a salt based on a random seed for a short time frame. The seed itself is dynamic and changes after some time and must be the same between the server and client. After all, the salt need not be secret.
On the client side generate the salt using the username (or whatever user data is available) assuming it is unique. Then you generate the hash on the concatenated password and salt and send it on the server.
On the server side, you calculate the salt using the same salting function in the client with the username as the input. You then generate the hash just the same and determine if the two values match. You just have to make sure the time window is wide enough to allow successful authentication.
Hashing client-side is useful if you don't have HTTPS for logins, but it can have some disadvantages such as revealing your hashing and/or salting methods. That being said, if they have access to your password hash database, they probably already have access to that information.
In order to do only a server side salt, you will need to rehash the password using the salt and password hash. In this scenario you would store only the username, salt (if not using a username and password hash salt) and second hash.
If as from your example you wish to perform the salting on both client and server, I would suggest using a combination of username and the initial password hash to salt. The salt won't be unknown by the client as anyone could check your salting method and even apply it to a password cracker, but it will avoid them using a rainbow table to crack same password users.
Don't use the username by itself as a salt. If its a common username (eg. admin), then there is probably a table out there already with this salt.
The problem with using nyde1319's answer (sorry didn't have rights to comment on the answer) is that you will need to have an unencrypted version of the password in your database to perform the password+salt hash. Defeating the purpose of the hash. If it was done using a hashed version of the password, you'd have to store the first hash and they could just crack that hash, defeating the purpose of the salt.
I have been looking for a good explanation of how to implement a password login system in a typical website environment. I have read some great wikipedia articles and SO Q&A and blogs etc but they always seem to focus on purely generating the hash rather than the whole process of creating hash sending which parts of it, storing which parts of it, what the server side code does with it etc. If there is already a good answer on SO I apologise for reposting, and please link.
My current understanding is:
1) A new user creates a new account on your website. They enter a "password", the client side code then generates and appends a long random string "salt" to the end and generates a hash -> BCrypt(password+salt) for example. The client code then sends the full hash plus the unhashed salt to the server.
2) The server stores the full hash and the unhashed salt in the users entry in a DB.
3) During the user login they type their password which is then hashed with a salt again,
Question 1) How does the client side code generate the same 'random' salt value for each user?
Question 2) at this point does the client side code just send the full hash without the salt?
Question 3) what does the server side do with the full hash once it has received it? (simply compare the sent full hash with the stored full hash? If that's the case then can't an attacker upon breaking into the db and getting the stored full hash values just use them directly to send to the server to log in? This is based on my assumption that the log in process essentially involves the server comparing the full hash sent from the client with the full hash stored in the db.
Question 4) should passwords always be sent over secure connection? or does salting and hashing them make it ok for anyone to see?
You are confusing the purpose of the hashing. It is not intended to secure the password for wire transmission. The client does not generate the hash. the purpose of the hash is to prevent an attacker who compromises your database from being able to quickly use a pre-generated hash lookup table to determine what your user's passwords are.
A trivial example follows- as #jhoyla points out in the comments below, industrial grade production schemes are even more complex.
To create an account:
The client establishes a secure (encrypted, e.g. SSL) connection with the server, and sends the username and password, usually in plaintext (which is OK, because it is encrypted).
The server generates a random salt, appends it to the password, hashes the result, and stores the hash and the unhashed salt value.
To log in:
The client establishes a secure (encrypted, e.g. SSL) connection with the server, and sends the username and password, usually in plaintext (which is OK, because it is encrypted).
The server retrieves the salt from storage, appends it to the password, hashes it, and compares the result to the hashed password in storage. If they match, the user is logged in.
To establish why we do this, imagine that I have successfully attacked a website's database server and downloaded the database. I now have a list of usernames, probably email addresses, and password hashes. If the passwords are not salted, then there is a very high probability that many of the hashes will be the same (because many people use the same weak passwords). I know that the likelihood of one of those users having that same weak password on (for example) their email account is quite high. So I go to work and hash the whole dictionary, plus many other likely passwords, looking for a hash that matches one of these popular ones. If I get a hit, I've just broken a bunch of passwords. If I was smart, I'd have generated this list in advance so that I can do it quickly.
Now imagine that the passwords are salted. Now, even if two people use the same password, a different salt will have been generated for each of them, and the resulting hashes will be different. I have no way of knowing which passwords are weak, common passwords, and which ones are strong passwords. I can try my dictionary attack by appending the salt to each possible password, but the difficulty (in terms of time) of cracking a password has now gone up exponentially.
never ever implement it yourself! if you need it just for learning then #Chris answered you. but if you need for for a working software then don't do it. every language has security libraries and every data store (ldap, database) has password storing mechanism already implemented. use it, don't invent the wheel again because you will most probably miss some detail
When authenticating a user to a website, should the hash generation and comparison be done in the database or the website?
My argument is the website should pass the user supplied password (possibly encrypted by the web server) to the database. The database then re-encrypts it with the salt and compares the hash's. The database the responds to the web server whether the user's credentials are valid or not. This way, the very minimum ever leaves the database, essentially either a yes or no, none of the stored credential info. Downside is, the database has to do more work.
The other argument is that the work should be done in the web server. Here the web server would create the hash and request the stored hash from the database and compare. In this situation the salt needs to be passed from the database back to the web server for the hash to be created. but, work is shared as # of web servers increase.
Personally I see the second method as a potential security risk. Should the web server be compromised, salts and hashes can be requested from the database and easily cracked.
What is the best practise for performing the above operation? Am I overlooking/missing something?
Thanks
The first problem I suspect you will run into (and it's a big one) is that your database does not have a password hash function. Sure, it probably has MD5() and SHA1() but these are cryptographic hash functions. Does it have bcrypt() or scrypt() or PBKDF2()?
Using a cryptographic hash function rather than a password hash function is what meant that the LinkedIn passwords could be cracked so quickly. If you don't use one of the above functions then you will be similarly vulnerable if your hashes are leaked.
Going on to answer your question assuming that your database does support a password hashing algorithm (using bcrypt simply because I have to pick one). The two alternatives are:
Hashing in the database:
$db->query("SELECT COUNT(*) FROM users WHERE username = '?' AND password = BCRYPT(?, (SELECT salt FROM user WHERE username = '?'))", $username, $password, $username);
if($row['count'] != 1)
{
// Not authenticated. Throw exception.
}
In this case, the raw password is sent to the database and a simple yes or no (1 or 0) is returned. This database communication can be encrypted. The hash and salt are never held in the application.
Hashing in the application:
$db->query("SELECT username, salt, password FROM users WHERE username = '?', $username);
if(bcrypt($password, $row['salt']) != $row['password'])
{
// Not authenticated. Throw exception.
}
In this case, the hash and salt are pulled from the database into the application and the hashing of the raw password and comparison is done there. The communication to the database can still be encrypted. The raw password is never held in the database memory.
For efficiency, we can assume that both hashing algorithms are written in C (or some compiled language) and are possibly provided by the OS so take the same time. The application hashing option receives more data over the wire and the database hashing option sends more and has a more complex query (essentially two queries, one to get the salt and one to effect the comparison). It may not be possible to use an index the way I have written that query but the query could be rewritten. Since the size of the data in both cases is likely still one TCP packet, the speed difference will be negligible. I would call this one a win for the application hashing option due to the subquery.
For exposure. I would consider the raw password to be more sensitive than the hash and the salt together. Therefore, limiting the exposure of the raw password seems like the safer bet, making application hashing the best practice.
There's a really good article on how to store passwords securely here:
http://throwingfire.com/storing-passwords-securely/
You are overlooking the purpose of a salt.
A salt is used to prevent a dictionary attack against hashed passwords. If your password is "peanut" and hashes to 12345, then I can pre-generate a list of hashes for every word in a dictionary (including your password) and quickly find your password by doing a lookup against my pre-generated set of password hashes. This is what happened to LinkedIn recently. If the passwords are salted, I'd have to pre-generate a dictionary for each salt value after compromising the database, which would be prohibitively expensive.
Furthermore, proper randomly-generated salts prevent an attacker from knowing that you and I have the same password (without the salt, we'd have the same hash).
My point is that the salts are not intended to be a secret. They are not public information, but an attacker getting access to the salt values + the hashes does not necessarily mean that the passwords have been compromised.
A good rule of thumb for computer security is that if you have to ask, you shouldn't do it yourself. But if your concern is exposure of password details if the web server is compromised, then one approach is to move authentication onto its own system, and don't give the web server access to the password database at all.
In the light of the big LinkedIn password leak, I've been thinking about password security. The web development frameworks that I have worked with in the past typically store a master, application-level salt as an app constant, then salt all user passwords with that value (randomly generated on a per-app basis). e.g. in pseudo-code: password = hash(App::salt + userPassword).
I've read a lot of advice that suggests generating a random salt for each user, then storing that in the database along with each user's password. My question is, how does this increase security? If an attacker procures a list of password digests from the database, they are likely also able to get the salt as well, right? Or is there some attack vector that I don't know of that will get password digests without access to the rest of the table?
Storing random salt for each user defeats Rainbow table attack.
In case of a "master salt" it is still possible to precompute such table and use it in the attack. With a per-user salt this becomes impractical.