Is it a security risk to use parts of GUID as a random passwords? - security

When users create an account in my web application, I generate a GUID and use the first 8 characters as their password which is then sent via email.
Is there a security risk I am overlooking in using GUIDs as passwords? I've taken a look at the questionAre GUIDs good passwords?, but that question pertains to personal passwords not random/generated passwords. Ideally, users will login and change their password if they want to.

Using GUIDs as passwords is a very bad idea. GUIDs are generated in a very predictable and well defined manner. Or in other words given enough information it would allow an attacker to predict the passwords of other users.
Predictable and well defined is the exact opposite of what you want in a password generator.

Yes, unless you know exactly how the GUID is built. For example, some GUIDs bundle the MAC address of the host in to the GUID. If you happen to use those bits, then that compromises a large amount of the bit space for the "random" password.
Simply put, GUIDs may be unique, but they are not necessarily random.

"Cryptanalysis of the WinAPI GUID generator shows that, since the sequence of V4 GUIDs is pseudo-random; given full knowledge of the internal state, it is possible to predict previous and subsequent values." http://en.wikipedia.org/wiki/Globally_unique_identifier
I wouldn't use it. It's not that hard to use a random number generator, after all, which are designed to be as random as possible, rather than attempting to guarantee global uniqueness.

This article says don't use it.

GUIDs come in a number of flavors; some have parts that are predictable.
On the other hand, it is very, very easy to generate random numbers.
Why use a questionable technique when a secure alternative is readily available?

Using part of the GUID, or even the whole thing, is a very bad idea. Even if most of it happens to be random, there's no guarantee that any particular portion will be.
I'm not sure there'd be much trouble using a hash of a GUID, or better yet a hash that combined a GUID with some other source of randomness (e.g. one might hash the time when the program starts, and then generate a passcode by returning part of a hash of the previous hash and a new GUID). If there's any randomness at all in GUID generation, the entropy of the hash should increase with each iteration. Note that the passcode should not reveal the entire hash value; some of that should be kept as secret internal state.

Related

Store user IP, but make it non traceable

I am working on a project where users (in a given and relativity short time period) answers statements, and i would like to store the entries anonymously.
After the collection period is over, i would like to be able to run statics on the answers. But it is very important that the users answers can not be traced back to a specific user/ip.
The reason that i would still like to store ip, regardless of my desire for the users to be anonymously, is that i would like to exclude entries where the user (with malicious intent or by accident), take the same test multiple times in a short span.
I have ruled out using encryptions, as it is, to my limited knowledge, not possible to compare a large set of encrypted strings like that.
My currently self proposed method is then to store: The user agent, a session identifier and a hashed ip address.
Regarding to the hashing method, i am thinking to use sha512 where the ip is prepended with a 16 character long salt (same salt for all entries).
I know that when hashing simple and common strings, that sha512 and other hashing methods can be broken with tools like: http://md5decrypt.net/en/Sha512/ and good old brute forcing.
My idea to then guarantee user anonymity, is that after the collection period is over, i will delete the salt. Making it (to my knowledge) near impossible to brute force the hash. Even if a malicious party got hand on my source code.
I know it seems like a low tech solution, and that party of the security is based on my own action of actually deleting, where i en theory could forget or change my mind. But it is the only solution i could come up with.
Thanks in advance
Don't hash the IP's, HMAC them. That's conceptually the same than what you want to do, but cryptographically robust.
https://en.wikipedia.org/wiki/Hash-based_message_authentication_code

Do I need to use hashes instead of user id's for invite system?

I'm implementing a referral system, and I'm wondering if I NEED to use a hash as the referrer_id, or if I can just use the user's id? I don't want to dogmatically use hashes for EVERYTHING, so can you give me some examples of potential pitfalls?
It depends on what is important to you. If you need to know for sure that the referrer_id originated from the user who provided the refer, then you will need some sort of hash combined with token that cannot be guessed. If all you do is hash the user id, you aren't providing any real security because an attacker could simply guess a user ID and hash it to make your server happy.
It is important to know that hashing does not produce any unpredictability. It will always produce the same output for the same input. Hashing is valuable because it prevents tampering (a small change in input produces a large change in output) and it normalizes the size of the data, but it should never be confused with encryption.
How much do you care about Confidentiality, Integrity, and Availability of the referrer_id? The answers to those questions will determine how much effort you need to exert to protect the referrer_id.

Does adding a constant string to the user's password before hashing it make it more secure?

Does adding a constant string that is stored in the code to the password before hashing make it harder for an attacker to figure out the original password?
This constant string is in addition to a salt. So, Hash(password + "string in code added to every password" + randomSaltForEachPassword)
Normally, if an attacker gets their hands on the database, they can possibly figure out someone's password by brute force. The database contains the salts corresponding to each password, so they would know what to salt their brute force attempts with. But, with the constant string in code, the attacker would also have to obtain the source code to know what to append to each of their brute force attempts.
I think it would be more secure, but I wanted to get other people's thoughts, and also make sure I'm not inadvertently making it less secure.
Given that you already have a random salt, appending some other string neither adds nor detracts from the security level.
Basically, it's just a waste of time.
update
This was getting a little long to use the comments.
First off, if the attacker has the database and the only thing you've encrypted is the password then games over anyhow. They have the data which is the truly important part.
Second, the salt means that they have to create a larger rainbow table to encompass the larger password length possibilities. The time this takes becomes impractical depending on salt length and the resources available to the cracker. See this question for a bit more info:
How to implement password protection for individual files?
update 2
;)
It is true that users reuse passwords (as some of the latest hacked sites reveal) and it's good that you want to prevent your data loss from impacting them. However, once you finish reading this update you'll see why that's not entirely possible.
The other questions will have to be taken together. The entire purpose of a salt is to ensure that the same two passwords result in a different hash value. Each salt value would require a rainbow table to be created encompassing all of the password hash possibilities.
Therefore not using a salt value means that a single global rainbow table can be referenced. It also means that if you use just one salt value for all passwords on the site, then, again, they can create a single rainbow table and grab all of the passwords at once.
However, when each password has a separate salt value this means they have to create a rainbow table for each salt value. Rainbow tables take time and resources to build. Things that can help limit the time it takes to create a table is knowing the password length restrictions. For example, if your passwords must be between 7 and 9 characters then the hacker only has to compute hash values in that range.
Now the salt value has to be available to the function that is going to hash a password attempt. Generally speaking you could hide this value elsewhere; but quite frankly if they've stolen the database then they'll be able to track it down pretty easily. So, placing the values next to the actual password has zero impact on security.
Adding an extra bit of characters that is common to ALL passwords adds nothing to the mix. Once a hacker cracks the first one it will be obvious that the others have this value and they can code their rainbow table generator accordingly. Meaning that it essentially saves no time. Further, it leads to a false sense of security on your part which can lead to you making bad choices.
Which leads us back to the purpose of salting passwords. The purpose is not to make it impossible, as anyone with time and resources can crack them. The purpose is to make it difficult and time consuming. The time consuming part is to allow you the time to detect the break in, notify everyone you have to, and enforce password changes in your system.
In other words, once the database is lost then all users should be notified so that they can take the appropriate action of changing their passwords on yours and other systems. The salt is just buying you and them time to do this.
The reason I mentioned "impractical" before with regards to cracking them is that the question is really one of the hacker determining the value of the passwords versus the cost in cracking them. Using reasonable salt values you can drive the computational costs up enough that very few hackers would bother. They tend to be low hanging fruit kind of people; unless you have a reason to be a target. At which point you should look into other forms of authentication.
This only helps if your threat model includes a situation in which your attacker somehow obtains your password database, but cannot read the secret key stored in your code. For most, this isn't a terribly likely scenario, so it's not worth catering for.
Even in that limited case, it doesn't gain you a great deal of additional security, as the attacker can simply take their own password, and iterate over all possible secret key values. Once they find the right one (because it hashes their own password correctly), they can use that to attack all the other passwords in the database as they would normally.
If you're concerned about storing passwords securely, you should use a standard scheme like PBKDF2, which uses key stretching to make brute forcing much less practical.

Ultimate Hash Protection - Discussion of Concepts

Ok, so the whole problem with hashes is that users don't enter passwords over 15 characters long. Most only use 4-8 characters making them easy for attackers to crack with a rainbow table.
Solution, use a user salt to make hash input more complex and over 50chars so that they will never be able to generate a table (way to big for strings that size). plus, they will have to create a new table for each user. Problem: if they download the db they will get the user salt so you are back to square one if they care enough.
Solution, use a site "pepper" plus the user salt, then even if they get the DB they will still have to know the config file. Problem: if they can get into your DB chances are they might also get into your filesystem and discover your site pepper.
So, with all of this known - lets assume that an attacker makes it into your site and gets everything, EVERYTHING. So what do you do now?
At this point in the discussion, most people reply with "who cares at this point?". But that is just a cheap way of saying "I don't know what to do next so it can't be that important". Sadly, everywhere else I have asked this question that has been the reply. Which shows that most programmers miss a very important point.
Lets image that your site is like the other 95% of sites out there and the user data - or even full sever access - isn't worth squat. The attacker happens to be after one of your users "Bob" because he knows that "Bob" uses the same password on your site as he does on the banks site. He also happens to know Bob has his life savings in there. Now, if the attacker can just crack our sites hashes the rest will be a piece of cake.
So here is my question - How do you extend the length of the password without any traceable path? Or how do you make the hashing process to complex to duplicate in a timely manner? The only thing that I have come up with is that you can re-hash a hash several thousand times and increase the time it would take to create the final rainbowtable by a factor of 1,000. This is because the attacker must follow that same path when creating his tables.
Any other ideas?
Solution, use a user salt to make hash
input more complex and over 50chars so
that they will never be able to
generate a table (way to big for
strings that size). plus, they will
have to create a new table for each
user. Problem: if they download the db
they will get the user salt so you are
back to square one if they care
enough.
This reasoning is fallacious.
A rainbow table (which is a specific implementation of the general dictionary attack) trades space for time. However, generating a dictionary (rainbow or otherwise) takes a lot of time. It is only worthwhile when it can be used against multiple hashes. Salt prevents this. The salt does not need to be secret, it just needs to be unpredictable for a given password. This makes the chance of an attacker having a dictionary generated for that particular salt negligibly small.
"The only thing that I have come up with is that you can re-hash a hash several thousand times and increase the time it would take to create the final rainbowtable by a factor of 1,000."
Isn't that exactly what the Blowfish-based BCrypt hash is about? Increasing the time it takes to compute a hash so that brute force cracking (and rainbow table creation) becomes undoable?
"We present two algorithms with adaptable cost (...)"
More about adaptable cost hashing algorithms: http://www.usenix.org/events/usenix99/provos.html
How about taking the "pepper" idea and implementing it on a separate server dedicated to hashing passwords - and locked down except for this one simple and secure-as-possible service - possibly even with rate-limits to prevent abuse. Gives the attacker one more hurdle to overcome, either gaining access to this server or reverse engineering the pepper, custom RNG and cleartext extension algorithm.
Of course if they have access to EVERYTHING they could just evesdrop on user activity for a little while..
uhmm... Okay, my take on this:
You can't get the original password back from a hash. I I have your hash, I may find a password that fits that hash, but I can not log in to any other site that uses this password, assuming they all use salting. No no real issue here.
If someone gets your DB or even your site to get your config, you're screwed anyway.
For Admin or other Super Accounts, implement a second mean of verification, i.e. limit logins to certain IP ranges, use Client-Side-SSL Certificates etc.
For normal users, you won't have much chance. Everything you do with their password needs to be stored in some config or database, so if have your site, I have your magic snake oil as well.
Strong Password limitations don't always work. Some sites require passwords to have a numeric character - and as a result, most users add 1 to their usual password.
So I'm not entirely sure what you want to achieve here? Adding a Salt to the front of the users password and protecting Admin accounts with a second mean of authentication seems to be the best way, given the fact that users simply don't pick proper passwords and can't be forced to either.
I was hoping that someone might have a solution but sadly I am no better off then when I first posted the question. It seems that there is nothing that can be done but to find a time-costly algorithm or re-hash 1,000's of times to slow down the whole process of generating rainbow tables (or brute-forcing) a hash.

Is MD5 less secure than SHA et. al. in a practical sense?

I've seen a few questions and answers on SO suggesting that MD5 is less secure than something like SHA.
My question is, Is this worth worrying about in my situation?
Here's an example of how I'm using it:
On the client side, I'm providing a "secure" checksum for a message by appending the current time and a password and then hashing it using MD5. So: MD5(message+time+password).
On the server side, I'm checking this hash against the message that's sent using my knowledge of the time it was sent and the client's password.
In this example, am I really better off using SHA instead of MD5?
In what circumstances would the choice of hashing function really matter in a practical sense?
Edit:
Just to clarify - in my example, is there any benefit moving to an SHA algorithm?
In other words, is it feasible in this example for someone to send a message and a correct hash without knowing the shared password?
More Edits:
Apologies for repeated editing - I wasn't being clear with what I was asking.
Yes, it is worth worrying about in practice. MD5 is so badly broken that researchers have been able to forge fake certificates that matched a real certificate signed by a certificate authority. This meant that they were able to create their own fake certificate authority, and thus could impersonate any bank or business they felt like with browsers completely trusting them.
Now, this took them a lot of time and effort using a cluster of PlayStation 3s, and several weeks to find an appropriate collision. But once broken, a hash algorithm only gets worse, never better. If you care at all about security, it would be better to choose an unbroken hash algorithm, such as one of the SHA-2 family (SHA-1 has also been weakened, though not broken as badly as MD5 is).
edit: The technique used in the link that I provided you involved being able to choose two arbitrary message prefixes and a common suffix, from which it could generate for each prefix a block of data that could be inserted between that prefix and the common suffix, to produce a message with the same MD5 sum as the message constructed from the other prefix. I cannot think of a way in which this particular vulnerability could be exploited in the situation you describe, and in general, using a secure has for message authentication is more resistant to attack than using it for digital signatures, but I can think of a few vulnerabilities you need to watch out for, which are mostly independent of the hash you choose.
As described, your algorithm involves storing the password in plain text on the server. This means that you are vulnerable to any information disclosure attacks that may be able to discover passwords on the server. You may think that if an attacker can access your database then the game is up, but your users would probably prefer if even if your server is compromised, that their passwords not be. Because of the proliferation of passwords online, many users use the same or similar passwords across services. Furthermore, information disclosure attacks may be possible even in cases when code execution or privilege escalation attacks are not.
You can mitigate this attack by storing the password on your server hashed with a random salt; you store the pair <salt,hash(password+salt)> on the server, and send the salt to the client so that it can compute hash(password+salt) to use in place of the password in the protocol you mention. This does not protect you from the next attack, however.
If an attacker can sniff a message sent from the client, he can do an offline dictionary attack against the client's password. Most users have passwords with fairly low entropy, and a good dictionary of a few hundred thousand existing passwords plus some time randomly permuting them could make finding a password given the information an attacker has from sniffing a message pretty easy.
The technique you propose does not authenticate the server. I don't know if this is a web app that you are talking about, but if it is, then someone who can perform a DNS hijack attack, or DHCP hijacking on an unsecure wireless network, or anything of the sort, can just do a man-in-the-middle attack in which they collect passwords in clear text from your clients.
While the current attack against MD5 may not work against the protocol you describe, MD5 has been severely compromised, and a hash will only ever get weaker, never stronger. Do you want to bet that you will find out about new attacks that could be used against you and will have time to upgrade hash algorithms before your attackers have a chance to exploit it? It would probably be easier to start with something that is currently stronger than MD5, to reduce your chances of having to deal with MD5 being broken further.
Now, if you're just doing this to make sure no one forges a message from another user on a forum or something, then sure, it's unlikely that anyone will put the time and effort in to break the protocol that you described. If someone really wanted to impersonate someone else, they could probably just create a new user name that has a 0 in place of a O or something even more similar using Unicode, and not even bother with trying to forge message and break hash algorithms.
If this is being used for something where the security really matters, then don't invent your own authentication system. Just use TLS/SSL. One of the fundamental rules of cryptography is not to invent your own. And then even for the case of the forum where it probably doesn't matter all that much, won't it be easier to just use something that's proven off the shelf than rolling your own?
In this particular case, I don't think that the weakest link your application is using md5 rather than sha. The manner in which md5 is "broken" is that given that md5(K) = V, it is possible to generate K' such that md5(K') = V, because the output-space is limited (not because there are any tricks to reduce the search space). However, K' is not necessarily K. This means that if you know md5(M+T+P) = V, you can generate P' such that md5(M+T+P') = V, this giving a valid entry. However, in this case the message still remains the same, and P hasn't been compromised. If the attacker tries to forge message M', with a T' timestamp, then it is highly unlikely that md5(M'+T'+P') = md5(M'+T'+P) unless P' = P. In which case, they would have brute-forced the password. If they have brute-forced the password, then that means that it doesn't matter if you used sha or md5, since checking if md5(M+T+P) = V is equivalent to checking if sha(M+T+P) = V. (except that sha might take constant time longer to calculate, that doesn't affect the complexity of the brute-force on P).
However, given the choice, you really ought to just go ahead and use sha. There is no sense in not using it, unless there is a serious drawback to using it.
A second thing is you probably shouldn't store the user's password in your database in plain-text. What you should store is a hash of the password, and then use that. In your example, the hash would be of: md5(message + time + md5(password)), and you could safely store md5(password) in your database. However, an attacker stealing your database (through something like SQL injection) would still be able to forge messages. I don't see any way around this.
Brian's answer covers the issues, but I do think it needs to be explained a little less verbosely
You are using the wrong crypto algorithm here
MD5 is wrong here, Sha1 is wrong to use here Sha2xx is wrong to use and Skein is wrong to use.
What you should be using is something like RSA.
Let me explain:
Your secure hash is effectively sending the password out for the world to see.
You mention that your hash is "time + payload + password", if a third party gets a copy of your payload and knows the time. It can find the password (using a brute force or dictionary attack). So, its almost as if you are sending the password in clear text.
Instead of this you should look at a public key cryptography have your server send out public keys to your agents and have the agents encrypt the data with the public key.
No man in the middle will be able to tell whats in the messages, and no one will be able to forge the messages.
On a side note, MD5 is plenty strong most of the time.
It depends on how valuable the contents of the messages are. The SHA family is demonstrably more secure than MD5 (where "more secure" means "harder to fake"), but if your messages are twitter updates, then you probably don't care.
If those messages are the IPC layer of a distributed system that handles financial transactions, then maybe you care more.
Update: I should add, also, that the two digest algorithms are essentially interchangeable in many ways, so how much more trouble would it really be to use the more secure one?
Update 2: this is a much more thorough answer: http://www.schneier.com/essay-074.html
Yes, someone can send a message and a correct hash without knowing the shared password. They just need to find a string that hashes to the same value.
How common is that? In 2007, a group from the Netherlands announced that they had predicted the winner of the 2008 U.S. Presidential election in a file with the MD5 hash value 3D515DEAD7AA16560ABA3E9DF05CBC80. They then created twelve files, all identical except for the candidate's name and an arbitrary number of spaces following, that hashed to that value. The MD5 hash value is worthless as a checksum, because too many different files give the same result.
This is the same scenario as yours, if I'm reading you right. Just replace "candidate's name" with "secret password". If you really want to be secure, you should probably use a different hash function.
if you are going to generate a hash-mac don't invent your scheme. use HMAC. there are issues with doing HASH(secret-key || message) and HASH(message || secret-key). if you are using a password as a key you should also be using a key derivation function. have a look at pbkdf2.
Yes, it is worth to worry about which hash to use in this case. Let's look at the attack model first. An attacker might not only try to generate values md5(M+T+P), but might also try to find the password P. In particular, if the attacker can collect tupels of values Mi, Ti, and the corresponding md5(Mi, Ti, P) then he/she might try to find P. This problem hasn't been studied as extensively for hash functions as finding collisions. My approach to this problem would be to try the same types of attacks that are used against block ciphers: e.g. differential attacks. And since MD5 already highly susceptible to differential attacks, I can certainly imagine that such an attack could be successful here.
Hence I do recommend that you use a stronger hash function than MD5 here. I also recommend that you use HMAC instead of just md5(M+T+P), because HMAC has been designed for the situation that you describe and has accordingly been analyzed.
There is nothing insecure about using MD5 in this manner. MD5 was only broken in the sense that, there are algorithms that, given a bunch of data A additional data B can be generated to create a desired hash. Meaning, if someone knows the hash of a password, they could produce a string that will result with that hash. Though, these generated strings are usually very long so if you limit passwords to 20 or 30 characters you're still probably safe.
The main reason to use SHA1 over MD5 is that MD5 functions are being phased out. For example the Silverlight .Net library does not include the MD5 cryptography provider.
MD5 provide more collision than SHA which mean someone can actually get same hash from different word (but it's rarely).
SHA family has been known for it's reliability, SHA1 has been standard on daily use, while SHA256/SHA512 was a standard for government and bank appliances.
For your personal website or forum, i suggest you to consider SHA1, and if you create a more serious like commerce, i suggest you to use SHA256/SHA512 (SHA2 family)
You can check wikipedia article about MD5 & SHA
Both MD5 amd SHA-1 have cryptographic weaknesses. MD4 and SHA-0 are also compromised.
You can probably safely use MD6, Whirlpool, and RIPEMD-160.
See the following powerpoint from Princeton University, scroll down to the last page.
http://gcu.googlecode.com/files/11Hashing.pdf
I'm not going to comment on the MD5/SHA1/etc. issue, so perhaps you'll consider this answer moot, but something that amuses me very slightly is whenever the use of MD5 et al. for hashing passwords in databases comes up.
If someone's poking around in your database, then they might very well want to look at your password hashes, but it's just as likely they're going to want to steal personal information or any other data you may have lying around in other tables. Frankly, in that situation, you've got bigger fish to fry.
I'm not saying ignore the issue, and like I said, this doesn't really have much bearing on whether or not you should use MD5, SHA1 or whatever to hash your passwords, but I do get tickled slightly pink every time I read someone getting a bit too upset about plain text passwords in a database.

Resources