Does anyone see any downsides of doing the following to prevent CSRF? - security

I'm wondering if the following method will completely prevent CSRF, and be compatible with all users.
Here it is:
In the form just include an extra parameter that is: encrypted(user's userID + request time). Server-side just decrypt and make sure it's the right userID and the request time was reasonably recent.
Aside from someone sniffing the user's traffic or breaking the encryption, is this completely secure? Are there any downsides?

While your approach is safe it is not standard. The standard way to prevent CSRF attacks is to generate pseudo-random number that you include in a hidden field and also in a cookie and then on the server side you verify that both values match. Take a look at this post.

One major downside is that your page will 'timeout' if the user leaves their browser open longer than the time frame you decide is reasonable before they post the form. I prefer sites not to rush their user into committing their action unless the action is inherently time-sensitive.

It's not completely secure, as it may be possible for an attacking site to guess the User ID.
If you use a per-session encryption key, it is secure. (But then all you need to do is send the raw key, and it's already secure)
Also, remember about timezones and inaccurate clocks.

It should work, yes. Though I would suggest you use a random number for UserID when you create new users, rather than a simple incrementing number (obviously, make sure it's unique when you create the user). That way, it's hard for an attacker to "guess".

Having the UserID and a DateTime is a start, but you also want a pseudo random number value preferably with high entropy in addition in the canary token. Basically, you need to reduce the predictability of the token in a page in a given context. Having only the UserID and a DateTime could in theory be possible to break after some time as it is not "random enough". Having said that, CSRF attacks are generally scripted and not directly monitored, so depending upon the exposure of your application, it may be enough.
Also, be sure to use a more secure encryption algorithm such as Rijndael/AES with a key of bits sufficient for the security of your application and a pseudo random initialization vector.

The security system you have proposed is vulnerable to attack.
Block ciphers like AES are commonly used as very secure random number generators. They are called CSPRNGs. However, like any random number generator you have to worry about what you are seeding the algorithm with. In this case you are using user's userID + request time both of which the attacker can know, your implementation doesn't have a Key or IV so I assume they are NULL. The attacker is building the request so he will always know the request time. The userId is likely a primary key, if you have 100 users then the attacker could forge 100 requests and one of them will work. But the attacker might just want to force the administrator to change his password, admin's usually have a primary key of 1.
Do not re-invent the wheal, very good random number generators have already been built and there are also anti-csrf libraries.

Related

Unique hash as authorization for endpoint

I've already saw, that sometimes companies are sending customized links to get to some resource without logging in.
For example some company send me email with link to my invoices:
www.financial.service.com/<SOME_HASHED_VALUE>
and there is no any authorization behind this endpoint, they only rely on fact that I am only person who knows this hash value. I have very similar case but I have concerns:
firstly is it good approach ?
secondly how should I make this hash? sha512 on some random data?
This can be a completely valid approach, and is its own type of authentication. If constructed correctly, it proves that you have access to that email (it doesn't prove anything else, but it does prove that much).
These values often aren't hashes. They're often random, and that's their power. If they are hashes, they need to be constructed such that their output is "effectively random," so usually you might as well just make them random in the first place. For this discussion, I'll call it a "token."
The point of a token is that's unpredictable, and extremely sparse within its search space. By unpredictable, I mean that even if I know exactly who the token is for, it should be effectively impossible (i.e. within practical time contraints) to construct a legitimate token for that user. So, for instance, if this were the hash of the username and a timestamp (even a millisecond timestamp), that would be a terrible token. I could guess those very quickly. So random is best.
By "sparse" I mean that out of all the possible tokens (i.e. strings of the correct length and format), a vanishingly small number of them should be valid tokens, and those valid tokens should be scattered across the search space randomly. For example, if the tokens were sequential, that would be terrible. If I had a token, I could find other tokens by simply increasing or decreasing the value by one.
So a good token looks like this:
Select a random, long string
Store it in your database, along with metadata about what it means, and a timestamp
When a user shows up with it, read the data from the database
After some period of time, expire the token by deleting it from the database (optional, but preferred)
Another way to implement this kind of scheme is to encode the encrypted metadata (i.e. the userid, what page this goes to, a timestamp, etc). Then you don't need to store anything in a database, because its right there in the URL. I don't usually like this approach because it requires a very high-value crypto key that you then have to protect on production servers, and can be used to connect as anyone. Even if I have good ways to protect such a key (generally an attached HSM), I don't like such a key even existing. So generally I prefer a database. But for some applications, encrypting the data is better. Storing the metadata in the URL also significantly restricts how much metadata you can store, so again, tokens are nicer.
and there is no any authorization behind this endpoint, they only rely on fact that I am only person who knows this hash value.
Usually there is authorization before accessing the endpoint (you have authenticated before receiving the invoices). I see it as a common way to share resource with external parties. We use similar approach with expirable AWS S3 urls.
firstly is it good approach ?
It depends on your use case. For sharing some internal resources with option to control access (revoking access, time based access, one time access, ..)
secondly how should I make this hash? sha512 on some random data?
Until the SOME_HASHED_VALUE is not guessable with negligible collision probability (salted hash, long random unique value, ..) it should be ok.

Hashing of api keys in Phoenix/Elixir and using comeonin for this

In my system each user can have multiple api keys. I want to hash api keys and store in a database their hashes. I'm using comeonin for this.
1) is it sensible to store hashes of api keys rather than their plain, original values?
2) when an api request comes in, there's only a plain api key value in it and no user email along with it -- this is my system is designed.
How should I check if an api key is valid? Will I have to do this -- recalculate a hash?
given_api_plain_key = get_key_from_request()
# re-hash it again
# but how about the original salt???
given_api_hash_key = Comeonin.Bcrypt.hashpwsalt(given_api_plain_key)
case Repo.get_by(ApiKey, key_hash: given_api_hash_key) do
nil -> IO.puts("not found")
a -> IO.puts("gooood")
end
Or is there a better way?
(1) is it sensible to store hashes of api keys rather than their plain, original values?
Your goal seems to be to protect against grand compromises that might happen if somebody gets access to your database (for example, via SQL injection, command injection, or reverse shell on your system). In this case, yes it is sensible, especially if each user has a different API key. However this link is worth reading for other considerations that might affect your decision.
(2) How should I check if an api key is valid?
It is clear that you need to hash the input and see if matches to something in your database.
(3) Implementation.
You do not want to apply the same protection that you use for passwords. Passwords tend to have low entropy by nature, and therefore we need tools like bcrypt to process them. Bcrypt is slow by design (to prevent brute force attacks), and uses salts to help the security of poorly chosen passwords.
API keys should not have low entropy, and therefore you do not need slow functions to process them. In fact, slow functions bring DoS risks, so you definitely do not want to do a bcrypt every request that comes in. Salting also complicates your use case (salts work when you know who is providing the input, but in your case you do not know who it is coming from beforehand).
Short answer: Just use SHA256, no salt needed.

Remembering authenticated users with cookies

In the past I have written a CMS where authenticated users are remembered across HTTP requests with two cookies:
User Token - A random, multi-character (say 10-digit long) alphanumeric string that relates back to an actual User ID in the database.
Authentication Token - A random, mult-character (say 100-digit long) alphanumeric string that, once hashed, must match the stored value for said User ID in the database.
My question (for a new CMS) is as follows:
What is the point of using two cookies? Wouldn't it be just as secure if I instead used a single 110-digit long token that, once hashed, must match the stored value for some User ID in the database. When a match of this token is found in the database, the related User ID would be considered the authenticated user.
User and Auth Tokens vs. Combined Token
The best reason to keep them separate, is to keep your code more manageable, portable, and playing nice with others.
Security
If security is your only concern, then it would be more secure to combine the user and auth tokens into a single encrypted token, if and only if both sequences were generated via a method which does not result in any particular character being more heavily weighted. The reason being that the act of combining the two values essentially acts as an additional simple encryption step, as well as being a larger encryption key, and thus more difficult to spoof or crack.
The weight of a character refers to how likely it is to occur. Many methods of hashing or encrypting data result in very easily cracked (Excel's horribly insecure password for one).The main reason being that as certain characters are more likely to appear, then many can also be substituted for others. The final encryption result having hundreds of thousands of unintended and unknown encryption keys. (Try out that excel cracker for an example)
Maintainability and Performance
However, there are some drawbacks to combining the two values, primarily performance and maintainability related.
Creating or collecting the combined key will require at least 2 extra steps every time.
Any need to get or set either value will require extracting all of the information.
You may no longer update the auth-token, without also updating the user-token.
This can cause severe issues later on if you ever expect to tie the user to other sessions. i.e.: google login paired with your auth system.
Anyone else looking at your code will have to reverse engineer how you create the user-auth combo, if they intend to add any functionality, such as group-level permissions.
Conform already
I'm not much of a conformist myself. However, in many things the crowd will flow to the path of least resistance. More often than not, common practices become common only after experience showed that the other ways had bigger problems. This is one of those cases.
Minimal security impact plus needing one cookie instead of two, traded for a less portable and less performing platform. At the end of the day, it's your call.
Finally
It may be best to stop bothering with keeping both keys, and instead just a unique session hash. Then, just pair old sessions to users, IFF they re-authenticate after expiration.
NEVER use cookies (even encrypted) to auto-login without have several other checks and balances in place. Even with extra checks, if you're storing confidential information (names, addresses, phone, email), the security is between you and your user, so be extra cautious.
At the end of the day, you're the architect, go whichever route best fits your platform and environment.

Is there any advantage to re-hashing stored passwords at login time?

I'm in the process of updating several projects from using various insecure/horribly insecure MD5-based password hashes. I'm now at least somewhat better informed on best practices, but I still wonder if I'm doing something wrong. I haven't seen the specific process I'm implementing used elsewhere, but at least one SO user seems to want to do something similar. In my case:
Password hashes are generated using bcrypt. (Since the proper options seem to be bcrypt, scrypt, or pbkdf2 and bcrypt was most easily accessible to me in PHP.)
A different, random, salt is used for each hash. (To prevent attackers from generating a custom rainbow table calculated with a single, static salt.)
The hash, algorithm settings, and salt are stored together. (Since that's what PHP's crypt function give me for the hash value.)
After a successful login, the hash is re-calculated with a new random salt.
It's that last step that I'm wondering about. My intention here to to allow updates to the hashing algorithm as time passes so users who regularly log in will have their passwords stored in the most secure format available.
My questions are:
Is this a waste of time?
Are there any dangers in doing this?
UPDATE
Re delnan's comment : If you are re-hashing the already hashed password, don't -- You never know what vulnerabilities may occur and be found in chaining up hashes. Obviously the other side of that is you need to compute the entire hash-chain every time you validate the user secret -- so just re-hash the cleartext.
ORIGINAL
I upvoted halfway through reading. It seems like you're someone who's asking the right kind of questions to be doing this kind of work.
Not a waste of time.
There are always dangers. Someone could obtain your users' passwords by torture or, more likely, social engineering. Someone could have access to vast resources and along with your shadow password file still manage to crack the passwords. Someone could compromise your server secretly insert a trojan that intercepts the users cleartext passwords at successful login.
So there is no guarantee of perfect security. Ever. But I'm sure you know that already. Which is why I'd like to add only one thing:
Encourage users to choose hard to crack passwords.
And, strictly speaking, if your only reason for rehashing at every login is so that passwords are always stored using the latest update then yes -- your method IS a waste of time, assuming you will not be updating your algorithm at every user's login. So there will be rehashes which use the same algorithm and (presumed) security for two logins in a row. A waste of a few clock cycles on rehashing. Strictly speaking it's not optimized. Why not just include an algo version in your password store, and at login rehash if the system algo is newer than the user's hash algo.
UDPATE
Sorry. Completely missed your point about the use of newer algo's. This is a good thing. :-) But as stated in my original answer below when the algo stays the same it is useless.
ORIGINAL
Rehashing passwords is useless, because if an attacker has already got hold of the hash you aren't preventing anything.
Consider the following:
I am a user on your site with the hash: 1234567890.
Some attacker gets hold of that hash.
I log in again and the hash is changed.
The attacker doesn't care the hash changes because he only needs one hash to try to break.
So nothing has been prevented. The attacker still has the hash and can still try to break it. A possible attacker is only interested in the final result (the password) and not in the hashes.
If someone gain access to the hash changing it every time will not help at all unless the person has access to every update and willingly start over. this isn't going to happen and if it did you would have a much bigger problem then that.
No there is no danger in it only waste of server resources.
Actually, it prevent novice cookie attacker to copy cookie into his browser just to impersonate...so if the owner later login, with a changed hash, it will log the attacker out thereby reducing havoc on the user account.

Is MD5 less secure than SHA et. al. in a practical sense?

I've seen a few questions and answers on SO suggesting that MD5 is less secure than something like SHA.
My question is, Is this worth worrying about in my situation?
Here's an example of how I'm using it:
On the client side, I'm providing a "secure" checksum for a message by appending the current time and a password and then hashing it using MD5. So: MD5(message+time+password).
On the server side, I'm checking this hash against the message that's sent using my knowledge of the time it was sent and the client's password.
In this example, am I really better off using SHA instead of MD5?
In what circumstances would the choice of hashing function really matter in a practical sense?
Edit:
Just to clarify - in my example, is there any benefit moving to an SHA algorithm?
In other words, is it feasible in this example for someone to send a message and a correct hash without knowing the shared password?
More Edits:
Apologies for repeated editing - I wasn't being clear with what I was asking.
Yes, it is worth worrying about in practice. MD5 is so badly broken that researchers have been able to forge fake certificates that matched a real certificate signed by a certificate authority. This meant that they were able to create their own fake certificate authority, and thus could impersonate any bank or business they felt like with browsers completely trusting them.
Now, this took them a lot of time and effort using a cluster of PlayStation 3s, and several weeks to find an appropriate collision. But once broken, a hash algorithm only gets worse, never better. If you care at all about security, it would be better to choose an unbroken hash algorithm, such as one of the SHA-2 family (SHA-1 has also been weakened, though not broken as badly as MD5 is).
edit: The technique used in the link that I provided you involved being able to choose two arbitrary message prefixes and a common suffix, from which it could generate for each prefix a block of data that could be inserted between that prefix and the common suffix, to produce a message with the same MD5 sum as the message constructed from the other prefix. I cannot think of a way in which this particular vulnerability could be exploited in the situation you describe, and in general, using a secure has for message authentication is more resistant to attack than using it for digital signatures, but I can think of a few vulnerabilities you need to watch out for, which are mostly independent of the hash you choose.
As described, your algorithm involves storing the password in plain text on the server. This means that you are vulnerable to any information disclosure attacks that may be able to discover passwords on the server. You may think that if an attacker can access your database then the game is up, but your users would probably prefer if even if your server is compromised, that their passwords not be. Because of the proliferation of passwords online, many users use the same or similar passwords across services. Furthermore, information disclosure attacks may be possible even in cases when code execution or privilege escalation attacks are not.
You can mitigate this attack by storing the password on your server hashed with a random salt; you store the pair <salt,hash(password+salt)> on the server, and send the salt to the client so that it can compute hash(password+salt) to use in place of the password in the protocol you mention. This does not protect you from the next attack, however.
If an attacker can sniff a message sent from the client, he can do an offline dictionary attack against the client's password. Most users have passwords with fairly low entropy, and a good dictionary of a few hundred thousand existing passwords plus some time randomly permuting them could make finding a password given the information an attacker has from sniffing a message pretty easy.
The technique you propose does not authenticate the server. I don't know if this is a web app that you are talking about, but if it is, then someone who can perform a DNS hijack attack, or DHCP hijacking on an unsecure wireless network, or anything of the sort, can just do a man-in-the-middle attack in which they collect passwords in clear text from your clients.
While the current attack against MD5 may not work against the protocol you describe, MD5 has been severely compromised, and a hash will only ever get weaker, never stronger. Do you want to bet that you will find out about new attacks that could be used against you and will have time to upgrade hash algorithms before your attackers have a chance to exploit it? It would probably be easier to start with something that is currently stronger than MD5, to reduce your chances of having to deal with MD5 being broken further.
Now, if you're just doing this to make sure no one forges a message from another user on a forum or something, then sure, it's unlikely that anyone will put the time and effort in to break the protocol that you described. If someone really wanted to impersonate someone else, they could probably just create a new user name that has a 0 in place of a O or something even more similar using Unicode, and not even bother with trying to forge message and break hash algorithms.
If this is being used for something where the security really matters, then don't invent your own authentication system. Just use TLS/SSL. One of the fundamental rules of cryptography is not to invent your own. And then even for the case of the forum where it probably doesn't matter all that much, won't it be easier to just use something that's proven off the shelf than rolling your own?
In this particular case, I don't think that the weakest link your application is using md5 rather than sha. The manner in which md5 is "broken" is that given that md5(K) = V, it is possible to generate K' such that md5(K') = V, because the output-space is limited (not because there are any tricks to reduce the search space). However, K' is not necessarily K. This means that if you know md5(M+T+P) = V, you can generate P' such that md5(M+T+P') = V, this giving a valid entry. However, in this case the message still remains the same, and P hasn't been compromised. If the attacker tries to forge message M', with a T' timestamp, then it is highly unlikely that md5(M'+T'+P') = md5(M'+T'+P) unless P' = P. In which case, they would have brute-forced the password. If they have brute-forced the password, then that means that it doesn't matter if you used sha or md5, since checking if md5(M+T+P) = V is equivalent to checking if sha(M+T+P) = V. (except that sha might take constant time longer to calculate, that doesn't affect the complexity of the brute-force on P).
However, given the choice, you really ought to just go ahead and use sha. There is no sense in not using it, unless there is a serious drawback to using it.
A second thing is you probably shouldn't store the user's password in your database in plain-text. What you should store is a hash of the password, and then use that. In your example, the hash would be of: md5(message + time + md5(password)), and you could safely store md5(password) in your database. However, an attacker stealing your database (through something like SQL injection) would still be able to forge messages. I don't see any way around this.
Brian's answer covers the issues, but I do think it needs to be explained a little less verbosely
You are using the wrong crypto algorithm here
MD5 is wrong here, Sha1 is wrong to use here Sha2xx is wrong to use and Skein is wrong to use.
What you should be using is something like RSA.
Let me explain:
Your secure hash is effectively sending the password out for the world to see.
You mention that your hash is "time + payload + password", if a third party gets a copy of your payload and knows the time. It can find the password (using a brute force or dictionary attack). So, its almost as if you are sending the password in clear text.
Instead of this you should look at a public key cryptography have your server send out public keys to your agents and have the agents encrypt the data with the public key.
No man in the middle will be able to tell whats in the messages, and no one will be able to forge the messages.
On a side note, MD5 is plenty strong most of the time.
It depends on how valuable the contents of the messages are. The SHA family is demonstrably more secure than MD5 (where "more secure" means "harder to fake"), but if your messages are twitter updates, then you probably don't care.
If those messages are the IPC layer of a distributed system that handles financial transactions, then maybe you care more.
Update: I should add, also, that the two digest algorithms are essentially interchangeable in many ways, so how much more trouble would it really be to use the more secure one?
Update 2: this is a much more thorough answer: http://www.schneier.com/essay-074.html
Yes, someone can send a message and a correct hash without knowing the shared password. They just need to find a string that hashes to the same value.
How common is that? In 2007, a group from the Netherlands announced that they had predicted the winner of the 2008 U.S. Presidential election in a file with the MD5 hash value 3D515DEAD7AA16560ABA3E9DF05CBC80. They then created twelve files, all identical except for the candidate's name and an arbitrary number of spaces following, that hashed to that value. The MD5 hash value is worthless as a checksum, because too many different files give the same result.
This is the same scenario as yours, if I'm reading you right. Just replace "candidate's name" with "secret password". If you really want to be secure, you should probably use a different hash function.
if you are going to generate a hash-mac don't invent your scheme. use HMAC. there are issues with doing HASH(secret-key || message) and HASH(message || secret-key). if you are using a password as a key you should also be using a key derivation function. have a look at pbkdf2.
Yes, it is worth to worry about which hash to use in this case. Let's look at the attack model first. An attacker might not only try to generate values md5(M+T+P), but might also try to find the password P. In particular, if the attacker can collect tupels of values Mi, Ti, and the corresponding md5(Mi, Ti, P) then he/she might try to find P. This problem hasn't been studied as extensively for hash functions as finding collisions. My approach to this problem would be to try the same types of attacks that are used against block ciphers: e.g. differential attacks. And since MD5 already highly susceptible to differential attacks, I can certainly imagine that such an attack could be successful here.
Hence I do recommend that you use a stronger hash function than MD5 here. I also recommend that you use HMAC instead of just md5(M+T+P), because HMAC has been designed for the situation that you describe and has accordingly been analyzed.
There is nothing insecure about using MD5 in this manner. MD5 was only broken in the sense that, there are algorithms that, given a bunch of data A additional data B can be generated to create a desired hash. Meaning, if someone knows the hash of a password, they could produce a string that will result with that hash. Though, these generated strings are usually very long so if you limit passwords to 20 or 30 characters you're still probably safe.
The main reason to use SHA1 over MD5 is that MD5 functions are being phased out. For example the Silverlight .Net library does not include the MD5 cryptography provider.
MD5 provide more collision than SHA which mean someone can actually get same hash from different word (but it's rarely).
SHA family has been known for it's reliability, SHA1 has been standard on daily use, while SHA256/SHA512 was a standard for government and bank appliances.
For your personal website or forum, i suggest you to consider SHA1, and if you create a more serious like commerce, i suggest you to use SHA256/SHA512 (SHA2 family)
You can check wikipedia article about MD5 & SHA
Both MD5 amd SHA-1 have cryptographic weaknesses. MD4 and SHA-0 are also compromised.
You can probably safely use MD6, Whirlpool, and RIPEMD-160.
See the following powerpoint from Princeton University, scroll down to the last page.
http://gcu.googlecode.com/files/11Hashing.pdf
I'm not going to comment on the MD5/SHA1/etc. issue, so perhaps you'll consider this answer moot, but something that amuses me very slightly is whenever the use of MD5 et al. for hashing passwords in databases comes up.
If someone's poking around in your database, then they might very well want to look at your password hashes, but it's just as likely they're going to want to steal personal information or any other data you may have lying around in other tables. Frankly, in that situation, you've got bigger fish to fry.
I'm not saying ignore the issue, and like I said, this doesn't really have much bearing on whether or not you should use MD5, SHA1 or whatever to hash your passwords, but I do get tickled slightly pink every time I read someone getting a bit too upset about plain text passwords in a database.

Resources