Is using email hash as user id a good solution to ensure anonymity?

Imagine you want to create a "secure" messaging app which must comply to:
If someone has access to server databases, he/she can not identify the user from the field your using to substitute the normal username / email.
This solution seems interesting.
But I wonder:
If there are any better (more secure) alternatives
What hashing mechanism one should use

Not really. Hashes are good for hiding secret information, like passwords. For information like email addresses, which are usually quite easily guessed/googled, an attacker could easily pre-generate a huge list of hashes for a database of email addresses and quickly use a reverse lookup to find out if a given hash (on your system) matches up with one of the addresses in the database. That's putting aside the fact that hashes are not unique, which probably isn't a problem with a big enough hash address space.
Generally, if you want anonymous IDs, you should use randomly generated ones.


Is there a hashing technique that works both ways?

The hashing function generates a different hash every time for the same piece of data, but it can determine if a particular hash was generated with the piece of data or not.
hash_func(xyz): abc123
hash_func(xyz): jhg342 // different hash, even if the data was same.
decode_hash(jhg324) == xyz
This gives true, because the hash function determined that jhg324 is indeed the hash of xyz
The Question
For an Open Source website, I want to store the email in hashed form (because all the users will be public), but the site needs to know if an email was used to register for another account so that it can ensure one account per email.
However, all the emails are from one organization only. This means, they all look exactly like This means anyone can run through all the UIDs and find out which hash belongs to which email, and thus, which person.
Therefore, is there a way to hash the email such that the hash knows which email it belongs to, but hashing the same email does not generate the same hash.
P.S. Please note that I cannot use Salting as the site will be Open Source and the salt will be publicly available.
This doesn't make sense - you're conflating hashing and encryption in a very strange way. What you're describing wouldn't really be a cryptographically secure hash function. By definition, cryptographically secure hash functions are one way. In fact, if you could reverse it, there would be little point to using it at all because it would no longer be secure. This would make it possible to brute-force passwords and would "break" passwords that were used in multiple places.
Also, why would you want it to hash to different values each time? That's what you use a salt for.
If you want to be able to reverse it later, just use an encryption algorithm like AES. Even better, many databases even offer features for securely storing sensitive information; see, for example, SQL Server's Always Encrypted feature.

Store user IP, but make it non traceable

I am working on a project where users (in a given and relativity short time period) answers statements, and i would like to store the entries anonymously.
After the collection period is over, i would like to be able to run statics on the answers. But it is very important that the users answers can not be traced back to a specific user/ip.
The reason that i would still like to store ip, regardless of my desire for the users to be anonymously, is that i would like to exclude entries where the user (with malicious intent or by accident), take the same test multiple times in a short span.
I have ruled out using encryptions, as it is, to my limited knowledge, not possible to compare a large set of encrypted strings like that.
My currently self proposed method is then to store: The user agent, a session identifier and a hashed ip address.
Regarding to the hashing method, i am thinking to use sha512 where the ip is prepended with a 16 character long salt (same salt for all entries).
I know that when hashing simple and common strings, that sha512 and other hashing methods can be broken with tools like: and good old brute forcing.
My idea to then guarantee user anonymity, is that after the collection period is over, i will delete the salt. Making it (to my knowledge) near impossible to brute force the hash. Even if a malicious party got hand on my source code.
I know it seems like a low tech solution, and that party of the security is based on my own action of actually deleting, where i en theory could forget or change my mind. But it is the only solution i could come up with.
Thanks in advance
Don't hash the IP's, HMAC them. That's conceptually the same than what you want to do, but cryptographically robust.

Store Secure Username in Database

I would like to protect my users' username in an online service, as it may be personally identifying (e.g., an email-address), but am wondering if it's even possible...
My first inclination was to hash it (unsalted), but am worried about possible hash collisions. Not so much worried about the probability of a collision in an SHA256 32-bit hash, but more about the possibility that the class of usernames used could be just prone to collisions.
I also looked into perfect hashes, but as the users can be added dynamically, that's going to be too hard to manage.
Another option I've thought of was that (when adding the user) if there were a hash collision, I would reply to the client with a request for another hash, and repeat until there was no collision. I'd repeat this process during log-in. However, I am also wondering if this actually makes it easier for an attacker, as they'd have more feedback about what hashes were successful, and if the database were compromised, all the additional hashes would make recovering the original value easier.
I was also considering encrypting the username using the username as a password, but I'm guessing this also suffers from collisions (because each entry has a unique password--two different plain-texts with two different passwords could result in the same cipher-text), so I'm thinking it's not worth exploring this further.
I don't really want to go with a custom username (where the user has to come up with something that hasn't been taken when they sign-up), as I'm expecting the user to very rarely use the service, and are likely to forget their username.
I'm currently thinking I will just go with the first idea of hashing once, and if there is a collision, have the password decide (and hope there's no collision there too--I could put a warning when the user signs-up saying that their username/password is not allowed because it will log them in as another user perhaps /S).
Is there any non-colliding way of creating a secure form of username?
Thank you.
Assuming we are talking about emails, as there aren't many other options usable for login names.
I was also considering encrypting the username using the username as a password, but I'm guessing this also suffers from collisions (because each entry has a unique password--two different plain-texts with two different passwords could result in the same cipher-text), so I'm thinking it's not worth exploring this further.
Collisions here are the wrong thing to worry about here ...
Mandatory disclaimer: Encryption keys are not the same things as passwords. And encrypting the plainText with itself as the key is even worse.
The problem with encryption is that cipherTexts aren't searchable; i.e. you cannot verify for uniqueness, unless you decrypt all user records each time, so this just isn't sustainable - your server loads will grow exponentially with each new user record.
That's because while encryption makes use of IVs (Initialization Vector; i.e. the equivalent of salts in password hashing), which results in different cipherText even if you encrypt the same plainText twice, using the same key.
However, it is very likely that you will need to encrypt those emails, as if you need to send out password reset links, notifications, etc. - you'll need a two-way mechanism. You can't do these things with hashes, because they are one-way only.
There's a reason why every website couples its user accounts with email addresses, even if they are not the login names. :)
What you can do for login checks only, is to store a HMAC (Hash-based Message Authentication Code) of the email.
HMACs look just like regular hashes, but are actually "keyed hashes" (i.e. you would use a key while hashing, similarly to encryption). And in addition to that, nobody has managed to find collisions with the HMAC construct so far, even with the now famously insecure MD5 (still, please use a modern algorithm; at least SHA-2).
I should note that HMACs aren't nearly as strong as password hashing algorithms, so your users emails certainly won't be as strongly protected as their passwords, but it's not like there's anything else you can do about it, and it should be OK.
In summary, you'll need to have two separate cryptographic keys configured in your application - one for encryption, and one for the HMACs - and the following data stored:
userLoginLookup - HMAC of the email, using one of the two keys
userLoginMailer - cipherText of the email, using the second configured key
userPassword - a standard password hash; using bcrypt, PBKDF2 or scrypt
Note: Cryptography is always case-sensitive, so to accomodate lookups, you need to always normalize the email addresses first; i.e. make them all-lowercase or all-uppercase.
When a user attempts to login, you do a HMAC(emailInput, hmacKey) and search for a match with the userLoginLookup field in your database.
When you need to send a notification or password reset email, you decrypt the userLoginMailer.

I have a simple database of content. Should I hash the "id" so that people don't look over it in the URL?

Is it recommended to create a column (unique key) that is a hash.
When people view my URL, it is currently like this:
But, people can look over this and data-mine all the content, right?
Is it RECOMMENDED to go 1 extra step to make this through hash?
The first and most important step is to use some form of role-based security to ensure that no user can see data they aren't supposed to see. So, for example, if a user should only see their own information, then you should check that the id belongs to the logged-in user before you display it.
As a second level of protection, it's not a bad idea to have a unique key that doesn't let you predict other keys (a hash, as you suggest, or a UUID). However, that still means that, for example, a malicious user who obtained someone else's URL (e.g. by sniffing a network, by reading a log file, by viewing the history in someone's browser) could see that user's information. You need authentication and authorization, not simply obfuscating the IDs.
It sort of depends on your situation, but off hand I think if you think you need to hash you need to hash. If someone could data mine by, say, iterating through:
Then using a hash for the id is necessary to avoid this, since it will be much harder to figure out the next one. Keep in mind, though, that you don't want to make the hash too obvious, so that a determined attacker would easily figure it out, e.g. just taking the MD5 of 2134 or whatever number you had.
Well, the problem here is that an actual Hash is technically one way. So if you hash the data you won't be able to recover it on the receiving side. Without knowing what technology you are using to create your web page it's hard to make any concrete suggestions, but if you must have sensitive information in your query string then I would recommend that you at least use a symmetric encryption algorithm on it to keep people from simply reading off the values and reverse engineering things.
Of course if you have the option - it's probably better to not have that information in the query string at all.

Are two security keys better than one?

I just implemented a "remember me" feature for a user login on a website. Most advice was to have the userid stored in a cookie, and then have some long, unguessable random key. If both of these match up, the user is considered authenticated.
Does having two strings actually help? Wouldn't a longer key do exactly the same thing?
In other words, aren't two keys equally susceptible to attacks as one longer key? (I imagine it would be the total length of the keys, regardless of how many you have)
Note: There might be some DB query efficiency issues here too, e.g., looking up a big UUID in the DB is not as easy as looking up a small number. (On a tangential note, Gmail uses a six digit number as their one-time login token along with the username.)
Robust discussion of that in this SO thread.
... the user is considered authenticated.
Should probably read authenticated but with limited authoriziation.
Per comment: Somewhat more secure since it's one time use and it's hard to guess. So if the cookie is compromised, the attacker has to act quickly or the token will be invalidated by the legitimate user loging in whereas the userid may not change for a long time.
I'm no crypto expert, but as long as you check for brute-forcing attempts, you should be able to use a short key (like Gmail's 6 digits). The real vulnerability is people listening when a user logs in (eg. SideJacking).
In sites I have previously created I made use of a user_id and a salted hash of the user's password. The primary reason I used two fields to authenticate a user is because it saved me the trouble of adding another table (and thus complicating the database design.) With the user_id also being stored in the cookie I could do an indexed look-up in users table and efficiently match the salted hash to the user. Of course you could concatenate both the user_id and the hash into one value and just store that in the cookie.
If you just have a random unguessable string then you would have to have a separate table to associate the random string with a user-id and do another look-up for that particular user.
