Hashing a solr input field - security

I would like to facilitate searching on a field that we cannot index or store in non hashed or encrypted form. Is there a way to tell solr to hash (or encrypt) a speicfic field prior to comparing against the index?

In a nutshell, I don't think it's easy, and it depends on what level of security you need.
As a generic, simple solution, you could store the whole index in an encrypted file system, e.g. eCryptfs or TrueCrypt (see difference between block-level encryption and fs-level encryption)
Depending on how you need to search in this field, if you can get away with just hashing the values then the solution would be purely client-side, i.e. hashing the value client-side, sending it to Solr and getting back the results.
Some years ago there was a patch to enable field-level encryption in Lucene, but for some reason it was rejected. Still, maybe you can borrow some ideas from that patch...

Related

How to encrypt a astring in nodeJs that should be used as a an index

I am designing a microservice architecture using nodeJs and mongoDb. I have a usecase to save driver's license number which also be used to validate the user. Now, as DL number is PII, I don't want to save the string as is, I want to encrypt it before saving. I can use encryption logic to generate a common encrypted string everytime, so I can encrypt the DL number and do a lookup in db. But I am worried about the hackers can decrypt and get all DL numbers if they know the encryption logic for one. Can someone suggest me the best approach for this kind of use case?
Ignoring the index for a moment, it sounds like the best approach is to hash the license number using a keyed hash, and store the hash. This is similar to symmetric encryption in that you need to keep a secret, the key. However, it's one-way so attackers that obtain the secret will still need to brute-force each entry to obtain the number.
If the key is compromised, depending on the license number scheme, brute-forcing each number will vary in difficulty from easy to trivial. But, it's better than plaintext.
However, if you really need it as an index you have what appears to be conflicting priorities. I'll defer to someone else, I don't know much about DB indexing.
If it were me and I had time to spare I'd setup one table with the hashes and one with the plaintext license number as an index. Add 10 million rows (or some ceiling that's relevant to you) of test data and profile a few thousand random lookups of each one.

Hashing of api keys in Phoenix/Elixir and using comeonin for this

In my system each user can have multiple api keys. I want to hash api keys and store in a database their hashes. I'm using comeonin for this.
1) is it sensible to store hashes of api keys rather than their plain, original values?
2) when an api request comes in, there's only a plain api key value in it and no user email along with it -- this is my system is designed.
How should I check if an api key is valid? Will I have to do this -- recalculate a hash?
given_api_plain_key = get_key_from_request()
# re-hash it again
# but how about the original salt???
given_api_hash_key = Comeonin.Bcrypt.hashpwsalt(given_api_plain_key)
case Repo.get_by(ApiKey, key_hash: given_api_hash_key) do
nil -> IO.puts("not found")
a -> IO.puts("gooood")
end
Or is there a better way?
(1) is it sensible to store hashes of api keys rather than their plain, original values?
Your goal seems to be to protect against grand compromises that might happen if somebody gets access to your database (for example, via SQL injection, command injection, or reverse shell on your system). In this case, yes it is sensible, especially if each user has a different API key. However this link is worth reading for other considerations that might affect your decision.
(2) How should I check if an api key is valid?
It is clear that you need to hash the input and see if matches to something in your database.
(3) Implementation.
You do not want to apply the same protection that you use for passwords. Passwords tend to have low entropy by nature, and therefore we need tools like bcrypt to process them. Bcrypt is slow by design (to prevent brute force attacks), and uses salts to help the security of poorly chosen passwords.
API keys should not have low entropy, and therefore you do not need slow functions to process them. In fact, slow functions bring DoS risks, so you definitely do not want to do a bcrypt every request that comes in. Salting also complicates your use case (salts work when you know who is providing the input, but in your case you do not know who it is coming from beforehand).
Short answer: Just use SHA256, no salt needed.

Full text search on encrypted data

Suppose I have a server storing encrypted text (end-to-end: server never sees plain text).
I want to be able to do full text search on that text.
I know this is tricky, but my idea is to use the traditional full text design ("list" and "match" tables where words are stored and matched with ids from the content table). When users submit the encrypted text, they also send a salted MD5 of the words and respective matches. The salt used is unique for each user and is recovered from their password.
(in short: the only difference is that the "list" table will contain hashed words)
Now, how vulnerable would this system be?
Note that I said "how vulnerable" instead of "how safe", because I acknowledge that it can't be totally safe.
I DO understand the tradeoff between features (full text search) and security (disclosing some information from the word index). For example, I understand that an attacker able to get the list and match tables could get information about the original, encrypted text and possibly be able to decipher some words with statistical analysis (however, being the salt unique for each user, this would need to be repeated for each user).
How serious would this threat be? And would there be any other serious threats?
DISCLAIMER
What I'm trying to build (and with the help of a cryptographer for actual implementation; right now I'm just trying to understand wether this will be possible) is a consumer-grade product which will deal with confidential yet not totally secret data.
My goal is just to provide something safe enough, so that it would be easier for an attacker to try stealing users' passwords (e.g. breaching into clients - they're consumers, eventually) rather than spending a huge amount of time and computing power trying to brute force the index or run complicated statistical analysis.
Comments in response to #Matthew
(may be relevant for anyone else answering)
As you noted, other solutions are not viable. Storing all the data inside the client means that users cannot access their data from other clients. Server-side encryption would work, but then we won't be able to give users the added security of client-side encryption.The only "true alternative" is just to not implement search: while this is not a required feature, it's very important to me/us.
The salt will be protected in the exactly same way as the users' decryption key (the one used to decrypt stored texts). Thus, if someone was able to capture the salt, he or she would likely be able to capture also the key, creating a much bigger issue.To be precise, the key and the salt will be stored encrypted on the server. They will be decrypted by the client locally with the user's password and kept in memory; the server never sees the decrypted key and salt. Users can change passwords, then, and they just need to re-encrypt the key and the salt, and not all stored texts. This is a pretty standard approach in the industry, to my knowledge.
Actually, the design of the database will be as follow (reporting relevant entries only). This design is like you proposed in your comment. It disallows proximity searches (not very relevant to us) and makes frequency less accurate.
Table content, containing all encrypted texts. Columns are content.id and content.text.
Table words, containing the list of all hashes. Columns are words.id and words.hash.
Table match, that matches texts with hashes/words (in a one-to-many relationship). Columns are match.content_id and match.word_id.
We would have to implement features like removing stopwords etc. Sure. That is not a big issue (will, of course, be done on the client). Eventually, those lists have always been of limited utility for international (i.e. non English-speaking) users.
We expect the lookup/insert ratio to be pretty high (i.e. many lookups, but rare inserts and mostly in bulk).
Decrypting the whole hash database is certainly possible, but requires a brute force attack.
Suppose the salt is kept safe (as per point 2 above). If the salt is long enough (you cited 32 bits... but why not 320? - just an example) that would take A LOT of time.
To conclude... You confirmed my doubts about the possible risk of frequency analysis. However, I feel like this risk is not so high. Can you confirm that?
Indeed, first of all the salt would be unique per each user. This means that one user must be attacked at time.
Second, by reporting words only once per text (no matter how many times they appear), frequency analysis becomes less reliable.
Third... Frequency analysis on hashed words doesn't sound as something as good as frequency analysis on a Caesar-shift, for example. There are 250,000 words in English alone (and, again, not all our users will be English-speaking), and even if some words are more common than others, I believe it'd be hard to do this attack anyway.
PS: The data we'll be storing is messages, like instant messages. These are short, contain a lot of abbreviations, slang, etc. And every person has a different style in writing texts, further reducing the risk (in my opinion) of frequency attacks.
TL;DR: If this needs to be secure enough that it requires per-user end-to-end encryption: Don't do it.
Too long for a comment, so here goes - if I understand correctly:
You have encrypted data submitted by the user (client side encrypted, so not using the DB to handle).
You want this to be searchable to the user (without you knowing anything about this - so an encrypted block of text is useless).
Your proposed solution to this is to also store a list (or perhaps paragraph) of hashed words submitted from the client as well.
So the data record would look like:
Column 1: Encrypted data block
Column 2: (space) delimited hashed, ordered, individual words from the above encrypted text
Then to search you just hash the search terms and treat the hashed terms as words to search the paragraph(s) of "text" in column 2. This will definitely work - just consider searching nonsense text with nonsense search terms. You would even still be able to do some proximity ranking of terms with this approach.
Concerns:
The column with the individually hashed words as text will incredibly weak in comparison to the encrypted text. You are greatly weakening your solution as not only are there limited words to work from, the resultant text will be susceptible to word frequency analysis, etc.
If you do this: separately store a salt unrelated to the password. Given that a rainbow table will be easy to create if your salt is captured (dictionary words only) store it encrypted somewhere.
You will lose many benefits of FTS like ignoring words like 'the' - you will need to re-implement this functionality on your own if you want it (i.e. remove these terms on the client side before submitting the data / search terms).
Other approaches that you imply are not acceptable/workable:
Implement searching client side (all data has to exist on the client to search)
Centralized encryption leveraging the databases built in functionality
I understand the argument being that your approach provides the user with the only access to their data (i.e. you cannot see/decrypt it). I would argue that this hashed approach weakens the data sufficiently that you could reasonably work out a users data (that is, you have lowered the effort required to the point that it is very plausible you could decrypt a user's information without any knowledge of their keys/salts). I wouldn't quite lower the bar to describe this as just obfuscation, but you should really think through how significant this is.
If you are sure that weakening your system to implement searching like this makes sense, and the another approach is not sufficient, one thing that could help is to store the hashes of words in the text as a list of uniquely occuring words only (i.e. no frequency or proximity information would be available). This would reduce the attack surface area of your implementation a little, but would also lose the benefits you are implying you want by describing the approach as FTS. You could get very fast results like this though as the hashed words essentially become tags attached to all the records that include them. The search lookup then could become very fast (at the expense of your inserts).
*Just to be clear - I would want to be REALLY sure my business needs demanded something like this before I implemented it...
EDIT:
Quick example of the issues - say I know you are using 32-bit salts and are hashing common words like "the". 2^32 possible salts = 4 billion possible salts (that is, not that many if you only need to hash a handful of words for the initial attack). Assume the salt is appended or prepended, this still is only 8 billion entries to pre-calculate. Even if it is less common words you do not need to create too many lists to ensure you will get hits (if this is not the case your data would not be worth searching).
Then lookup the highest frequency salts for a given block of text in our each of our pre-calculated salt tables and use the match to see if it correctly decrypts other words in the text. Once you have a plausible candidate generate the 250,000 word English language rainbow table for that salt and decrypt the text.
I would guess you could decrypt the hashed data in the system in hours to days with access to the database.
First, you have all of the normal vulnerabilities of password-based cryptography, which stem from users picking predictable passwords. It is common to crack more than 50% of passwords from real-world applications in offline attacks with less than two hours of desktop computing time.
I assume the full text encryption key is derived from the password, or is encrypted by a password-derived key. So an attacker can test guesses against a selection of hashed index keys, and once she finds the password, decrypt all of the documents.
But, even if a user picks a high-entropy password, frequency analysis on the index could potentially reveal a lot about the plain text. Although word order is lost in indexing (if you don't support proximity searches), you are essentially creating an electronic code book for each user. This index would be vulnerable to centuries of well-developed cryptanalytical techniques. Modern encryption protocols avoid ECB, and provide "ciphertext indistinguishability"—the same plain text yields different cipher text each time it's encrypted. But that doesn't work with indexes.
A less vulnerable approach would be to index and search on the client. The necessary tables would be bundled as a single message and encrypted on the client, then transported to the server for storage. The obvious tradeoff is the cost of transmission of that bundle on each session. Client-side caching of index fragments could mitigate this cost somewhat.
In the end, only you can weigh the security cost of a breach against the performance costs of client-side indexing. But the statistical analysis enabled by an index is a significant vulnerability.
MSSQL Enterprise TDE encrypts Full-Text index as well as other indices when you set whole database encryption (Since 2008). in practice, it works pretty well, without a huge performance penalty. Can't comment on how, b/c it's a proprietary algo, but heres the docs.
https://learn.microsoft.com/en-us/sql/relational-databases/security/encryption/transparent-data-encryption-tde
it doesn't cover any of your application stack besides your db, but your FTS indices will work like normal and won't exist in plain text like they do in MySQL or PostGres. MariaDB and of course Oracle have their own implementation as well, from what i remember. MySQL and PGSQL do not.
As for passwords, TDE on all the implementations use AES keys, which can be rotated (though not always easily) - so the password vulnerability fall on the DBA's.
The problem is you need to pay for full enterprise licensing for MSSQL TDE (ie features not available in "standard" or "basic" cloud and on premise editions), and you do probably for TDE in Oracle as well. But if what you need is a quick solution and have the cash for enterprise licensing (probably cheaper than developing your own implementation), implementations are out there.

Are MongoDB ids guessable?

If you bind an api call to the object's id, could one simply brute force this api to get all objects? If you think of MySQL, this would be totally possible with incremental integer ids. But what about MongoDB? Are the ids guessable? For example, if you know one id, is it easy to guess other (next, previous) ids?
Thanks!
Update Jan 2019: As mentioned in the comments, the information below is true up until version 3.2. Version 3.4+ changed the spec so that machine ID and process ID were merged into a single random 5 byte value instead. That might make it harder to figure out where a document came from, but it also simplifies the generation and reduces the likelihood of collisions.
Original Answer:
+1 for Sergio's answer, in terms of answering whether they could be guessed or not, they are not hashes, they are predictable, so they can be "brute forced" given enough time. The likelihood depends on how the ObjectIDs were generated and how you go about guessing. To explain, first, read the spec here:
Object ID Spec
Let us then break it down piece by piece:
TimeStamp - completely predictable as long as you have a general idea of when the data was generated
Machine - this is an MD5 hash of one of several options, some of which are more easily determined than others, but highly dependent on the environment
PID - again, not a huge number of values here, and could be sleuthed for data generated from a known source
Increment - if this is a random number rather than an increment (both are allowed), then it is less predictable
To expand a bit on the sources. ObjectIDs can be generated by:
MongoDB itself (but can be migrated, moved, updated)
The driver (on any machine that inserts or updates data)
Your Application (you can manually insert your own ObjectID if you wish)
So, there are things you can do to make them harder to guess individually, but without a lot of forethought and safeguards, for a normal data set, the ranges of valid ObjectIDs should be fairly easy to work out since they are all prefixed with a timestamp (unless you are manipulating this in some way).
Mongo's ObjectId were never meant to be a protection from brute force attack (or any attack, for that matter). They simply offer global uniqueness. You should not assume that some object can't be accessed by a user because this user should not know its id.
For an actual protection of your resources, employ other techniques.
If you defend against an unauthorized access, place some authorization logic in your app (allow access to legitimate users, deny for everyone else).
If you want to hinder dumping all objects, use some kind of rate limiting. Combine with authorization if applicable.
Optional reading: Eric Lippert on GUIDs.

Reversing an MD5 Hash [duplicate]

Someone told me that he has seen software systems that:
retrieve MD5 encrypted passwords from other systems;
decrypt the encrypted passwords and
store the passwords in the database of the system using the systems own algorithm.
Is that possible? I thought that it wasn't possible / feasible to decrypt MD5 hashes.
I know there are MD5 dictionaries, but is there an actual decryption algorithm?
No. MD5 is not encryption (though it may be used as part of some encryption algorithms), it is a one way hash function. Much of the original data is actually "lost" as part of the transformation.
Think about this: An MD5 is always 128 bits long. That means that there are 2128 possible MD5 hashes. That is a reasonably large number, and yet it is most definitely finite. And yet, there are an infinite number of possible inputs to a given hash function (and most of them contain more than 128 bits, or a measly 16 bytes). So there are actually an infinite number of possibilities for data that would hash to the same value. The thing that makes hashes interesting is that it is incredibly difficult to find two pieces of data that hash to the same value, and the chances of it happening by accident are almost 0.
A simple example for a (very insecure) hash function (and this illustrates the general idea of it being one-way) would be to take all of the bits of a piece of data, and treat it as a large number. Next, perform integer division using some large (probably prime) number n and take the remainder (see: Modulus). You will be left with some number between 0 and n. If you were to perform the same calculation again (any time, on any computer, anywhere), using the exact same string, it will come up with the same value. And yet, there is no way to find out what the original value was, since there are an infinite number of numbers that have that exact remainder, when divided by n.
That said, MD5 has been found to have some weaknesses, such that with some complex mathematics, it may be possible to find a collision without trying out 2128 possible input strings. And the fact that most passwords are short, and people often use common values (like "password" or "secret") means that in some cases, you can make a reasonably good guess at someone's password by Googling for the hash or using a Rainbow table. That is one reason why you should always "salt" hashed passwords, so that two identical values, when hashed, will not hash to the same value.
Once a piece of data has been run through a hash function, there is no going back.
You can't - in theory. The whole point of a hash is that it's one way only. This means that if someone manages to get the list of hashes, they still can't get your password. Additionally it means that even if someone uses the same password on multiple sites (yes, we all know we shouldn't, but...) anyone with access to the database of site A won't be able to use the user's password on site B.
The fact that MD5 is a hash also means it loses information. For any given MD5 hash, if you allow passwords of arbitrary length there could be multiple passwords which produce the same hash. For a good hash it would be computationally infeasible to find them beyond a pretty trivial maximum length, but it means there's no guarantee that if you find a password which has the target hash, it's definitely the original password. It's astronomically unlikely that you'd see two ASCII-only, reasonable-length passwords that have the same MD5 hash, but it's not impossible.
MD5 is a bad hash to use for passwords:
It's fast, which means if you have a "target" hash, it's cheap to try lots of passwords and see whether you can find one which hashes to that target. Salting doesn't help with that scenario, but it helps to make it more expensive to try to find a password matching any one of multiple hashes using different salts.
I believe it has known flaws which make it easier to find collisions, although finding collisions within printable text (rather than arbitrary binary data) would at least be harder.
I'm not a security expert, so won't make a concrete recommendation beyond "Don't roll your own authentication system." Find one from a reputable supplier, and use that. Both the design and implementation of security systems is a tricky business.
Technically, it's 'possible', but under very strict conditions (rainbow tables, brute forcing based on the very small possibility that a user's password is in that hash database).
But that doesn't mean it's
Viable
or
Secure
You don't want to 'reverse' an MD5 hash. Using the methods outlined below, you'll never need to. 'Reversing' MD5 is actually considered malicious - a few websites offer the ability to 'crack' and bruteforce MD5 hashes - but all they are are massive databases containing dictionary words, previously submitted passwords and other words. There is a very small chance that it will have the MD5 hash you need reversed. And if you've salted the MD5 hash - this won't work either! :)
The way logins with MD5 hashing should work is:
During Registration:
User creates password -> Password is hashed using MD5 -> Hash stored in database
During Login:
User enters username and password -> (Username checked) Password is hashed using MD5 -> Hash is compared with stored hash in database
When 'Lost Password' is needed:
2 options:
User sent a random password to log in, then is bugged to change it on first login.
or
User is sent a link to change their password (with extra checking if you have a security question/etc) and then the new password is hashed and replaced with old password in database
Not directly. Because of the pigeonhole principle, there is (likely) more than one value that hashes to any given MD5 output. As such, you can't reverse it with certainty. Moreover, MD5 is made to make it difficult to find any such reversed hash (however there have been attacks that produce collisions - that is, produce two values that hash to the same result, but you can't control what the resulting MD5 value will be).
However, if you restrict the search space to, for example, common passwords with length under N, you might no longer have the irreversibility property (because the number of MD5 outputs is much greater than the number of strings in the domain of interest). Then you can use a rainbow table or similar to reverse hashes.
Not possible, at least not in a reasonable amount of time.
The way this is often handled is a password "reset". That is, you give them a new (random) password and send them that in an email.
You can't revert a md5 password.(in any language)
But you can:
give to the user a new one.
check in some rainbow table to maybe retrieve the old one.
No, he must have been confused about the MD5 dictionaries.
Cryptographic hashes (MD5, etc...) are one way and you can't get back to the original message with only the digest unless you have some other information about the original message, etc. that you shouldn't.
Decryption (directly getting the the plain text from the hashed value, in an algorithmic way), no.
There are, however, methods that use what is known as a rainbow table. It is pretty feasible if your passwords are hashed without a salt.
MD5 is a hashing algorithm, you can not revert the hash value.
You should add "change password feature", where the user gives another password, calculates the hash and store it as a new password.
There's no easy way to do it. This is kind of the point of hashing the password in the first place. :)
One thing you should be able to do is set a temporary password for them manually and send them that.
I hesitate to mention this because it's a bad idea (and it's not guaranteed to work anyway), but you could try looking up the hash in a rainbow table like milw0rm to see if you can recover the old password that way.
See all other answers here about how and why it's not reversible and why you wouldn't want to anyway.
For completeness though, there are rainbow tables which you can look up possible matches on. There is no guarantee that the answer in the rainbow table will be the original password chosen by your user so that would confuse them greatly.
Also, this will not work for salted hashes. Salting is recommended by many security experts.
No, it is not possible to reverse a hash function such as MD5: given the output hash value it is impossible to find the input message unless enough information about the input message is known.
Decryption is not a function that is defined for a hash function; encryption and decryption are functions of a cipher such as AES in CBC mode; hash functions do not encrypt nor decrypt. Hash functions are used to digest an input message. As the name implies there is no reverse algorithm possible by design.
MD5 has been designed as a cryptographically secure, one-way hash function. It is now easy to generate collisions for MD5 - even if a large part of the input message is pre-determined. So MD5 is officially broken and MD5 should not be considered a cryptographically secure hash anymore. It is however still impossible to find an input message that leads to a hash value: find X when only H(X) is known (and X doesn't have a pre-computed structure with at least one 128 byte block of precomputed data). There are no known pre-image attacks against MD5.
It is generally also possible to guess passwords using brute force or (augmented) dictionary attacks, to compare databases or to try and find password hashes in so called rainbow tables. If a match is found then it is computationally certain that the input has been found. Hash functions are also secure against collision attacks: finding X' so that H(X') = H(X) given H(X). So if an X is found it is computationally certain that it was indeed the input message. Otherwise you would have performed a collision attack after all. Rainbow tables can be used to speed up the attacks and there are specialized internet resources out there that will help you find a password given a specific hash.
It is of course possible to re-use the hash value H(X) to verify passwords that were generated on other systems. The only thing that the receiving system has to do is to store the result of a deterministic function F that takes H(X) as input. When X is given to the system then H(X) and therefore F can be recalculated and the results can be compared. In other words, it is not required to decrypt the hash value to just verify that a password is correct, and you can still store the hash as a different value.
Instead of MD5 it is important to use a password hash or PBKDF (password based key derivation function) instead. Such a function specifies how to use a salt together with a hash. That way identical hashes won't be generated for identical passwords (from other users or within other databases). Password hashes for that reason also do not allow rainbow tables to be used as long as the salt is large enough and properly randomized.
Password hashes also contain a work factor (sometimes configured using an iteration count) that can significantly slow down attacks that try to find the password given the salt and hash value. This is important as the database with salts and hash values could be stolen. Finally, the password hash may also be memory-hard so that a significant amount of memory is required to calculate the hash. This makes it impossible to use special hardware (GPU's, ASIC's, FPGA's etc.) to allow an attacker to speed up the search. Other inputs or configuration options such as a pepper or the amount of parallelization may also be available to a password hash.
It will however still allow anybody to verify a password given H(X) even if H(X) is a password hash. Password hashes are still deterministic, so if anybody has knows all the input and the hash algorithm itself then X can be used to calculate H(X) and - again - the results can be compared.
Commonly used password hashes are bcrypt, scrypt and PBKDF2. There is also Argon2 in various forms which is the winner of the reasonably recent password hashing competition. Here on CrackStation is a good blog post on doing password security right.
It is possible to make it impossible for adversaries to perform the hash calculation verify that a password is correct. For this a pepper can be used as input to the password hash. Alternatively, the hash value can of course be encrypted using a cipher such as AES and a mode of operation such as CBC or GCM. This however requires the storage of a secret / key independently and with higher access requirements than the password hash.
MD5 is considered broken, not because you can get back the original content from the hash, but because with work, you can craft two messages that hash to the same hash.
You cannot un-hash an MD5 hash.
There is no way of "reverting" a hash function in terms of finding the inverse function for it. As mentioned before, this is the whole point of having a hash function. It should not be reversible and it should allow for fast hash value calculation. So the only way to find an input string which yields a given hash value is to try out all possible combinations. This is called brute force attack for that reason.
Trying all possible combinations takes a lot of time and this is also the reason why hash values are used to store passwords in a relatively safe way. If an attacker is able to access your database with all the user passwords inside, you loose in any case. If you have hash values and (idealistically speaking) strong passwords, it will be a lot harder to get the passwords out of the hash values for the attacker.
Storing the hash values is also no performance problem because computing the hash value is relatively fast. So what most systems do is computing the hash value of the password the user keyed in (which is fast) and then compare it to the stored hash value in their user database.
You can find online tools that use a dictionary to retrieve the original message.
In some cases, the dictionary method might just be useless:
if the message is hashed using a SALT message
if the message is hash more than once
For example, here is one MD5 decrypter online tool.
The only thing that can be work is (if we mention that the passwords are just hashed, without adding any kind of salt to prevent the replay attacks, if it is so you must know the salt)by the way, get an dictionary attack tool, the files of many words, numbers etc. then create two rows, one row is word,number (in dictionary) the other one is hash of the word, and compare the hashes if matches you get it...
that's the only way, without going into cryptanalysis.
The MD5 Hash algorithm is not reversible, so MD5 decode in not possible, but some website have bulk set of password match, so you can try online for decode MD5 hash.
Try online :
MD5 Decrypt
md5online
md5decrypter
Yes, exactly what you're asking for is possible.
It is not possible to 'decrypt' an MD5 password without help, but it is possible to re-encrypt an MD5 password into another algorithm, just not all in one go.
What you do is arrange for your users to be able to logon to your new system using the old MD5 password. At the point that they login they have given your login program an unhashed version of the password that you prove matches the MD5 hash that you have. You can then convert this unhashed password to your new hashing algorithm.
Obviously, this is an extended process because you have to wait for your users to tell you what the passwords are, but it does work.
(NB: seven years later, oh well hopefully someone will find it useful)
No, it cannot be done. Either you can use a dictionary, or you can try hashing different values until you get the hash that you are seeking. But it cannot be "decrypted".
MD5 has its weaknesses (see Wikipedia), so there are some projects, which try to precompute Hashes. Wikipedia does also hint at some of these projects. One I know of (and respect) is ophrack. You can not tell the user their own password, but you might be able to tell them a password that works. But i think: Just mail thrm a new password in case they forgot.
In theory it is not possible to decrypt a hash value but you have some dirty techniques for getting the original plain text back.
Bruteforcing: All computer security algorithm suffer bruteforcing. Based on this idea today's GPU employ the idea of parallel programming using which it can get back the plain text by massively bruteforcing it using any graphics processor. This tool hashcat does this job. Last time I checked the cuda version of it, I was able to bruteforce a 7 letter long character within six minutes.
Internet search: Just copy and paste the hash on Google and see If you can find the corresponding plaintext there. This is not a solution when you are pentesting something but it is definitely worth a try. Some websites maintain the hash for almost all the words in the dictionary.
MD5 is a cryptographic (one-way) hash function, so there is no direct way to decode it. The entire purpose of a cryptographic hash function is that you can't undo it.
One thing you can do is a brute-force strategy, where you guess what was hashed, then hash it with the same function and see if it matches. Unless the hashed data is very easy to guess, it could take a long time though.
It is not yet possible to put in a hash of a password into an algorithm and get the password back in plain text because hashing is a one way thing. But what people have done is to generate hashes and store it in a big table so that when you enter a particular hash, it checks the table for the password that matches the hash and returns that password to you. An example of a site that does that is http://www.md5online.org/ . Modern password storage system counters this by using a salting algorithm such that when you enter the same password into a password box during registration different hashes are generated.
No, you can not decrypt/reverse the md5 as it is a one-way hash function till you can not found a extensive vulnerabilities in the MD5.
Another way is there are some website has a large amount of set of password database, so you can try online to decode your MD5 or SHA1 hash string.
I tried a website like http://www.mycodemyway.com/encrypt-and-decrypt/md5 and its working fine for me but this totally depends on your hash if that hash is stored in that database then you can get the actual string.

Resources