MD5 hash reversing - security

I know it's not possible to reverse an MD5 hash back to its original value. But what about generating a set of random characters which would give the exact same value when hashed? Is that possible?

Finding a message that matches a given MD5 hash can happen in three ways:
You guess the original message. For passwords and other low entropy messages this is often relatively easy. That's why we use use key-stretching in such situations. For sufficiently complex messages, this becomes infeasible.
You guess about 2^127 times and get a new message fitting that hash. This is currently infeasible.
You exploit a pre-image attack against that specific hash function, obtained by cryptoanalyzing it. For MD5 there is one, with a workfactor of 2^123, but that's still infeasible.
There is no efficient attack on MD5's pre-image resistance at the moment.
There are efficient collision attacks against MD5, but they only allow an attacker to construct two different messages with the same hash. But it doesn't allow him to construct a message for a given hash.

Yes it is possible to come up with a collision (since you map from a larger space to a smaller this is something that you can assume to happen eventually). Actually MD5 is already considered as "broken" in this respect.
From wiki:
However, it has since been shown that MD5 is not collision
resistant;[3] as such, MD5 is not suitable for applications like SSL
certificates or digital signatures that rely on this property. In
1996, a flaw was found with the design of MD5, and while it was not a
clearly fatal weakness, cryptographers began recommending the use of
other algorithms, such as SHA-1—which has since been found also to be
vulnerable. In 2004, more serious flaws were discovered in MD5, making
further use of the algorithm for security purposes
questionable—specifically, a group of researchers described how to
create a pair of files that share the same MD5 checksum.[4][5] Further
advances were made in breaking MD5 in 2005, 2006, and 2007.[6] In
December 2008, a group of researchers used this technique to fake SSL
certificate validity,[7][8] and US-CERT now says that MD5 "should be
considered cryptographically broken and unsuitable for further
use."[9] and most U.S. government applications now require the SHA-2
family of hash functions.[10]

In one sense, this is possible. If you have strings that are longer than the hash itself, then you will have collisions, so such a string will exist.
However, finding such a string would be equivalent to reversing the hash, as you would be finding a value that hashes to a particular hash, so it would not be any more feasible than reversing a hash any other way.

For MD5 specifically? Yes.
Several years ago, an article was published on an exploit of the MD5 hash that allowed easy generation of data which, when hashed, gave a desired MD5 hash (well, what they actually discovered was an algorithm to find sets of data with the same hash, but you get how that can be used the other way around). You can read an overview of the principle here. No similar algorithm has been found for SHA-2, although that may change in the future.

Yes, what you're talking about is called a collision. A collision in any hashing mechanism is when two different plaintexts create the same hash after being run through a hashing algorithm.

Related

Why doesn't having the code to the MD5 function help hackers break it?

I believe I can download the code to PHP or Linux or whatever and look directly at the source code for the MD5 function. Could I not then reverse engineer the encryption?
Here's the code - http://dollar.ecom.cmu.edu/sec/cryptosource.htm
It seems like any encryption method would be useless if "the enemy" has the code it was created with. Am I wrong?
That is actually a good question.
MD5 is a hash function -- it "mixes" input data in such a way that it should be unfeasible to do a number of things, including recovering the input given the output (it is not encryption, there is no key and it is not meant to be inverted -- rather the opposite). A handwaving description is that each input bit is injected several times in a large enough internal state, which is mixed such that any difference quickly propagates to the whole state.
MD5 is public since 1992. There is no secret, and has never been any secret, to the design of MD5.
MD5 is considered cryptographically broken since 2004, year of publication of the first collision (two distinct input messages which yield the same output); it was considered "weak" since 1996 (when some structural properties were found, which were believed to ultimately help in building collisions). However, there are other hash functions, which are as public as MD5 is, and for which no weakness is known yet: the SHA-2 family. Newer hash functions are currently being evaluated as part of the SHA-3 competition.
The really troubling part is that there is no known mathematical proof that a hash function may actually exist. A hash function is a publicly described efficient algorithm, which can be embedded as a logic circuit of a finite, fixed and small size. For the practitioners of computational complexity, it is somewhat surprising that it is possible to exhibit a circuit which cannot be inverted. So right now we only have candidates: functions for which nobody has found weaknesses yet, rather than function for which no weakness exists. On the other hand, the case of MD5 shows that, apparently, getting from known structural weaknesses to actual collisions to attacks takes a substantial amount of time (weaknesses in 1996, collisions in 2004, applied collisions -- to a pair of X.509 certificates -- in 2008), so the current trend is to use algorithm agility: when we use a hash function in a protocol, we also think about how we could transition to another, should the hash function prove to be weak.
It is not an encryption, but a one way hashing mechanism. It digests the string and produces a (hopefully) unique hash.
If it were a reversible encryption, zip and tar.gz formats would be quite verbose. :)
The reason it doesn't help hackers too much (obviously knowing how one is made is beneficial) is that if they find a password to a system that is hashed, e.g. 2fcab58712467eab4004583eb8fb7f89, they need to know the original string used to create it, and also if any salt was used. That is because when you login, for obvious reasons, the password string is hashed with the same method as it is generated and then that resulting hash is compared to what is stored.
Also, many developers are migrating to bcrypt which incorporates a work factor, if the hashing takes 1 second as opposed to .01 second, it greatly slows down generating a rainbow table for you application, and those old PHP sites using md5() only become the low hanging fruit.
Further reading on bcrypt.
One of the criteria of good cryptographic operations is that knowledge of the algorithm should not make it easier to break the encryption. So an encryption should not be reversible without knowledge of the algorithm and the key, and a hash function must not be reversible regardless of knowledge of the algorithm (the term used is "computationally infeasible").
MD5 and other hash function (like SHA-1 SHA-256, etc) perform a one-way operation on data that creates a digest or "fingerprint" that is usually much smaller than than the plaintext. This one way function cannot be reversed to retrieve the plaintext, even when you know exactly what the function does.
Likewise, knowledge of an encryption algorithm doesn't make it any easier (assuming a good algorithm) to recover plaintext from ciphertext. The reverse process is "computationally infeasible" without knowledge of the encryption key used.

What does "no collisions have been found yet for this hashing method" even mean?

I mean I don't need to look for the actual collisions, to know they exist. If there weren't collisions, then how would you have fixed-length results? That's why I don't understand what people mean when they claim 'md5 is insecure! someone found collisions!', or something like that.
The only thing I can think of, is that the collision search only looks for dictionary words, eg: If 'dog' and 'house' share the same hash, it would be a stupid hashing method IMO. It could also look for strings with a length < X, being X something between 5-10 (passwords that people could remember)
Am I totally wrong?
MD5 is a 128-bit hash, so there are 2^128 possible hashes. If the hash were perfect, then it would in theory require around 2^64 different hash attempts to find a collision (and you would have to store all 2^64 because each new hash would require comparison to all previous values). There isn't 2^64 bits of storage on the planet, so you would be safe.
The attacks on MD5 allow collisions to be found with significantly less than 2^64 hashes and significantly less than 128 x 2^64 bits of storage. That's why MD5 is considered broken.
Currently there are no similar attacks that work on full-strength SHA-1, but it's expected that such attacks will be publicly known within a few years.
As you know, a collision is the term for the situation where two different things (e.g. documents) hash to the same value.
Clearly, collisions are always theoretically possible for a secure hashing algorithm. But the security of secure hashing comes from:
using a large domain of possible hash values, and
using a hashing algorithm with the property that trial and error is close to the best way to produce a document with a given hash.
If both of these criteria are satisfied, then the probability of someone being able to manufacture a collision for a given document is vanishingly small. This is sufficient to make it impractical to (for example) change the content of a document with a digital signature.
The problem is that clever people have figured out a way (or ways) that are a LOT faster than trial and error for creating documents whose MD5 signatures collide. Hence they can defeat digital signatures, and similar uses of MD5 to provide security.
FOLLOWUP
This quote comes from the Wikipedia page on MD5:
MD5 makes only one pass over the data, so if two prefixes with the same hash can be constructed, a common suffix can be added to both to make the collision more likely to be accepted as valid data by the application using it. Furthermore, current collision-finding techniques allow to specify an arbitrary prefix: an attacker can create two colliding files that both begin with the same content. All the attacker needs to generate two colliding files is a template file with a 128-byte block of data aligned on a 64-byte boundary that can be changed freely by the collision-finding algorithm.
I don't completely understand this, but it looks like a recipe for producing files with (different) meaningful content and the same signature.
In practice, it's not about whether a single sample was found, but about a method. These can be either based on some property "if you hash values of length N, ending with ..., etc. you will get the same hash" (silly example), or based on some algorithm "having this hash / value, this is how you get a new value with the same hash".
Collisions will of course always exist, but the interesting problem is how to find them. I'm not sure what is the source of that claim you quoted, but I'm pretty sure it was supposed to actually mean "no practical way to find collisions has been found yet for this hashing method".
When you see "No collisions found" for the SHA-256 hash, for example, it really means that no hash collisions have ever been found. You are right that theoretically collisions exists, and there may already have happened a SHA-256 collision that no-one noticed, but this is irrelevant.
To find a collision by chance, you would need on average 18 quintillion of hash attempts for a MD5 hash, and 340 undecillion attempts for a SHA-256 hash, already accounting for the birthday problem.
As vy32 said, it is computationally unfeasible compute, store and compare so many hashes. So, in order to find a collision, you need a method that is many orders of magnitude faster than the random trial and error one. If there exists such a method for a secure hash, the hash is considered broken, at least in regards to general collision resistance.
So, to say "Someone found a collision in this xxxbit hash" is in fact synonymous of saying "A practical method of finding collisions was found for this hash, making it insecure". The alternative is a cosmically unlikely event, and would be reported in another way.

why should a good hash algorithm not allow attackers to find two messages producing the same hash?

I was reading wikipedia, and it says
Cryptographic hash functions are a third type of cryptographic algorithm.
They take a message of any length as input, and output a short,
fixed length hash which can be used in (for example) a digital signature.
For good hash functions, an attacker cannot find two messages that produce the same hash.
But why? What I understand is that you can put the long Macbeth story into the hash function and get a X long hash out of it. Then you can put in the Beowulf story to get another hash out of it again X long.
So since this function maps loads of things into a shorter length, there is bound to be overlaps, like I might put in the story of the Hobit into the hash function and get the same output as Beowulf, ok, but this is inevitable right (?) since we are producing a shorter length output from our input? And even if the output is found, why is it a problem?
I can imagine if I invert it and get out Hobit instead of Beowulf, that would be bad but why is it useful to the attacker?
Best,
Yes, of course there will be collisions for the reasons you describe.
I suppose the statement should really be something like this: "For good hash functions, an attacker cannot find two messages that produce the same hash, except by brute-force".
As for the why...
Hash algorithms are often used for authentication. By checking the hash of a message you can be (almost) certain that the message itself hasn't been tampered with. This relies on it being infeasible to find two messages that generate the same hash.
If a hash algorithm allows collisions to be found relatively easily then it becomes useless for authentication because an attacker could then (theoretically) tamper with a message and have the tampered message generate the same hash as the original.
Yes, it's inevitable that there will be collisions when mapping a long message onto a shorter hash, as the hash cannot contain all possible values of the message. For the same reason you cannot 'invert' the hash to uniquely produce either Beowulf or The Hobbit - but if you generated every possible text and filtered out the ones that had your particular hash value, you'd find both texts (amongst billions of others).
The article is saying that it should be hard for an attacker to find or construct a second message that has the same hash value as a first. Cryptographic hash functions are often used as proof that a message hasn't been tampered with - if even a single bit of data flips then the hash value should be completely different.
A couple of years back, Dutch researchers demonstrated weaknesses in MD5 by publishing a hash of their "prediction" for the US presidential election. Of course, they had no way of knowing the outcome in advance - but with the computational power of a PS3 they constructed a PDF file for each candidate, each with the same hash value. The implications for MD5 - already on its way down - as a trusted algorithm for digital signatures became even more dire...
Cryptographic hashes are used for authentication. For instance, peer-to-peer protocols rely heavily on them. They use them to make sure that an ill-intentioned peer cannot spoil the download for everyone else by distributing packets that contain garbage. The torrent file that describes a download contains the hashes for each block. With this check in place, the victim peer can find out that he has been handled a corrupted block and download it again from someone else.
The attacker would like to replace Beowulf by Hobbit to increase saxon poetry's visibility, but the cryptographic hash that is used in the protocol won't let him.
If it is easy to find collisions then the attacker could create malicious data, and simply prepend it with dummy data until the collision is found. The hash check would then pass for the malicious data. That is why collisions should only be possible via brute force and be as rare as possible.
Alternatively collisions are also a problem with Certificates.

When is it safe to use a broken hash function?

It is trivial to use a secure hash function like SHA-256, and continuing to use MD5 for security is reckless behavior. However, there are some complexities to hash function vulnerabilities that I would like to better understand.
Collisions have been generated for MD4 and MD5. According to NIST, MD5 is not a secure hash function. It only takes 239 operations to generate a collision and should never be used for passwords. However SHA-1 is vulnerable to a similar collision attack in which a collision can be found in 269 operations, whereas brute force is 280. No one has generated a SHA-1 collision and NIST still lists SHA-1 as a secure message digest function.
So when is it safe to use a broken hash function? Even though a function is broken it can still be "big enough". According to Schneier a hash function vulnerable to a collision attack can still be used as an HMAC. I believe this is because the security of an HMAC is dependent on its secret key and a collision cannot be found until this key is obtained. Once you have the key used in an HMAC it's already broken, so it's a moot point. What hash function vulnerabilities would undermine the security of an HMAC?
Let's take this property a bit further. Does it then become safe to use a very weak message digest like MD4 for passwords if a salt is prepended to the password? Keep in mind the MD4 and MD5 attacks are prefixing attacks, and if a salt is prepended then an attacker cannot control the prefix of the message. If the salt is truly a secret, and isn't known to the attacker, then does it matter if it's appended to the password? Is it safe to assume that an attacker cannot generate a collision until the entire message has been obtained?
Do you know of other cases where a broken hash function can be used in a security context without introducing a vulnerability?
(Please post supporting evidence because it is awesome!)
Actually collisions are easier than what you list on both MD5 and SHA-1. MD5 collisions can be found in time equivalent to 226.5 operation (where one "operation" is the computation of MD5 over a short message). See this page for some details and an implementation of the attack (I wrote that code; it finds a collision within an average of 14 seconds on a 2.4 GHz Core2 x86 in 64-bit mode).
Similarly, the best known attack on SHA-1 is in about 261 operations, not 269. It is still theoretical (no actual collision was produced yet) but it is within the realm of the feasible.
As for implications on security: hash functions are usually said to have three properties:
No preimage: given y, it should not be feasible to find x such that h(x) = y.
No second preimage: given x1, it should not be feasible to find x2 (distinct from x1) such that h(x1) = h(x2).
No collision: it should not be feasible to find any x1 and x2 (distinct from each other) such that h(x1) = h(x2).
For a hash function with a n-bit output, there are generic attacks (which work regardless of the details of the hash function) in 2n operations for the two first properties, and 2n/2 operations for the third. If, for a given hash function, an attack is found, which, by exploiting special details of how the hash function operates, finds a preimage, a second preimage or a collision faster than the corresponding generic attack, then the hash function is said to be "broken".
However, not all usages of hash functions rely on all three properties. For instance, digital signatures begin by hashing the data which is to be signed, and then the hash value is used in the rest of the algorithm. This relies on the resistance to preimages and second preimages, but digital signatures are not, per se, impacted by collisions. Collisions may be a problem in some specific signature scenarios, where the attacker gets to choose the data that is to be signed by the victim (basically, the attacker computes a collision, has one message signed by the victim, and the signature becomes valid for the other message as well). This can be counteracted by prepending some random bytes to the signed message before computing the signature (the attack and the solution where demonstrated in the context of X.509 certificates).
HMAC security relies on an other property that the hash function must fulfill; namely, that the "compression function" (the elementary brick on which the hash function is built) acts as a Pseudo-Random Function (PRF). Details on what a PRF is are quite technical, but, roughly speaking, a PRF should be indistinguishable from a Random Oracle. A random oracle is modeled as a black box which contains a gnome, some dice and a big book. On some input data, the gnome select a random output (with the dice) and writes down in the book the input message and the output which was randomly selected. The gnome uses the book to check whether he already saw the same input message: if so, then the gnome returns the same output than previously. By construction, you can know nothing about the output of a random oracle on a given message until you try it.
The random oracle model allows the HMAC security proof to be quantified in invocations of the PRF. Basically, the proof states that HMAC cannot be broken without invoking the PRF a huge number of times, and by "huge" I mean computationally infeasible.
Unfortunately, we do not have random oracles, so in practice we must use hash functions. There is no proof that hash functions really exist, with the PRF property; right now, we only have candidates, i.e. functions for which we cannot prove (yet) that their compression functions are not PRF.
If the compression function is a PRF then the hash function is automatically resistant to collisions. That's part of the magic of PRF. Therefore, if we can find collisions for a hash function, then we know that the internal compression function is not a PRF. This does not turn the collisions into an attack on HMAC. Being able to generate collisions at will does not help in breaking HMAC. However, those collisions demonstrate that the security proof associated with HMAC does not apply. The guarantee is void. That's just the same than a laptop computer: opening the case does not necessarily break the machine, but afterwards you are on your own.
In the Kim-Biryukov-Preneel-Hong article, some attacks on HMAC are presented, in particular a forgery attack on HMAC-MD4. The attack exploits the shortcomings of MD4 (its "weaknesses") which make it a non-PRF. Variants of the same weaknesses were used to generate collisions on MD4 (MD4 is thoroughly broken; some attacks generate collisions faster than the computation of the hash function itself !). So the collisions do not imply the HMAC attack, but both attacks feed on the same source. Note, though, that the forgery attack has cost 258, which is quite high (no actual forgery was produced, the result is still theoretical) but substantially lower than the resistance level expected from HMAC (with a robust hash function with an n-bit output, HMAC should resist up to 2n work factor; n = 128 for MD4).
So, while collisions do not per se imply weaknesses on HMAC, they are bad news. In practice, collisions are a problem for very few setups. But knowing whether collisions impact a given usage of hash functions is tricky enough, that it is quite unwise to keep on using a hash function for which collisions were demonstrated.
For SHA-1, the attack is still theoretical, and SHA-1 is widely deployed. The situation has been described like this: "The alarm is on, but there is no visible fire or smoke. It is time to walk towards the exits -- but not to run."
For more information on the subject, begin by reading the chapter 9 of the Handbook of Applied Cryptography, by Menezes, van Oorschot and Vanstone, a must-read for the apprentice cryptographer (not to be confused with "Applied Cryptography" by B. Schneier, which is a well-written introduction but nowhere as thorough as the "Handbook").
The only time it is safe to use a broken hash function is when the consequences of a collision are harmless or trivial, e.g. when assigning files to a bucket on a filesystem.
When you don't care whether it's safe or not.
Seriously, it doesn't take any extra effort to use a secure hash function in pretty much every language, and performance impact is negligible, so I don't see why you wouldn't.
[Edit after actually reading your question]
According to Schneier a hash function vulnerable to a collsion attack can still be used as an HMAC. I believe this is because the security of an HMAC is Dependant on its secret key and a collision cannot be found until this key is obtained.
Actually, it's essentially because being able to generate a collision for a hash does not necessarily help you generate a collision for the hash-of-a-hash (combined with the XORing used by HMACs).
Does it then become safe to use a very weak message digest like md4 for passwords if a salt is perpended to the password?
No, not if the hash has a preimage attack which allows you to prepend data to the input. For instance, if the hash was H(pass + salt), we'd need a preimage attack which allows us to find pass2 such that H(pass2 + salt) = H(pass + salt).
There have been append attacks in the past, so I'm sure prepend attacks are possible.
Download sites use MD5 hash as a checksum to determine if the file was corrupted during download, and I would say a broken hash is good enough for that purpose.
Lets say that a MITM decides to modify the file (say a zip archive, or an exe). Now, the attacker has to do two things -
Find a hash collision and create a modified file out of it
Ensure that the newly created file is also a valid exe or a zip archive
With a broken hash, 1 is a bit easier. But ensuring that the collision simultaneously meets other known properties of the file is too expensive computationally.
This is totally my own answer, and I could be terribly wrong.
The answer entirely depends on what you're using it for. If you need to prevent somebody producing a collision with a few milliseconds I'd be less worried than if you need to prevent somebody producing a collision within a few decades.
What problem are you actually trying to solve?
Most of the worry about using something like MD4 for a password is related less to currently known attacks, than to the fact that once it has been analyzed to the point that collision generation is easy, it is generally presumed to be considerably more likely that somebody will be able to use that knowledge to create a preimage attack -- and when/if that happens, essentially all possible uses of that hash function become vulnerable.

What algorithm should I use to hash passwords into my database? [duplicate]

This question already has answers here:
Secure Password Hashing [closed]
(9 answers)
Closed 7 years ago.
Is there anything available that isn't trivially breakable?
This 2008 answer is now dangerously out of date. SHA (all variants) is now trivially breakable, and best practice is now (as of Jan 2013) to use a key-stretching hash (like PBKDF2) or ideally a RAM intensive one (like Bcrypt) and to add a per-user salt too.
Points 2, 3 and 4 are still worth paying attention to.
See the IT Security SE site for more.
Original 2008 answer:
Use a proven algorithm. SHA-256 uses 64 characters in the database, but with an index on the column that isn't a problem, and it is a proven hash and more reliable than MD5 and SHA-1. It's also implemented in most languages as part of the standard security suite. However don't feel bad if you use SHA-1.
Don't just hash the password, but put other information in it as well. You often use the hash of "username:password:salt" or similar, rather than just the password, but if you play with this then you make it even harder to run a dictionary attack.
Security is a tough field, do not think you can invent your own algorithms and protocols.
Don't write logs like "[AddUser] Hash of GeorgeBush:Rep4Lyfe:ASOIJNTY is xyz"
First rule of cryptography and password storage is "don't invent it yourself," but if you must here is the absolute minimum you must do to have any semblance of security:
Cardinal rules:
Never store a plain text password (which means you can never display or transmit it either.)
Never transmit the stored representation of a password over an unsecured line (either plain text, encoded or hashed).
Speed is your enemy.
Regularly reanalyze and improve your process as hardware and cryptanalysis improves.
Cryptography and process is a very small part of the solution.
Points of failure include: storage, client, transmission, processing, user, legal warrants, intrusion, and administrators.
Steps:
Enforce some reasonable minimum password requirements.
Change passwords frequently.
Use the strongest hash you can get - SHA-256 was suggested here.
Combine the password with a fixed salt (same for your whole database).
Combine the result of previous step with a unique salt (maybe the username, record id, a guid, a long random number, etc.) that is stored and attached to this record.
Run the hash algorithm multiple times - like 1000+ times. Ideally include a different salt each time with the previous hash. Speed is your enemy and multiple iterations reduces the speed. Every so often double the iterations (this requires capturing a new hash - do it next time they change their password.)
Oh, and unless you are running SSL or some other line security then don't allow your password to be transmitted in plain text. And if you are only comparing the final hash from the client to your stored hash then don't allow that to be transmitted in plain text either. You need to send a nonce (number used once) to the client and have them hash that with their generated hash (using steps above) hash and then they send you that one. On the server side you run the same process and and see if the two one time hashes match. Then dispose of them. There is a better way, but that is the simplest one.
CodingHorror had a great article on this last year. The recommendation at the end of the article is bcrypt.
Also see: https://security.stackexchange.com/questions/4781/do-any-security-experts-recommend-bcrypt-for-password-storage/6415#6415
The aforementioned algorithms are cryptographically secure hashing algorithms (but MD5 isn't considered to be secure today).
However there are algorithms, that specifically created to derive keys from passwords. These are the key derivation functions. They are designed for use with symmetric ciphers, but they are good for storing password too. PBKDF2 for example uses salt, large number of iterations, and a good hash function. If you have a library, what implements it (e.g. .NET), I think you should consider it.
Add a unique salt to the hashed password value (store the salt value in the db). When a unique salt is used the benefit of using a more secure algorithm than SHA1 or MD5 is not really necessary (at that point it's an incremental improvement, whereas using a salt is a monumental improvement).
Use a strong crytographic hash function like MD5 or SHA1, but make sure you use a good salt, otherwise you'll be susceptible to rainbow table attacks.
Update Jan 2013
The original answer is from 2008, and things have moved a bit in the last 5 years. The ready availability of cloud computing and powerful parallel-processor graphics cards means that passwords up to 8 or 9 characters hashed as MD5 or SHA1 are now trivially breakable.
Now a long salt is a must, as is something tougher like SHA512.
However all SHA variant hashes are designed for communication encryption - messages back and forth where every message is encrypted, and for this reason they are designed to be fast.
In the password hashing world this design is a big disadvantage as the quicker the hash is the generate the less time it takes to generate large numbers of hashes.
A fast hash like SHA512 can be generated millions, even billions of times a second. Throw in cheap parallel processing and every possible permutation of a password becomes an absolute must.
Key-stretching is one way to combat this. A key-stretching algorithm (like PBKDF2) applies a quicker hash (like SHA512) thousands of times, typically causing the hash generation to take 1/5 of a second or so. Someone logging in won't notice, but if you can only generate 5 hashes per second brute force attacks are much tougher.
Secondly there should always be a per-user random salt. This can be randomly generated as the first n bytes of the hash (which are then stripped off and added to the password text to be checked before building the hashes to compare) or as an extra DB column.
So:
What algorithm should I use to hash passwords into my database?
Key-stretching to slow down hash generation. I'd probably go with PBKDF2.
Per-user salt means a new attack per user, and some work figuring out how to get the salt.
Computing power and availability are going up exponentially - chances are these rules will change again in another 4 years. If you need future-proof security I'd investigate bcrypt/scrypt style hashes - these take the slower key-stretching algorithms and add a step that uses a lot of RAM to generate the hash. Using so much RAM reduces the effectiveness of cheap parallel processors.
Original Sept 2008 (left in so comments make sense)
MD5+salt or SHA1+salt is not 'trivially breakable' - most hacks depend on huge rainbow tables and these become less useful with a salt [update, now they are].
MD5+salt is a relatively weak option, but it isn't going to be easily broken [update, now it is very easy to break].
SHA2 goes all the way up to 512 - that's going to be pretty impossible to crack with readily available kit [update, pretty easy up to 9 char passwords now] - though I'm sure there's a Cray in some military bunker somewhere that can do it [You can now rent this 'Cray' from Amazon]
MD5 or SHA in combination with a randomly generated salt value for every entry
as mentioned earlier simple hashing algorithms should not be used here is reason why :
http://arstechnica.com/security/2012/08/passwords-under-assault/
so use something else such as http://msdn.microsoft.com/en-us/library/system.security.cryptography.rfc2898derivebytes.aspx
All hashing algorithms are vulnerable to a "dictionary attack". This is simply where the attacker has a very large dictionary of possible passwords, and they hash all of them. They then see if any of those hashes match the hash of the password they want to decrypt. This technique can easily test millions of passwords. This is why you need to avoid any password that might be remotely predictable.
But, if you are willing to accept the threat of a dictionary attack, MD5 and SHA1 would each be more than adequate. SHA1 is more secure, but for most applications this really isn't a significant improvement.
MD5 / SHA1 hashes are both good choices. MD5 is slightly weaker than SHA1.

Resources