August 17, 2004

SHA-0 Broken, MD5 Rumored Broken

"Anyone who creates his or her own cryptographic primitive is either a genius or a fool. Given the genius/fool ratio for our species, the odds aren't very good."

- Bruce Schneier
So the Internet is a buzz with rumors of SHA0, SHA1 and MD5 being broken.

There is an email that started this, reporting that a collision was found in SHA0. Further to this, a paper was published yesterday showing collisions in MD4, MD5, HAVAL-128 and RIPEMD. This paper couldn't be verified originally, but this morning I found a post where a cryptographer was able to successfully reproduce the collision results on MD5. You can even download the source and verify it yourself if you are so inclined.

So what does this mean about the cryptography field? Nobody really knows. Is it still safe to buy your antique collection of Smurf memorabilia online? Absolutely. For now.

Theory and practice are two different things. Simply finding a collision doesn't make a cryptographic hash function (CHF) worthless. I think Edwin said it best when he says that the finding of a single collision in SHA-1 would not, by itself, cause much trouble, since one arbitrary collision won't do an attacker much good in practice. But history tells us that such discoveries are usually followed by a series of bigger discoveries that widen the breach, to the point that the broken primitive becomes unusable.

To be cryptographically sound, a CHF should have two main properties:

  1. Given a digest from a source, it must be essentially impossible to figure out what data generated that digest.
  2. It must be essentially impossible to find two different data values that have the same digest, which is commonly called a "collision".
A break in the SHA-1 primitive (which is one of the most used CHF, and is recommended by NIST) may lead for the need to refactor code using it, including applications that use technology like digital signatures. (Which could be ugly for the eCommerce world)

Of course right now this is just rumor, and hasn't been verified. Will be interesting to see what is published from the CRYPTO conference this week.

Posted by SilverStr at August 17, 2004 08:16 AM | TrackBack

perhaps someone should have taken a similar approach to this problem as the skeptic column in sciam a few months ago. The column discussed how one in a million miracles happen 295 times a day in America. It went on to explain that the sheer immensity of our society and the actions that we take in it are such that probability mathematics can show some interesting results(it can also get you thrown out of casinoes, but that's another story). SHA-1 hashing sees widespread use in p2p apps such as ed2k and bittorrent. MD5 is the hash of choice for download verification. The problem isn't that we've found a collision and the sky is falling because the primitive is bad, but rather that we simply have too much data and "actions" for the current implementations of the primitive to work properly. In quasi-pseudo english(similar, but completely different language), the amount of entropy that the current implementations of the primitive can handle are exceeded by the amount of entropy within the data that we want to use the primitive on. Solution: double or quadruple(or for you overkill types, x8) the size of the hash string. Unfortunately, as the size of any hash's given area of targets increases, it also becomes more apparent that the primitive isn't very good(you'll notice that it won't spread itself evenly). How many people use e-commerce, and in doing so, generate a SHA-1 hash? How many transactions do they do per day? How long do you want your function to last, if you factor in growth formulas?

Posted by: Steve at August 18, 2004 11:20 PM