Is SHA1 better than MD5 only because it generates a hash of 160 bits?
It is well known that SHA1 is recommended more than MD5 for hashing since MD5 is practically broken as lot of collisions have been found. With the birthday attack, it is possible to get a collision in MD5 with 264 complexity and with 280 complexity in SHA1
It is known that there are algorithms that are able to crack both of these in far lesser time than it takes for a birthday attack.
My question is: is MD5 considered insecure only for this reason that it is easy to produce collisions? Because looking at both, producing collisions in SHA1 is not that difficult either. So what makes SHA1 better?
I was under the impression that it is possible to determine a collision given 2^80 possible inputs. Also, [link]http://eprint.iacr.org/2008/469.pdf is a theoretical attack on SHA1. Correct me if am wrong.
There are theoretical attacks. But nobody managed to pull off an actual collision yet. Still I certainly wouldn't trust SHA-1's collision resistance, since it's likely the next practical attack against a popular crypto primitive we'll see. But it hasn't happened yet (at least publicly).
@CodesInChaos Took 5 years: https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html
Producing SHA-1 collisions is not that easy. It seems reasonable that the attack with has been described on SHA-1 really works with an average cost of 261, much faster than the generic birthday attack (which is in 280), but still quite difficult (doable, but expensive).
That being said, we do not really know what makes hash functions resistant (see for instance this answer for a detailed discussion). With a lot of hand-waving, I could claim that SHA-1 is more robust than MD5 because it has more rounds and because the derivation of the 80 message words in SHA-1 is much more "mixing" than that of MD5 (in particular the 1-bit rotation, which, by the way, is the only difference between SHA-0 and SHA-1, and SHA-0 collisions have been produced).
For more of the same, look at SHA-256, which is much more "massive" (many more operations than SHA-1, yet with a similar structure), and currently unbroken. It is as if there was a minimal amount of operations for a hash function to be secure, for a given structure (but there I am moving my hands at stupendous speed, so don't believe that I said anything really scientific or profound).
The first collision was created today :) https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html
No. It's not just the length of the output. There are significant differences in their level of security against cryptanalytic attacks.
There are devastating collision attacks on MD5. (The Wikipedia article on MD5 has some details.) These attacks mean that MD5 provides essentially no security against collisions: it is easy to find collisions in MD5.
In contrast, SHA1 appears to be much more secure. While there are some known attacks on SHA1, they are much less serious than the attacks on MD5. (The Wikipedia article on SHA1 has an overview.) For this reason, SHA1 is a much better choice than MD5 in many settings.
These days, instead of using MD5 or SHA1, you're probably even better off to use one of the more modern hash functions, like SHA256. Those have no known attacks of any practical relevance.
But certainly don't use MD5 in any setting where collision-resistance is needed, as that aspect of MD5 is completely broken.
When you say [quote]There are significant differences in their level of security against cryptanalytic attacks.[/quote], do you mean these are the implementation flaws in the md5 hash generation algorithm which makes it insecure ?
@asudhak: the flaws are not in the implementation of MD5, but in its specification. Resistance of MD5 depends on its structure, and it turned out that the structure of MD5 is not as resistant as was initially believed.
@asudhak By the way, I don't know what context you're asking this question in, but **do not** use a hash like MD5 or SHA256 for passwords. See how to securely hash passwords? and how to store salt? for more detailed explanations.
oh no, am not a developer... I'm more of a consultant and I wanted a concrete reason as to why SHA1 > MD5 rather than just accept blindly the fact...
@JMCF125, pretty much what the phrase would make you think. Roughly speaking, there are no known attacks. More precisely, there might be theoretical "attacks", but they have no practical relevance (they don't endanger any practical application).
@D.W., ah; so they no only weren't put to practice, as they would be practically impossible. My initial thoughts; but then I considered that it might have been a lengthy, inefficient attack. Thanks for clearing that up.
@JMCF125 An ideal n bit secure hash function (one where the only attack was brute force) would take roughly 2^n trials to find a preimage and 2^(n/2) trials to find a collision. It is common for an attack method to be discovered that requires less computational effort than brute force but is still computationally infeasible.
SHA-1 collision reported. https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html
Note that it is certainly possible to use reduced output of SHA-2 functions, in case that's required. There is SHA-224 for instance, but you could also reduce the output of SHA-256 or 512 by simply taking the leftmost bytes (which is the common default). Note that 160 bits of SHA-1 however only has 80 bits of security when it comes to producing collisions, which is not great; generally we aim for >= 128 bits of security.
The level of security provided by a hash function is based on the difficulty of generating a plain-text that will produce a given hash signature (the output of the hash). A hash is a quick method of simplifying a set of data to indicate that a user possesses the original data without actually revealing the data. This can be useful both for validating that someone is who they say they are (by comparing a hash of something you know they know to the stored value) as well as validating that a message has not been changed. Because a hash is many to one (many values will produce the same hash value), it is theoretically difficult to work from a hash to the original value. Because of the high number of possible hash values, it SHOULD be difficult to get a hash to produce a given output.
This is, however, unfortunately not always true. The expectation that certain values correspond to human readable input allow for dictionary attacks called rainbow tables against a hash to attempt to discover the original value. Salting (the addition of non-human readable input to the beginning or end of an input) is an attempt to prevent rainbow tables from working as they would have to be made for every different salt.
The other problem, and the one relevant to your question is that of hash collisions. A hash collision occurs any time that two given inputs produce the same hash output. In order to validate that data has not been altered, it must not be easily possible to generate a hash for an altered set of data that matches the hash of the original. Unfortunately, MD5 is thoroughly compromised in this regard with there being multiple ways to relatively easily find alterations that can be put on the end or beginning of a payload to make it appear to be valid. SHA-1 also has some minor compromises in this regard that were recently discovered, however they are less severe than the MD5 issues. Using something like SHA256 is even more secure as it does not currently have any known attacks against hash collision.