How to determine what type of encoding/encryption has been used?
Is there a way to find what type of encryption/encoding is being used? For example, I am testing a web application which stores the password in the database in an encrypted format (
WeJcFMQ/8+8QJ/w0hHh+0g==). How do I determine what hashing or encryption is being used?
Some content of (or links pointing to) a methodology is in order to explain how to identify certain types of crypto or encoding in a completely zero-knowledge scenario. Most of these answers are "it's impossible" and my gut feeling tells me that nothing in our industry is impossible.
@atdre Thanks for the bounty. The question seems focussed on password hashing formats - is that your focus also? That seems best to me, and if people want to answer the question for file formats, they can ask another question.
@atdre: "Impossible" is usually a shortcut for "infeasible with current technology/won't finish before the heat death of the universe".
I asked a similar question on SE: http://stackoverflow.com/questions/988642/how-would-i-reverse-engineer-a-cryptographic-algorithm
Your example string (
WeJcFMQ/8+8QJ/w0hHh+0g==) is Base64 encoding for a sequence of 16 bytes, which do not look like meaningful ASCII or UTF-8. If this is a value stored for password verification (i.e. not really an "encrypted" password, rather a "hashed" password) then this is probably the result of a hash function computed over the password; the one classical hash function with a 128-bit output is MD5. But it could be about anything.
The "normal" way to know that is to look at the application code. Application code is incarnated in a tangible, fat way (executable files on a server, source code somewhere...) which is not, and cannot be, as much protected as a secret key can. So reverse engineering is the "way to go".
Barring reverse engineering, you can make a few experiments to try to make educated guesses:
- If the same user "changes" his password but reuses the same, does the stored value changes ? If yes, then part of the value is probably a randomized "salt" or IV (assuming symmetric encryption).
- Assuming that the value is deterministic from the password for a given user, if two users choose the same password, does it result in the same stored value ? If no, then the user name is probably part of the computation. You may want to try to compute MD5("username:password") or other similar variants, to see if you get a match.
- Is the password length limited ? Namely, if you set a 40-character password and cannot successfully authenticate by typing only the first 39 characters, then this means that all characters are important, and this implies that this really is password hashing, not encryption (the stored value is used to verify a password, but the password cannot be recovered from the stored value alone).
Thanks for the inputs.. Pls tell me more about how you confirmed its a Base64 encoding for a sequence of 16 bytes. **Regarding your experiments,** Yes, this is a value stored for password verification. 1) if a user changes password, then the stored value changes too.. 2) if two users choose same password, the stored value is the same 3) password length is not limited.
@Learner: _any_ sequence of 24 characters, such that the first 22 are letters, digits, '+' or '/', and the last two are '=' signs, is a valid Base64 encoding of a 128-bit value. And any 128-bit value, when encoded with Base64, yields such a sequence.
This is the right answer, even though I wanted to hear some potential tricks if the application code isn't available. If you are dealing with a closed-source binary application -- check out http://aluigi.org/mytoolz.htm#signsrch
If it is MD5 you could try any of the MD5 cracking websites like http://www.md5decrypter.co.uk/. None of these are fast or guaranteed to give a result. Google for "md5 cracking" to find more. That will also give you an extendeed list at http://www.stottmeister.com/blog/2009/04/14/how-to-crack-md5-passwords/