Finding encryption algorithm from source and resulting string

Finding encryption algorithm from source and resulting string - string

If we have a source string and encrypted string, can we find out algorithm/forumla used in encrypting that source string?
EDIT
Here are a couple of such strings.
string, encrypted string
avtacarguy,c0e54a662e8d7adbf26e2515dcb2bfde
burris212,0c9fe74ce3abb1507108dba1f04497e5
directert,96336189003e59a2d4a3fdbb2cf02707

In general, no. There can be numerous algorithms that turn the source string into the encrypted string, based on what public and/or private keys are used.
In simple cases, such as the Caesar cipher it may be possible to figure out how it was done but even then you've only provided a 'most likely' explanation as to what encryption algorithm was used.

Technically (mathematically) speaking, no. Several encryption schemes could yield the same crypto text for some particular input.
If you had had the encryption key, you could of course try out all popular encryption schemes and see if you got some exact match in which case you could be pretty sure you found the algorithm.

Related

XOR Encryption with "Padded Key"

I read that XOR encryption can be considered very safe as long as two conditions are fulfilled.
1. The length of the key is as long (or longer) than the data
2. The key is not following a notable pattern (i.e. it's a random jumble of characters)
In that case, how about this: Before the XOR operations you use the (short) key to generate a seed for a Random Number Generator. You then use this Generator to create characters which are added to the end of your key until it's as long as the data you want to encrypt.
Then you use this new key to XOR the data.
I have tested this and it does seem to have no problem working as intended (it can encrypt and decrypt without corruption of the data).
My question is how "secure" such an encryption would be. Anyone have an estimate of how hard it'd be to break/decrypt that data?

As others have said, your idea is a stream cipher. It fails to be completely secure, like a One Time Pad is provably secure, because of the first condition you state:
The length of the key is as long (or longer) than the data
You are using a "short key" to seed your RNG. That is a weakness, because that "short key" is the cryptographic key for the whole system. If an attacker knows the short key, she can plug it into a copy of the RNG, generate the entire keystream and decrypt the message. If the key is too short she can try every possible key and eventually decrypt the message -- a brute force attack.
You are right that this avoids the problems with the OTP, but in so doing it loses the absolute security. There are secure stream ciphers, see eSTREAM for some examples, or else a block cipher running in counter mode is effectively a stream cipher.
Your idea is a reasonable one, but it has been thought of before. Sorry.

Keying Hashes/Login Security?

I am going to use hashing function, with salt:
$stored_pass = md5(md5($salt).md5($plain_pass)); **
/// I am wanting to know an efficient way to key/authenticate that hash.
I read up a bit about keying hashes, and MAC's, but didn't quite grasp HMAC's; so just figured that wrapping the hash in an encryption function, like aes, would work. ///
EG
$stored_pass = aes(md5(md5($salt).md5($plain_pass))); **
I would like to know the following:
Why key hashes for login? (Examples would be nice)
Methods for keying hashes? (Specifically for use in loign)
Disadvantages?
Are there still ways our hashing/password validation system could be more secure? (After we factor in hash, salted, keyed)
** What is the most "secure hashing algorithm?
" Secure by hardest to crack.
/// I read that sha-512 was one of the most secure; but then read contradicting articles stating, sha in any form should not be used, and something like bcrypt/scrypt should be used or PBKDF2. Then I read that bcrypt shouldn't be used, and has limitations. So I'm a bit confused. ///
When providing hash algorithms, I'd like to know the following:
What are the limitations of the hash algorithm?
Upsides?
Downsides?
/// My main concern is user security, so if that means less 'speed' I'm not bothered. In my eyes the login function's purpose is user security, so reducing the credibility of that security for a few milliseconds seems silly to me. (Just personal opinion). ///
Also I'd appreciate comments on my function:
$stored_pass = aes(md5(md5($salt).md5($plain_pass))); **
And any alternatives would be appreciated.
Note: I know some suggest using some sort of api for this, with functions already written, but that's not really what I'm looking for. I'd prefer to learn more about it myself.

I am going to use hashing function, with salt:
$stored_pass = md5(md5($salt).md5($plain_pass));
Don't use MD5. Learn how to safely store passwords instead. That page might very well answer all of your questions.
/// I am wanting to know an efficient way to key/authenticate that hash.
I read up a bit about keying hashes, and MAC's, but didn't quite grasp HMAC's; so just figured that wrapping the hash in an encryption function, like aes, would work. ///
MACs aren't the proper tool for the job here either, even if it seems tempting to use them. Maybe this primer on cryptography concepts will help illuminate the difference, but basically:
MAC - Provide tamper-resistance for a message.
Password hash - Slow, salted hashing algorithm.
They're totally different use-cases. (Although, PBKDF2 uses a MAC algorithm internally, so I can understand if you were confused by that.)
Encrypting a hash isn't what a MAC does, either. HMAC in particular is basically:
Hash your message (with a minor transformation).
Hash the key and output of step 1 (with another minor transformation).
/// I read that sha-512 was one of the most secure; but then read contradicting articles stating, sha in any form should not be used, and something like bcrypt/scrypt should be used or PBKDF2. Then I read that bcrypt shouldn't be used, and has limitations. So I'm a bit confused. ///
Easy answer:
Use password_hash() to create password hashes.
Use password_verify() to authenticate passwords against hashes.
Stop worrying about it.
The limitations of bcrypt (truncating after 72 characters OR the first NUL byte -- which are mentioned in the first article I linked to) aren't a practical concern, and rolling your own crypto is definitely less secure than using bcrypt.
If you are absolutely concerned about the bcrypt limitations, do this:
function bcrypt_sha384_hash($password, $cost = 10)
{
$fasthash = base64_encode(
hash('sha384', $password, true)
);
return password_hash($fasthash, PASSWORD_BCRYPT, ['cost' => $cost]);
}
function bcrypt_sha384_verify($password, $storedHash)
{
$fasthash = base64_encode(
hash('sha384', $password, true)
);
return password_verify($fasthash, $storedHash);
}

Which encryption method would produce this result

I am doing a security review on a system.
From one part of the system to another, information is sent using an encrypted string.
This string is over 400 characters long, but within it are 4 sets of 10 identical characters. I am assuming that the data that was encrypted also has this pattern, for example the word "parameters".
I have tested encrypting a string containing several identical strings with DES, but do not get the same pattern.
Question is: Is there an encryption method that would produce this result. Or have the parts been encrypted seperatly and conncatinated?

An encryption system with short key length and no correlation between blocks (e.g. ECB mode) would encrypt short runs of identical plain text identically. It could also just be coincidence, of course.

If what you're seeing is real, it's mostly about encryption mode, not the cipher. Likely culprits are a block cipher in ECB mode (which is usually a bad idea), or the pseudo-"stream" cipher of XORing the plaintext with a short password string repeated over and over (in which case the odds of two copies of the same plaintext at random positions encoding to the same thing are 1 in passwordlength) this one is a really bad idea.
By the way, it's best to be clear what format you're looking at the data in. If it's hex, okay. If it's base64, you should decode it before you look at it -- identical strings won't always look identical after base64 encoding depending on their alignment to a 3-byte boundary.
And just for illustration, here's a discussion of ECB mode on Wikipedia including pictures of the entropy problem with ECB -- scroll down to the pictures of Tux.

What do you mean with "4 sets of 10 identical characters"?
If you mean 4 identical substrings with length 10, it may be the Caesar cipher, which is totally unsecure, as it can be deciphered by a human in no time. Another possibility is the use of an XOR cipher with a bad chosen key.

What's a good method/function to create a reversible hash?

I need to transmit some data over the wire and I don't want that data being plain text.
The text I'm sending needs to be reversed so I can't md5/sha256/etc...
What's a good way to encode a salted string?

You're looking for encryption.
What language are you using? You probably have a built-in encryption algorithm you can use.
The idea with hashing is that you can only go one-way.
[plain text]--->(HASH ALGORITHM)--->[hash]
Whereas the idea with encryption is that you can use a key together with some plaintext to create a ciphertext. Then you can use the key on the ciphertext to retrieve the plaintext at any time:
[plain text] + [key] --->(ENCRYPTION ALGORITHM)-->[ciphertext]
[ciphertext] + [key] --->(DECRYPTION ALGORITHM)-->[plain text]
The decryption algorithm for a given encryption algorithm is usually very similar to the encryption algorithm, and it allows for the retrieval of a plaintext message given a ciphertext and the correct key (ie password).

You want to use an encryption function, not a hash - which by definition is one-way.
The AES encryption algorithm would be a good start, as the is probably the most widely used one at present.

You don't want a hash, you want encryption. You should look at Blowfish.

Initialization vector uniqueness

Best practice is to use unique ivs, but what is unique? Is it unique for each record? or absolutely unique (unique for each field too)?
If it's per field, that sounds awfully complicated, how do you manage the storage of so many ivs if you have 60 fields in each record.

I started an answer a while ago, but suffered a crash that lost what I'd put in. What I said was along the lines of:
It depends...
The key point is that if you ever reuse an IV, you open yourself up to cryptographic attacks that are easier to execute than those when you use a different IV every time. So, for every sequence where you need to start encrypting again, you need a new, unique IV.
You also need to look up cryptographic modes - the Wikipedia has an excellent illustration of why you should not use ECB. CTR mode can be very beneficial.
If you are encrypting each record separately, then you need to create and record one IV for the record. If you are encrypting each field separately, then you need to create and record one IV for each field. Storing the IVs can become a significant overhead, especially if you do field-level encryption.
However, you have to decide whether you need the flexibility of field level encryption. You might - it is unlikely, but there might be advantages to using a single key but different IVs for different fields. OTOH, I strongly suspect that it is overkill, not to mention stressing your IV generator (cryptographic random number generator).
If you can afford to do encryption at a page level instead of the row level (assuming rows are smaller than a page), then you may benefit from using one IV per page.
Erickson wrote:
You could do something clever like generating one random value in each record, and using a hash of the field name and the random value to produce an IV for that field.
However, I think a better approach is to store a structure in the field that collects an algorithm identifier, necessary parameters (like IV) for that parameter, and the ciphertext. This could be stored as a little binary packet, or encoded into some text like Base-85 or Base-64.
And Chris commented:
I am indeed using CBC mode. I thought about an algorithm to do a 1:many so I can store only 1 IV per record. But now I'm considering your idea of storing the IV with the ciphertext. Can you give me more some more advice: I'm using PHP + MySQL, and many of the fields are either varchar or text. I don't have much experience with binary in the database, I thought binary was database-unfriendly so I always base64_encoded when storing binary (like the IV for example).
To which I would add:
IBM DB2 LUW and Informix Dynamic Server both use a Base-64 encoded scheme for the character output of their ENCRYPT_AES() and related functions, storing the encryption scheme, IV and other information as well as the encrypted data.
I think you should look at CTR mode carefully - as I said before. You could create a 64-bit IV from, say, 48-bits of random data plus a 16-bit counter. You could use the counter part as an index into the record (probably in 16 byte chunks - one crypto block for AES).
I'm not familiar with how MySQL stores data at the disk level. However, it is perfectly possible to encrypt the entire record including the representation of NULL (absence of) values.
If you use a single IV for a record, but use a separate CBC encryption for each field, then each field has to be padded to 16 bytes, and you are definitely indulging in 'IV reuse'. I think this is cryptographically unsound. You would be much better off using a single IV for the entire record and either one unit of padding for the record and CBC mode or no padding and CTR mode (since CTR does not require padding - one of its merits; another is that you only use the encryption mode of the cipher for both encrypting and decrypting the data).

Once again, appendix C of NIST pub 800-38 might be helpful. E.g., according to this
you could generate an IV for the CBC mode simply by encrypting a unique nonce with your encryption key. Even simpler if you would use OFB then the IV just needs to be unique.
There is some confusion about what the real requirements are for good IVs in the CBC mode. Therefore, I think it is helpful to look briefly at some of the reasons behind these requirements.
Let's start with reviewing why IVs are even necessary. IVs randomize the ciphertext. If the same message is encrypted twice with the same key then (but different IVs) then the ciphertexts are distinct. An attacker who is given two (equally long) ciphertexts, should not be able to determine whether the two ciphertexts encrypt the same plaintext or two different plaintext. This property is usually called ciphertext indistinguishablility.
Obviously this is an important property for encrypting databases, where many short messages are encrypted.
Next, let's look at what can go wrong if the IVs are predictable. Let's for example take
Ericksons proposal:
"You could do something clever like generating one random value in each record, and using a hash of the field name and the random value to produce an IV for that field."
This is not secure. For simplicity assume that a user Alice has a record in which there
exist only two possible values m1 or m2 for a field F. Let Ra be the random value that was used to encrypt Alice's record. Then the ciphertext for the field F would be
EK(hash(F || Ra) xor m).
The random Ra is also stored in the record, since otherwise it wouldn't be possible to decrypt. An attacker Eve, who would like to learn the value of Alice's record can proceed as follows: First, she finds an existing record where she can add a value chosen by her.
Let Re be the random value used for this record and let F' be the field for which Eve can submit her own value v. Since the record already exists, it is possible to predict the IV for the field F', i.e. it is
hash(F' || Re).
Eve can exploit this by selecting her value v as
v = hash(F' || Re) xor hash(F || Ra) xor m1,
let the database encrypt this value, which is
EK(hash(F || Ra) xor m1)
and then compare the result with Alice's record. If the two result match, then she knows that m1 was the value stored in Alice's record otherwise it will be m2.
You can find variants of this attack by searching for "block-wise adaptive chosen plaintext attack" (e.g. this paper). There is even a variant that worked against TLS.
The attack can be prevented. Possibly by encrypting the random before using putting it into the record, deriving the IV by encrypting the result. But again, probably the simplest thing to do is what NIST already proposes. Generate a unique nonce for every field that you encrypt (this could simply be a counter) encrypt the nonce with your encryption key and use the result as an IV.
Also note, that the attack above is a chosen plaintext attack. Even more damaging attacks are possible if the attacker has the possibility to do chosen ciphertext attacks, i.e. is she can modify your database. Since I don't know how your databases are protected it is hard to make any claims there.

The requirements for IV uniqueness depend on the "mode" in which the cipher is used.
For CBC, the IV should be unpredictable for a given message.
For CTR, the IV has to be unique, period.
For ECB, of course, there is no IV. If a field is short, random identifier that fits in a single block, you can use ECB securely.
I think a good approach is to store a structure in the field that collects an algorithm identifier, necessary parameters (like IV) for that algorithm, and the ciphertext. This could be stored as a little binary packet, or encoded into some text like Base-85 or Base-64.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string