This method works fine in a program I've made. However I cannot really understand what is happening and where the encryption is actually performed. I read the related description from MSDN but not much information is given.
Can someone explain what is happening in general especially in line 8 and 9 please.
public byte[] Decrypt(byte[] input, byte[] key, byte[] iv)
{
DES des = new DESCryptoServiceProvider();
des.Mode = CipherMode.ECB;
des.Padding = PaddingMode.None;
des.Key = key;
ICryptoTransform ct = des.CreateDecryptor(key, iv);
byte[] result = ct.TransformFinalBlock(input, 0, input.Length);
return result;
}
If you want to understand what is going on, you should read about block cipher operations here:
http://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#Electronic_codebook_.28ECB.29
In a nutshell, block ciphers chaining causes the input of one block operation to be fed into the next block operation. This obscures any block-level patterns in the ciphertext. Since there is a chaining structure, the last block gets an input from the second last block, and so on... until the second block gets an input from the first block. Now the first block needs to get an input from something, but there are no preceding blocks. So we use something called an Initialization Vector (iv) to start it off. This IV does not need to be secret like the key, but it does need to have a low probability of re-use (otherwise the attacker can use it to correlate the first blocks of all your ciphertexts). Typically random numbers are used, or sometimes increasing sequence numbers.
In regard to the specific call:
Your method works to decrypt a single block using DES. (Which is nowadays considered out of date and insecure, by the way, please consider using AES instead - the block cipher structures remain the same so all you need to do is swap the library). Anyway,
Since you're using a cipher in ECB mode, each block is decrypted independently with the same initialization vector, which is provided to your Decrypt method call. The call to CreateDecryptor initializes a decryption object using the provided secret key and initialization vector.
The actual decryption is performed using the call to TransformFinalBlock. The arguments are the input byte array, and then an offset and a length parameter (used for when you don't want to decrypt the entire byte array). In this case you do want to use the entire byte array so the starting offset is 0 and the size is the length of the whole byte array.
One thing you should probably add is to check that the input byte array is the correct block size for your cipher, otherwise it will throw an exception. In the case of DES, this is 64 bits. If you switch to AES as I recommended it will be 128 bits.
Related
I have a key for AES encryption and I'm trying to encrypt strings with constant length as well, does the resultant encrypted string will always have the same length?
That depends on what you mean by 'the same length'.
The same length as the original string: generally no; the original string will be padded to a multiple of the cipher block-length. Check padding modes for details.
The same length every time you encrypt: yes; as long as you stick to the same mode and padding the encrypted output will have the same length.
First of all, a block cipher requires a full block to encrypt, therefore AES requires 16-byte ( or 128-bit ) input to encrypt will output 16-byte ciphertext. This is always the case for block ciphers since they are Pseudo-Random Permutations (PRP) - they are always permutations we expect them to be PRP.
Block ciphers, like AES, are primitives and must be used with a proper mode of operation. Here we will talk only about some of them.
ECB mode is the default and insecure.
Padding: In the case of the message size is not a multiple of the 128, we apply padding. There are various paddings, however, the most common one is the PCKS#7 ( update of PCKS#5 that was for 64-bit block ciphers like DES). This padding append characters to the end so that the size is multiple of 128.
Therefore; the appended number of characters for ECB mode can be from 1 to 16 ( if the block size is already a multiple of 128) then a new block is added with 16 10 bytes.
CBC mode requires a random and unpredictable IV so that we can achieve probabilistic encryption. As in ECB, it requires padding.
Therefore; the appended number of characters for CBC mode can be from 1+16 to 32 since we need to add the IV to the ciphertext.
CTR mode requires Initial Value (IV) that can be random or deterministic. CTR mode converts a block cipher (PRP) into a stream cipher. The plaintext is x-ored with the ciphertext where the input was the IV and incremented for each encryption. There is no need for padding in CTR mode.
Therefore; the appended number of characters for CTR mode is (at most)16.
AAED: The above are archaic mode of operations and today we use Authenticated Encryption (with Associated Data) (AE/AEAD). Unlike the archaic modes, these modes can provide us not only confidentiality but also integrity and authentication.
The authentication part requires a tag (MAC tag) and this really depends on the scheme.
AES-GCM is the NIST-approved mode. AES-GCM's recommended IV size 12 since different size requires an additional process. It always produces a 16-byte tag, however, one can reduce the size of the tag if they want to, although not recommended.
At first one might consider that the size is incremented by 12-16, however, it is not. The GCM always used Associated Data (AD) and allows zeroAD and the size of the AD is always added. For details, see NIST GCM specification
This was the short story of the long list of how the mode of operations behaves in terms of input vs output length.
If you omit the IV size, it is possible with CTR mode.
I have the binary data that I need to decipher, the algorithm (RC4) and the key. However, to decipher the data, one instruction I got is that "the length of the key initially gets skipped" or that "len bytes are skipped initially".
What does this mean exactly? Does it mean that if my key is 10 bytes long, that I need to pass in the binary data without the first 10 bytes to the decipher and then concatenate the first 10 bytes with the deciphered bytes?
const decipher = crypto.createDecipheriv('RC4', 'mysuperkey', null);
const buffer = decipher.update(data.slice('mysuperkey'.length));
decipher.final();
This does not work, so I might not understand the instruction.
RC4 is insecure for the first bits, so often you are instructed to skip over some initial bytes of the key stream. The way a stream cipher works is that it creates a stream of pseudo random data that depends on the key. That stream is XOR'ed with the plaintext to create the ciphertext, and with the ciphertext to create the plaintext.
To skip a number of bytes of the key stream you can simply encrypt / decrypt some (zero valued) bytes and throw away the results. This goes both for encryption and decryption. If the API has a specific skip method then you should of course use that, but I don't think it is present in CryptoJS.
I have read one article about difference between the methods update() and dofinal() in cipher.
It was about what will happend if we want to encrypt 4 Bytes Array, when the block size of the cipher is for example 8 Bytes. If we call update here it will return null. My question is: what will happen if we call doFinal() with a 4 byte array to encrypt, and the buffer size is 8 bytes, how many bytes encoded data will we receive on the return?
update(): feed the data, again and again, enables you to encrypt long files, streams.
dofinal(): apply the requested padding scheme to the data, if requested and necessary, then encrypt. ECB and CBC mode requires padding but CTR mode doesn't. If NOPADDING has used some libraries may secretly pad, in others you have to handle the padding yourself.
When you call, dofinal() with 4-byte data, if NOPADDING is not set, it will be padded and then encrypted.
From Java Doc;
update(byte[] input)
Continues a multiple-part encryption or decryption operation (depending on how this cipher was initialized), processing another data part.
doFinal()
Finishes a multiple-part encryption or decryption operation, depending on how this cipher was initialized.
I have an API route that proxies a file upload from the browser/client to AWS S3.
This API route attempts to stream the file as it is uploaded to avoid buffering the entire contents of the file in memory on the server.
However, the route also attempts to calculate an MD5 checksum of the file's body. As each part of the file is chunked, the hash.update() method is invoked w/ the chunk.
http://nodejs.org/api/crypto.html#crypto_hash_update_data_input_encoding
var crypto = require('crypto');
var hash = crypto.createHash('md5');
function write (chunk) {
// invoked many times as file is uploaded
hash.update(chunk);
}
function done() {
// will hash buffer all chunks in memory at this point?
hash.digest('hex');
}
Will the instance of Hash buffer all the contents of the file in order to perform the hash calculation (thus defeating the goal of avoiding buffering the entire file's contents in memory)? Or can an MD5 hash be calculated incrementally, without ever having the entire input available to perform the calculation?
MD5 and some other hash functions are based on the Merkle–Damgård construction. It supports the incremental/progressive/streaming hashing of data. After the data is transformed into an internal state (which has a fixed size) a last finalization step is performed to generate the final hash by padding and processing the last block and afterwards by simply returning the final state.
This is probably also why many hashing library functions are designed in such a way with an update and a finalization step.
To answer your question: No, the file content is not kept in a buffer, but is rather transformed into a fixed size internal state.
All modern cryptographic hash functions are created in such a way that they can be updated incrementally.
To allow for incremental updates, the input data of the message is first arranged in blocks. These blocks are processed in order. To do this the implementation usually buffers the input internally until it has a full block, and then processes this block together with the current state to produce a new state, using a so called compression function. The initial state usually simply consists of predetermined constant values. During the call to digest the last block is padded - usually with bit padding and an encoding of the amount of processed bytes - and the final state is calculated; this may require an additional block without any message data. A final operation may be performed and finally the resulting hash value is returned.
For MD5 the Merkle–Damgård construction is used. This common construction is also used for SHA-1 and SHA-2. SHA-2 is a family of hashes based on the algorithms for SHA-256 (SHA-224) and SHA-512 (SHA-384, SHA-512/224 and SHA-512/256). MD5 in particular uses a block size of 512 bits and a internal state of 128 bits. The internal state of the last block (including padding) is simply output directly without any post-processing for MD5, SHA-1, SHA-256 and SHA-512.
Keccak has been chosen to be SHA-3. It is construction based on a sponge, a specific compression function. It isn't a Merkle–Damgård hash - which is a big reason why it has been chosen as SHA-3. It still has all the update properties of Merkle–Damgård hashes and has been designed to be compatible with SHA-2. It splits up and buffers blocks just like the previously mentioned hashes, but it has a larger internal state and performs final operations on the output, making it arguably more secure.
So when you were using a modern hash construction such as MD5 you were unknowingly performing additional buffering. Fortunately, the buffering of a single block of 512 bits + 128 bits for the state size will not likely make you run out of memory. It is certainly not required for the hash implementation to buffer the entire message before the final hash value can be calculated.
Notes:
MD5 and SHA-1 are considered insecure w.r.t. collision resistance and they should preferably not be used anymore, especially when it comes to validating contents;
A "compression function" is a specific cryptographic notion; it is not
LSZIP or anything similar;
There may be specialized, theoretical hashes that perform the calculate the values differently - theoretically speaking there is no requirement to split the input messages into blocks and operate on the blocks sequentially. No worry, those are unlikely to be in the libraries you are using;
Similarly, implementations may decide to buffer more blocks at once, but that is fortunately extremely uncommon as well. Commonly only one block is used as buffer - in some cases it could be more performant to buffer a few blocks instead;
Some low level implementations may require you to supply the blocks yourself for reasons of efficiency.
I want to use OpenSSL for data transmission between server and client. I want to do it using EVP with AES in CBC mode. But when I try to decode second message on client, EVP_EncryptFinal_ex returns 0.
The my scheme is shown on picture.
I think, this behavior because I call EVP_EncryptFinal_ex (and EVP_DecryptFinal_ex) twice for one EVP context. How to do it correctly?
You cannot call EVP_EncryptUpdate() after calling EVP_EncryptFinal_ex() according to the EVP docs.
If padding is enabled (the default) then EVP_EncryptFinal_ex()
encrypts the "final" data, that is any data that remains in a partial
block. It uses standard block padding (aka PKCS padding) as described
in the NOTES section, below. The encrypted final data is written to
out which should have sufficient space for one cipher block. The
number of bytes written is placed in outl. After this function is
called the encryption operation is finished and no further calls to
EVP_EncryptUpdate() should be made.
Instead, you should setup the cipher ctx for encryption again by calling EVP_EncryptInit_ex(). Note that unlike EVP_EncryptInit(), with EVP_EncryptInit_ex(), you can continue reusing an existing context without allocating and freeing it up on each call.