AES-256 encryption workflow in scala with bouncy castle: salt and IV usage and transfer/storage

AES-256 encryption workflow in scala with bouncy castle: salt and IV usage and transfer/storage - security

I'm trying to implement secure encryption of the files to be sent over insecure channel or stored in an insecure place. I use bouncy castle framework, my code is written in scala. I decided to use aes-256 (to be more specific - Rinjael with 256 bit block size, here is why). And it seems like I can use Rinjael with any (128|160|192|256) block length.
I cannot understand the whole process overview correctly. Here is one of good answers, in this question there is some useful code specific to bouncy castle. But both leaving some questions unanswered for me (questions below).
So this is how I understand the workflow:
For creating a block cipher instance I have to get an instance of padded block cipher with some output feedback:
// create an instance of the engine
val engine = new RijndaelEngine(bitLength)
// wrap engine with some feedback-blocking cipher mode engine
val ofb = new OFBBlockCipher(engine , bitLength)
// wrap this with some padded-blocking cipher mode
val cipher = new PaddedBufferedBlockCipher(ofb, new PKCS7Padding())
Now I have to run init() on the cipher engine
2.1. first generate a key, to do this the best solution suggested here was to use Scrypt to derive a secret from password instead of using PBKDF2-HMAC-xxx. In russian wikipedia article on Scrypt it is said that the recommended parameters for Scrypt are as follows: N = 16384, r = 8, p = 1
So I'we wrirtten this code to generate the password:
SCrypt.generate(password.getBytes(encoding), salt, 16384, 8, 1, bitLength / 8)
2.2. This leads to that I need a salt. Salt should be an array of random bytes. Most answers here use 8 bytes. So I do
// helper method to get a bunch of random bytes
def getRandomBytes(size: Int) = {
val bytes = Array.ofDim[Byte](size)
val rnd = new SecureRandom()
rnd.nextBytes(bytes)
bytes
}
// generate salt
val salt = getRandomBytes(8)
2.3. For cipher to initialize we need an initialization vector (please take a look at my question (2) below).
val iv = getRandomBytes(bitLength / 8)
2.4. Now we are ready to initialize the cipher.
cipher.init(mode, params(password, salt, iv, bitLength))
Questions:
What should be the size of salt? Why do most respondents here use 8 bytes, not more?
What should be the size of IV? Is it correct that it should be the same size as cipher block size? Is it preferred to be fetched from cipher like here: cipher.getParameters().getParameterSpec(IvParameterSpec.class).getIV(); or to be just random as i did?
Is it correct that I need both the salt and IV or I can use just one of these? For example use random IV as a salt.
And the main question: I have to pass salt and IV to the other side or else it would be not possible to decrypt the message. I need to somehow pass both over unencrypted channel. Is it secure to just add both before an encrypted message (as a header)?
Thanks in advance!

I would go for 16 bytes salt length as suggested
IV should be size of block size of cipher and should be random
Yes you need both salt and IV because salt is used to generate key from password and IV is used to initialize block cipher
Salt and IV are designed to be public. You can send them or store unencrypted, but you do not use any authentication mechanism so anyone can change IV or Salt during transport and you would not be able to detect it and decryption will get you something different. To prevent that you should use some AEAD mode and include IV and salt in authentication.

Is it secure? Sure, they will still need to guess the passphrase. Is it as secure? No, because you're giving the attacker information that they need to simplify the decryption process. If the only way that you can get the salt/password to the other side is via an unencrypted channel then something is better than nothing I suppose, but why can't you exchange this information using PKI/SSL?

Related

How does an IV work and what would be the best way to store it?

I want to encrypt and decrypt strings. I'm using Nodejs crypto for this. I've read that when encrypting and decrypting it's highly recommended to use an IV. I want to store the encrypted data inside a MySQL database and decrypt it later when needed. I understand that I need the IV also for the decryption process. But what exactly is an IV and how should I store it? I read something about that an IV does not to be kept secret. Does this mean I can store it right next to the encrypted data it belongs to?

it's highly recommended to use an IV
No, it's required or you'll not get a fully secure ciphertext in most circumstances. At the very minimum, not supplying an IV for the same key and plaintext message will result in identical ciphertext, which will leak information to an adversary. In other words: encryption would be deterministic, and that's not a property that you want from a cipher. For CTR and GCM mode you may well leak all of the plaintext message though...
But what exactly is an IV ... ?
An IV just consists of binary bits. It's size and contents depend on the mode of operation (CBC/CTR/GCM). Generally it needs either to be a nonce or randomized.
CBC mode requires a randomized IV of 16 bytes; generally a cryptographically secure random number generator is used for that.
CTR mode commonly specifies both a nonce and the initial counter value within the IV of 16 bytes. So you already need to put the nonce in the left hand bytes (lowest index). This nonce may be randomized, but then it should be large enough (e.g. 12 bytes) to avoid the birthday problem.
GCM mode requires just a nonce of 12 bytes.
and how should I store it
Anyway you can store the bytes, as long as they can be retrieved or regenerated during decryption. If you need text you may need to encode it using base 64 or hexadecimals (this goes for the ciphertext as well, of course).
I read something about that an IV does not to be kept secret.
That's correct.
Does this mean I can store it right next to the encrypted data it belongs to?
Correct, quite often the IV is simply prefixed to the ciphertext; if you know the block cipher and mode of operation then the size is predetermined after all.

Encryption & Decryption AES-256-CFB in Python

I have aes-256-cfb decryption code in ruby as follows.
data_cipher = OpenSSL::Cipher::Cipher.new "aes-256-cfb".freeze
data_cipher.decrypt
data_cipher.key = encryption_key
data_cipher.update(decode64(str)) << data_cipher.final
I need the python equivalent of this above code. My problem here is, where ever i found python logic for aes-256-cfb it always involved Initialization vector(IV). But in the above Ruby logic, iv is not set.
I tried with Random values of iv, but that is not giving me the same result as ruby code.
Please advise.

For AES-256-CFB there is always a iv needed for encryption. If no iv is given, most likely it will be just zero (which means 16 0x00 bytes, as the iv is equal to the blocksize, which is 128bit). Another option would be that the iv is randomly generated at encryption time and is encapsulated in the message. That would mean, that the first 16 byte of the message are the iv. If you don't know how the encryption algorith works, you will probably need to try this out.
However, since the iv is only used to decrypt the first block in CFB mode, if you have a long enough message, the decryption will work just fine even if the iv is wrong (except for the first 128 bit of the message).
Below is a code sample, how decryption works in python. You need to know the iv before encryption. In the sample below I initialized it with zero bytes.
Note that this code only handles decryption of the raw message bytes. You will need to care about the encoding yourself.
from Crypto import Random
from Crypto.Cipher import AES
def decrypt(key, enc):
iv = '\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
cipher = AES.new(key, AES.MODE_CFB, iv)
return cipher.decrypt(enc)
More info here (note that this thread is using CBC mode, which is a litte different): Encrypt & Decrypt using PyCrypto AES 256

What is the equivalent cipher to bouncycastle's PaddedBufferedBlockCipher with AESEngine and PKCS7 padding?

I want to decrypt AES-256 encrypted string using nodejs. I am using crypto module for that.
The string is encrypted using Bouncy castle java library. In Java the cipher is intialsed using:
PaddedBufferedBlockCipher cipher = new PaddedBufferedBlockCipher(new AESEngine(), new PKCS7Padding());
crypto module of nodejs uses openssl's list of ciphers for intialising it, like:
var decipher = crypto.createDecipher('aes-256-cbc',key);
Which algorithm should I use?
Here is the list of algorithms to choose from:
-bash-4.1$ openssl list-cipher-algorithms|grep AES-256
AES-256-CBC
AES-256-CFB
AES-256-CFB1
AES-256-CFB8
AES-256-CTR
AES-256-ECB
AES-256-OFB
AES-256-XTS
AES256 => AES-256-CBC
aes256 => AES-256-CBC

If you encrypt something with a block cipher, you need
the block cipher which can take a single block of input and mangle it into a single block of output (for AES the block size is 16 bytes),
the mode of operation which enables you to encrypt more than one block in a structured fashion
the padding which enables you to encrypt something that is not exactly as long as a multiple of the block size.
The PaddedBufferedBlockCipher that you've shown only has two of them. The mode of operation is implied to be ECB mode, because it simply consists of applying the block cipher to each block separately.
You'll get the same behavior in node.js with:
var decipher = crypto.createDecipheriv('aes-xxx-ecb', key, '');
Exchange the xxx for the size of your key in bits. Valid sizes are 128 bit, 192 bit and 256 bit. Everything else will not work. Also, make sure that you get the encoding of your key right.
In case you're wondering why createDecipheriv is used here instead of createDecipher, I suggest that you carefully compare the documentation to both of those functions. createDecipher expects a password and not a key.
Other considerations:
Never use ECB mode. It's deterministic and therefore not semantically secure. You should at the very least use a randomized mode like CBC or CTR. It is better to authenticate your ciphertexts so that attacks like a padding oracle attack are not possible. This can be done with authenticated modes like GCM or EAX, or with an encrypt-then-MAC scheme.

Decrypt the data with AES-256-ECB (I don't see any CBC or other modes.).
Call decipher.setAutoPadding(true) for using PKCS padding.

Which of these encryption methods is more secure? Why?

I am writing a program that takes a passphrase from the user and then writes some encrypted data to file. The method that I have come up with so far is as follows:
Generate a 128-bit IV from hashing the filename and the system time, and write this to the beginning of the file.
Generate a 256-bit key from the passphrase using SHA256.
Encrypt the data (beginning with a 32-bit static signature) with this key using AES in CBC mode, and write it to file.
When decrypting, the IV is read, and then the passphrase used to generate the key in the same way, and the first 32-bits are compared against what the signature should be in order to tell if the key is valid.
However I was looking at the AES example provided in PolarSSL (the library I am using to do the hashing and encryption), and they use a much more complex method:
Generate a 128-bit IV from hashing the filename and file size, and write this to the beginning of the file.
Generate a 256-bit key from hashing (SHA256) the passphrase and the IV together 8192 times.
Initialize the HMAC with this key.
Encrypt the data with this key using AES in CBC mode, and write it to file, while updating the HMAC with each encrypted block.
Write the HMAC to the end of the file.
I get the impression that the second method is more secure, but I don't have enough knowledge to back that up, other than that it looks more complicated.
If it is more secure, what are the reasons for this?
Is appending an HMAC to the end of the file more secure than having a signature at the beginning of the encrypted data?
Does hashing 8192 times increase the security?
Note: This is an open source project so whatever method I use, it will be freely available to anyone.

The second option is more secure.
Your method, does not provide any message integrity. This means that an attacker can modify parts of the ciphertext and alter what the plain text decrypts to. So long as they don't modify anything that will alter your 32-bit static signature then you'll trust it. The HMAC on the second method provides message integrity.
By hashing the key 8192 times it adds extra computational steps for someone to try and bruteforce the key. Assume a user will pick a dictionary based password. With your method an attacker must perform SHA256(someguess) and then try and decrypt. However, with the PolarSSL version, they will have to calculate SHA256(SHA256(SHA256...(SHA256(someguess))) for 8192 times. This will only slow an attacker down, but it might be enough (for now).
For what it's worth, please use an existing library. Cryptography is hard and is prone to subtle mistakes.

Source and importance of nonce / IV for protocol using AES-GCM

I am making a protocol that uses packets (i.e., not a stream) encrypted with AES. I've decided on using GCM (based off CTR) because it provides integrated authentication and is part of the NSA's Suite B. The AES keys are negotiated using ECDH, where the public keys are signed by trusted contacts as a part of a web-of-trust using something like ECDSA. I believe that I need a 128-bit nonce / initialization vector for GCM because even though I'm using a 256 bit key for AES, it's always a 128 bit block cipher (right?) I'll be using a 96 bit IV after reading the BC code.
I'm definitely not implementing my own algorithms (just the protocol -- my crypto provider is BouncyCastle), but I still need to know how to use this nonce without shooting myself in the foot. The AES key used in between two people with the same DH keys will remain constant, so I know that the same nonce should not be used for more than one packet.
Could I simply prepend a 96-bit pseudo random number to the packet and have the recipient use this as a nonce? This is peer-to-peer software and packets can be sent by either at any time (e.g., an instant message, file transfer request, etc.) and speed is a big issue so it would be good not to have to use a secure random number source. The nonce doesn't have to be secret at all, right? Or necessarily as random as a "cryptographically secure" PNRG? Wikipedia says that it should be random, or else it is susceptible to a chosen plaintext attack -- but there's a "citation needed" next to both claims and I'm not sure if that's true for block ciphers. Could I actually use a counter that counts the number of packets sent (separate from the counter of the number of 128 bit blocks) with a given AES key, starting at 1? Obviously this would make the nonce predictable. Considering that GCM authenticates as well as encrypts, would this compromise its authentication functionality?

GCM is a block cipher counter mode with authentication. A Counter mode effectively turns a block cipher into a stream cipher, and therefore many of the rules for stream ciphers still apply. Its important to note that the same Key+IV will always produce the same PRNG stream, and reusing this PRNG stream can lead to an attacker obtaining plaintext with a simple XOR. In a protocol the same Key+IV can be used for the life of the session, so long as the mode's counter doesn't wrap (int overflow). For example, a protocol could have two parties and they have a pre-shared secret key, then they could negotiate a new cryptographic Nonce that is used as the IV for each session (Remember nonce means use ONLY ONCE).
If you want to use AES as a block cipher you should look into CMAC Mode or perhaps the OMAC1 variant. With CMAC mode all of the rules for still CBC apply. In this case you would have to make sure that each packet used a unique IV that is also random. However its important to note that reusing an IV doesn't have nearly as dire consequences as reusing PRNG stream.

I'd suggest against making your own security protocol. There are several things you need to consider that even a qualified cryptographer can get it wrong. I'd refer you to the TLS
protocol (RFC5246), and the datagram TLS protocol (RFC 4347). Pick a library and use them.
Concerning your question with IV in GCM mode. I'll tell you how DTLS and TLS do it. They use an explicit nonce, i.e. the message sequence number (64-bits) that is included in every packet, with a secret part that is not transmitted (the upper 32 bits) and is derived from the initial key exchange (check RFC 5288 for more information).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string