How does RSA Passphrase Encryption work under the hood?

How does RSA Passphrase Encryption work under the hood? - security

So in the .ssh directory, there is a file named "id_rsa" which is the private key file.
It contains the encrypted private key, and and encryption algorithm (AES128-CBC) as well as the Initialization Vector.
I understand that it gets decrypted automatically when you enter your passphrase and I understand that the decryption algorithm takes in the encrypted private key, and the passphrase (as the key), as well as the IV (I am given this already).
I just want to know how the passphrase is padded? Cause AES 128 takes in a key size of 128 bits, and a passphrase is usually a lot smaller than that.
I am trying to manually decrypt my private key (for learning purposes) . The only missing thing I have, is how I should pad my "passphrase" so I can pass it in the encryption algorithm.
Basically, to sum it all up, how is a passphrase which is a string converted to a 128 bit (16 byte) key?

Related

How does an IV work and what would be the best way to store it?

I want to encrypt and decrypt strings. I'm using Nodejs crypto for this. I've read that when encrypting and decrypting it's highly recommended to use an IV. I want to store the encrypted data inside a MySQL database and decrypt it later when needed. I understand that I need the IV also for the decryption process. But what exactly is an IV and how should I store it? I read something about that an IV does not to be kept secret. Does this mean I can store it right next to the encrypted data it belongs to?

it's highly recommended to use an IV
No, it's required or you'll not get a fully secure ciphertext in most circumstances. At the very minimum, not supplying an IV for the same key and plaintext message will result in identical ciphertext, which will leak information to an adversary. In other words: encryption would be deterministic, and that's not a property that you want from a cipher. For CTR and GCM mode you may well leak all of the plaintext message though...
But what exactly is an IV ... ?
An IV just consists of binary bits. It's size and contents depend on the mode of operation (CBC/CTR/GCM). Generally it needs either to be a nonce or randomized.
CBC mode requires a randomized IV of 16 bytes; generally a cryptographically secure random number generator is used for that.
CTR mode commonly specifies both a nonce and the initial counter value within the IV of 16 bytes. So you already need to put the nonce in the left hand bytes (lowest index). This nonce may be randomized, but then it should be large enough (e.g. 12 bytes) to avoid the birthday problem.
GCM mode requires just a nonce of 12 bytes.
and how should I store it
Anyway you can store the bytes, as long as they can be retrieved or regenerated during decryption. If you need text you may need to encode it using base 64 or hexadecimals (this goes for the ciphertext as well, of course).
I read something about that an IV does not to be kept secret.
That's correct.
Does this mean I can store it right next to the encrypted data it belongs to?
Correct, quite often the IV is simply prefixed to the ciphertext; if you know the block cipher and mode of operation then the size is predetermined after all.

Can we encrypt data that must be decrypted with any private key plus a server generate bits?

I have come up with a scenario to make a secure data. Suppose I have a public encrypted file that anybody can download. But whenever anyone want to decrypt that data they need to get a key from server
To make the key cannot be shared. The key from server will not be able to decrypt the data directly. But the data must be decrypted with the client's private key after, without server knowing those client's privateKey
I hope below diagram could explain it clearly
Is it possible? What is the algorithm that could do this?

I have come up with a scenario to make a secure data. Suppose I have a public encrypted
file that anybody can download. But whenever anyone want to decrypt that data they need to get a key from server
To make the key cannot be shared. The key from server will not be able to decrypt the
data directly. But the data must be decrypted with the client's
private key after, without server knowing those client's privateKey
Make it so each time the file is downloaded, a random string is appended. The file is then encrypted with the user's public key, and symmetrically with an appropriate hash generated by that same string. For example a GPG file inside a password-protected ZIP file.
So Alice downloads Financial_Report_201809_d8a1b2e6.pdf.zip while Bob downloads Financial_Report_201809_ff2a91c3.pdf.zip.
If they want to decrypt the file, they need to send the server back the random string, and the server will supply them with the password for the outer ZIP. Then they're left with an encrypted file that only their private key can decode.
Note that once they have decrypted the file, nothing stops them from forwarding the file in the clear to someone else. On the other hand, sharing the encrypted PDF avails them nothing, as they would also need to share their private key.
Also note that since they need to be online to get the outer password, and they're left with a cleartext file at the end, this is (almost) functionally equivalent to the file being downloaded in the clear once user identity has been established.
The main differences are:
the ciphered file (PDF in the above example) might not have been encrypted by the server at all. It might have been supplied by the user, who is then satisfied that only he can read the file back (it makes little sense for anyone else to download it, though).
the transmitted file is very securely transmitted. An attacker with full access to the datastream would not be able to decode the file (but this is no more than could be gained by just encrypting with the user's public key - no extra ZIP stage required).
UPDATE
You want to encrypt the whole file only once (for all users), and then send the same file to Alice and Bob, and have them require two different keys at decryption time. The problem here is that Alice's key will also work on Bob's file, since it is the same file. There's no magic that's going to work here, unless you can hide some detail of the decryption process (e.g. use a program that you control and that can't be debugged and that will always connect to your server: a proposition that has consistently shown to be losing).
If you want to limit the encryption cost, you can send the massive file with both a symmetrically encrypted data payload (always the same) and a very short, asymmetrically encrypted key payload (always different), but still you will be vulnerable to the decrypted key being captured:
[ RSA(ALICE.PUB, "SQUEAMISH OSSIFRAGE" ][ RIJNDAEL("SQUEAMISH OSSIFRAGE", LARGE FILE) ]
In the above scenario some program has to read the encryption header and decrypt the 'Squeamish Ossifrage' password, then go on decrypting (e.g. playing) the extra payload without the password being intercepted. This means that you need to supply the program yourself.
This is functionally equivalent to the program connecting to the server and downloading a "yes" or "no" to the question (appropriately encrypted, signed and secured) "I am Alice's player. Can I decrypt and play 'Never Wanna Give You Up.avi'?" , with no passwords or public keys being known or exchanged apart from the secret shared by Alice's player and the server.
UPDATE II
If the goal is to save encryption resources, the encryption could be made client side as hinted in the comment:
the file is encrypted the once, with a purpose-generated private key.
the private key is stored inside a binary (we must assume it to be unhackable).
the user has to supply his public key for the decryption to work
the program can verify the public key from a repository (or, alternately, the user can supply the public key to the server, which will generate and send the binary file for download)
the program then runs both the decryption and reencryption
the user is left with a file encrypted with his public key, that he alone can decrypt.
UPDATE III
In order for the cleartext file to never be exposed (i.e., it does not matter whether the algorithm gets leaked), you could devise the following scheme. Keep in mind that I'm not a cryptographer and there could be all sort of side channels left uncovered.
You prepare a conversion table that maps each 16-bit word into another 16-bit word. This is a flavour of symmetrical encryption, even if you use two reciprocal matrices for encoding and decoding. Each matrix holds all possible 16-bit words, which means 65536 values, and is therefore 128 Kb in size.
You encrypt the file, once, with the encryption matrix. Without the decryption matrix, the file is unusable.
The user has to send you his public key.
You prepare a transmogrification matrix by encrypting each word with that key, and use the decryption value as an index.
So, for example, say the first word of the cleartext file is A18B. In the encryption matrix, after the scramble, the A18B-th position will contain say 701C, and the decryption matrix, therefore, in the 701Cth position, will hold a18b.
The user has a file starting with 701c... which is of no use.
The user sends you his public key and you run 65536 encryptions on all words from 0000 to ffff. You then determine that the encryption of a18b is 791c. You prepare a re-encoding matrix that has 791c in the 701cth position.
You then send the user this matrix, which has 128K bytes, where the 701cth position is 791c.
The user runs the transmogrification, which is very fast, and is left with a file starting with 791c (as the 701c became 791c - I mistakenly chose two similar values in my example, that is of no significance). This value, once decrypted with his private key, will yield a18b which is the "readable" value.
The user has now a file that's been encrypted by his public key. The a18b value never appeared anywhere.
All that's left is for the user to decrypt the file using his private key and a code block size of 16 bits. This operation will be run by the client and be quite slow, and it's the reason why usually a large random quick symmetric key is RSA-encoded, and used to symmetrically quickly encrypt the large file, which can be quickly decrypted after the private key has unlocked the symmetric key.
The user cannot send the 128K to anyone, for they're useless without the private key.
(The problem here is still that the user can now decrypt the file with his private key, and send it around, even if it's unwieldy as it's a very large file).

the data must be decrypted with the client's private key after, without server knowing those client's privateKey
the original file can be decrypted only by a specific client, using their own private key,
There's a commonly used cryptosystem called hybrid cryptosystem.
The steps are:
The original data are encrypted with a random unique key.
The data encryption key is encrypted by a client's public key (the client's public key needs to be know to the server).
The client needs to use its private key to decrypt the file encryption key and decrypt the file

you can use any asymmetric cryptography algorithms.
A public and a private key pairs are used. The public key is used to encrypt data that can only be decrypted with the private key. There are a lot of resources on this, for example the article form InfoSec Institute.
There are several proven good asymmetric algorithms such as RSA, DSA, Elliptic Curve Crytography (used by Ethereum blockchain). There are many Python libraries too.

What is the equivalent cipher to bouncycastle's PaddedBufferedBlockCipher with AESEngine and PKCS7 padding?

I want to decrypt AES-256 encrypted string using nodejs. I am using crypto module for that.
The string is encrypted using Bouncy castle java library. In Java the cipher is intialsed using:
PaddedBufferedBlockCipher cipher = new PaddedBufferedBlockCipher(new AESEngine(), new PKCS7Padding());
crypto module of nodejs uses openssl's list of ciphers for intialising it, like:
var decipher = crypto.createDecipher('aes-256-cbc',key);
Which algorithm should I use?
Here is the list of algorithms to choose from:
-bash-4.1$ openssl list-cipher-algorithms|grep AES-256
AES-256-CBC
AES-256-CFB
AES-256-CFB1
AES-256-CFB8
AES-256-CTR
AES-256-ECB
AES-256-OFB
AES-256-XTS
AES256 => AES-256-CBC
aes256 => AES-256-CBC

If you encrypt something with a block cipher, you need
the block cipher which can take a single block of input and mangle it into a single block of output (for AES the block size is 16 bytes),
the mode of operation which enables you to encrypt more than one block in a structured fashion
the padding which enables you to encrypt something that is not exactly as long as a multiple of the block size.
The PaddedBufferedBlockCipher that you've shown only has two of them. The mode of operation is implied to be ECB mode, because it simply consists of applying the block cipher to each block separately.
You'll get the same behavior in node.js with:
var decipher = crypto.createDecipheriv('aes-xxx-ecb', key, '');
Exchange the xxx for the size of your key in bits. Valid sizes are 128 bit, 192 bit and 256 bit. Everything else will not work. Also, make sure that you get the encoding of your key right.
In case you're wondering why createDecipheriv is used here instead of createDecipher, I suggest that you carefully compare the documentation to both of those functions. createDecipher expects a password and not a key.
Other considerations:
Never use ECB mode. It's deterministic and therefore not semantically secure. You should at the very least use a randomized mode like CBC or CTR. It is better to authenticate your ciphertexts so that attacks like a padding oracle attack are not possible. This can be done with authenticated modes like GCM or EAX, or with an encrypt-then-MAC scheme.

Decrypt the data with AES-256-ECB (I don't see any CBC or other modes.).
Call decipher.setAutoPadding(true) for using PKCS padding.

How do I tell if OpenPGP encryption is symmetric or asymmetric?

Is there a way to tell if things encrypted via the GNU Privacy Guard are symmetric or asymmetric (without decrypting them or already knowing to start with)? How?
Anyway (for those who want to know what I'm doing), I used Python 3.x to program a GUI-based IDE of sorts that can open symmetrically encrypted files (and save them, too). It can open asymmetrically encrypted files (enter the passphrase to use your secret key instead of the passphrase to decrypt a symmetrically encrypted file). However, it doesn't know they're asymmetric and will overwrite them with symmetrically encrypted files if saved. It would be nice to be able to save them asymmetrically, too. My editor uses the gpg command-line program on Linux (no gpg libraries or anything like that).
I could have a checkbox on the password prompt for asymmetric encryption, but I'd rather not make it so it has to be a manual thing for the user.
For my own personal files, I could add some kind of marker to the saved files to distinguish, but I want it to be able to open them correctly even if they weren't created in my IDE.
I know there's a question with a similar title, but the question asked in the body is fundamentally different.

OpenPGP is a hybrid cryptosystem, which means messages (or files) are always encrypted symmetrically using a so-called session key. The session key again is encrypted using asymmetric encryption (using a public key) or symmetric encryption again (using a string to key function).
This has technical reasons (asymmetric cryptography is very slow for large amounts of data), but also practical ones: by encrypting the small session key multiple times (once for each recipient), you can also have multiple recipients with different keys and even mix asymmetric (public key) and symmetric (password based) encryption in a single OpenPGP message.
Each of those encrypted copies of the session key form an OpenPGP packet, either a packet with tag 1 (Public-Key Encrypted Session Key Packet) or a packet with tag 3 (Symmetric-Key Encrypted Session Key Packet). Those packets in an OpenPGP message can be easily decomposed using pgpdump. An example using GnuPG to create an OpenPGP message encrypting for both my own key and symmetrically for the passphrase foo:
$ echo foo | gpg --recipient a4ff2279 --symmetric --passphrase foo --encrypt | pgpdump
Old: Public-Key Encrypted Session Key Packet(tag 1)(524 bytes)
New version(3)
Key ID - 0xCC73B287A4388025
Pub alg - RSA Encrypt or Sign(pub 1)
RSA m^e mod n(4096 bits) - ...
-> m = sym alg(1 byte) + checksum(2 bytes) + PKCS-1 block type 02
Old: Symmetric-Key Encrypted Session Key Packet(tag 3)(46 bytes)
New version(4)
Sym alg - AES with 128-bit key(sym 7)
Iterated and salted string-to-key(s2k 3):
Hash alg - SHA512(hash 10)
Salt - 0c a6 e6 1d d2 f4 9a 50
Count - 102400(coded count 105)
Encrypted session key
-> sym alg(1 bytes) + session key
New: Symmetrically Encrypted and MDC Packet(tag 18)(63 bytes)
Ver 1
Encrypted data [sym alg is specified in sym-key encrypted session key]
(plain text + MDC SHA1(20 bytes))
Each of the first two packets forms a key to open the encrypted string in the Symmetrically Encrypted and MDC Packet.
This also already explains how to analyze how a message was encrypted: look through the packets, looking for either tag 1 or 3 packets, indicating asymmetric or symmetric encryption (and be aware both might exist). You seem to be very lucky, and the Python GnuPG module already brings a ListPackets class, so you neither have to interface pgpdump nor write your own OpenPGP parser.

Since you are using linux commands, I think you can try with the "file" utility to check the header and tell if the encryption is symmetric or asymmetric.
The output would be something like this (Tested in Ubuntu 14.04):
Command: file symm_encrypted.txt.gpg
Output: GPG symmetrically encrypted data (CAST5 cipher)
Command: file asymm_encrypted.txt.gpg
Output: GPG encrypted data

Which of these encryption methods is more secure? Why?

I am writing a program that takes a passphrase from the user and then writes some encrypted data to file. The method that I have come up with so far is as follows:
Generate a 128-bit IV from hashing the filename and the system time, and write this to the beginning of the file.
Generate a 256-bit key from the passphrase using SHA256.
Encrypt the data (beginning with a 32-bit static signature) with this key using AES in CBC mode, and write it to file.
When decrypting, the IV is read, and then the passphrase used to generate the key in the same way, and the first 32-bits are compared against what the signature should be in order to tell if the key is valid.
However I was looking at the AES example provided in PolarSSL (the library I am using to do the hashing and encryption), and they use a much more complex method:
Generate a 128-bit IV from hashing the filename and file size, and write this to the beginning of the file.
Generate a 256-bit key from hashing (SHA256) the passphrase and the IV together 8192 times.
Initialize the HMAC with this key.
Encrypt the data with this key using AES in CBC mode, and write it to file, while updating the HMAC with each encrypted block.
Write the HMAC to the end of the file.
I get the impression that the second method is more secure, but I don't have enough knowledge to back that up, other than that it looks more complicated.
If it is more secure, what are the reasons for this?
Is appending an HMAC to the end of the file more secure than having a signature at the beginning of the encrypted data?
Does hashing 8192 times increase the security?
Note: This is an open source project so whatever method I use, it will be freely available to anyone.

The second option is more secure.
Your method, does not provide any message integrity. This means that an attacker can modify parts of the ciphertext and alter what the plain text decrypts to. So long as they don't modify anything that will alter your 32-bit static signature then you'll trust it. The HMAC on the second method provides message integrity.
By hashing the key 8192 times it adds extra computational steps for someone to try and bruteforce the key. Assume a user will pick a dictionary based password. With your method an attacker must perform SHA256(someguess) and then try and decrypt. However, with the PolarSSL version, they will have to calculate SHA256(SHA256(SHA256...(SHA256(someguess))) for 8192 times. This will only slow an attacker down, but it might be enough (for now).
For what it's worth, please use an existing library. Cryptography is hard and is prone to subtle mistakes.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string