Seeing a list of recipients in an encrypted message - pgp

I'm using PGP to encrypt and send messages to friends.
I've read up that the message is encrypted using a symetric key and then the symetric key is encrypted using the recipients public key. If you have multiple recipients then the symetric key is encrypted multiple times, once for each recipient, and added to the encrypted message. If you set a flag you also encrypt the key with your own public key and added to the message so that you can yourself decrypt it later from your sent items folder.
Now I imagined that the encrypted symetric keys would be embedded in the message as a table with columns email address and encrypted symetric key. So one recipient e.g. john would look through this table for his email address, say john#somewhere.com, find it and then know that that entry was for him to decode and get the symetric key.
My question is why can't I see a list of the recipients in the encrypted message? Without that the recipient would have to go through each entry in the table and attempt to decrypt it until he finds one that he can. Given that the result is a random number, the symetric key, how would any recipient know it was decrypted properly, well I guess unless he also attempts to use any attempt as a symetric key until he finds one that works.
So again, I sort of assumed that I should be able to see a list of recipients in the encrypted message without decrypting, but I can't. What's going on?

In the OpenPGP terminology the packet that holds the symmetric key encrypted with the public key of the recipient is knows as 'Public Key Encrypted Session Key Packet'. Defined in RFC 4880 https://www.rfc-editor.org/rfc/rfc4880#page-17
In this packet only the Key ID of the public encryption key is stored (not its User Id - which is an email address in most cases). And the recipient finds the packet that she should decrypt by searching by Key ID (actually this should be done by the PGP software).
The recipient will always know that the symmetric key is decrypted properly, because otherwise the decryption algorithm will fail.
The same applies to decrypting the data packet with a random key - each block cipher will fail at the end when it verifies the checksum of the last block. Even if you remove the checksum calculation of the symmetric cipher implementation, you will receive just garbage data:)

Related

Can we encrypt data that must be decrypted with any private key plus a server generate bits?

I have come up with a scenario to make a secure data. Suppose I have a public encrypted file that anybody can download. But whenever anyone want to decrypt that data they need to get a key from server
To make the key cannot be shared. The key from server will not be able to decrypt the data directly. But the data must be decrypted with the client's private key after, without server knowing those client's privateKey
I hope below diagram could explain it clearly
Is it possible? What is the algorithm that could do this?
I have come up with a scenario to make a secure data. Suppose I have a public encrypted
file that anybody can download. But whenever anyone want to decrypt that data they need to get a key from server
To make the key cannot be shared. The key from server will not be able to decrypt the
data directly. But the data must be decrypted with the client's
private key after, without server knowing those client's privateKey
Make it so each time the file is downloaded, a random string is appended. The file is then encrypted with the user's public key, and symmetrically with an appropriate hash generated by that same string. For example a GPG file inside a password-protected ZIP file.
So Alice downloads Financial_Report_201809_d8a1b2e6.pdf.zip while Bob downloads Financial_Report_201809_ff2a91c3.pdf.zip.
If they want to decrypt the file, they need to send the server back the random string, and the server will supply them with the password for the outer ZIP. Then they're left with an encrypted file that only their private key can decode.
Note that once they have decrypted the file, nothing stops them from forwarding the file in the clear to someone else. On the other hand, sharing the encrypted PDF avails them nothing, as they would also need to share their private key.
Also note that since they need to be online to get the outer password, and they're left with a cleartext file at the end, this is (almost) functionally equivalent to the file being downloaded in the clear once user identity has been established.
The main differences are:
the ciphered file (PDF in the above example) might not have been encrypted by the server at all. It might have been supplied by the user, who is then satisfied that only he can read the file back (it makes little sense for anyone else to download it, though).
the transmitted file is very securely transmitted. An attacker with full access to the datastream would not be able to decode the file (but this is no more than could be gained by just encrypting with the user's public key - no extra ZIP stage required).
UPDATE
You want to encrypt the whole file only once (for all users), and then send the same file to Alice and Bob, and have them require two different keys at decryption time. The problem here is that Alice's key will also work on Bob's file, since it is the same file. There's no magic that's going to work here, unless you can hide some detail of the decryption process (e.g. use a program that you control and that can't be debugged and that will always connect to your server: a proposition that has consistently shown to be losing).
If you want to limit the encryption cost, you can send the massive file with both a symmetrically encrypted data payload (always the same) and a very short, asymmetrically encrypted key payload (always different), but still you will be vulnerable to the decrypted key being captured:
[ RSA(ALICE.PUB, "SQUEAMISH OSSIFRAGE" ][ RIJNDAEL("SQUEAMISH OSSIFRAGE", LARGE FILE) ]
In the above scenario some program has to read the encryption header and decrypt the 'Squeamish Ossifrage' password, then go on decrypting (e.g. playing) the extra payload without the password being intercepted. This means that you need to supply the program yourself.
This is functionally equivalent to the program connecting to the server and downloading a "yes" or "no" to the question (appropriately encrypted, signed and secured) "I am Alice's player. Can I decrypt and play 'Never Wanna Give You Up.avi'?" , with no passwords or public keys being known or exchanged apart from the secret shared by Alice's player and the server.
UPDATE II
If the goal is to save encryption resources, the encryption could be made client side as hinted in the comment:
the file is encrypted the once, with a purpose-generated private key.
the private key is stored inside a binary (we must assume it to be unhackable).
the user has to supply his public key for the decryption to work
the program can verify the public key from a repository (or, alternately, the user can supply the public key to the server, which will generate and send the binary file for download)
the program then runs both the decryption and reencryption
the user is left with a file encrypted with his public key, that he alone can decrypt.
UPDATE III
In order for the cleartext file to never be exposed (i.e., it does not matter whether the algorithm gets leaked), you could devise the following scheme. Keep in mind that I'm not a cryptographer and there could be all sort of side channels left uncovered.
You prepare a conversion table that maps each 16-bit word into another 16-bit word. This is a flavour of symmetrical encryption, even if you use two reciprocal matrices for encoding and decoding. Each matrix holds all possible 16-bit words, which means 65536 values, and is therefore 128 Kb in size.
You encrypt the file, once, with the encryption matrix. Without the decryption matrix, the file is unusable.
The user has to send you his public key.
You prepare a transmogrification matrix by encrypting each word with that key, and use the decryption value as an index.
So, for example, say the first word of the cleartext file is A18B. In the encryption matrix, after the scramble, the A18B-th position will contain say 701C, and the decryption matrix, therefore, in the 701Cth position, will hold a18b.
The user has a file starting with 701c... which is of no use.
The user sends you his public key and you run 65536 encryptions on all words from 0000 to ffff. You then determine that the encryption of a18b is 791c. You prepare a re-encoding matrix that has 791c in the 701cth position.
You then send the user this matrix, which has 128K bytes, where the 701cth position is 791c.
The user runs the transmogrification, which is very fast, and is left with a file starting with 791c (as the 701c became 791c - I mistakenly chose two similar values in my example, that is of no significance). This value, once decrypted with his private key, will yield a18b which is the "readable" value.
The user has now a file that's been encrypted by his public key. The a18b value never appeared anywhere.
All that's left is for the user to decrypt the file using his private key and a code block size of 16 bits. This operation will be run by the client and be quite slow, and it's the reason why usually a large random quick symmetric key is RSA-encoded, and used to symmetrically quickly encrypt the large file, which can be quickly decrypted after the private key has unlocked the symmetric key.
The user cannot send the 128K to anyone, for they're useless without the private key.
(The problem here is still that the user can now decrypt the file with his private key, and send it around, even if it's unwieldy as it's a very large file).
the data must be decrypted with the client's private key after, without server knowing those client's privateKey
the original file can be decrypted only by a specific client, using their own private key,
There's a commonly used cryptosystem called hybrid cryptosystem.
The steps are:
The original data are encrypted with a random unique key.
The data encryption key is encrypted by a client's public key (the client's public key needs to be know to the server).
The client needs to use its private key to decrypt the file encryption key and decrypt the file
you can use any asymmetric cryptography algorithms.
A public and a private key pairs are used. The public key is used to encrypt data that can only be decrypted with the private key. There are a lot of resources on this, for example the article form InfoSec Institute.
There are several proven good asymmetric algorithms such as RSA, DSA, Elliptic Curve Crytography (used by Ethereum blockchain). There are many Python libraries too.

How do I tell if OpenPGP encryption is symmetric or asymmetric?

Is there a way to tell if things encrypted via the GNU Privacy Guard are symmetric or asymmetric (without decrypting them or already knowing to start with)? How?
Anyway (for those who want to know what I'm doing), I used Python 3.x to program a GUI-based IDE of sorts that can open symmetrically encrypted files (and save them, too). It can open asymmetrically encrypted files (enter the passphrase to use your secret key instead of the passphrase to decrypt a symmetrically encrypted file). However, it doesn't know they're asymmetric and will overwrite them with symmetrically encrypted files if saved. It would be nice to be able to save them asymmetrically, too. My editor uses the gpg command-line program on Linux (no gpg libraries or anything like that).
I could have a checkbox on the password prompt for asymmetric encryption, but I'd rather not make it so it has to be a manual thing for the user.
For my own personal files, I could add some kind of marker to the saved files to distinguish, but I want it to be able to open them correctly even if they weren't created in my IDE.
I know there's a question with a similar title, but the question asked in the body is fundamentally different.
OpenPGP is a hybrid cryptosystem, which means messages (or files) are always encrypted symmetrically using a so-called session key. The session key again is encrypted using asymmetric encryption (using a public key) or symmetric encryption again (using a string to key function).
This has technical reasons (asymmetric cryptography is very slow for large amounts of data), but also practical ones: by encrypting the small session key multiple times (once for each recipient), you can also have multiple recipients with different keys and even mix asymmetric (public key) and symmetric (password based) encryption in a single OpenPGP message.
Each of those encrypted copies of the session key form an OpenPGP packet, either a packet with tag 1 (Public-Key Encrypted Session Key Packet) or a packet with tag 3 (Symmetric-Key Encrypted Session Key Packet). Those packets in an OpenPGP message can be easily decomposed using pgpdump. An example using GnuPG to create an OpenPGP message encrypting for both my own key and symmetrically for the passphrase foo:
$ echo foo | gpg --recipient a4ff2279 --symmetric --passphrase foo --encrypt | pgpdump
Old: Public-Key Encrypted Session Key Packet(tag 1)(524 bytes)
New version(3)
Key ID - 0xCC73B287A4388025
Pub alg - RSA Encrypt or Sign(pub 1)
RSA m^e mod n(4096 bits) - ...
-> m = sym alg(1 byte) + checksum(2 bytes) + PKCS-1 block type 02
Old: Symmetric-Key Encrypted Session Key Packet(tag 3)(46 bytes)
New version(4)
Sym alg - AES with 128-bit key(sym 7)
Iterated and salted string-to-key(s2k 3):
Hash alg - SHA512(hash 10)
Salt - 0c a6 e6 1d d2 f4 9a 50
Count - 102400(coded count 105)
Encrypted session key
-> sym alg(1 bytes) + session key
New: Symmetrically Encrypted and MDC Packet(tag 18)(63 bytes)
Ver 1
Encrypted data [sym alg is specified in sym-key encrypted session key]
(plain text + MDC SHA1(20 bytes))
Each of the first two packets forms a key to open the encrypted string in the Symmetrically Encrypted and MDC Packet.
This also already explains how to analyze how a message was encrypted: look through the packets, looking for either tag 1 or 3 packets, indicating asymmetric or symmetric encryption (and be aware both might exist). You seem to be very lucky, and the Python GnuPG module already brings a ListPackets class, so you neither have to interface pgpdump nor write your own OpenPGP parser.
Since you are using linux commands, I think you can try with the "file" utility to check the header and tell if the encryption is symmetric or asymmetric.
The output would be something like this (Tested in Ubuntu 14.04):
Command: file symm_encrypted.txt.gpg
Output: GPG symmetrically encrypted data (CAST5 cipher)
Command: file asymm_encrypted.txt.gpg
Output: GPG encrypted data

using counter instead of salt for hashing

I'm developing own protocol for secure message exchanging.
Each message contains the following fields: HMAC, time, salt, and message itself. HMAC is computed over all other fields using known secret key.
Protocol should protect against reply attack. On large time interval "time" record protects against replay attack (both sides should have synchronized clocks). But for protection against replay attack on short time intervals (clocks are not too accurate) I'm planning replace "salt" field with counter increasing every time, when new message is send. Receiving party will throw away messages with counter value less or equal to the previous message counter.
What I'm doing wrong?
Initial counter value can be different (I can use party identifier as initial value), but it will be known to the attacker (party identifier transmitted in unencrypted form).
(https://security.stackexchange.com/questions/8246/what-is-a-good-enough-salt-for-a-saltedhash)
But attacker can precompute rainbow tables for counter+1, counter+2, counter+3... if I will not use really random salt?
I'm not certain of your design and requirements, so some of this may be off base; hopefully some of it is also useful.
First, I'm having a little trouble understanding the attack; I'm probably just missing something. Alice sends a message to Bob that includes a counter, a payload, and an HMAC of (counter||payload). Eve intercepts and replays the message. Bob has seen that one, so he throws it away. Eve tries to compute a new message with counter+1, but she is unable to compute the HMAC for this message (since the counter is different), so Bob throws it away. As long as there is a secret available, Eve should never be able to forge a message, and replaying a message does nothing.
So what is the "known secret key?" Is this key known to the attacker? (And if it is, then he can trivially forge messages, so the HMAC isn't helpful.) Since you note that you have DH, are you using that to negotiate a key?
Assuming I'm missing the attack, thinking through the rest of your question: If you have a shared secret, why not use that to encrypt the message, or at least the time+counter? By encrypting the time and counter together, a rainbow table should be impractical.
If there is some shared secret, but you don't have the processor available to encrypt, you could still do something like MD5(secret+counter) to prevent an attacker guessing ahead (you must already have MD5 available for your HMAC-MD5).
I have attacked this problem before with no shared secret and no DH. In that case, the embedded device needed a per-device public/private keypair (ideally installed during manufacturing, but it can be computed during first power-on and stored in nonvolatile memory; randomness is hard, one option is to let the server provide a random number service; if you have any piece of unique non-public information on the chip, like a serial number, that can be used to seed your key, too. Worst case, you can use your MAC plus the time plus as much entropy as you can scrounge from the network.)
With a public/private key in place, rather than using HMAC, the device just signs its messages, sending its public key to the server in its first message. The public key becomes the identifier of the device. The nice thing about this approach is that there is no negotiation phase. The device can just start talking, and if the server has never heard of this public key, it creates a new record.
There's a small denial-of-service problem here, because attackers could fill your database with junk. The best solution to that is to generate the keys during manufacturing, and immediately insert the public keys into your database. That's impractical for some contract manufacturers. So you can resort to including a shared secret that the device can use to authenticate itself to the server the first time. That's weak, but probably sufficient for the vast majority of cases.

Should I encrypt the signature?

I know, according to this article that I should Sign the message, then Encrypt the message.
My program operates like so:
Get the bytes of the message
Digitally sign the message, and store the signature in a separate byte array
Encrypt the message
Send the signature, then the encrypted message in a packet
Should I do it like so?
Get the bytes of the message
Digitally sign the message, and concatenate it with the bytes of the message
Encrypt the array containing the message and signature
Send the encrypted data
Appreciate the assistance
Digitally sign the message, and concatenate it with the bytes of the message.
You need to know where one ends and the other starts, but sure. Some APIs just take a key and a message and produce an output of bytes, and then instead of having a separate verify (data)->boolean step, they take a single bunch of bytes and either return the verified message or fail.
So yes, you can send
encrypt(
concat(
sign(message, signerPrivateKey), message),
encryptionKey)
To get a verified message, the receiver has to have received two keys ahead of time: the signers public key and the decrypt key which is the same as the encryptionKey for symmetric crypto and which must be a guarded secret.
If you want to use asymmetric crypto so you only need to exchange public keys, and your message is not always shorter than a key, typically you generate a one-time use symmetric key and only encrypt that asymmetrically since asymmetric algos are typically more expensive than symmetric ones.
oneTimeUseSymmetricCryptoKey := generateKey()
concat(
encryptAssymetric(
oneTimeUseSymmetricCryptoKey,
encrypterPrivateKey),
encryptSymmetric(
concat(sign(message, signerPrivateKey), message),
oneTimeUseSymmetricCryptoKey))
None of this though prevents the message forwarding attack described in the link above. To do that, you need to authenticate the sender, e.g. by choosing a public key to verify the signature AND a key to decrypt based on a sender address which is arrived at independently from the exchange of encrypted bytes.

Avoid that repeated same messages look always same after encryption, and can be replayed by an attacker?

I'm looking to authenticate that a particular message is coming from a particular place.
Example: A repeatedly sends the same message to B. Lets say this message is "helloworld" which is encrypted to "asdfqwerty".
How can I ensure that a third party C doesn't learn that B always receives this same encrypted string, and C starts sending "asdfqwerty" to B?
How can I ensure that when B decrypts "asdfqwerty" to "helloworld", it is always receiving this "helloworld" from A?
Thanks for any help.
For the former, you want to use a Mode of Operation for your symmetric cipher that uses an Initialization Vector. The IV ensures that every encrypted message is different, even if it contains the same plaintext.
For the latter, you want to sign your message using the private key of A(lice). If B(ob) has the public key of Alice, he can then verify she really created the message.
Finally, beware of replay attacks, where C(harlie) records a valid message from Alice, and later replays it to Bob. To avoid this, add a nonce and/or a timestamp to your encrypted message (yes, you could make the IV play double-duty as a nonce).
Add random value to the data being encrypted, and whenever it's decrypted, strip it from the original unencrypted data.
You need decent random number generator. I'm sure Google will help you on that.
C noticing that B receives twice the same encrypted message is an issue called traffic analysis and has historically been a heavy concern (but this was in times which predated public key encryption).
Any decent public encryption system includes some random padding. For instance, for RSA as described in PKCS#1, the encrypted message (of length at most 117 bytes for a 1024-bit RSA key) gets a header with at least eight random (non-zero) bytes, and a few extra data which allows the receiver to unambiguously locate the padding bytes, and see where the "real" data begins. The random bytes will be generated anew every time; hence, if A sends twice the same message to B, the encrypted messages will be different, but B will recover the original message twice.
Random padding is required for public key encryption precisely because the public key is public: if encryption was deterministic, then an attacker could "try" potential messages and look for a match (this is exhaustive search on possible messages).
Public key encryption algorithms often have heavy limitations on data size or performance (e.g. with RSA, you have a strict maximum message length, depending on the key size). Thus, it is customary to use a hybrid system: the public key encryption is used to encrypt a symmetric key K (i.e. a bunch of random bytes), and K is used to symmetrically encrypt the data (symmetric encryption is fast and does not have constraints on input message size). In a hybrid system, you generate a new K for every message, so this also gives you the randomness you need to avoid the issue of encrypting several times the same message with a given public key: at the public encryption level, you are actually never encrypting twice the same message (the same key K), even if the data which is symmetrically encrypted with K is the same than in a previous message. This would protect you from traffic analysis even if the public key encryption itself did not include random padding.
When symmetrically encrypting data with a key K, the symmetric encryption should use an "initial value" (IV) which is randomly and uniformly generated; this is integrated in the encryption mode (some modes only need a non-repeating IV without requiring a random uniform generation, but CBC needs random uniform generation). This is a third level of randomness, protecting you against traffic analysis.
When using asymmetric key agreement (static Diffie-Hellman), since are a bit more complex, because a key agreement results in a key K which you do not choose, and which could be the same ever and ever (between given sender and receiver). In that situation, protection against traffic analysis relies on the symmetric encryption IV randomness.
Asymmetric encryption protocols, such as OpenPGP, describe how the symmetric encryption, public key encryption and randomness should all be linked together, ironing out the tricky details. You are warmly encouraged not to reinvent your own protocol: it is difficult to design a secure protocol, mostly because one cannot easily test for the presence or absence of any weakness.
You may want to study block cipher modes of operation. However, the modes are designed to work on a data stream that is sent over a reliable channel. If your messages are sent out of order over an unreliable transport (e.g. UDP packets), I don't think you can use it.

Resources