I'm building a client (C#)/server (PHP) application in which the server signs a message with RSA and the client verifies the message. So far so good.
I'm using HTTPS/JSON as protocol for the communication and therefore I have to encode binary data in the message as base64.
My question here is: What's best practice? To sign the base64 encoded data or the original binary data?
Are there any positives or negatives for doing one of them?
King regards.
My question here is: What's best practice? To sign the base64 encoded data or the original binary data?
At the end - it doesn't matter. The signature ensures integrity of the signed data regardless they are encoded or not.
Related
What is the benefit of using Authenticated Encryption schemes like GCM or EAX compared to simpler methods like CRC or hash functions like SHA?
As far as I understand these methods basically add a Message Authentication Code (MAC) to the message so it can be validated. But the same would be possible if a CRC or hash value would be calculated and appended to the plaintext (MIC). This way it would also not be possible to tamper with the message because the hash would probably not match anymore.
The linked Wikipedia article says MICs don't take the key etc. into account but I do not understand why this is a problem.
There's conceptually no difference between an Authenticated Encryption scheme (GCM, CCM, EAX, etc) and providing an HMAC over the encrypted message, the AE algorithms simply constrain and standardize the byte pattern (while tending to require less space/time than a serial operation of encrypt and HMAC).
If you are computing your unkeyed digest over the plaintext before encrypting you do have a tamper-evident algorithm. But computing the digest over the plaintext has two disadvantages over computing it over the ciphertext:
If you send the same thing twice you send the same hash, even if your ciphertext is different (due to a different IV or key)
If the ciphertext has been tampered with in an attempt to confuse the decryption routine you will still process it before discovering the tamper.
Of course, the disadvantage of digesting after is that in your unkeyed approach anyone who tampers with the ciphertext can simply recompute the SHA-2-256 digest of the ciphertext after the tamper. The solution to that is to not do an unkeyed digest, but to do a keyed digest, like HMAC.
The options are:
Encrypt-only: Subject to tampering. Assuming a new IV is used for each message (and ECB isn't used) does not reveal when a message repeats.
Digest-only: Subject to tampering. Message is plain-text.
MAC-only: Not subject to tampering. Message is plain-text.
Digest-then-Encrypt (DtE - digest is itself encrypted): Ciphertext corruption attacks are possible. Tampering with the plaintext is possible, if it is known. Message reuse is not revealed.
Digest-and-Encrypt (D&E/E&D - digest plaintext, send digest as plaintext): Ciphertext corruption attacks are possible. Tampering with the plaintext is possible, if it is known. Message reuse is revealed via the digest not changing.
Encrypt-then-Digest (EtD): This guards against transmission errors, but since any attacker can just recompute the digest this is the same as encrypt-only.
MAC-then-Encrypt (MtE): Same strengths as DtE, but even if the attacker knew the original plaintext and what they had tampered it to they cannot alter the MAC (unless the plaintext is being altered to an already-known message+MAC).
MAC-and-Encrypt (M&E/E&M): Like D&E this reveals message reuse. Like MtE it is still vulnerable to ciphertext corruption, and a very small set of tampering.
Encrypt-then-MAC (EtM): Any attempt to alter the ciphertext is discovered by the MAC failing to validate, this can be done before processing the ciphertext. Message reuse is not revealed, since the MAC was over the ciphertext.
EtM is the safest approach in the general case. One of the things that an AE algorithm solves is that it takes the question of how to combine a MAC and cipher out of the developer's hands and puts it into the hands of a cryptographer.
I am new to security issues. I am trying to implement some basic (limited) protection of intellectual property in an android app by encrypting text assets. To do this, I have implemented the SimpleCrypto class described by S.Saurel here http://www.ssaurel.com/blog/how-to-use-cryptography-in-android-applications-2/. I plan to encrypt the data using a separate program and include the encrypted data as assets files, then decrypt them for use in the app. My question is what is the best practice for dealing with the SALT string and SEED string that are needed to decrypt the files? Should they be generated at run time or stored somehow? Should they be included in the crypto class, or generated elsewhere in the app and passed to the crypto class?
Thanks in advance!
In this implementation "seed" is what you would think of as "password", and since you're not going to ask the user to provide a password, you can hardcode it, or store it in a file, or request it from a server at runtime, or whatever else. Be aware that a smart attacker will most likely be able to get at this password and use it to generate their own decryption key for your ciphertexts.
Salt is a non-secret value that acts as an initialization vector for encryption. Best-practice would dictate that you generate a random salt per cleartext, then provide the ciphertext and unencrypted seed to the client. The IV is typically dependent on the block size of your cipher used, and in your example there you're generating a 256-bit key, so you should generate a random 256-bit (64-byte) salt per cleartext and ship it with the ciphertext. You could do something as simple as producing a final string which is:
[2 bytes indicating length of salt][salt][ciphertext]
Then from that you can get your seed and ciphertext for decryption.
On the Wikipedia for UMAC, https://en.wikipedia.org/wiki/UMAC, it states:
The resulting digest or fingerprint is then encrypted to hide the
identity of the hash function used.
Further, in this paper, http://web.cs.ucdavis.edu/~rogaway/papers/umac-full.pdf, it states:
A message is authenticated by hashing it with the shared hash function
and then encrypting the resulting hash (using the encryption key).
My question is, if the set of hash functions H is large enough, and the number of hash buckets |B| is large enough, why do we need to encrypt -- isn't the secret hash secure enough?
For example, take the worst case scenario where every client is sending the same, short content, like "x". If we hash to 32 bytes and our hash depends on a secret 32 byte hash key, and the hashes exhibit uniform properties, how could an attacker ever hope to learn the secret hash key of any individual client, even without encryption?
And, if the attacker doesn't learn the key, how could the attacker ever hope to maliciously alter the message contents?
Thank you!
I don't know much about UMAC specifically but:
Having a rainbow table for a specific hash function defeats any encryption you have put on the message so instead of having a single attack surface, you now have two
As computational powers increase with time you will be more and more likely to figure out the plaintext of the message so PFS (https://en.wikipedia.org/wiki/Forward_secrecy) will never be possible if you leave the MAC unencrypted.
On top of all this, if you can figure out a single plaintext message from a MAC value, you exponentially get closer to decrypting the rest of the message by getting some information about the PRNG, context of the other data, IV, etc.
Should I first encode my encrypted text with base64 to store it in a database? It will be larger when I encode it but is better or faster to decrypt it?
(Especially for Rijndael and RSA encrypted data.)
No, you don't need to encode your ciphertext using base64 encoding. You should encode your ciphertext only if your field just accepts text values. Speed is normally not an issue as cryptographic algorithms are much slower than the conversion from or to base64.
If I understood you correctly than you do not need to encode with base64 before storing it in the database. If you use a proper database than you can just store the raw bytes. Base64 is good for transferring binary data over text-only media but, it also enlarges data by roughly 33%. It would not make it faster because you actually need to do two steps now: decode and than decrypt, instead of decrypting it immediately.
To build a secure system, can we assume that encryption guarantees integrity is true before starting a secure programming?
Both in symmetric and public-key
encryption, is my question
well-proofed ?
If no, what are the
vulnerabilities, can you give an
example?
No. This is easy to see if you consider the one-time pad, a simple (theoretically) perfectly secure system.
If you change any bit of the output, a bit of the clear text will change, and the recipient has no way to detect this.
This is an obvious case, but the same conclusion applies to most encryption systems. They only provide for confidentiality, not integrity.
Thus, you may want to add a digital signature. Interestingly, when using public key cryptography, it is not sufficient to sign then encrypt (SE), or to encrypt then sign (ES). Both of these are vulnerable to replay attacks. You have to either sign-encrypt-sign or encrypt-sign-encrypt to have a generally secure solution. This paper explains why in detail.
If you use SE, the recipient can decrypt the message, then re-encrypt it to a different recipient. This then deceives the new recipient about the sender's intended recipient.
If you use ES, an eavesdropper can remove the signature and add their own. Thus, even though they can't read the message, they can take credit for it, pretending to be the original sender.
In short the answer is no. Message Integrity and Secrecy are different, and require different tools.
Lets take a simple coin flip into consideration, and in this case we are betting on the results. The result is a simple bool and I encrypt it using a stream cipher like RC4 which yields 1 encrypted bit and I email it to you. You don't have the key, and I ask you to email me back the answer.
A few attacks can happen in this scenario.
1)An attacker could modify the bit in transit, if it was a 0 there is a 50% chance it will become a 1 and the contrary is true. This is because RC4 produces a prng stream that is XOR'ed with the plain text produce the cipher text, similar to a one time pad.
2)Another possibility is that I could provide you with a different key to make sure your answer is wrong. This is easy to brute force, I just just keep trying keys until I get the proper bit flip.
A solution is to use a block cipher is CMAC Mode. A CMAC is a message authentication code similar to an hmac but it uses a block cipher instead of a message digest function. The secret key (K) is the same key that you use to encrypt the message. This adds n+1 blocks to the cipher text. In my scenario this prevents both attacks 1 and 2. An attacker cannot flip a simple bit because the plain text is padded, even if the message only takes up 1 bit i must transmit a minimum of 1 block using a block cipher. The additional authentication block prevents me from chaining the key, and it also provides integrity from anyone attempting to modify the cipher text in transit (although this would be very difficult to do in practice, the additional layer of security is useful).
WPA2 uses AES-CMAC for these reasons.
If data integrity is a specific concern to you, you should use a cryptographic hash function, combined with an an encryption algorithm.
But it really does come down to using the correct tool for the job. Some encryption algorithms may provide some level of checksum validation built-in, others may not.