Best practice for decrypting text file using SALT, SEED, SecretKey in android - security

I am new to security issues. I am trying to implement some basic (limited) protection of intellectual property in an android app by encrypting text assets. To do this, I have implemented the SimpleCrypto class described by S.Saurel here http://www.ssaurel.com/blog/how-to-use-cryptography-in-android-applications-2/. I plan to encrypt the data using a separate program and include the encrypted data as assets files, then decrypt them for use in the app. My question is what is the best practice for dealing with the SALT string and SEED string that are needed to decrypt the files? Should they be generated at run time or stored somehow? Should they be included in the crypto class, or generated elsewhere in the app and passed to the crypto class?
Thanks in advance!

In this implementation "seed" is what you would think of as "password", and since you're not going to ask the user to provide a password, you can hardcode it, or store it in a file, or request it from a server at runtime, or whatever else. Be aware that a smart attacker will most likely be able to get at this password and use it to generate their own decryption key for your ciphertexts.
Salt is a non-secret value that acts as an initialization vector for encryption. Best-practice would dictate that you generate a random salt per cleartext, then provide the ciphertext and unencrypted seed to the client. The IV is typically dependent on the block size of your cipher used, and in your example there you're generating a 256-bit key, so you should generate a random 256-bit (64-byte) salt per cleartext and ship it with the ciphertext. You could do something as simple as producing a final string which is:
[2 bytes indicating length of salt][salt][ciphertext]
Then from that you can get your seed and ciphertext for decryption.

Related

How to do 1-of-X or Y-of-X public key based encrypt/ decrypt in NodeJs?

I would like to be able to encrypt data using public keys, and decrypt the encrypted data using private keys.
Encryption essentially needs to accept inputs:
Clear data to be encrypted
A list of several public keys
The minimum number of private keys corresponding to those public keys that are needed to decrypt the encrypted
How can this be done in NodeJs?
Scenarios
By way of concrete scenarios, where there are 5 users (A - E) with crypto key pairs in the system.
A 1-of-X scenario:
encrypted = crypto_encrypt(clearText, [A.publicKey, B.publicKey], 1) (1-of-2)
decrypted = crypto_decrypt(encrypted, [A.privateKey])
success: decrypted === clearText
because A.publicKey was used in encryption
decrypted = crypto_decrypt(encrypted, [C.privateKey])
failure: unable to decrypt
because C.publicKey was not used in encryption
A Y-of-X scenario:
encrypted = crypto_encrypt(clearText, [A.publicKey, B.publicKey, C.publicKey], 2) (2-of-3)
decrypted = crypto_decrypt(encrypted, [A.privateKey, C.privateKey])
success: decrypted === clearText
because both A.publicKey and C.publicKey was used in encryption
decrypted = crypto_decrypt(encrypted, [C.privateKey, E.privateKey])
failure: unable to decrypt
because while C.publicKey was used in encryption, E.publicKey was not
Ideally...
At minimum I need to be able to support the 1-of-X scenario, but if Y-of-X is also possible, that would be better
What the actual key pairs are is not so important here, could be RSA, could be any of the elliptic curves. If the method supports a number of different ones, and allows one to pick, that would be better
Preferably not tied to the use of any particular toolset or framework
PGP can do this.
Specifically for node, openpgpjs has a section in the README - https://github.com/openpgpjs/openpgpjs#encrypt-and-decrypt-string-data-with-pgp-keys - which could be condensed into:
const encryptedText = await openpgp.encrypt({ message: clearText, publicKeys });
const decryptedText = await openpgp.decrypt({ message: encryptedText, privateKeys });
However:
for number of keys required to decrypt, it only supports the 1 of many scenario, not the more general some of many scenario you'd ideally want
supports both RSA and many elliptic curve based keys, but the key format is designed for use by PGP, as the name of the library implies (so it is specific to the PGP toolchain)
As noted by Luke Joshua Park in the comments, this sounds like a textbook use case for a secret sharing scheme. Specifically, I would recommend that you:
Generate a random AES (or other symmetric cipher) key. Make sure to use a cryptographically secure RNG (such as Crypto.randomBytes()) for this, since an attacker who can guess this key can also break the entire scheme!
Encrypt the data with this key, using an authenticated encryption mode such as AES-SIV (as provided e.g. by miscreant).
Split the AES key into multiple shares using Shamir's secret sharing scheme with the desired reconstruction threshold. (Some JS implementations I found with a quick Google search include secrets.js, jsss and ThresholdJS.)
Encrypt each share using a different user's public key.
Send each user their encrypted share and a copy of the AES-encrypted data.
Disclaimer: I have not reviewed the security or correctness of any of the APIs or libraries linked above. The cryptographic techniques they claim to use appear to be sound and suitable for this task, but I cannot guarantee that they have been implemented safely and correctly. Caveat emptor.
To decrypt the data, each user can first decrypt their share of the AES key using their private key, and a sufficient number of the decrypted shares can then be combined (using the same implementation of Shamir's secret sharing as used to create them) to reconstruct the original AES key, which can then be used to decrypt (and verify the integrity of) the data.
Note that Shamir's secret sharing implicitly assumes that the users who combine their shares to reconstruct the secret will trust each other and not lie about their shares or otherwise misbehave. If that's not necessarily true, there are various ways for a malicious user to trick the others — perhaps most simply by waiting for everyone else to reveal their share to them and then refusing to reveal their own share to the others. In general, preventing such attacks is all but impossible without the help of some kind of a mutually trusted party.
At the very least, though, using an encryption mode like AES-SIV with built-in authentication should ensure that users will detect if the reconstructed AES key is incorrect, since the decryption will then fail. If you want to be extra sure of this, you may wish to also send each of the users a secure cryptographic hash (e.g. SHA-512) of the AES key, so that they can verify its correctness before attempting decryption.

How to securely encrypt many similiar chunks of data with the same key?

I'm writing an application that will require the following security features: when launching the CLI version, you should pass some key to it. Some undefined number of chunks of data of the same size will be generated. It needs to be stored remotely. This will be a sensitive data. I want it to be encrypted and accessible only by that one key that was passed to it initially. My question is, which algorithm will suit me? I read about AES but it says that
When you perform an encryption operation you initialize your Encryptor
with this key, then generate a new, unique Initialization Vector for
each record you’re going to encrypt.
which means I'll have to pass a key and an IV, rather than just the key and this IV should be unique for each generated chunk of data (and there is going to be a lot of those).
If the answer is AES, which encryption mode is it?
You can use any modern symmetric algorithm. The amount of data and how to handle your IVs is irrelevant because it applies no matter which symmetric algorithm you pick.
AES-128 is a good choice, as it isn't limited by law in the US and 128 bits is infeasible to brute force. If you aren't in the US, you could use AES-256 if you wanted to, but implementations in Java require additional installations.
You say you are going to generate n many chunks of data (or retrieve, whatever).
You could encrypt them all at once in CBC mode, which keeps AES as a block cipher, and you'll only end up with one IV. You'll need an HMAC here to protect the integrity. This isn't the most modern way, however.
You should use AES in GCM mode as a stream cipher. You'll still have one single IV (nounce) but the ciphertext will also be authenticated.
IVs should be generated randomly and prepended to the ciphertext. You can then retrieve the IV when it is time to decrypt. Remember: IVs aren't secret, they just need to be random!
EDIT: As pointed out below, IVs should be generated using a crypto-secure random number generator. IVs for CTR based modes, like GCM, only need to be unique.
In summary, what you are worried about shouldn't be worried about. One key is fine. More than one IV is fine too, but there are ways to do it with just one. You will have to worry about IVs either way. Don't use ECB mode.

Why is encrypting necessary for security after hashing in the UMAC (Universal Message Authentication Code) algorithm?

On the Wikipedia for UMAC, https://en.wikipedia.org/wiki/UMAC, it states:
The resulting digest or fingerprint is then encrypted to hide the
identity of the hash function used.
Further, in this paper, http://web.cs.ucdavis.edu/~rogaway/papers/umac-full.pdf, it states:
A message is authenticated by hashing it with the shared hash function
and then encrypting the resulting hash (using the encryption key).
My question is, if the set of hash functions H is large enough, and the number of hash buckets |B| is large enough, why do we need to encrypt -- isn't the secret hash secure enough?
For example, take the worst case scenario where every client is sending the same, short content, like "x". If we hash to 32 bytes and our hash depends on a secret 32 byte hash key, and the hashes exhibit uniform properties, how could an attacker ever hope to learn the secret hash key of any individual client, even without encryption?
And, if the attacker doesn't learn the key, how could the attacker ever hope to maliciously alter the message contents?
Thank you!
I don't know much about UMAC specifically but:
Having a rainbow table for a specific hash function defeats any encryption you have put on the message so instead of having a single attack surface, you now have two
As computational powers increase with time you will be more and more likely to figure out the plaintext of the message so PFS (https://en.wikipedia.org/wiki/Forward_secrecy) will never be possible if you leave the MAC unencrypted.
On top of all this, if you can figure out a single plaintext message from a MAC value, you exponentially get closer to decrypting the rest of the message by getting some information about the PRNG, context of the other data, IV, etc.

Using hash of password to encrypt private key

I am developing a web application in which I need to encrypt sensitive information. My plan is to use use AES-256 where the private key is encrypted by a hash of the user's password. I need to store the hash of the password for authentication purposes, but it obviously can't be same used to encrypt the private key. My current thought is to use bcrypt to generate a key to be used to encrypt the private key. For authentication, my thought was to simply hash the password using bcrypt and then hash that hash using bcrypt again and then store that hash in the database. Since it is one way, there shouldn't be any way to use the stored hash to decrypt the private key? Are there any obvious security issues with doing this that I may be missing?
My other thought was to use two different encryption algorithms, such as using a bcrypt hash to encrypt the private key and storing a SHA-2 hash for authentication purposes.
Thanks for your help.
don't use hash to encrypt AES password. salted hash should be used only for authentication. when user logs in, you have his password. use this password to encrypt (first time) and decrypt (later) the AES key and then forget the password.
I'd recommend using PBKDF2 in this situation. You can use two different salts, one that would derive the symmetric key and the other one would derive the password hash to be stored. The salt should contain a deterministic part distinguishing the two different use cases, as well as a random part - cf. this comment:
Otherwise, the salt should contain data that explicitly
distinguishes between different operations and different key
lengths, in addition to a random part that is at least eight
octets long, and this data should be checked or regenerated by
the party receiving the salt. For instance, the salt could have
an additional non-random octet that specifies the purpose of
the derived key. Alternatively, it could be the encoding of a
structure that specifies detailed information about the derived
key, such as the encryption or authentication technique and a
sequence number among the different keys derived from the
password. The particular format of the additional data is left
to the application.
A plain, salted SHA-2 probably isn't enough because of the poor entropy of typical passwords, as was mentioned in the comments.
A suggestion: use two different salts. When the user enters their password concatenate it with a random salt and hash it for the password recognition routine. Use a different salt and hash it again for the AES encryption key. Depending on how secure you want things, you can stretch the hashing as well.
Effectively you have:
storedPasswordCheck = SHA256(password + salt1);
AESkey = SHA256(password + salt2);
The AES keys are not stored of course, but are regenerated from the user's password as needed. You will need two separate salts, best at least 128 bits each, stored for each user.

Does a Message authentication code (MAC) ensure authenticity of the key used?

I have to protects confidentiality, integrity and authenticity of a file of records with a password. The number of records can potentially be more then 32^2 and each record can be accessed independently.
One way to implement it is
Generate a 256-bit random salt and store it in the file header.
Generate a derived key from the password and the salt using PBKDF2 with HMAC-SHA256 from PKCS #5.
For each record generate a 96-bit random initialization vector.
Encrypt each record's content using AES-256 in GCM mode using the derived key, the initialization vector, and (as additional authenticated data) the position of the record in a file.
As a result, each record will store an initialization vector, an encrypted content, and a MAC.
But the NIST Special Publication SP800-38D defining GCM and GMAC requires the number of records to be less than 32^2 for the initialization vectors to be unique.
So I devised another solution: create a key for each record with HMAC-SHA256 using the derived key as a key and the position of the record in a file as a message to be authenticated (salt).
So the question is do I need to provide the position of the record in a file to the authenticated encryption algorithm
as an additional authenticated data since I've already taken care of it when generating the key?
Additionally do I really need to use initialization vectors at all since all the records will be encrypted and
authenticated using supposedly different keys generated by HMAC-SHA256(PBKDF2(HMAC-SHA256, password, salt, iterationCount, 256), blockAddress) ?
I don't know what the size of the file will be, so I presume it can be very large.
If I understood you correctly (bit of a disclaimer, sorry) then you should be fine without adding the position within the record in the file.
No you don't need a random IV if you only use a (session) key once. Using an IV consisting of zero's would be enough (deterministic construction, using one device and a counter set to zero, if we keep with the NIST nomenclature).

Resources