Does a Message authentication code (MAC) ensure authenticity of the key used? - security

I have to protects confidentiality, integrity and authenticity of a file of records with a password. The number of records can potentially be more then 32^2 and each record can be accessed independently.
One way to implement it is
Generate a 256-bit random salt and store it in the file header.
Generate a derived key from the password and the salt using PBKDF2 with HMAC-SHA256 from PKCS #5.
For each record generate a 96-bit random initialization vector.
Encrypt each record's content using AES-256 in GCM mode using the derived key, the initialization vector, and (as additional authenticated data) the position of the record in a file.
As a result, each record will store an initialization vector, an encrypted content, and a MAC.
But the NIST Special Publication SP800-38D defining GCM and GMAC requires the number of records to be less than 32^2 for the initialization vectors to be unique.
So I devised another solution: create a key for each record with HMAC-SHA256 using the derived key as a key and the position of the record in a file as a message to be authenticated (salt).
So the question is do I need to provide the position of the record in a file to the authenticated encryption algorithm
as an additional authenticated data since I've already taken care of it when generating the key?
Additionally do I really need to use initialization vectors at all since all the records will be encrypted and
authenticated using supposedly different keys generated by HMAC-SHA256(PBKDF2(HMAC-SHA256, password, salt, iterationCount, 256), blockAddress) ?
I don't know what the size of the file will be, so I presume it can be very large.

If I understood you correctly (bit of a disclaimer, sorry) then you should be fine without adding the position within the record in the file.
No you don't need a random IV if you only use a (session) key once. Using an IV consisting of zero's would be enough (deterministic construction, using one device and a counter set to zero, if we keep with the NIST nomenclature).

Related

Best practice for decrypting text file using SALT, SEED, SecretKey in android

I am new to security issues. I am trying to implement some basic (limited) protection of intellectual property in an android app by encrypting text assets. To do this, I have implemented the SimpleCrypto class described by S.Saurel here http://www.ssaurel.com/blog/how-to-use-cryptography-in-android-applications-2/. I plan to encrypt the data using a separate program and include the encrypted data as assets files, then decrypt them for use in the app. My question is what is the best practice for dealing with the SALT string and SEED string that are needed to decrypt the files? Should they be generated at run time or stored somehow? Should they be included in the crypto class, or generated elsewhere in the app and passed to the crypto class?
Thanks in advance!
In this implementation "seed" is what you would think of as "password", and since you're not going to ask the user to provide a password, you can hardcode it, or store it in a file, or request it from a server at runtime, or whatever else. Be aware that a smart attacker will most likely be able to get at this password and use it to generate their own decryption key for your ciphertexts.
Salt is a non-secret value that acts as an initialization vector for encryption. Best-practice would dictate that you generate a random salt per cleartext, then provide the ciphertext and unencrypted seed to the client. The IV is typically dependent on the block size of your cipher used, and in your example there you're generating a 256-bit key, so you should generate a random 256-bit (64-byte) salt per cleartext and ship it with the ciphertext. You could do something as simple as producing a final string which is:
[2 bytes indicating length of salt][salt][ciphertext]
Then from that you can get your seed and ciphertext for decryption.

Why is encrypting necessary for security after hashing in the UMAC (Universal Message Authentication Code) algorithm?

On the Wikipedia for UMAC, https://en.wikipedia.org/wiki/UMAC, it states:
The resulting digest or fingerprint is then encrypted to hide the
identity of the hash function used.
Further, in this paper, http://web.cs.ucdavis.edu/~rogaway/papers/umac-full.pdf, it states:
A message is authenticated by hashing it with the shared hash function
and then encrypting the resulting hash (using the encryption key).
My question is, if the set of hash functions H is large enough, and the number of hash buckets |B| is large enough, why do we need to encrypt -- isn't the secret hash secure enough?
For example, take the worst case scenario where every client is sending the same, short content, like "x". If we hash to 32 bytes and our hash depends on a secret 32 byte hash key, and the hashes exhibit uniform properties, how could an attacker ever hope to learn the secret hash key of any individual client, even without encryption?
And, if the attacker doesn't learn the key, how could the attacker ever hope to maliciously alter the message contents?
Thank you!
I don't know much about UMAC specifically but:
Having a rainbow table for a specific hash function defeats any encryption you have put on the message so instead of having a single attack surface, you now have two
As computational powers increase with time you will be more and more likely to figure out the plaintext of the message so PFS (https://en.wikipedia.org/wiki/Forward_secrecy) will never be possible if you leave the MAC unencrypted.
On top of all this, if you can figure out a single plaintext message from a MAC value, you exponentially get closer to decrypting the rest of the message by getting some information about the PRNG, context of the other data, IV, etc.

Using hash of password to encrypt private key

I am developing a web application in which I need to encrypt sensitive information. My plan is to use use AES-256 where the private key is encrypted by a hash of the user's password. I need to store the hash of the password for authentication purposes, but it obviously can't be same used to encrypt the private key. My current thought is to use bcrypt to generate a key to be used to encrypt the private key. For authentication, my thought was to simply hash the password using bcrypt and then hash that hash using bcrypt again and then store that hash in the database. Since it is one way, there shouldn't be any way to use the stored hash to decrypt the private key? Are there any obvious security issues with doing this that I may be missing?
My other thought was to use two different encryption algorithms, such as using a bcrypt hash to encrypt the private key and storing a SHA-2 hash for authentication purposes.
Thanks for your help.
don't use hash to encrypt AES password. salted hash should be used only for authentication. when user logs in, you have his password. use this password to encrypt (first time) and decrypt (later) the AES key and then forget the password.
I'd recommend using PBKDF2 in this situation. You can use two different salts, one that would derive the symmetric key and the other one would derive the password hash to be stored. The salt should contain a deterministic part distinguishing the two different use cases, as well as a random part - cf. this comment:
Otherwise, the salt should contain data that explicitly
distinguishes between different operations and different key
lengths, in addition to a random part that is at least eight
octets long, and this data should be checked or regenerated by
the party receiving the salt. For instance, the salt could have
an additional non-random octet that specifies the purpose of
the derived key. Alternatively, it could be the encoding of a
structure that specifies detailed information about the derived
key, such as the encryption or authentication technique and a
sequence number among the different keys derived from the
password. The particular format of the additional data is left
to the application.
A plain, salted SHA-2 probably isn't enough because of the poor entropy of typical passwords, as was mentioned in the comments.
A suggestion: use two different salts. When the user enters their password concatenate it with a random salt and hash it for the password recognition routine. Use a different salt and hash it again for the AES encryption key. Depending on how secure you want things, you can stretch the hashing as well.
Effectively you have:
storedPasswordCheck = SHA256(password + salt1);
AESkey = SHA256(password + salt2);
The AES keys are not stored of course, but are regenerated from the user's password as needed. You will need two separate salts, best at least 128 bits each, stored for each user.

Hashing vs. Signing Binaries

If you want to ensure that a file is valid (untampered and came from the correct/expected source), there are two things you can do: hashing, and signing.
For the purposes of my question, hashing means providing a hash of the file (along with the file) to download. The client downloads the hash and the file, re-computes the hash, and verifies that it matches the downloaded hash; this "proves" that the file was untampered with.
Signing means using a public-private encryption scheme, where you sign the binary with a public key, and the client uses the private key to verify that you really did sign the key.
Based on these definitions, I don't really see what is the main benefit of signing something vs. hashing something. Both of them are supposed to prove that the file was not tampered with.
The only thing I can see is that with hashing, a compromised server could mean someone also compromising the hash and replacing a malicious binary with a matching key; but with a public-private scheme, as long as the private key remains private, there is no way to forge a malicious file.
Or am I missing something?
The difference is as you said: a hacker can update a hash to match the tampered-with file, but cannot generate a valid signature.
Signing is done with the private key, verification with the public key. You said the opposite above. It's also typically done on the hash of the file and not the file itself for practical reasons.
Signing verifies two things -- that the file has not been tampered with, and the identity of the signer. If you can be sure that entity giving you the hash is absolutely the entity that is supposed to be giving you the file, then the two are equivalent. Signing and certificate authorities are a way of ensuring that trust relationship.
Hash is a output with fixed length of characters(or bits if represented in binary) for a specific data that is passed into a function.
Hash is irreversible.The hash value for a particular data is always same. If a single bit in data changes almost entire hash for the altered data is changed. The process of calculating a hash is called hashing.
In Asymmetric cryptography each communicating party has his own key pair (private key and public key). As name suggest , private key is usually kept secret and public key is shared. These keys are as such in nature that if one is used to encrypt then the only other key pair can decrypt.
To achieve non repudiation(sender cannot deny he sent message) and to Authenticate specific entity to receive data , public key is shared to them so that they can decrypt anything that is encrypted by the sender using the corresponding private key that is with the sender(only with sender i.e secret)
But note that confidentiality is week in this example as sender does not know and cannot guarantee if public key was compromised to an unknown.
when private key is used to encrypt a Hash then it becomes a signature and the process is called signing. This achieves Authenticity (that data is coming from a genuine guy as private key is used) and also Integrity is assured because receiver verifies the Hash upon receiving data by decrypting the hash using corresponding public key given to him by sender and then calculating the same hash on his own and matching it.
The big difference between providing some data (an executable a document, whatever) along with a hash and providing the same data with a signature is with the hash, both the data and the hash value come from the same place. So, if someone can compromise one of them, he can probably also compromise the other.
For example, if I can hack into your web server, I can easily replace your executable with my own version and replace the hash value with the correct hash for my executable.
If you sign your executable, I can't just produce another signature for a different executable and replace your original signature. The signature verifies both the hash of the original data (the data has not changed since being signed) and that the signature was generated by your private key.
Of course, this all assumes that people who receive your signed executable have received your public key in some trusted way. If I can trick people into using my public key instead of yours, then I can hack into your website and replace your signed executable with my own. That's why we have certificate authorities.
This page has a high level overview of digital signatures.

Is It Possible To Reconstruct a Cryptographic Hash's Key

We would like to cryptographically (SHA-256) hash a secret value in our database. Since we want to use this as a way to lookup individual records in our database, we cannot use a different random salt for each encrypted value.
My question is: given unlimited access to our database, and given that the attacker knows at least one secret value and hashed value pair, is it possible for the attacker to reverse engineer the cryptographic key? IE, would the attacker then be able to reverse all hashes and determine all secret values?
It seems like this defeats the entire purpose of a cryptographic hash if it is the case, so perhaps I'm missing something.
There are no published "first pre-image" attacks against SHA-256. Without such an attack to open a shortcut, it is impossible for an attacker to the recover a secret value from its SHA-256 hash.
However, the mention of a "secret key" might indicate some confusion about hashes. Hash algorithms don't use a key. So, if an attacker were able to attack one "secret-value–hash-value" pair, he wouldn't learn a "key" that would enable him to easily invert the rest of the hash values.
When a hash is attacked successfully, it is usually because the original message was from a small space. For example, most passwords are chosen from a relatively short list of real words, perhaps with some simple permutations. So, rather than systematically testing every possible password, the attacker starts with an ordered list of the few billion most common passwords. To avoid this, it's important to choose the "secret value" randomly from a large space.
There are message authentication algorithms that hash a secret key together with some data. These algorithms are used to protect the integrity of the message against tampering. But they don't help thwart pre-image attacks.
In short, yes.
No, a SHA hash is not reversible (at least not easily). When you Hash something if you need to reverse it you need to reconstruct the hash. This is usually done with a private (salt) and public key.
For example, if I'm trying to prevent access based off my user id. I would hash my user id and the salt. Let say MD5 for example. My user id is "12345" and the salt is "abcde"
So I will hash the string "12345_abcde", which return a hash of "7b322f78afeeb81ad92873b776558368"
Now I will pass to the validating application the hash and the public key, "12345" which is the public key and the has.
The validating application, knows the salt, so it hashes the same values. "12345_abcde", which in turn would generate the exact same hash. I then compare the hash i validated with the one passed off and they match. If I had somehow modified the public key without modifying the hash, a different has would have been generated resulting in a mismatch.
Yes it's possible, but not in this lifetime.
Modern brute-force attacks using multiple GPUs could crack this in short order. I recommend you follow the guidelines for password storage for this application. Here are the current password storage guidelines from OWASP. Currently, they recommend a long salt value, and PBKDF2 with 64,000 iterations, which iteratively stretches the key and makes it computationally complex to brute force the input values. Note that this will also make it computationally complex for you to generate your key values, but the idea is that you will be generating keys far less frequently than an attacker would have to. That said, your design requires many more key derivations than a typical password storage/challenge application, so your design may be fatally flawed. Also keep in mind that the iteration count should doubled every 18 months to make the computational complexity follow Moore's Law. This means that your system would need some way of allowing you to rehash these values (possibly by combining hash techniques). Over time, you will find that old HMAC functions are broken by cryptanalysts, and you need to be ready to update your algorithms. For example, a single iteration of MD5 or SHA-1 used to be sufficient, but it is not anymore. There are other HMAC functions that could also suit your needs that wouldn't require PBKDF2 (such as bcrypt or scrypt), but PBKDF2 is currently the industry standard that has received the most scrutiny. One could argue that bcrypt or scrypt would also be suitable, but this is yet another reason why a pluggable scheme should be used to allow you to upgrade HMAC functions over time.

Resources