Encrypting database of secret information: Entirely or in parts? - security

Imagine you have a database full of secret information, e.g. a list of usernames + passwords.
If you want to encrypt this database using an algorithm such as AES-128, how would you encrypt the data?
Encrypt only the secret information fields, e.g. the passwords. Leave the usernames as they are. Output could be: "mike#example.org/AES_ENCRYPTED_PASSWORD;linda#example.org/AES_ENCRYPTED_PASSWORD"
Encrypt the entire database, output would be: "AES_ENCRYPTED_DATA"
The problem I am thinking of: Probably, the data is saved in XML format. So a possible attacker could try random passwords using brute force until he finds an XML-element in the encrypted data. So it's easier to crack than the first approach. Right?
Or is it safe to just save my data temporarily in XML format and then encrypt the whole XML file using AES?

As I understand the OP, the question is more towards "known plaintext" and the size and redundancy of the message to be encrypted.
So:
YES, it is in most cases "easier" to break an encryption if parts of the plaintext are known, like XML-tags.
YES, it may become easier to break an encryption if more encrypted data is available.
BUT: All common off-the-shelf encryption algorithms, that are not yet considered "broken", are pretty immune against both types of attack.
In theory, it should in fact be safer if only short messages of pretty much random content (as passwords) were encrypted. If, however, encrypting many such messages independently, one would have to think about initialization vectors (similar to "salt") and the like to avoid producing patterns one actually intends to hide.
Conclusion:
Take a "good" algorithm with a good key/password/... and -if feasible- encrypt your "database" as one big plaintext message.
Or is it safe to just [...] encrypt the whole [database] file using AES?
Yes, that's what I would recommend in principle. But be very careful how and where you store your data "temporarily"; a file in a "temp" dir, for instance, may not be as "temporary" as one is tempted to believe.

If you are using the same passphrase the difficulty would be the same whether this is a bunch of usernames/passwords or sets of XML files.
Only encrypt what you must encrypt. Though when it comes to passwords, encryption is not a good idea as passwords can be recovered (which will disclose them to the would be attacker).
It is better to store a hash and have a mechanism to generate new passwords if a user can't recall their password.

Better to store the passwords as a hash with salt.

Related

Derive salt from password - How (in)secure is it?

I got the following problem:
In an Java Application I want to store some configuration data in an encrypted, local file. This file might be used for confidential data, like user credentials.
This file should be accessible by using a password (and only a password).
Now most trustworthy people and reference implementations use random salts. I completely understand that this is a reasonable choice. But if my application terminates and will be started later, the random salt is not available anymore. This application is stand-alone so no additional database could be used as a salt store.
For my software the user shall only type in the password (means: no user name, no salt, no favorite animal or colour).
Now my idea was deriving the salt from the password (e.g. by using the first 16 bytes of SHA-256).
My questions are:
How (in)secure would this implementation be?
What is a common way of encrypting stuff with only a password and would be a better alternative?
What is not the aim of this question:
Where to store salts
Secure algorithms and crypto implementation (of course, I did not implement crypto by myself)
Architectural improvements (nop, I do not want a global database for storing stuff)
First, I strongly recommend against devising novel encryption formats if you can help it. It is very difficult to do them correctly. If you want an encryption format that does what you're describing, see JNCryptor, which is an implementation of the RNCryptor format. The RNCryptor format is designed precisely for this problem, so the spec is a good source of information on how you could create your own if you don't want to use it directly. (I'm the author of RNCryptor.)
See also libsodium. It's a better encryption format than RNCryptor for various technical reasons, but it's a bit harder to install and use correctly. There are several Java bindings for libsodium.
When you say "of course, I did not implement crypto by myself," that's what you're doing. Crypto schemes are more than just the AES code. Deciding how to generate the salts in a novel way is implementing crypto. There are many ways to put together secure primitives (like salts) in simple ways and make them wildly insecure. That's why you want to use something well established.
The key take-away is that you store the salt with the data. I know you said this isn't about storing the salt, but that's how you do this. The simplest way to do this is to just glue the salt onto the start of the cipher text and store that. Then you just read the salt from the header. Similarly, you could put the whole thing in an envelope if that's more convenient. Something as simple as JSON:
{ "salt": "<base64-salt>",
"data": "<base64-data>" }
It's not the most efficient way to store the data, but it's easy, standard, and secure.
Remember, salts are not secrets. It is fine that everyone can read the salt.
OK, enough of how to do it right. Let's get to your actual question.
Your salting proposal is not a salt. It's just a slightly different hashing function. The point of a salt is if the same password is used twice (without intending to be the same password), then they will have different hashes. Your scheme fails that. If I implement the same approach as you do, and I pick the same password as yours, then the hash will be the same. Rainbow tables win.
The way you fix that is with a static salt, not a modified hash function. You should pick a salt that represents your system. I usually like reverse DNS for this, because it leads to uniqueness. For example: "com.example.mygreatapp". Someone else would naturally pick "org.example.ourawesomedb". You also can pick a long random string, but the important thing is uniqueness, so I like reverse DNS. (Random strings tend to make people think the salt is a secret, and the salt is not a secret.)
That's the whole system; just pick some constant salt, unique to your system. (If you had a username, you'd add the username to the salt. This is a standard way to construct a deterministic salt.)
But for file storage, I'd never do it that way.
To encrypt data, one needs a key not a password. There are key-derivation-functions to get a key from a user password.
A salt can be used for password hashing, but it cannot be used for encrypting data. There is a similar concept for encryption though, the random values there are called IV or Nonce and are stored together with the encrypted data.
The best thing you can do is
use a key-derivation-function with a salt, to get a key from the password.
With the resulting key you can encrypt the data.
In this case the salt can be stored inside the encrypted data container (the IV is already there), so there is no need for a global database.
To answer your original question: Derriving a salt from the password negates the whole purpose of the salt, it just becomes a more complex hash function.

Encrypting a string

Is there a way to encrypt a string so there is no reversable effect? Like if you run some algorith 100 times, encrypting a message, you can run it 100 times in reverse and get the right one. If there a technology or method that eliminates such possibility?
There are two broad categories you should look into, depending on your needs:
Cryptographic Hash Functions
Cryptographic hash functions produce fixed-width values based on an arbitrarily long input, in such a way that even very minor changes in the input result in significantly different output. As a rule, they are irreversible (though flaws have been found in some algorithms). This is a good choice if you do not need to be able to recover the value of the string yourself. For example, good username/password verification systems store a hash of the password rather than the password itself, and authenticate by comparing that hash to the hash of the password provided by the user. This way, even if the username/password database is compromised, user passwords are not exposed.
Public-Key Cryptography
In public-key cryptography, a sender uses the intended recipient's "public" key to encrypt a message, and the recipient uses their "private" key to decrypt it. The message cannot be decrypted by the same key that encrypted it, so in that sense the algorithm is not strictly "reversible" (splicing hairs, I know). TLS, SSL, and PGP are all based on this technique, to name a few examples. This is probably your best option if you are transmitting data between two known parties.

Keeping Encrypted Strings Safe with Multiple Encrypts

A system I have been working on for a while requires DPA, and asked a question about keeping the data passcodes safe. I have since them come up with an idea to fix that, which involves having the data decrypt password for the database stored on the database, but have that encrypted with validated users password (which is stored as an MD5 key) after a different type of hashing.
The question is that does encrypting the password multiple times with different keys (at least 20 characters long, with possible extension) make it considerably easier to decrypt without prior knowledge or information on the password?
No, in general a good cipher should have the property that you cannot retrieve data even if you know the plaintext. Having the data encrypted should not have much influence, geven a good cipher and a big enough key space.
First off, MD5 is no longer considered a secure encryption algorithm. See http://www.kb.cert.org/vuls/id/836068 for details.
Secondly, the encryption key for the data should not be stored in the database itself. It should be stored separately. That way there are at least two things that have to be obtained (the database file and the key) to decrypt the data. If the key is stored in the database itself, it probably wouldn't take long to find it once someone has the database file.
Find a separate method for storing the key. It should either be coded into the application or stored in a file that is obfuscated in some way.

Using asymmetric encryption to secure passwords

Due to our customer's demands, user passwords must be kept in some "readable" form in order to allow accounts to be converted at a later date. Unfortunately, just saving hash values and comparing them on authentication is not an option here. Storing plain passwords in the database is not an option either of course, but using an encryption scheme like AES might be one. But in that case, the key to decrypt passwords would have to be stored on the system handling authentication and I'm not quite comfortable with that.
Hoping to get "best of both worlds", my implementation is now using RSA asymmetric encryption to secure the passwords. Passwords are salted and encrypted using the public key. I disabled any additional, internal salting or padding mechanisms. The encrypted password will be the same every time, just like a MD5 or SHA1 hashed password would be. This way, the authentication system needs the public key, only. The private key is not required.
The private key is printed out, sealed and stored offline in the company's safe right after it is created. But when the accounts need to be converted later, it will allow access to the passwords.
Before we deploy this solution, I'd like to hear your opinion on this scheme. Any flaws in design? Any serious drawbacks compared to the symmetric encryption? Anything else we are missing?
Thank you very much in advance!
--
Update:
In response to Jack's arguments below, I'd like to add the relevant implementation details for our RSA-based "hashing" function:
Security.addProvider(new org.bouncycastle.jce.provider.BouncyCastleProvider());
Cipher rsa = Cipher.getInstance("RSA/None/NoPadding");
rsa.init(Cipher.ENCRYPT_MODE, publicKey);
byte[] cryptRaw = rsa.doFinal(saltedPassword.getBytes());
Having quickly skimmed over the paper mentioned by Jack, I think I somewhat understand the importance of preprocessing such as OAEP. Would it be alright to extend my original question and ask if there is a way to apply the needed preprocessing and still have the function return the same output every time for each input, just as a regular hashing function would? I would accept an answer to that "bonus question" here. (Or should I make that a seperate question on SOF?)
--
Update 2:
I'm having a hard time accepting one of the present answers because I feel that none really does answer my question. But I no longer expect any more answers to come, so I'll accept the one that I feel is most constructive.
I'm adding this as another answer because instead of answering the question asked (as I did in the first response) this is a workaround / alternative suggestion.
Simply put:
Use hashes BUT, whenever a user changes their password, also use your public key as follows:
Generate a random symmetric key and use it to encrypt the timestamp, user identifier, and new password.
The timestamp is to ensure you don't mess up later when trying to find the current / most up-to-date password.
Username so that you know which account you're dealing with.
Password because it is a requirement.
Store the encrypted text.
Encrypt the symmetric key using your public key.
Store the public key encrypted symmetric key with the encrypted text.
Destroy the in-memory plaintext symmetric key, leaving only the public key encrypted key.
When you need to 'convert' the accounts using the current password, you use the private key and go through the password change records. For each one:
Using the private key, decrypt the symmetric key.
Using the symmetric key, decrypt the record.
If you have a record for this user already, compare timestamps, and keep the password that is most recent (discarding the older).
Lather, rinse, repeat.
(Frankly I'm probably overdoing things by encrypting the timestamp and not leaving it plaintext, but I'm paranoid and I have a thing for timestamps. Don't get me started.)
Since you only use the public key when changing passwords, speed isn't critical. Also, you don't have to keep the records / files / data where the plaintext password is encrypted on the server the user uses for authentication. This data can be archived or otherwise moved off regularly, as they aren't required for normal operations (that's what the hash is for).
There is not enough information in the question to give any reasonable answer. Anyway since you disable padding there is a good chance that one of the attacks described in the paper
"Why Textbook ElGamal and RSA Encryption are Insecure" by
D. Boneh, A. Joux, and P. Nguyen is applicable.
That is just a wild guess of course. Your proposal could be susceptible to a number of other attacks.
In terms of answering your specific question, my main concern would have been management of the private key but given it's well and truly not accessible via any computer system breach, you're pretty well covered on that front.
I'd still question the logic of not using hashes though - this sounds like a classic YAGNI. A hashing process is deterministic so even if you decided to migrate systems in the future, so long as you can still use the same algorithm, you'll get the same result. Personally, I'd pick a strong hash algorithm, use a cryptographically strong, unique salt on each account and be done with it.
It seems safe enough in terms of what is online but have you given full consideration to the offline storage. How easy will it be for people within your company to get access to the private key? How would you know if someone within your company had accessed the private key? How easy would it be for the private key to be destroyed (e.g. is the safe fireproof/waterproof, will the printed key become illegible over time etc).
You need to look at things such as split knowledge, dual control, tamper evident envelopes etc. As a minimum I think you need to print out two strings of data which when or'd together create the private key and then have one in your office and one in your customers office,
One serious drawback I've not seen mentioned is the speed.
Symmetric encryption is generally much much faster than asymmetric. That's normally fine because most people account for that in their designs (SSL, for example, only uses asymmetric encryption to share the symmetric key and checking certificates). You're going to be doing asymmetric (slow) for every login, instead of cryptographic hashing (quite fast) or symmetric encryption (pretty snappy). I don't know that it will impact performance, but it could.
As a point of comparison: on my machine an AES symmetric stream cipher encryption (aes-128 cbc) yields up to 188255kB/s. That's a lot of passwords. On the same machine, the peak performance for signatures per second (probably the closest approximation to your intended operation) using DSA with a 512 bit key (no longer used to sign SSL keys) is 8916.2 operations per second. That difference is (roughly) a factor of a thousand assuming the signatures were using MD5 sized checksums. Three orders of magnitude.
This direct comparison is probably not applicable directly to your situation, but my intention was to give you an idea of the comparative algorithmic complexity.
If you have cryptographic algorithms you would prefer to use or compare and you'd like to benchmark them on your system, I suggest the 'openssl speed' command for systems that have openssl builds.
You can also probably mitigate this concern with dedicated hardware designed to accelerate public key cryptographic operations.

Difference between Hashing a Password and Encrypting it

The current top-voted to this question states:
Another one that's not so much a security issue, although it is security-related, is complete and abject failure to grok the difference between hashing a password and encrypting it. Most commonly found in code where the programmer is trying to provide unsafe "Remind me of my password" functionality.
What exactly is this difference? I was always under the impression that hashing was a form of encryption. What is the unsafe functionality the poster is referring to?
Hashing is a one way function (well, a mapping). It's irreversible, you apply the secure hash algorithm and you cannot get the original string back. The most you can do is to generate what's called "a collision", that is, finding a different string that provides the same hash. Cryptographically secure hash algorithms are designed to prevent the occurrence of collisions. You can attack a secure hash by the use of a rainbow table, which you can counteract by applying a salt to the hash before storing it.
Encrypting is a proper (two way) function. It's reversible, you can decrypt the mangled string to get original string if you have the key.
The unsafe functionality it's referring to is that if you encrypt the passwords, your application has the key stored somewhere and an attacker who gets access to your database (and/or code) can get the original passwords by getting both the key and the encrypted text, whereas with a hash it's impossible.
People usually say that if a cracker owns your database or your code he doesn't need a password, thus the difference is moot. This is naïve, because you still have the duty to protect your users' passwords, mainly because most of them do use the same password over and over again, exposing them to a greater risk by leaking their passwords.
Hashing is a one-way function, meaning that once you hash a password it is very difficult to get the original password back from the hash. Encryption is a two-way function, where it's much easier to get the original text back from the encrypted text.
Plain hashing is easily defeated using a dictionary attack, where an attacker just pre-hashes every word in a dictionary (or every combination of characters up to a certain length), then uses this new dictionary to look up hashed passwords. Using a unique random salt for each hashed password stored makes it much more difficult for an attacker to use this method. They would basically need to create a new unique dictionary for every salt value that you use, slowing down their attack terribly.
It's unsafe to store passwords using an encryption algorithm because if it's easier for the user or the administrator to get the original password back from the encrypted text, it's also easier for an attacker to do the same.
As shown in the above image, if the password is encrypted it is always a hidden secret where someone can extract the plain text password. However when password is hashed, you are relaxed as there is hardly any method of recovering the password from the hash value.
Extracted from Encrypted vs Hashed Passwords - Which is better?
Is encryption good?
Plain text passwords can be encrypted using symmetric encryption algorithms like DES, AES or with any other algorithms and be stored inside the database. At the authentication (confirming the identity with user name and password), application will decrypt the encrypted password stored in database and compare with user provided password for equality. In this type of an password handling approach, even if someone get access to database tables the passwords will not be simply reusable. However there is a bad news in this approach as well. If somehow someone obtain the cryptographic algorithm along with the key used by your application, he/she will be able to view all the user passwords stored in your database by decryption. "This is the best option I got", a software developer may scream, but is there a better way?
Cryptographic hash function (one-way-only)
Yes there is, may be you have missed the point here. Did you notice that there is no requirement to decrypt and compare? If there is one-way-only conversion approach where the password can be converted into some converted-word, but the reverse operation (generation of password from converted-word) is impossible. Now even if someone gets access to the database, there is no way that the passwords be reproduced or extracted using the converted-words. In this approach, there will be hardly anyway that some could know your users' top secret passwords; and this will protect the users using the same password across multiple applications. What algorithms can be used for this approach?
I've always thought that Encryption can be converted both ways, in a way that the end value can bring you to original value and with Hashing you'll not be able to revert from the end result to the original value.
Hashing algorithms are usually cryptographic in nature, but the principal difference is that encryption is reversible through decryption, and hashing is not.
An encryption function typically takes input and produces encrypted output that is the same, or slightly larger size.
A hashing function takes input and produces a typically smaller output, typically of a fixed size as well.
While it isn't possible to take a hashed result and "dehash" it to get back the original input, you can typically brute-force your way to something that produces the same hash.
In other words, if a authentication scheme takes a password, hashes it, and compares it to a hashed version of the requires password, it might not be required that you actually know the original password, only its hash, and you can brute-force your way to something that will match, even if it's a different password.
Hashing functions are typically created to minimize the chance of collisions and make it hard to just calculate something that will produce the same hash as something else.
Hashing:
It is a one-way algorithm and once hashed can not rollback and this is its sweet point against encryption.
Encryption
If we perform encryption, there will a key to do this. If this key will be leaked all of your passwords could be decrypted easily.
On the other hand, even if your database will be hacked or your server admin took data from DB and you used hashed passwords, the hacker will not able to break these hashed passwords. This would actually practically impossible if we use hashing with proper salt and additional security with PBKDF2.
If you want to take a look at how should you write your hash functions, you can visit here.
There are many algorithms to perform hashing.
MD5 - Uses the Message Digest Algorithm 5 (MD5) hash function. The output hash is 128 bits in length. The MD5 algorithm was designed by Ron Rivest in the early 1990s and is not a preferred option today.
SHA1 - Uses Security Hash Algorithm (SHA1) hash published in 1995. The output hash is 160 bits in length. Although most widely used, this is not a preferred option today.
HMACSHA256, HMACSHA384, HMACSHA512 - Use the functions SHA-256, SHA-384, and SHA-512 of the SHA-2 family. SHA-2 was published in 2001. The output hash lengths are 256, 384, and 512 bits, respectively,as the hash functions’ names indicate.
Ideally you should do both.
First Hash the pass password for the one way security. Use a salt for extra security.
Then encrypt the hash to defend against dictionary attacks if your database of password hashes is compromised.
As correct as the other answers may be, in the context that the quote was in, hashing is a tool that may be used in securing information, encryption is a process that takes information and makes it very difficult for unauthorized people to read/use.
Here's one reason you may want to use one over the other - password retrieval.
If you only store a hash of a user's password, you can't offer a 'forgotten password' feature.

Resources