Storing and searching for encrypted data fields like email - security

I wanted to know what was the best practice followed for storing sensitive fields like email and phone number in the database. Lets say you want to search by email and phone number , and the application sends emails and sms to its users as well.
Because this data is sensitive you need to encrypt it. Hashing is not an option because you cant unhash it.
Encryption standards like Rjindael or AES makes the data secure, but you cannot search the db by it because the encrypted string produced for the same input is always different.
So in a case like this do I need to store both the hash as well as the encrypted field in the table ? Or is there some other strong encryption technique deployed for fields like these.

Check out CipherSweet. It's a very permissively-licensed open source library that provides searchable encryption in PHP.
Its implementation is similar to Ebbe's answer, but with a lot more caveats:
CipherSweet automatically handles key splitting, through a well-defined protocol.
CipherSweet supports multiple functional blind indexes (truncated hashes of transformations of the plaintext) to facilitate advanced searching.
More about the security implications of its design are available here.
Furthermore, the API is relatively straightforward:
<?php
use ParagonIE\CipherSweet\BlindIndex;
use ParagonIE\CipherSweet\CipherSweet;
use ParagonIE\CipherSweet\CompoundIndex;
use ParagonIE\CipherSweet\EncryptedRow;
use ParagonIE\CipherSweet\Transformation\LastFourDigits;
/** #var CipherSweet $engine */
// Define two fields (one text, one boolean) that will be encrypted
$encryptedRow = (new EncryptedRow($engine, 'contacts'))
->addTextField('ssn')
->addBooleanField('hivstatus');
// Add a normal Blind Index on one field:
$encryptedRow->addBlindIndex(
'ssn',
new BlindIndex(
'contact_ssn_last_four',
[new LastFourDigits()],
32 // 32 bits = 4 bytes
)
);
// Create/add a compound blind index on multiple fields:
$encryptedRow->addCompoundIndex(
(
new CompoundIndex(
'contact_ssnlast4_hivstatus',
['ssn', 'hivstatus'],
32, // 32 bits = 4 bytes
true // fast hash
)
)->addTransform('ssn', new LastFourDigits())
);
Once you have your object instantiated and configured, you can insert rows like so:
<?php
/* continuing from previous snippet... */
list($encrypted, $indexes) = $encryptedRow->prepareRowForStorage([
'extraneous' => true,
'ssn' => '123-45-6789',
'hivstatus' => false
]);
$encrypted['contact_ssnlast4_hivstatus'] = $indexes['contact_ssnlast4_hivstatus'];
$dbh->insert('contacts', $encrypted);
Then retrieving rows from the database is as simple as using the blind index in a SELECT query:
<?php
/* continuing from previous snippet... */
$lookup = $encryptedRow->getBlindIndex(
'contact_ssnlast4_hivstatus',
['ssn' => '123-45-6789', 'hivstatus' => true]
);
$results = $dbh->search('contacts', ['contact_ssnlast4_hivstatus' => $lookup]);
foreach ($results as $result) {
$decrypted = $encryptedRow->decrypt($result);
}
CipherSweet is currently implemented in PHP and Node.js, with additional Java, C#, Rust, and Python implementations coming soon.

Actually, encrypting the same message twice with AES with the same key and the same initialization vector (IV) will produce the same output - always.
However, using the same key and the same IV would leak information about the encrypted data. Due to the way AES encrypts in blocks of 16 bytes, two email addresses starting with the same 16 bytes and encrypted with the same key and the same IV would also have the same 16 bytes in the start of the encrypted message. Those leaking the information that these two emails start with the same. One of the purposes of the IV is to counter this.
A secure search field can be created using an encrypted (with same key and same IV) one-way-hash. The one-way-hash ensures that the encryption don't leak data. Only using a one-way-hash would not be enough for e.g telephone numbers as you can easily brute force all one-way-hash'es for any valid phone numbers.

If you want to encrypt your data, place the table on an encrypted filesystem or use a database that provides a facility for encrypted tables.
Encrypting data in the database itself would lead to very poor performance for a number of reasons, the most obvious being that a simple table scan (let's say you're looking for a user by email address) would require a decryption of the whole recordset.
Also, your application shouldn't deal with encryption/decryption of data: if it is compromised, then all of your data is too.
Moreover, this question probably shouldn't be tagged as 'PHP' question.

Related

How to create a simple encryption endpoint in node such that only the encrypted text can be returned back for future decryption?

I need to implement an api endpoint that just takes a id in query param, then sends its encrypted value back. For that I was looking into the crypto module in node, and I found it a bit complex. One thing that I donot get is how am I suppose to use the iv? I plan to store the encryption key in the env such that every id can be decrypted using that same key. So, should I also store the iv in the env? Is that a good practice?
I have seen some apis actually randomly generates iv for each request, and return it alongside the encrypted text, such that the user can send them both later for decryption. But for my usecase, I cannot send two separate data back to the user. I can concat iv in the encrypted text, but for some values, the encrypted text in itself is too long for my use case. Any suggestion on what might be the best approach for my case?
Initialisation vectors are important to prevent attackers using brute force methods to decrypt data after a breach has occurred, i.e. in the event the DB has been copied/stolen.
In summary, if you encrypted the same password twice, with the same key, but without an IV, you will get the same encrypted string output. By adding an IV you will get a different output with the same password, but you have to store the IV along with the encrypted data, see Cipher Block Chaining. This makes it much harder to decrypt breached password databases as the attacker cannot use dictionaries of common passwords to test keys for a match within the data. In relational databases an IV is typically called a 'salt', in Postgres for example, you should generate a new salt when storing each password, like so:
UPDATE user SET password = crypt('new password', gen_salt('md5'));
For your use case I'm not certain if you need an IV, it depends on how the encrypted data is supposed to be used and/or stored. If you decide you don't need one, you can just omit it either of these ways:
1: Pass null instead of an IV:
const cipher = createCipheriv('aes-192-ccm', key, null);
2: (Deprecated since Node 10) Use the createCipher function:
const cipher = crypto.createCipher('aes-192-ccm', key);

search/filter/sort encrypted columns in postgresql

I have a user table which stores the encrypted data( name, last name, DOB, email,...) all this data has been encrypted by the crypto package of nodejs using the AES-128-CBC algorithm. I use TypeOrm and I have defined a BeforeInsert and AfterLoad which encrypt/decrypt the data accordingly and all the results in the code are in plaintext.
right now due to some features like search, filtering/sorting(based on column) the performance of the application is very low.
Is there a way to put this encryption and decryption into some wrapper function into the Postgres and get the plaintext when we query the database? (same for inserting, insert plaintext but some function in Postgres get the values and encryption those and then insert it). can we have such functionality with Postgres? can we then use filtering and sorting in the code and database to take care of the rest? (decrypt the fields then filter/sort/search and return the plaintext result)

Retrieving an encrypted value when presented with the plaintext

I am required to use AES 256 CBC to encrypt some strings before I store them in a relational database. I prepend the ciphertext with the IV that was used. The plaintext is a unique string (what I call the "key") that has a one to one relationship with users in my application.
The problem is that when a user does something, they send the plaintext key and I have to go retrieve any metadata associated with it (such as the user's ID, permissions, etc.). But I've encrypted the key in the database, so I can't just filter like where encrypted_key = :plain_key. I want to be able to do this retrieval with only the plaintext key and not require that other data are sent with the plaintext key. (It may be necessary that I do use more information in my query; I'm just seeing if there's some clever way around it).
I could just retrieve ALL encrypted keys in the database, and then iterate over all ciphertexts and parse out the IV, re-encrypt the plaintext key I received from the user with the IV, and see if I find a match. I don't want to have to retrieve all ciphertexts though. If the IV were predictable somehow I could do it, but I don't want to use any part of the plaintext key or associated metadata as the IV.

How to store private data on blockchain?

I'm writting DApp on Ethereum (Solidity) and I need to find a solution how to store a private data on blockchain, when I also need to proceed them somehow. If it is only about storing, I can use some normal encrypting, but the problem is that I need to read the data IN the smart contract and proceed it somehow too.
Let's say:
1) I want to send some private number to a blockchain.
2) I need to check if the privete number is bigger than the last stored private number and smaller than the second last stored number.
if (storage[n] < y < storage[n-1]) storage.push(y);
3) If yes, I want to store it privately.
Any ideas? Thank you.
it is better to use a data structure with two entries e.g. a tuple, with the first entry to use as the counter (i.e. to take care of check if the private number is bigger than the last stored private number part, and the second entry to store an encrypted data, one to use as a counter.
y = new Dapp(sno, value)
# compare y's sno with the record, store private data in value
You can encrypt the data before sending to the blockchain. Encryption and decryption should be done off-chain because the blockchain is public and you don't want your plaintext to be exposed.
You can compare those numbers on-chain without exposing your plaintext if you use homomorphic encryption.
Homomorphic encryption is a form of encryption that allows computation on ciphertexts, generating an encrypted result which, when decrypted, matches the result of the operations as if they had been performed on the plaintext. - Wikipedia

Security schema .NET

Good day!
I have thought of a security schema for a project and I am curious whether if it is very, very secure.
I have stored in the database a RNGCryptoServiceProvider() base64 string of the GetNonZeroBytes(byte[16] object) method which I use as a Salt.
Next, I use the Salt to generate a Scrypt encryption of the password (just like bCrypt, just that it allows me to chose the quantity of RAM and other stuff like that - in this scenario, I use 8mb of RAM to encrypt the password). I use the output and the Salt from before to initialise a Rfc2898DeriveBytes(encrypted output, Salt, 10000) instance.
public static string GetBase64StringSafeString(string SaltSource, string StringToEncrypt, int memoryCost)
{
byte[] Salt = Encoding.ASCII.GetBytes(SaltSource);
byte[] derivedBytes = new byte[64];
SCrypt.ComputeKey(Encoding.ASCII.GetBytes(StringToEncrypt), (new Rfc2898DeriveBytes(SaltSource, Salt, 10000)).GetBytes(25), (memoryCost != 0 ? memoryCost : 8192), 8, 1, null, derivedBytes);
return Convert.ToBase64String(derivedBytes);
}
This I use to generate the key and IV for a RijndaelManaged algorithm with a Blocksize of 256. This is what I use to encrypt data into the database and thus to this algorithm, I don't have to store the password anywhere: all I have to do is check whether the password written by the user is good in order to decrypt the data. If it is, the user is authentificated.
Because the main aim of the hacker is to get the data, he needs the password. If he has the password he could either log in and get the data or decrypt the data in the DB. For him to get the password, he would have to run that version of bCrypt with Salt until he finds a match and to decrypt the data from the DB he would have to do that and run the RijndaelManaged with that Rfc2898DerivedBytes.
The only way I see doing this even more secure is by finding a way to store the Salt other than in Plaintext.
What do you think?

Resources