Coding strategy for securing sensitive data

Coding strategy for securing sensitive data - security

A web application contains sensitive data of the users. Neither the operator of the web application nor the hosting provider should be able to see this data. Therefore I wanted to store these data in the DB encrypted with the entrance password of the users.
dataInDB = encrypt (rawData, user password)
With this strategy it is however not possible to implement the usual use case for password recovery: Since usually only the hash value of the password is stored by the web app, the application cannot send the old, forgotten password to the user. And with the assignment of a new, coincidental password the encrypted data in the DB are no longer readable.
Is there any other solution ?

A possible solution (I am not responsible for any destruction):
When encrypting sensitive data, don't use the user's password as the key. Rather, derive the key from the user's password (preferably using a standard algorithm such as PBKDF2). Just in case the user forgets their password, you can keep a copy of this derived key (encrypted using a different key derived from the user's answer). If the user forgets their password, they can answer their security question. Only the correct answer will decrypt the original password key (not the original password). This affords you the opportunity to re-encrypt the sensitive information.
I will demonstrate using (Python-esque) pseudo code, but first let's look at a possible table for the users. Don't get caught up in the columns just yet, they will become clear soon...
CREATE TABLE USERS
(
user_name VARCHAR,
-- ... lots of other, useful columns ...
password_key_iterations NUMBER,
password_key_salt BINARY,
password_key_iv BINARY,
encrypted_password_key BINARY,
question VARCHAR,
answer_key_iterations NUMBER,
answer_key_salt BINARY
)
When it comes time to register a user, they must provide a question and answer:
def register_user(user_name, password, question, answer):
user = User()
# The question is simply stored for later use
user.question = question
# The password secret key is derived from the user's password
user.password_key_iterations = generate_random_number(from=1000, to=2000)
user.password_key_salt = generate_random_salt()
password_key = derive_key(password, iterations=user.password_key_iterations, salt=user.password_key_salt)
# The answer secret key is derived from the answer to the user's security question
user.answer_key_iterations = generate_random_number(from=1000, to=2000)
user.answer_key_salt = generate_random_salt()
answer_key = derive_key(answer, iterations=user.answer_key_iterations, salt=user.answer_key_salt)
# The password secret key is encrypted using the key derived from the answer
user.password_key_iv = generate_random_iv()
user.encrypted_password_key = encrypt(password_key, key=answer_key, iv=user.password_key_iv)
database.insert_user(user)
Should the user forget their password, the system will still have to ask the user to answer their security question. Their password cannot be recovered, but the key derived from the password can be. This allows the system to re-encrypt the sensitive information using the new password:
def reset_password(user_name, answer, new_password):
user = database.rerieve_user(user_name)
answer_key = derive_key(answer, iterations=user.answer_key_iterations, salt=user.answer_key_salt)
# The answer key decrypts the old password key
old_password_key = decrypt(user.encrypted_password_key, key=answer_key, iv=user.password_key_iv)
# TODO: Decrypt sensitive data using the old password key
new_password_key = derive_key(new_password, iterations=user.password_key_iterations, salt=user.password_key_salt)
# TODO: Re-encrypt sensitive data using the new password key
user.encrypted_password_key = encrypt(new_password_key, key=user.answer_key, iv=user.password_key_iv)
database.update_user(user)
Of course, there are some general cryptographic principles not explicitly highlighted here (cipher modes, etc...) that are the responsibility of the implementer to familiarize themselves with.
Hope this helps a little! :)
Update courtesy of Eadwacer's comment
As Eadwacer commented:
I would avoid deriving the key directly from the password (limited entropy and changing the password will require re-encrypting all of the data). Instead, create a random key for each user and use the password to encrypt the key. You would also encrypt the key using a key derived from the security questions.
Here is a modified version of my solution taking his excellent advice into consideration:
CREATE TABLE USERS
(
user_name VARCHAR,
-- ... lots of other, useful columns ...
password_key_iterations NUMBER,
password_key_salt BINARY,
password_encrypted_data_key BINARY,
password_encrypted_data_key_iv BINARY,
question VARCHAR,
answer_key_iterations NUMBER,
answer_key_salt BINARY,
answer_encrypted_data_key BINARY,
answer_encrypted_data_key_iv BINARY,
)
You would then register the user as follows:
def register_user(user_name, password, question, answer):
user = User()
# The question is simply stored for later use
user.question = question
# The randomly-generated data key will ultimately encrypt our sensitive data
data_key = generate_random_key()
# The password key is derived from the password
user.password_key_iterations = generate_random_number(from=1000, to=2000)
user.password_key_salt = generate_random_salt()
password_key = derive_key(password, iterations=user.password_key_iterations, salt=user.password_key_salt)
# The answer key is derived from the answer
user.answer_key_iterations = generate_random_number(from=1000, to=2000)
user.answer_key_salt = generate_random_salt()
answer_key = derive_key(answer, iterations=user.answer_key_iterations, salt=user.answer_key_salt)
# The data key is encrypted using the password key
user.password_encrypted_data_key_iv = generate_random_iv()
user.password_encrypted_data_key = encrypt(data_key, key=password_key, iv=user.password_encrypted_data_key_iv)
# The data key is encrypted using the answer key
user.answer_encrypted_data_key_iv = generate_random_iv()
user.answer_encrypted_data_key = encrypt(data_key, key=answer_key, iv=user.answer_encrypted_data_key_iv)
database.insert_user(user)
Now, resetting a user's password looks like this:
def reset_password(user_name, answer, new_password):
user = database.rerieve_user(user_name)
answer_key = derive_key(answer, iterations=user.answer_key_iterations, salt=user.answer_key_salt)
# The answer key decrypts the data key
data_key = decrypt(user.answer_encrypted_data_key, key=answer_key, iv=user.answer_encrypted_data_key_iv)
# Instead of re-encrypting all the sensitive data, we simply re-encrypt the password key
new_password_key = derive_key(new_password, iterations=user.password_key_iterations, salt=user.password_key_salt)
user.password_encrypted_data_key = encrypt(data_key, key=new_password_key, iv=user.password_encrypted_data_key_iv)
database.update_user(user)
Hopefully my head is still functioning clearly tonight...

Related

How do I validate the Hmac using NodeJS?

I can successfully create an Hmac via NodeJS using the following code:
(slightly altered example from : https://nodejs.org/api/crypto.html#cryptocreatehmacalgorithm-key-options)
Crypto.createHmac('sha256', Crypto.randomBytes(16))
.update('I love cupcakes')
.digest('hex');
That results in a value like the following (hex-based string Hmac signature):
fb2937ca821264812d511d68ae06a643915931375633173ba64af9425f2ffd53
How do I use that signature to verify that the data was not altered? (using NodeJS, of course).
My Assumption
I'm assuming there is a method call where you supply the data and the signature and you get a boolean that tells you if the data was altered or not -- or something similar.
Another Solution?
Oh, wait, as I was writing that I started thinking...
Do I need to store the original random bytes I generated (Crypto.randomBytes(16)) and pass them to the receiver so they can just generate the HMac again and verify that the result is the same (fb2937ca821264812d511d68ae06a643915931375633173ba64af9425f2ffd53)?
If that is true that would be odd, because the parameter for Crypto.randomBytes(16) is named secret (in the official example)*. Seems like that needs to be kept secret??
Please let me know if there is a way to verify the signature on the receiving side & how I do that.
Official Documentation : A Bit Confusing
Here's the function as it is defined in the official docs:
crypto.createHmac(algorithm, key[, options])
In the function definition, you can see the second param is named key.
However, in the example they refer to it as secret
const secret = 'abcdefg';
const hash = crypto.createHmac('sha256', secret)
.update('I love cupcakes')
.digest('hex');
console.log(hash);

Just posting the answer so if anyone in future sees this they will be able to have the definitive answer.
As the commentor (Topaco) pointed out, the simple answer is that:
The receiver who want wants to validate the Hmac simply needs to use the same key value & data and apply it to the method and retrieve the hash value.
const secret = 'abcdefg';
const hash = crypto.createHmac('sha256', secret)
.update('I love cupcakes')
.digest('hex');
console.log(hash);
The original Hmac-creating party must provide three things for the verifying party:
data : (could be encrypted data from AES256, for example)
key : original key passed into the createHmac() method -- note: this item is called secret in the sample code by NodeJS (above).
hash :the (clearText) hash which the original creator generated when calling the createHmac() method.
With those three things the verifying party can now call the createHmac() method and determine if the hash they get matches the hash that the original hmac-creating party generated.
Doing this validates that the Data which was sent has not been corrupted or altered.
Additional Note On Key (secret)
I've come back after thinking about the Hmac a bit more.
It is required that both parties know the key (aka secret) but it does not mean that it should be exposed to others.
This must be kept secret (as the code implies) because if a nefarious type knew the value and could alter it, then they could also alter the data and generate a new key (secret) and pass it along as if the original creator sent it along (MITM - man in the middle attack).
So, the point here is that yes, both parties have to know the key (secret) value, but it should not be shared where it might be discovered by nefarious types.
Instead, it will have to be agreed upon or based upon a secret password, etc.

File encryption in Laravel and sudo users

I understand I can encrypt and store the contents of files (csv mainly) using the techniques explained here and here.
However, I am looking for a way to prevent anyone from accessing these files, even users with sudo access to the server. The only one (or group of people) who should be able to access the encrypted files would be those who have a password or encryption key chosen by me. Is this possible?

By default the file will show encrypted data, hence even if the file anyone get cant seen data. however you can also put key protectection using class from Encrypt directly or using spatie crypt
Here LOOK Spatie link
You can also use the default crypt like this
use Illuminate\Encryption\Encrypter;
//Keys and cipher used by encrypter(s)
$fromKey = base64_decode("from_key_as_a_base_64_encoded_string");
$toKey = base64_decode("to_key_as_a_base_64_encoded_string");
$cipher = "AES-256-CBC"; //or AES-128-CBC if you prefer
//Create two encrypters using different keys for each
$encrypterFrom = new Encrypter($fromKey, $cipher);
$encrypterTo = new Encrypter($toKey, $cipher);
//Decrypt a string that was encrypted using the "from" key
$decryptedFromString = $encrypterFrom->decryptString("gobbledygook=that=is=a=from=key=encrypted=string==");
//Now encrypt the decrypted string using the "to" key
$encryptedToString = $encrypterTo->encryptString($decryptedFromString);

What is the best way to encrypt stored data in web2py?

I need to encrypt data stored in web2py, more precisely passwords.
This is not about authentication, but more something in the line of a KeePass-like application.
I've seen that is included in web2py, but and M2Secret could easily do that. With M2Secret I can use this:
import m2secret
# Encrypt
secret = m2secret.Secret()
secret.encrypt('my data', 'my master password')
serialized = secret.serialize()
# Decrypt
secret = m2secret.Secret()
secret.deserialize(serialized)
data = secret.decrypt('my master password')
But I would have to include the M2Crypto library in my appliance.
Is there a way to do this with PyMe which is already included with web2py?

By default web2py stores passwords hashed using HMAC+SHA512 so there is nothing for you to do. It is better than the mechanism that you suggest because encryption is reversible while hashing is not. You can change this and do what you ask above but it would not be any more secure than using plaintext (since you would have to expose the encryption key in the app).
Anyway. Let's say you have a
db.define_table('mytable',Field('myfield'.'password'))
and you want to use m2secret. You would do:
class MyValidator:
def __init__(self,key): self.key=key
def __call__(self,value):
secret = m2secret.Secret()
secret.encrypt(value, self.key)
return secret.serialize()
def formatter(self,value):
secret = m2secret.Secret()
secret.deserialize(value)
return (secret.decrypt(self.key),None)
db.mytable.myfield.requires=MyValidator("master password")
In web2py validators are also two way filters.

Password hashing - how to upgrade?

There's plenty of discussion on the best algorithm - but what if you're already in production? How do you upgrade without having to reset on the user?
EDIT/DISCLAIMER: Although I originally wanted a "quick fix" solution and chose orip's response, I must concede that if security in your application is important enough to be even bothering with this issue, then a quick fix is the wrong mentality and his proposed solution is probably inadequate.

One option is to make your stored hash include an algorithm version number - so you start with algorithm 0 (e.g. MD5) and store
0:ab0123fe
then when you upgrade to SHA-1, you bump the version number to 1:
1:babababa192df1312
(no, I know these lengths probably aren't right).
That way you can always tell which version to check against when validating a password. You can invalidate old algorithms just by wiping stored hashes which start with that version number.
If you've already got hashes in production without a version number, just choose a scheme such that you can easily recognise unversioned hashes - for example, using the above scheme of a colon, any hash which doesn't contain a colon must by definition predate the versioning scheme, so can be inferred to be version 0 (or whatever).

A cool way to secure all the existing passwords: use the existing hash as the input for the new, and better, password hash.
So if your existing hashes are straight MD5s, and you plan on moving to some form of PBKDF2 (or bcrypt, or scrypt), then change your password hash to:
PBKDF2( MD5( password ) )
You already have the MD5 in your database so all you do is apply PBKDF2 to it.
The reason this works well is that the weaknesses of MD5 vs other hashes (e.g. SHA-*) don't affect password use. For example, its collision vulnerabilities are devastating for digital signatures but they don't affect password hashes. Compared to longer hashes MD5 reduces the hash search-space somewhat with its 128-bit output, but this is insignificant compared to the password search space itself which is much much smaller.
What makes a password hash strong is slowing down (achieved in PBKDF2 by iterations) and a random, long-enough salt - the initial MD5 doesn't adversely affect either of them.
And while you're at it, add a version field to the passwords too.
EDIT: The cryptography StackExchange has an interesting discussion on this method.

Wait until your user logs in (so you have the password in plaintext), then hash it with the new algorithm & save it in your database.

One way to do it is to:
Introduce new field for new password
When the user logs in check the password against the old hash
If OK, hash the clear text password with the new hash
Remove the old hash
Then gradually you will have only passwords with the new hash

You probably can't change the password hashing scheme now, unless you're storing passwords in plain text. What you can do is re-hash the member passwords using a better hashing scheme after each user has successfully logged in.
You can try this:
First add a new column to your members table, or which ever table stores passwords.
ALTER TABLE members ADD is_pass_upgraded tinyint(1) default 0;
Next, in your code that authenticates users, add some additional logic (I'm using PHP):
<?php
$username = $_POST['username'];
$password = $_POST['password'];
$auth_success = authenticateUser($username, $password);
if (!$auth_success) {
/**
* They entered the wrong username/password. Redirect them back
* to the login page.
*/
} else {
/**
* Check to see if the member's password has been upgraded yet
*/
$username = mysql_real_escape_string($username);
$sql = "SELECT id FROM members WHERE username = '$username' AND is_pass_upgraded = 0 LIMIT 1";
$results = mysql_query($sql);
/**
* Getting any results from the query means their password hasn't been
* upgraded yet. We will upgrade it now.
*/
if (mysql_num_rows($results) > 0) {
/**
* Generate a new password hash using your new algorithm. That's
* what the generateNewPasswordHash() function does.
*/
$password = generateNewPasswordHash($password);
$password = mysql_real_escape_string($password);
/**
* Now that we have a new password hash, we'll update the member table
* with the new password hash, and change the is_pass_upgraded flag.
*/
$sql = "UPDATE members SET password = '$password', is_pass_upgraded = 1 WHERE username = '$username' LIMIT 1";
mysql_query($sql);
}
}
Your authenticateUser() function would need to be changed to something similar to this:
<?php
function authenticateUser($username, $password)
{
$username = mysql_real_escape_string($username);
/**
* We need password hashes using your old system (md5 for example)
* and your new system.
*/
$old_password_hashed = md5($password);
$new_password_hashed = generateBetterPasswordHash($password);
$old_password_hashed = mysql_real_escape_string($old_password_hashed);
$new_password_hashed = mysql_real_escape_string($new_password_hashed);
$sql = "SELECT *
FROM members
WHERE username = '$username'
AND
(
(is_pass_upgraded = 0 AND password = '$old_password_hashed')
OR
(is_pass_upgraded = 1 AND password = '$new_password_hashed')
)
LIMIT 1";
$results = mysql_query($sql);
if (mysql_num_rows($results) > 0) {
$row = mysql_fetch_assoc($results);
startUserSession($row);
return true;
} else {
return false;
}
}
There's upsides and downsides to this approach. On the upsides, an individual member's password becomes more secure after they've logged in. The downside is everyone's passwords aren't secured.
I'd only do this for maybe 2 weeks. I'd send an email to all my members, and tell them they have 2 weeks to log into their account because of site upgrades. If they fail to log in within 2 weeks they'll need to use the password recovery system to reset their password.

Just re-hash the plain text when they authenticate the next time. Oah and use SHA-256 with a salt of base256 (full byte) and 256 bytes in size.

Cryptographic security of Captcha hash cookie

My company's CRM system utilizes a captcha system at each login and in order to utilize certain administrative functions. The original implementation stored the current captcha value for in a server-side session variable.
We're now required to redevelop this to store all necessary captcha verification information in a hashed client-side cookie. This is due to a parent IT policy which is intended to reduce overhead by disallowing use of sessions for users who are not already authenticated to the application. Thus, the authentication process itself is disallowed from using server-side storage or sessions.
The design was a bit of a group effort, and I have my doubts as to its overall efficacy. My question is, can anyone see any obvious security issues with the implementation shown below, and is it overkill or insufficient in any way?
EDIT: Further discussion has led to an updated implementation, so I've replaced the original code with the new version and edited the description to talk to this revision.
(The code below is a kind of pseudo-code; the original uses some idiosyncratic legacy libraries and structure which make it difficult to read. Hopefully this style is easy enough to understand.)
// Generate a "session" cookie unique to a particular machine and timeframe
String generateSessionHash(timestamp) {
return sha256( ""
+ (int)(timestamp / CAPTCHA_VALIDITY_SECONDS)
+ "|" + request.getRemoteAddr()
+ "|" + request.getUserAgent()
+ "|" + BASE64_8192BIT_SECRET_A
);
}
// Generate a hash of the captcha, salted with secret key and session id
String generateCaptchaHash(captchaValue, session_hash) {
return sha256( ""
+ captchaValue
+ "|" + BASE64_8192BIT_SECRET_B
+ "|" + session_hash
);
}
// Set cookie with hash matching the provided captcha image
void setCaptchaCookie(CaptchaGenerator captcha) {
String session_hash = generateSessionHash(time());
String captcha_hash = generateCaptchaHash(captcha.getValue(), session_hash);
response.setCookie(CAPTCHA_COOKIE, captcha_hash + session_hash);
}
// Return true if user's input matches the cookie captcha hash
boolean isCaptchaValid(userInputValue) {
String cookie = request.getCookie(CAPTCHA_COOKIE);
String cookie_captcha_hash = substring(cookie, 0, 64);
String cookie_session_hash = substring(cookie, 64, 64);
String session_hash = generateSessionHash(time());
if (!session_hash.equals(cookie_session_hash)) {
session_hash = generateSessionHash(time() - CAPTCHA_VALIDITY_SECONDS);
}
String captcha_hash = generateCaptchaHash(userInputValue, session_hash);
return captcha_hash.equals(cookie_captcha_hash);
}
Concept:
The "session_hash" is intended to prevent the same cookie from being used on multiple machines, and enforces a time period after which it becomes invalid.
Both the "session_hash" and "captcha_hash" have their own secret salt keys.
These BASE64_8192BIT_SECRET_A and _B salt keys are portions of an RSA private key stored on the server.
The "captcha_hash" is salted with both the secret and the "session_hash".
Delimiters are added where client-provided data is used, to avoid splicing attacks.
The "captcha_hash" and "session_hash" are both stored in the client-side cookie.
EDIT: re:Kobi Thanks for the feedback!
(I would reply in comments, but it doesn't seem to accept the formatting that works in questions?)
Each time they access the login page, the captcha is replaced; This does however assume that they don't simply resubmit without reloading the login form page. The session-based implementation uses expiration times to avoid this problem. We could also add a nonce to the login page, but we would need server-side session storage for that as well.
Per Kobi's suggestion, an expiration timeframe is now included in the hashed data, but consensus is to add it to the session_hash instead, since it's intuitive for a session to have a timeout.
This idea of hashing some data and including another hash in that data seems suspect to me. Is there really any benefit, or are we better off with a single hash containing all of the relevant data (time, IP, User-agent, Captcha value, and secret key). In this implementation we are basically telling the user part of the hashed plaintext.
Questions:
Are there any obvious deficiencies?
Are there any subtle deficiencies?
Is there a more robust approach?
Is salting the hash with another hash helping anything?
Is there a simpler and equally robust approach?
New question:
I personally think that we're better off leaving it as a server-side session; can anybody point me to any papers or articles proving or disproving the inherent risk of sending all verification data to the client side only?

Assuming no other security than stated here:
It seems an attacker can solve the captcha once, and save the cookie.
She then has her constant session_hash and captcha_hash. Nothing prevents her from submitting the same cookie with the same hashed data - possibly breaking your system.
This can be avoided by using time as part of captcha_hash (you'll need to round it to an even time, possibly a few minutes - and checking for two options - the current time and the previous)
To calrifiy, you said:
The "session_hash" is intended to prevent the same cookie from being used on multiple machines.
Is that true?
On isCaptchaValid you're doing String session_hash = substring(cookie, 64, 64); - that is: you're relying on data in the cookie. How can you tell it wasn't copied from another computer? - you're not hashing the client data again to confirm it (in fact, you have a random number there, so it may not be possible). How can you tell it's new request, and hadn't been used?
I realize the captcha is replaced with each login, but how can you know that when a request was made? You aren't checking the new captcha on isCaptchaValid - your code will still validate the request, even if it doesn't match the displayed captcha.
Consider the following scenario (can be automated):
Eve open the login page.
Gets a new cookie and a new captcha.
Replaces it with her old cookie, with hashed data of her old cptcha.
Submits the old cookie, and userInputValue with the old captcha word.
With this input, isCaptchaValid validates the request - captcha_hash, session_hash, userInputValue and BASE64_8192BIT_SECRET are all the same as they were on the first request.
by the way, in most systems you'll need a nonce anyway, to avoid XSS, and having one also solves your problem.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string