Cryptographic security of Captcha hash cookie - security

My company's CRM system utilizes a captcha system at each login and in order to utilize certain administrative functions. The original implementation stored the current captcha value for in a server-side session variable.
We're now required to redevelop this to store all necessary captcha verification information in a hashed client-side cookie. This is due to a parent IT policy which is intended to reduce overhead by disallowing use of sessions for users who are not already authenticated to the application. Thus, the authentication process itself is disallowed from using server-side storage or sessions.
The design was a bit of a group effort, and I have my doubts as to its overall efficacy. My question is, can anyone see any obvious security issues with the implementation shown below, and is it overkill or insufficient in any way?
EDIT: Further discussion has led to an updated implementation, so I've replaced the original code with the new version and edited the description to talk to this revision.
(The code below is a kind of pseudo-code; the original uses some idiosyncratic legacy libraries and structure which make it difficult to read. Hopefully this style is easy enough to understand.)
// Generate a "session" cookie unique to a particular machine and timeframe
String generateSessionHash(timestamp) {
return sha256( ""
+ (int)(timestamp / CAPTCHA_VALIDITY_SECONDS)
+ "|" + request.getRemoteAddr()
+ "|" + request.getUserAgent()
+ "|" + BASE64_8192BIT_SECRET_A
);
}
// Generate a hash of the captcha, salted with secret key and session id
String generateCaptchaHash(captchaValue, session_hash) {
return sha256( ""
+ captchaValue
+ "|" + BASE64_8192BIT_SECRET_B
+ "|" + session_hash
);
}
// Set cookie with hash matching the provided captcha image
void setCaptchaCookie(CaptchaGenerator captcha) {
String session_hash = generateSessionHash(time());
String captcha_hash = generateCaptchaHash(captcha.getValue(), session_hash);
response.setCookie(CAPTCHA_COOKIE, captcha_hash + session_hash);
}
// Return true if user's input matches the cookie captcha hash
boolean isCaptchaValid(userInputValue) {
String cookie = request.getCookie(CAPTCHA_COOKIE);
String cookie_captcha_hash = substring(cookie, 0, 64);
String cookie_session_hash = substring(cookie, 64, 64);
String session_hash = generateSessionHash(time());
if (!session_hash.equals(cookie_session_hash)) {
session_hash = generateSessionHash(time() - CAPTCHA_VALIDITY_SECONDS);
}
String captcha_hash = generateCaptchaHash(userInputValue, session_hash);
return captcha_hash.equals(cookie_captcha_hash);
}
Concept:
The "session_hash" is intended to prevent the same cookie from being used on multiple machines, and enforces a time period after which it becomes invalid.
Both the "session_hash" and "captcha_hash" have their own secret salt keys.
These BASE64_8192BIT_SECRET_A and _B salt keys are portions of an RSA private key stored on the server.
The "captcha_hash" is salted with both the secret and the "session_hash".
Delimiters are added where client-provided data is used, to avoid splicing attacks.
The "captcha_hash" and "session_hash" are both stored in the client-side cookie.
EDIT: re:Kobi Thanks for the feedback!
(I would reply in comments, but it doesn't seem to accept the formatting that works in questions?)
Each time they access the login page, the captcha is replaced; This does however assume that they don't simply resubmit without reloading the login form page. The session-based implementation uses expiration times to avoid this problem. We could also add a nonce to the login page, but we would need server-side session storage for that as well.
Per Kobi's suggestion, an expiration timeframe is now included in the hashed data, but consensus is to add it to the session_hash instead, since it's intuitive for a session to have a timeout.
This idea of hashing some data and including another hash in that data seems suspect to me. Is there really any benefit, or are we better off with a single hash containing all of the relevant data (time, IP, User-agent, Captcha value, and secret key). In this implementation we are basically telling the user part of the hashed plaintext.
Questions:
Are there any obvious deficiencies?
Are there any subtle deficiencies?
Is there a more robust approach?
Is salting the hash with another hash helping anything?
Is there a simpler and equally robust approach?
New question:
I personally think that we're better off leaving it as a server-side session; can anybody point me to any papers or articles proving or disproving the inherent risk of sending all verification data to the client side only?

Assuming no other security than stated here:
It seems an attacker can solve the captcha once, and save the cookie.
She then has her constant session_hash and captcha_hash. Nothing prevents her from submitting the same cookie with the same hashed data - possibly breaking your system.
This can be avoided by using time as part of captcha_hash (you'll need to round it to an even time, possibly a few minutes - and checking for two options - the current time and the previous)
To calrifiy, you said:
The "session_hash" is intended to prevent the same cookie from being used on multiple machines.
Is that true?
On isCaptchaValid you're doing String session_hash = substring(cookie, 64, 64); - that is: you're relying on data in the cookie. How can you tell it wasn't copied from another computer? - you're not hashing the client data again to confirm it (in fact, you have a random number there, so it may not be possible). How can you tell it's new request, and hadn't been used?
I realize the captcha is replaced with each login, but how can you know that when a request was made? You aren't checking the new captcha on isCaptchaValid - your code will still validate the request, even if it doesn't match the displayed captcha.
Consider the following scenario (can be automated):
Eve open the login page.
Gets a new cookie and a new captcha.
Replaces it with her old cookie, with hashed data of her old cptcha.
Submits the old cookie, and userInputValue with the old captcha word.
With this input, isCaptchaValid validates the request - captcha_hash, session_hash, userInputValue and BASE64_8192BIT_SECRET are all the same as they were on the first request.
by the way, in most systems you'll need a nonce anyway, to avoid XSS, and having one also solves your problem.

Related

How do I validate the Hmac using NodeJS?

I can successfully create an Hmac via NodeJS using the following code:
(slightly altered example from : https://nodejs.org/api/crypto.html#cryptocreatehmacalgorithm-key-options)
Crypto.createHmac('sha256', Crypto.randomBytes(16))
.update('I love cupcakes')
.digest('hex');
That results in a value like the following (hex-based string Hmac signature):
fb2937ca821264812d511d68ae06a643915931375633173ba64af9425f2ffd53
How do I use that signature to verify that the data was not altered? (using NodeJS, of course).
My Assumption
I'm assuming there is a method call where you supply the data and the signature and you get a boolean that tells you if the data was altered or not -- or something similar.
Another Solution?
Oh, wait, as I was writing that I started thinking...
Do I need to store the original random bytes I generated (Crypto.randomBytes(16)) and pass them to the receiver so they can just generate the HMac again and verify that the result is the same (fb2937ca821264812d511d68ae06a643915931375633173ba64af9425f2ffd53)?
If that is true that would be odd, because the parameter for Crypto.randomBytes(16) is named secret (in the official example)*. Seems like that needs to be kept secret??
Please let me know if there is a way to verify the signature on the receiving side & how I do that.
Official Documentation : A Bit Confusing
Here's the function as it is defined in the official docs:
crypto.createHmac(algorithm, key[, options])
In the function definition, you can see the second param is named key.
However, in the example they refer to it as secret
const secret = 'abcdefg';
const hash = crypto.createHmac('sha256', secret)
.update('I love cupcakes')
.digest('hex');
console.log(hash);
Just posting the answer so if anyone in future sees this they will be able to have the definitive answer.
As the commentor (Topaco) pointed out, the simple answer is that:
The receiver who want wants to validate the Hmac simply needs to use the same key value & data and apply it to the method and retrieve the hash value.
const secret = 'abcdefg';
const hash = crypto.createHmac('sha256', secret)
.update('I love cupcakes')
.digest('hex');
console.log(hash);
The original Hmac-creating party must provide three things for the verifying party:
data : (could be encrypted data from AES256, for example)
key : original key passed into the createHmac() method -- note: this item is called secret in the sample code by NodeJS (above).
hash :the (clearText) hash which the original creator generated when calling the createHmac() method.
With those three things the verifying party can now call the createHmac() method and determine if the hash they get matches the hash that the original hmac-creating party generated.
Doing this validates that the Data which was sent has not been corrupted or altered.
Additional Note On Key (secret)
I've come back after thinking about the Hmac a bit more.
It is required that both parties know the key (aka secret) but it does not mean that it should be exposed to others.
This must be kept secret (as the code implies) because if a nefarious type knew the value and could alter it, then they could also alter the data and generate a new key (secret) and pass it along as if the original creator sent it along (MITM - man in the middle attack).
So, the point here is that yes, both parties have to know the key (secret) value, but it should not be shared where it might be discovered by nefarious types.
Instead, it will have to be agreed upon or based upon a secret password, etc.

What is double HMAC verification and how does it work?

From the Mega Security Whitepaper:
The API does not store the unhashed Authentication Key sent by the user. It only stores the Hashed Authentication Key to prevent “pass-the-hash” attacks (wherein the scenario of a leaked database, an attacker would just pass the Hashed Authentication Key to get authenticated and carry out actions as the real user). The server always hashes the Authentication Key received from the client which prevents this attack vector.
If the Hashed Authentication Key does not match the result in the database, the API responds with a negative response to indicate failure. The API side avoids timing attacks here by using Double HMAC Verification (https://www.nccgroup.trust/us/about-us/newsroom-and-events/blog/2011/february/double-hmac-verification/).
Unfortunately the link is dead and a google search often points back to the same source. Can you please explain how double hmac verification works when you need the original key to verify the signature? Thank you
That URL has been archived by Wayback Machine so you can still read the full blog post: http://web.archive.org/web/20160203044316/https://www.nccgroup.trust/us/about-us/newsroom-and-events/blog/2011/february/double-hmac-verification/
The post contains a C# code snippet that demonstrates how to perform a double HMAC verification. In the end is pretty easy, just as the name "double HMAC" indicates it performs an HMAC two times, using the same HMAC-key.
public void validateHMACSHA256(byte[]receivedHMAC, byte[]message, byte[]key) {
HashAlgorithm hashAlgorithm = new HMACSHA256(key);
// size and algorithm choice are not secret; no weakness in failing fast here.
if (receivedHMAC.Length != hashAlgorithm.HashSize / 8) {
Throw new CryptographicException("HMAC verification failure.");
}
byte[]calculatedHMAC = hashAlgorithm.ComputeHash(message);
// Now we HMAC both values again before comparing to randomize byte order.
// These two lines are all that is required to prevent many existing implementations
// vulnerable to adaptive chosen ciphertext attacks using the timing side channel.
receivedHMAC = hashAlgorithm.ComputeHash(receivedHMAC);
calculatedHMAC = hashAlgorithm.ComputeHash(calculatedHMAC);
for (int i = 0; i < calculatedHMAC.Length; i++) {
if (receivedHMAC[i] != calculatedHMAC[i]) {
throw new CryptographicException("HMAC verification failure.");
}
}
}

Generating and Storing API Keys - Node js [duplicate]

So with lots of different services around now, Google APIs, Twitter API, Facebook API, etc etc.
Each service has an API key, like:
AIzaSyClzfrOzB818x55FASHvX4JuGQciR9lv7q
All the keys vary in length and the characters they contain, I'm wondering what the best approach is for generating an API key?
I'm not asking for a specific language, just the general approach to creating keys, should they be an encryption of details of the users app, or a hash, or a hash of a random string, etc. Should we worry about hash algorithm (MSD, SHA1, bcrypt) etc?
Edit:
I've spoke to a few friends (email/twitter) and they recommended just using a GUID with the dashes stripped.
This seems a little hacky to me though, hoping to get some more ideas.
Use a random number generator designed for cryptography. Then base-64 encode the number.
This is a C# example:
var key = new byte[32];
using (var generator = RandomNumberGenerator.Create())
generator.GetBytes(key);
string apiKey = Convert.ToBase64String(key);
API keys need to have the properties that they:
uniquely identify an authorized API user -- the "key" part of "API key"
authenticate that user -- cannot be guessed/forged
can be revoked if a user misbehaves -- typically they key into a database that can have a record deleted.
Typically you will have thousands or millions of API keys not billions, so they do not need to:
Reliably store information about the API user because that can be stored in your database.
As such, one way to generate an API key is to take two pieces of information:
a serial number to guarantee uniqueness
enough random bits to pad out the key
and sign them using a private secret.
The counter guarantees that they uniquely identify the user, and the signing prevents forgery. Revocability requires checking that the key is still valid in the database before doing anything that requires API-key authorization.
A good GUID generator is a pretty good approximation of an incremented counter if you need to generate keys from multiple data centers or don't have otherwise a good distributed way to assign serial numbers.
or a hash of a random string
Hashing doesn't prevent forgery. Signing is what guarantees that the key came from you.
Update, in Chrome's console and Node.js, you can issue:
crypto.randomUUID()
Example output:
'4f9d5fe0-a964-4f11-af99-6c40de98af77'
Original answer (stronger):
You could try your web browser console by opening a new tab, hitting CTRL + SHIFT + i on Chrome, and then entering the following immediately invoked function expression (IIFE):
(async function (){
let k = await window.crypto.subtle.generateKey(
{name: "AES-GCM", length: 256}, true, ["encrypt", "decrypt"]);
const jwk = await crypto.subtle.exportKey("jwk", k)
console.log(jwk.k)
})()
Example output:
gv4Gp1OeZhF5eBNU7vDjDL-yqZ6vrCfdCzF7HGVMiCs
References:
https://developer.mozilla.org/en-US/docs/Web/API/SubtleCrypto/generateKey
https://developer.mozilla.org/en-US/docs/Web/API/SubtleCrypto/exportKey
I'll confess that I mainly wrote this for myself for future reference...
I use UUIDs, formatted in lower case without dashes.
Generation is easy since most languages have it built in.
API keys can be compromised, in which case a user may want to cancel their API key and generate a new one, so your key generation method must be able to satisfy this requirement.
If you want an API key with only alphanumeric characters, you can use a variant of the base64-random approach, only using a base-62 encoding instead. The base-62 encoder is based on this.
public static string CreateApiKey()
{
var bytes = new byte[256 / 8];
using (var random = RandomNumberGenerator.Create())
random.GetBytes(bytes);
return ToBase62String(bytes);
}
static string ToBase62String(byte[] toConvert)
{
const string alphabet = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
BigInteger dividend = new BigInteger(toConvert);
var builder = new StringBuilder();
while (dividend != 0) {
dividend = BigInteger.DivRem(dividend, alphabet.Length, out BigInteger remainder);
builder.Insert(0, alphabet[Math.Abs(((int)remainder))]);
}
return builder.ToString();
}
An API key should be some random value. Random enough that it can't be predicted. It should not contain any details of the user or account that it's for. Using UUIDs is a good idea, if you're certain that the IDs created are random.
Earlier versions of Windows produced predictable GUIDs, for example, but this is an old story.

What's the best approach for generating a new API key?

So with lots of different services around now, Google APIs, Twitter API, Facebook API, etc etc.
Each service has an API key, like:
AIzaSyClzfrOzB818x55FASHvX4JuGQciR9lv7q
All the keys vary in length and the characters they contain, I'm wondering what the best approach is for generating an API key?
I'm not asking for a specific language, just the general approach to creating keys, should they be an encryption of details of the users app, or a hash, or a hash of a random string, etc. Should we worry about hash algorithm (MSD, SHA1, bcrypt) etc?
Edit:
I've spoke to a few friends (email/twitter) and they recommended just using a GUID with the dashes stripped.
This seems a little hacky to me though, hoping to get some more ideas.
Use a random number generator designed for cryptography. Then base-64 encode the number.
This is a C# example:
var key = new byte[32];
using (var generator = RandomNumberGenerator.Create())
generator.GetBytes(key);
string apiKey = Convert.ToBase64String(key);
API keys need to have the properties that they:
uniquely identify an authorized API user -- the "key" part of "API key"
authenticate that user -- cannot be guessed/forged
can be revoked if a user misbehaves -- typically they key into a database that can have a record deleted.
Typically you will have thousands or millions of API keys not billions, so they do not need to:
Reliably store information about the API user because that can be stored in your database.
As such, one way to generate an API key is to take two pieces of information:
a serial number to guarantee uniqueness
enough random bits to pad out the key
and sign them using a private secret.
The counter guarantees that they uniquely identify the user, and the signing prevents forgery. Revocability requires checking that the key is still valid in the database before doing anything that requires API-key authorization.
A good GUID generator is a pretty good approximation of an incremented counter if you need to generate keys from multiple data centers or don't have otherwise a good distributed way to assign serial numbers.
or a hash of a random string
Hashing doesn't prevent forgery. Signing is what guarantees that the key came from you.
Update, in Chrome's console and Node.js, you can issue:
crypto.randomUUID()
Example output:
'4f9d5fe0-a964-4f11-af99-6c40de98af77'
Original answer (stronger):
You could try your web browser console by opening a new tab, hitting CTRL + SHIFT + i on Chrome, and then entering the following immediately invoked function expression (IIFE):
(async function (){
let k = await window.crypto.subtle.generateKey(
{name: "AES-GCM", length: 256}, true, ["encrypt", "decrypt"]);
const jwk = await crypto.subtle.exportKey("jwk", k)
console.log(jwk.k)
})()
Example output:
gv4Gp1OeZhF5eBNU7vDjDL-yqZ6vrCfdCzF7HGVMiCs
References:
https://developer.mozilla.org/en-US/docs/Web/API/SubtleCrypto/generateKey
https://developer.mozilla.org/en-US/docs/Web/API/SubtleCrypto/exportKey
I'll confess that I mainly wrote this for myself for future reference...
I use UUIDs, formatted in lower case without dashes.
Generation is easy since most languages have it built in.
API keys can be compromised, in which case a user may want to cancel their API key and generate a new one, so your key generation method must be able to satisfy this requirement.
If you want an API key with only alphanumeric characters, you can use a variant of the base64-random approach, only using a base-62 encoding instead. The base-62 encoder is based on this.
public static string CreateApiKey()
{
var bytes = new byte[256 / 8];
using (var random = RandomNumberGenerator.Create())
random.GetBytes(bytes);
return ToBase62String(bytes);
}
static string ToBase62String(byte[] toConvert)
{
const string alphabet = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
BigInteger dividend = new BigInteger(toConvert);
var builder = new StringBuilder();
while (dividend != 0) {
dividend = BigInteger.DivRem(dividend, alphabet.Length, out BigInteger remainder);
builder.Insert(0, alphabet[Math.Abs(((int)remainder))]);
}
return builder.ToString();
}
An API key should be some random value. Random enough that it can't be predicted. It should not contain any details of the user or account that it's for. Using UUIDs is a good idea, if you're certain that the IDs created are random.
Earlier versions of Windows produced predictable GUIDs, for example, but this is an old story.

Is this a wise way to protect a datafeed?

I've been thinking of a way to protect my datafeed(json strings) from third party apps and websites using it.
so I came up with a way of protecting it but I'm kind of curious about how good my protection will be.
client side
int passcode, int dateint
passcode = 15987456 //random static code
dateint = 20112805 // todays date all stuck together
return (((Integer.parseint(passcode + "" + dateint) * 9)/2)*15)/3 // stick the 2 numbers together and do random math on it.
on the server side php
$passcode = 15987456 //random static code
$key = $_POST['key'];
$key = ((($key / 9) * 2) / 15) * 3; // reverse the random math
if(substr($key, 0, strlen($passcode)) === $passcode){
$dateyear = substr($key, strlen($passcode), 4);
$datemonth = substr($key, strlen($passcode)+4, 2);
$dateday = substr($key, strlen($passcode)+6, 2);
if(!($dateyear === date(Y) && $datemonth === date(m) && $datedate === date(d))){
die("access denied");
}
}
eventually the random static passcode could be fetched from another page and it could then be dynamic...
don't mind syntax/coding errors. just wrote this off the top of my head.
One of the first rules of cryptography is to always use an existing standard. If you try to make your own then it will be weak. Either use the client's Public Key or Diffie Hellman to establish the key at the client's site.
If your application (which uses the feed) is on the attacker's computer, and thus runs under his control, there effectively is no way to have data that your application can read but the attacker can't.
You can make it a bit harder by encrypting the data, but then the encryption key is in the program. There are some ways to protect the key (this is known as white-box cryptography, have a look at the white-box tag on crypto.stackexchange.com for details). Still, the attacker can simply execute the part of your program that decrypts the data.
You really need some user-specific key here (either a secret key shared between you and the user, or a user's private key, where you use the corresponding public key to encrypt the data).
There are three immediate problems I see:
I understand your code is just an example, but your random math isn't very random: x*9/2*15/3 == x*22.5. If someone wants to break that they will. Using a real cryptographic algorithm like md5 or sha would be much more secure.
Using today's date in the algorithm isn't very reliable: the client could be on the other side of the world where it's already tomorrow or still yesterday, or the client computer's clock might just be plain off.
Finally, if the site that's authorized to use the data feed is a public site, anyone can just look at the JavaScript code and check what the protection algorithm is, making even the most (otherwise) secure algorithm useless.
Here's an example that demonstrates why the key is very easy to crack. If you run the algorithm with a couple of consecutive days you get:
20110905: 2235971776452495360
20110906: 2235971776452495388
20110907: 2235971776452495410
20110908: 2235971776452495428
20110909: 2235971776452495452
The difference between today and tomorrow is 28, between tomorrow and the day after 22, then 18, then 24... There's a clear pattern there and you don't need to observe the code for very long before you see it. The malicious party can just try a couple of numbers that match the pattern and hit the right one very soon.

Resources