How standard is HMAC(SHA-1) - security

HMAC(SHA-1) is an algorithm for Hash computation that also accepts a key as input value. The algorithm follows certain rules and guarantees a certain level of security and resilience against attacks.
Moving to its implementation: is HMAC(SHA-1) standard at the point that all the "official" and correct implementations of it produce exactly the same result for a given input message and key? Or is the algorithm accepting different implementations that might produce a different result?

any given implementation of HMAC-SHA1 will produce the same set of bytes given the same set of bytes as the input message and key.
That said, there can be a lot of variation on how various interfaces work and how they accept those bytes. For example, one library may output the hash as a hex string, and another may output it as an array of bytes. Or one would take a string as input with a UTF-8 encoding, whereas another would take it in as a UTF-16 encoding. You would need to be careful that the same bytes are hitting the algorithm in different libraries to ensure you get the same result.
Also, while HMAC-SHA1 is probably okay from a security perspective, you should probably be using HMAC-SHA256 instead.

It's very standard. It's a standard, even!
RFC2104 specifies the actual HMAC algorithm and block sizes.
RFC2202 contains test cases for both HMAC-MD5 and HMAC-SHA1.
For further study, RFC4868 gives more guidance on HMAC for the SHA2 family, with an emphasis on IPSec.

Related

Hashing and 'brute-force' permutations

So this is a two-part question:
Are there any hashing functions that guarantee that for any combination of the same length, they generate a unique hash? As I remember - most are that way, but I just need to confirm this.
Based on the 1st question - so, given a file hash and a length - is it then theoretically possible to 'brute-force' all byte permutations of that same length until the same hash is generated - ie. the original file has been recreated?
PS. I am aware that this will take ages (if theoretically possible), but I think it would be feasible for small files (sizes < 1KB)
1KB, that'd be 1000^256, right? 1000 possible combinations of bytes (256 configurations each?). It's a real big number. 1 with 768 0s behind it.
If you were to generate all of them, one would be the right one, but you'd have some number of collisions.
According to this security.SE post, the collission rate for md5 (for example) is about 1 in 2^64. So, if we divide our original number by that, we'd get how many possible combinations, right? http://www.wolframalpha.com/input/?i=1000%5E256+%2F+2%5E64
~5.42 × 10^748
That is still a lot of files to check.
I'd feel a lot better if someone critiqued my math here, but the point is that your first point is not true because of collisions. You can use the same sort math for calculating two 1000 character passwords having the same hash. It's the birthday problem. Given 2 people, it is unlikely that we'd have the same birthday, but if you take a room full the probability of any two people having the same birthday increases very quickly. If you take all 1000 character passwords, some of them are going to collide. You are going from X bytes to 16 bytes. You can't fit all of the combinations into 16 bytes.
Expanding upon the response to your first point, one of the points of cryptographic hash functions is unpredictability. A function with zero collisions is a 1-1 (or one-to-one) function, so called because every input has exactly one output and every output has exactly one input.
In order for a function to accept arbitrary length & complexity inputs without generating a collision, it is easy to see that the function must have arbitrary length outputs. As Gray obliquely points out, most hash functions have fixed-length outputs. (There are apparently some new algorithms that support arbitrary length outputs, but they still don't guarantee 0 collisions.) The reason is not stated clearly in the common crypto literature, but consider the difference between hashing and encrypting.
In hashing, you have the message (the unaltered original) and the message digest (the output of the hash function. (Digest here having the meaning "a summation or condensation of a body of information.")
With encryption, you have the plain text and the cipher text. The implication is that the cipher text is of equal length and complexity as the original.
I look at it as a cryptographic hash function with 0 collisions is of equal complexity as encryption. (Note that I'm unsure of what the advantages of a variable-length hash output are, so I asked a question about it.)
Additionally, hash functions are susceptible to attacks by pre-computed rainbow tables, which is why all hash algorithms still considered secure employ extra random inputs, called salts. The reason encryption isn't susceptible to a similar attack is that the encryption key is kept secret and you can't pre-compute output values without knowing the key. Compare symmetric key encryption (where there is one key that must be kept secret) with public key encryption (where the encryption key is public and the decryption key is private).
The other thing that prevents encryption algorithms from pre-computation attacks is that the number of computations for arbitrary-length inputs grows exponentially, and it is literally impossible to store the output from every input you may be interested in.

Is there a hash function to generate a hash with a given length?

Is there a function that generates a hash that has the exact lenght I want? I know that MD5 always has 16 bytes. But I want to define the lenght of the resulting hash.
Example:
hash('Something', 2) = 'gn'
hash('Something', 5) = 'a5d92'
hash('Something', 20) = 'RYNSl7cMObkPuXCK1GhF'
When the length increases, the result should be more secure from duplicates.
The upcoming SHAKE256 (or SHAKE128 for a security level of 128bit instead of 256bit), a so called extendable-output function (XOF), is exactly what you are looking for. It will be defined alongside with SHA3. There is already a draft online.
If you need an established solution now, follow CodesInChaos advice and truncate SHA512 if a maximum of 64byte is enough and otherwise seed a stream cipher with the output of a hash of the original data.
Technical disclaimer: After an output length of 512bit the "security against duplicates" (collision resistance) does not increase any more with longer output, as with SHAKE256 it has reached the maximum security level against collisions the primitive supports (256bit). (Note that because of the birthday paradox the security level of an ideal hash function with output length of n bit against collisions is only n/2 bit.) Any higher security level is pretty much meaningless anyway (probably 256bit is already an overkill) given that our solar system does not provide enough energy to even count from 0 to 2^256.
Please do not confuse "security levels" with key lengths: With symmetric algorithms one usually expects a security level equal to the key size, but with asymmetric algorithms the numbers are completely unrelated: A 512 bit RSA encryption scheme is far less secure than 128bit AES (i.e. 512bit RSA moduli can be factored by brute force already).
If a cryptographic primitive tries to achieve a "security level of n bits" it means that there are supposed to be no attacks against it that is faster than 2^n operations.
BLAKE2 can produce digests of any size between 1 and 64 bytes.
If you want a digest considered cryptographically secure, consider the Birthday problem and what other algorithms use — e.g. SHA-1 uses 20 bytes and is considered insecure, SHA-2 uses 28/32/48/64 bytes and is generally considered secure.
If you just want to avoid accidental collisions, still consider the Birthday problem (above), but 16 or even 8 bytes might be considered sufficient depending on the application (see table).

Entropy for Nonce creation

In OAuth, a Nonce is used to prevent replay-attacks. In addition to the nonce, a timestamp is also used (and can be considered a second nonce, as, when strictly sticking with the specification, there is no timeframe in which requests are consideed valid - servers MAY, not MUST limit the range).
The question that came into my mind when implementing a OAuth-Client is: Do Nonces have to be cryptographcally secure?
Two points are important to me here:
Is it ok to use /dev/urandom instead of /dev/random and risk predictable values if the system is running low on entropy when many nonces are created in little time?
(For those not familiar with random/urandom: This would have an advantage in performance, as /dev/urandom doesn't block calls when little entropy is available at the cost of security, as, of course, values are less random).
As nonces have to be encoded to be sent if they contain non-ASCII-characters, it's the easiest thing to create them only out of those ASCII chars that can be sent as-is ([0-9A-Za-z_-+~] AFAIR). Of course this limits entropy again, so the nonce has t be longer to be equally strong. In your oppinion, what's a reasonable length for nonce that only consist of those characters and is it worth the advantage of not having to encode?
Normally it is hardly ever useful to use /dev/random instead of /dev/urandom. You can make a point of using it to seed other PRNG's if you don't want to have those PRNG's rely on /dev/urandom. For nonce's, you should certainly be better off using /dev/urandom. Or you can use a well seeded, thread local cryptographically secure PRNG implemented in your app or library of course.
If you want to send a nonce (or most binary data) over ASCII then you can use hexadecimals or base 64. For the best readability of the value itself, use hex, for efficiency use base64. Now by default base 64 uses numbers, upper and lowercase letters, plus the characters +, / and = but if you want to use other values you can always URLencode the base64, replace the characters you do not want, or use one of the variants.

Which encryption method would produce this result

I am doing a security review on a system.
From one part of the system to another, information is sent using an encrypted string.
This string is over 400 characters long, but within it are 4 sets of 10 identical characters. I am assuming that the data that was encrypted also has this pattern, for example the word "parameters".
I have tested encrypting a string containing several identical strings with DES, but do not get the same pattern.
Question is: Is there an encryption method that would produce this result. Or have the parts been encrypted seperatly and conncatinated?
An encryption system with short key length and no correlation between blocks (e.g. ECB mode) would encrypt short runs of identical plain text identically. It could also just be coincidence, of course.
If what you're seeing is real, it's mostly about encryption mode, not the cipher. Likely culprits are a block cipher in ECB mode (which is usually a bad idea), or the pseudo-"stream" cipher of XORing the plaintext with a short password string repeated over and over (in which case the odds of two copies of the same plaintext at random positions encoding to the same thing are 1 in passwordlength) this one is a really bad idea.
By the way, it's best to be clear what format you're looking at the data in. If it's hex, okay. If it's base64, you should decode it before you look at it -- identical strings won't always look identical after base64 encoding depending on their alignment to a 3-byte boundary.
And just for illustration, here's a discussion of ECB mode on Wikipedia including pictures of the entropy problem with ECB -- scroll down to the pictures of Tux.
What do you mean with "4 sets of 10 identical characters"?
If you mean 4 identical substrings with length 10, it may be the Caesar cipher, which is totally unsecure, as it can be deciphered by a human in no time. Another possibility is the use of an XOR cipher with a bad chosen key.

Is there an encryption technique that could turn an 8-digit number into something 10 or 11 digits or less?

Many of the encryption techniques I've seen can easily encrypt a simple 8 digit number like "12345678" but the result is often something like "8745b34097af8bc9de087e98deb8707aac8797d097f" (made up but you get the idea).
Is there a way to encrypt this 8 digit number but have the resulting encrypted value be the same or at least only a slightly longer number? An ideal target would be to end up with a 10 digit number or less. Is this possible while still maintaining a fairly strong encryption?
Update: I didn't make the output clear enough - I am wanting an 8-digit number to turn into an 8-digit number, not 8 bytes.
A lot here is going to depend on how seriously you mean your "public-key-encryption" tag. Do you actually want public key encryption, or are you just taking that possibility into account?
If you're willing to use symmetric encryption, producing 8 bytes of output from 8 bytes of input is pretty easy: just run 3DES in ECB (Electronic Code Book) mode, and that's what you'll get. The main weakness of ECB is that a given input will always produce the same result, so if your inputs might repeat an attacker will be able to see that repetition, and may be able to notice a pattern of "encrypted value X leads to action Y", even if they can't/don't break the encryption itself at all. If you can live with that, 3DES/ECB is probably your answer.
If you can't live with that, 3DES in CFB mode is probably the next best. This will produce 16 bytes of output from 8 bytes of input (note that it's not normally doubling the input size, but adding 8 bytes to the input size).
3DES is hardly what anybody would call a cutting edge algorithm, but I'd say it still qualifies as "fairly strong encryption". Part of its weakness as an algorithm stems from its relatively small block size, but that also minimizes expansion of the output.
Edit: Sorry, I forgot to the public-key possibility. With most public-key cryptography, the smallest result is roughly equal to the key size. With RSA encryption, that'll typically mean a minimum of something like 1024 bits (and often considerably more than that). To keep the result smaller, I'd probably use Elliptical Curve Cryptography, for which a ~200 bit key is reasonably secure against known attacks. This will still be larger than 3DES/CFB, but not outrageously so.
Well you could look a stream cipher which encrypts bytes 1:1. With N bytes input, there are N bytes encrypted/decrypted output. Such ciphers are usually based on an algorithm that creates a stream of random numbers, with the encryption key/IV acting as seed.
For some stream ciphers, look at the eSTREAM candidates. I don't know of any relevant attacks on HC-128 and HC-256, for example.

Resources