I need to sign a small string with an asymmetric key encryption scheme.
The signature will be stored on a small chip together with the signed string. I have very little space to spare (about 60bytes for signature + string), so the generated signature should be as small as possible.
I looked around for how to do it, and what I found is that I could use RSA-SHA1, but the generated signature with a 512 bit key is 64 bytes. That is a bit much.
What secure algorithm could I use to generate a small asymmetric signature?
Would it still be secure if I store the SHA1 sum of the RSA-SHA1 signature, and later verify that instead?
What you're bumping up against here is one of the properties of a good hash function - the return value should be long to protect against birthday attacks (where two different inputs result in the same output hash). Generally 128-512 bits is preferred hence the SHA-1 signature gives you 512 bits.
As with all things in cryptography security is a trade off. As you are using asymmetric signing have you considered using RSA-MD5 as your signature option? This will give you a far shorter return of 128 bits but this comes with the caveat that MD5 is considered broken and is generally being moved away from.
Related
So this is a two-part question:
Are there any hashing functions that guarantee that for any combination of the same length, they generate a unique hash? As I remember - most are that way, but I just need to confirm this.
Based on the 1st question - so, given a file hash and a length - is it then theoretically possible to 'brute-force' all byte permutations of that same length until the same hash is generated - ie. the original file has been recreated?
PS. I am aware that this will take ages (if theoretically possible), but I think it would be feasible for small files (sizes < 1KB)
1KB, that'd be 1000^256, right? 1000 possible combinations of bytes (256 configurations each?). It's a real big number. 1 with 768 0s behind it.
If you were to generate all of them, one would be the right one, but you'd have some number of collisions.
According to this security.SE post, the collission rate for md5 (for example) is about 1 in 2^64. So, if we divide our original number by that, we'd get how many possible combinations, right? http://www.wolframalpha.com/input/?i=1000%5E256+%2F+2%5E64
~5.42 × 10^748
That is still a lot of files to check.
I'd feel a lot better if someone critiqued my math here, but the point is that your first point is not true because of collisions. You can use the same sort math for calculating two 1000 character passwords having the same hash. It's the birthday problem. Given 2 people, it is unlikely that we'd have the same birthday, but if you take a room full the probability of any two people having the same birthday increases very quickly. If you take all 1000 character passwords, some of them are going to collide. You are going from X bytes to 16 bytes. You can't fit all of the combinations into 16 bytes.
Expanding upon the response to your first point, one of the points of cryptographic hash functions is unpredictability. A function with zero collisions is a 1-1 (or one-to-one) function, so called because every input has exactly one output and every output has exactly one input.
In order for a function to accept arbitrary length & complexity inputs without generating a collision, it is easy to see that the function must have arbitrary length outputs. As Gray obliquely points out, most hash functions have fixed-length outputs. (There are apparently some new algorithms that support arbitrary length outputs, but they still don't guarantee 0 collisions.) The reason is not stated clearly in the common crypto literature, but consider the difference between hashing and encrypting.
In hashing, you have the message (the unaltered original) and the message digest (the output of the hash function. (Digest here having the meaning "a summation or condensation of a body of information.")
With encryption, you have the plain text and the cipher text. The implication is that the cipher text is of equal length and complexity as the original.
I look at it as a cryptographic hash function with 0 collisions is of equal complexity as encryption. (Note that I'm unsure of what the advantages of a variable-length hash output are, so I asked a question about it.)
Additionally, hash functions are susceptible to attacks by pre-computed rainbow tables, which is why all hash algorithms still considered secure employ extra random inputs, called salts. The reason encryption isn't susceptible to a similar attack is that the encryption key is kept secret and you can't pre-compute output values without knowing the key. Compare symmetric key encryption (where there is one key that must be kept secret) with public key encryption (where the encryption key is public and the decryption key is private).
The other thing that prevents encryption algorithms from pre-computation attacks is that the number of computations for arbitrary-length inputs grows exponentially, and it is literally impossible to store the output from every input you may be interested in.
Is there a function that generates a hash that has the exact lenght I want? I know that MD5 always has 16 bytes. But I want to define the lenght of the resulting hash.
Example:
hash('Something', 2) = 'gn'
hash('Something', 5) = 'a5d92'
hash('Something', 20) = 'RYNSl7cMObkPuXCK1GhF'
When the length increases, the result should be more secure from duplicates.
The upcoming SHAKE256 (or SHAKE128 for a security level of 128bit instead of 256bit), a so called extendable-output function (XOF), is exactly what you are looking for. It will be defined alongside with SHA3. There is already a draft online.
If you need an established solution now, follow CodesInChaos advice and truncate SHA512 if a maximum of 64byte is enough and otherwise seed a stream cipher with the output of a hash of the original data.
Technical disclaimer: After an output length of 512bit the "security against duplicates" (collision resistance) does not increase any more with longer output, as with SHAKE256 it has reached the maximum security level against collisions the primitive supports (256bit). (Note that because of the birthday paradox the security level of an ideal hash function with output length of n bit against collisions is only n/2 bit.) Any higher security level is pretty much meaningless anyway (probably 256bit is already an overkill) given that our solar system does not provide enough energy to even count from 0 to 2^256.
Please do not confuse "security levels" with key lengths: With symmetric algorithms one usually expects a security level equal to the key size, but with asymmetric algorithms the numbers are completely unrelated: A 512 bit RSA encryption scheme is far less secure than 128bit AES (i.e. 512bit RSA moduli can be factored by brute force already).
If a cryptographic primitive tries to achieve a "security level of n bits" it means that there are supposed to be no attacks against it that is faster than 2^n operations.
BLAKE2 can produce digests of any size between 1 and 64 bytes.
If you want a digest considered cryptographically secure, consider the Birthday problem and what other algorithms use — e.g. SHA-1 uses 20 bytes and is considered insecure, SHA-2 uses 28/32/48/64 bytes and is generally considered secure.
If you just want to avoid accidental collisions, still consider the Birthday problem (above), but 16 or even 8 bytes might be considered sufficient depending on the application (see table).
Today I was doing some leisurely reading and stumbled upon Section 5.8 (on page 45) of Recommendation for Pair-Wise Key Establishment Schemes Using Discrete Logarithm Cryptography (Revised) (NIST Special Publication 800-56A). I was very confused by this:
An Approved key derivation function
(KDF) shall be used to derive secret
keying material from a shared secret.
The output from a KDF shall only be
used for secret keying material, such
as a symmetric key used for data
encryption or message integrity, a
secret initialization vector, or a
master key that will be used to
generate other keys (possibly using a
different process). Nonsecret keying
material (such as a non-secret
initialization vector) shall not be
generated using the shared secret.
Now I'm no Alan Turing, but I thought that initialization vectors need not be kept secret. Under what circumstances would one want a "secret initialization vector?"
Thomas Pornin says that IVs are public and he seems well-versed in cryptography. Likewise with caf.
An initialization vector needs not be secret (it is not a key) but it needs not be public either (sender and receiver must know it, but it is not necessary that the Queen of England also knows it).
A typical key establishment protocol will result in both involve parties computing a piece of data which they, but only they, both know. With Diffie-Hellman (or any Elliptic Curve variant thereof), the said shared piece of data has a fixed length and they have no control over its value (they just both get the same seemingly random sequence of bits). In order to use that shared secret for symmetric encryption, they must derive that shared data into a sequence of bits of the appropriate length for whatever symmetric encryption algorithm they are about to use.
In a protocol in which you use a key establishment algorithm to obtain a shared secret between the sender and the receiver, and will use that secret to symmetrically encrypt a message (possibly a very long streamed message), it is possible to use the KDF to produce the key and the IV in one go. This is how it goes in, for instance, SSL: from the shared secret (called "pre-master secret" in the SSL spec) is computed a big block of derived secret data, which is then split into symmetric keys and initialization vectors for both directions of encryption. You could do otherwise, and, for instance, generate random IV and send them along with the encrypted data, instead of using an IV obtained through the KDF (that's how it goes in recent versions of TLS, the successor to SSL). Both strategies are equally valid (TLS uses external random IV because they want a fresh random IV for each "record" -- a packet of data within a TLS connection -- which is why using the KDF was not deemed appropriate anymore).
Well, consider that if two parties have the same cryptographic function, but don't have the same IV, they won't get the same results. So then, it seems like the proposal there is that the two parties get the same shared secret, and each generate, deterministically, an IV (that will be the same) and then they can communicate. That's just how I read it; but I've not actually read the document, and I'm not completely sure that my description is accurate; but it's how I'd start investigating.
IV is public or private, it doesn't matter
let's consider IV is known to attacker, now by looking at encrypted packet/data,
and knowledge of IV and no knowledge on encryption key, can he/she can guess about input data ? (think for a while)
let's go slightly backwards, let's say there is no IV in used in encryption
AES (input, K)= E1
Same input will always produce the same encrypted text.
Attacker can guess Key "K" by looking at encrypted text and some prior knowledge of input data(i.e. initial exchange of some protocols)
So, here is what IV helps. its added with input value , your encrypted text changes even for same input data.
i.e. AES (input, IV, K)= E1
Hence, attacker sees encrypted packets are different (even with same input data) and can't guess easily. (even having IV knowledge)
The starting value of the counter in CTR mode encryption can be thought of as an IV. If you make it secret, you end up with some amount of added security over the security granted by the key length of the cipher you're using. How much extra is hard to say, but not knowing it does increase the work required to figure out how to decrypt a given message.
I'm having a bit of difficulty getting an understand of key length requirements in cryptography. I'm currently using DES which I believe is 56 bits... now, by converting an 8 character password to a byte[] my cryptography works. If I use a 7 digit password, it doesn't.
Now, forgive me if I'm wrong, but is that because ASCII characters are 7 bits, therefor 8 * 7 = 56bits?
That just doesn't seem right to me. If I want to use a key, why can I not just pass in a salted hash of my secret key, i.e. an MD5 hash?
I'm sure this is very simple, but I can't get a clear understanding of what's going on.
DES uses a 56-bit key: 8 bytes where one bit in each byte is a parity bit.
In general, however, it is recommended to use an accepted, well-known key derivation algorithm to convert a text password to a symmetric cipher key, regardless of the algorithm.
The PBKDF2 algorithm described in PKCS #5 (RFC 2898) is a widely-used key derivation function that can generate a key of any length. At its heart, PBKDF2 is combining salt and the password through via a hash function to produce the actual key. The hash is repeated many times so that it will be expensive and slow for an attacker to try each entry in her "dictionary" of most common passwords.
The older version, PBKDF1, can generate keys for DES encryption, but DES and PBKDF1 aren't recommended for new applications.
Most platforms with cryptographic support include PKCS #5 key-derivation algorithms in their API.
Each algorithm is designed to accept a certain key length. The key is used as part of the algorithm, and as such, can't be whatever your heart desires.
Common key sizes are:
DES: 56bit key
AES: 128-256bit key (commonly used values are 128, 192 and 256)
RSA (assymetric cryptography): 1024, 2048, 4096 bit key
A number, such as 1234567 is only a 4-byte variable. The key is expected to be a byte array, such as "1234567" (implicitly convertible to one in C) or `{ '1', '2', '3', '4', '5', '6', '7' }.
If you wish to pass the MD5 hash of your salted key to DES, you should use some key compression technique. For instance, you could take the top 7 bytes (somewhat undesirable), or perform DES encryption on the MD5 hash (with a known constant key), and take all but the last byte as the key for DES operation.
edit: The DES I'm talking about here is the implementation per the standard released by NIST. It could so be (as noted above), that your specific API expects different requirements on the length of the key, and derives the final 7-byte key from it.
The key must have size 64-bits but only 56-bits are used from the key. The other 8-bits are parity bits (internal use).
ASCII chars have 8-bit size.
You shouldn't pass you passwords straight into the algorithm. Use for instance the Rfc2898DeriveBytes class that will salt your passwords, too. It will work with any length.
Have a look here for an example.
EDIT: D'Oh - your question is not C# or .Net tagged :/
According to MSDN DES supports a key length of 64 bits.
To avoid this issue and increase the overall security of one's implementation, typically we'll pass some hashed variant of the key to crypto functions, rather than the key itself.
Also, it's good practice to 'salt' the hash with a value which is particular to the operation you are doing and won't change (e.g., internal userid). This assures you that for any two instances of the key, the resulting has will be different.
Once you have your derived key, you can pull off the first n-bites of it as required by your particular crypto function.
DES requires a 64 bit key. 56 bits for data and 8 bits for parity.
Each byte contains a parity bit at the last index. This bit is used to check for errors that may have occurred.
If the sum of the 7 data bits add up to an even number the correct parity bit is 0, for an odd number it's 1.
ASCII chars contain 8 bits, 8 chars can be used as a key if error correction is not necessary. If EC is necessary, use 7 chars and insert parity bits at indices (0 based) 7,15,23,31,39,47,55,63.
sources:
Wikipedia: https://en.m.wikipedia.org/wiki/Data_Encryption_Standard
“The key ostensibly consists of 64 bits; however, only 56 of these are actually used by the algorithm. Eight bits are used solely for checking parity, and are thereafter discarded. Hence the effective key length is 56 bits.”
“The key is nominally stored or transmitted as 8 bytes, each with odd parity. According to ANSI X3.92-1981 (Now, known as ANSI INCITS 92-1981), section 3.5:
One bit in each 8-bit byte of the KEY may be utilized for error detection in key generation, distribution, and storage. Bits 8, 16,..., 64 are for use in ensuring that each byte is of odd parity.”
I would like to sign a device, and I have 64 bits to store my signature in the device. This device has a MAC address and some other details (about 30 bytes worth) I can mangle to create my signature.
If possible, I would like the method to be one-way, so that I can verify that the signature is valid without knowing how to create a valid signature. Most public-private keys have this feature but they generate signatures that are 48 bytes long (I only have 8 bytes).
Implementation in Python is a plus.
Thanks
EDIT:
Thanks for the advice everyone. It sounds like there is no secure way to do this, only a way that is moderately inconvenient to attackers. I'll probably use a cryptographic hash combined with secret bit-shuffling. This will be as secure as any other link in my (very weak) 'security'.
Hash functions and digital signatures are very different things.
The size of a digital signature depends on the underlying hash function and the key length. So in theory, you can create an RSA implementation that generates 64-bit signatures, but that'd be an extremely weak signature.
For smaller key lengths, you might want to look at elliptic curve cryptography.
EDIT: Yes, I'm a cryptographer.
EDIT 2: Yet if you only need a hash function, you can look at elf64 or RIPEMD-64 as Fernando Miguélez suggested.
EDIT 3: Doing the math, you'd need to use 16-bit keys in ECC to generate 64-bit signatures, which is very weak. For ECC, anything less than 128 bits can be considered weak. For RSA this is 1024 bits.
Basically what you need is a 64-bit cryptographic hash funcion, such as Ripemd-64 or elf-64. Then you encrypt the hash with a cryptographic method and you got a 64 bit signature. The only problem is, from the point of view of a non-cryptoanalyst, that 64 bit offers a much weaker signature than typical over-128 bit hash. Nonetheless it could still be suitable for your application.
You could just use a standard hashing function (MD5 SHA1) and only use the first or last 30 bytes.
The number of bytes a hashing function generates is fairly arbitrary - it's obviously a trade off between space and uniqueness. There is nothing special about the lenght of the signature they use.
Edit - sorry I was thinking that MD5 returned 32bytes- it actaulyl returns 16bytes but is ussually written as 32hex digits.