In PBKDF2 is INT (i) signed? - security

Page 11 of RFC 2898 states that for U_1 = PRF (P, S || INT (i)), INT (i) is a four-octet encoding of the integer i, most significant octet first.
Does that mean that i is a signed value and if so what happens on overflow?

Nothing says that it would be signed. The fact that dkLen is capped at (2^32 - 1) * hLen suggests that it's an unsigned integer, and that it cannot roll over from 0xFFFFFFFF (2^32 - 1) to 0x00000000.
Of course, PBKDF2(MD5) wouldn't hit 2^31 until you've asked for 34,359,738,368 bytes. That's an awful lot of bytes.
SHA-1: 42,949,672,960
SHA-2-256 / SHA-3-256: 68,719,476,736
SHA-2-384 / SHA-3-384: 103,079,215,104
SHA-2-512 / SHA-3-512: 137,438,953,472
Since the .NET implementation (in Rfc2898DeriveBytes) is an iterative stream it could be polled for 32GB via a (long) series of calls. Most platforms expose PBKDF2 as a one-shot, so you'd need to give them a memory range of 32GB (or more) to identify if they had an error that far out. So even if most platforms get the sign bit wrong... it doesn't really matter.
PBKDF2 is a KDF (key derivation function), so used for deriving keys. AES-256 is 32 bytes, or 48 if you use the same PBKDF2 to generate an IV (which you really shouldn't). Generating a private key for the ECC curve with a 34,093 digit prime is (if I did my math right) 14,157 bytes. Well below the 32GB mark.

i ranges from 1 to l = CEIL (dkLen / hLen), and dkLen and hLen are positive integers. Therefore, i is strictly positive.
You can, however, store i in a signed, 32-bit integer type without any special handling. If i rolls over (increments from 0x7FFFFFFF to 0xF0000000), it will continue to be encoded correctly, and continue to increment correctly. With two's complement encoding, bitwise results for addition, subtraction, and multiplication are the same as long as all values are treated as either signed or unsigned.

Related

Optimizing find_first_not_of with SSE4.2 or earlier

I am writing a textual packet analyzer for a protocol and in optimizing it I found that a great bottleneck is the find_first_not_of call.
In essence, I need to find if a packet is valid if it contains only valid characters, faster than the default C++ function.
For instance, if all allowed characters are f, h, o, t, and w, in C++ I would just call s.find_first_not_of("fhotw"), but in SSEx I have no clue after loading the string in a set of __m128i variables.
Apparently, the _mm_cmpXstrY functions documentation is not really helping me in this. (e.g. _mm_cmpistri). I could at first subtract with _mm_sub_epi8, but I don't think it would be a great idea.
Moreover, I am stuck with SSE (any version).
This article by Wojciech Muła describes a SSSE3 algorithm to accept/reject any given byte value.
(Contrary to the article, saturated arithmetic should be used to conduct range checks, but we don't have ranges.)
SSE4.2 string functions are often slower** than hand-crafted alternatives. For example, 3 uops, 3 cycle throughput on Skylake for pcmpistri, the fastest of the SSE4.2 string instructions. vs. 1 shuffle and 1 pcmpeqb per 16 bytes of input with this, with SIMD AND and movemask to combine results. Plus some load and register-copy instructions, but still very likely faster than 1 vector per 3 cycles. Doesn't quite as easily handle short 0-terminated strings, though; SSE4.2 is worth considering if you also need to worry about that, instead of known-size blocks that are a multiple of the vector width.
For "fhotw" specifically, try:
#include <tmmintrin.h> // pshufb
bool is_valid_64bytes (uint8_t* src) {
const __m128i tab = _mm_set_epi8('o','_','_','_','_','_','_','h',
'w','f','_','t','_','_','_','_');
__m128i src0 = _mm_loadu_si128((__m128i*)&src[0]);
__m128i src1 = _mm_loadu_si128((__m128i*)&src[16]);
__m128i src2 = _mm_loadu_si128((__m128i*)&src[32]);
__m128i src3 = _mm_loadu_si128((__m128i*)&src[48]);
__m128i acc;
acc = _mm_cmpeq_epi8(_mm_shuffle_epi8(tab, src0), src0);
acc = _mm_and_si128(acc, _mm_cmpeq_epi8(_mm_shuffle_epi8(tab, src1), src1));
acc = _mm_and_si128(acc, _mm_cmpeq_epi8(_mm_shuffle_epi8(tab, src2), src2));
acc = _mm_and_si128(acc, _mm_cmpeq_epi8(_mm_shuffle_epi8(tab, src3), src3));
return !!(((unsigned)_mm_movemask_epi8(acc)) == 0xFFFF);
}
Using the low 4 bits of the data, we can select a byte from our set that has that low nibble value. e.g. 'o' (0x6f) goes in the high byte of the table so input bytes of the form 0x?f try to match against it. i.e. it's the first element for _mm_set_epi8, which goes from high to low.
See the full article for variations on this technique for other special / more general cases.
**If the search is very simple (doesn't need the functionality of string instructions) or very complex (needs at least two string instructions) then it doesn't make much sense to use the string functions. Also the string instructions don't scale to the 256-bit width of AVX2.

Getting an integer value from a byte-string in Python3

I am implementing a RSA and AES file encryption program. So far, I have RSA and AES implemented. What I wish to understand however is , if my AES implementation uses a 16 byte key (obtained by os.urandom(16)) how could I get an integer value from this to encrypt with the RSA ?
In essence, if I have a byte string like
b',\x84\x9f\xfc\xdd\xa8A\xa7\xcb\x07v\xc9`\xefu\x81'
How could I obtain an integer from this byte string (AES key) which could subsequently be used for encryption using (RSA)?
Flow of encryption
Encrypt file (AES Key) -> Encrypt AES key (using RSA)
TL;DR use from_bytes and to_bytes to implement OS2IP and I2OSP respectively.
For secure encryption, you don't directly turn the AES key into a number. This is because raw RSA is inherently insecure in many many ways (the list is not complete at the time of writing).
First you need to random-pad your key bytes to obtain a byte array that will represent a number close to the modulus. Then you can perform the byte array conversion to a number, and only then should you perform modular exponentiation. Modular exponentiation will also result in a number, and you need to turn that number into a statically sized byte array with the same size as the modulus (in bytes).
All this is standardized in the PKCS#1 RSA standard. In v2.2 there are two schemes specified, known as PKCS#1 v1.5 padding and OAEP padding. The first one is pretty easy to implement, but is more vulnerable to padding oracle attacks. OAEP is also vulnerable, but less so. You will however need to follow the implementation hints to the detail, especially during unpadding.
To circle back to your question, the number conversions are called the octet string to integer primitive (OS2IP) and the integer to octet string primitive (I2OSP). These are however not mathematical operations that you need to perform: they just describe how to represent how to encode the number as statically sized, big endian, unsigned integer.
Say that keysize is the key size (modulus size) in bits and em is the bytes or bytearray representing the padded key, then you'd just perform:
m = int.from_bytes(em, byteorder='big', signed=False)
for OS2IP where m will be the input for modular exponentiation and back using:
k = (keysize + 8 - 1) / 8
em = m.to_bytes(k, byteorder='big', signed=False)
for I2OSP.
And you will have to perform the same two operations for decryption...
To literally interpret the byte-string as an integer (which you should be able to do; python integers can get arbitrarily large), you could just sum up the values, shifted the appropriate number of bits:
bytestr = b',\x84\x9f\xfc\xdd\xa8A\xa7\xcb\x07v\xc9`\xefu\x81'
le_int = sum(v << (8 * i) for i, v in enumerate(bytestr))
# = sum([44, 33792, 10420224, 4227858432, 949187772416, 184717953466368, 18295873486192640, 12033618204333965312, 3744689046963038978048, 33056565380087516495872, 142653246714526242615328768, 62206486974090358813680992256, 7605903601369376408980219232256, 4847495895272749231323393057357824, 607498732448574832538068070518751232, 171470411456254147604591110776164450304])
# = 172082765352850773589848323076011295788
That would be a little-endian interpretation; a big-endian interpretation would just start reading from the other side, which you could do with reversed():
be_int = sum(v << (8 * i) for i, v in enumerate(reversed(bytestr)))
# = sum([129, 29952, 15663104, 1610612736, 863288426496, 129742372077568, 1970324836974592, 14627691589699371008, 3080606260309495119872, 306953821386526938890240, 203099537695257701350637568, 68396187170517260188176613376, 19965496953594613073573075484672, 3224903126980615597407612954476544, 685383185326597246966025515457052672, 58486031814536298407767510652335161344])
# = 59174659937086426622213974601606591873

ECDH private key size

I know that key sizes in ECDH depend on size of Elliptic Curve.
If it is a 256-bit curve (secp256k1), keys will be:
Public: 32 bytes * 2 + 1 = 65 (uncompressed)
Private: 32 bytes
384-bit curve (secp384r1):
Public: 48 bytes * 2 + 1= 97 (uncompressed)
Private: 48 bytes
But with 521-bit curve (secp521r1) situation is very strange:
Public: 66 bytes * 2 + 1 = 133 (uncompressed)
Private: 66 bytes or 65 bytes.
I used node.js crypto module to generate this keys.
Why private key value of 521-bit curve is variable?
The private key of the other curves are variable as well, but they are less likely to exhibit this variance when it comes to encoding to bytes.
The public key is encoded as two statically sized integers, prefixed with the uncompressed point indicator 04. The size is identical to the key size in bytes.
The private key doesn't really have an pre-established encoding. It is a single random value (or vector) within the range 1..N-1 where N is the order of the curve. Now if you encode this value as a variable sized unsigned number then usually it will be the same size as the key in bytes. However, it may by chance be one byte smaller, or two, or three or more. Of course, the chance that it is much smaller is pretty low.
Now the 521 bit key is a bit strange that the first, most significant byte of the order doesn't start with a bit set to 1; it only has the least significant bit set to 1. This means that there is a much higher chance that the most significant byte of the private value (usually called s) is a byte shorter.
The exact chance of course depends on the full value of the order:
01FF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF
FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFA
51868783 BF2F966B 7FCC0148 F709A5D0
3BB5C9B8 899C47AE BB6FB71E 91386409
but as you may guess it is pretty close to 1 out of 2 because there are many bits set to 1 afterwards. The chance that two bytes are missing is of course 1 out of 512, and three bytes 1 out of 131072 (etc.).
Note that ECDSA signature sizes may fluctuate as well. The X9.42 signature scheme uses two DER encoded signed integers. The fact that they are signed may introduces a byte set all to zeros if the most significant bit of the most significant byte is set to 1, otherwise the value would be interpreted as being negative. The fact that it consists of two numbers, r and s, and that the size of DER encoding is also dependent of the size of the encoded integers makes the size of the full encoding rather hard to predict.
Another less common (flat) encoding of an ECDSA signature uses the same statically sized integers as the public key, in which case it is just twice the size of the order N in bytes.
ECDH doesn't have this issue. Commonly the shared secret is the statically encoded X coordinate of the point that is the result of the ECDH calculation - or at least a value that is derived from it using a Key Derivation Function (KDF).

Why test for these numbers (2^16, 2^31 ....)

Going through Elisabeth Hendrickson's test heuristics cheatsheet , I see the following recommendations :
Numbers : 32768 (2^15) 32769 (2^15+ 1) 65536 (2^16) 65537 (2^16 +1) 2147483648 (2^31) 2147483649 (2^31+ 1) 4294967296 (2^32) 4294967297 (2^32+ 1)
Does someone know the reason for testing all theses cases ? My gut feeling goes with the data type the developer may have used ( integer, long, double...)
Similarly, with Strings :
Long (255, 256, 257, 1000, 1024, 2000, 2048 or more characters)
These represent boundaries
Integers
2^15 is at the bounds of signed 16-bit integers
2^16 is at the bounds of unsigned 16-bit integers
2^31 is at the bounds of signed 32-bit integers
2^32 is at the bounds of unsigned 32-bit integers
Testing for values close to common boundaries tests whether overflow is correctly handled (either arithmetic overflow in the case of various integer types, or buffer overflow in the case of long strings that might potentially overflow a buffer).
Strings
255/256 is at the bounds of numbers that can be represented in 8 bits
1024 is at the bounds of numbers that can be represented in 10 bits
2048 is at the bounds of numbers that can be represented in 11 bits
I suspect that the recommendations such as 255, 256, 1000, 1024, 2000, 2048 are based on experience/observation that some developers may allocate a fixed-size buffer that they feel is "big enough no matter what" and fail to check input. That attitude leads to buffer overflow attacks.
These are boundary values close to maximum signed short, maximum unsigned short and same for int. The reason to test them is to find bugs that occur close to the border values of typical data types.
E.g. your code uses signed short and you have a test that exercises something just below and just above the maximum value of such type. If the first test passes and the second one fails, you can easily tell that overflow/truncation on short was the reason.
Those numbers are border cases on either side of the fence (+1, 0, and -1) for "whole and round" computer numbers, which are always powers of 2. Those powers of 2 are also not random and are representing standard choices for integer precision - being 8, 16, 32, and so on bits wide.

RSA signature size?

I would like to know what is the length of RSA signature ? Is it always the same size as the RSA key size like if the key size is 1024 then RSA signature is 128 bytes , if the key size is 512 bits then RSA signature is 64 bytes ? what is RSA modulus ?
Also what does RSA-sha1 mean ?
Any pointers greatly appreciated.
You are right, the RSA signature size is dependent on the key size, the RSA signature size is equal to the length of the modulus in bytes. This means that for a "n bit key", the resulting signature will be exactly n bits long. Although the computed signature value is not necessarily n bits, the result will be padded to match exactly n bits.
Now here is how this works: The RSA algorithm is based on modular exponentiation. For such a calculation the final result is the remainder of the "normal" result divided by the modulus. Modular arithmetic plays a large role in Number Theory. There the definition for congruence (≡) is
m is congruent to n mod k if k divides m - n
Simple example - let n = 2 and k = 7, then
2 ≡ 2 (mod 7) because: 7 divides 2 - 2
9 ≡ 2 (mod 7) because: 7 divides 9 - 2
16 ≡ 2 (mod 7) because: 7 divides 16 - 2
...
7 actually does divide 0, the definition for division is
An integer a divides an integer b if there is an integer n with the property that b = na
For a = 7 and b = 0 choose n = 0. This implies that every integer divides 0, but it also implies that congruence can be expanded to negative numbers (won't go into details here, it's not important for RSA).
So the gist is that the congruence principle expands our naive understanding of remainders, the modulus is the "number after mod", in our example it would be 7. As there are an infinite amount of numbers that are congruent given a modulus, we speak of this as the congruence classes and usually pick one representative (the smallest congruent integer > 0) for our calculations, just as we intuitively do when talking about the "remainder" of a calculation.
In RSA, signing a message m means exponentiation with the "private exponent" d, the result r is the smallest integer >0 and smaller than the modulus n so that
m^d ≡ r (mod n)
This implies two things
The length of r (in bits) is bounded by n (in bits)
The length of m (in bits) must be <= n (in bits, too)
To make the signature exactly n bits long, some form of padding is applied. Cf. PKCS#1 for valid options.
The second fact implies that messages larger than n would either have to be signed by breaking m in several chunks <= n, but this is not done in practice since it would be way too slow (modular exponentiation is computationally expensive), so we need another way to "compress" our messages to be smaller than n. For this purpose we use cryptographically secure hash functions such as SHA-1 that you mentioned. Applying SHA-1 to an arbitrary-length message m will produce a "hash" that is 20 bytes long, smaller than the typical size of an RSA modulus, common sizes are 1024 bits or 2048 bits, i.e. 128 or 256 bytes, so the signature calculation can be applied for any arbitrary message.
The cryptographic properties of such a hash function ensures (in theory - signature forgery is a huge topic in the research community) that it is not possible to forge a signature other than by brute force.

Resources