Multiplication of numbers that have sizes greater than a 64-bit register - 64-bit

For a 64-bit processor, the size of registers are 64 bits. So the maximum bits that can be multiplied together at a time are: a 32 bit number by a 32 bit number.
Is this true or are there any other factors that determine the number of bits multiplied by each other?

No - you can not say that.
That depends on how the algorithm of multiplication is designed.
Think about that:
x * y = x + x + x + ... + x // y times
You can think of every multiplication as a sum of additions.
Adding a number to another one is not a thing of registers because you add always two figurs, save the result and the carry-over and go on to the next two figurs till you added the two numbers.
This way you can multiply very long numbers with a very small register but a large memory to save the result.

Related

Why does this hash calculating bit hack work?

For practice I've implemented the qoi specification in rust. In it there is a small hash function to store recently used pixels:
index_position = (r * 3 + g * 5 + b * 7 + a * 11) % 64
where r, g, b, and a are the red, green, blue and alpha channels respectively.
I assume this works as a hash because it creates a unique prime factorization for the numbers with the mod to limit the number of bytes. Anyways I implemented it naively in my code.
While looking at other implementations I came across this bit hack to optimize the hash calculation:
fn hash(rgba:[u8:4]) -> u8 {
let v = u32::from_ne_bytes(rgba);
let s = (((v as u64) << 32) | (v as u64)) & 0xFF00FF0000FF00FF;
s.wrapping_mul(0x030007000005000Bu64.to_le()).swap_bytes() as u8 & 63
}
I think I understand most of what's going on but I'm confused about the magic number (the multiplicand). To my understanding it should be flipped. As a step by step example:
let rgba = [0x12, 0x34, 0x56, 0x78].
On my machine (little endian) this gives v the value 0x78563412.
The bit shifting spreads the values, giving s = 0x7800340000560012.
Now here's where I get confused. The magic number has the values that should be multiplied aligned in a 64 bit field (3, 5, 7, 11), spaced the same way that the original values are. However they seem to be in reverse order from the values:
0x7800340000560012
0x030007000005000B
When multiplying it would seem that the highest value, the alpha channel (0x78), is being multiplied by 3, while the lowest value, the red channel (0x12), is being multiplied by 11. I'm also not entirely sure why this multiplication works anyway, after multiplying the values by various powers of 2.
I understand that the bytes are then swapped to big endian and trimmed, but that's not until after the multiplication step which loses me.
I know that the code produces the correct hash, but I don't understand why that's the case. Can anyone explain to me what I'm missing?
If you think about the way the math works, you want this flipped order, because it means all the results from each of the "logical" multiplications cluster in the same byte. The highest byte in the first value multiplied by the lowest byte in the second produces a result in the highest byte. The lowest byte in the first value's product with the highest byte in the second value produces a result in the same highest byte, and the same goes for the intermediate bytes.
Yes, the 0x78... and 0x03... are also multiplied by each other, but they overflow way past the top of the value and are lost. Having the order "backwards" means the result of the multiplications we care about all ends up summed in the uppermost byte (the total shift of the results we want is always 56 bits, because the 56th bit offset value is multiplied by the 0th, the 40th by the 16th, the 16th by the 40th, and the 0th by the 56th), with the rest of the multiplications we don't want having their results either overflow (and being lost) or appearing in lower bytes (which we ignore). If you flipped the bytes in the second value, the 0x78 * 0x0B (alpha value & multiplier) component would be lost to overflow, while the 0x12 * 0x03 (red value & multiplier) component wouldn't reach the target byte (every component we cared about would end up somewhat that wasn't the uppermost byte).
For a possibly more intuitive example, imagine doing the same work, but where all the bytes of one input except a single component are zero. If you multiply:
0x7800000000000000 * 0x030007000005000B
the logical result is:
0x1680348000258052800000000000000
but removing the overflow reduces that to:
0x2800000000000000
//^^ result we care about (actual product of 0x78 and 0x0B is 0x528, but only keeping low byte)
Similarly,
0x0000340000000000 * 0x030007000005000B
produces:
0x9c016c000104023c0000000000
overflowing to:
0x04023c0000000000
//^^ result we care about (actual product of 0x34 and 0x5 was 0x104, but only 04 kept)
In that case, the other multiplications did leave data in result (not all overflowed), but since we only look at the high byte, the rest gets ignored.
If you keep doing this math step by step and adding the results, you'll find that the high byte ends up the correct answer to the four individual multiplications you expected (mod 256); flip the order, and it won't work out that way.
The advantage to putting all the results in that high byte is that it allows you to use swap_bytes to move it cheaply to the low byte, and read the value directly (no need to even mask it on many architectures).

Maximum bit-width to store a summation of M n-bit binary numbers

I am trying to find the formula to calculate the maximum bit-width required to contain a sum of M n-bit unsigned binary numbers. Thanks!
The maximum bit-width needed should be ceil(log_2(M * (2^n - 1))).
Edit: Thanks to #MBurnham I realize now that it should be floor(log_2(M * (2^n - 1))) + 1 instead.
Assuming positive integers, you need floor(log2(x)) + 1 bits to store x. and the largest value the sum of m n-bit numbers can produce would be m * 2^n.
So I believe the formula should be
floor(log2(m * 2^n)) + 1
bits.
If I add 2 numbers the I need 1 bit more than the wider of the 2 numbers to store the result. So, if I add 2 n-bit numbers, I need n+1 bits to store the result.
if I add another n-bit number, I need (n+1)+1 bits to store the result (that's 3 n-bit numbers added so far)
if I add another n-bit number, I need ((n+1)+1)+1 bits to store the result (that's 4 n-bit numbers added so far)
if I add another n-bit number, I need (((n+1)+1)+1)+1 bits to store the result (that's 5 n-bit numbers added so far)
So, I think your formula is
n + M - 1

Verilog code to compute cosx using Taylor series approximation

I'm trying to implement COS X function in Verilog using Taylor series. The problem statement presented to me is as below
"Write a Verilog code to compute cosX using Taylor series approximation. Please attach the source and test bench code of the 8-bit outputs in signed decimal radix format for X = 0° to 360° at the increment of 10° "
I need to understand a couple of things before i proceed. Please correct me if i am wrong someplace
Resolution calculation : 10° increments to cover 0° to 360° => 36 positions
36 in decimal can be represented by 6 bits. Since we can use 6 bits, the resolution is slightly better by using 64 words. The 64 words represent 0° to 360° hence each word represents a resolution of 5.625° ie all values of Cos from 0° to 360° in increments of 5.625°. Thus resolution is 5.625°
Taylor series calculation
Taylor series for cos is given by Cos x approximation by Taylor series
COS X = 1 − (X^2/2!) + (X^4/4!) − (X^6/6!) ..... (using only 3~4 terms)
I have a couple of queries
1) While it is easy to generate X*X (X square) or X cube terms using a multiplier, i am not sure how to deal with the extra bits generated during calculation of X square or X cube terms . Output is 8 bits only
eg X=6 bits ; X square =12 bits ; X cube = 18 bits.
Do i generate them anyways and later ignore them by considering just the MSB 8 bits of the entire result ? ... such a cos wave would suck right ?
2) I am not sure how to handle the +1 addition at start of Taylor series ...COS X = 1 − (X^2/2!) + (X^4/4!) .... Do i add binary 1 directly or do i have to scale the 1 as 2^8 = 255 or 2^6 = 64 since i am using 6 bits at input and 8 bits at output ?
I think this number series normally gives a number in the range +1 to -1. SO you have to decide how you are going to use your 8 bits.
I think a signed number with 1 integer bit and 7 fractional bits, you will not be able to represent 1, but very close.
I have a previous answer explaining how to use fixed-point with verilog. Once your comfortable with that you need to look at how bit growth occurs during multiply.
Just because you are outputting 1 bit int, 7 bit frac internally you could (should) use more to compute the answer.
With 7 fractional bits a 1 integer would look like 9'b0_1_0000000 or 1*2**7.

Lookup table for counting number of set bits in an Integer

Was trying to solve this popular interview question - http://www.careercup.com/question?id=3406682
There are 2 approaches to this that i was able to grasp -
Brian Kernighan's algo -
Bits counting algorithm (Brian Kernighan) in an integer time complexity
Lookup table.
I assume when people say use a lookup table, they mean a Hashmap with the Integer as key, and the count of number of set bits as value.
How does one construct this lookup table? Do we use Brian's algo to to count the number of bits the first time we encounter an integer, put it in hashtable, and next time we encounter that integer, retrieve value from hashtable?
PS: I am aware of the hardware and software api's available to perform popcount (Integer.bitCount()), but in context of this interview question, we are not allowed to use those methods.
I was looking for Answer everywhere but could not get the satisfactory explanation.
Let's start by understanding the concept of left shifting. When we shift a number left we multiply the number by 2 and shifting right will divide it by 2.
For example, if we want to generate number 20(binary 10100) from number 10(01010) then we have to shift number 10 to the left by one. we can see number of set bit in 10 and 20 is same except for the fact that bits in 20 is shifted one position to the left in comparison to number 10. so from here we can conclude that number of set bits in the number n is same as that of number of set bit in n/2(if n is even).
In case of odd numbers, like 21(10101) all bits will be same as number 20 except for the last bit, which will be set to 1 in case of 21 resulting in extra one set bit for odd number.
let's generalize this formual
number of set bits in n is number of set bits in n/2 if n is even
number of set bits in n is number of set bit in n/2 + 1 if n is odd (as in case of odd number last bit is set.
More generic Formula would be:
BitsSetTable256[i] = (i & 1) + BitsSetTable256[i / 2];
where BitsetTable256 is table we are building for bit count. For base case we can set BitsetTable256[0] = 0; rest of the table can be computed using above formula in bottom up approach.
Integers can directly be used to index arrays;
e.g. so you have just a simple array of unsigned 8bit integers containing the set-bit-count for 0x0001, 0x0002, 0x0003... and do a look up by array[number_to_test].
You don't need to implement a hash function to map an 16 bit integer to something that you can order so you can have a look up function!
To answer your question about how to compute this table:
int table[256]; /* For 8 bit lookup */
for (int i=0; i<256; i++) {
table[i] = table[i/2] + (i&1);
}
Lookup this table on every byte of the given integer and sum the values obtained.

Python 3 - What is ">>"

This is the confusing line: x_next = (x_next + (a // x_prev)) >> 1
It is bit-wise shift. The next will give you some intuitions:
>>> 16 >> 1
8
>>> 16 >> 2
4
>>> 16 >> 3
2
>>> bin(16)
'0b10000'
>>> bin(16 >> 1)
'0b1000'
>>> bin(16 >> 2)
'0b100'
The >> operator is the same operator as it is in C and many other languages.
A bitshift to the right. If your number is like this in binary: 0100 than it will be 0010 after >> 1. With >> 2 it will be 0001.
So basically it's a nice way to divide your number by 2 (while flooring the remainder) ;)
It is the right shift operator.
Here it is being used to divide by 2. It would be far more clear to write this as
x_next = (x_next + (a // x_prev)) // 2
Sadly a lot of people try to be clever and use shift operators in place of multiplication and division. Typically this just leads to lots of confusion for the poor individuals who have to read the code at a later date.
Most newer/younger programmers do not worry about efficiency because the computers are so fast.
But if you are working on a 8-bit or 16-bit processor that may or may not have a hardware multiply and rarely has a hardware divide, then shifting integers takes one machine cycle while a multiply may take 16 or more and a divide may take 50-200 machine cycles. When your processor clock is in the GHz range you do not notice the difference, but if your instruction rate is 8 MHz or less it adds up very quickly.
So for efficiency people shift to multiply or divide for powers of two, especially in C which is the most common language for small processors and controllers.
I see it so often that I do not even think about it anymore.
Some of the things I do in C:
x = y >> 3; // is the same ad the floor of a divide by 8
if (y & 0x04) x++; // adds the rounding so the answer is rounded
For most microcontrollers, the compiler lets you see the resulting machine code generated and you can see what different statements generate. With that type of feedback, after awhile you just start writing more efficient code.
It means "right shift". It works the same as floor division by 2:
>>> a = 7
>>> a >> 1
3
>>> a // 2
3

Resources