Lookup table for counting number of set bits in an Integer

Lookup table for counting number of set bits in an Integer - hashmap

Was trying to solve this popular interview question - http://www.careercup.com/question?id=3406682
There are 2 approaches to this that i was able to grasp -
Brian Kernighan's algo -
Bits counting algorithm (Brian Kernighan) in an integer time complexity
Lookup table.
I assume when people say use a lookup table, they mean a Hashmap with the Integer as key, and the count of number of set bits as value.
How does one construct this lookup table? Do we use Brian's algo to to count the number of bits the first time we encounter an integer, put it in hashtable, and next time we encounter that integer, retrieve value from hashtable?
PS: I am aware of the hardware and software api's available to perform popcount (Integer.bitCount()), but in context of this interview question, we are not allowed to use those methods.

I was looking for Answer everywhere but could not get the satisfactory explanation.
Let's start by understanding the concept of left shifting. When we shift a number left we multiply the number by 2 and shifting right will divide it by 2.
For example, if we want to generate number 20(binary 10100) from number 10(01010) then we have to shift number 10 to the left by one. we can see number of set bit in 10 and 20 is same except for the fact that bits in 20 is shifted one position to the left in comparison to number 10. so from here we can conclude that number of set bits in the number n is same as that of number of set bit in n/2(if n is even).
In case of odd numbers, like 21(10101) all bits will be same as number 20 except for the last bit, which will be set to 1 in case of 21 resulting in extra one set bit for odd number.
let's generalize this formual
number of set bits in n is number of set bits in n/2 if n is even
number of set bits in n is number of set bit in n/2 + 1 if n is odd (as in case of odd number last bit is set.
More generic Formula would be:
BitsSetTable256[i] = (i & 1) + BitsSetTable256[i / 2];
where BitsetTable256 is table we are building for bit count. For base case we can set BitsetTable256[0] = 0; rest of the table can be computed using above formula in bottom up approach.

Integers can directly be used to index arrays;
e.g. so you have just a simple array of unsigned 8bit integers containing the set-bit-count for 0x0001, 0x0002, 0x0003... and do a look up by array[number_to_test].
You don't need to implement a hash function to map an 16 bit integer to something that you can order so you can have a look up function!

To answer your question about how to compute this table:
int table[256]; /* For 8 bit lookup */
for (int i=0; i<256; i++) {
table[i] = table[i/2] + (i&1);
}
Lookup this table on every byte of the given integer and sum the values obtained.

Related

Why does this hash calculating bit hack work?

For practice I've implemented the qoi specification in rust. In it there is a small hash function to store recently used pixels:
index_position = (r * 3 + g * 5 + b * 7 + a * 11) % 64
where r, g, b, and a are the red, green, blue and alpha channels respectively.
I assume this works as a hash because it creates a unique prime factorization for the numbers with the mod to limit the number of bytes. Anyways I implemented it naively in my code.
While looking at other implementations I came across this bit hack to optimize the hash calculation:
fn hash(rgba:[u8:4]) -> u8 {
let v = u32::from_ne_bytes(rgba);
let s = (((v as u64) << 32) | (v as u64)) & 0xFF00FF0000FF00FF;
s.wrapping_mul(0x030007000005000Bu64.to_le()).swap_bytes() as u8 & 63
}
I think I understand most of what's going on but I'm confused about the magic number (the multiplicand). To my understanding it should be flipped. As a step by step example:
let rgba = [0x12, 0x34, 0x56, 0x78].
On my machine (little endian) this gives v the value 0x78563412.
The bit shifting spreads the values, giving s = 0x7800340000560012.
Now here's where I get confused. The magic number has the values that should be multiplied aligned in a 64 bit field (3, 5, 7, 11), spaced the same way that the original values are. However they seem to be in reverse order from the values:
0x7800340000560012
0x030007000005000B
When multiplying it would seem that the highest value, the alpha channel (0x78), is being multiplied by 3, while the lowest value, the red channel (0x12), is being multiplied by 11. I'm also not entirely sure why this multiplication works anyway, after multiplying the values by various powers of 2.
I understand that the bytes are then swapped to big endian and trimmed, but that's not until after the multiplication step which loses me.
I know that the code produces the correct hash, but I don't understand why that's the case. Can anyone explain to me what I'm missing?

If you think about the way the math works, you want this flipped order, because it means all the results from each of the "logical" multiplications cluster in the same byte. The highest byte in the first value multiplied by the lowest byte in the second produces a result in the highest byte. The lowest byte in the first value's product with the highest byte in the second value produces a result in the same highest byte, and the same goes for the intermediate bytes.
Yes, the 0x78... and 0x03... are also multiplied by each other, but they overflow way past the top of the value and are lost. Having the order "backwards" means the result of the multiplications we care about all ends up summed in the uppermost byte (the total shift of the results we want is always 56 bits, because the 56th bit offset value is multiplied by the 0th, the 40th by the 16th, the 16th by the 40th, and the 0th by the 56th), with the rest of the multiplications we don't want having their results either overflow (and being lost) or appearing in lower bytes (which we ignore). If you flipped the bytes in the second value, the 0x78 * 0x0B (alpha value & multiplier) component would be lost to overflow, while the 0x12 * 0x03 (red value & multiplier) component wouldn't reach the target byte (every component we cared about would end up somewhat that wasn't the uppermost byte).
For a possibly more intuitive example, imagine doing the same work, but where all the bytes of one input except a single component are zero. If you multiply:
0x7800000000000000 * 0x030007000005000B
the logical result is:
0x1680348000258052800000000000000
but removing the overflow reduces that to:
0x2800000000000000
//^^ result we care about (actual product of 0x78 and 0x0B is 0x528, but only keeping low byte)
Similarly,
0x0000340000000000 * 0x030007000005000B
produces:
0x9c016c000104023c0000000000
overflowing to:
0x04023c0000000000
//^^ result we care about (actual product of 0x34 and 0x5 was 0x104, but only 04 kept)
In that case, the other multiplications did leave data in result (not all overflowed), but since we only look at the high byte, the rest gets ignored.
If you keep doing this math step by step and adding the results, you'll find that the high byte ends up the correct answer to the four individual multiplications you expected (mod 256); flip the order, and it won't work out that way.
The advantage to putting all the results in that high byte is that it allows you to use swap_bytes to move it cheaply to the low byte, and read the value directly (no need to even mask it on many architectures).

Find if two strings are anagrams

Faced this question in an interview, which basically stated
Find if the given two strings are anagrams of each other in O(N) time without any extra space
I tried the usual solutions:
Using a character frequency count (O(N) time, O(26) space) (as a variation, iterating 26 times to calculate frequency of each character as well)
Sorting both strings and comparing (O(NlogN) time, constant space)
But the interviewer wanted a "better" approach. At the extreme end of the interview, he hinted at "XOR" for the question. Not sure how that works, since "aa" XOR "bb" should also be zero without being anagrams.
Long story short, are the given constraints possible? If so, what would be the algorithm?

Given word_a and word_b in the same length, I would try the following:
Define a variable counter and initialise the value to 0.
For each letter ii in the alphabet do the following:
2.1. for jj in length(word_a):
2.1.1. if word_a[jj] == ii increase counter by 1: counter += 1
2.1.2. if word_b[jj] == ii decrease the counter by 1: counter -= 1
2.2. if after passing all the characters in the words, counter is different than 0, you have a different number of ii characters in each word and in particular they are not anagrams, break out of the loop and return False
Return True
Explanation
In case the words are anagrams, you have the same number of each of the characters, therefore the use of the histogram makes sense, but histograms require space. Using this method, you run over the n characters of the words exactly 26 times in the case of the English alphabet or any other constant c representing the number of letters in the alphabet. Therefor, the runtime of the process is O(c*n) = O(n) since c is constant and you do not use any other space besides the one variable

I haven't proven to myself that this is infallible yet, but it's a possible solution.
Go through both strings and calculate 3 values: the sum, the accumulated xor, and the count. If all 3 are equal then the strings should be anagrams.

Coin Change Algorithm - DP with 1D array

I came across a solution to the Coin Change problem here : Coin Change. Here I was able to understand the first recursive method, the second method which uses DP with a 2D array. But am not able to understand the logic behind the third solution.
As far as I have thought, the last method works for problems in which the sequence of coins used in coin change is considered. Am I correct? Can anyone please explain me if I am wrong.

Well I figured it out myself!
This can be easily proved using induction. Let table[k] denote the ways change can be given for a total of k. Now the algorithm consists of two loops, one which is controlled by i and iterates through the array containing all the different coins and the other is the j controlled loop which for a given i, updates all the values of elements in array table. Now consider for a fixed i we have calculated the number of ways change can be given for all values from 1 to n and these values are stored in table from table[1] to table[n]. When the i controlled loop iterates for i+1, the value in table[j] for an arbitrary j is incremented by table[j-S[i + 1]] which is nothing but the ways we can create j using at least one coin with value S[i + 1] (the array which stores coin values). Thus the total value in table[j] equals the number of ways we can create a change with coins of value S[1]....S[i] (this was already stored before) and the value table[j-S[i + 1]]. This is same as the optimal substructure of the problem used in the recursive algorithm.

int arr[size];
memset(arr,0,sizeof(size));
int n;
cin>>n;
int sum;
cin>>sum;
int a[size];
fi(i,n)
cin>>a[i];
arr[0]=1;
fi(i,n)
for(int j=arr[i]; j<=n; j++)
a[j]+=a[j-arr[i]];
cout<<arr[n];
The array arr is initialised as 0 so as to show that the number of ways a sum of ican be represented is zero(that is not initialised). However, the number of ways in which a sum of 0 can be represented is 1 (zero way).
Further, we take each coin and start initialising each position in the array starting from the coin denomination.
a[j]+=a[j-arr[i]] means that we are basically incrementing the possible ways to represent the sum jby the previous number of ways, required (j-arr[i]).
In the end, we output the a[n]

Binary search - worst/avg case

I'm finding it difficult to understand why/how the worst and average case for searching for a key in an array/list using binary search is O(log(n)).
log(1,000,000) is only 6. log(1,000,000,000) is only 9 - I get that, but I don't understand the explanation. If one did not test it, how do we know that the avg/worst case is actually log(n)?
I hope you guys understand what I'm trying to say. If not, please let me know and I'll try to explain it differently.

Worst case
Every time the binary search code makes a decision, it eliminates half of the remaining elements from consideration. So you're dividing the number of elements by 2 with each decision.
How many times can you divide by 2 before you are down to only a single element? If n is the starting number of elements and x is the number of times you divide by 2, we can write this as:
n / (2 * 2 * 2 * ... * 2) = 1 [the '2' is repeated x times]
or, equivalently,
n / 2^x = 1
or, equivalently,
n = 2^x
So log base 2 of n gives you x, which is the number of decisions being made.
Finally, you might ask, if I used log base 2, why is it also OK to write it as log base 10, as you have done? The base does not matter because the difference is only a constant factor which is "ignored" by Big O notation.
Average case
I see that you also asked about the average case. Consider:
There is only one element in the array that can be found on the first try.
There are only two elements that can be found on the second try. (Because after the first try, we chose either the right half or the left half.)
There are only four elements that can be found on the third try.
You can see the pattern: 1, 2, 4, 8, ... , n/2. To express the same pattern going in the other direction:
Half the elements take the maximum number of decisions to find.
A quarter of the elements take one fewer decision to find.
etc.
Since half of the elements take the maximum amount of time, it doesn't matter how much less time the other elements take. We could assume that all elements take the maximum amount of time, and even if half of them actually take 0 time, our assumption would not be more than double whatever the true average is. We can ignore "double" since it is a constant factor. So the average case is the same as the worst case, as far as Big O notation is concerned.

For binary search, the array should be arranged in ascending or descending order.
In each step, the algorithm compares the search key value with the key value of the middle element of the array.
If the keys match, then a matching element has been found and its index, or position, is returned.
Otherwise, if the search key is less than the middle element's key, then the algorithm repeats its action on the sub-array to the left of the middle element.
Or, if the search key is greater,then the algorithm repeats its action on the sub-array to the right.
If the remaining array to be searched is empty, then the key cannot be found in the array and a special "not found" indication is returned.
So, a binary search is a dichotomic divide and conquer search algorithm. Thereby it takes logarithmic time for performing the search operation as the elements are reduced by half in each of the iteration.

For sorted lists which we can do a binary search, each "decision" made by the binary search compares your key to the middle element, if greater it takes the right half of the list, if less it will take the left half of the list (if it's a match it will return the element at that position) you effectively reduce your list by half for every decision yielding O(logn).
Binary search however, only works for sorted lists. For un-sorted lists you can do a straight search starting with the first element yielding a complexity of O(n).
O(logn) < O(n)
Although it entirely depends on how many searches you'll be doing, your inputs, etc what your best approach would be.

For Binary search the prerequisite is a sorted array as input.
• As the list is sorted:
• Certainly we don't have to check every word in the dictionary to look up a word.
• A basic strategy is to repeatedly halve our search range until we find the value.
• For example, look for 5 in the list of 9 #s below.v = 1 1 3 5 8 10 18 33 42
• We would first start in the middle: 8
• Since 5<8, we know we can look at just the first half: 1 1 3 5
• Looking at the middle # again, narrow down to 3 5
• Then we stop when we're down to one #: 5
How many comparison is needed: 4 =log(base 2)(9-1)=O(log(base2)n)
int binary_search (vector<int> v, int val) {
int from = 0;
int to = v.size()-1;
int mid;
while (from <= to) {
mid = (from+to)/2;
if (val == v[mid])
return mid;
else if (val > v[mid])
from = mid+1;
else
to = mid-1;
}
return -1;
}

CodeJam 2014: Solution for The Repeater

I participated in code jam, I successfully solved small input of The Repeater Challenge but can't seem to figure out approach for multiple strings.
Can any one give the algorithm used for multiple strings. For 2 strings ( small input ) I am comparing strings character by character and doing operations to make them equal. However this approach would time out for large input.
Can some one explain their algorithm they used. I can see solutions of other users but can't figure out what have they done.

I can tell you my solution which worked fine for both small and large inputs.
First, we have to see if there is a solution, you do that by bringing all strings to their "simplest" form. If any of them does not match, there there is no solution.
e.g.
aaabbbc => abc
abbbbbcc => abc
abbcca => abca
If only the first two were given, then a solution would be possible. As soon as the third is thrown into the mix, then it's impossible. The algorithm to do the "simplification" is to parse the string and eliminate any double character you see. As soon as a string does not equal the simplified form of the batch, bail out.
As for actual solution to the problem, i simply converted the strings to a [letter, repeat] format. So for example
qwerty => 1q,1w,1e,1r,1t,1y
qqqwweeertttyy => 3q,2w,3e,1r,3t,2y
(mind you the outputs are internal structures, not actual strings)
Imagine now you have 100 strings, you have already passed the test that there is a solution and you have all strings into the [letter, repeat] representation. Now go through every letter and find the least 'difference' of repetitions you have to do, to reach the same number. So for example
1a, 1a, 1a => 0 diff
1a, 2a, 2a => 1 diff
1a, 3a, 10a => 9 diff (to bring everything to 3)
the way to do this (i'm pretty sure there is a more efficient way) is to go from the min number to the max number and calculate the sum of all diffs. You are not guaranteed that the number will be one of the numbers in the set. For the last example, you would calculate the diff to bring everything to 1 (0,2,9 =11) then for 2 (1,1,8 =10), the for 3 (2,0,7 =9) and so on up to 10 and choose the min again. Strings are limited to 1000 characters so this is an easy calculation. On my moderate laptop, the results were instant.
Repeat the same for every letter of the strings and sum everything up and that is your solution.

This answer gives an example to explain why finding the median number of repeats produces the lowest cost.
Suppose we have values:
1 20 30 40 100
And we are trying to find the value which has shortest total distance to all these values.
We might guess the best answer is 50, with cost |50-1|+|50-20|+|50-30|+|50-40|+|50-100| = 159.
Split this into two sums, left and right, where left is the cost of all numbers to the left of our target, and right is the cost of all numbers to the right.
left = |50-1|+|50-20|+|50-30|+|50-40| = 50-1+50-20+50-30+50-40 = 109
right = |50-100| = 100-50 = 50
cost = left + right = 159
Now consider changing the value by x. Providing x is small enough such that the same numbers are on the left, then the values will change to:
left(x) = |50+x-1|+|50+x-20|+|50+x-30|+|50+x-40| = 109 + 4x
right(x) = |50+x-100| = 50 - x
cost(x) = left(x)+right(x) = 159+3x
So if we set x=-1 we will decrease our cost by 3, therefore the best answer is not 50.
The amount our cost will change if we move is given by difference between the number to our left (4) and the number to our right (1).
Therefore, as long as these are different we can always decrease our cost by moving towards the median.
Therefore the median gives the lowest cost.
If there are an even number of points, such as 1,100 then all numbers between the two middle points will give identical costs, so any of these values can be chosen.

Since Thanasis already explained the solution, I'm providing here my source code in Ruby. It's really short (only 400B) and following his algorithm exactly.
def solve(strs)
form = strs.first.squeeze
strs.map { |str|
return 'Fegla Won' if form != str.squeeze
str.chars.chunk { |c| c }.map { |arr|
arr.last.size
}
}.transpose.map { |row|
Range.new(*row.minmax).map { |n|
row.map { |r|
(r - n).abs
}.reduce :+
}.min
}.reduce :+
end
gets.to_i.times { |i|
result = solve gets.to_i.times.map { gets.chomp }
puts "Case ##{i+1}: #{result}"
}
It uses a method squeeze on strings, which removes all the duplicate characters. This way, you just compare every squeezed line to the reference (variable form). If there's an inconsistency, you just return that Fegla Won.
Next you use a chunk method on char array, which collects all consecutive characters. This way you can count them easily.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string