Effective bit-on in counting the number of case - dynamic-programming

I am wondering the way of bit-on in counting the number of case.
SITUAITION
check the whole possible n/2 cases of n.
MY APPROACH
bool possible(int state)
it is the function for counting '1' in the state. if cnt is equal to n/2 return true, or return false.
inline bool possible(int state){
int cnt=0;
for(int t=1;;t*=2){
if(t==(1<<n)) break;
if(cnt>n/2) break;
if((t&state)==t) ++cnt;
}
if(cnt==n/2) return true;
else return false;
}
void query()
it searches all possible states.
inline void query(){
int tar=n/2;
int ss=(1<<tar)-1;
int ee=(1<<n)-1;
for(int i=ss;i<=ee;++i){
if(possible(i)) process(i);
}
}
I want to use bitmask for solving the whole possible n/2 cases of n.
But I think query() function is ineffective, because it searches the whole cases. Is there any effective way to approach this problem?
THE MEANING OF BIT-ON
for example, if n=4, then we have to bit-on two index,
in 0-based index,
0001 [fail]
0010 [fail]
0011 [indices of 0,1 bit-on]
0100 [fail]
0101 [indices of 0,2 bit-on]
0110 [indices of 1,2 bit-on]
0111 [fail]
1000 [fail]
1001 [indices of 0,3 bit-on]
1010 [indices of 1,3 bit-on]
1011 [fail]
1100 [indices of 2,3 bit-on]
1101 [fail]
1110 [fail]
1111 [fail]
Apparently, 4C2=6 cases selected, so the states,
[0011, 0101, 0110, 1001, 1010, 1100] will be searched.

The problem can be solved recursively
first you choose the position of your left-most "1"
then call the function recursively to place the remaining "1"s
the recursion ends when there are no "1"s left to place
That means you need to define a more general function that generates k "1"s in a n bit number.
A handy trick to avoid returning and processing subresults is passing an accumulator down. You can than call your process() function deep down in the recursion.
Example code in python. Should translate to C easy enough.
def process(i):
'''prints decimal and binary representation of i'''
print(i, bin(i))
def helper1(length, ones, acc):
if ones == 0:
process(acc)
else:
for i in range(ones-1, length): # iterates from ones-1 to length-1
helper1(i, ones-1, acc | (1<<i))
def solution1(n):
helper1(n, n >> 1, 0)
On a modern CPU this should run just fine. It can be "improved" though by using bitmasks instead of indices as parameters. However, the code gets harder to understand.
def helper2(first_mask, last_mask, acc):
if last_mask == 0:
process(acc)
else:
mask = last_mask
while mask <= first_mask:
helper2(mask >> 1, last_mask >> 1, acc | mask)
mask = mask << 1
def solution2(n):
helper2(1 << (n-1), 1 << (n//2 -1), 0)
first_mask represents the left-most position where a "1" can be inserted
last_mask represents the right-most position where a "1" can be inserted (so that there is still enough room for the remaining "1"s). It doubles as a counter for remaining "1"s.
Bit-twiddling-hack
It just occured to me that you can also solve the problem without recursion:
Start with the smallest number and find the next greater number in a loop.
To find a greater number you need to move a "1" to the postion of the next "0" to the left and then move all "1"s that are on the right of your new "1" the very right.
While this sounds complicated it can be done fast using bit-twiddling-hacks.
def helper3(total, ones):
if ones == 0:
process(0)
return
i = (1 << ones) - 1
while i < (1 << total):
process(i)
last_one_mask = (((i - 1) ^ i) >> 1) + 1
temp = i + last_one_mask
i = temp | (((temp ^ i) // last_one_mask) >> 2)
def solution3(n):
helper3(n, n >> 1)
If your language has constant width integers you could get an overflow when calculating temp. To avoid bad behaviour you must abort the loop if temp<i.

Related

Maximum Sum of XOR operation on a selected element with array elements with an optimize approach

Problem: Choose an element from the array to maximize the sum after XOR all elements in the array.
Input for problem statement:
N=3
A=[15,11,8]
Output:
11
Approach:
(15^15)+(15^11)+(15^8)=11
My Code for brute force approach:
def compute(N,A):
ans=0
for i in A:
xor_sum=0
for j in A:
xor_sum+=(i^j)
if xor_sum>ans:
ans=xor_sum
return ans
Above approach giving the correct answer but wanted to optimize the approach to solve it in O(n) time complexity. Please help me to get this.
If you have integers with a fixed (constant) number of c bites then it should be possible because O(c) = O(1). For simplicity reasons I assume unsigned integers and n to be odd. If n is even then we sometimes have to check both paths in the tree (see solution below). You can adapt the algorithm to cover even n and negative numbers.
find max in array with length n O(n)
if max == 0 return 0 (just 0s in array)
find the position p of the most significant bit of max O(c) = O(1)
p = -1
while (max != 0)
p++
max /= 2
so 1 << p gives a mask for the highest set bit
build a tree where the leaves are the numbers and every level stands for a position of a bit, if there is an edge to the left from the root then there is a number that has bit p set and if there is an edge to the right there is a number that has bit p not set, for the next level we have an edge to the left if there is a number with bit p - 1 set and an edge to the right if bit p - 1 is not set and so on, this can be done in O(cn) = O(n)
go through the array and count how many times a bit at position i (i from 0 to p) is set => sum array O(cn) = O(n)
assign the root of the tree to node x
now for each i from p to 0 do the following:
if x has only one edge => x becomes its only child node
else if sum[i] > n / 2 => x becomes its right child node
else x becomes its left child node
in this step we choose the best path through the tree that gives us the most ones when xoring O(cn) = O(n)
xor all the elements in the array with the value of x and sum them up to get the result, actually you could have built the result already in the step before by adding sum[i] * (1 << i) to the result if going left and (n - sum[i]) * (1 << i) if going right O(n)
All the sequential steps are O(n) and therefore overall the algorithm is also O(n).

What is the time complexity of this division function (no divide or multiply operators used)?

I solved this leetcode question https://leetcode.com/problems/divide-two-integers/ . The goal is to get the quotient of the division of dividend by divisor without using a multiplication or division operator. Here is my solution:
def divide(dividend, divisor):
"""
:type dividend: int
:type divisor: int
:rtype: int
"""
sign = [1,-1][(dividend < 0) != (divisor < 0)]
dividend, divisor = abs(dividend), abs(divisor)
res = 0
i = 0
Q = divisor
while dividend >= divisor:
dividend = dividend - Q
Q <<= 1
res += (1 << i)
i+=1
if dividend < Q:
Q = divisor
i = 0
if sign == -1:
res = -res
if res < -2**31 or res > 2**31 -1:
return 2**31 - 1
return res
So I am having trouble analyzing the time complexity of this solution. I know it should be O(log(something)). Usually for algorithms we say they are O(log(n)) when the input gets divided by 2 at each iteration but here I multiply the divisor by 2 Q<<= 1 at each iteration so at each step I take a bigger step towards the solution. Obviously if the dividend is the same for a bigger divisor my algorithm will be faster. Similarly the bigger the dividend for the same divisor we get a slower run time.
My guess is the equation governing the runtime of this algorithm is basically of the form O(dividend/divisor) (duh that's division) with some logs in there to account for me multiplying Q by 2 at each step Q <<= 1... I can't figure out what exactly.
EDIT:
When I first posted the question the algorithm I posted is the one below, Alain Merigot's answer is based on that algorithm. The difference between the version on top and that one is for the one above I never have my dividend go below 0 resulting in a faster run time.
def divide(dividend, divisor):
"""
:type dividend: int
:type divisor: int
:rtype: int
"""
sign = [1,-1][(dividend < 0) != (divisor < 0)]
dividend, divisor = abs(dividend), abs(divisor)
res = 0
i = 0
tmp_divisor = divisor
while dividend >= divisor:
old_dividend, old_res = dividend, res
dividend = dividend - tmp_divisor
tmp_divisor <<= 1
res += (1 << i)
i+=1
if dividend < 0:
dividend = old_dividend
res = old_res
tmp_divisor >>= 2
i -= 2
if sign == -1:
res = -res
if res < -2**31 or res > 2**31 -1:
return 2**31 - 1
return res
Your algorithm is O(m^2) in the worst-case, where m is the number of bits in the result. In terms of the inputs, it would be O(log(dividend/divisor) ^ 2).
To see why, consider what your loop does. Let a=dividend, b=divisor. The loop subtracts b, 2b, 4b, 8b, ... from a as long as it's big enough, then repeats this sequence again and again until a<b.
It can be equivalently written as two nested loops:
while dividend >= divisor:
Q = divisor
i = 0
while Q <= dividend:
dividend = dividend - Q
Q <<= 1
res += (1 << i)
i+=1
For each iteration of the outer loop, the inner loop will perform less iterations because dividend is smaller. In the worst case, the inner loop will do only one iteration less for each iteration of the outer loop. This happens when the result is 1+3+7+15+...+(2^n-1) for some n. In this case, it can be shown that n = O(log(result)), but the total number of inner loop iterations is O(n^2), i.e. quadratic in the size of the result.
To improve this to be linear in the size of the result, first calculate the largest needed values of Q and i. Then work backwards from that, subtracting 1 from i and shifting Q right each iteration. This guarantees no more than 2n iterations total.
Worst case complexity is easy to find.
Every iteration generates a bit of the result, and the number of iterations is equal to the number of bits in the quotient.
When divider=1, quotient=dividend and in that case the number of iterations is equal to the number of bits in dividend after the leading (most significant) 1. It is maximized when dividend=2^(n-1)+k, where n is the number of bits and k any number such as 1≤k<2^(n-1). This will obviously be the worst case.
After first iteration, dividend=dividend-diviser(=dividend-1) and diviser=2^1
After iteration m, diviser=2^m and dividend=dividend-(1+2^1+..+2^(m-1))=dividend-(2^m-1)
Iterations stop when dividend is <0. As dividend=2^(n-1)+k, with k>0, this happens for m=n.
Hence, the number of steps in the worst case is n and complexity is linear with number of bits of the dividend.

Am I doing this while loop correctly? [duplicate]

This question already has answers here:
How do I plot this logarithm without a "while True" loop?
(2 answers)
Closed 3 years ago.
I am trying to plot the logarithm of twelve tone equal temperament on a scale of hertz.
Is this while loop that breaks in the middle the best way to iterate all of the audible notes in the scale? Could I do the same thing more accurately, or with less code?
I do not want to use a for loop because then the range would be defined arbitrarily, not by the audible range.
When I try to use "note > highest or note < lowest" as the condition for the while loop, it doesn't work. I'm assuming that's because of the scope of where "note" is defined.
highest = 20000
lowest = 20
key = 440
TET = 12
equal_temper = [key]
i = 1
while True:
note = key * (2**(1/TET))**i
if note > highest or note < lowest:
break
equal_temper.append(note)
i += 1
i = 1
while True:
note = key * (2**(1/TET))**-i
if note > highest or note < lowest:
break
equal_temper.append(note)
i += 1
equal_tempered = sorted(equal_temper)
for i in range(len(equal_temper)):
print(equal_tempered[i])
The code returns a list of pitches (in hertz) that are very close to other tables I have looked at, but the higher numbers are further off. Setting a while loop to loop indefinitely seems to work, but I suspect there may be a more elegant way to write the loop.
As it turns out, you actually know the number of iterations! At least you can calculate it by doing some simple math. Then you can use a list comprehension to build your list:
import math
min_I = math.ceil(TET*math.log2(lowest/key))
max_I = math.floor(TET*math.log2(highest/key))
equal_tempered = [key * 2 ** (i / TET) for i in range(min_I, max_I + 1)]
You can use the piano key formula:
freq_n = freq_ref * sqrt(2, 12) ** (n − a)
The reference note is A4, 440 Hz and 49th key on the piano:
def piano_freq(key_no: int) -> float:
ref_tone = 440
ref_no = 49
freq_ratio = 2 ** (1/12)
return ref_tone * freq_ratio ** (key_no - ref_no)
Then you can do things like:
print(piano_freq(40)) # C4 = 261.6255653005985
print([piano_freq(no) for no in range(49, 49+12)]) # A4 .. G#5
Based on: https://en.wikipedia.org/wiki/Piano_key_frequencies

Calculating checksum for ICMP echo request in Python

I am trying to implement a ping server in Python, and I am going through Pyping's source code as a reference: https://github.com/Akhavi/pyping/blob/master/pyping/core.py
I am not being able to understand the calculate_checksum function that has been implemented to calculate the checksum of the ICMP echo request. It has been am implemented as follows:
def calculate_checksum(source_string):
countTo = (int(len(source_string) / 2)) * 2
sum = 0
count = 0
# Handle bytes in pairs (decoding as short ints)
loByte = 0
hiByte = 0
while count < countTo:
if (sys.byteorder == "little"):
loByte = source_string[count]
hiByte = source_string[count + 1]
else:
loByte = source_string[count + 1]
hiByte = source_string[count]
sum = sum + (ord(hiByte) * 256 + ord(loByte))
count += 2
# Handle last byte if applicable (odd-number of bytes)
# Endianness should be irrelevant in this case
if countTo < len(source_string): # Check for odd length
loByte = source_string[len(source_string) - 1]
sum += ord(loByte)
sum &= 0xffffffff # Truncate sum to 32 bits (a variance from ping.c, which
# uses signed ints, but overflow is unlikely in ping)
sum = (sum >> 16) + (sum & 0xffff) # Add high 16 bits to low 16 bits
sum += (sum >> 16) # Add carry from above (if any)
answer = ~sum & 0xffff # Invert and truncate to 16 bits
answer = socket.htons(answer)
return answer
sum &= 0xffffffff is used for truncating the sum to 32 bits. However, what happens to the extra bit (the 33rd bit). Shouldn't that be added to the sum as a carry? Also, I am not being the able to understand the code after this.
I read the RFC1071 documentation (http://www.faqs.org/rfcs/rfc1071.html) that explains how to implement the checksum, but I haven't been able to understand much.
Any help would be appreciated. Thanks!
I was finally able to figure out the working of the calculate_checksum function, and I have tried to explain it below.
The checksum calculation is as follows (as per RFC1071):
Adjacent octets in the source_string are paired to form 16-bit
integers, and the 1's complement sum of these integers is
calculated. In case of odd number of octets, pairs are created out of the n-1 octets and added, and the remaining octet is added to the sum.
The resulting sum is truncated to 16-bits (carry bits are to be taken care of) and the checksum is calculated by taking it's 1's complement. The final checksum should be 16-bits long.
Let's take an example.
If the checksum is to be computed over the sequence of octets [A, B, C, D, E], the pairs created would be [A, B] and [C, D], with the remaining octet E. The pairs [a, b] can be computed as follows:
a*256+b where a and b are the octets
Say if a is 11001010 and b is 00010001, a*256+b = 1100101000010001 thus giving us the concatenated results of the octets.
The 1's complement sum is thus computed as follows:
sum = [A+B] +' [C+D] +' E where +' represents 1's complement
addition
Now coming back to the code, everything before the line sum &= 0xffffffff calculates the 1's complement sum that we have calculated before.
sum &= 0xffffffff
is used for truncating the sum to 32-bits, although the sum exceeding is unlikely in ping as the size of the source_string is not very large
(source_string = header(8 bytes) + payload (variable length)).
sum = (sum >> 16) + (sum & 0xffff)
This piece of code is implemented for the case when the sum is greater than 16-bits. The sum is broken down into 2 parts:
(sum >> 16): the higher order 16-bits
(sum & 0xffff): the lower order 16-bits
and then these two parts are added. The final result can be 16-bits ogreater than 16-bits
sum += (sum >> 16)
This line is used in case the resulting sum from the previous computation is longer than 16-bits and is used to take care of the carry, similar to the previous line.
Finally, the 1's complement is calculated and truncated to 16 bits. The socket.htons() function is used for maintaining the arrangement of bytes sent to the network based on the architecture of your device (Little endian and big endian).

efficiently generating all integers within a binary mask

Suppose I have some binary mask mask. (e.g. 0b101011011101)
Is there an efficient method of computing all integers k such that k & mask == k? (where & is the bitwise AND operator) (alternatively, k & ~mask == 0)
If mask has m ones, then there are exactly 2m numbers that satisfy this property, so it seems like there should be some kind of process that is O(2m). Enumerating the integers less than the mask is wasteful (though easy to eliminate values that do not apply).
I figured it out... you can identify all the single bit patterns like as follows, since the least significant 1 bit of any integer k is cleared when calculating k & (k-1):
def onebits(x):
while x > 0:
# find least significant 1 bit
xprev = x
x &= x-1
yield x ^ xprev
and then I can use the ruler function to XOR in various combinations of 1 bits to emulate which bits of a counter are toggled each time:
def maskcount(mask):
maskbits = []
m = 0
for ls1 in onebits(mask):
m ^= ls1
maskbits.append(m)
# ruler function modified from
# http://lua-users.org/wiki/LuaCoroutinesVersusPythonGenerators
def ruler(k):
for i in range(k):
yield i
for x in ruler(i): yield x
x = 0
yield x
for k in ruler(len(maskbits)):
x ^= maskbits[k]
yield x
which looks like this:
>>> for x in maskcount(0xc05):
... print format(x, '#016b')
0b00000000000000
0b00000000000001
0b00000000000100
0b00000000000101
0b00010000000000
0b00010000000001
0b00010000000100
0b00010000000101
0b00100000000000
0b00100000000001
0b00100000000100
0b00100000000101
0b00110000000000
0b00110000000001
0b00110000000100
0b00110000000101
An easy way to solve the problem is to find the bits that are set in mask, and then simply count with i, but then replacing the bits of i with corresponding bits from the mask.
def codes(mask):
bits = filter(None, (mask & (1 << i) for i in xrange(mask.bit_length())))
for i in xrange(1 << len(bits)):
yield sum(b for j, b in enumerate(bits) if (i >> j) & 1)
print list(codes(39))
That gives you O(log(N)) work per iteration (where N is the number of bits set in mask).
It's possible to be more efficient, and do O(1) work per iteration by counting using gray codes. With gray code counting, only a single bit changes each iteration so it's possible to efficiently update the current value, v. Obviously this is much harder to understand than the simple solution above.
def codes(mask):
bits = filter(None, (mask & (1 << i) for i in xrange(mask.bit_length())))
blt = dict((1 << i, b) for i, b in enumerate(bits))
p, v = 0, 0
for i in xrange(1 << len(bits)):
n = i ^ (i >> 1)
v ^= blt.get(p^n, 0)
p = n
yield v
print list(codes(39))
A disadvantage of using gray codes is that the results are not returned in numeric order. But luckily that wasn't a condition in the question!

Resources