count the number of binary string of length n that is repeatable - string

The problem is to find the number of repeatable binary strings of length n.A binary string is repeatable if it can be obtained by any sub string of the binary string that repeats itself to form the original binary string.
Example
"1010" is a repeatable string as it can be obtained from "10" by repeating 2 number of times
"1001" is not a repeatable string as it cannot be obtained from any sub string of "1001" by repeating them any number of times
The solution I thought of is to generate all possible binary string of length n and check whether it is is a repeatable or not using KMP algorithm, but this solution is not feasible even for small n like n=40.
The second approach I thought is
for divisor k of n find all sub strings of length k that repeats itself n/k times
Example for n = 6 we have divisor 1,2,3
for length 1 we have 2 sub string "1" and "0" that repeats itself 6
times so "111111" and "000000" are repeatable strings
for length 2 we have 4 sub strings "00" "01" "10" "11" so "000000"
"010101" "101010" and "111111" are repeatable strings
similarly for length 3 we have 8 strings that are repeatable.
Sum up all the divisor generated string and subtract duplicates.
In the above example the string "111111" and "000000" was counted 3 times for each of the divisor.so clearly I am over counting.I need to subtract duplicates but I can't think of anyway to subtract duplicates from my actual count How can I do that?
Am I headed in the right direction or do I need to any other approach?

When you use the second scheme remove the sub strings which made of repeatable binaries. For instance, 00 and 11 are made of the repeat of 0 and 1 respectively. So for length of 2 only consider the "01" and "10"
for length of 3 only consider "001", "010", "011", "100", "101", "110"
...
generally,
for odd length of n remove 0 and (2^n)-1,
for even length of n, remove 0, (2^(n/2)+1), (2^(n/2)+1)2, ...., (2^n)-1
and if n dividable by 3, (1+2^(n/2)+2^(n-2)), (1+2^(n/2)+2^(n-2)) 2, ...
continue this for all divider.

One idea is that if we only count the ways to make the divisor-sized strings from non-repeated substrings, the counts from the divisors's divisors will account for the ways to make the divisors from repeated substrings.
f(1) = 0
f(n) = sum(2^d - f(d)), where 1 <= d < n and d divides n
...meaning the sum of only the ways divisors of n can be made not from repeated substrings.
f(2) = 2^1-0
f(3) = 2^1-0
f(4) = 2^1-0 + 2^2-2
f(6) = 2^1-0 + 2^2-2 + 2^3-2
...

Related

Python: How do I select multiples that contain a specific number only?

For example, from all the multiples of the factor digit, I'd like to find the number of integers that have the digit "d" in one of the 2 digits of the integers. n is the limit to the number of multiples I'd like to search through.
def find_integers():
Factor=int(input("Enter Factor-digit:"))
d=int(input("Enter must-have-digit:"))
n=int(input("Enter the total number of integers:"))
for i in range(0,n):
Multiples=(Factor*i)
How do I carry on to take out the multiples that have the digit "d" in them?
multiples=[x*2 for x in range(0,1000000)]
# get 1 million multiples of 2
print(list(filter(lambda x:"4" in str(x),multiples)))
# this only prints the values in multiples if the lambda function returns true
# it only returns true if the string "4" is in the string representation of the number
I hope this helps

Given a integer N greater than zero. How many sequences of 1's and 2's are there

Given a integer N greater than zero.
How many sequences of 1's and 2's are there such that sum of the numbers in the sequence = N ?
(not necessary that every sequence must contain both 1 and 2 )
example :
for N = 2 ; 11,2 => ans = 2 sequences of 1's and 2's
for N = 3 ; 11,12,21 => ans = 3 sequences of 1's and 2's
One can think of a recursive formula, for instance by characterizing the last digits. For instance, a sequence of N+1 can be obtained by concatenating a sequence of N and a 1, or a sequence of N-1 and a 2. So it gives:
R(N+1) = R(N) + R(N-1)
So we have a Fibonacci-type sequence with R(1)=1 and R(2)=2.
See https://en.wikipedia.org/wiki/Fibonacci_number
It gives
where and .
So you can program the answer using a constant number of operations.

Number of substrings with given constraints

I am given a sorted string and I wish to count the number of substrings (not necessarily contiguous) that are possible with the following constraints:
All the alphabets in the substring should be in sorted order.
The substring must contain only 1 vowel.
The length of the substring should be greater than or equal to 3.
For example:
for "aabbc",
we have 3 substrings "abc","abb","abbc" that match the above constraints.So, here 3 is the ans.
How do I go about for a general string?
I have tried this for 2-3 hours, but couldn't find a proper way. I was asked this question in a programming coding round today and I fear the same question would be asked in the interview tomorrow. Even hints or approach would be appreciated.
Suppose we have k vowels, and an array A specifying the histogram of each non-vowel. (i.e. A[0] is the number of the first non-vowel, A[1] is the number of the second non-vowel.)
Then (ignoring the length constraint) we have k choices for the vowel, and (A[0]+1)*(A[1]+1)*(A[2]+1)*... choices for the remaining letters (for each non-vowel we can have 0,1,2,...,A[i] choices).
This overcounts by k (for the single letter cases) and by k*len(A) for the double letter cases, so simply subtract these from the total.
Example Python code:
from collections import Counter
s='aabbc'
vowels = 'aeiou'
C = Counter(s)
t = 1
vowel_count = 0
cons_count = 0
for letter,count in C.items():
if letter in vowels:
vowel_count += 1
else:
cons_count += 1
t *= count+1
print vowel_count * (t - cons_count - 1)

Subsequences whose sum of digits is divisible by 6

Say I have a string whose characters are nothing but digits in [0 - 9] range. E.g: "2486". Now I want to find out all the subsequences whose sum of digits is divisible by 6. E.g: in "2486", the subsequences are - "6", "246" ( 2+ 4 + 6 = 12 is divisible by 6 ), "486" (4 + 8 + 6 = 18 is divisible by 6 ) etc. I know generating all 2^n combinations we can do this. But that's very costly. What is the most efficient way to do this?
Edit:
I found the following solution somewhere in quora.
int len,ar[MAXLEN],dp[MAXLEN][MAXN];
int fun(int idx,int m)
{
if(idx==len)
return (m==0);
if(dp[idx][m]!=-1)
return dp[idx][m];
int ans=fun(idx+1,m);
ans+=fun(idx+1,(m*10+ar[idx])%n);
return dp[idx][m]=ans;
}
int main()
{
// input len , n , array
memset(dp,-1,sizeof(dp));
printf("%d\n",fun(0,0));
return 0;
}
Can someone please explain what is the logic behind the code - 'm*10+ar[idx])%n' ? Why is m multiplied by 10 here?
Say you have a sequence of 16 digits You could generate all 216 subsequences and test them, which is 65536 operations.
Or you could take the first 8 digits and generate the 28 possible subsequences, and sort them based on the result of their sum modulo 6, and do the same for the last 8 digits. This is only 512 operations.
Then you can generate all subsequences of the original 16 digit string that are divisible by 6 by taking each subsequence of the first list with a modulo value equal to 0 (including the empty subsquence) and concatenating it with each subsequence of the last list with a modulo value equal to 0.
Then take each subsequence of the first list with a modulo value equal to 1 and concatenate it with each subsequence of the last list with a modulo value equal to 5. Then 2 with 4, 3 with 3, 4 with 2 and 5 with 1.
So after an initial cost of 512 operations you can generate just those subsequences whose sum is divisible by 6. You can apply this algorithm recursively for larger sequences.
Create an array with a 6-bit bitmap for each position in the string. Work from right to left and set the array of bitmaps so that bitmaps have bits set in the array when there is some subsequence starting from just after the array which sums up to that position in the bitmap. You can do this from right to left using the bitmap just after the current position. If you see a 3 and the bitmap just after the current position is 010001 then sums 1 and 5 are already accessible by just skipping the 3. Using the 3 sums 4 and 2 are now available, so the new bitmap is 011011.
Now do a depth first search for subsequences from left to right, with the choice at each character being either to take that character or not. As you do this keep track of the mod 6 sum of the characters taken so far. Use the bitmaps to work out whether there is a subsequence to the right of that position that, added to the sum so far, yields zero. Carry on as long as you can see that the current sum leads to a subsequence of sum zero, otherwise stop and recurse.
The first stage has cost linear in the size of the input (for fixed values of 6). The second stage has cost linear in the number of subsequences produced. In fact, if you have to actually write out the subsequences visited (E.g. by maintaining an explicit stack and writing out the contents of the stack) THAT will be the most expensive part of the program.
The worst case is of course input 000000...0000 when all 2^n subsequences are valid.
I'm pretty sure a user named, amit, recently answered a similar question for combinations rather than subsequences where the divisor is 4, although I can't find it right now. His answer was to create, in this case, five arrays (call them Array_i) in O(n) where each array contains the array elements with a modular relationship i with 6. With subsequences we also need a way to record element order. For example, in your case of 2486, our arrays could be:
Array_0 = [null,null,null,6]
Array_1 = []
Array_2 = [null,4,null,null]
Array_3 = []
Array_4 = [2,null,8,null]
Array_5 = []
Now just cross-combine the appropriate arrays, maintaining element order: Array_0, Array_2 & Array_4, Array_0 & any other combination of arrays:
6, 24, 48, 246, 486

Substrings and Subsequences

In a string of length n, how many Sub-strings and Sub-sequences can I have... even tho a sub-string is obtained by deleting any prefix and any suffix from s, while a sub-sequence is any string formed by deleting zero or more not necessary a consecutive positions of s.
Assuming you are not ignoring duplicates:
sub strings = n(n+1)/2
count the number of 1 length sub strings = n
count the number of 2 length sub strings = n-1
count the number of 3 length sub strings = n-2
....
count the number of n length sub strings = n - (n-1) = 1
generalizes to the sum of the sequence of numbers from 1 to n.
sub sequences = 2^n
Think of the string as a bit array. either include the character in your sub sequence or do not. there are 2^n combinations.

Resources