Generate a subset without disturbing the order - subset

Given a set of letters S = { a, b, c, d, e}
How do I generate the following subsets if the input k = 3?
abc
abd
abe
acd
ace
ade
bcd
bce
bde
cde
Where the subsets does not violate the order of the letters as in S.
What is the name for such problem, and what is the solutions?

Consider a binary string having 5 places (or as many places as there are characters in S) consisting of k 1's and (|S|-k) 0's.
Generate all (|S|! / (k! * (|S|-k)!)) permutations of the above strings.
Now output characters corresponding to 1's. That will not violate the order of the characters in S.

Related

Unique numbers with missing digits

I have this problem that I must solve in time that is polynomial in N, K and D given below:
Let N, K be some natural numbers.
Then a, b, c ... are N numbers of exactly K digits each.
a, b, c ... contain only the digits 1 and 2 in some order given by the input.
Although, there have only D digits that are visible, the rest of them being hidden (the hidden digits will be noted with the character "?").
There may be different numbers such that one or more of a, b, c ... are generalizations of the said number:
e.g.
2?122 is a generalization for 21122 and 22122
2?122 is not a generalization for 11111
12??? and ?21?? are both generalizations for 12112
???22 and ???11 cannot be generalizations of the same number
Basically, some number is a generalization of the other if the latter can be one of the "unhidden" versions of the former.
Question:
How many different numbers there are such that at least one of a, b, c or ... is their generalization?***
Quick Reminder:
N = nº of numbers
K = nº of digits in each number
D = nº of visible digits in each number
Conditions & Limitations:
N, K, D are natural numbers
1 ≤ N
1 ≤ D < K
Input / Output snippets for verification of the algorithm:
Input:
N = 3, K = 5, D = 3
112??
?122?
1?2?1
Output:
8
Explanation:
The numbers are 11211, 11212, 11221, 11222, 12211, 12221, 21221, 21222, which are 8 numbers.
11211, 11212, 11221, 11222 are the generalizations of 112??
11221, 11222, 21221, 21222 are the generalizations of ?122?
11211, 11221, 12211, 12221 are the generalizations of 1?2?1
Input:
N = 2, K = 3, D = 1
1??
?2?
Output:
6
Explanation:
The numbers are 111, 112, 121, 122, 221, 222, which are 6 numbers.
From my calculations, I found out that there are 2^(K-D) possible numbers in total that have a as their generalization, 2^(K-D) possible numbers in total that have b as their generalization etc., leaving me with N*2^(K-D) numbers.
My big problem is that I found cases where a number has multiple generalizations and therefore it repeats inside N*2^(K-D), so the real nº of different numbers will be, in this case, something smaller than N*2^(K-D).
I don't know how to find only the different numbers and I need your help.
Thank you very much!
EDIT: Answer to «n. 1.8e9-where's-my-share m.»'s question from the comments:
If you have two generalisations, can you find out how many numbers they both generalise?
For two given general numbers a and b (general meaning that they both contain "?"), it is possible to find nº of numbers generalised by both a and b in polynomial time by using the following logic:
1 - we declare some variable q = 1
2 - we start "scanning" the digits of the two numbers simultaneously from left to right:
2.1 - if we find two unhidden digits and they are different, then no numbers are generalized by both a and b and we return 0
2.2 - if we find two hidden digits, then we multiply q by 2, since of the two general numbers result to both generalize some number, that number can have 1 or 2 in place of "?", therefore for each "?" we double the numbers that can be generalized from both a and b as long as step 2.1 is never true.
3 - if scanned all the digits and step 2.1 was never true, then we return 2^q
Therefore, the nº of numbers both a and b generalize is 0 or 2^q, according to the cases presented above.
Unfortunately, this is impossible to do in polynomial time (unless P=NP, and maybe not even then.) Your problem is equivalent to the problem of counting satisfying assignments to a formula in Disjunctive Normal Form, called the DNF counting problem. DNF counting is Sharp-P-hard, so a polynomial time solution could be used to solve all problems in NP in polynomial time too (and more).
To see the relationship, note that each pattern is equivalent to an AND of several literals. If you take '1' in a position to be a literal, and '2' in that position to be that literal negated, you can convert it to a disjunctive clause.
For example:
1 1 2 ? ?
becomes
(x_1 ∧ x_2 ∧ ¬x_3)
? 1 2 2 ?
becomes
(x_2 ∧ ¬x_3 ∧ ¬x_4)
1 ? 2 ? 1
becomes
(x_1 ∧ ¬x_3 ∧ x_5)
The question of how many numbers satisfy at least one of these patterns is exactly the question of how many assignments satisfy at least one of the equivalent clauses.

Minimum no of operations required to create String A By appending subsequence of String B to a empty string C

You have given two strings A and B. You have some empty string C. In one operation You can remove any no of characters (from anywhere) from String B and append it to string C. Minimum no of operations required to convert String C to String A.
e.g if
A is "ABCDE" and
B is "ABDEC" then
In 1st operation you will choose subsequence ABC from B and in 2nd operation DE.
So two operations are required.
if
A is "ABCDE"
B is "EDCBA" then
operations required 5.
Linear complexity expected O(n)
Just use a greedy algorithm.
1 - Let i = 0
2 - Let j = 0
3 - Search for the first A[i] in B after j
4 - If it exists, let j be its index in B, remove it from B, append it to C, increment i, and repeat from 3
5 - If it doesn't exist, repeat from 2
Each time you get to 5 corresponds to one operation.
Assuming all the characters of A (and B) are different, then here is a solution with linear complexity. You need a hashmap or something similar, as well as an array of indices, Y, of equal length to A and B.
1 - Put each character of A in the hashmap as key, with its index as value.
2 - Look up each character of B in the hashmap to get the value i, and put its index into Y at the position i.
3 - Go through Y counting the number of times that Y[i] < Y[i-1]. That's your number of operations.

Regular expressions for strings not containing specific substring

What could be the regular expression for - All words that do not have the substring baa for alphabet set ={a,b}?
Is it:
a* ((aa) * b *)?
Can a string of length 2 be acceptable for the above condition to hold?
a*(ba?)*
At start, it can go with arbitrarily many a's, but once a b has been introduced, not more than a single isolated a is allowed to appear anywhere hereupon.
a*(b+(ba))*
By grammar, once b reached, there can be many b occurrences or if there is an a after b, it must end or follow by b or by ba.

How to find all strings that do not contain substring palindromes

Disclaimer: This is a problem lifted from HackerRank, but their editorial answer wasn't sufficient so I hoped to get better answers. If it's against any policy, please let me know and I'll take this down.
Problem:
You are given two integers, N and M. Count the number of strings of length N under the alphabet set of size M that doesn't contain any palindromic string of the length greater than 1 as a consecutive substring.
N=2,M=2 -> 2 :: AA, AB, BA, BB
N=2,M=3 -> 6 :: AA, AB, AC, BA, BB, BC, CA, CB, CC
ABCDE counts as it does not contain any palindromic substrings.
ABCCC does not count as it does contain "CCC", a palindrome of length >1.
Editorial
Here is the provided answer which I think is wrong:
For N>=3, there are (M−2) ways to choose any next symbol (after the first two) - basically it should not coincide with the previous and pred-previous symbols, that aren't equal.
If N=1, return M
If N=2, return M * (M-1)
If N>=3, return M * (M-1) * (M-2)^(N-2)
counterexample: N=4, M=3, "ABCC"
My Solution Try
When I was working on this problem, I tried to find all the strings that contained palindromic substrings and subtracting that from the total, M^N. I ran into a lot of problems with over counting. For example, "ABABA" has "ABA","BAB","ABA" of n=3, and "ABABA" of n=5.
Thanks for any help in elucidating this problem. I really hope for a good answer to figure this out!
Suppose you build up palindrome-free strings one letter at a time. For the first letter, you have M choices, and for the second, you have M-1, since you can't use the first letter. This much is obvious.
For every letter after the first two, you can't use the previous letter, and you can't use the letter before that, so that's two choices eliminated. What about the other letters? Well, if using one of those creates a palindrome, it would have to be a palindrome of length at least 4 - but if adding a letter creates a palindrome of length K+2 for K>=2, the string must already have had a palindrome of length K for the new palindrome to build off of. (For K<2, this is okay.) Since the string didn't have any palindromes of length >=2, we can conclude that adding any letter other than the previous two letters is fine.
Thus, we have M choices for the first letter, M-1 choices for the second, and M-2 for every letter after that.

Algorithm to form a given pattern using some strings

Given are 6 strings of any length. The words are to be arranged in the pattern shown below. They can be arranged either vertically or horizontally.
--------
| |
| |
| |
---------------
| |
| |
| |
--------
The pattern need not to be symmetric and there need to be two empty areas as shown.
For example:
Given strings
PQF
DCC
ACTF
CKTYCA
PGYVQP
DWTP
The pattern can be
DCC...
W.K...
T.T...
PGYVQP
..C..Q
..ACTF
where dot represent empty areas.
The other example is
RVE
LAPAHFUIK
BIRRE
KZGLPFQR
LLHU
UUZZSQHILWB
Pattern is
LLHU....
A..U....
P..Z....
A..Z....
H..S....
F..Q....
U..H....
I..I....
KZGLPFQR
...W...V
...BIRRE
If multiple patterns are possible then pattern with lexicographically smallest first line, then second line and so on is to be formed. What algorithm can be used to solve this?
Find strings which suits to this constraint:
strlen(a) + strlen(b) - 1 = strlen(c)
strlen(d) + strlen(e) - 1 = strlen(f)
After that try every possible situation if they are valid. For example;
aaa.....
d.f.....
d.f.....
d.f.....
cccccccc
..f....e
..f....e
..bbbbbb
There will be 2*2*2 = 8 different situation.
There are a number of heuristics that you can apply, but before that, let's go over some properties of the puzzle.
+aa+
c f
+ee+eee+
f d
+bbb+
Let us call the length of the string with the same character as appeared in the diagram above. We have:
a + b - 1 = e
c + d - 1 = f
I will refer to the 2 strings for the cross in the middle as middle strings.
We also infer that the length of the string cannot be less than 2. Therefore, we can infer:
e > a, e > b
f > c, f > d
From this, we know that the 2 shortest strings cannot be middle strings, due to the inequality above.
The 3 largest strings cannot be equal also, since after choosing any of 3 string as middle string, we are left with 2 largest strings that are equal, and it is impossible according to the inequality above.
The puzzle is only tricky when the lengths are regular. When the lengths are irregular, you can do direct mapping from length to position.
If we have the 2 largest strings being equal, due to the inequality above, they are the 2 middle strings. The worst case for this one is a "regular" puzzle, where the length a, b, c, d are equal.
If the 2 largest strings are unequal, the largest string's position can be determined immediately (since its length is unique in the puzzle) - as one of the middle string. In worst case, there can be 3 candidates for the other middle string - just brute force and check all of them.
Algorithm:
Try to map unique length string to the position.
Brute force the 2 strings in the middle (taken into consideration what I mentioned above), and brute force to fill in the rest.
Even with stupid brute force, there are only 6! = 720 cases, if the string can only go from left to right, up to down (no reverse). There will be 46080 cases (* 2^6) if the string is allowed to be in any direction.

Resources