Algorithm to find number of possible string variations - string

For a password related project, I'm looking for an algorithm that calculates the number of possible variations a certain string can have based on a few options. For now the string variation options are upper/lowercase and character to number replacements (like E=3)
For example, lets take the string 'abc#def'
With just upper/lower variations, there are 6 characters that can vary, and the total number of possible variations is 2^6 = 64.
With just character to number replacements, there are 2 characters that qualify (A=4,E=3). That makes the number of variations 2^2 = 4.
I'm struggeling with calculating the number of variations when both methods are enabled.
I've tried (2^6 * 2^4), but obviously this doesn't consider the overlap that occurs when applying both.
For example, the variations 'abc#def' and 'abc#dEf' both result in 'abc#d3f' with number substitution on de character E and should be counted as one.
Somehow I can't figure this out :)

Just count all possibilities for each letter within the password and multiply them together:
letter options count
a a A 4 3
b b B 2
c c C 2
# # 1
d d D 2
e e E 3 3
f f F 2
Finally we have 3 * 2 * 2 * 1 * 2 * 3 * 2 == 144 variants

Related

TRUE/FALSE ← VLOOKUP ← Identify the ROW! of the first negative value within a column

Firstly, we have an array of predetermined factors, ie. V-Z;
their attributes are 3, the first two (•xM) multiplied giving the 3rd.
f ... factors
• ... cap, the values in the data set may increase max
m ... fixed multiplier
p ... let's call it power
This is a separate, standalone array .. we'd access with eg. VLOOKUP
f • m pwr
V 1 9 9
W 2 8 16
X 3 7 21
Y 4 6 24
Z 5 5 25
—————————————————————————————————————————————
Then we have 6 columns, in which the actual data to be processed is in, & thereof derive the next-level result, based on the interaction of both samples introduced.
In addition, there are added two columns, for balance & profit.
Here's a short, 6-row data sample:
f • m bal profit
V 2 3 377 1
Y 2 3 156 7
Y 1 1 122 0
X 1 2 -27 2
Z 3 3 223 3
—————————————————————————————————————————————
Ultimately, starting at the end, we are comparing IF -27 inverted → so 27 is within the X's power range ie. 21 (as per the first sample) .. which is then fed into a bigger formula, beyond the scope of this post.
This can be done with VLOOKUP, all fine by now.
—————————————————————————————————————————————
To get to that .. for the working example, we are focusing coincidentally on row5, since that's the one with the first negative value in the 'balance' column, so ..
on factorX = which factor exactly is to us unknown &
balance -27 = which we have to locate amongst potentially dozens to hundreds of rows.
Why!?
Once we know that the factor is X, based on the * & multiplier pertaining to it, then we also know which 'power' (top array) to compare -27, as the identified first negative value in the balance column, to.
Is that clear?
I'd like to know the formula on how to achieve that, & (get to) move on with the broader-scope work.
—————————————————————————————————————————————
The main issue for me is not knowing how to identify the first negative or row -27 pertains to, then having that piece of information how to leverage it to get the X or identify the factor type, especially since its positioned left of the latter & to the best of my knowledge I cannot use negative column index number (so, latter even if possible is out of the question anyway).
To recap;
IF(21>27) = IF(-21<-27)
27 → LOCATE ROW with the first negative number (-27)
21 → IDENTIFY the FACTOR TYPE, same row as (-27)
→ VLOOKUP pwr, based on factor type identified (top array, 4th column right)
→ invert either 21 to a negative number or (-27) to the positive number
= TRUE/FALSE
Guessing your columns I'll say your first chart is in columns A to D, and the second in columns G to K
You could find the letter of that factor with something like this:
=INDEX(G:G,XMATCH(TRUE,INDEX(J:J<0)))
INDEX(J:J<0) converts that column to TRUE and FALSE depending on being negative or not and with XMATCH you find the first TRUE. You could then use that in VLOOKUP:
=VLOOKUP(INDEX(G:G,XMATCH(TRUE,INDEX(J:J<0))),A:D,4,0)
That would return the 21. You can use the first concept too to find the the -27 and with ABS have its "positive value"
=VLOOKUP(INDEX(G:G,XMATCH(TRUE,INDEX(J:J<0))),A:D,4,0) > INDEX(J:J,XMATCH(TRUE,INDEX(J:J<0)))
That should return true or false in the comparison

Hacker rank problem - code optimisation and debugging logical errors required to pass all the test cases for the below python program

This problem is regarding sets, here is an array arr of integers. There are also disjoint sets, A and B, each containing integers. You like all the integers in the set A and dislike all the integers in set B. Your initial happiness is 0. For each integer in the array, if i belongs to A, you add 1 to your happiness. If i belongs to B, you add -1 to your happiness. Otherwise, your happiness does not change. Output your final happiness at the end.
Note: A and B are set, they have no repeated elements. However, the array might contain duplicate elements.
In the below code, I have tried to take input n,m
k = list(map(str,input().split(' ')))
n,m =k
arr=[]
arr = [int(i) for i in input().split()]
arr1 = list( dict.fromkeys(arr) )
A=set(int(i) for i in input().split())
B=set(int(i) for i in input().split())
a=len(set(arr1).intersection(A))
b=len(set(arr1).intersection(B))
print(a-b)
Input Format
The first line contains integers n and m and separated by a space.
The second line contains n integers, the elements of the array.
The third and fourth lines contain m integers, A, and B, respectively.
Input
**1** **2**
3 2 13 4
1 5 3 1 7 8 5 3 7 9 4 9 8 2 1 4
3 1 1 5 3 9
5 7 7 4 2 8
Output
1 0
The above piece of code works for small input test cases but it results as the Wrong answer for the rest.
Follow the link for the actual problem statement
This is the code I used but it was unable to clear most test cases. Need help.

How can I solve this classical dynamic programming problem?

There are N jewellery shop(s). Each jewellery shop has three kinds of coins - Gold, Platinum, and Diamond having worth value A, B, and C respectively. You decided to go to each of N jewellery shop and take coins from each of the shop. But to do so following conditions must satisfy -
You can take at most 1 coin from an individual shop.
You can take at most X coins of Gold type.
You can take at most Y coins of Platinum type.
You can take at most Z coins of Diamond type.
You want to collect coins from shops in such a way that worth value of coins collected is maximised.
Input Format :
The first line contains an integer N. Where N is the number of jewellery shops.
The second line contains three integers X, Y, Z. Where X, Y, Z denotes the maximum number of coins you can collect of type Gold, Platinum, and diamond respectively.
Then N lines contain three space-separated integers A, B, C. Where A, B, C is the worth value of the Gold, Platinum, and diamond coin respectively.
Output Format :
Print a single integer representing the maximum worth value you can get.
Constraints :
1
<=
N
<=
200
1
<=
X
,
Y
,
Z
<=
N
1
<=
A
,
B
,
C
<
10
9
Example : -
4
2 1 1
5 4 5
4 3 2
10 9 7
8 2 9
Answer:-
27(9+9+5+4)
I tried the obvious greedy approach but it failed :-)

Understanding solution to online test

The question is in the following link:
http://www.spoj.com/problems/AEROLITE/
Input:
1 1 1 1
0 0 6 3
1 1 1 2
[and 7 test cases more]
Output:
6
57
8
[and 7 test cases more]
How does the output come from the input?
Consider the outputs corresponding to the following letters:
a. 1 1 1 1 = 6
b. 0 0 6 3 = 57
c. 1 1 1 2 = 8
Restating the definitions from the problem in a more tactical way, the 4 inputs correspond to the following:
The number of "{}" pairs
The number of "[]" pairs
The number of "()" pairs
The max depth when generating the output
The output is a single number representing the number of regular expressions that match the input parameters (how much depth can be used with the pairs) and how many combinations of the 3 pairs can be generating matching the prioritization rules that "()" cannot contain "{}" or "[]" and "[]" cannot contain "{}".
The walkthrough below shows how to arrive at the outputs, but it doesn't try to break the sub-problems or anything down. Hopefully, it will at least help you connect the numbers and start to find the problems to break down.
Taking those examples explicitly, start with "a" for 1 1 1 1 = 6:
The inputs mean that only do a depth of 1 and use 1 pair each of "{}", "[]", "()". This is a permutation how many arrangements of 3 can be made as permutations, so 3! = 6.
Actual: {}, {}()[], []{}(), {}, (){}[], ()[]{}
Then go to "b" for 1 1 1 2 = 8
This is just like "a" with exception that we must now allow for another level of depth (d = 2 instead of 1)
Therefore, this is 6 from "a" + any additional combinations of depth = 2
** Additional = {[()]}, {} (only 2 additional cases meet the rules)
"a" + (additional for d = 2) = 8
Finally, consider "b" where we are exploring only the d = 3 of 6 "()".
We must break down and add the depth (d) of 1, 2, and 3
Because only parenthesis exist here, this is just a Catalan number Cn where n = 6, but limited to a depth of no more than 3 levels of parenthesis (For more on this: https://en.wikipedia.org/wiki/Catalan_number) C(6) = 132, but once you exclude all the Catalan numbers for depths more than 3, you are left with 57 matches.
Alternatively and much more tediously, you can iterate over all the combinations of parenthesis that are depth of 3 or less to get to 57 records:
** Start with d = 1, so just ()()()()()()
** Then d = 2, so examples like (())()()()(), ()(())()()(), ()()(())()(), ()()()(())(), ()()()()(()), and so on
** Then d = 3, so examples like ((()))()()(), ()((()))()(), ()()((()))(), ()()()((())), and so on

loop rolling algorithm

I have come up with the term loop rolling myself with the hope that it does
not overlap with an existing term. Basically I'm trying to come up with an
algorithm to find loops in a printed text.
Some examples from simple to complicated
Example1
Given:
a a a a a b c d
I want to say:
5x(a) b c d
or algorithmically:
for 1 .. 5
print a
end
print b
print c
print d
Example2
Given:
a b a b a b a b c d
I want to say:
4x(a b) c d
or algorithmically:
for 1 .. 4
print a
print b
end
print c
print d
Example3
Given:
a b c d b c d b c d b c e
I want to say:
a 3x(b c d) b c e
or algorithmically:
print a
for 1 .. 3
print b
print c
print d
end
print b
print c
print d
It didn't remind me of any algorithm that I know of. I feel like some of the
problems can be ambiguous but finding one of the solutions is enough to me for
now. Efficiency is always welcome but not mandatory. How can I do this?
EDIT
First of all, thanks for all the discussion. I have adapted an LZW algorithm
from rosetta and ran it on my
input:
abcdbcdbcdbcdef
which gave me:
a
b
c
d
8 => bc
10 => db
9 => cd
11 => bcd
e
f
where I have a dictionary of:
a a
c c
b b
e e
d d
f f
8 bc
9 cd
10 db
11 bcd
12 dbc
13 cdb
14 bcde
15 ef
7 ab
It looks good for compression but it's not quite what I wanted. What I need
is more like compression in the algorithmic representation from my examples
which would have:
subsequent sequences (if a sequence is repeating, there would be no other
sequence in between)
no dictionary but only loops
irreducable
with maximum sequence sizes (which would minimize the algorithmic
representation)
and let's say nested loops are allowed (contrary to what I said before in
the comment)
I start with an algorithm, which gives maximum sequence sizes. Though it would not always minimize the algorithmic representation, it may be used as an approximation algorithm. Or it may be extended to optimal algorithm.
Start with constructing Suffix array for your text along with LCP array.
Sort an array of indexes of LCP array, indexes of larger elements of LCP array come first. This groups together repeating sequences of the same length and allows to process sequences in greedy manner, starting from maximum sequence sizes.
Extract suffix array entries, grouped by LCP value (by group I mean all the entries with selected LCP value as well as all entries with larger LCP values), and sort them by position in the text.
Filter out entries with positional difference not equal to LCP. For remaining entries, get prefixes of length, equal to LCP. This gives all possible sequences in the text.
Add sequences, sorted by starting position, to ordered collection (for example, binary search tree). Sequences are added in order of appearance in sorted LCP, so longer sequences are added first. Sequences are added only if they are independent or if one of them is completely nested inside the other one. Intersecting intervals are ignored. For example, in caba caba bab sequence ab intersects with caba and so it is ignored. But in cababa cababa babab one instance of ab is dropped, 2 instances are completely inside larger sequence, and 2 instances are completely outside of it.
At the end, this ordered collection contains all the information, needed to produce the algorithmic representation.
Example:
Text ababcabab
Suffix array ab abab ababcabab abcabab b bab babcabab bcabab cabab
LCP array 2 4 2 0 1 3 1 0
Sorted LCP 4 3 2 2 1 1 0 0
Positional difference 5 5 2 2 2 2 - -
Filtered LCP - - 2 2 - - - -
Filtered prefixes (ab ab) (ab ab)
Sketch of an algorithm, producing the minimal algorithmic representation.
Start with the first 4 steps of previous algorithm. Fifth step should be modified. Now it is not possible to ignore intersecting intervals, so every sequence is added to the collection. Since the collection now contains intersecting intervals, it is better to implement it as some advanced data structure, for example, Interval tree.
Then recursively determine the length of algorithmic representation for all sequences, that contain any nested sequences, starting from the smallest ones. When every sequence is evaluated, compute optimal algorithmic representation for whole text. Algorithm for processing either a sequence or whole text uses dynamic programming: allocate a matrix with number of columns, equal to text/sequence length and number of rows, equal to the length of algorithmic representation; doing in-order traversal of interval tree, update this matrix with all sequences, possible for each text position; when more than one value for some cell is possible, either choose any of them, or give preference to longer or shorter sub-sequences.

Resources