"Consolidation" algorithm name / implementation - string

Not quite sure how to describe this, but I have a word game I like to play that I'd like to implement as a computer program.
The basic gist is that you look at the values of the letters (A=1..Z=26), and consolidate the letters into the fewest possible, and that are the closest possible to each other.
As an example:
s t a c k
Sum the values
19 + 20 + 1 + 3 + 11 = 54
Find the fewest number of letters:
ceil(54/26) = 3
Choose letters closest to each other
54/3 = 18
Letters to be displayed should be rrr.
That happens to be an easy example. What would it look like when you need to have, say, rrs (if your initial string was 'a stack' instead)?
Does this already have a name that I can lookup and implement?

I think your problem boils down to this: given n and k, find numbers r1, r2, ..., rk such that sum(r1 + r2 + ... + rk) = n and max(r1, r2, ..., rk) - min(r1, r2, ..., rk) is as small as possible.
The solution is pick r = floor(n / k), and set n mod k of the numbers to be r + 1, and the rest r.
For example, if n = 55 and k = 3 (your example), we have floor(55/3) = 18 and 55 mod 3 is 1, so the solution is 19, 18, 18.
All that remains is converting between numbers and letters.

Related

String manipulation with dynamic programming

I have a problem where I have a string of length N, where (1 ≤ N ≤ 10^5). This string will only have lower case letters.
We have to rewrite the string so that it has a series of "streaks", where the same letter is included at least K (1 ≤ K ≤ N) times in a row.
It costs a_ij to change a single specific letter in the string from i to j. There are M different possible letters you can change each letter to.
Example: "abcde" is the input string. N = 5 (length of "abcde"), M = 5 (letters are A, B, C, D, E), and K = 2 (each letter must be repeated at least 2 times) Then we are given a M×M matrix of values a_ij, where a_ij is an integer in the range 0…1000 and a_ii = 0 for all i.
0 1 4 4 4
2 0 4 4 4
6 5 0 3 2
5 5 5 0 4
3 7 0 5 0
Here, it costs 0 to change from A to A, 1 to change from A to B, 4 to change from A to C, and so on. It costs 2 to change from B to A.
The optimal solution in this example is to change the a into b, change the d into e, and then change both e’s into c’s. This will take 1 + 4 + 0 + 0 = 5 moves, and the final combo string will be "bbccc".
It becomes complicated as it might take less time to switch from using button i to an intermediate button k and then from button k to button j rather than from i to j directly (or more generally, there may be a path of changes starting with i and ending with j that gives the best overall cost for switching from button i ultimately to button j).
To solve for this issue, I am treating the matrix as a graph, and then performing Floyd Warshall to find the fastest time to switch letters. This will take O(M^3) which is only 26^3.
My next step is to perform dynamic programming on each additional letter to find the answer. If someone could give me advice on how to do this, I would be thankful!
Here are some untested ideas. I'm not sure if this is efficient enough (or completely worked out) but it looks like 26 * 3 * 10^5. The recurrence could be converted to a table, although with higher Ks, memoisation might be more efficient because of reduced state possibilities.
Assume we've recorded 26 prefix arrays for conversion of the entire list to each of the characters using the best conversion schedule, using a path-finding method. This lets us calculate the cost of a conversion of a range in the string in O(1) time, using a function, cost.
A letter in the result can be one of three things: either it's the kth instance of character c, or it's before the kth, or it's after the kth. This leads to a general recurrence:
f(i, is_kth, c) ->
cost(i - k + 1, i, c) + A
where
A = min(
f(i - k, is_kth, c'),
f(i - k, is_after_kth, c')
) forall c'
A takes constant time since the alphabet is constant, assuming earlier calls to f have been tabled.
f(i, is_before_kth, c) ->
cost(i, i, c) + A
where
A = min(
f(i - 1, is_before_kth, c),
f(i - 1, is_kth, c'),
f(i - 1, is_after_kth, c')
) forall c'
Again A is constant time since the alphabet is constant.
f(i, is_after_kth, c) ->
cost(i, i, c) + A
where
A = min(
f(i - 1, is_after_kth, c),
f(i - 1, is_kth, c)
)
A is constant time in the latter. We would seek the best result of the recurrence applied to each character at the end of the string with either state is_kth or state is_after_kth.

Detecting when Y is a vowel

While learning the programming language Rust, I want to make a numerology calculator based on a person's full name. This numerology calculator would, later on, provide computations on other pieces of texts as well as other numbers.
The numerology calculator would take a name such as John Edward Smith and calculate his life path like this, for example:
1 + 6 + 8 + 4 + 5 + 4 + 5 + 1 + 9 + 4 + 1 + 4 + 9 + 2 + 8
J O H N E D W A R D S M I T H
Adding up all the letters yields 71 and 7+1 = 8
Another calculation uses the vowels in the name, and here is the problem: most algorithms detect whether a letter in question is AEIO or U and declares it to be a vowel. Otherwise, it's a consonant. However, Y could be either a vowel or a consonant. W is also sometimes a vowel but for the purposes of numerology, it is always a consonant.
I came across a cargo package called Eudex (supposedly better than Soundex) but I don't know how to use it to detect whether Y is a vowel or consonant. Could someone point me in the right direction? Thanks.
fn main() {
assert!((Hash::new("jumpo") - Hash::new("jumbo")).similar());
assert!(!(Hash::new("Horse") - Hash::new("Norse")).similar());
println!("{:?}", Hash::new("hello"));
println!(Hash::new("Sydney")).listVowels());
}
Hash { hash: 144115188075855872 }
Fantasy result of the last print line:
ye

Sort numbers into groups so that the difference of their sums is minimal

I found a few threads that were similar however I believe mine is a bit unique. This will be difficult to write so please bear with me.
I have a strain of 10 accounts, each account has a static number that can not be split up. I have 3 employees that need these accounts split as even as possible. They cannot share an account.
For example:
(A)lpha = 15
(B)eta = 30
(C)harlie = 22
(D)elta = 19
(E)cho = 28
(F)ranklin = 3
(G)roto = 7
(H)enry = 28
(I)ndia = 38
(J)uliet = 48
The total sum is = 238. In the perfect world, 2 people would get 79 and one person would have 80. However, remember we cannot break apart an account so we would need to add accounts together to get as close to evenly spread as possible.
I need a formula for this since situations like this occur regularly and it takes some time to figure this out. I believe this would be best executed with a helper column.
The closest I have come to is:
FHJ = 79
ABCG = 74
DEI = 85
But since this is reoccurring and can happen over even more accounts, I need something I can reuse over and over.
Another less complex but approximated solution would be to
sort your accounts from highest to lowest number.
Start sorting the numbers into 3 groups (A, B, C)
starting with the 3 highest numbers (48|J, 38|I, 30|B) sorting to group A, B and C
next highest number (28|E) goes to the group with the lowest sum (C)
next highest number (28|H) goes to the group with the lowest sum (B)
and so on …
You should end up with this:
Which is different from your manual solution but closer. If you see the differences:
Solution from above: 81 - 77 = 4
Your manual solution: 85 - 74 = 11
This algorithm is an approximation, it will not always find the best solution but if the difference between the lowest and highest number is not too large then the result is very close to the best solution.
This is known as a partition problem. You could try implementing the pseudo-polynomial time algorithm from the Wikipedia page. You'll have to modify it for 3 partitions instead of 2.
INPUT: A list of integers S
OUTPUT: True if S can be partitioned into two subsets that have equal sum
1 function find_partition(S):
2 n ← |S|
3 K ← sum(S)
4 P ← empty boolean table of size (floor(K/2)+ 1) by (n + 1)
5 initialize top row (P(0,x)) of P to True
6 initialize leftmost column (P(x, 0)) of P, except for P(0, 0) to False
7 for i from 1 to floor(K/2)
8 for j from 1 to n
9 if (i-S[j-1]) >= 0
10 P(i, j) ← P(i, j-1) or P(i-S[j-1], j-1)
11 else
12 P(i, j) ← P(i, j-1)
13 return P(floor(K/2), n)

Find the maximum benifit value from a sub-set of non-prefix neighbours?

1 < n <= 4 x 10^5
Length of each string can be up to 11
Each string contains only uppercase letters
Example - If there are 3 strings, A, B and AE, output is 200.
Explanation - S = {"A", "B", "AE"}
Strings A and AE are prefix neighbors, so they cannot both be in Mark's subset of S. String B has no prefix neighbor, so we include it in Mark's subset.
To maximize the benefit value, we choose AE and B for our subset. We then calculate the following benefit values for the chosen subset:
Benefit value of AE = 65+69 = 134
Benefit value of B = 66
Total benefit value = 134 + 66 = 200.
Insert the input words into a radix tree and splice out the non key words. Compute the maximum-weight independent set of the tree; the link goes to an unweighted algorithm, so you'll need to replace 1 by the weight of the node as defined by this question. All of this is linear-time.

multiplicative inverse? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I know that an affine cipher substitutes BD with SG. I need to find the encryption formula, in the form y = a x + b, where a and b are coefficients.
From the information above I end up having to equations:
a+b=18 and
3a+b=6
So I am working like this:
a+b=18 and 3a + b = 6-> 3a+18-a=6->  2a= 6-18 -> 2a=14 (as it is mod 26)
b=18-a
2a=? 
So, O want to multiply by the multiplicative inverse of 2 mod 26
I can't find a multiplicative inverse of number 2 with 26 (y = ax + b mod 26)
Can anyone please help me find a and b?
That's because 2 doesn't have a multiplicative inverse mod 26: since 13*2=0, there does not exist K such that K * a = 1. Your modulus must be prime. Try looking up the Chinese Remainder Theorem for more information.
To be more specific, integers mod 26 is not a field (a mathematical set where every element, except 0, has a multiplicative inverse). Any ring in which a * b = 0, for some a!=0 and b!=0, is not a field.
In fact, a field will always have p^n elements, where p is a prime number and n is a positive integer. The simplest fields are just integers mod a prime number, but for prime powers you need to construct a more elaborate system. So, in short, use a different modulus like 29.
Does a = 7 work? 2*7 = 14. Thus, b = 11.
Let's check the 2 equations to see if that works:
7+11 = 18 (check for the first equation).
3*7+11=21+11 = 32 = 6.
What is wrong with the above?
EDIT: Ok, now I see what could go wrong with trying to do a division by 2 in a non-prime modulus as it is similar to a division by 0. You could take ribond's suggestion of using the Chinese Remainder Theorem and split the equations into another pair of pairs:
mod 13: a+b=5, 3a+b=6. (2a = 1 = 14 => a=7. b = 18-7 = 11.)
mod 2: a+b=0. 3a+b=0 (Note this is the same equation and has a pair of possible solutions where a and b are either 0 or 1.)
Thus there is the unique solution for your problem I think.
Other posters are right in that there is no inverse of 2 modulo 26, so you can't solve 2a=14 mod 26 by multiplying through by the inverse of 2. But that doesn't mean that 2a=14 mod 26 isn't solvable.
Consider the general equation cx = d mod n (c=2,d=14,n=26 in your case). Let g = gcd(c,n). The equation cx=d has a solution if an only if g divides d. If g divides d, then there are in fact multiple solutions (g of them). The equation (c/g)x = d/g mod n/g has a unique solution (call it x_0) because c/g is relatively prime to n/g and therefore has an inverse. The solutions to the original equation are x_0, x_0 + n/g, ..., x_0 + (g-1)n/g.
In your case c=2,d=14,n=26, and g=2. g divides d, so first solve the equation (2/2)x = (14/2) mod (26/2) which gives 7. So both 7 and 7+13=20 solve your original equation.
Note that this means you haven't uniquely determined your affine transformation, two possibilities still exist. You need another data point...

Resources