Longest Common Prefix property

Longest Common Prefix property - string

I was going through suffix array and its use to compute longest common prefix of two suffixes.
The source says:
"The lcp between two suffixes is the minimum of the lcp's of all pairs of adjacent suffixes between them on the array"
i.e. lcp(x,y)=min{ lcp(x,x+1),lcp(x+1,x+2),.....,lcp(y-1,y) }
where x and y are two index of the string from where the two suffix of the string starts.
I am not convinced with the statement as in example of string "abca".
lcp(1,4)=1 (considering 1 based indexing)
but if I apply the above equation then
lcp(1,4)=min{lcp(1,2),lcp(2,3),lcp(3,4)}
and I think lcp(1,2)=0.
so the answer must be 0 according to the equation.
Am i getting it wrong somewhere?

I think the index referred by the source is not the index of the string itself, but index of the sorted suffixes.
a
abca
bca
ca
Hence
lcp(1,2) = lcp(a, abca) = 1
lcp(1,4) = min(lcp(1,2), lcp(2,3), lcp(3,4)) = 0

You can't find LCP of any two suffixes by simply calculating the minimum of the lcp's of all pairs of adjacent suffixes between them on the array.
We can calculate the LCPs of any suffixes (i,j)
with the Help of Following :
LCP(suffix i,suffix j)=LCP[RMQ(i + 1; j)]
Also Note (i<j) as LCP (suff i,suff j) may not necessarly equal LCP (Suff j,suff i).
RMQ is Range Minimum Query .
Page 3 of this paper.
Details:
Step 1:
First Calculate LCP of Adjacents /consecutive Suffix Pairs .
n= Length of string.
suffixArray[] is Suffix array.
void calculateadjacentsuffixes(int n)
{
for (int i=0; i<n; ++i) Rank[suffixArray[i]] = i;
Height[0] = 0;
for (int i=0, h=0; i<n; ++i)
{
if (Rank[i] > 0)
{
int j = suffixArray[Rank[i]-1];
while (i + h < n && j + h < n && str[i+h] == str[j+h])
{
h++;
}
Height[Rank[i]] = h;
if (h > 0) h--;
}
}
}
Note: Height[i]=LCPs of (Suffix i-1 ,suffix i) ie. Height array contains LCP of adjacent suffix.
Step 2:
Calculate LCP of Any two suffixes i,j using RMQ concept.
RMQ pre-compute function:
void preprocesses(int N)
{
int i, j;
//initialize M for the intervals with length 1
for (i = 0; i < N; i++)
M[i][0] = i;
//compute values from smaller to bigger intervals
for (j = 1; 1 << j <= N; j++)
{
for (i = 0; i + (1 << j) - 1 < N; i++)
{
if (Height[M[i][j - 1]] < Height[M[i + (1 << (j - 1))][j - 1]])
{
M[i][j] = M[i][j - 1];
}
else
{
M[i][j] = M[i + (1 << (j - 1))][j - 1];
}
}
}
}
Step 3: Calculate LCP between any two Suffixes i,j
int LCP(int i,int j)
{
/*Make sure we send i<j always */
/* By doing this ,it resolve following
suppose ,we send LCP(5,4) then it converts it to LCP(4,5)
*/
if(i>j)
swap(i,j);
/*conformation over*/
if(i==j)
{
return (Length_of_str-suffixArray[i]);
}
else
{
return Height[RMQ(i+1,j)];
//LCP(suffix i,suffix j)=LCPadj[RMQ(i + 1; j)]
//LCPadj=LCP of adjacent suffix =Height.
}
}
Where RMQ function is:
int RMQ(int i,int j)
{
int k=log((double)(j-i+1))/log((double)2);
int vv= j-(1<<k)+1 ;
if(Height[M[i][k]]<=Height[ M[vv][ k] ])
return M[i][k];
else
return M[ vv ][ k];
}
Refer Topcoder tutorials for RMQ.
You can check the complete implementation in C++ at my blog.

Related

Number of palindromic subsequences of length 5

Given a String s, return the number of palindromic subsequences of length 5.
Test case 1:
input : "abcdba"
Output : 2
"abcba" and "abdba"
Test case 2:
input : "aabccba"
Output : 4
"abcba" , "abcba" , "abcba" , "abcba"
Max length of String: 700
My TLE Approach: O(2^n)
https://www.online-java.com/5YegWkAVad
Any inputs are highly appreciated...

Whenever 2 characters match, we only have to find how many palindromes of length 3 are possible in between these 2 characters.
For example:
a bcbc a
^ ^
|_ _ _ |
In the above example, you can find 2 palindromes of length 3 which is bcb and cbc. Hence, we can make palindromic sequence of length 5 as abcba or acbca. Hence, the answer is 2.
Computing how many palindromes of length 3 are possible for every substring can be time consuming if we don't cache the results when we do it the first time. So, cache those results and reuse them for queries generated by other 2 character matches. (a.k.a dynamic programming)
This way, the solution becomes quadratic O(n^2) time where n is length of the string.
Snippet:
private static long solve(String s){
long ans = 0;
int len = s.length();
long[][] dp = new long[len][len];
/* compute how many palindromes of length 3 are possible for every 2 characters match */
for(int i = len - 2;i >= 0; --i){
for(int j = i + 2; j < len; ++j){
dp[i][j] = dp[i][j-1] + (dp[i + 1][j] == dp[i + 1][j-1] ? 0 : dp[i + 1][j] - dp[i + 1][j - 1]);
if(s.charAt(i) == s.charAt(j)){
dp[i][j] += j - i - 1;
}
}
}
/* re-use the above data to calculate for palindromes of length 5*/
for(int i = 0; i < len; ++i){
for(int j = i + 4; j < len; ++j){
if(s.charAt(i) == s.charAt(j)){
ans += dp[i + 1][j - 1];
}
}
}
//for(int i=0;i<len;++i) System.out.println(Arrays.toString(dp[i]));
return ans;
}
Online Demo
Update:
dp[i][j] = dp[i][j-1] + (dp[i + 1][j] == dp[i + 1][j-1] ? 0 : dp[i + 1][j] - dp[i + 1][j - 1]);
The above line basically mean this,
For any substring, say bcbcb, with matching first and last b, the total 3 length palindromes can be addition of
The total count possible for bcbc.
The total count possible for cbcb.
The total count possible for bcbcb (which is (j - i - 1) in the if condition).
dp[i][j] For the current substring at hand.
dp[i][j-1] - Adding the previous substring counts of length 3. In this example, bcbc.
dp[i + 1][j], Adding the substring ending at current index excluding the first character. (Here, cbcb).
dp[i + 1][j] == dp[i + 1][j-1] ? 0 : dp[i + 1][j] - dp[i + 1][j - 1] This is to basically avoid duplicate counting for internal substrings and only adding them if there is a difference in the counts.

Observation:
The preceding method is too cool because it gives the impression of a number of palindrome substrings of length 5, whereas the preceding method is o(n^2) using 2 DP. Cant we reduced to o(n) by using 3 dp.  Yes we can becuase here n should should be length of string but next  2 parameters length lie in range 0 to 25.
Eg: dp[i][j][k]  j, k is between 0 and 25. i is between 0 and the length of the string.
We can't get an idea directly from observation, so go to the institution.
Intitution:
Of length 3
For palindromic substring of length 3 should be  number of palindrome substring of length 3 would count the occurence of left to the index multiply with right side of the index .
Eg: _ s[i]_  => number of palindromic substring of length 3 should be at index is occurence of each alphabet before multiply with after index. So that it becomes palindrome of length 3.
Time complexity : o(n)
Of length 5
Similary for the case of length if 5 => _ _ s[i] _ _ Here number of occurence of combination of 2 characters before index multiply with after index, So that it becomes palindrome of length 5.
Eg: x y s[i] y x ; x,y belongs to a to z. Here we need to store occurence of xy before index and after index.
Time complexity : o(26 * 26 * n)
Of length 7
Similary for the case of length if 7 => _ _ _ s[i] _ _ _ Here number of occurence of combination of 3 characters before index multiply with after index, So that it becomes palindrome of length 7.
Eg: x y z s[i] z y x ; x,y ,z belongs to a to z. Here we need to store occurence of xyz before index and after index.
Time complexity : o(26 * 26 * 26*n)
Code
int pre[10000][26][26], suf[10000][26][26], cnts[26] = {};
int countPalindromes(string s) {
int mod = 1e9 + 7, n = s.size(), ans = 0;
for (int i = 0; i < n; i++) {
int c = s[i] - '0';
if (i)
for (int j = 0; j < 26; j++)
for (int k = 0; k < 26; k++) {
pre[i][j][k] = pre[i - 1][j][k];
if (k == c) pre[i][j][k] += cnts[j];
}
cnts[c]++;
}
memset(cnts, 0, sizeof(cnts));
for (int i = n - 1; i >= 0; i--) {
int c = s[i] - '0';
if (i < n - 1)
for (int j = 0; j < 26; j++)
for (int k = 0; k < 26; k++) {
suf[i][j][k] = suf[i + 1][j][k];
if (k == c) suf[i][j][k] += cnts[j];
}
cnts[c]++;
}
for (int i = 2; i < n - 2; i++)
for (int j = 0; j < 26; j++)
for (int k = 0; k < 26; k++)
ans = (ans + 1LL * pre[i - 1][j][k] * suf[i + 1][j][k]) % mod;
return ans;
}
Reference
Here's a link! for related problem , there 0 to 9, Most voted blog for problem.

Dynamic Programming, choosing the highest total value

The Data:
A list of integers increasing in order (0,1,2,3,4,5.......)
A list of values that belong to those integers. As an example, 0 = 33, 1 = 45, 2 = 21, ....etc.
And an incrementing variable x which represent a minimum jump value.
x is the value of each jump. For example if x = 2, if 1 is chosen you cannot choose 2.
I need to determine the best way to choose integers, given some (x), that produce the highest total value from the value list.
EXAMPLE:
A = a set of 1 foot intervals (0,1,2,3,4,5,6,7,8,9)
B = the amount of money at each interval (9,5,7,3,2,7,8,10,21,12)
Distance = the minimum distance you can cover
- i.e. if the minimum distance is 3, you must skip 2 feet and leave the money, then you can
pick up the amount at the 3rd interval.
if you pick up at 0, the next one you can pick up is 3, if you choose 3 you can
next pick up 6 (after skipping 4 and 5). BUT, you dont have to pick up 6, you
could pick up 7 if it is worth more. You just can't pick up early.
So, how can I programmatically make the best jumps and end with the most money at the end?

So I am using the below equation for computing the opt value in the dynamic programming:
Here d is distance.
if (i -d) >= 0
opt(i) = max (opt(i-1), B[i] + OPT(i-d));
else
opt(i) = max (opt(i-1), B[i]);
Psuedo-code for computing the OPT value:
int A[] = {integers list}; // This is redundant if the integers are consecutive and are always from 0..n.
int B[] = {values list};
int i = 0;
int d = distance; // minimum distance between two picks.
int numIntegers = sizeof(A)/sizeof(int);
int opt[numIntegers];
opt[0] = B[0]; // For the first one Optimal value is picking itself.
for (i=1; i < numIntegers; i++) {
if ((i-d) < 0) {
opt[i] = max (opt[i-1], B[i]);
} else {
opt[i] = max (opt[i-1], B[i] + opt[i-d]);
}
}
EDIT based on OP's requirement about getting the selected integers from B:
for (i=numIntegres - 1; i >= 0;) {
if ((i == 0) && (opt[i] > 0)) {
printf ("%d ", i);
break;
}
if (opt[i] > opt[i-1]) {
printf ("%d ", i);
i = i -d;
} else {
i = i - 1;
}
}
If A[] does not have consecutive integers from 0 to n.
int A[] = {integers list}; // Here the integers may not be consecutive
int B[] = {values list};
int i = 0, j = 0;
int d = distance; // minimum distance between two picks.
int numAs = sizeof(A)/sizeof(int);
int numIntegers = A[numAs-1]
int opt[numIntegers];
opt[0] = 0;
if (A[0] == 0) {
opt[0] = B[0]; // For the first one Optimal value is picking itself.
j = 1;
}
for (i=1; i < numIntegers && j < numAs; i++, j++) {
if (i < A[j]) {
while (i < A[j]) {
opt[i] = opt[i -1];
i = i + 1:
}
}
if ((i-d) < 0) {
opt[i] = max (opt[i-1], B[j]);
} else {
opt[i] = max (opt[i-1], B[j] + opt[i-d]);
}
}

Finding the ranking of a word (permutations) with duplicate letters

I'm posting this although much has already been posted about this question. I didn't want to post as an answer since it's not working. The answer to this post (Finding the rank of the Given string in list of all possible permutations with Duplicates) did not work for me.
So I tried this (which is a compilation of code I've plagiarized and my attempt to deal with repetitions). The non-repeating cases work fine. BOOKKEEPER generates 83863, not the desired 10743.
(The factorial function and letter counter array 'repeats' are working correctly. I didn't post to save space.)
while (pointer != length)
{
if (sortedWordChars[pointer] != wordArray[pointer])
{
// Swap the current character with the one after that
char temp = sortedWordChars[pointer];
sortedWordChars[pointer] = sortedWordChars[next];
sortedWordChars[next] = temp;
next++;
//For each position check how many characters left have duplicates,
//and use the logic that if you need to permute n things and if 'a' things
//are similar the number of permutations is n!/a!
int ct = repeats[(sortedWordChars[pointer]-64)];
// Increment the rank
if (ct>1) { //repeats?
System.out.println("repeating " + (sortedWordChars[pointer]-64));
//In case of repetition of any character use: (n-1)!/(times)!
//e.g. if there is 1 character which is repeating twice,
//x* (n-1)!/2!
int dividend = getFactorialIter(length - pointer - 1);
int divisor = getFactorialIter(ct);
int quo = dividend/divisor;
rank += quo;
} else {
rank += getFactorialIter(length - pointer - 1);
}
} else
{
pointer++;
next = pointer + 1;
}
}

Note: this answer is for 1-based rankings, as specified implicitly by example. Here's some Python that works at least for the two examples provided. The key fact is that suffixperms * ctr[y] // ctr[x] is the number of permutations whose first letter is y of the length-(i + 1) suffix of perm.
from collections import Counter
def rankperm(perm):
rank = 1
suffixperms = 1
ctr = Counter()
for i in range(len(perm)):
x = perm[((len(perm) - 1) - i)]
ctr[x] += 1
for y in ctr:
if (y < x):
rank += ((suffixperms * ctr[y]) // ctr[x])
suffixperms = ((suffixperms * (i + 1)) // ctr[x])
return rank
print(rankperm('QUESTION'))
print(rankperm('BOOKKEEPER'))
Java version:
public static long rankPerm(String perm) {
long rank = 1;
long suffixPermCount = 1;
java.util.Map<Character, Integer> charCounts =
new java.util.HashMap<Character, Integer>();
for (int i = perm.length() - 1; i > -1; i--) {
char x = perm.charAt(i);
int xCount = charCounts.containsKey(x) ? charCounts.get(x) + 1 : 1;
charCounts.put(x, xCount);
for (java.util.Map.Entry<Character, Integer> e : charCounts.entrySet()) {
if (e.getKey() < x) {
rank += suffixPermCount * e.getValue() / xCount;
}
}
suffixPermCount *= perm.length() - i;
suffixPermCount /= xCount;
}
return rank;
}
Unranking permutations:
from collections import Counter
def unrankperm(letters, rank):
ctr = Counter()
permcount = 1
for i in range(len(letters)):
x = letters[i]
ctr[x] += 1
permcount = (permcount * (i + 1)) // ctr[x]
# ctr is the histogram of letters
# permcount is the number of distinct perms of letters
perm = []
for i in range(len(letters)):
for x in sorted(ctr.keys()):
# suffixcount is the number of distinct perms that begin with x
suffixcount = permcount * ctr[x] // (len(letters) - i)
if rank <= suffixcount:
perm.append(x)
permcount = suffixcount
ctr[x] -= 1
if ctr[x] == 0:
del ctr[x]
break
rank -= suffixcount
return ''.join(perm)

If we use mathematics, the complexity will come down and will be able to find rank quicker. This will be particularly helpful for large strings.
(more details can be found here)
Suggest to programmatically define the approach shown here (screenshot attached below) given below)

I would say David post (the accepted answer) is super cool. However, I would like to improve it further for speed. The inner loop is trying to find inverse order pairs, and for each such inverse order, it tries to contribute to the increment of rank. If we use an ordered map structure (binary search tree or BST) in that place, we can simply do an inorder traversal from the first node (left-bottom) until it reaches the current character in the BST, rather than traversal for the whole map(BST). In C++, std::map is a perfect one for BST implementation. The following code reduces the necessary iterations in loop and removes the if check.
long long rankofword(string s)
{
long long rank = 1;
long long suffixPermCount = 1;
map<char, int> m;
int size = s.size();
for (int i = size - 1; i > -1; i--)
{
char x = s[i];
m[x]++;
for (auto it = m.begin(); it != m.find(x); it++)
rank += suffixPermCount * it->second / m[x];
suffixPermCount *= (size - i);
suffixPermCount /= m[x];
}
return rank;
}

#Dvaid Einstat, this was really helpful. It took me a WHILE to figure out what you were doing as I am still learning my first language(C#). I translated it into C# and figured that I'd give that solution as well since this listing helped me so much!
Thanks!
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Text.RegularExpressions;
namespace CsharpVersion
{
class Program
{
//Takes in the word and checks to make sure that the word
//is between 1 and 25 charaters inclusive and only
//letters are used
static string readWord(string prompt, int high)
{
Regex rgx = new Regex("^[a-zA-Z]+$");
string word;
string result;
do
{
Console.WriteLine(prompt);
word = Console.ReadLine();
} while (word == "" | word.Length > high | rgx.IsMatch(word) == false);
result = word.ToUpper();
return result;
}
//Creates a sorted dictionary containing distinct letters
//initialized with 0 frequency
static SortedDictionary<char,int> Counter(string word)
{
char[] wordArray = word.ToCharArray();
int len = word.Length;
SortedDictionary<char,int> count = new SortedDictionary<char,int>();
foreach(char c in word)
{
if(count.ContainsKey(c))
{
}
else
{
count.Add(c, 0);
}
}
return count;
}
//Creates a factorial function
static int Factorial(int n)
{
if (n <= 1)
{
return 1;
}
else
{
return n * Factorial(n - 1);
}
}
//Ranks the word input if there are no repeated charaters
//in the word
static Int64 rankWord(char[] wordArray)
{
int n = wordArray.Length;
Int64 rank = 1;
//loops through the array of letters
for (int i = 0; i < n-1; i++)
{
int x=0;
//loops all letters after i and compares them for factorial calculation
for (int j = i+1; j<n ; j++)
{
if (wordArray[i] > wordArray[j])
{
x++;
}
}
rank = rank + x * (Factorial(n - i - 1));
}
return rank;
}
//Ranks the word input if there are repeated charaters
//in the word
static Int64 rankPerm(String word)
{
Int64 rank = 1;
Int64 suffixPermCount = 1;
SortedDictionary<char, int> counter = Counter(word);
for (int i = word.Length - 1; i > -1; i--)
{
char x = Convert.ToChar(word.Substring(i,1));
int xCount;
if(counter[x] != 0)
{
xCount = counter[x] + 1;
}
else
{
xCount = 1;
}
counter[x] = xCount;
foreach (KeyValuePair<char,int> e in counter)
{
if (e.Key < x)
{
rank += suffixPermCount * e.Value / xCount;
}
}
suffixPermCount *= word.Length - i;
suffixPermCount /= xCount;
}
return rank;
}
static void Main(string[] args)
{
Console.WriteLine("Type Exit to end the program.");
string prompt = "Please enter a word using only letters:";
const int MAX_VALUE = 25;
Int64 rank = new Int64();
string theWord;
do
{
theWord = readWord(prompt, MAX_VALUE);
char[] wordLetters = theWord.ToCharArray();
Array.Sort(wordLetters);
bool duplicate = false;
for(int i = 0; i< theWord.Length - 1; i++)
{
if(wordLetters[i] < wordLetters[i+1])
{
duplicate = true;
}
}
if(duplicate)
{
SortedDictionary<char, int> counter = Counter(theWord);
rank = rankPerm(theWord);
Console.WriteLine("\n" + theWord + " = " + rank);
}
else
{
char[] letters = theWord.ToCharArray();
rank = rankWord(letters);
Console.WriteLine("\n" + theWord + " = " + rank);
}
} while (theWord != "EXIT");
Console.WriteLine("\nPress enter to escape..");
Console.Read();
}
}
}

If there are k distinct characters, the i^th character repeated n_i times, then the total number of permutations is given by
(n_1 + n_2 + ..+ n_k)!
------------------------------------------------
n_1! n_2! ... n_k!
which is the multinomial coefficient.
Now we can use this to compute the rank of a given permutation as follows:
Consider the first character(leftmost). say it was the r^th one in the sorted order of characters.
Now if you replace the first character by any of the 1,2,3,..,(r-1)^th character and consider all possible permutations, each of these permutations will precede the given permutation. The total number can be computed using the above formula.
Once you compute the number for the first character, fix the first character, and repeat the same with the second character and so on.
Here's the C++ implementation to your question
#include<iostream>
using namespace std;
int fact(int f) {
if (f == 0) return 1;
if (f <= 2) return f;
return (f * fact(f - 1));
}
int solve(string s,int n) {
int ans = 1;
int arr[26] = {0};
int len = n - 1;
for (int i = 0; i < n; i++) {
s[i] = toupper(s[i]);
arr[s[i] - 'A']++;
}
for(int i = 0; i < n; i++) {
int temp = 0;
int x = 1;
char c = s[i];
for(int j = 0; j < c - 'A'; j++) temp += arr[j];
for (int j = 0; j < 26; j++) x = (x * fact(arr[j]));
arr[c - 'A']--;
ans = ans + (temp * ((fact(len)) / x));
len--;
}
return ans;
}
int main() {
int i,n;
string s;
cin>>s;
n=s.size();
cout << solve(s,n);
return 0;
}

Java version of unrank for a String:
public static String unrankperm(String letters, int rank) {
Map<Character, Integer> charCounts = new java.util.HashMap<>();
int permcount = 1;
for(int i = 0; i < letters.length(); i++) {
char x = letters.charAt(i);
int xCount = charCounts.containsKey(x) ? charCounts.get(x) + 1 : 1;
charCounts.put(x, xCount);
permcount = (permcount * (i + 1)) / xCount;
}
// charCounts is the histogram of letters
// permcount is the number of distinct perms of letters
StringBuilder perm = new StringBuilder();
for(int i = 0; i < letters.length(); i++) {
List<Character> sorted = new ArrayList<>(charCounts.keySet());
Collections.sort(sorted);
for(Character x : sorted) {
// suffixcount is the number of distinct perms that begin with x
Integer frequency = charCounts.get(x);
int suffixcount = permcount * frequency / (letters.length() - i);
if (rank <= suffixcount) {
perm.append(x);
permcount = suffixcount;
if(frequency == 1) {
charCounts.remove(x);
} else {
charCounts.put(x, frequency - 1);
}
break;
}
rank -= suffixcount;
}
}
return perm.toString();
}
See also n-th-permutation-algorithm-for-use-in-brute-force-bin-packaging-parallelization.

Find the index of a specific combination without generating all ncr combinations

I am trying to find the index of a specific combination without generating the actual list of all possible combinations. For ex: 2 number combinations from 1 to 5 produces, 1,2;1,3,1,4,1,5;2,3,2,4,2,5..so..on. Each combination has its own index starting with zero,if my guess is right. I want to find that index without generating the all possible combination for a given combination. I am writing in C# but my code generates all possible combinations on fly. This would be expensive if n and r are like 80 and 9 and i even can't enumerate the actual range. Is there any possible way to find the index without producing the actual combination for that particular index
public int GetIndex(T[] combination)
{
int index = (from i in Enumerable.Range(0, 9)
where AreEquivalentArray(GetCombination(i), combination)
select i).SingleOrDefault();
return index;
}

I found the answer to my own question in simple terms. It is very simple but seems to be effective in my situation.The choose method is brought from other site though which generates the combinations count for n items chosen r:
public long GetIndex(T[] combinations)
{
long sum = Choose(items.Count(),atATime);
for (int i = 0; i < combinations.Count(); i++)
{
sum = sum - Choose(items.ToList().IndexOf(items.Max())+1 - (items.ToList().IndexOf(combinations[i])+1), atATime - i);
}
return sum-1;
}
private long Choose(int n, int k)
{
long result = 0;
int delta;
int max;
if (n < 0 || k < 0)
{
throw new ArgumentOutOfRangeException("Invalid negative parameter in Choose()");
}
if (n < k)
{
result = 0;
}
else if (n == k)
{
result = 1;
}
else
{
if (k < n - k)
{
delta = n - k;
max = k;
}
else
{
delta = k;
max = n - k;
}
result = delta + 1;
for (int i = 2; i <= max; i++)
{
checked
{
result = (result * (delta + i)) / i;
}
}
}
return result;
}

Generate all compositions of an integer into k parts

I can't figure out how to generate all compositions (http://en.wikipedia.org/wiki/Composition_%28number_theory%29) of an integer N into K parts, but only doing it one at a time. That is, I need a function that given the previous composition generated, returns the next one in the sequence. The reason is that memory is limited for my application. This would be much easier if I could use Python and its generator functionality, but I'm stuck with C++.
This is similar to Next Composition of n into k parts - does anyone have a working algorithm?
Any assistance would be greatly appreciated.

Preliminary remarks
First start from the observation that [1,1,...,1,n-k+1] is the first composition (in lexicographic order) of n over k parts, and [n-k+1,1,1,...,1] is the last one.
Now consider an exemple: the composition [2,4,3,1,1], here n = 11 and k=5. Which is the next one in lexicographic order? Obviously the rightmost part to be incremented is 4, because [3,1,1] is the last composition of 5 over 3 parts.
4 is at the left of 3, the rightmost part different from 1.
So turn 4 into 5, and replace [3,1,1] by [1,1,2], the first composition of the remainder (3+1+1)-1 , giving [2,5,1,1,2]
Generation program (in C)
The following C program shows how to compute such compositions on demand in lexicographic order
#include <stdio.h>
#include <stdbool.h>
bool get_first_composition(int n, int k, int composition[k])
{
if (n < k) {
return false;
}
for (int i = 0; i < k - 1; i++) {
composition[i] = 1;
}
composition[k - 1] = n - k + 1;
return true;
}
bool get_next_composition(int n, int k, int composition[k])
{
if (composition[0] == n - k + 1) {
return false;
}
// there'a an i with composition[i] > 1, and it is not 0.
// find the last one
int last = k - 1;
while (composition[last] == 1) {
last--;
}
// turn a b ... y z 1 1 ... 1
// ^ last
// into a b ... (y+1) 1 1 1 ... (z-1)
// be careful, there may be no 1's at the end
int z = composition[last];
composition[last - 1] += 1;
composition[last] = 1;
composition[k - 1] = z - 1;
return true;
}
void display_composition(int k, int composition[k])
{
char *separator = "[";
for (int i = 0; i < k; i++) {
printf("%s%d", separator, composition[i]);
separator = ",";
}
printf("]\n");
}
void display_all_compositions(int n, int k)
{
int composition[k]; // VLA. Please don't use silly values for k
for (bool exists = get_first_composition(n, k, composition);
exists;
exists = get_next_composition(n, k, composition)) {
display_composition(k, composition);
}
}
int main()
{
display_all_compositions(5, 3);
}
Results
[1,1,3]
[1,2,2]
[1,3,1]
[2,1,2]
[2,2,1]
[3,1,1]
Weak compositions
A similar algorithm works for weak compositions (where 0 is allowed).
bool get_first_weak_composition(int n, int k, int composition[k])
{
if (n < k) {
return false;
}
for (int i = 0; i < k - 1; i++) {
composition[i] = 0;
}
composition[k - 1] = n;
return true;
}
bool get_next_weak_composition(int n, int k, int composition[k])
{
if (composition[0] == n) {
return false;
}
// there'a an i with composition[i] > 0, and it is not 0.
// find the last one
int last = k - 1;
while (composition[last] == 0) {
last--;
}
// turn a b ... y z 0 0 ... 0
// ^ last
// into a b ... (y+1) 0 0 0 ... (z-1)
// be careful, there may be no 0's at the end
int z = composition[last];
composition[last - 1] += 1;
composition[last] = 0;
composition[k - 1] = z - 1;
return true;
}
Results for n=5 k=3
[0,0,5]
[0,1,4]
[0,2,3]
[0,3,2]
[0,4,1]
[0,5,0]
[1,0,4]
[1,1,3]
[1,2,2]
[1,3,1]
[1,4,0]
[2,0,3]
[2,1,2]
[2,2,1]
[2,3,0]
[3,0,2]
[3,1,1]
[3,2,0]
[4,0,1]
[4,1,0]
[5,0,0]
Similar algorithms can be written for compositions of n into k parts greater than some fixed value.

You could try something like this:
start with the array [1,1,...,1,N-k+1] of (K-1) ones and 1 entry with the remainder. The next composition can be created by incrementing the (K-1)th element and decreasing the last element. Do this trick as long as the last element is bigger than the second to last.
When the last element becomes smaller, increment the (K-2)th element, set the (K-1)th element to the same value and set the last element to the remainder again. Repeat the process and apply the same principle for the other elements when necessary.
You end up with a constantly sorted array that avoids duplicate compositions

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Longest Common Prefix property - string

I think the index referred by the source is not the index of the string itself, but index of the sorted suffixes. a abca bca ca Hence lcp(1,2) = lcp(a, abca) = 1 lcp(1,4) = min(lcp(1,2), lcp(2,3), lcp(3,4)) = 0

Related

Number of palindromic subsequences of length 5

Dynamic Programming, choosing the highest total value

Finding the ranking of a word (permutations) with duplicate letters

Find the index of a specific combination without generating all ncr combinations

Generate all compositions of an integer into k parts

Categories

Resources