dynamic programming for minimum cost of breaking the string - string

A certain string-processing language offers a primitive operation which splits a string into two pieces. Since this operation involves copying the original string, it takes n units of time for a string of length n, regardless of the location of the cut. Suppose, now, that you want to break a string into many pieces. The order in which the breaks are made can affect the total running time. For example, if you want to cut a 20-character string at positions 3 and 10, then making the first cut at position 3 incurs a total cost of 20+17=37, while doing position 10 first has a better cost of 20+10=30.
Give a dynamic programming algorithm that, given the locations of m cuts in a string of length n, finds the minimum cost of breaking the string into m + 1 pieces.
This problem is from "Algorithms" chapter6 6.9.
Since there is no answer for this problem, This is what I thought.
Define OPT(i,j,n) as the minimum cost of breaking the string, i for start index, j for end index of String and n for the remaining number of cut I can use.
Here is what I get:
OPT(i,j,n) = min {OPT(i,k,w) + OPT(k+1,j,n-w) + j-i} for i<=k<j and 0<=w<=n
Is it right or not? Please help, thx!

I think your recurrence relation can become more better. Here's what I came up with, define cost(i,j) to be the cost of cutting the string from index i to j. Then,
cost(i,j) = min {length of substring + cost(i,k) + cost(k,j) where i < k < j}

void s_cut()
{
int l,p;
int temp=0;
//ArrayList<Integer> al = new ArrayList<Integer>();
int al[];
Scanner s=new Scanner(System.in);
int table[][];
ArrayList<Integer> values[][];
int low=0,high=0;
int min=0;
l=s.nextInt();
p=s.nextInt();
System.out.println("The values are "+l+" "+p);
table= new int[l+1][l+1];
values= new ArrayList[l+1][l+1];
al= new int[p];
for(int i=0;i<p;i++)
{
al[i]=s.nextInt();
}
for(int i=0;i<=l;i++)
for(int j=0;j<=l;j++)
values[i][j]=new ArrayList<Integer>();
System.out.println();
for(int i=1;i<=l;i++)
table[i][i]=0;
//Arrays.s
Arrays.sort(al);
for(int i=0;i<p;i++)
{
System.out.print(al[i]+ " ");
}
for(int len=2;len<=l;len++)
{
//System.out.println("The length is "+len);
for(int i=1,j=i+len-1;j<=l;i++,j++)
{
high= min_index(al,j-1);
low= max_index(al,i);
System.out.println("Indices are "+low+" "+high);
if(low<=high && low!=-1 && high!=-1)
{
int cost=Integer.MAX_VALUE;;
for(int k=low;k<=high;k++)
{
//if(al[k]!=j)
temp=cost;
cost=Math.min(cost, table[i][al[k]]+table[al[k]+1][j]);
if(temp!=cost)
{
min=k;
//values[i][j].add(al[k]);
//values[i][j].addAll(values[i][al[k]]);
//values[i][j].addAll(values[al[k]+1][j]);
//values[i][j].addAll(values[i][al[k]]);
}
//else
//cost=0;
}
table[i][j]= len+cost;
values[i][j].add(al[min]);
//values[i][j].addAll(values[i][al[min]]);
values[i][j].addAll(values[al[min]+1][j]);
values[i][j].addAll(values[i][al[min]]);
}
else
table[i][j]=0;
System.out.println(" values are "+i+" "+j+" "+table[i][j]);
}
}
System.out.println(" The minimum cost is "+table[1][l]);
//temp=values[1][l];
for(int e: values[1][l])
{
System.out.print(e+"-->");
}
}
The above solution has the complexity of O(n^3).

Related

O(n) time complexity and O(1) space complexity way to see if two strings are permutations of each other

Is there an algorithm that can see if two strings are permutations of each other with O(n) time complexity and O(1) space complexity?
Yes sure there is a very nice way. You have to use count sort for this. There is no reason to generate prime numbers at all. Here is a C code snippet that describes the algorithm:
bool is_permutation(string s1, string s2) {
if(s1.length() != s2.length()) return false;
int count[256]; //assuming each character fits in one byte, also the authors sample solution seems to have this boundary
for(int i=0;i<256;i++) count[i]=0;
for(int i=0;i<s1.length();i++) { //count the digits to see if each digits occur same number of times in both strings
count[ s1[i] ]++;
count[ s2[i] ]--;
}
for(int i=0;i<256;i++) { //see if there is any digit that appeared in different frequency
if(count[i]!=0) return false;
}
return true;
}
EDIT: (I decided to add this after some comments related to order of my program)
The Lets try to calculate the time complexity of the algorithm I have used in my program:
n = max len of strings
m = max allowed different characters, assuming will having all consecutive ascii value in range [0,m-1]
Time complexity: O(max(n,m))
Memory Complexity O(m)
Now assuming m is a constant here the order becomes
Time complexity: O(n)
Memory Complexity O(1)
Here is a simple program I wrote in java that gives the answer in O(n) for time complexity and O(1) for space complexity. It works by mapping every character to a prime number and then multiplying together all of the characters in the string's prime mappings. If the two strings are permutations then they should have the same unique characters each with the same number of occurrences.
Here is some sample code that accomplishes this:
// maps keys to a corresponding unique prime
static Map<Integer, Integer> primes = generatePrimes(255); // use 255 for
// ASCII or the
// number of
// possible
// characters
public static boolean permutations(String s1, String s2) {
// both strings must be same length
if (s1.length() != s2.length())
return false;
// the corresponding primes for every char in both strings are multiplied together
int s1Product = 1;
int s2Product = 1;
for (char c : s1.toCharArray())
s1Product *= primes.get((int) c);
for (char c : s2.toCharArray())
s2Product *= primes.get((int) c);
return s1Product == s2Product;
}
private static Map<Integer, Integer> generatePrimes(int n) {
Map<Integer, Integer> primes = new HashMap<Integer, Integer>();
primes.put(0, 2);
for (int i = 2; primes.size() < n; i++) {
boolean divisible = false;
for (int v : primes.values()) {
if (i % v == 0) {
divisible = true;
break;
}
}
if (!divisible) {
primes.put(primes.size(), i);
System.out.println(i + " ");
}
}
return primes;
}

Maximum repeating substring of size n

Find the substring of length n that repeats a maximum number of times in a given string.
Input: abbbabbbb# 2
Output: bb
My solution:
public static String mrs(String s, int m) {
int n = s.length();
String[] suffixes = new String[n-m+1];
for (int i = 0; i < n-m+1; i++) {
suffixes[i] = s.substring(i, i+m);
}
Arrays.sort(suffixes);
String ans = "", tmp=suffixes[0].substring(0,m);
int cnt = 1, max=0;
for (int i = 0; i < n-m; i++) {
if (suffixes[i].equals(suffixes[i+1])){
cnt++;
}else{
if(cnt>max){
max = cnt;
ans =tmp;
}
cnt=0;
tmp = suffixes[i];
}
}
return ans;
}
Can it be done better than the above O(nm) time and O(n) space solution?
For a string of length L and a given length k (not to mess up with n and m which the question interchanges at times), we can compute polynomial hashes of all substrings of length k in O(L) (see Wikipedia for some elaboration on this subproblem).
Now, if we map the hash values to the number of times they occur, we get the value which occurs most frequently in O(L) (with a HashMap with high probability, or in O(L log L) with a TreeMap).
After that, just take the substring which got the most frequent hash as the answer.
This solution does not take hash collisions into account.
The idea is to just reduce the probability of collisions enough for the application (if it's too high, use multiple hashes, for example).
If the application demands that we absolutely never give a wrong answer, we can check the answer in O(L) with another algorithm (KMP, for example), and re-run the whole solution with a different hash function as long as the answer turns out to be wrong.

Time complexity of a string compression algorithm

I tried to do an exercice from cracking the code interview book about compressing a string :
Implement a method to perform basic string compression using the counts of
repeated characters. For example, the string aabcccccaaa would become
a2blc5a3. If the "compressed" string would not become smaller than the original
string, your method should return the original string.
The first proposition given by the author is the following :
public static String compressBad(String str) {
int size = countCompression(str);
if (size >= str.length()) {
return str;
}
String mystr = "";
char last = str.charAt(0);
int count = 1;
for (int i = 1; i < str.length(); i++) {
if (str.charAt(i) == last) {
count++;
} else {
mystr += last + "" + count;
last = str.charAt(i);
count = 1;
}
}
return mystr + last + count;
}
About the time complexity of thie algorithm, she said that :
The runtime is 0(p + k^2), where p is the size of the original string and k is the number of character sequences. For example, if the string is aabccdeeaa, then there are six character sequences. It's slow because string concatenation operates in 0(n^2) time
I have two main questions :
1) Why the time complexity is O(p+k^2), it should be only O(p) (where p is the size of the string) no? Because we do only p iterations not p + k^2.
2) Why the time complexity of a string concatenation is 0(n^2)???
I thought that if we have a string3 = string1 + string2 then we have a complexity of size(string1) + size(string2), because we have to create a copy of the two strings before adding them to a new string (create a string is O(1)). So, we will have an addition not a multiplication, no? It isn't the same thing for an array (if we used a char array for example) ?
Can you clarify these points please? I didn't understand how we calculate the complexity...
String concatenation is O(n) but in this case it's being concatenated K times.
Every time you find a sequence you have to copy the entire string + what you found. for example if you had four total sequences in the original string
the cost to get the final string will be:
(k-3)+ //first sequence
(k-3)+(k-2) + //copying previous string and adding new sequence
((k-3)+(+k-2)+k(-1) + //copying previous string and adding new sequence
((k-3)+(k-2)+(k-1)+k) //copying previous string and adding new sequence
Thus complexity is O(p + K^2)

Generate 50 random numbers and store them into an array c++

this is what i have of the function so far. This is only the beginning of the problem, it is asking to generate the random numbers in a 10 by 5 group of numbers for the output, then after this it is to be sorted by number size, but i am just trying to get this first part down.
/* Populate the array with 50 randomly generated integer values
* in the range 1-50. */
void populateArray(int ar[], const int n) {
int n;
for (int i = 1; i <= length - 1; i++){
for (int i = 1; i <= ARRAY_SIZE; i++) {
i = rand() % 10 + 1;
ar[n]++;
}
}
}
First of all we want to use std::array; It has some nice property, one of which is that it doesn't decay as a pointer. Another is that it knows its size. In this case we are going to use templates to make populateArray a generic enough algorithm.
template<std::size_t N>
void populateArray(std::array<int, N>& array) { ... }
Then, we would like to remove all "raw" for loops. std::generate_n in combination with some random generator seems a good option.
For the number generator we can use <random>. Specifically std::uniform_int_distribution. For that we need to get some generator up and running:
std::random_device device;
std::mt19937 generator(device());
std::uniform_int_distribution<> dist(1, N);
and use it in our std::generate_n algorithm:
std::generate_n(array.begin(), N, [&dist, &generator](){
return dist(generator);
});
Live demo

Sorting a string using another sorting order string [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I saw this in an interview question ,
Given a sorting order string, you are asked to sort the input string based on the given sorting order string.
for example if the sorting order string is dfbcae
and the Input string is abcdeeabc
the output should be dbbccaaee.
any ideas on how to do this , in an efficient way ?
The Counting Sort option is pretty cool, and fast when the string to be sorted is long compared to the sort order string.
create an array where each index corresponds to a letter in the alphabet, this is the count array
for each letter in the sort target, increment the index in the count array which corresponds to that letter
for each letter in the sort order string
add that letter to the end of the output string a number of times equal to it's count in the count array
Algorithmic complexity is O(n) where n is the length of the string to be sorted. As the Wikipedia article explains we're able to beat the lower bound on standard comparison based sorting because this isn't a comparison based sort.
Here's some pseudocode.
char[26] countArray;
foreach(char c in sortTarget)
{
countArray[c - 'a']++;
}
int head = 0;
foreach(char c in sortOrder)
{
while(countArray[c - 'a'] > 0)
{
sortTarget[head] = c;
head++;
countArray[c - 'a']--;
}
}
Note: this implementation requires that both strings contain only lowercase characters.
Here's a nice easy to understand algorithm that has decent algorithmic complexity.
For each character in the sort order string
scan string to be sorted, starting at first non-ordered character (you can keep track of this character with an index or pointer)
when you find an occurrence of the specified character, swap it with the first non-ordered character
increment the index for the first non-ordered character
This is O(n*m), where n is the length of the string to be sorted and m is the length of the sort order string. We're able to beat the lower bound on comparison based sorting because this algorithm doesn't really use comparisons. Like Counting Sort it relies on the fact that you have a predefined finite external ordering set.
Here's some psuedocode:
int head = 0;
foreach(char c in sortOrder)
{
for(int i = head; i < sortTarget.length; i++)
{
if(sortTarget[i] == c)
{
// swap i with head
char temp = sortTarget[head];
sortTarget[head] = sortTarget[i];
sortTarget[i] = temp;
head++;
}
}
}
In Python, you can just create an index and use that in a comparison expression:
order = 'dfbcae'
input = 'abcdeeabc'
index = dict([ (y,x) for (x,y) in enumerate(order) ])
output = sorted(input, cmp=lambda x,y: index[x] - index[y])
print 'input=',''.join(input)
print 'output=',''.join(output)
gives this output:
input= abcdeeabc
output= dbbccaaee
Use binary search to find all the "split points" between different letters, then use the length of each segment directly. This will be asymptotically faster then naive counting sort, but will be harder to implement:
Use an array of size 26*2 to store the begin and end of each letter;
Inspect the middle element, see if it is different from the element left to it. If so, then this is the begin for the middle element and end for the element before it;
Throw away the segment with identical begin and end (if there are any), recursively apply this algorithm.
Since there are at most 25 "split"s, you won't have to do the search for more than 25 segemnts, and for each segment it is O(logn). Since this is constant * O(logn), the algorithm is O(nlogn).
And of course, just use counting sort will be easier to implement:
Use an array of size 26 to record the number of different letters;
Scan the input string;
Output the string in the given sorting order.
This is O(n), n being the length of the string.
Interview questions are generally about thought process and don't usually care too much about language features, but I couldn't resist posting a VB.Net 4.0 version anyway.
"Efficient" can mean two different things. The first is "what's the fastest way to make a computer execute a task" and the second is "what's the fastest that we can get a task done". They might sound the same but the first can mean micro-optimizations like int vs short, running timers to compare execution times and spending a week tweaking every millisecond out of an algorithm. The second definition is about how much human time would it take to create the code that does the task (hopefully in a reasonable amount of time). If code A runs 20 times faster than code B but code B took 1/20th of the time to write, depending on the granularity of the timer (1ms vs 20ms, 1 week vs 20 weeks), each version could be considered "efficient".
Dim input = "abcdeeabc"
Dim sort = "dfbcae"
Dim SortChars = sort.ToList()
Dim output = New String((From c In input.ToList() Select c Order By SortChars.IndexOf(c)).ToArray())
Trace.WriteLine(output)
Here is my solution to the question
import java.util.*;
import java.io.*;
class SortString
{
public static void main(String arg[])throws IOException
{
BufferedReader br=new BufferedReader(new InputStreamReader(System.in));
// System.out.println("Enter 1st String :");
// System.out.println("Enter 1st String :");
// String s1=br.readLine();
// System.out.println("Enter 2nd String :");
// String s2=br.readLine();
String s1="tracctor";
String s2="car";
String com="";
String uncom="";
for(int i=0;i<s2.length();i++)
{
if(s1.contains(""+s2.charAt(i)))
{
com=com+s2.charAt(i);
}
}
System.out.println("Com :"+com);
for(int i=0;i<s1.length();i++)
if(!com.contains(""+s1.charAt(i)))
uncom=uncom+s1.charAt(i);
System.out.println("Uncom "+uncom);
System.out.println("Combined "+(com+uncom));
HashMap<String,Integer> h1=new HashMap<String,Integer>();
for(int i=0;i<s1.length();i++)
{
String m=""+s1.charAt(i);
if(h1.containsKey(m))
{
int val=(int)h1.get(m);
val=val+1;
h1.put(m,val);
}
else
{
h1.put(m,new Integer(1));
}
}
StringBuilder x=new StringBuilder();
for(int i=0;i<com.length();i++)
{
if(h1.containsKey(""+com.charAt(i)))
{
int count=(int)h1.get(""+com.charAt(i));
while(count!=0)
{x.append(""+com.charAt(i));count--;}
}
}
x.append(uncom);
System.out.println("Sort "+x);
}
}
Here is my version which is O(n) in time. Instead of unordered_map, I could have just used a char array of constant size. i.,e. char char_count[256] (and done ++char_count[ch - 'a'] ) assuming the input strings has all ASCII small characters.
string SortOrder(const string& input, const string& sort_order) {
unordered_map<char, int> char_count;
for (auto ch : input) {
++char_count[ch];
}
string res = "";
for (auto ch : sort_order) {
unordered_map<char, int>::iterator it = char_count.find(ch);
if (it != char_count.end()) {
string s(it->second, it->first);
res += s;
}
}
return res;
}
private static String sort(String target, String reference) {
final Map<Character, Integer> referencesMap = new HashMap<Character, Integer>();
for (int i = 0; i < reference.length(); i++) {
char key = reference.charAt(i);
if (!referencesMap.containsKey(key)) {
referencesMap.put(key, i);
}
}
List<Character> chars = new ArrayList<Character>(target.length());
for (int i = 0; i < target.length(); i++) {
chars.add(target.charAt(i));
}
Collections.sort(chars, new Comparator<Character>() {
#Override
public int compare(Character o1, Character o2) {
return referencesMap.get(o1).compareTo(referencesMap.get(o2));
}
});
StringBuilder sb = new StringBuilder();
for (Character c : chars) {
sb.append(c);
}
return sb.toString();
}
In C# I would just use the IComparer Interface and leave it to Array.Sort
void Main()
{
// we defin the IComparer class to define Sort Order
var sortOrder = new SortOrder("dfbcae");
var testOrder = "abcdeeabc".ToCharArray();
// sort the array using Array.Sort
Array.Sort(testOrder, sortOrder);
Console.WriteLine(testOrder.ToString());
}
public class SortOrder : IComparer
{
string sortOrder;
public SortOrder(string sortOrder)
{
this.sortOrder = sortOrder;
}
public int Compare(object obj1, object obj2)
{
var obj1Index = sortOrder.IndexOf((char)obj1);
var obj2Index = sortOrder.IndexOf((char)obj2);
if(obj1Index == -1 || obj2Index == -1)
{
throw new Exception("character not found");
}
if(obj1Index > obj2Index)
{
return 1;
}
else if (obj1Index == obj2Index)
{
return 0;
}
else
{
return -1;
}
}
}

Resources