Related
I was asked the following question in an onsite interview:
A string is considered "balanced" when every letter in the string appears both in uppercase and lowercase. For e.g., CATattac is balanced (a, c, t occur in both cases), while Madam is not (a, d only appear in lowercase). Write a function that, given a string, returns the shortest balanced substring of that string. For e.g.,:
“azABaabza” should return “ABaab”
“TacoCat” should return -1 (not balanced)
“AcZCbaBz” should returns the entire string
Doing it with the brute force approach is trivial - calculating all the pairs of substrings and then checking if they are balanced, while keeping track of the size and starting index of the smallest one.
How do I optimize? I have a strong feeling it can be done with a sliding-window/two-pointer approach, but I am not sure how. When to update the pointers of the sliding window?
Edit: Removing the sliding-window tag since this is not a sliding-window problem (as discussed in the comments).
Due to the special property of string. There is only 26 uppercase letters and 26 lowercase letters.
We can loop every 26 letter j and denote the minimum length for any substrings starting from position i to find matches for uppercase and lowercase letter j be len[i][j]
Demo C++ code:
string s = "CATattac";
// if len[i] >= s.size() + 1, it denotes there is no matching
vector<vector<int>> len(s.size(), vector<int>(26, 0));
for (int i = 0; i < 26; ++i) {
int upperPos = s.size() * 2;
int lowerPos = s.size() * 2;
for (int j = s.size() - 1; j >= 0; --j) {
if (s[j] == 'A' + i) {
upperPos = j;
} else if (s[j] == 'a' + i) {
lowerPos = j;
}
len[j][i] = max(lowerPos - j + 1, upperPos - j + 1);
}
}
We also keep track of the count of characters.
// cnt[i][j] denotes the number of characters j in substring s[0..i-1]
// cnt[0][j] is always 0
vector<vector<int>> cnt(s.size() + 1, vector<int>(26, 0));
for (int i = 0; i < s.size(); ++i) {
for (int j = 0; j < 26; ++j) {
cnt[i + 1][j] = cnt[i][j];
if (s[i] == 'A' + j || s[i] == 'a' + j) {
++cnt[i + 1][j];
}
}
}
Then we can loop over s.
int m = s.size() + 1;
for (int i = 0; i < s.size(); ++i) {
bool done = false;
int minLen = 1;
while (!done && i + minLen <= s.size()) {
// execute at most 26 times, a new character must be added to change minLen
int prevMinLen = minLen;
done = true;
for (int j = 0; j < 26 && i + minLen <= s.size(); ++j) {
if (cnt[i + minLen][j] - cnt[i][j] > 0) {
// character j exists in the substring, have to find pair of it
minLen = max(minLen, len[i][j]);
}
}
if (prevMinLen != minLen) done = false;
}
// find overall minLen
if (i + minLen <= s.size())
m = min(m, minLen);
cout << minLen << '\n';
}
Output: (if i + minLen <= s.size(), it is valid. Otherwise substring doesn't exist if starting at that position)
The invalid output difference is due to how the array len is generated.
8
4
15
14
13
12
11
10
I'm not sure whether there is a simpler solution but it is the best I could think of right now.
Time complexity: O(N) with a constant of 26 * 26
Edit: I previously had O(nlog(n)) due to a unnecessary binary search.
I thought of a solution, which is technically O(n), where n is the length of the string, but the constant is pretty large.
For simplicity's sake, let's consider an analogous situation with only two letters, A and B (and their lowercase counterparts), and let l be the size of the alphabet for future reference. I worked on an example string ABabBaaA.
We start by computing the prefix counts of the number of occurrences of each letter. In this case, we get
i: 0, 1, 2, 3, 4, 5, 6, 7, 8
----------------------------
A: 0, 1, 1, 1, 1, 1, 1, 1, 2
a: 0, 0, 0, 1, 1, 1, 2, 3, 3
B: 0, 0, 1, 1, 1, 2, 2, 2, 2
b: 0, 0, 0, 0, 1, 1, 1, 1, 1
This way, assuming we are indexing the string starting from 1 (for implementation's sake you can add an extra character to the beginning, like a dollar sign $), we can get the number of occurrences of each letter on any substring in constant time (or rather -- in O(l), but in my case l is set to 2 and in your case l = 26 so technically this is constant time).
OK now we prepare arrays / vectors / queues of character indices, so if the character A appears on indices 1 and 8, the structure will consist of 1 and 8. We get
A: 1, 8
a: 3, 6, 7
B: 2, 5
b: 4
What is important, is that in arrays and vectors, we can look up certain "lowest element greater than" in amortized constant time by discarding indices which are smaller than every index one by one.
Now, the algorithm. Starting at each (left) index greater than 0, we will find the earliest right index for which the substring bound by [left_index, right_index] is balanced. We do that as follows:
Start with left_index = right_index = i for i = 1, ..., n.
Read the array of prefix counts for right_index and subtract the prefix counts for left_index - 1 receiving the counts for the substring [left_index, right_index]. Find any letter, which fails the "balance" check. If there is none, you found the shortest balanced substring starting at left_index.
Find the first occurrence of the "missing" letter, greater than left_index. Set right_index to the index of that occurrence. Go to step 1 keeping the modified right_index.
For example: starting with left_index = right_index = 1 we see that the number of occurrences of each letter in the substring is 1, 0, 0, 0, so a fails the check. The earliest occurrence of a is 3, so we set right_index = 3. We go back to step 1 receiving a new array of occurrences: 1, 1, 1, 0. Now b fails the check, and its earliest occurrence greater than 1 is 4, so we set right_index to 4. We go to step 1 receiving an array of occurrences 1, 1, 1, 1, which passes the balance check.
Another example: starting with left_index = right_index = 2 we get in step 1 an array of occurrences 0, 0, 1, 0. Now b fails the check. The earliest occurrence of b greater than left_index is 4, so we set right_index to 4. Now we get an array of occurrences 0, 1, 1, 1, so A fails the check. The earliest occurrence of A greater than left_index is 8, so we set right_index to that. Now, the array of occurrences is 2-1, 3-0, 2-0, 1-0, which is 1, 3, 2, 1 and it passes the balance check.
Ultimately we will find the shortest balanced substring to be bB with left_index = 4.
The complexity of this algorithm is O(nl^2) because: we start at n different indices and we perform a maximum of l lookups (for l different letters which can fail the check) in O(1). For each lookup, we have to calculate l differences of prefix sums. But as l is constant (albeit it may be large, like 26), this simplifies to O(n).
I'm using a recursive approach to this; I'm not sure what it's time complexity is though.
The idea is we check what characters in the string are present in both their lower and upper form formats. For any characters that aren't given in both forms, we replace them with a space ' '. We then split the remaining string on ' ' into a list.
In the first case, if we have only one string left after it- we return it's length.
In the second case, if we have no characters left, we return -1.
In the third case, if we have more than one string left, we re-evaluate each of the strings sub-lengths and return the length of the longest string we then evaluate.
from collections import Counter
def findMutual(s):
lower = dict(Counter( [x for x in s if x.lower() == x] ))
upper = dict(Counter( [x for x in s if x.upper() == x] ))
mutual = {}
for charr in lower:
if charr.upper() in upper:
mutual[charr] = upper[charr.upper()] + lower[charr]
matching_charrs = ''.join([x if x.lower() in mutual else ' ' for x in s ]).split()
print(s)
print(matching_charrs)
return matching_charrs
def smallestSubstring(s):
matching_charrs = findMutual(s)
if len(matching_charrs) == 1:
return(len(matching_charrs[0]))
elif len(matching_charrs) == 0:
return(-1)
else:
list_lens = []
for i in matching_charrs:
list_lens.append(smallestSubstring(i))
return max(list_lens)
print(smallestSubstring('azABaabza'))
print(smallestSubstring('dAcZCbaBz'))
print(smallestSubstring('TacoCat'))
print(smallestSubstring('Tt'))
print(smallestSubstring('T'))
print(smallestSubstring('TaCc'))
Consider the sequence of numbers from 1 to 𝑁. For example, for 𝑁 = 9,
we have 1, 2, 3, 4, 5, 6, 7, 8, 9.
Now, place among the numbers one of the three following operators:
"+" sum
"-" subtraction
"#" Paste Operator --> paste the previous and the next operands.
For example, 1#2 = 12
How can I calculate the number of possible sequences that yield zero ?
Example for N = 7:
1+2-3+4-5-6+7
1+2-3-4+5+6-7
1-2#3+4+5+6+7
1-2#3-4#5+6#7
1-2+3+4-5+6-7
1-2-3-4-5+6+7
See the fourth sequence, it is same as 1-23-45+67 and the result is 0.
All of the above sequences evaluate to zero.
Here is my recursion based solution just to build your intuition so that you can approach and improve this solution using dynamic programming on your own (implemented in c++):
// N is the input
// index_count is the index count in the given sequence
// sum is the total sum of a given sequence
int isEvaluteToZero(int N, int index_count, int sum){
// if N==1, then the sequence only contains 1 which is not 0, so return 0
if(N==1){
return 0;
}
// Base case
// if index_count is equal to N and total sum is 0, return 1, else 0
if(index_count==N){
if(sum==0){
return 1;
}
return 0;
}
// recursively call by considering '+' between index_count and index_count+1
// increase index_count by 1
int placeAdd = isEvaluteToZero(N, index_count+1, sum+index_count+1);
// recursively call by considering '-' between index_count and index_count+1
// increase index_count by 1
int placeMinus = isEvaluteToZero(N, index_count+1, sum-index_count-1);
// place '#'
int placePaste;
if(index_count+2<=N){
// paste the previous and the next operands
// For e.g., (8#9) = 8*(10^1)+9 = 89
// (9#10) = 9*(10^2)+10 = 910
// (99#100) = 99*(10^3)+100 = 99100
// (999#1000) = 999*(10^4)+1000 = 9991000
int num1 = index_count+1;
int num2 = index_count+2;
int concat_num = num1*(int)(pow(10, (int)num2/10 + 1) + 0.5)+num2;
placePaste = isEvaluteToZero(N, index_count+2, sum+concat_num) + isEvaluteToZero(N, index_count+2, sum-concat_num);
}else{
// in case index_count+2>N
placePaste = 0;
}
return (placeAdd+placeMinus+placePaste);
}
int main(){
int N, res=1, index_count=1;
cout<<"Enter N:";
cin>>N;
cout<<isEvaluteToZero(N, index_count, res)<<endl;
return 0;
}
output:
N=1 output=0
N=2 output=0
N=3 output=1
N=4 output=1
N=7 output=6
Let's say I have a list of numbers: 2, 2, 5, 7
Now the result of the algorithm should contain all possible sums.
In this case: 2+2, 2+5, 5+7, 2+2+5, 2+2+5+7, 2+5+7, 5+7
I'd like to achieve this by using Dynamic Programming. I tried using a matrix but so far I have not found a way to get all the possibilities.
Based on the question, I think that the answer posted by AT-2016 is correct, and there is no solution that can exploit the concept of dynamic programming to reduce the complexity.
Here is how you can exploit dynamic programming to solve a similar question that asks to return the sum of all possible subsequence sums.
Consider the array {2, 2, 5, 7}: The different possible subsequences are:
{2},{2},{5},{7},{2,5},{2,5},{5,7},{2,5,7},{2,5,7},{2,2,5,7},{2,2},{2,7},{2,7},{2,2,7},{2,2,5}
So, the question is to find the sum of all these elements from all these subsequences. Dynamic Programming comes to the rescue!!
Arrange the subsequences based on the ending element of each subsequence:
subsequences ending with the first element: {2}
subsequences ending with the second element: {2}, {2,2}
subsequences ending with the third element: {5},{2,5},{2,5},{2,2,5}
subsequences ending with the fourth element: {7},{5,7},{2,7},{2,7},{2,2,7},{2,5,7},{2,5,7},{2,2,5,7}.
Here is the code snippet:
The array 's[]' calculates the sums for 1,2,3,4 individually, that is, s[2] calculates the sum of all subsequences ending with third element. The array 'dp[]' calculates the overall sum till now.
s[0]=array[0];
dp[0]=s[0];
k = 2;
for(int i = 1; i < n; i ++)
{
s[i] = s[i-1] + k*array[i];
dp[i] = dp[i-1] + s[i];
k = k * 2;
}
return dp[n-1];
This is done in C# and in an array to find the possible sums that I used earlier:
static void Main(string[] args)
{
//Set up array of integers
int[] items = { 2, 2, 5, 7 };
//Figure out how many bitmasks is needed
//4 bits have a maximum value of 15, so we need 15 masks.
//Calculated as: (2 ^ ItemCount) - 1
int len = items.Length;
int calcs = (int)Math.Pow(2, len) - 1;
//Create array of bitmasks. Each item in the array represents a unique combination from our items array
string[] masks = Enumerable.Range(1, calcs).Select(i => Convert.ToString(i, 2).PadLeft(len, '0')).ToArray();
//Spit out the corresponding calculation for each bitmask
foreach (string m in masks)
{
//Get the items from array that correspond to the on bits in the mask
int[] incl = items.Where((c, i) => m[i] == '1').ToArray();
//Write out the mask, calculation and resulting sum
Console.WriteLine(
"[{0}] {1} = {2}",
m,
String.Join("+", incl.Select(c => c.ToString()).ToArray()),
incl.Sum()
);
}
Console.ReadKey();
}
Possible outputs:
[0001] 7 = 7
[0010] 5 = 5
[0011] 5 + 7 = 12
[0100] 2 = 2
This is not an answer to the question because it does not demonstrate the application of dynamic programming. Rather it notes that this problem involves multisets, for which facilities are available in Sympy.
>>> from sympy.utilities.iterables import multiset_combinations
>>> numbers = [2,2,5,7]
>>> sums = [ ]
>>> for n in range(2,1+len(numbers)):
... for item in multiset_combinations([2,2,5,7],n):
... item
... added = sum(item)
... if not added in sums:
... sums.append(added)
...
[2, 2]
[2, 5]
[2, 7]
[5, 7]
[2, 2, 5]
[2, 2, 7]
[2, 5, 7]
[2, 2, 5, 7]
>>> sums.sort()
>>> sums
[4, 7, 9, 11, 12, 14, 16]
I have a solution that can print a list of all possible subset sums.
Its not dynamic programming(DP) but this solution is faster than the DP approach.
void solve(){
ll i, j, n;
cin>>n;
vector<int> arr(n);
const int maxPossibleSum=1000000;
for(i=0;i<n;i++){
cin>>arr[i];
}
bitset<maxPossibleSum> b;
b[0]=1;
for(i=0;i<n;i++){
b|=b<<arr[i];
}
for(i=0;i<maxPossibleSum;i++){
if(b[i])
cout<<i<<endl;
}
}
Input:
First line has the number of elements N in the array.
The next line contains N space-separated array elements.
4
2 2 5 7
----------
Output:
0
2
4
5
7
9
11
12
14
16
The time complexity of this solution is O(N * maxPossibleSum/32)
The space complexity of this solution is O(maxPossibleSum/8)
Given a value N, if we want to make change for N cents, and we have infinite supply of each of S = { S1, S2, .. , Sm} valued coins, how many ways can we make the change? The order of coins doesn’t matter.There is additional restriction though: you can only give change with exactly K coins.
For example, for N = 4, k = 2 and S = {1,2,3}, there are two solutions: {2,2},{1,3}. So output should be 2.
Solution:
int getways(int coins, int target, int total_coins, int *denomination, int size, int idx)
{
int sum = 0, i;
if (coins > target || total_coins < 0)
return 0;
if (target == coins && total_coins == 0)
return 1;
if (target == coins && total_coins < 0)
return 0;
for (i=idx;i<size;i++) {
sum += getways(coins+denomination[i], target, total_coins-1, denomination, size, i);
}
return sum;
}
int main()
{
int target = 49;
int total_coins = 15;
int denomination[] = {1, 2, 3, 4, 5};
int size = sizeof(denomination)/sizeof(denomination[0]);
printf("%d\n", getways(0, target, total_coins, denomination, size, 0));
}
Above is recursive solution. However i need help with my dynamic programming solution:
Let dp[i][j][k] represent sum up to i with j elements and k coins.
So,
dp[i][j][k] = dp[i][j-1][k] + dp[i-a[j]][j][k-1]
Is my recurrence relation right?
I don't really understand your recurrence relation:
Let dp[i][j][k] represent sum up to i with j elements and k coins.
I think you're on the right track, but I suggest simply dropping the middle dimension [j], and use dp[sum][coinsLeft] as follows:
dp[0][0] = 1 // coins: 0, desired sum: 0 => 1 solution
dp[i][0] = 0 // coins: 0, desired sum: i => 0 solutions
dp[sum][coinsLeft] = dp[sum - S1][coinsLeft-1]
+ dp[sum - S2][coinsLeft-1]
+ ...
+ dp[sum - SM][coinsLeft-1]
The answer is then to be found at dp[N][K] (= number of ways to add K coins to get N cents)
Here's some sample code (I advice you to not look until you've tried to solve it yourself. It's a good exercise):
public static int combinations(int numCoinsToUse, int targetSum, int[] denom) {
// dp[numCoins][sum] == ways to get sum using numCoins
int[][] dp = new int[numCoinsToUse+1][targetSum];
// Any sum (except 0) is impossible with 0 coins
for (int sum = 0; sum < targetSum; sum++) {
dp[0][sum] = sum == 0 ? 1 : 0;
}
// Gradually increase number of coins
for (int c = 1; c <= numCoinsToUse; c++)
for (int sum = 0; sum < targetSum; sum++)
for (int d : denom)
if (sum >= d)
dp[c][sum] += dp[c-1][sum - d];
return dp[numCoinsToUse][targetSum-1];
}
Using your example input:
combinations(2, 4, new int[] {1, 2, 3} ) // gives 2
Was asked this Amazon Telephonic Interview Round 1
So for Length = 1
0 1 (0 1)
Length = 2
00 01 11 10 (0, 1, 3, 2)
and so on
write function for length x that returns numbers in digit(base 10) form
That's called gray code, there are several different kinds, some of which are easier to construct than others. The wikipedia article shows a very simple way to convert from binary to gray code:
unsigned int binaryToGray(unsigned int num)
{
return (num >> 1) ^ num;
}
Using that, you only have to iterate over all numbers of a certain size, put them through that function, and print them however you want.
This is one way to do it:
int nval = (int)Math.Pow(2 , n);
int divisor = nval/2;
for (int i = 0; i < nval; i++)
{
int nb =(int) (i % divisor);
if ( nb== 2) Console.WriteLine(i + 1);
else if (nb == 3) Console.WriteLine(i - 1);
else Console.WriteLine(i);
}