Counting substring that begin with character 'A' and ends with character 'X' - string

PYTHON QN:
Using just one loop, how do I devise an algorithm that counts the number of substrings that begin with character A and ends with character X? For example, given the input string CAXAAYXZA there are four substrings that begin with A and ends with X, namely: AX, AXAAYX, AAYX, and AYX.
For example:
>>>count_substring('CAXAAYXZA')
4

Since you didn't specify a language, im doing c++ish
int count_substring(string s)
{
int inc = 0;
int substring_count = 0;
for(int i = 0;i < s.length();i++)
{
if(s[i] == 'A') inc++;
if(s[i] == 'X') substring_count += inc;
}
return substring_count;
}
and in Python
def count_substring(s):
inc = 0
substring_count = 0
for c in s:
if(c == 'A'): inc = inc + 1
if(c == 'X'): substring_count = substring_count + inc
return substring_count

First count number of "A" in the string
Then count "X" in the string
using
Public Function CountCharacter(ByVal value As String, ByVal ch As Char) As Integer
Dim cnt As Integer = 0
For Each c As Char In value
If c = ch Then cnt += 1
Next
Return cnt
End Function
then take each "A" as a start position and "X" as an end position and get the substring. Do this for each "X" and then start with second "A" and run that for "X" count times. Repeat this and you will get all the substrings starting with "A" and ending with "X".

Just another solution In python:
def count_substring(str):
length = len(str) + 1
found = []
for i in xrange(0, length):
for j in xrange(i+1, length):
if str[i] == 'A' and str[j-1] == 'X':
found.append(str[i:j])
return found
string = 'CAXAAYXZA'
print count_substring(string)
Output:
['AX', 'AXAAYX', 'AAYX', 'AYX']

Related

Given a string s, find the length of the longest substring without repeating characters? (I need to find the bug in code I wrote)

Please help as this is getting on my nerves I can't figure out what I'm doing wrong and have tried trace code.
Link to problem: https://leetcode.com/problems/longest-substring-without-repeating-characters/
I created a solution using a sliding window. It works on most test cases, but fails for a few (such as "ad"). I can't figure out where the bug is. I basically keep track in a dictionary of characters I've seen and the last index I saw them at which gets updated periodically in a loop. I use two indices i and j; i gets updated when I find a repeat character. I return the max of current max and length of current substring which is i-j. Here is my code below:
class Solution:
def lengthOfLongestSubstring(self, s: str) -> int:
if len(s) < 2:
return len(s)
m = 1
i = 0
j = 1
d = {}
d[s[0]] = 0
while j < len(s):
if s[j] in d and d[s[j]] >= i:
m = max(m, j -i)
i = j
d[s[j]] = j
j += 1
return max(m, j - i - 1)
Why does this fail for some cases? Example:
"au"
Output
1
Expected
2
Last line should be return max(m, j - i). Because i is the last index we see repeated character. So. We start this index to end of the string.So length is len(s) - i . And since j = len(s) (while loop ends when j = len(s)) so last substring length is j-i. not j-i-1
And also we are updating i wrong.let's say s = "abcadf". In while loop when we see second "a" ,so j = 3, we should update i = 1, not 3. Because in this case our longest substring will start with "b".So we should update i as i = d[s[j]] + 1. So final result:
class Solution:
def lengthOfLongestSubstring(self, s: str) -> int:
if len(s) < 2:
return len(s)
m = 1
i = 0
j = 1
d = {}
d[s[0]] = 0
while j < len(s):
if s[j] in d and d[s[j]] >= i:
m = max(m, j -i)
i = d[s[j]] + 1
d[s[j]] = j
j += 1
return max(m, j - i)

Longest common prefix length of all substrings and a string

I found similar questions on StackOverflow, but my question is different.
Given a string s contains lowercase alphabet. I want to find the length of Longest common Prefix of all substrings.
For example
s = 'ababac'
Then substrings are as follow:
1: s(1, 6) = ababac
2: s(2, 6) = babac
3: s(3, 6) = abac
4: s(4, 6) = bac
5: s(5, 6) = ac
6: s(6, 6) = c
Now, The lengths of LCP of all substrings are as follow
1: len(LCP(s(1, 6), s)) = 6
2: len(LCP(s(2, 6), s)) = 0
3: len(LCP(s(3, 6), s)) = 3
4: len(LCP(s(4, 6), s)) = 0
5: len(LCP(s(5, 6), s)) = 1
6: len(LCP(s(6, 6), s)) = 0
I am using character by character matching
string commonPrefix(string s1, string s2) {
int minlen = minlength1(s1, s2);
char current;
int result = 0;
for (int i=0; i<minlen; i++) {
current = s1[i];
for (int j=1 ; j<n; j++)
if (s2[i] != current)
return result;
result++;
}
return result;
}
But still, it's O(n2). I know all substrings are overlapping on one another, It can be optimized further. Can anyone help to optimize this code?
As mentioned by Aditya, this can be solved using Z-Algorithm. Please find the detailed explanation with implementation here - https://www.hackerearth.com/practice/algorithms/string-algorithm/z-algorithm/tutorial/
This is similar to Z-algorithm for pattern matching.
Except for the first case where len(LCP(s(1, 6), s)) = len (s).
We need to create a Z array .
For a string str[0..n-1], Z array is of same length as string. An element Z[i] of Z array stores length of the longest substring starting from str[i] which is also a prefix of str[0..n-1]. The first entry of Z array is meaning less as complete string is always prefix of itself.
Visualize the algorithm here :
https://personal.utdallas.edu/~besp/demo/John2010/z-algorithm.htm
Below is the solution of the same :
public static int[] computeZ(String s) {
int l = 0; r = 0;
int [] Z = new int[len];
int len = s.length();
for (int k =0 ; k < len; k++ ) {
int j;
if (k < r) {
j = (z[k-l] < (r-k)) ? z[k-l] : (r-k)
} else {
j = 0;
}
while (k + j < len) {
if (s.charAt(k+j) == s.charAt(j)) {
j++;
} else {
break;
}
}
if (k + j > r) {
l = k;
r = k + j;
}
}
Z[0] = len;
return Z;
}

Modified longest common substring

Given two strings what is an efficient algorithm to find the number and length of longest common sub-strings with the sub-strings being called common if :
1) they have at-least x% characters same and at same position.
2) the start and end indexes of the sub-strings being same.
Ex :
String 1 -> abedefkhj
String 2 -> kbfdfjhlo
suppose the x% being asked is 40,then, ans is,
5 1
where 5 is the longest length and 1 is the number of sub-strings in each string satisfying the given property. Sub-String is "abede" in string 1 and "kbfdf" in string 2.
You can use smth like Levenshtein distance without deleting and inserting.
Build the table, where every element [i, j] is error for substring from position [i] to position [j].
foo(string a, string b, int x):
len = min(a.length, b.length)
error[0][0] = 0 if a[0] == b[0] else 1;
for (end: [1 -> len-1]):
for (start: [end -> 0]):
if a[end] == b[end]:
error[start][end] = error[start][end - 1]
else:
error[start][end] = error[start][end - 1] + 1
best_len = 0;
best_pos = 0;
for (i: [0 -> len-1]):
for (j: [i -> 0]):
len = i - j + 1
error_percent = 100 * error[i][j] / len
if (error_percent <= x and len > best_len):
best_len = len
best_pos = j
return (best_len, best_pos)

Finding minimum moves required for making 2 strings equal

This is a question from one of the online coding challenge (which has completed).
I just need some logic for this as to how to approach.
Problem Statement:
We have two strings A and B with the same super set of characters. We need to change these strings to obtain two equal strings. In each move we can perform one of the following operations:
1. swap two consecutive characters of a string
2. swap the first and the last characters of a string
A move can be performed on either string.
What is the minimum number of moves that we need in order to obtain two equal strings?
Input Format and Constraints:
The first and the second line of the input contains two strings A and B. It is guaranteed that the superset their characters are equal.
1 <= length(A) = length(B) <= 2000
All the input characters are between 'a' and 'z'
Output Format:
Print the minimum number of moves to the only line of the output
Sample input:
aab
baa
Sample output:
1
Explanation:
Swap the first and last character of the string aab to convert it to baa. The two strings are now equal.
EDIT : Here is my first try, but I'm getting wrong output. Can someone guide me what is wrong in my approach.
int minStringMoves(char* a, char* b) {
int length, pos, i, j, moves=0;
char *ptr;
length = strlen(a);
for(i=0;i<length;i++) {
// Find the first occurrence of b[i] in a
ptr = strchr(a,b[i]);
pos = ptr - a;
// If its the last element, swap with the first
if(i==0 && pos == length-1) {
swap(&a[0], &a[length-1]);
moves++;
}
// Else swap from current index till pos
else {
for(j=pos;j>i;j--) {
swap(&a[j],&a[j-1]);
moves++;
}
}
// If equal, break
if(strcmp(a,b) == 0)
break;
}
return moves;
}
Take a look at this example:
aaaaaaaaab
abaaaaaaaa
Your solution: 8
aaaaaaaaab -> aaaaaaaaba -> aaaaaaabaa -> aaaaaabaaa -> aaaaabaaaa ->
aaaabaaaaa -> aaabaaaaaa -> aabaaaaaaa -> abaaaaaaaa
Proper solution: 2
aaaaaaaaab -> baaaaaaaaa -> abaaaaaaaa
You should check if swapping in the other direction would give you better result.
But sometimes you will also ruin the previous part of the string. eg:
caaaaaaaab
cbaaaaaaaa
caaaaaaaab -> baaaaaaaac -> abaaaaaaac
You need another swap here to put back the 'c' to the first place.
The proper algorithm is probably even more complex, but you can see now what's wrong in your solution.
The A* algorithm might work for this problem.
The initial node will be the original string.
The goal node will be the target string.
Each child of a node will be all possible transformations of that string.
The current cost g(x) is simply the number of transformations thus far.
The heuristic h(x) is half the number of characters in the wrong position.
Since h(x) is admissible (because a single transformation can't put more than 2 characters in their correct positions), the path to the target string will give the least number of transformations possible.
However, an elementary implementation will likely be too slow. Calculating all possible transformations of a string would be rather expensive.
Note that there's a lot of similarity between a node's siblings (its parent's children) and its children. So you may be able to just calculate all transformations of the original string and, from there, simply copy and recalculate data involving changed characters.
You can use dynamic programming. Go over all swap possibilities while storing all the intermediate results along with the minimal number of steps that took you to get there. Actually, you are going to calculate the minimum number of steps for every possible target string that can be obtained by applying given rules for a number times. Once you calculate it all, you can print the minimum number of steps, which is needed to take you to the target string. Here's the sample code in JavaScript, and its usage for "aab" and "baa" examples:
function swap(str, i, j) {
var s = str.split("");
s[i] = str[j];
s[j] = str[i];
return s.join("");
}
function calcMinimumSteps(current, stepsCount)
{
if (typeof(memory[current]) !== "undefined") {
if (memory[current] > stepsCount) {
memory[current] = stepsCount;
} else if (memory[current] < stepsCount) {
stepsCount = memory[current];
}
} else {
memory[current] = stepsCount;
calcMinimumSteps(swap(current, 0, current.length-1), stepsCount+1);
for (var i = 0; i < current.length - 1; ++i) {
calcMinimumSteps(swap(current, i, i + 1), stepsCount+1);
}
}
}
var memory = {};
calcMinimumSteps("aab", 0);
alert("Minimum steps count: " + memory["baa"]);
Here is the ruby logic for this problem, copy this code in to rb file and execute.
str1 = "education" #Sample first string
str2 = "cnatdeiou" #Sample second string
moves_count = 0
no_swap = 0
count = str1.length - 1
def ends_swap(str1,str2)
str2 = swap_strings(str2,str2.length-1,0)
return str2
end
def swap_strings(str2,cp,np)
current_string = str2[cp]
new_string = str2[np]
str2[cp] = new_string
str2[np] = current_string
return str2
end
def consecutive_swap(str,current_position, target_position)
counter=0
diff = current_position > target_position ? -1 : 1
while current_position!=target_position
new_position = current_position + diff
str = swap_strings(str,current_position,new_position)
# p "-------"
# p "CP: #{current_position} NP: #{new_position} TP: #{target_position} String: #{str}"
current_position+=diff
counter+=1
end
return counter,str
end
while(str1 != str2 && count!=0)
counter = 1
if str1[-1]==str2[0]
# p "cross match"
str2 = ends_swap(str1,str2)
else
# p "No match for #{str2}-- Count: #{count}, TC: #{str1[count]}, CP: #{str2.index(str1[count])}"
str = str2[0..count]
cp = str.rindex(str1[count])
tp = count
counter, str2 = consecutive_swap(str2,cp,tp)
count-=1
end
moves_count+=counter
# p "Step: #{moves_count}"
# p str2
end
p "Total moves: #{moves_count}"
Please feel free to suggest any improvements in this code.
Try this code. Hope this will help you.
public class TwoStringIdentical {
static int lcs(String str1, String str2, int m, int n) {
int L[][] = new int[m + 1][n + 1];
int i, j;
for (i = 0; i <= m; i++) {
for (j = 0; j <= n; j++) {
if (i == 0 || j == 0)
L[i][j] = 0;
else if (str1.charAt(i - 1) == str2.charAt(j - 1))
L[i][j] = L[i - 1][j - 1] + 1;
else
L[i][j] = Math.max(L[i - 1][j], L[i][j - 1]);
}
}
return L[m][n];
}
static void printMinTransformation(String str1, String str2) {
int m = str1.length();
int n = str2.length();
int len = lcs(str1, str2, m, n);
System.out.println((m - len)+(n - len));
}
public static void main(String[] args) {
Scanner scan = new Scanner(System.in);
String str1 = scan.nextLine();
String str2 = scan.nextLine();
printMinTransformation("asdfg", "sdfg");
}
}

Check if a string is rotation of another WITHOUT concatenating

There are 2 strings , how can we check if one is a rotated version of another ?
For Example : hello --- lohel
One simple solution is by concatenating first string with itself and checking if the other one is a substring of the concatenated version.
Is there any other solution to it ?
I was wondering if we could use circular linked list maybe ? But I am not able to arrive at the solution.
One simple solution is by concatenating them and checking if the other one is a substring of the concatenated version.
I assume you mean concatenate the first string with itself, then check if the other one is a substring of that concatenation.
That will work, and in fact can be done without any concatenation at all. Just use any string searching algorithm to search for the second string in the first, and when you reach the end, loop back to the beginning.
For instance, using Boyer-Moore the overall algorithm would be O(n).
There's no need to concatenate at all.
First, check the lengths. If they're different then return false.
Second, use an index that increments from the first character to the last of the source. Check if the destination starts with all the letters from the index to the end, and ends with all the letters before the index. If at any time this is true, return true.
Otherwise, return false.
EDIT:
An implementation in Python:
def isrot(src, dest):
# Make sure they have the same size
if len(src) != len(dest):
return False
# Rotate through the letters in src
for ix in range(len(src)):
# Compare the end of src with the beginning of dest
# and the beginning of src with the end of dest
if dest.startswith(src[ix:]) and dest.endswith(src[:ix]):
return True
return False
print isrot('hello', 'lohel')
print isrot('hello', 'lohell')
print isrot('hello', 'hello')
print isrot('hello', 'lohe')
You could compute the lexicographically minimal string rotation of each string and then test if they were equal.
Computing the minimal rotation is O(n).
This would be good if you had lots of strings to test as the minimal rotation could be applied as a preprocessing step and then you could use a standard hash table to store the rotated strings.
Trivial O(min(n,m)^2) algorithm: (n - length of S1, m - length of S2)
isRotated(S1 , S2):
if (S1.length != S2.length)
return false
for i : 0 to n-1
res = true
index = i
for j : 0 to n-1
if S1[j] != S2[index]
res = false
break
index = (index+1)%n
if res == true
return true
return false
EDIT:
Explanation -
Two strings S1 and S2 of lengths m and n respectively are cyclic identical if and only if m == n and exist index 0 <= j <= n-1 such S1 = S[j]S[j+1]...S[n-1]S[0]...S[j-1].
So in the above algorithm we check if the length is equal and if exist such an index.
A very straightforward solution is to rotate one of the words n times, where n is the length of the word. For each of those rotations, check to see if the result is the same as the other word.
You can do it in O(n) time and O(1) space:
def is_rot(u, v):
n, i, j = len(u), 0, 0
if n != len(v):
return False
while i < n and j < n:
k = 1
while k <= n and u[(i + k) % n] == v[(j + k) % n]:
k += 1
if k > n:
return True
if u[(i + k) % n] > v[(j + k) % n]:
i += k
else:
j += k
return False
See my answer here for more details.
Simple solution in Java. No need of iteration or concatenation.
private static boolean isSubString(String first, String second){
int firstIndex = second.indexOf(first.charAt(0));
if(first.length() == second.length() && firstIndex > -1){
if(first.equalsIgnoreCase(second))
return true;
int finalPos = second.length() - firstIndex ;
return second.charAt(0) == first.charAt(finalPos)
&& first.substring(finalPos).equals(second.subSequence(0, firstIndex));
}
return false;
}
Test case:
String first = "bottle";
String second = "tlebot";
Logic:
Take the first string's first character, find the index in the second string. Subtract the length of the second with the index found, check if first character of the second at 0 is same as character at the difference of length of the second and index found and substrings between those 2 characters are the same.
Another python implementation (without concatenation) although not efficient but it's O(n), looking forward for comments if any.
Assume that there are two strings s1 and s2.
Obviously, if s1 and s2 are rotations, there exists two sub strings of s2 in s1, the sum of them will total to the length of the string.
The question is to find that partition for which I increment an index in s2 whenever a char of s2 matches with that of s1.
def is_rotation(s1, s2):
if len(s1) != len(s2):
return False
n = len(s1)
if n == 0: return True
j = 0
for i in range(n):
if s2[j] == s1[i]:
j += 1
return (j > 0 and s1[:n - j] == s2[j:] and s1[n - j:] == s2[:j])
The second and condition is just to ensure that the counter incremented for s2 are a sub string match.
input1= "hello" input2="llohe" input3="lohel"(input3 is special case)
if length's of input 1 & input2 are not same return 0.Let i and j be two indexes pointing to input1 and input2 respectively and initialize count to input1.length. Have a flag called isRotated which is set to false
while(count != 0){
When the character's of input1 matches input2
increment i & j
decrement count
If the character's donot match
if isRotated = true(it means even after rotation there's mismatch) so break;
else Reset j to 0 as there's a mismatch. Eg:
Please find the code below and let me know if it fails for some other combination I may not have considered.
public boolean isRotation(String input1, String input2) {
boolean isRotated = false;
int i = 0, j = 0, count = input1.length();
if (input1.length() != input2.length())
return false;
while (count != 0) {
if (i == input1.length() && !isRotated) {
isRotated = true;
i = 0;
}
if (input1.charAt(i) == input2.charAt(j)) {
i++;
j++;
count--;
}
else {
if (isRotated) {
break;
}
if (i == input1.length() - 1 && !isRotated) {
isRotated = true;
}
if (i < input1.length()) {
j = 0;
count = input1.length();
}
/* To handle the duplicates. This is the special case.
* This occurs when input1 contains two duplicate elements placed side-by-side as "ll" in "hello" while
* they may not be side-by-side in input2 such as "lohel" but are still valid rotations.
Eg: "hello" "lohel"
*/
if (input1.charAt(i) == input2.charAt(j)) {
i--;
}
i++;
}
}
if (count == 0)
return true;
return false;
}
public static void main(String[] args) {
// TODO Auto-generated method stub
System.out.println(new StringRotation().isRotation("harry potter",
"terharry pot"));
System.out.println(new StringRotation().isRotation("hello", "llohe"));
System.out.println(new StringRotation().isRotation("hello", "lohell"));
System.out.println(new StringRotation().isRotation("hello", "hello"));
System.out.println(new StringRotation().isRotation("hello", "lohe"));
}
Solving the problem in O(n)
void isSubstring(string& s1, string& s2)
{
if(s1.length() != s2.length())
cout<<"Not rotation string"<<endl;
else
{
int firstI=0, secondI=0;
int len = s1.length();
while( firstI < len )
{
if(s1[firstI%len] == s2[0] && s1[(firstI+1) %len] == s2[1])
break;
firstI = (firstI+1)%len;
}
int len2 = s2.length();
int i=0;
bool isSubString = true;
while(i < len2)
{
if(s1[firstI%len] != s2[i])
{
isSubString = false;
break;
}
i++;
}
if(isSubString)
cout<<"Is Rotation String"<<endl;
else
cout<<"Is not a rotation string"<<endl;
}
}
String source = "avaraavar";
String dest = "ravaraava";
System.out.println();
if(source.length()!=dest.length())
try {
throw (new IOException());
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
int i = 0;
int j = 0;
int totalcount=0;
while(true)
{
i=i%source.length();
if(source.charAt(i)==dest.charAt(j))
{
System.out.println("i="+i+" , j = "+j);
System.out.println(source.charAt(i)+"=="+dest.charAt(j));
i++;
j++;
totalcount++;
}
else
{
System.out.println("i="+i+" , j = "+j);
System.out.println(source.charAt(i)+"!="+dest.charAt(j));
i++;
totalcount++;
j=0;
}
if(j==source.length())
{
System.out.println("Yes its a rotation");
break;
}
if(totalcount >(2*source.length())-1)
{
System.out.println("No its a rotation");
break;
}
}

Resources