A friend of mine just had his interview at Google and got rejected because he couldn't give a solution to this question.
I have my own interview in a couple of days and can't seem to figure out a way to solve it.
Here's the question:
You are given a pattern, such as [a b a b]. You are also given a
string, example "redblueredblue". I need to write a program that tells
whether the string follows the given pattern or not.
A few examples:
Pattern: [a b b a] String: catdogdogcat returns 1
Pattern: [a b a b] String: redblueredblue returns 1
Pattern: [a b b a] String: redblueredblue returns 0
I thought of a few approaches, like getting the number of unique characters in the pattern and then finding that many unique substrings of the string then comparing with the pattern using a hashmap. However, that turns out to be a problem if the substring of a is a part of b.
It'd be really great if any of you could help me out with it. :)
UPDATE:
Added Info: There can be any number of characters in the pattern (a-z). Two characters won't represent the same substring. Also, a character can't represent an empty string.
The simplest solution that I can think of is to divide the given string into four parts and compare the individual parts. You don't know how long a or b is, but both as are of the same length as well as bs are. So the number of ways how to divide the given string is not very large.
Example:
pattern = [a b a b], given string = redblueredblue (14 characters in total)
|a| (length of a) = 1, then that makes 2 characters for as and 12 characters is left for bs, i.e. |b| = 6. Divided string = r edblue r edblue. Whoa, this matches right away!
(just out of curiosity) |a| = 2, |b| = 5 -> divided string = re dblue re dblue -> match
Example 2:
pattern = [a b a b], string = redbluebluered (14 characters in total)
|a| = 1, |b| = 6 -> divided string = r edblue b luered -> no match
|a| = 2, |b| = 5 -> divided string = re dblue bl uered -> no match
|a| = 3, |b| = 4 -> divided string = red blue blu ered -> no match
The rest is not needed to be checked because if you switched a for b and vice versa, the situation is identical.
What is the pattern that has [a b c a b c] ?
Don't you just need to translate the pattern to a regexp using backreferences, i.e. something like this (Python 3 with the "re" module loaded):
>>> print(re.match('(.+)(.+)\\2\\1', 'catdogdogcat'))
<_sre.SRE_Match object; span=(0, 12), match='catdogdogcat'>
>>> print(re.match('(.+)(.+)\\1\\2', 'redblueredblue'))
<_sre.SRE_Match object; span=(0, 14), match='redblueredblue'>
>>> print(re.match('(.+)(.+)\\2\\1', 'redblueredblue'))
None
The regexp looks pretty trivial to generate. If you need to support more than 9 backrefs, you can use named groups - see the Python regexp docs.
Here is java backtracking solution. Source link.
public class Solution {
public boolean isMatch(String str, String pat) {
Map<Character, String> map = new HashMap<>();
return isMatch(str, 0, pat, 0, map);
}
boolean isMatch(String str, int i, String pat, int j, Map<Character, String> map) {
// base case
if (i == str.length() && j == pat.length()) return true;
if (i == str.length() || j == pat.length()) return false;
// get current pattern character
char c = pat.charAt(j);
// if the pattern character exists
if (map.containsKey(c)) {
String s = map.get(c);
// then check if we can use it to match str[i...i+s.length()]
if (i + s.length() > str.length() || !str.substring(i, i + s.length()).equals(s)) {
return false;
}
// if it can match, great, continue to match the rest
return isMatch(str, i + s.length(), pat, j + 1, map);
}
// pattern character does not exist in the map
for (int k = i; k < str.length(); k++) {
// create or update the map
map.put(c, str.substring(i, k + 1));
// continue to match the rest
if (isMatch(str, k + 1, pat, j + 1, map)) {
return true;
}
}
// we've tried our best but still no luck
map.remove(c);
return false;
}
}
One more brute force recursion solution:
import java.io.IOException;
import java.util.*;
public class Test {
public static void main(String[] args) throws IOException {
int res;
res = wordpattern("abba", "redbluebluered");
System.out.println("RESULT: " + res);
}
static int wordpattern(String pattern, String input) {
int patternSize = 1;
boolean res = findPattern(pattern, input, new HashMap<Character, String>(), patternSize);
while (!res && patternSize < input.length())
{
patternSize++;
res = findPattern(pattern, input, new HashMap<Character, String>(), patternSize);
}
return res ? 1 : 0;
}
private static boolean findPattern(String pattern, String input, Map<Character, String> charToValue, int patternSize) {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < pattern.length(); i++) {
char c = pattern.charAt(i);
if (charToValue.containsKey(c)) {
sb.append(charToValue.get(c));
} else {
// new character in pattern
if (sb.length() + patternSize > input.length()) {
return false;
} else {
String substring = input.substring(sb.length(), sb.length() + patternSize);
charToValue.put(c, substring);
int newPatternSize = 1;
boolean res = findPattern(pattern, input, new HashMap<>(charToValue), newPatternSize);
while (!res && newPatternSize + sb.length() + substring.length() < input.length() - 1) {
newPatternSize++;
res = findPattern(pattern, input, new HashMap<>(charToValue), newPatternSize);
}
return res;
}
}
}
return sb.toString().equals(input) && allValuesUniq(charToValue.values());
}
private static boolean allValuesUniq(Collection<String> values) {
Set<String> set = new HashSet<>();
for (String v : values) {
if (!set.add(v)) {
return false;
}
}
return true;
}
}
My Implementation on C#. Tried to look for something clean in C#, couldn't find. So I'll add it to here.
private static bool CheckIfStringFollowOrder(string text, string subString)
{
int subStringLength = subString.Length;
if (text.Length < subStringLength) return false;
char x, y;
int indexX, indexY;
for (int i=0; i < subStringLength -1; i++)
{
indexX = -1;
indexY = -1;
x = subString[i];
y = subString[i + 1];
indexX = text.LastIndexOf(x);
indexY = text.IndexOf(y);
if (y < x || indexX == -1 || indexY == -1)
return false;
}
return true;
}
I solved this as a language production problem using regexen.
def wordpattern( pattern, string):
'''
input: pattern 'abba'
string 'redbluebluered'
output: 1 for match, 2 for no match
'''
# assemble regex into something like this for 'abba':
# '^(?P<A>.+)(?P<B>.+)(?P=B)(?P=A)$'
p = pattern
for c in pattern:
C = c.upper()
p = p.replace(c,"(?P<{0}>.+)".format(C),1)
p = p.replace(c,"(?P={0})".format(C),len(pattern))
p = '^' + p + '$'
# check for a preliminary match
if re.search(p,string):
rem = re.match(p,string)
seen = {}
# check to ensure that no points in the pattern share the same match
for c in pattern:
s = rem.group(c.upper())
# has match been seen? yes, fail, no continue
if s in seen and seen[s] != c:
return 0
seen[s] = c
# success
return 1
# did not hit the search, fail
return 0
#EricM
I tested your DFS solution and it seems wrong, like case:
pattern = ["a", "b", "a"], s = "patrpatrr"
The problem is that when you meet a pattern that already exists in dict and find it cannot fit the following string, you delete and try to assign it a new value. However, you haven't check this pattern with the new value for the previous times it occurs.
My idea is about providing addition dict (or merge in this dict) new value to keep track of the first time it appears and another stack to keep track of the unique pattern I meet. when "not match" occurs, I will know there is some problem with the last pattern and I pop it from the stack and modify the corresponding value in the dict, also I will start to check again at that corresponding index. If cannot be modified any more. I will pop until there is none left in the stack and then return False.
(I want to add comments but don't have enough reputation as a new user.. I haven't implement it but till now I haven't find any error in my logic. I am sorry if there is something wrong with my solution== I will try to implement it later.)
I can't think of much better than the brute force solution: try every possible partitioning of the word (this is essentially what Jan described).
The run-time complexity is O(n^(2m)) where m is the length of the pattern and n is the length of the string.
Here's what the code for that looks like (I made my code return the actual mapping instead of just 0 or 1. Modifying the code to return 0 or 1 is easy):
import java.util.Arrays;
import java.util.ArrayDeque;
import java.util.ArrayList;
import java.util.Deque;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class StringBijection {
public static void main(String[] args) {
String chars = "abaac";
String string = "johnjohnnyjohnjohncodes";
List<String> stringBijection = getStringBijection(chars, string);
System.out.println(Arrays.toString(stringBijection.toArray()));
}
public static List<String> getStringBijection(String chars, String string) {
if (chars == null || string == null) {
return null;
}
Map<Character, String> bijection = new HashMap<Character, String>();
Deque<String> assignments = new ArrayDeque<String>();
List<String> results = new ArrayList<String>();
boolean hasBijection = getStringBijection(chars, string, 0, 0, bijection, assignments);
if (!hasBijection) {
return null;
}
for (String result : assignments) {
results.add(result);
}
return results;
}
private static boolean getStringBijection(String chars, String string, int charIndex, int stringIndex, Map<Character, String> bijection, Deque<String> assignments) {
int charsLen = chars.length();
int stringLen = string.length();
if (charIndex == charsLen && stringIndex == stringLen) {
return true;
} else if (charIndex == charsLen || stringIndex == stringLen) {
return false;
}
char currentChar = chars.charAt(charIndex);
List<String> possibleWords = new ArrayList<String>();
boolean charAlreadyAssigned = bijection.containsKey(currentChar);
if (charAlreadyAssigned) {
String word = bijection.get(currentChar);
possibleWords.add(word);
} else {
StringBuilder word = new StringBuilder();
for (int i = stringIndex; i < stringLen; ++i) {
word.append(string.charAt(i));
possibleWords.add(word.toString());
}
}
for (String word : possibleWords) {
int wordLen = word.length();
int endIndex = stringIndex + wordLen;
if (endIndex <= stringLen && string.substring(stringIndex, endIndex).equals(word)) {
if (!charAlreadyAssigned) {
bijection.put(currentChar, word);
}
assignments.addLast(word);
boolean done = getStringBijection(chars, string, charIndex + 1, stringIndex + wordLen, bijection, assignments);
if (done) {
return true;
}
assignments.removeLast();
if (!charAlreadyAssigned) {
bijection.remove(currentChar);
}
}
}
return false;
}
}
If you are looking for a solution in C++, here is a brute force solution:
https://linzhongzl.wordpress.com/2014/11/04/repeating-pattern-match/
Plain Brute Force, not sure if any optimization is possible here ..
import java.util.HashMap;
import java.util.Map;
import org.junit.*;
public class Pattern {
private Map<Character, String> map;
private boolean matchInt(String pattern, String str) {
if (pattern.length() == 0) {
return str.length() == 0;
}
char pch = pattern.charAt(0);
for (int i = 0; i < str.length(); ++i) {
if (!map.containsKey(pch)) {
String val = str.substring(0, i + 1);
map.put(pch, val);
if (matchInt(pattern.substring(1), str.substring(val.length()))) {
return true;
} else {
map.remove(pch);
}
} else {
String val = map.get(pch);
if (!str.startsWith(val)) {
return false;
}
return matchInt(pattern.substring(1), str.substring(val.length()));
}
}
return false;
}
public boolean match(String pattern, String str) {
map = new HashMap<Character, String>();
return matchInt(pattern, str);
}
#Test
public void test1() {
Assert.assertTrue(match("aabb", "ABABCDCD"));
Assert.assertTrue(match("abba", "redbluebluered"));
Assert.assertTrue(match("abba", "asdasdasdasd"));
Assert.assertFalse(match("aabb", "xyzabcxzyabc"));
Assert.assertTrue(match("abba", "catdogdogcat"));
Assert.assertTrue(match("abab", "ryry"));
Assert.assertFalse(match("abba", " redblueredblue"));
}
}
class StringPattern{
public:
int n, pn;
string str;
unordered_map<string, pair<string, int>> um;
vector<string> p;
bool match(string pat, string str_) {
p.clear();
istringstream istr(pat);
string x;
while(istr>>x) p.push_back(x);
pn=p.size();
str=str_;
n=str.size();
um.clear();
return dfs(0, 0);
}
bool dfs(int i, int c) {
if(i>=n) {
if(c>=pn){
return 1;
}
}
if(c>=pn) return 0;
for(int len=1; i+len-1<n; len++) {
string sub=str.substr(i, len);
if(um.count(p[c]) && um[p[c]].fi!=sub
|| um.count(sub) && um[sub].fi!=p[c]
)
continue;
//cout<<"str:"<<endl;
//cout<<p[c]<<" "<<sub<<endl;
um[p[c]].fi=sub;
um[p[c]].se++;
um[sub].fi=p[c];
um[sub].se++;
//um[sub]=p[c];
if(dfs(i+len, c+1)) return 1;
um[p[c]].se--;
if(!um[p[c]].se) um.erase(p[c]);
um[sub].se--;
if(!um[sub].se) um.erase(sub);
//um.erase(sub);
}
return 0;
}
};
My solution, as two side hashmap is needed, and also need to count the hash map counts
My java script solution:
function isMatch(pattern, str){
var map = {}; //store the pairs of pattern and strings
function checkMatch(pattern, str) {
if (pattern.length == 0 && str.length == 0){
return true;
}
//if the pattern or the string is empty
if (pattern.length == 0 || str.length == 0){
return false;
}
//store the next pattern
var currentPattern = pattern.charAt(0);
if (currentPattern in map){
//the pattern has alredy seen, check if there is a match with the string
if (str.length >= map[currentPattern].length && str.startsWith(map[currentPattern])){
//there is a match, try all other posibilities
return checkMatch(pattern.substring(1), str.substring(map[currentPattern].length));
} else {
//no match, return false
return false;
}
}
//the current pattern is new, try all the posibilities of current string
for (var i=1; i <= str.length; i++){
var stringToCheck = str.substring(0, i);
//store in the map
map[currentPattern] = stringToCheck;
//try the rest
var match = checkMatch(pattern.substring(1), str.substring(i));
if (match){
//there is a match
return true;
} else {
//if there is no match, delete the pair from the map
delete map[currentPattern];
}
}
return false;
}
return checkMatch(pattern, str);
}
A solution in Java I wrote (based on this HackerRank Dropbox Challenge practice).
You can play with the DEBUG_VARIATIONS and DEBUG_MATCH flags to have a better understanding of how the algorithm works.
It may be too late now, but you might want to attempt to tackle the problem at HackerRank first before reading through the proposed solutions! ;-)
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class Solution {
private static final boolean DEBUG_VARIATIONS = false;
private static final boolean DEBUG_MATCH = true;
static int wordpattern(final String pattern, final String input) {
if (pattern.length() == 1) {
return 1;
}
final int nWords = pattern.length();
final List<List<String>> lists = split(input, nWords);
for (final List<String> words : lists) {
if (DEBUG_VARIATIONS) {
System.out.print("-> ");
for (int i = 0; i < words.size(); i++) {
System.out.printf("%s ", words.get(i));
}
System.out.println();
}
if (matches(pattern, words)) {
return 1;
}
}
return 0;
}
// Return every possible way to split 'input' into 'n' parts
private static final List<List<String>> split(final String input, final int n) {
final List<List<String>> variations = new ArrayList<>();
// Stop recursion when n == 2
if (n == 2) {
for (int i = 1; i < input.length(); i++) {
final List<String> l = new ArrayList<>();
l.add(input.substring(0, i));
l.add(input.substring(i));
variations.add(l);
}
return variations;
}
for (int i = 1; i < input.length() - n + 1; i++) {
final List<List<String>> result = split(input.substring(i), n - 1);
for (List<String> l : result) {
l.add(0, input.substring(0, i));
}
variations.addAll(result);
}
return variations;
}
// Return 'true' if list of words matches patterns
private static final boolean matches(final String pattern, final List<String> words) {
final Map<String, String> patterns = new HashMap<>();
for (int i = 0; i < pattern.length(); i++) {
final String key = String.valueOf(pattern.charAt(i));
final String value = words.get(i);
boolean hasKey = patterns.containsKey(key);
boolean hasValue = patterns.containsValue(value);
if (!hasKey && !hasValue) {
patterns.put(key, value);
} else if (hasKey && !hasValue) {
return false;
} else if (!hasKey && hasValue) {
return false;
} else if (hasKey && hasValue) {
if (!value.equals(patterns.get(key))) {
return false;
}
}
}
if (DEBUG_MATCH) {
System.out.print("Found match! -> ");
for (int i = 0; i < words.size(); i++) {
System.out.printf("%s ", words.get(i));
}
System.out.println();
}
return true;
}
public static void main(final String[] args) {
System.out.println(wordpattern("abba", "redbluebluered"));
}
}
Python solution based on Java solution at: https://www.algo.monster/problems/word_pattern_ii
def helper(pattern, s, idxPattern, idxString, myMap, mySet):
if (idxPattern == len(pattern)) and (idxString == len(s)):
return True
if (idxPattern >= len(pattern)) or (idxString >= len(s)):
return False
thisChar = pattern[idxPattern]
#print ("At Char: ", thisChar, " at location: ", idxPattern)
for idxK in range(idxString + 1, len(s) + 1):
subString = s[idxString:idxK]
if (thisChar not in myMap) and (subString not in mySet) :
myMap[thisChar] = subString
mySet.add(subString)
# print ("Before Map {0}, Set: {1}".format(myMap, mySet))
if helper(pattern, s, idxPattern + 1, idxK, myMap, mySet):
return True
myMap.pop(thisChar)
mySet.remove(subString)
# print ("After Map {0}, Set: {1}".format(myMap, mySet))
elif (thisChar in myMap) and (myMap[thisChar] == subString):
if helper(pattern, s, idxPattern + 1, idxK, myMap, mySet):
return True
def word_pattern_match(pattern: str, s: str) -> bool:
# WRITE YOUR BRILLIANT CODE HERE
print ("Pattern {0}, String {1}".format(pattern, s))
if (len(pattern) == 0) and (len(s) == 0):
return True
if (len(pattern) == 0):
return False
myMap = dict()
mySet = set()
return helper(pattern, s, 0, 0, myMap, mySet)
if __name__ == '__main__':
pattern = input()
s = input()
res = word_pattern_match(pattern, s)
print('true' if res else 'false')
recursively check each combination.
#include <bits/stdc++.h>
using namespace std;
/**
* Given a string and a pattern, check if the whole string is following the given pattern.
* e.g.
* string pattern return
* redblueredblue abab a:red, b:blue true
* redbb aba false
*
* Concept:
* Recursively checking
* point_pat:0 point_str:0 a:r point_pat:1 point_str:1 b:e/ed/edb...
* point_pat:0 point_str:1 a:re point_pat:1 point_str:2 b:d/db/dbl...
*/
bool isMatch(const string &str, const string &pattern, unordered_map<char, string> &match_table, int point_str, int point_pat)
{
if (point_pat >= pattern.size() && point_str >= str.size())
return true;
if (point_pat >= pattern.size() || point_str >= str.size())
return false;
if (match_table.count(pattern[point_pat]))
{
auto &match_str = match_table[pattern[point_pat]];
if (str.substr(point_str, match_str.size()) == match_str)
return isMatch(str, pattern, match_table, point_str + match_str.size(), point_pat + 1);
else
return false;
}
else
{
for (int len = 1; len <= str.size() - point_str; ++len)
{
match_table[pattern[point_pat]] = str.substr(point_str, len);
if (isMatch(str, pattern, match_table, point_str + len, point_pat + 1))
{
return true;
}
}
return false;
}
}
bool isMatch(const string &str, const string &pattern)
{
unordered_map<char, string> match_table;
bool res = isMatch(str, pattern, match_table, 0, 0);
for (const auto &p : match_table)
{
cout << p.first << " : " << p.second << "\n";
}
return res;
}
int main()
{
string str{"redblueredblue"}, pattern{"abab"};
cout << isMatch(str, pattern) << "\n";
cout << isMatch(str, "ab") << "\n";
cout << isMatch(str, "ababa") << "\n";
cout << isMatch(str, "cba") << "\n";
cout << isMatch(str, "abcabc") << "\n";
cout << isMatch("patrpatrr", "aba") << "\n";
}
Depending on what patterns are given, you can answer a 'different' question (that really is the same question).
For patterns like [a b b a] determine whether or not the string is a palindrome.
For patterns like [a b a b] determine if the second half of the string equals the first half of the string.
Longer patterns like [a b c b c a], but you still break it up into smaller problems to solve. For this one, you know that the last n characters of the string should be the reverse of the first n characters. Once they stop being equal, you simply have another [b c b c] problem to check for.
Although possible, in an interview, I doubt they'd give you anything more complex than maybe 3-4 different substrings.
Can someone explain to me how to solve the substring problem iteratively?
The problem: given two strings S=S1S2S3…Sn and T=T1T2T3…Tm, with m is less than or equal to n, determine if T is a substring of S.
Here's a list of string searching algorithms
Depending on your needs, a different algorithm may be a better fit, but Boyer-Moore is a popular choice.
A naive algorithm would be to test at each position 0 < i ≤ n-m of S if Si+1Si+2…Si+m=T1T2…Tm. For n=7 and m=5:
i=0: S1S2S3S4S5S6S7
| | | | |
T1T2T3T4T5
i=1: S1S2S3S4S5S6S7
| | | | |
T1T2T3T4T5
i=2: S1S2S3S4S5S6S7
| | | | |
T1T2T3T4T5
The algorithm in pseudo-code:
// we just need to test if n ≤ m
IF n > m:
// for each offset on that T can start to be substring of S
FOR i FROM 0 TO n-m:
// compare every character of T with the corresponding character in S plus the offset
FOR j FROM 1 TO m:
// if characters are equal
IF S[i+j] == T[j]:
// if we’re at the end of T, T is a substring of S
IF j == m:
RETURN true;
ENDIF;
ELSE:
BREAK;
ENDIF;
ENDFOR;
ENDFOR;
ENDIF;
RETURN false;
Not sure what language you're working in, but here's an example in C#. It's a roughly n2 algorithm, but it will get the job done.
bool IsSubstring (string s, string t)
{
for (int i = 0; i <= (s.Length - t.Length); i++)
{
bool found = true;
for (int j = 0; found && j < t.Length; j++)
{
if (s[i + j] != t[j])
found = false;
}
if (found)
return true;
}
return false;
}
if (T == string.Empty) return true;
for (int i = 0; i <= S.Length - T.Length; i++) {
for (int j = 0; j < T.Length; j++) {
if (S[i + j] == T[j]) {
if (j == (T.Length - 1)) return true;
}
else break;
}
}
return false;
It would go something like this:
m==0? return true
cs=0
ct=0
loop
cs>n-m? break
char at cs+ct in S==char at ct in T?
yes:
ct=ct+1
ct==m? return true
no:
ct=0
cs=cs+1
end loop
return false
This may be redundant with the above list of substring algorithms, but I was always amused by KMP (http://en.wikipedia.org/wiki/Knuth–Morris–Pratt_algorithm)
// runs in best case O(n) where no match, worst case O(n2) where strings match
var s = "hippopotumus"
var t = "tum"
for(var i=0;i<s.length;i++)
if(s[i]==t[0])
for(var ii=i,iii=0; iii<t.length && i<s.length; ii++, iii++){
if(s[ii]!=t[iii]) break
else if (iii==t.length-1) console.log("yay found it at index: "+i)
}
Here is my PHP variation that includes a check to make sure the Needle does not exceed the Haystacks length during the search.
<?php
function substring($haystack,$needle) {
if("" == $needle) { return true; }
echo "Haystack:\n$haystack\n";
echo "Needle:\n$needle\n";
for($i=0,$len=strlen($haystack);$i<$len;$i++){
if($needle[0] == $haystack[$i]) {
$found = true;
for($j=0,$slen=strlen($needle);$j<$slen;$j++) {
if($j >= $len) { return false; }
if($needle[$j] != $haystack[$i+$j]) {
$found = false;
continue;
}
}
if($found) {
echo " . . . . . . SUCCESS!!!! startPos: $i\n";
return true;
}
}
}
echo " . . . . . . FAILURE!\n" ;
return false;
}
assert(substring("haystack","hay"));
assert(!substring("ack","hoy"));
assert(substring("hayhayhay","hayhay"));
assert(substring("mucho22","22"));
assert(!substring("str","string"));
?>
Left in some echo's. Remove if they offend you!
Is a O(n*m) algorithm, where n and m are the size of each string.
In C# it would be something similar to:
public static bool IsSubtring(char[] strBigger, char[] strSmall)
{
int startBigger = 0;
while (startBigger <= strBigger.Length - strSmall.Length)
{
int i = startBigger, j = 0;
while (j < strSmall.Length && strSmall[j] == strBigger[i])
{
i++;
j++;
}
if (j == strSmall.Length)
return true;
startBigger++;
}
return false;
}
I know I'm late to the game but here is my version of it (in C#):
bool isSubString(string subString, string supraString)
{
for (int x = 0; x <= supraString.Length; x++)
{
int counter = 0;
if (subString[0] == supraString[x]) //find initial match
{
for (int y = 0; y <= subString.Length; y++)
{
if (subString[y] == supraString[y+x])
{
counter++;
if (counter == subString.Length)
{
return true;
}
}
}
}
}
return false;
}
Though its pretty old post, I am trying to answer it. Kindly correct me if anything is wrong,
package com.amaze.substring;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
public class CheckSubstring {
/**
* #param args
* #throws IOException
*/
public static void main(String[] args) throws IOException {
// TODO Auto-generated method stub
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
System.out.println("Please enter the main string");
String mainStr = br.readLine();
System.out.println("Enter the substring that has to be searched");
String subStr = br.readLine();
char[] mainArr = new char[mainStr.length()];
mainArr = mainStr.toCharArray();
char[] subArr = new char[subStr.length()];
subArr = subStr.toCharArray();
boolean tracing = false;
//System.out.println("Length of substring is "+subArr.length);
int j = 0;
for(int i=0; i<mainStr.length();i++){
if(!tracing){
if(mainArr[i] == subArr[j]){
tracing = true;
j++;
}
} else {
if (mainArr[i] == subArr[j]){
//System.out.println(mainArr[i]);
//System.out.println(subArr[j]);
j++;
System.out.println("Value of j is "+j);
if((j == subArr.length)){
System.out.println("SubString found");
return;
}
} else {
j=0;
tracing = false;
}
}
}
System.out.println("Substring not found");
}
}
I'm trying to find the easiest way to search a string for an array of possible strings. I know the easy way to do this for characters is to use myString.IndexOfAny(charArray). But how what if I'd like to search my string for strings and not just characters? Are there any .net tricks or methods that make this easier?
Basically, I'd like to do something like this:
string myName = "rahkim";
string[] names = new string[] {"joe","bob","chris"};
if(myName.IndexOfAny(names) >= 0)
{
//success code//
}
I know there are ways to do this with loops, etc. But I was hoping for something inherent in the framework.
You should define if you want to to find equal strings or search for a matching substring. Both ways are easy pre-LINQ and with LINQ.
string myName = "rahkim";
string[] names = new string[] { "joe", "bob", "chris" };
Equal Strings, LINQ
bool contains = names.Contains(myName);
Equal Strings, Pre-LINQ
bool contains = new List<string>(name).Contains(myName);
Substrings, LINQ
bool contains = names.Any(name => name.Contains(myName));
Substring, Pre-LINQ
bool contains = false;
foreach(string name in names)
if (name.Contains(myName))
contains = true;
If anyone else found this while trying to search for a .Net method like String.IndexOfAny(String[]), this is my solution:
C#
public int IndexOfAny(string test, string[] values)
{
int first = -1;
foreach (string item in values) {
int i = test.IndexOf(item);
if (i >= 0) {
if (first > 0) {
if (i < first) {
first = i;
}
} else {
first = i;
}
}
}
return first;
}
VB
Public Function IndexOfAny(test As String, values As String()) As Integer
Dim first As Integer = -1
For Each item As String In values
Dim i As Integer = test.IndexOf(item)
If i >= 0 Then
If first > 0 Then
If i < first Then
first = i
End If
Else
first = i
End If
End If
Next
Return first
End Function
You can do a LastIndexOfAny(String[]) by just switching the
i < first
to
i > first
You can (also) use the static IndexOf method of the Array class:
bool hasName = Array.IndexOf(names, myName) > -1;
int IndexOfAny(String[] rgs) would indeed be nice but it's nominally an O(n^2) operation. If, in your application, the set of strings rgs is large and always the same, the most efficient approach is to load them into a trie data structure once, and then use the trie repeatedly to search for them within the unknown strings given at runtime.
Here is the relevant code, adapted from a C# trie source I found on the web, attributed to "Kerry D. Wong." In my version, each string in the trie has a "payload" of generic type TValue. To use this trie to simply search for substrings, the payload could always be set to true, as illustrated with simple_trie.
The other thing I changed here is that this trie automatically adapts allow for storage of arbitrary Unicode strings. The array at each node—which characterizes a trie—adjusts its base and length to accomodate the range of Unicode characters which need to be stored at that node. This allows for case-sensitive matching, for example.
The C# 3.0 initialization syntax is handy for this trie, but enabling it requires a dummy implementation of IEnumerable in order to compile. The CLR doesn't seem to call GetEnumerator() and I suggest that you don't try to enumerate with its result either.
using System;
using System.Collections.Generic;
using System.Linq; // only used in Main()
class Program
{
// trie with payload of type <String>
static Trie<String> value_trie = new Trie<String>
{
{ "rabbit", "cute" },
{ "giraffe", "tall" },
{ "ape", "smart" },
{ "hippo", "large" },
};
// degenerate case of a trie without payload
static Trie<bool> simple_trie = new Trie<bool>
{
{ "rabbit", true },
{ "giraffe", true },
{ "ape", true },
{ "hippo", true },
};
static void Main(String[] args)
{
String s = "Once upon a time, a rabbit met an ape in the woods.";
// Retrieve payloads for words in the string.
//
// output:
// cute
// smart
foreach (String word in value_trie.AllSubstringValues(s))
Console.WriteLine(word);
// Simply test a string for any of the words in the trie.
// Note that the Any() operator ensures that the input is no longer
// traversed once a single result is found.
//
// output:
// True
Console.WriteLine(simple_trie.AllSubstringValues(s).Any(e=>e));
s = "Four score and seven years ago.";
// output:
// False
Console.WriteLine(simple_trie.AllSubstringValues(s).Any(e => e));
}
}
class TrieNode<TValue>
{
private TrieNode<TValue>[] nodes = null;
private TValue m_value = default(TValue);
private Char m_base;
public Char Base { get { return m_base; } }
public bool IsEnd { get { return !m_value.Equals(default(TValue)); } }
public TValue Value
{
get { return m_value; }
set { m_value = value; }
}
public IEnumerable<TrieNode<TValue>> Nodes { get { return nodes; } }
public TrieNode<TValue> this[char c]
{
get
{
if (nodes != null && m_base <= c && c < m_base + nodes.Length)
return nodes[c - m_base];
return null;
}
}
public TrieNode<TValue> AddChild(char c)
{
if (nodes == null)
{
m_base = c;
nodes = new TrieNode<TValue>[1];
}
else if (c >= m_base + nodes.Length)
{
Array.Resize(ref nodes, c - m_base + 1);
}
else if (c < m_base)
{
Char c_new = (Char)(m_base - c);
TrieNode<TValue>[] tmp = new TrieNode<TValue>[nodes.Length + c_new];
nodes.CopyTo(tmp, c_new);
m_base = c;
nodes = tmp;
}
TrieNode<TValue> node = nodes[c - m_base];
if (node == null)
{
node = new TrieNode<TValue>();
nodes[c - m_base] = node;
}
return node;
}
};
class Trie<TValue> : System.Collections.IEnumerable
{
private TrieNode<TValue> _root = new TrieNode<TValue>();
// This dummy enables C# 3.0 initialization syntax
public System.Collections.IEnumerator GetEnumerator()
{
return null;
}
public void Add(String s, TValue v)
{
TrieNode<TValue> node = _root;
foreach (Char c in s)
node = node.AddChild(c);
node.Value = v;
}
public bool Contains(String s)
{
TrieNode<TValue> node = _root;
foreach (Char c in s)
{
node = node[c];
if (node == null)
return false;
}
return node.IsEnd;
}
public TValue Find(String s_in)
{
TrieNode<TValue> node = _root;
foreach (Char c in s_in)
{
node = node[c];
if (node == null)
return default(TValue);
}
return node.Value;
}
public IEnumerable<TValue> FindAll(String s_in)
{
TrieNode<TValue> node = _root;
foreach (Char c in s_in)
{
node = node[c];
if (node == null)
break;
if (node.Value != null)
yield return node.Value;
}
}
public IEnumerable<TValue> AllSubstringValues(String s)
{
int i_cur = 0;
while (i_cur < s.Length)
{
TrieNode<TValue> node = _root;
int i = i_cur;
while (i < s.Length)
{
node = node[s[i]];
if (node == null)
break;
if (node.Value != null)
yield return node.Value;
i++;
}
i_cur++;
}
}
};
Here's the right syntax:
if(names.Contains(myName))
{
//success code//
}
if (names.Contains(myName))
{
//success code//
}