If I have a search box and want to find if a string contains certain words (not case sensitive) and/or numbers.
search = "Brown lazy 46"
textline = "The quick brown fox jumped over 46 lazy dogs"
if string.match(textline, search) then
result = textline
end
just like a web search.
search = "Brown lazy 46"
textline = "The quick brown fox jumped over 46 lazy dogs"
for item in string.gmatch(search, "%S+") do
if string.find(textline, string.lower(item)) then
result = textline
end
end
You break up the word values you are looking for and convert them into an array. Then you should loop that array and check if your main variable is in it.
If I understood correctly what you want to do this should solve your problem.
Related
I've try to break down a sentence to letters and rearrange them alphabetically.
Please see if i could improve this code in someway.
Kind regards
sen = "the quick brown fox jumps over the lazy dog"
smallest=[]
re=''
while len(sen) >0:
smallest.append( min(sen))
print(ord(min(sen)))
re=re+min(sen)
sen = sen[:sen.index(min(sen))]+sen[sen.index(min(sen))+1:]
counter+=1
print(smallest) #list
print(re) #string
What you're doing is exactly like sorting an array of number. You got a value for each char. There is many way to sort an array of number some are pretty fast but depends on what you're seeking. I think the fastest are insertion sort, bubble sort, or selection sort.
You can code them or find them already done in many languages.
There is other way to sort an array you can fin all of them here :
https://en.wikipedia.org/wiki/Sorting_algorithm#Comparison_of_algorithms
In MATLAB how can we compare 2 strings and print the common word out. For Example string1 = "hello my name is bob"; and string2 = "today bob went to the park"; the word bob is common in both. What is the structure to follow.
Use intersect with strsplit for a one-liner -
common_word = intersect(strsplit(string1),strsplit(string2))
strsplit splits each string to cells of words and then intersect finds the common one out.
If you would like to avoid strsplit, you can use regexp instead -
common_word =intersect(regexp(string1,'\s','Split'),regexp(string2,'\s','Split'))
Bonus: Removing stop-words from the common words
Let's add some stop-words that are common to these two strings -
string1 = 'hello my name is bob and I am going to the zoo'
string2 = 'today bob went to the park'
Using the solution presented earlier, you would get -
common_word =
'bob' 'the' 'to'
Now, these words - 'the' and 'to' are part of the stop-words. If you would like to have them removed, let me suggest this - Removing stop words from single string
and it's accepted solution.
The final output would be 'bob', whom you were looking for!
If you are looking for matching words only, that are separated by spaces, you can use strsplit to change each string into cell arrays of words, then loop through and search for each one.
str1 = 'test if this works';
str2 = 'does this work?';
cell1 = strsplit(str1);
cell2 = strsplit(str2);
for n = 1:length(cell1)
for m = 1:length(cell2)
if strcmp(cell1{n},cell2{m})
disp(cell1{n});
end
end
end
Notice that in my example the last member of cell2 is 'work?' so if you have punctuation in your strings, you'll have to do a check for that (isletter might help).
I need to know if two strings "match" where "matching" basically means that there is significant overlap between the two strings. For example, if string1 is "foo" and string2 is "foobar", this should be a match. If string2 was "barfoo", that should also be a match with string1. However, if string2 was "fobar", this should not be a match. I'm struggling to find a clever way to do this. Do I need to split the strings into lists of characters first or is there a way to do this kind of comparison already in Groovy? Thanks!
Using Apache commons StringUtils:
#Grab( 'org.apache.commons:commons-lang3:3.1' )
import static org.apache.commons.lang3.StringUtils.getLevenshteinDistance
int a = getLevenshteinDistance( 'The quick fox jumped', 'The fox jumped' )
int b = getLevenshteinDistance( 'The fox jumped', 'The fox' )
// Assert a is more similar than b
assert a < b
Levenshtein Distance tells you the number of characters that have to change for one string to become another
So to get from 'The quick fox jumped' to 'The fox jumped', you need to delete 6 chars (so it has a score of 6)
And to get from 'The fox jumped' to 'The fox', you need to delete 7 chars.
As per your examples, plain old String.contains may suffice:
assert 'foobar'.contains('foo')
assert 'barfoo'.contains('foo')
assert !'fobar'.contains('foo')
For any input string, we need to find super string by word match in any order. i.e. all words in the input string has to occur in any order in output string.
e.g. given data set:
"string search"
"java string search"
"manual c string search equals"
"java search code"
"c java code search"
...
input: "java search"
output:
1) "java string search"
2) "java search code"
3) "c java code search"
input: "search c"
output:
1) "manual c string search equals"
2) "c java code search"
This can be done in a very trivial way with word by word matching. Here mainly I am looking for an efficient algo.
Input: A few billion records in given data set (mostly 1 to 10 words length string).
I need to find super string for millions of strings.
Note: words are extended dictionary ones.
Preprocess your input (if possible), and index the words that appear in the dataset. Generate a mapping from each word to a set of possible output strings. For example, with the dataset
0 string search
1 java string search
2 manual c string search equals
3 java search code
4 c java code search
we get
c {2,4}
code {3,4}
equals {2}
java {1,3,4}
...
Then, searching for the matches for a given input is as simple as intersecting the sets corresponding to the input word:
input: "java c"
output: {1,3,4} intersect {2,4} = {4}
If you store the sets just as sorted lists, intersection can be done in linear time (linear in the total length of the input sets) by scanning across the lists in parallel.
You basically need to find the intersection of two sets of words, input_words and data_words. If the intersection equals input_words, you have a match.
Here are efficient algorithms for set intersection: Efficient list intersection algorithm
An algorithm that comes to my mind and completes in O(n*m) [n = size input, m = size data] is.
Python:
match = True
for word in input.split():
if word in data_words.split(): # linear search comparing word to each word
continue
else:
match = False
break
A search on sorted list would be quicker and hash lookups would be even more. Those are detailed in the link above.
I have this string:
"The quick brown f0x jumps 0ver the lazy d0g, the quick brown f0x jumps 0ver the lazy d0g.".
I need a function that will replace all zeros between "brown" and "lazy" with "o". So the output will look like this:
"The quick brown fox jumps over the lazy d0g, the quick brown fox jumps over the lazy d0g.".
So it will look all over the string and most importantly will leave all other zeros intact.
function(text, leftBorder, rightBorder, searchString, replaceString) : string;
Is there any good algorithm?
If you have Python, here's an example using just string manipulation, eg split(), indexing etc. Your programming language should have these features as well.
>>> s="The quick brown f0x jumps 0ver the lazy d0g, the quick brown f0x jumps 0ver the lazy d0g."
>>> words = s.split("lazy")
>>> for n,word in enumerate(words):
... if "brown" in word:
... w = word.split("brown")
... w[-1]=w[-1].replace("0","o")
... word = 'brown'.join(w)
... words[n]=word
...
>>> 'lazy'.join(words)
'The quick brown fox jumps over the lazy d0g, the quick brown fox jumps over the lazy d0g.'
>>>
The steps:
Split the words on "lazy" to an array A
Go through each element in A to look for "brown"
if found , split on "brown" into array B. The part you are going to change is the
last element
replace it with whatever methods your programming language provides
join back the array B using "brown"
update this element in the first array A
lastly, join the whole string back using "lazy"