Find multiple words or strings in an if statement [duplicate] - python-3.x

How can I check if any of the strings in an array exists in another string?
For example:
a = ['a', 'b', 'c']
s = "a123"
if a in s:
print("some of the strings found in s")
else:
print("no strings found in s")
How can I replace the if a in s: line to get the appropriate result?

You can use any:
a_string = "A string is more than its parts!"
matches = ["more", "wholesome", "milk"]
if any([x in a_string for x in matches]):
Similarly to check if all the strings from the list are found, use all instead of any.

any() is by far the best approach if all you want is True or False, but if you want to know specifically which string/strings match, you can use a couple things.
If you want the first match (with False as a default):
match = next((x for x in a if x in str), False)
If you want to get all matches (including duplicates):
matches = [x for x in a if x in str]
If you want to get all non-duplicate matches (disregarding order):
matches = {x for x in a if x in str}
If you want to get all non-duplicate matches in the right order:
matches = []
for x in a:
if x in str and x not in matches:
matches.append(x)

You should be careful if the strings in a or str gets longer. The straightforward solutions take O(S*(A^2)), where S is the length of str and A is the sum of the lenghts of all strings in a. For a faster solution, look at Aho-Corasick algorithm for string matching, which runs in linear time O(S+A).

Just to add some diversity with regex:
import re
if any(re.findall(r'a|b|c', str, re.IGNORECASE)):
print 'possible matches thanks to regex'
else:
print 'no matches'
or if your list is too long - any(re.findall(r'|'.join(a), str, re.IGNORECASE))

A surprisingly fast approach is to use set:
a = ['a', 'b', 'c']
str = "a123"
if set(a) & set(str):
print("some of the strings found in str")
else:
print("no strings found in str")
This works if a does not contain any multiple-character values (in which case use any as listed above). If so, it's simpler to specify a as a string: a = 'abc'.

You need to iterate on the elements of a.
a = ['a', 'b', 'c']
str = "a123"
found_a_string = False
for item in a:
if item in str:
found_a_string = True
if found_a_string:
print "found a match"
else:
print "no match found"

a = ['a', 'b', 'c']
str = "a123"
a_match = [True for match in a if match in str]
if True in a_match:
print "some of the strings found in str"
else:
print "no strings found in str"

jbernadas already mentioned the Aho-Corasick-Algorithm in order to reduce complexity.
Here is one way to use it in Python:
Download aho_corasick.py from here
Put it in the same directory as your main Python file and name it aho_corasick.py
Try the alrorithm with the following code:
from aho_corasick import aho_corasick #(string, keywords)
print(aho_corasick(string, ["keyword1", "keyword2"]))
Note that the search is case-sensitive

The regex module recommended in python docs, supports this
words = {'he', 'or', 'low'}
p = regex.compile(r"\L<name>", name=words)
m = p.findall('helloworld')
print(m)
output:
['he', 'low', 'or']
Some details on implementation: link

A compact way to find multiple strings in another list of strings is to use set.intersection. This executes much faster than list comprehension in large sets or lists.
>>> astring = ['abc','def','ghi','jkl','mno']
>>> bstring = ['def', 'jkl']
>>> a_set = set(astring) # convert list to set
>>> b_set = set(bstring)
>>> matches = a_set.intersection(b_set)
>>> matches
{'def', 'jkl'}
>>> list(matches) # if you want a list instead of a set
['def', 'jkl']
>>>

Just some more info on how to get all list elements availlable in String
a = ['a', 'b', 'c']
str = "a123"
list(filter(lambda x: x in str, a))

It depends on the context
suppose if you want to check single literal like(any single word a,e,w,..etc) in is enough
original_word ="hackerearcth"
for 'h' in original_word:
print("YES")
if you want to check any of the character among the original_word:
make use of
if any(your_required in yourinput for your_required in original_word ):
if you want all the input you want in that original_word,make use of all
simple
original_word = ['h', 'a', 'c', 'k', 'e', 'r', 'e', 'a', 'r', 't', 'h']
yourinput = str(input()).lower()
if all(requested_word in yourinput for requested_word in original_word):
print("yes")

flog = open('test.txt', 'r')
flogLines = flog.readlines()
strlist = ['SUCCESS', 'Done','SUCCESSFUL']
res = False
for line in flogLines:
for fstr in strlist:
if line.find(fstr) != -1:
print('found')
res = True
if res:
print('res true')
else:
print('res false')

I would use this kind of function for speed:
def check_string(string, substring_list):
for substring in substring_list:
if substring in string:
return True
return False

Yet another solution with set. using set.intersection. For a one-liner.
subset = {"some" ,"words"}
text = "some words to be searched here"
if len(subset & set(text.split())) == len(subset):
print("All values present in text")
if subset & set(text.split()):
print("Atleast one values present in text")

If you want exact matches of words then consider word tokenizing the target string. I use the recommended word_tokenize from nltk:
from nltk.tokenize import word_tokenize
Here is the tokenized string from the accepted answer:
a_string = "A string is more than its parts!"
tokens = word_tokenize(a_string)
tokens
Out[46]: ['A', 'string', 'is', 'more', 'than', 'its', 'parts', '!']
The accepted answer gets modified as follows:
matches_1 = ["more", "wholesome", "milk"]
[x in tokens for x in matches_1]
Out[42]: [True, False, False]
As in the accepted answer, the word "more" is still matched. If "mo" becomes a match string, however, the accepted answer still finds a match. That is a behavior I did not want.
matches_2 = ["mo", "wholesome", "milk"]
[x in a_string for x in matches_1]
Out[43]: [True, False, False]
Using word tokenization, "mo" is no longer matched:
[x in tokens for x in matches_2]
Out[44]: [False, False, False]
That is the additional behavior that I wanted. This answer also responds to the duplicate question here.

data = "firstName and favoriteFood"
mandatory_fields = ['firstName', 'lastName', 'age']
# for each
for field in mandatory_fields:
if field not in data:
print("Error, missing req field {0}".format(field));
# still fine, multiple if statements
if ('firstName' not in data or
'lastName' not in data or
'age' not in data):
print("Error, missing a req field");
# not very readable, list comprehension
missing_fields = [x for x in mandatory_fields if x not in data]
if (len(missing_fields)>0):
print("Error, missing fields {0}".format(", ".join(missing_fields)));

Related

Using Python to identify if two words contain the same letter

Problem to solve:
Write a function that will find all the anagrams of a word from a list. You will be given two inputs a word and an array with words. You should return an array of all the anagrams or an empty array if there are none.
Solution Tested:
a = ['aabb', 'abcd', 'bbaa', 'dada']
b = ['abab']
listA = []
sorted_defaultword = sorted(b[0])
print (sorted_defaultword)
for i in range (len(a)):
#print(a[i])
sorted_word = sorted(a[i])
#print (sorted_word)
if (sorted_word == sorted_defaultword):
listA.append(a[i])
print (listA)
Test Output:
['a', 'a', 'b', 'b']
['aabb', 'bbaa']
Using the test, I then tried to write my function but apparently it will not work. Can someone please suggest why:
def anagrams(word, words):
sorted_defaultword = sorted(word[0])
anagram_List = []
for i in range (len(words)):
sorted_word = sorted(words[i])
if (sorted_word == sorted_defaultword):
anagram_List.append(words[i])
return anagram_List
Why is this failing when I put it in a function?
You are passing wrong arguments to the function.
Test.assert_equals(anagrams('abba', ['aabb', 'abcd', 'bbaa', 'dada']), ['aabb', 'bbaa']
here you are passing the first parameter as a string. while the function expects a list.
Change your code to:
Test.assert_equals(anagrams(['abba'], ['aabb', 'abcd', 'bbaa', 'dada']), ['aabb', 'bbaa']
note that I have just passed 'abba' in a list, because your function expects it to be a list.
If you want to use your previous code, from your function change this line sorted_defaultword = sorted(word[0]) to sorted_defaultword = sorted(word)
And this should do the job...

How to Sort Alphabets

Input : abcdABCD
Output : AaBbCcDd
ms=[]
n = input()
for i in n:
ms.append(i)
ms.sort()
print(ms)
It gives me ABCDabcd.
How to sort this in python?
Without having to import anything, you could probably do something like this:
arr = "abcdeABCDE"
temp = sorted(arr, key = lambda i: (i.lower(), i))
result = "".join(temp)
print(result) # AaBbCcDdEe
The key will take in each element of arr and sort it first by lower-casing it, then if it ties, it will sort it based on its original value. It will group all similar letters together (A with a, B with b) and then put the capital first.
Use a sorting key:
ms = "abcdABCD"
sorted_ms = sorted(ms, key=lambda letter:(letter.upper(), letter.islower()))
# sorted_ms = ['A', 'a', 'B', 'b', 'C', 'c', 'D', 'd']
sorted_str = ''.join(sorted_ms)
# sorted_str = 'AaBbCcDd'
Why this works:
You can specify the criteria by which to sort by using the key argument in the sorted function, or the list.sort() method - this expects a function or lambda that takes the element in question, and outputs a new criteria by which to sort it. If that "new criteria" is a tuple, then the first element takes precedence - if it's equal, then the second argument, and so on.
So, the lambda I provided here returns a 2-tuple:
(letter.upper(), letter.islower())
letter.upper() as the first element here means that the strings are going to be sorted lexigraphically, but case-insensitively (as it will sort them as if they were all uppercase). Then, I use letter.islower() as the second argument, which is True if the letter was lowercase and False otherwise. When sorting, False comes before True - which means that if you give a capital letter and a lowercase letter, the capital letter will come first.
Try this:
>>>s='abcdABCD'
>>>''.join(sorted(s,key=lambda x:x.lower()))
'aAbBcCdD'

why it show me this in result (list index out of range)?

Write a function called stop_at_z that iterates through a list of strings. Using a while loop, append each string to a new list until the string that appears is “z”. The function should return the new list.
def stop_at_z(str):
d = 0
x=[]
str1 = list(str)
while True :
if str1[d] != 'Z' :
x.append(str1[d])
d+=1
if str1[d] == 'Z' :
break
return x
Using a while loop, append each string to a new list until the string that appears is “z”. The function should return the new list.
You're getting this error because d keeps increasing infinitely if there is no uppercase 'Z' in the string. Instead, you should only stay in the while loop while the full length of the input string has not been reached:
def stop_at_z(inputstr):
d = 0
x=[]
str1 = list(inputstr)
while d<len(inputstr) :
if str1[d] == 'z' :
break
else:
x.append(str1[d])
d+=1
return x
Note that you can achieve the same thing using takewhile() from the itertools module:
from itertools import takewhile
def stop_at_z(inputstr):
return list(takewhile(lambda i: i != 'z', inputstr))
print(stop_at_z("hello wzrld"))
Output:
['h', 'e', 'l', 'l', 'o', ' ', 'w']
Is the the way you are doing it, searching for “z” is case-sensitive, try something like:
If str1[d].strip().lower() == “z”
It strips off leading and trailing white space and then converts the str1 element to lower case (both of these simply return the modified string, so the original is unchanged) and compares it to a lower case z
What if the string 'z' is never in the list?
Then it keeps on increasing the index and eventually runs into an error.
Just restricting the loop to the length of the list should help.
def stop_at_z(str):
d = 0
x=[]
str1 = list(str)
for d in range(0,len(str1)) :
print(d)
if str1[d] != 'Z' :
x.append(str1[d])
else:
break
return x
Basically, we needed to have a list that could have all the characters until we get "z". One way we could do that is we first convert the string into a list and iterate that list and add every character to a new list ls until we get "z". But the problem is we may get a string that doesn't have "z" so we need to iterate till the length of that list. I hope it is clear.
def stop_at_z(s):
ls = []
idx = 0
x = list(s)
while idx<len(x):
if x[idx]=="z":
break
ls.append(x[idx])
idx+=1
return ls
It's my first time posting here, but I use this while loop:
def stop_at_z(input_list):
print (input_list)
output_list=[]
index=0
while index< len(input_list):
if input_list[index] != "z":
output_list.append(input_list[index])
index+=1
else:
break
return output_list

Python3 TypeError: list indices must be integers or slices, not str

i have the task to get the String 'AAAABBBCCDAABBB' into a list like this: ['A','B','C','D','A','B']
I am working on this for 2 hours now, and i can't get the solution. This is my code so far:
list = []
string = 'AAAABBBCCDAABBB'
i = 1
for i in string:
list.append(i)
print(list)
for element in list:
if list[element] == list[element-1]:
list.remove(list[element])
print(list)
I am a newbie to programming, and the error "TypeError: list indices must be integers or slices, not str" always shows up...
I already changed the comparison
if list[element] == list[element-1]
to
if list[element] is list[element-1]
But the error stays the same. I already googled a few times, but there were always lists which didn't need the string-format, but i need it (am i right?).
Thank you for helping!
NoAbL
First of all don't name your variables after built in python statements or data structures like list, tuple or even the name of a module you import, this also applies to files. for example naming your file socket.py and importing the socket module is definitely going to lead to an error (I'll leave you to try that out by yourself)
in your code element is a string, indexes of an iterable must be numbers not strings, so you can tell python
give me the item at position 2.
but right now you're trying to say give me the item at position A and that's not even valid in English, talk-less of a programming language.
you should use the enumerate function if you want to get indexes of an iterable as you loop through it or you could just do
for i in range(len(list))
and loop through the range of the length of the list, you don't really need the elements anyway.
Here is a simpler approach to what you want to do
s = string = 'AAAABBBCCDAABBB'
ls = []
for i in s:
if ls:
if i != ls[-1]:
ls.append(i)
else:
ls.append(i)
print(ls)
It is a different approach, but your problem can be solved using itertools.groupby as follows:
from itertools import groupby
string = 'AAAABBBCCDAABBB'
answer = [group[0] for group in groupby(string)]
print(answer)
Output
['A', 'B', 'C', 'D', 'A', 'B']
According to the documentation, groupby:
Make an iterator that returns consecutive keys and groups from the iterable
In my example we use a list comprehension to iterate over the consecutive keys and groups, and use the index 0 to extract just the key.
You can try the following code:
list = []
string = 'AAAABBBCCDAABBB'
# remove the duplicate character before append to list
prev = ''
for char in string:
if char == prev:
pass
else:
list.append(char)
prev = char
print(list)
Output:
['A', 'B', 'C', 'D', 'A', 'B']
In your loop, element is the string. You want to have the index.
Try for i, element in enumerate(list).
EDIT: i will now be the index of the element you're currently iterating through.

Is it possible to convert a String into a List data type?

Okay so, my code:
def isPalindrome():
string = requestString("give me a Palendrom!, add spaces between each letter")
list = string.split()
print list
reverseList = list.reverse()
print reverseList
this is unfinished, but the idea is to detect Palindromes, the user is to input a word and, what I want to be able to do is say.
if list = reverseList:
print "yes"
else:
print "no!"
But unfortunately the return from what I have is:
======= Loading Progam =======
>>> isPalindrome()
['r', 'a', 'd', 'a', 'r']
None
>>>
My class mates are taking a different approach to this question, but I have a reputation for 'unique' code, so I was hoping this would work.
My question is
1 is this even possible?
2 is there a better approach to this problem?
Side note, I am very new to this, I am using JES, Jython and this is my first question on stackoverflow, be kind :D
Edit:
def isPalindrome2():
string = requestString("give me a Palindrome, make sure the letters are spaced")
print string
reversedString = string[::-1]
print reversedString
if string == reversedString:
print ("this is a Palindrome")
else:
print ("this is not a Palindrome")
OutPut:
>>> isPalindrome2()
r a d a r
r a d a r
this is a Palindrome
string[::-1]
It should return the string reversed.

Resources