Pattern matching in Python? - string

I have a list of string representations of binary numbers. X characters are wildcards that can be 0 or 1.
I have a string '10101011' to search for in the below list.
line = '10101011'
my_list = ['1000101X','1000101X','11XXXXXX','111010XX','101XXXXX','100100XX','1000001X','1010110X']
The search string line has to match with '101XXXXX', which is in the list.
I tried:
if any(line in s for s in my_list)
But figured out it is not possible to use this.
Is there any way? How can I do it?

Find the index of the X character and compare the strings up until that point.
def match(line, s):
i = s.index('X')
if s == line or (i and s[:i] == line[:i] and len(s) == len(line)) :
return True
return False
line = '10101011'
my_list = ['1000101X','1000101X','11XXXXXX','111010XX','101XXXXX','100100XX','1000001X','1010110X']
if any(match(line,s) for s in my_list):
do something
If the X's are not all at the end of the string, find all the permutations, then filter them by non-wildcard digits:
import itertools
def expand_list(l):
new_list = []
for s in l:
perms = ["".join(seq) for seq in itertools.product("01", repeat=len(s))]
for i in range(len(s)):
if s[i] != 'X':
perms = [p for p in perms if p[i] == s[i]]
new_list.extend(perms)
return new_list
line = '10101011'
my_list = ['1000101X','1000101X','11XXXXXX','111010XX','101XXXXX','100100XX','1000001X','1010110X']
if line in expand_list(my_list):
do something

Related

List comprehention works but normal for loop doesn't - Python

I defined a function to change an element of a list to be number 0, if this element is not a number. It works using list comprehension but it doesn't work when I use a normal for loop.
I'm trying to understand what's is the error in the for loop.
See the code below:
def zerozero(mylist):
mylist = [0 if type(x) == str else x for x in mylist]
return mylist
def zerozero2(mylist):
for x in mylist:
if type(x) == str:
x = 0
else:
x = x
return mylist
Your second function is not quite equivalent. You would need something like this:
def zerozero2(mylist):
new_list = []
for x in mylist:
if type(x) == str:
new_list.append(0)
else:
new_list.append(x)
return new_list
In this manner you can mimic the functionality of the list comprehension, creating a new list and appending items to it as you iterate through.
If you want to modify your list 'in place', you can use this sort of construction:
for idx, x in enumearte(mylist):
if type(x) == str:
mylist[idx] = 0
else:
mylist[idx] = x
However, practically speaking this is unlikely to have much impact on your code efficiency. You can't do this with a list comprehension, and in either case you can just re-assign the new list back to your original variable when you return from the function:
mylist = zerozeroX(mylist)
So what happens is your function is returning the same list as your input.
What you should do is create an empty list first. For example my_list_0 = [].
def zerozero2(mylist):
my_list_0 = []
for x in mylist:
if type(x) == str:
x=0
else:
x=x
my_list_0.append(x)
return my_list_0
The list comprehension essentially returns the new values into your original list, so this is why it is different.

Capitalize a character Before and After Nth character in a string in a Python list

Here is my code I am trying uppercase letters before and after a specific letter in a list. Uppercase any letter before and after
uppercase the previous and next letters that come before and after each "z" in a capital city. All other letters are lowercase. All cities that contain that letter will be stored in a list and returned. If I could get some input that would be great. Also if I need to change the code completely please let me know of other ways. I am new to this any input would be appreciated. Thanks
lst = ['brazzaville', 'zagreb', 'vaduz']
lst2 = []
for wrd in lst:
newwrd = ''
for ltr in wrd:
if ltr in 'ua':
newwrd += ltr.capitalize()
else:
newwrd += ltr
lst2.append(newwrd)
print(lst2)
I keep getting this:
['brAzzAville', 'zAgreb', 'vAdUz']
But I need this:
['brAzzAville', 'zAgreb', 'vadUz']
The following strategy consists of iterating through the word and replacing the letters at index-1 and index+1 of z (if they exist) with upper case letters:
lst2 = []
for wrd in lst:
wrd = wrd.lower()
for idx, letter in enumerate(wrd):
if letter == 'z':
if idx-1 > 0 and wrd[idx - 1] != 'z':
wrd = wrd.replace(wrd[idx - 1], wrd[idx - 1].upper())
if idx+1 < len(wrd) and wrd[idx + 1] != 'z':
wrd = wrd.replace(wrd[idx + 1], wrd[idx + 1].upper())
if "z" in wrd:
lst2.append(wrd)
print(lst2)
#['brAzzAville', 'zAgreb', 'vadUz']
I think this code gives correct answer , verify once
def findOccurrences(s, ch):
return [i for i, letter in enumerate(s) if letter == ch]
lst = ['brazzaville', 'zagreb', 'vaduz']
lst2 = []
result = []
for wrd in lst:
newwrd = ''
result = findOccurrences(wrd, 'z')
for i in range(len(wrd)):
if (i + 1 in result or i - 1 in result) and wrd[i] != 'z':
newwrd += wrd[i].capitalize()
else:
newwrd += wrd[i]
lst2.append(newwrd)
print(lst2)
Capitalize Nth character in a string
res = lambda test_str,N: test_str[:N] + test_str[N].upper() + test_str[N + 1:] if test_str else ''
Pseudocode
Loop through the list and filter the list for strings that contain 'z'.
[check(i) for i in lst if 'z' in i]
For each item in the list:
find the index and capitalize the preceding character to the first occurence of 'z' without rotation.
preind = list(i).index('z')-1 if list(i).index('z')-1>0 else None
k = res(stri,preind) if(preind) else i
find the index and capitalize the succeeding character to the last occurence of 'z' without rotation.
postind = i.rfind('z')+1 if i.rfind('z')+1<len(i) else None
stri = res(i,preind) if(preind) else stri
Code
lst = ['brazzaville', 'zagreb', 'vaduz']
def check(i):
stri = ""
k = ""
i = i.lower()
# lambda expression to capitalise Nth character in a string
res = lambda test_str,N: test_str[:N] + test_str[N].upper() + test_str[N + 1:] if test_str else ''
# find index of the preceeding character to 'z'
preind = list(i).index('z')-1 if list(i).index('z')-1>0 else None
# find index of the succeeding character to 'z'
postind = i.rfind('z')+1 if i.rfind('z')+1<len(i) else None
# capitalise preceeding character to 'z'
stri = res(i,preind) if(preind) else i
# capitalise succeeding character to 'z'
k = res(stri,postind) if(postind) else stri
# return the processed string
return k
print([check(i) for i in lst if 'z' in i ])
#output
['brAzzAville', 'zAgreb', 'vadUz']

How to convert this nested string list into the actual list

How to convert string into new_list without using eval(), exec() or any other libraries?
string = '[1,2,3,[4,5,6],7,[8,[9,10]],11]'
new_list = [1,2,3,[4,5,6],7,[8,[9,10]],11]
The biggest problem you're facing with this question is that you cannot say how deep the list will nest, therefore the best option is to make a recursive function that evaluates the string and calls itself whenever it encounters a new list.
Another issue you will run in to is that a number can be multiple digits long and it would not be a clean solution to only assume it might be max 2 digits. A while loop to find all consecutive digits is probably the best solution for that. Here is my implementation:
def get_number(str, idx):
"""Find all consecutive digits and return them and the updated index"""
number = ""
while str[idx].isdigit():
number += str[idx]
idx += 1
return int(number), idx
def recursive_list(str, idx):
"""Transform a string to list, recursively calling this function whenever a nested list is encountered."""
lst = []
# Loop over the entire string
while idx < len(str):
char = str[idx]
# When encountering a digit, get the entire number
if char.isdigit():
number, idx = get_number(str, idx)
lst.append(number)
idx += 1
# When encountering a closing bracket, return the (sub-)list
elif char == ']':
return lst, idx + 1
# When encountering an opening bracket, append the nested list
elif char == '[':
nested, idx = recursive_list(str, idx + 1)
lst.append(nested)
else:
# Go to the next character if we encounter anything else
idx += 1
return lst, idx
def main():
"""Transform a string to a list."""
str = "[1,2,3,[4,5,6],7,[8,[9,10]],11]"
new_list, _ = recursive_list(str, 1)
print(new_list)
if __name__ == "__main__":
main()

Finding partial string matches between list and elements of list of lists

I have a list of strings:
mylist = ['foo hydro', 'bar']
and a list of lists of strings called test:
testI = ['foo', 'bar'] ## should succeed
testJ = ['foo'] ## should fail
testK = ['foo hydro'] ## should fail
testL = ['foo hydro', 'bar'] ## should succeed
testM = ['foo', 'bar', 'third'] ## should fail
test = [testI,testJ,testK,testL,testM]
I need to be able to check if there's a (partial or whole) string match between each element of each list in test and each element of mylist.
So, testI should succeed because testI[0] is a partial string match of mylist[0] and because testI[1] is a complete string match for mylist[1].
However, testJ and testK should each fail because they only match one of the two strings in mylist, and testM should fail because it contains an element which doesn't match with any element in mylist
So far, I've tried to play around with any:
for i in mylist:
for j in test:
for k in j:
if any(i in b for b in k):
print("An element of mylist matches an element of test")
So I can catch if any element of mylist matches any element in each list in test, but I can't work out a way to meet all the requirements.
Any suggestions? I'm happy to refactor the question if it makes dealing with it easier.
I want to suggest a solution to your problem.
Firstly, we create function that recognizes if a word is a substring of any word in another list:
def is_substring_of_element_in_list(word, list_of_str):
if len(list_of_str) == 0:
return (False, -1)
is_sub = any([word in s for s in list_of_str])
if (is_sub == True):
ix = [word in s for s in list_of_str].index(True)
else:
ix = -1
return is_sub, ix
Now, we can use this function to check if each word from the test list is a substring of a word on your list. Notice, we can use every word only once so we need to remove a string if a given word is a substring of.
def is_list_is_in_mylist(t, mylist):
mylist_now = sorted(mylist, key=len)
test_now = sorted(t, key=len)
counter = 0
for word in t:
is_sub, index = is_substring_of_element_in_list(word, mylist_now)
if is_sub:
mylist_now.pop(index)
test_now.remove(word)
counter += 1
if counter == len(t) and counter == len(mylist):
print("success")
else:
print("fail")
Pay attention, we need to sort the elements in the list to avoiding mistakes caused by the order of the words. For example, if my_list = ['f', 'foo'] and test1 = ['f', 'foo'] and test2 = ['foo', 'f'] without sorting, one of the success and the other will be faild.
Now, you can iterate over your test with simple for loop:
for t in test:
is_list_is_in_mylist(t, mylist)
i think this code probably match your conditions :
for t in test:
counter = 0
if len(t) == len(mylist):
t = list(dict.fromkeys(t))
temp = []
for s in t:
if not any([s in r for r in t if s != r]):
temp.append(s)
for l in temp:
for m in mylist:
if l in m:
counter = counter + 1
if counter == len(mylist):
print('successed')
else:
print('fail')
else:
print('fail')

Python split string every n character

I need help finding a way to split a string every nth character, but I need it to overlap so as to get all the
An example should be clearer:
I would like to go from "BANANA" to "BA", "AN", "NA", "AN", "NA", "
Here's my code so far
import string
import re
def player1(s):
pos1 = []
inP1 = "AN"
p = str(len(inP1))
n = re.findall()
for n in range(len(s)):
if s[n] == inP1:
pos1.append(n)
points1 = len(pos1)
return points1
if __name__ == '__main__':
= "BANANA"
You can do this pretty simply with list comprehension;
input_string = "BANANA"
[input_string[i]+input_string[i+1] for i in range(0,len(input_string)-1)]
or for every nth character:
index_range = 3
[''.join([input_string[j] for j in range(i, i+index_range)]) for i in range(0,len(input_string)-index_range+1)]
This will iterate over each letter in the word banana, 0 through 6.
Then print each letter plus the next letter. Else statement for when the word reaches the last letter.
def splitFunc(word):
for i in range(0, len(word)-1):
if i < len(word):
print(word[i] + word[i+1])
else:
break
splitFunc("BANANA")
Hope this helps
Those are called n-grams.
This should work :)
text = "BANANA"
n = 2
chars = [c for c in text]
ngrams = []
for i in range(len(chars)-n + 1):
ngram = "".join(chars[i:i+n])
ngrams.append(ngram)
print(ngrams)
output: ['BA', 'AN', 'NA, 'AN', 'NA']

Resources