How to use re.compile within a for loop to extract substring indices - python-3.x

I have a list of data from which I need to extract the indices of some strings within that list:
str=['cat','monkey']
list=['a cat','a dog','a cow','a lot of monkeys']
I've been using re.compile to match (even partial match) individual elements of the str list to the list:
regex=re.compile(".*(monkey).*")
b=[m.group(0) for l in list for m in [regex.search(l)] if m]
>>> list.index(b[0])
3
However, when I try to iterate over the str list to find the indices of those elements, I obtain empty lists:
>>> for i in str:
... regex=re.compile(".*(i).*")
... b=[m.group(0) for l in list for m in [regex.search(l)] if m]
... print(b)
...
[]
[]
I imagine that the problem is with regex=re.compile(".*(i).*"), but I don't know how to pass the ith element as a string.
Any suggestion is very welcome, thanks!!

It looks like you need to use string formatting.
for i in str:
match_pattern = ".*({}).*".format(i)
regex = re.compile(match_pattern)
b = [m.group(0) for l in list for m in [regex.search(l)] if m]
print(b)

Related

How to convert a tuple list string value to integer

I have the following list:
l = [('15234', '8604'), ('15238', '8606'), ('15241', '8606'), ('15243', '8607')]
I would like to converted it such that the tuple values are integers and not string. How do I do that?
Desired output:
[(15234, 8604), (15238, 8606), (15241, 8606), (15243, 8607)]
What I tried so far?
l = [('15234', '8604'), ('15238', '8606'), ('15241', '8606'), ('15243', '8607')]
new_list = []
for i in `l:
new_list.append((int(i[0]), i[1]))
print(tuple(new_list))
This only converts the first element i.e. 15234, 15238, 15241, 15243 into int. I would like to convert all the values to int. How do I do that?
The easiest and most concise way is via a list comprehension:
>>> [tuple(map(int, item)) for item in l]
[(15234, 8604), (15238, 8606), (15241, 8606), (15243, 8607)]
This takes each tuple in l and maps the int function to each member of the tuple, then creates a new tuple out of them, and puts them all in a new list.
You can change the second numbers into integers the same way you did the first. Try this:
new_list.append((int(i[0]), int(i[1]))

How to subtract adjacent items in list with unknown length (python)?

Provided with a list of lists. Here's an example myList =[[70,83,90],[19,25,30]], return a list of lists which contains the difference between the elements. An example of the result would be[[13,7],[6,5]]. The absolute value of (70-83), (83-90), (19-25), and (25-30) is what is returned. I'm not sure how to iterate through the list to subtract adjacent elements without already knowing the length of the list. So far I have just separated the list of lists into two separate lists.
list_one = myList[0]
list_two = myList[1]
Please let me know what you would recommend, thank you!
A custom generator can return two adjacent items at a time from a sequence without knowing the length:
def two(sequence):
i = iter(sequence)
a = next(i)
for b in i:
yield a,b
a = b
original = [[70,83,90],[19,25,30]]
result = [[abs(a-b) for a,b in two(sequence)]
for sequence in original]
print(result)
[[13, 7], [6, 5]]
Well, for each list, you can simply get its number of elements like this:
res = []
for my_list in list_of_lists:
res.append([])
for i in range(len(my_list) - 1):
# Do some stuff
You can then add the results you want to res[-1].

Matching character lists of unequal length

I want to match two lists from which one list is smaller while other is a bigger one. If a match occurs between two lists then put the matching element in a new list at the same index instead of putting it another index. You can understand my question from the code given below:
list1=['AF','KN','JN','NJ']
list2=['KNJ','NJK','JNJ','INS','AFG']
matchlist = []
smaller_list_len = min(len(list1),len(list2))
for ind in range(smaller_list_len):
elem2 = list1[ind]
elem1 = list2[ind][0:2]
if elem1 in list2:
matchlist.append(list1[ind])
Obtained output
>>> matchlist
['KNJ', 'NJK', 'JNJ']
Desired Output
>>> matchlist
['AFG', 'KNJ', 'JNJ', 'NJK']
Is there a way to get the desired output?
Use a nested loop iterating over the 3-char list. When an item in that list contains the current item in the 2-char list, append it and break out of the inner loop:
list1=['AF','KN','JN','NJ']
list2=['KNJ','NJK','JNJ','INS','AFG']
matchlist = []
smaller_list_len = min(len(list1),len(list2))
for ind in range(smaller_list_len):
for item in list2:
if list1[ind] in item:
matchlist.append(item)
break
Given the question doesn't specify any constraints, in a more pythonic way, using a list comprehension:
list1=['AF','KN','JN','NJ']
list2=['KNJ','NJK','JNJ','INS','AFG']
matchlist=[e2 for e1 in list1 for e2 in list2 if e2.startswith(e1)]
produces
['AFG', 'KNJ', 'JNJ', 'NJK']

How to remove tuple from zip?

so i have a bunch of numbers i've tupled but am having difficulty remove an item from the zipped list.
so far i've tried .remove on the list but that gave me an error.
is there an easy way of doing this?
this is my current code:
Example data:
QueenRowColumn: 3,3
TheComparisonQueen: 7,3
def CheckQueenPathDown(self, QueenRowColumn, TheComparisonQueen):
row = []
column = []
CurrentLocation = QueenRowColumn
#MoveLocation = TheComparisonQueen
a = QueenRowColumn[0]
b = QueenRowColumn[1]
for i in range (-7,0):
row.append(CurrentLocation[1] - i)
column.append(a)
Down = zip(row,column)
#Down.remove(TheComparisonQueen)
return Down
if i, for example were to remove "TheComparisonQueen" from the list of tuples, how would i do it?
If you just looking to drop TheComparisonQueen from iterator of tuples you can return values that are not equal to TheComparisonQueen using a list comprehension or a generator expression.
# List Comprehension
Down = [(i,j) for i,j in zip(row,column) if (i,j) != TheComparisonQueen]
# Generator Expression
Down = ((i,j) for i,j in zip(row,column) if (i,j) != TheComparisonQueen)

iteration and matching items in lists

Am trying to check if elements of a list match elements of another. But there is a slight twist to the problem.
alist = ['949', '714']
blist = ['(714)824-1234', '(419)312-8732', '(949)555-1234', '(661)949-2867']
Am trying to match the elements of alist to the blist, but only the area code part(in blist). Here is my current code:
def match_area_codes(alist, blist):
clist =[]
for i in alist:
for j in blist:
if i in j:
clist.append(j)
return clist
The code works for the most part, except when there is a string matching the area code anywhere else in the list. It should only print:
['(714)824-1234', '(949)555-1234']
but it ends up printing
['(714)824-1234', '(949)555-1234', '(661)949-2867']
as there is a '949' in the last phone number. Is there a way to fix this?
You can use a regular expression to get the part within (...) and compare that part to alist.
import re
def match_area_codes(alist, blist):
p = re.compile(r"\((\d+)\)")
return [b for b in blist if p.search(b).group(1) in alist]
Example:
>>> alist = set(['949', '714'])
>>> blist = ['(714)824-1234', '(419)312-8732', '(949)555-1234', '(661)949-2867']
>>> match_area_codes(alist, blist)
['(714)824-1234', '(949)555-1234']
If you really really want to do it without regular expressions, you could, e.g., find the position of the ( and ) and thus get the slice from the string corresponding to the region code.
def match_area_codes(alist, blist):
find_code = lambda s: s[s.index("(") + 1 : s.index(")")]
return [b for b in blist if find_code(b) in alist]
However, I would strongly suggest to just take this as an opportunity for getting started with regular expressions. It's not all that hard, and definitely worth it!

Resources