A more efficient way for nested loops - python-3.x

Currently, I have a nested loops in python that iterates over lists, but the iterable child list depends on the selected value of parent loop. So, consider this code snippet for the nested loop.
my_combinations = []
list1 = ['foo', 'bar', 'baz']
for l1 in list1:
list2 = my_func1(l1) # Some user defined function which queries through some dataset
for l2 in list2:
list3 = my_func2(l1, l2) # Some other user defined function which queries through some dataset
for l3 in list3:
my_combinations.append((l1,l2,l3))
Is there an efficient way to get all the permissible combinations (as defined by my_func1 and my_func2 functions) in the my_combinations list as the number of elements in list1, list2 and list3 runs into 4-5 digits and is clearly inefficient right now?
As a thought process, if I had list1, list2 and list3 pre-defined before entering the outermost loop, itertools.product might have given me the required combinations efficiently. However, I don't think I can use it in this situation.

Related

Time complexity of a function in Python

I have 2 functions which perform same task of identifying if the 2 lists have any common element between them. I want to analyze their time complexity.
What i know is: for loop if iterated n times gives O(n) complexity. But, I am confused with the situation when we use 'in' operator. eg: if element in mylist
Please look at the functions to have better understanding of the scenario:
list1 = ['a','b','c','d','e']
list2 = ['m','n','o','d']
def func1(list1, list2):
for i in list1: # O(n), assuming number of items in list1 is n
if i in list2: # What will be the BigO of this statement??
return True
return False
z = func1(list1, list2)
print(z)
I have another function func2, please help determine its BigO as well:
def func2(list1, list2):
dict = {}
for i in list1:
if i not in dict.keys():
dict[i] = True
for j in list2:
if j in dict.keys():
return True
return False
z = func2(list1, list2)
print(z)
What is the time complexity of func1 and func2? Is there any difference in performance between 2 functions?
Regarding func1:
searching in lists is a linear operation with respect to the number of elements,
assuming items are randomly ordered and order of checking is also not related then statistically you come across an existing element in n/2 steps and n when not found (which simplifies to O(n))
if x in list_ is a linear search as described above, hence func1 has complexity of n^2.
Regarding func2:
instead of dictionary you may want to consider using a set. It has O(1) complexity for checking the existence of element. which would improve the complexity over func1, and also you can use set(list) to create a list instead of iterating over list directly in python (which is slower than initialization of a set directly from list - but does not affect the O complexity, as it is just slower, but constant).

How to find match between two 2D lists in Python?

Lets say I have two 2D lists like this:
list1 = [ ['A', 5], ['X', 7], ['P', 3]]
list2 = [ ['B', 9], ['C', 5], ['A', 3]]
I want to compare these two lists and find where the 2nd item matches between the two lists e.g here we can see that numbers 5 and 3 appear in both lists. The first item is actually not relevant in comparison.
How do I compare the lists and copy those values that appear in 2nd column of both lists? Using 'x in list' does not work since these are 2D lists. Do I create another copy of the lists with just the 2nd column copied across?
It is possible that this can be done using list comprehension but I am not sure about it so far.
There might be a duplicate for this but I have not found it yet.
The pursuit of one-liners is a futile exercise. They aren't always more efficient than the regular loopy way, and almost always less readable when you're writing anything more complicated than one or two nested loops. So let's get a multi-line solution first. Once we have a working solution, we can try to convert it to a one-liner.
Now the solution you shared in the comments works, but it doesn't handle duplicate elements and also is O(n^2) because it contains a nested loop. https://wiki.python.org/moin/TimeComplexity
list_common = [x[1] for x in list1 for y in list2 if x[1] == y[1]]
A few key things to remember:
A single loop O(n) is better than a nested loop O(n^2).
Membership lookup in a set O(1) is much quicker than lookup in a list O(n).
Sets also get rid of duplicates for you.
Python includes set operations like union, intersection, etc.
Let's code something using these points:
# Create a set containing all numbers from list1
set1 = set(x[1] for x in list1)
# Create a set containing all numbers from list2
set2 = set(x[1] for x in list2)
# Intersection contains numbers in both sets
intersection = set1.intersection(set2)
# If you want, convert this to a list
list_common = list(intersection)
Now, to convert this to a one-liner:
list_common = list(set(x[1] for x in list1).intersection(x[1] for x in list2))
We don't need to explicitly convert x[1] for x in list2 to a set because the set.intersection() function takes generator expressions and internally handles the conversion to a set.
This gives you the result in O(n) time, and also gets rid of duplicates in the process.

Math-like way to define a set in Python: technical name [duplicate]

Can someone explain the last line of this Python code snippet to me?
Cell is just another class. I don't understand how the for loop is being used to store Cell objects into the Column object.
class Column(object):
def __init__(self, region, srcPos, pos):
self.region = region
self.cells = [Cell(self, i) for i in xrange(region.cellsPerCol)] #Please explain this line.
The line of code you are asking about is using list comprehension to create a list and assign the data collected in this list to self.cells. It is equivalent to
self.cells = []
for i in xrange(region.cellsPerCol):
self.cells.append(Cell(self, i))
Explanation:
To best explain how this works, a few simple examples might be instructive in helping you understand the code you have. If you are going to continue working with Python code, you will come across list comprehension again, and you may want to use it yourself.
Note, in the example below, both code segments are equivalent in that they create a list of values stored in list myList.
For instance:
myList = []
for i in range(10):
myList.append(i)
is equivalent to
myList = [i for i in range(10)]
List comprehensions can be more complex too, so for instance if you had some condition that determined if values should go into a list you could also express this with list comprehension.
This example only collects even numbered values in the list:
myList = []
for i in range(10):
if i%2 == 0: # could be written as "if not i%2" more tersely
myList.append(i)
and the equivalent list comprehension:
myList = [i for i in range(10) if i%2 == 0]
Two final notes:
You can have "nested" list comrehensions, but they quickly become hard to comprehend :)
List comprehension will run faster than the equivalent for-loop, and therefore is often a favorite with regular Python programmers who are concerned about efficiency.
Ok, one last example showing that you can also apply functions to the items you are iterating over in the list. This uses float() to convert a list of strings to float values:
data = ['3', '7.4', '8.2']
new_data = [float(n) for n in data]
gives:
new_data
[3.0, 7.4, 8.2]
It is the same as if you did this:
def __init__(self, region, srcPos, pos):
self.region = region
self.cells = []
for i in xrange(region.cellsPerCol):
self.cells.append(Cell(self, i))
This is called a list comprehension.

Matching character lists of unequal length

I want to match two lists from which one list is smaller while other is a bigger one. If a match occurs between two lists then put the matching element in a new list at the same index instead of putting it another index. You can understand my question from the code given below:
list1=['AF','KN','JN','NJ']
list2=['KNJ','NJK','JNJ','INS','AFG']
matchlist = []
smaller_list_len = min(len(list1),len(list2))
for ind in range(smaller_list_len):
elem2 = list1[ind]
elem1 = list2[ind][0:2]
if elem1 in list2:
matchlist.append(list1[ind])
Obtained output
>>> matchlist
['KNJ', 'NJK', 'JNJ']
Desired Output
>>> matchlist
['AFG', 'KNJ', 'JNJ', 'NJK']
Is there a way to get the desired output?
Use a nested loop iterating over the 3-char list. When an item in that list contains the current item in the 2-char list, append it and break out of the inner loop:
list1=['AF','KN','JN','NJ']
list2=['KNJ','NJK','JNJ','INS','AFG']
matchlist = []
smaller_list_len = min(len(list1),len(list2))
for ind in range(smaller_list_len):
for item in list2:
if list1[ind] in item:
matchlist.append(item)
break
Given the question doesn't specify any constraints, in a more pythonic way, using a list comprehension:
list1=['AF','KN','JN','NJ']
list2=['KNJ','NJK','JNJ','INS','AFG']
matchlist=[e2 for e1 in list1 for e2 in list2 if e2.startswith(e1)]
produces
['AFG', 'KNJ', 'JNJ', 'NJK']

Python: Faster way to filter a list using list comprehension

Consider the following problem: I want to keep elements of list1 that belongs to list2. So I can do something like this:
filtered_list = [w for w in list1 if w in list2]
I need to repeat this same procedure for different examples of list1 (about 20000 different examples) and a "constant" (frozen) list2.
How can I speed up the process?
I also know the following properties:
1) list1 has repeated elements and it is not sorted and it has about 10000 (ten thousand) items.
2) list2 is a giant sorted list (about 200000 - two hundred thousand) entries in Python) and each element is unique.
The first thing that comes to me is that maybe I can use a kind of binary search. However, is there a way to do this in Python?
Furthermore, I do not mind if filtered_list has the same order of items of list1. So, maybe I can check only a unrepeated version of list1 and after removing the elements in list1 that do not belong to list 2, I can return the repeated items.
Is there a fast way to do this in Python 3?
Convert list2 to a set:
# do once
set2 = set(list2)
# then every time
filtered_list = [w for w in list1 if w in set2]
x in list2 is sequential; x in set2 uses the same mechanism as dictionaries, resulting in a very quick lookup.
If list1 didn't have duplicates, converting both to sets and taking set intersection would be the way to go:
filtered_set = set1 & set2
but with duplicates you're stuck with iterating over list1 as above.
(As you said, you could even see elements that you should delete, using set1 - set2, but then you'd still be stuck in a loop in order to delete - there shouldn't be any difference in performance between filtering keepers vs filtering trash, you still have to iterate over list1, so that's no win over the method above.)
EDIT in response to comment: Converting list1 to a Counter would might (EDIT: or not; testing needed!) speed it up if you can use it normally like that (i.e. you never have a list, you always just deal with a Counter). But if you have to preprocess list1 into counter1 each time you do the above operation, again it's no win - creating a Counter will again involve a loop.

Resources