Different behavior in list comprehension - python-3.x

In my mind this two pieces of code do the same thing:
l = [[1,2], [3,4],[3,2], [5,4], [4,4],[5,7]]
1)
In [4]: [list(g) for k,g in groupby(sorted(l,key=lambda x:x[1]),
key = lambda x:x[1]) if len(list(g)) == 2]
Out[4]: [[]]
2)
In [5]: groups = [list(g) for k,g in groupby(sorted(l,
key=lambda x:x[1]), key = lambda x:x[1])]
In [6]: [g for g in groups if len(g) == 2]
Out[6]: [[[1, 2], [3, 2]]]
But as you see first one gives an empty list while the second one gives what I need. Where am I mistaken?

The group is an iterator, you cannot consume it (e.g. by calling list on it) twice. For example:
>>> from operator import itemgetter
>>> from itertools import groupby
>>> l = [[1,2], [3,4],[3,2], [5,4], [4,4],[5,7]]
>>> for _, group in groupby(sorted(l, key=itemgetter(1)), key=itemgetter(1)):
... print('first', list(group))
... print('second', list(group))
...
first [[1, 2], [3, 2]]
second []
first [[3, 4], [5, 4], [4, 4]]
second []
first [[5, 7]]
second []
Instead, you need to call list once per group and filter on the results of that, e.g. by using map:
>>> [lst for lst in map(list, (group for _, group in groupby(sorted(l, key=itemgetter(1))), key=itemgetter(1))) if len(lst) == 2]
[[[1, 2], [3, 2]]]

Related

Print multiple columns from a matrix

I have a list of column vectors and I want to print only those column vectors from a matrix.
Note: the list can be of random length, and the indices can also be random.
For instance, the following does what I want:
import numpy as np
column_list = [2,3]
a = np.array([[1,2,6,1],[4,5,8,2],[8,3,5,3],[6,5,4,4],[5,2,8,8]])
new_matrix = []
for i in column_list:
new_matrix.append(a[:,i])
new_matrix = np.array(new_matrix)
new_matrix = new_matrix.transpose()
print(new_matrix)
However, I was wondering if there is a shorter method?
Yes, there's a shorter way. You can pass a list (or numpy array) to an array's indexer. Therefore, you can pass column_list to the columns indexer of a:
>>> a[:, column_list]
array([[6, 1],
[8, 2],
[5, 3],
[4, 4],
[8, 8]])
# This is your new_matrix produced by your original code:
>>> new_matrix
array([[6, 1],
[8, 2],
[5, 3],
[4, 4],
[8, 8]])
>>> np.all(a[:, column_list] == new_matrix)
True

What is the role of [:] in overwriting a list in a for loop?

I came across a weird syntactical approach at work today that I couldn't wrap my head around. Let's say I have the following list:
my_list = [[1, 2, 3], [4, 5, 6]]
My objective is to filter each nested list according to some criteria and overwrite the elements of the list in place. So, let's say I want to remove odd numbers from each nested list such that my_list contains lists of even numbers, where the end result would look like this:
[[2], [4, 6]]
If I try to do this using a simple assignment operator, it doesn't work.
my_list = [[1, 2, 3], [4, 5, 6]]
for l in my_list:
l = [num for num in l if num % 2 == 0]
print(my_list)
Output: [[1, 2, 3], [4, 5, 6]]
However, if I "slice" the list, it provides the expected output.
my_list = [[1, 2, 3], [4, 5, 6]]
for l in my_list:
l[:] = [num for num in l if num % 2 == 0]
print(my_list)
Output: [[2], [4, 6]]
My original hypothesis was that l was a newly created object that didn't actually point to the corresponding object in the list, but comparing the outputs of id(x[i]), id(l), and id(l[:]) (where i is the index of l in x), I realized that l[:] was the one with the differing id. So, if Python is creating a new object when I assign to l[:] then how does Python know to overwrite the existing object of l? Why does this work? And why doesn't the simple assignment operator l = ... work?
It's subtle.
Snippet one:
my_list = [[1, 2, 3], [4, 5, 6]]
for l in my_list:
l = [num for num in l if num % 2 == 0]
Why doesn't this work? Because when you do l = , you're only reassigning the variable l, not making any change to its value.
If we write the loop out "manually", it hopefully will become more clear why this strategy fails:
my_list = [[1, 2, 3], [4, 5, 6]]
# iteration 1
l = my_list[0]
l = [num for num in l if num % 2 == 0]
# iteration 2
l = my_list[1]
l = [num for num in l if num % 2 == 0]
Snippet two:
my_list = [[1, 2, 3], [4, 5, 6]]
for l in my_list:
l[:] = [num for num in l if num % 2 == 0]
Why does this work? Because by using l[:] = , you're actually modifying the value that l references, not just the variable l. Let me elaborate.
Generally speaking, using [:] notation (slice notation) on lists allows one to work with a section of the list.
The simplest use is for getting values out of a list; we can write a[n:k] to get the nth, item n+1st item, etc, up to k-1. For instance:
>>> a = ["a", "very", "fancy", "list"]
>>> print(a[1:3])
['very', 'fancy']
Python also allows use of slice notation on the left-side of a =. In this case, it interprets the notation to mean that we want to update only part of a list. For instance, we can replace "very", "fancy" with "not", "so", "fancy" like so:
>>> print(a)
['a', 'very', 'fancy', 'list']
>>> a[1:3] = ["not", "so", "fancy"]
>>> print(a)
['a', 'not', 'so', 'fancy', 'list']
When using slice syntax, Python also provides some convenient shorthand. Instead of writing [n:k], we can omit n or k or both.
If we omit n, then our slice looks like [:k], and Python understands it to mean "up to k", i.e., the same as [0:k].
If we omit k, then our slice looks like a[n:], and Python understands it to mean "n and after", i.e., the same as a[n:len(a)].
If we omit both, then both rules take place, so a[:] is the same as a[0:len(a)], which is a slice over the entire list.
Examples:
>>> print(a)
['a', 'not', 'so', 'fancy', 'list']
>>> print(a[2:4])
['so', 'fancy']
>>> print(a[:4])
['a', 'not', 'so', 'fancy']
>>> print(a[2:])
['so', 'fancy', 'list']
>>> print(a[:])
['a', 'not', 'so', 'fancy', 'list']
Crucially, this all still applies if we are using our slice on the left-hand side of a =:
>>> print(a)
['a', 'not', 'so', 'fancy', 'list']
>>> a[:4] = ["the", "fanciest"]
>>> print(a)
['the', 'fanciest', 'list']
And using [:] means to replace every item in the list:
>>> print(a)
['the', 'fanciest', 'list']
>>> a[:] = ["something", "completely", "different"]
>>> print(a)
['something', 'completely', 'different']
Okay, so far so good.
They key thing to note is that using slice notation on the left-hand side of a list updates the list in-place. In other words, when I do a[1:3] =, the variable a is never updated; the list that it references is.
We can see this with id(), as you were doing:
>>> print(a)
['something', 'completely', 'different']
>>> print(id(a))
139848671387072
>>> a[1:] = ["truly", "amazing"]
>>> print(a)
['something', 'truly', 'amazing']
>>> print(id(a))
139848671387072
Perhaps more pertinently, this means that if a were a reference to a list within some other object, then using a[:] = will update the list within that object. Like so:
>>> list_of_lists = [ [1, 2], [3, 4], [5, 6] ]
>>> second_list = list_of_lists[1]
>>> print(second_list)
[3, 4]
>>> second_list[1:] = [2, 1, 'boom!']
>>> print(second_list)
[3, 2, 1, 'boom!']
>>> print(list_of_lists)
[[1, 2], [3, 2, 1, 'boom!'], [5, 6]]

All possible combinations (including all subsets combinations) of two or more lists

I searched the net but couldn't find anything. I am trying to get all possible combinations including all subsets combinations of two lists (ideally n lists). All combinations should include at least one item from each list.
list_1 = [1,2,3]
list_2 = [5,6]
output = [
[1,5], [1,6], [2,5], [2,6], [3,5], [3,6],
[1,2,5], [1,2,6], [1,3,5], [1,3,6], [2,3,5], [2,3,6], [1,5,6], [2,5,6], [3,5,6],
[1,2,3,5], [1,2,3,6],
[1,2,3,5,6]
]
All I can get is pair combinations like [1,5], [1,6], .. by using
combs = list(itertools.combinations(itertools.chain(*ls_filter_columns), cnt))
What is the pythonic way of achieving this?
Here is one way:
from itertools import combinations, product
def non_empties(items):
"""returns nonempty subsets of list items"""
subsets = []
n = len(items)
for i in range(1,n+1):
subsets.extend(combinations(items,i))
return subsets
list_1 = [1,2,3]
list_2 = [5,6]
combs = [list(p) + list(q) for p,q in product(non_empties(list_1),non_empties(list_2))]
print(combs)
Output:
[[1, 5], [1, 6], [1, 5, 6], [2, 5], [2, 6], [2, 5, 6], [3, 5], [3, 6], [3, 5, 6], [1, 2, 5], [1, 2, 6], [1, 2, 5, 6], [1, 3, 5], [1, 3, 6], [1, 3, 5, 6], [2, 3, 5], [2, 3, 6], [2, 3, 5, 6], [1, 2, 3, 5], [1, 2, 3, 6], [1, 2, 3, 5, 6]]
Which has more elements then the output you gave, though I suspect that your intended output is in error. Note that my code might not correctly handle the case in which there is a non-empty intersection of the two lists. Then again, it might -- you didn't specify what the intended output should be in such a case.
Here is another way.
Even though it is not fancy, it cares n_lists easily.
def change_format(X):
output = []
for x in X:
output += list(x)
return output
import itertools
list_1 = [1,2,3]
list_2 = [5,6]
list_3 = [7,8]
lists = [list_1, list_2, list_3]
lengths = list(map(len, lists))
rs_list = itertools.product(*[list(range(1, l+1)) for l in lengths])
output = []
for rs in rs_list:
temp = []
for L, r in zip(lists, rs):
temp.append(list(itertools.combinations(L, r)))
output += list(itertools.product(*temp))
output = list(map(change_format, output))
I have managed to make it work for n-lists with less code, which was a challenge for me.
from itertools import chain, combinations, product
def get_subsets(list_of_lists):
"""
Get all possible combinations of subsets of given lists
:param list_of_lists: consists of any number of lists with any number of elements
:return: list
"""
ls = [chain(*map(lambda x: combinations(e, x), range(1, len(e)+1))) for e in list_of_lists if e]
ls_output = [[i for tpl in ele for i in tpl] for ele in product(*ls)]
return ls_output
list_1 = [1, 2, 3]
list_2 = [5, 6]
list_3 = [7, 8]
ls_filter_columns = [list_1, list_2, list_3]
print(get_subsets(ls_filter_columns))

How to transform a list of lists like [1, [2, 3, 4], 5 ] to a list [[1,2,5], [1,3,5], [1,4,5]] in Python?

I'm just starting with Python and trying to find a general solution to transform a list of lists [1, [2, 3, 4], 5 ] to a list [[1,2,5], [1,3,5], [1,4,5]] in Python.
I've tried creating some dynamic lists but not getting what i want, not even for this simple list in the example. Any help will be greatly appreciated.
inter_l = []
aba = []
v = [1, [2, 3], 4, 5, 6]
g = globals()
for elem in v:
if isinstance(elem, (list,)):
l_ln = len(elem)
indx = v.index(elem)
for i in range(0, l_ln):
g['depth_{0}'.format(i)] = [elem[i]]
inter_l.append(list(g['depth_{0}'.format(i)]))
else:
aba.append(elem)
t = aba.extend(inter_l)
w = aba.extend(inter_l)
print(v)
print(aba)
print(inter_l)
[1, [2, 3], 4, 5, 6]
[1, 4, 5, 6, [2], [3], [2], [3]]
[[2], [3]]
The easiest way would be to leverage itertools.product function, but since it expects iterables as its inputs, the input would have to be transformed a little. One way to achieve this would be something like this:
transformed = [e if isinstance(e, list) else [e] for e in v]
which converts all non-list elements into lists and then pass this transformed input to product:
list(itertools.product(*transformed))
Note, that * in front of transformed expands transformed list into positional arguments, so that instead of a single argument of type list, a list of its elements is passed instead.
The entire pipeline looks something like this:
>>> v = [1, [2, 3, 4], 5]
>>> t = [e if isinstance(e, list) else [e] for e in v]
>>> list(itertools.product(*t))
[(1, 2, 5), (1, 3, 5), (1, 4, 5)]

Generating a list from complex dictionary

I have a dictionary dict1['a'] = [ [1,2], [3,4] ] and need to generate a list out of it as l1 = [2, 4]. That is, a list out of the second element of each inner list. It can be a separate list or even the dictionary can be modified as dict1['a'] = [2,4].
Given a list:
>>> lst = [ [1,2], [3,4] ]
You can extract the second element of each sublist with a simple list comprehension:
>>> [x[1] for x in lst]
[2, 4]
If you want to do this for every value in a dictionary, you can iterate over the dictionary. I'm not sure exactly what you want your final data to look like, but something like this may help:
>>> dict1 = {}
>>> dict1['a'] = [ [1,2], [3,4] ]
>>> [(k, [x[1] for x in v]) for k, v in dict1.items()]
[('a', [2, 4])]
dict.items() returns (key, value) pairs from the dictionary, as a list. So this code will extract each key in your dictionary and pair it with a list generated as above.
Assuming that each value in the dictionary is a list of pairs, then this should do it for you:
[pair[1] for pairlist in dict1.values() for pair in pairlist]
As you can see:
dict1.values() gets you just the values in your dict,
for pairlist in dict1.values() gets you all the lists of pairs,
for pair in pairlist gets you all the pairs in each of those lists,
and pair[1] gets you the second value in each pair.
Try it out. The Python shell is your friend!...
>>> dict1 = {}
>>> dict1['a'] = [[1,2], [3,4]]
>>> dict1['b'] = [[5, 6], [42, 69], [220, 284]]
>>>
>>> dict1.values()
[[[1, 2], [3, 4]], [[5, 6], [42, 69], [220, 284]]]
>>>
>>> [pairlist for pairlist in dict1.values()]
[[[1, 2], [3, 4]], [[5, 6], [42, 69], [220, 284]]]
>>> # No real difference here, but we can refer to each list now.
>>>
>>> [pair for pairlist in dict1.values() for pair in pairlist]
[[1, 2], [3, 4], [5, 6], [42, 69], [220, 284]]
>>>
>>> # Finally...
>>> [pair[1] for pairlist in dict1.values() for pair in pairlist]
[2, 4, 6, 69, 284]
While I'm at it, I'll just say: ipython loves you!
a list out of the second element of
each inner list
that sounds like [sl[1] for sl in dict1['a']] -- so what's the QUESTION?!-)

Resources