If I have duplicates in a list with brackets, what should I do - python-3.x

Suppose I have the following list:
m=[1,2,[1],1,2,[1]]
I wish to take away all duplicates. If it were not for the brackets inside the the list, then I could use:
m=list(set(m))
but when I do this, I get the error:
unhashable type 'set'.
What command will help me remove duplicates so that I could only be left with the list
m=[1,2,[1]]
Thank you

You can do something along these lines:
m=[1,2,[1],1,2,[1]]
seen=set()
nm=[]
for e in m:
try:
x={e}
x=e
except TypeError:
x=frozenset(e)
if x not in seen:
seen.add(x)
nm.append(e)
>>> nm
[1, 2, [1]]
From comments: This method preserves the order of the original list. If you want the numeric types in order first and the other types second, you can do:
sorted(nm, key=lambda e: 0 if isinstance(e, (int,float)) else 1)

The first step will be to convert the inner lists to tuples:
>> new_list = [tuple(i) if type(i) == list else i for i in m]
Then create a set to remove duplicates:
>> no_duplicates = set(new_list)
>> no_duplicates
{1, 2, (1,)}
and you can convert that into list if you wish.

For a more generic solution you can serialize each list item with pickle.dumps before passing them to set(), and then de-serialize the items with pickle.loads:
import pickle
m = list(map(pickle.loads, set(map(pickle.dumps, m))))
If you want the original order to be maintained, you can use a dict (which has become ordered since Python 3.6+) instead of a set:
import pickle
m = list(map(pickle.loads, {k: 1 for k in map(pickle.dumps, m)}))
Or if you need to be compatible with Python 3.5 or earlier versions, you can use collections.OrderedDict instead:
import pickle
from collections import OrderedDict
m = list(map(pickle.loads, OrderedDict((k, 1) for k in map(pickle.dumps, m))))

result = []
for i in m:
flag = True
for j in m:
if i == j:
flag = False
if flag:
result.append(i)
Result will be: [1,2,[1]]
There are ways to make this code shorter, but I'm writing it more verbosely for readability. Also, note that this method is O(n^2), so I wouldn't recommend for long lists. But benefits is the simplicity.

Simple Solution,
m=[1,2,[1],1,2,[1]]
l= []
for i in m:
if i not in l:
l.append(i)
print(l)
[1, 2, [1]]
[Program finished]

Related

Python list comprehension ignore None results?

I have the following toy example function and list comprehension:
def foo(lst):
if lst:
return lst[0] + lst[1]
[foo(l) for l in [[], [1,2], [1,4]]]
The result is:
[None, 3, 4]
How can I avoid the Nones, I would like to avoid calling if foo(l) is not None inside the list comp.
Please advise.
If you want to avoid calling the function more than once, you can make a generator that yields based on the result of the function. It's a little more code, but avoids making a list with a bunch of None values which have to be filtered later, and also avoids calling the function twice in the list comprehension:
def expensive_function(lst):
# sometimes returns None...hard to know when without calling
if lst:
return lst[0] + lst[1]
def gen_results(l):
for a in l:
res = expensive_function(a)
if res:
yield res
inp = [[], [1,2], [1,4]]
list(gen_results(inp))
# [3, 5]
Also, since generators are lazy, you don't need to make a list if you don't need a list.

How to overwrite a nested list using conditional list comprehension

I'm trying to check a nested for the bigger number in each and overwrite each nested list with just the value of the number with the biggest value.
I have done this using nested loops but I was wondering how to do this using conditional list comprehension.
Here's my nested loop solution:
list1 = [[1,2,4,3], [1,2,755,244], [1,2,6,1000] , [5,3,7,13]]
iterator = 0
for val1 in list1:
for num in val1:
if num == max(val1):
list1[iterator] = num
iterator +=1
Here's what I tried with list comprehension but the syntax is wrong:
num for x in list1 for num in x if num ==max(x)
The error is: invalid syntax
The coded you pasted works just fine. That being said you can write it much cleaner:
list1 = [[1,2,4,3], [1,2,755,244], [1,2,6,1000] , [5,3,7,13]]
iterator = 0
for index, val in enumerate(list1):
list1[index] = max(val)
print(list1) # [4, 755, 1000, 13]
The cleaner yet, listcomp version with max:
list1 = [[1,2,4,3], [1,2,755,244], [1,2,6,1000] , [5,3,7,13]]
list1 = [max(lst) for lst in list1]
print(list1) # [4, 755, 1000, 13]

How to delete certain element(s) from an array?

I have a 2d array, how can I delete certain element(s) from it?
x = [[2,3,4,5,2],[5,3,6,7,9,2],[34,5,7],[2,46,7,4,36]]
for i in range(len(x)):
for j in range(len(x[i])):
if x[i][j] == 2:
del x[i][j]
This will destroy the array and returns error "list index out of range".
you can use pop on the list item. For example -
>>> array = [[1,2,3,4], [6,7,8,9]]
>>> array [1].pop(3)
>>> array
[[1, 2, 3, 4], [6, 7, 8]]
I think this can solve your problem.
x = [[2,3,4,5,2],[5,3,6,7,9,2],[34,5,7],[2,46,7,4,36]]
for i in range(len(x)):
for j in range(len(x[i])):
if j<len(x[i]):
if x[i][j] == 2:
del x[i][j]
I have tested it locally and working as expected.Hope it will help.
Mutating a list while iterating over it is always a bad idea. Just make a new list and add everything except those items you want to exclude. Such as:
x = [[2,3,4,5,2],[5,3,6,7,9,2],[34,5,7],[2,46,7,4,36]]
new_array = []
temp = []
delete_val = 2
for list_ in x:
for element in list_:
if element != delete_val:
temp.append(element)
new_array.append(temp)
temp = []
x = new_array
print(x)
Edit: made it a little more pythonic by omitting list indices.
I think this is more readable at the cost of temporarily more memory usage (making a new list) compared to the solution that Sai prateek has offered.

Fastest way to find all the indexes of maximum value in a list - Python

I am having list which as follows
input_list= [2, 3, 5, 2, 5, 1, 5]
I want to get all the indexes of maximum value. Need efficient solution. The output will be as follows.
output = [2,4,6] (The above list 5 is maximum value in a list)
I have tried by using below code
m = max(input_list)
output = [i for i, j in enumerate(a) if j == m]
I need to find any other optimum solution.
from collections import defaultdict
dic=defaultdict(list)
input_list=[]
for i in range(len(input_list)):
dic[input_list[i]]+=[i]
max_value = max(input_list)
Sol = dic[max_value]
You can use numpy (numpy arrays are very fast):
import numpy as np
input_list= np.array([2, 3, 5, 2, 5, 1, 5])
i, = np.where(input_list == np.max(input_list))
print(i)
Output:
[2 4 6]
Here's the approach which is described in comments. Even if you use some library, fundamentally you need to traverse at least once to solve this problem (considering input list is unsorted). So even lower bound for the algorithm would be Omega(size_of_list). If list is sorted we can leverage binary_search to solve the problem.
def max_indexes(l):
try:
assert l != []
max_element = l[0]
indexes = [0]
for index, element in enumerate(l[1:]):
if element > max_element:
max_element = element
indexes = [index + 1]
elif element == max_element:
indexes.append(index + 1)
return indexes
except AssertionError:
print ('input_list in empty')
Use a for loop for O(n) and iterating just once over the list resolution:
from itertools import islice
input_list= [2, 3, 5, 2, 5, 1, 5]
def max_indexes(l):
max_item = input_list[0]
indexes = [0]
for i, item in enumerate(islice(l, 1, None), 1):
if item < max_item:
continue
elif item > max_item:
max_item = item
indexes = [i]
elif item == max_item:
indexes.append(i)
return indexes
Here you have the live example
Think of it in this way, unless you iterate through the whole list once, which is O(n), n being the length of the list, you won't be able to compare the maximum with all values in the list, so the best you can do is O(n), which you already seems to be doing in your example.
So I am not sure you can do it faster than O(n) with the list approach.

Split/partition list based on invariant/hash?

I have a list [a1,21,...] and would like to split it based on the value of a function f(a).
For example if the input is the list [0,1,2,3,4] and the function def f(x): return x % 3,
I would like to return a list [0,3], [1,4], [2], since the first group all takes values 0 under f, the 2nd group take value 1, etc...
Something like this works:
return [[x for x in lst if f(x) == val] for val in set(map(f,lst))],
But it does not seem optimal (nor pythonic) since the inner loop unnecessarily scans the entire list and computes same f values of the elements several times.
I'm looking for a solution that would compute the value of f ideally once for every element...
If you're not irrationally ;-) set on a one-liner, it's straightforward:
from collections import defaultdict
lst = [0,1,2,3,4]
f = lambda x: x % 3
d = defaultdict(list)
for x in lst:
d[f(x)].append(x)
print(list(d.values()))
displays what you want. f() is executed len(lst) times, which can't be beat
EDIT: or, if you must:
from itertools import groupby
print([[pair[1] for pair in grp]
for ignore, grp in
groupby(sorted((f(x), x) for x in lst),
key=lambda pair: pair[0])])
That doesn't require that f() produce values usable as dict keys, but incurs the extra expense of a sort, and is close to incomprehensible. Clarity is much more Pythonic than striving for one-liners.
#Tim Peters is right, and here is a mentioned setdefault and another itertool.groupby option.
Given
import itertools as it
iterable = range(5)
keyfunc = lambda x: x % 3
Code
setdefault
d = {}
for x in iterable:
d.setdefault(keyfunc(x), []).append(x)
list(d.values())
groupby
[list(g) for _, g in it.groupby(sorted(iterable, key=keyfunc), key=keyfunc)]
See also more on itertools.groupby

Resources