Remove redundant sublists within list in python - python-3.x

Hello everyone I have a list of lists values such as :
list_of_values=[['A','B'],['A','B','C'],['D','E'],['A','C'],['I','J','K','L','M'],['J','M']]
and I would like to keep within that list, only the lists where I have the highest amount of values.
For instance in sublist1 : ['A','B'] A and B are also present in the sublist2 ['A','B','C'], so I remove the sublist1.
The same for sublist4.
the sublist6 is also removed because J and M were present in a the longer sublist5.
at the end I should get:
list_of_no_redundant_values=[['A','B','C'],['D','E'],['I','J','K','L','M']]
other exemple =
list_of_values=[['A','B'],['A','B','C'],['B','E'],['A','C'],['I','J','K','L','M'],['J','M']]
expected output :
[['A','B','C'],['B','E'],['I','J','K','L','M']]
Does someone have an idea ?

mylist=[['A','B'],['A','C'],['A','B','C'],['D','E'],['I','J','K','L','M'],['J','M']]
def remove_subsets(lists):
outlists = lists[:]
for s1 in lists:
for s2 in lists:
if set(s1).issubset(set(s2)) and (s1 is not s2):
outlists.remove(s1)
break
return outlists
print(remove_subsets(mylist))
This should result in [['A', 'B', 'C'], ['D', 'E'], ['I', 'J', 'K', 'L', 'M']]

Related

Apparently empty groups generated with itertools.groupby

I have some troubles with groupby from itertools
from itertools import groupby
for k, grp in groupby("aahfffddssssnnb"):
print(k, list(grp), list(grp))
output is:
a ['a', 'a'] []
h ['h'] []
f ['f', 'f', 'f'] []
d ['d', 'd'] []
s ['s', 's', 's', 's'] []
n ['n', 'n'] []
b ['b'] []
It works as expected.
itertools._grouper objects seems to be readable only once (maybe iterators ?)
but:
li = [grp for k, grp in groupby("aahfffddssssnnb")]
list(li[0])
[]
list(li[1])
[]
It seems empty ... I don't understand why ?
This one works:
["".join(grp) for k, grp in groupby("aahfffddssssnnb")]
['aa', 'h', 'fff', 'dd', 'ssss', 'nn', 'b']
I am using version 3.9.9
Question already asked to newsgroup comp.lang.python without any answsers
grp is a sub-iterator over the same major iterator given to groupby. A new one is created for every key.
When you skip to the next key, the old grp is no longer available as you advanced the main iterator beyond the current group.
It is stated clearly in the Python documentation:
The returned group is itself an iterator that shares the underlying iterable with groupby(). Because the source is shared, when the groupby() object is advanced, the previous group is no longer visible. So, if that data is needed later, it should be stored as a list:
k, g in groupby(data, keyfunc):
groups.append(list(g)) # Store group iterator as a list
uniquekeys.append(k)

Find cartesian product of the elements in a program generated dynamic "sub-list"

I have a program which producing and modifying a list of "n" elements/members, n remaining constant throughout a particular run of the program. (The value of "n" might change in the next run).
Each member in the list is a "sub-list"! Each of these sub-list elements are not only of variable lengths, but are also dynamic and might keep changing while the program keeps running.
So, eventually, at some given point, my list would look something like (assuming n=3):
[['1', '2'], ['a', 'b', 'c', 'd'], ['x', 'y', 'z']]
I want the output to be like the following:
['1ax', '1ay', '1az', '1bx', '1by', '1bz',
'1cx', '1cy', '1cz', '1dx', '1dy', '1dz',
'2ax', '2ay', '2az', '2bx', '2by', '2bz',
'2cx', '2cy', '2cz', '2dx', '2dy', '2dz']
i.e. a list with exactly (2 * 3 * 4) elements where each element is of length exactly 3 and has exactly 1 member from each of the "sub-lists".
Easiest is itertools.product:
from itertools import product
lst = [['1', '2'], ['a', 'b', 'c', 'd'], ['x', 'y', 'z']]
output = [''.join(p) for p in product(*lst)]
# OR
output = list(map(''.join, product(*lst)))
# ['1ax', '1ay', '1az', '1bx', '1by', '1bz',
# '1cx', '1cy', '1cz', '1dx', '1dy', '1dz',
# '2ax', '2ay', '2az', '2bx', '2by', '2bz',
# '2cx', '2cy', '2cz', '2dx', '2dy', '2dz']
A manual implementation specific to strings could look like this:
def prod(*pools):
if pools:
*rest, pool = pools
for p in prod(*rest):
for el in pool:
yield p + el
else:
yield ""
list(prod(*lst))
# ['1ax', '1ay', '1az', '1bx', '1by', '1bz',
# '1cx', '1cy', '1cz', '1dx', '1dy', '1dz',
# '2ax', '2ay', '2az', '2bx', '2by', '2bz',
# '2cx', '2cy', '2cz', '2dx', '2dy', '2dz']

Sliding Window algorithm in python 3.x. Extracting the before and after values of an element from a list

I have 2 lists; terms and key_terms. I need to extract the before and after elements from the terms list using the elements from the key_terms list. I have tried the below and it works but it has a bug.
terms=['b','a','f','s','w','c','g']
key_terms=['a','w','g']
context_terms=[]
for kt in key_terms:
if(kt!=0):
before=terms[(terms.index(kt))-1]
if(terms.index(kt)==len(terms)-1):
context_terms.append(before)
break
else:
after=terms[(terms.index(kt))+1]
context_terms.append(before)
context_terms.append(after)
print(context_terms)
Output: ['b', 'f', 's', 'c', 'c']
The problem with the above is that if the key_terms appear twice in the terms list, the second instance is ignored.
terms=['b','a','f','s','a','c','g']
key_terms=['a','g']
context_terms=[]
for kt in key_terms:
if(kt!=0):
before=terms[(terms.index(kt))-1]
if(terms.index(kt)==len(terms)-1):
context_terms.append(before)
break
else:
after=terms[(terms.index(kt))+1]
context_terms.append(before)
context_terms.append(after)
print(context_terms)
Output: ['b', 'f', 'c']
The correct output should be ['b', 'f', 's', 'c', 'c']
After some research i noticed that i have to use a sliding window. Can someone please help me because i can't understand how i am to apply the sliding window for my case. Thank you (P.s this is my first ever question, sorry if my issue is not clear)
Try looping over terms instead of key_terms. For every element in terms which is present in key_terms, add the element prior to and next to it.
The pseudo-code would be:
for e in terms:
if e present in key_terms:
ans.add(element_to_left_of_e)
ans.add(element_to_right_of_e)
As opposed to finding indices later, the following pseudo code might prove better to iterate over indices:
for index in range(0, length of terms):
if terms[index] present in key_terms:
ans.add(terms[index-1])
ans.add(terms[index+1])
If I get your problem correctly may be following can help:
terms=['b','a','f','s','a','c','g']
key_terms=['a','g']
context_terms=[]
for k in key_terms:
indices = [i for i, item in enumerate(terms) if item == k]
for kt in indices:
before=terms[kt - 1]
if kt == len(terms)-1:
context_terms.append(before)
break
else:
after=terms[kt + 1]
context_terms.append(before)
context_terms.append(after)
print(context_terms)
Output: ['b', 'f', 's', 'c', 'c']

Retrieving repeated items in a list comprehension

Given a sorted list, I would like to retrieve the first repeated item in the list using list comprehension.
So I ran the line below:
list=['1', '2', '3', 'a', 'a', 'b', 'c']
print(k for k in list if k==k+1)
I expected the output "a". But instead I got:
<generator object <genexpr> at 0x0021AB30>
I'm pretty new at this, would someone be willing to clarify why this doesn't work?
You seem to confuse the notion of list element and index.
For example the generator expression iterating over all items of list xs equal to its predecessor would look like this:
g = (xs[k] for k in range(1, len(xs)) if xs[k] == xs[k - 1])
Since you are interested only in first such item, you could write
next(xs[k] for k in range(1, len(xs)) if xs[k] == xs[k - 1])
however you'll get an exception if there is in fact no such items.
As a general advice, prefer simple readable functions over clever long one-liners,
especially when you are new to language. Your task could be accomplished as follows:
def first_duplicate(xs):
for k in range(1, len(xs)):
if xs[k] == xs[k - 1]:
return xs[k]
chars = ['1', '2', '3', 'a', 'a', 'b', 'c']
print(first_duplicate(chars)) # 'a'
P.S. Beware using list as your variable name -- you're shadowing built-in type
If you want just the first repeated item in the list you can use the next function with a generator expression that iterates through the list zipped with itself but with an offset of 1 to compare adjacent items:
next(a for a, b in zip(lst, lst[1:]) if a == b)
so that given lst = ['1', '2', '3', 'a', 'a', 'b', 'c'], the above returns: 'a'.

Based on a condition perform reranking of elements in the list in python

i have a ranked list of elements as below :
ranked_list_1 = ['G','A','M','S','D']
i need to rerank the list as below
1) Re rank as :
re_ranked_list_1 = ['A','M','D','E','G','S']
Logic : 'G' and 'S' should always be in last 2 positions and new element 'E' should also be tagged to the list just before the last 2 position.
I think this is what you want. This will put 'G' and 'S' at the end, in the order they appear in the list.
ordered_list = list()
final_terms = ['E']
for item in ranked_list_1:
if item in ['G', 'S']:
final_terms.append(item)
else:
ordered_list.append(item)
output_list = ordered_list + final_terms
print(output_list)
>>>['A', 'M', 'D', 'E', 'G', 'S']

Resources