Retrieving repeated items in a list comprehension - python-3.x

Given a sorted list, I would like to retrieve the first repeated item in the list using list comprehension.
So I ran the line below:
list=['1', '2', '3', 'a', 'a', 'b', 'c']
print(k for k in list if k==k+1)
I expected the output "a". But instead I got:
<generator object <genexpr> at 0x0021AB30>
I'm pretty new at this, would someone be willing to clarify why this doesn't work?

You seem to confuse the notion of list element and index.
For example the generator expression iterating over all items of list xs equal to its predecessor would look like this:
g = (xs[k] for k in range(1, len(xs)) if xs[k] == xs[k - 1])
Since you are interested only in first such item, you could write
next(xs[k] for k in range(1, len(xs)) if xs[k] == xs[k - 1])
however you'll get an exception if there is in fact no such items.
As a general advice, prefer simple readable functions over clever long one-liners,
especially when you are new to language. Your task could be accomplished as follows:
def first_duplicate(xs):
for k in range(1, len(xs)):
if xs[k] == xs[k - 1]:
return xs[k]
chars = ['1', '2', '3', 'a', 'a', 'b', 'c']
print(first_duplicate(chars)) # 'a'
P.S. Beware using list as your variable name -- you're shadowing built-in type

If you want just the first repeated item in the list you can use the next function with a generator expression that iterates through the list zipped with itself but with an offset of 1 to compare adjacent items:
next(a for a, b in zip(lst, lst[1:]) if a == b)
so that given lst = ['1', '2', '3', 'a', 'a', 'b', 'c'], the above returns: 'a'.

Related

Is there a way to split strings inside a list?

I am trying to split strings inside a list but I could not find any solution on the internet. This is a sample, but it should help you guys understand my problem.
array=['a','b;','c','d)','void','plasma']
for i in array:
print(i.split())
My desired output should look like this:
output: ['a','b',';','c','d',')','void','plasma']
One approach uses re.findall on each starting list term along with a list comprehension to flatten the resulting 2D list:
inp = ['a', 'b;', 'c', 'd)', 'void', 'plasma']
output = [j for sub in [re.findall(r'\w+|\W+', x) for x in inp] for j in sub]
print(output) # ['a', 'b', ';', 'c', 'd', ')', 'void', 'plasma']

Apparently empty groups generated with itertools.groupby

I have some troubles with groupby from itertools
from itertools import groupby
for k, grp in groupby("aahfffddssssnnb"):
print(k, list(grp), list(grp))
output is:
a ['a', 'a'] []
h ['h'] []
f ['f', 'f', 'f'] []
d ['d', 'd'] []
s ['s', 's', 's', 's'] []
n ['n', 'n'] []
b ['b'] []
It works as expected.
itertools._grouper objects seems to be readable only once (maybe iterators ?)
but:
li = [grp for k, grp in groupby("aahfffddssssnnb")]
list(li[0])
[]
list(li[1])
[]
It seems empty ... I don't understand why ?
This one works:
["".join(grp) for k, grp in groupby("aahfffddssssnnb")]
['aa', 'h', 'fff', 'dd', 'ssss', 'nn', 'b']
I am using version 3.9.9
Question already asked to newsgroup comp.lang.python without any answsers
grp is a sub-iterator over the same major iterator given to groupby. A new one is created for every key.
When you skip to the next key, the old grp is no longer available as you advanced the main iterator beyond the current group.
It is stated clearly in the Python documentation:
The returned group is itself an iterator that shares the underlying iterable with groupby(). Because the source is shared, when the groupby() object is advanced, the previous group is no longer visible. So, if that data is needed later, it should be stored as a list:
k, g in groupby(data, keyfunc):
groups.append(list(g)) # Store group iterator as a list
uniquekeys.append(k)

Find cartesian product of the elements in a program generated dynamic "sub-list"

I have a program which producing and modifying a list of "n" elements/members, n remaining constant throughout a particular run of the program. (The value of "n" might change in the next run).
Each member in the list is a "sub-list"! Each of these sub-list elements are not only of variable lengths, but are also dynamic and might keep changing while the program keeps running.
So, eventually, at some given point, my list would look something like (assuming n=3):
[['1', '2'], ['a', 'b', 'c', 'd'], ['x', 'y', 'z']]
I want the output to be like the following:
['1ax', '1ay', '1az', '1bx', '1by', '1bz',
'1cx', '1cy', '1cz', '1dx', '1dy', '1dz',
'2ax', '2ay', '2az', '2bx', '2by', '2bz',
'2cx', '2cy', '2cz', '2dx', '2dy', '2dz']
i.e. a list with exactly (2 * 3 * 4) elements where each element is of length exactly 3 and has exactly 1 member from each of the "sub-lists".
Easiest is itertools.product:
from itertools import product
lst = [['1', '2'], ['a', 'b', 'c', 'd'], ['x', 'y', 'z']]
output = [''.join(p) for p in product(*lst)]
# OR
output = list(map(''.join, product(*lst)))
# ['1ax', '1ay', '1az', '1bx', '1by', '1bz',
# '1cx', '1cy', '1cz', '1dx', '1dy', '1dz',
# '2ax', '2ay', '2az', '2bx', '2by', '2bz',
# '2cx', '2cy', '2cz', '2dx', '2dy', '2dz']
A manual implementation specific to strings could look like this:
def prod(*pools):
if pools:
*rest, pool = pools
for p in prod(*rest):
for el in pool:
yield p + el
else:
yield ""
list(prod(*lst))
# ['1ax', '1ay', '1az', '1bx', '1by', '1bz',
# '1cx', '1cy', '1cz', '1dx', '1dy', '1dz',
# '2ax', '2ay', '2az', '2bx', '2by', '2bz',
# '2cx', '2cy', '2cz', '2dx', '2dy', '2dz']

Remove redundant sublists within list in python

Hello everyone I have a list of lists values such as :
list_of_values=[['A','B'],['A','B','C'],['D','E'],['A','C'],['I','J','K','L','M'],['J','M']]
and I would like to keep within that list, only the lists where I have the highest amount of values.
For instance in sublist1 : ['A','B'] A and B are also present in the sublist2 ['A','B','C'], so I remove the sublist1.
The same for sublist4.
the sublist6 is also removed because J and M were present in a the longer sublist5.
at the end I should get:
list_of_no_redundant_values=[['A','B','C'],['D','E'],['I','J','K','L','M']]
other exemple =
list_of_values=[['A','B'],['A','B','C'],['B','E'],['A','C'],['I','J','K','L','M'],['J','M']]
expected output :
[['A','B','C'],['B','E'],['I','J','K','L','M']]
Does someone have an idea ?
mylist=[['A','B'],['A','C'],['A','B','C'],['D','E'],['I','J','K','L','M'],['J','M']]
def remove_subsets(lists):
outlists = lists[:]
for s1 in lists:
for s2 in lists:
if set(s1).issubset(set(s2)) and (s1 is not s2):
outlists.remove(s1)
break
return outlists
print(remove_subsets(mylist))
This should result in [['A', 'B', 'C'], ['D', 'E'], ['I', 'J', 'K', 'L', 'M']]

Convert a string within a list to an element in the list in python

I am using python data to create a ReportLab report. I have a list that looks like this:
mylist = [['a b c d e f'],['g h i j k l']]
and want to convert it to look like this:
mylist2 = [[a,b,c,d,e],[g,h,i,j,k,l]]
the first list gives me a "List out of index" error when building the report.
the second list works in ReportLab, but columns and formatting in this list aren't what I want.
What is the best method to convert mylist 1 to mylist2 in python?
string to list can be done using split() method.
try mylist[1][0].split() and mylist[0][0].split()
Borrowing idea from Jibin Mathews, I tried the following
new_list = [mylist[0][0].split(), mylist[1][0].split()]
and it prints
[['a', 'b', 'c', 'd', 'e', 'f'], ['g', 'h', 'i', 'j', 'k', 'l']]
I saw 'f' is missing in your final list. Is that the mistake?
mylist = [['a b c d e f'],['g h i j k l']]
import re
space_re = re.compile(r'\s+')
output = []
for l in mylist:
element = l[0]
le = re.split(space_re, element)
output.append(le)
This not best answer but it will work fine.!

Resources