How to only print next iteration of a permutation - python-3.x

So basically I have a list of 26 objects (the letters of the alphabet)
I would like to be able to find what the next iteration of the permutation would be.
However to compute the entire permutation list and store this as a list to iterate through will take too much computational power as the total number of possible iterations is 403291461126605635584000000
import itertools
print(list(itertools.permutations(['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z'], 26)))
For example.
If I had a small list of say 3 letters this would be much more straight forward.
import itertools
x = list(itertools.permutations(['a','b','c'], 3))
print(x[2])

The method itertools.permutations is a generator function, which will create the next permutation when requested (calling next() on the generator object or iterating over it). All permutations will only be created in the case where you cast it to a list (which you are doing) or when you iterate over all of them.
from itertools import permutations
permutation_generator = permutations("ABCD", 2)
print(next(permutation_generator))
# Output: ('A', 'B')
print(next(permutation_generator))
# Output: ('A', 'C')
print(next(permutation_generator))
# Output: ('A', 'D')
# Example of iterating with a generator and stopping after two iterations
for i, permutation in enumerate(permutation_generator):
if i > 1:
break
print(permutation)
# Output:
# ('B', 'A')
# ('B', 'C')
# Generating the next five permutations in a list
permutations_first_five = [next(permutation_generator) for _ in range(5)]

Related

Keep duplciate items in list of tuples if only the first index matches between the tuples

Input [(1,3), (3,1), (1,5), (2,3), (2,4), (44,33), (33,22), (44,22), (22,33)]
Expected Output [(1,3), (1,5), (2,3), (2,4), (44,33), (44,22)]
I am trying to figure out the above and have tried lots of stuff. So far my only success has been,
for x in range(len(list1)):
if list1[0][0] == list1[x][0]:
print(list1[x])
Output: (1, 3) \n (1, 5)
Any sort of advice or help would be appreciated.
Use a collections.defaultdict(list) keyed by the first value, and keep only the values that are ultimately duplicated:
from collections import defaultdict # At top of file, for collecting values by first element
from itertools import chain # At top of file, for flattening result
dct = defaultdict(list)
inp = [(1,3), (3,1), (1,5), (2,3), (2,4), (44,33), (33,22), (44,22), (22,33)]
# For each tuple
for tup in inp:
first, _ = tup # Extract first element (and verify it's actually a pair)
dct[first].append(tup) # Collect with other tuples sharing the same first element
# Extract all lists which have two or more elements (first element duplicated at least once)
# Would be list of lists, with each inner list sharing the same first element
onlydups = [lst for firstelem, lst in dct.items() if len(lst) > 1]
# Flattens to make single list of all results (if desired)
flattened_output = list(chain.from_iterable(onlydups))
Importantly, this doesn't required ordered input, and scales well, doing O(n) work (scaling your solution naively would produce a O(n²) solution, considerably slower for larger inputs).
Another approach is the following :
def sort(L:list):
K = []
for i in L :
if set(i) not in K :
K.append(set(i))
output = [tuple(m) for m in K]
return output
output :
[(1, 3), (1, 5), (2, 3), (2, 4), (33, 44), (33, 22), (44, 22)]

What is the best possible way to find the first AND the last occurrences of an element in a list in Python?

The basic way I usually use is by using the list.index(element) and reversed_list.index(element), but this fails when I need to search for many elements and the length of the list is too large say 10^5 or say 10^6 or even larger than that. What is the best possible way (which uses very little time) for the same?
You can build auxiliary lookup structures:
lst = [1,2,3,1,2,3] # super long list
last = {n: i for i, n in enumerate(lst)}
first = {n: i for i, n in reversed(list(enumerate(lst)))}
last[3]
# 5
first[3]
# 2
The construction of the lookup dicts takes linear time, but then the lookup itself is constant.
Whreas calls to list.index() take linear time, and repeatedly doing so is then quadratic (given the number of lookups you make depends on the size of the list).
You could also build a single structure in one iteration:
from collections import defaultdict
lookup = defaultdict(lambda: [None, None])
for i, n in enumerate(lst):
lookup[n][1] = i
if lookup[n][0] is None:
lookup[n][0] = i
lookup[3]
# [2, 5]
lookup[2]
# [1, 4]
Well, someone needs to do the work in finding the element, and in a large list this can take time! Without more information or a code example, it'll be difficult to help you, but usually the go-to answer is to use another data structure- for example, if you can keep your elements in a dictionary instead of a list with the key being the element and the value being an array of indices, you'll be much quicker.
You can just remember first and last index for every element in the list:
In [9]: l = [random.randint(1, 10) for _ in range(100)]
In [10]: first_index = {}
In [11]: last_index = {}
In [12]: for idx, x in enumerate(l):
...: if x not in first_index:
...: first_index[x] = idx
...: last_index[x] = idx
...:
In [13]: [(x, first_index.get(x), last_index.get(x)) for x in range(1, 11)]
Out[13]:
[(1, 3, 88),
(2, 23, 90),
(3, 10, 91),
(4, 13, 98),
(5, 11, 57),
(6, 4, 99),
(7, 9, 92),
(8, 19, 95),
(9, 0, 77),
(10, 2, 87)]
In [14]: l[0]
Out[14]: 9
Your approach sounds good, I did some testing and:
import numpy as np
long_list = list(np.random.randint(0, 100_000, 100_000_000))
# This takes 10ms in my machine
long_list.index(999)
# This takes 1,100ms in my machine
long_list[::-1].index(999)
# This takes 1,300ms in my machine
list(reversed(long_list)).index(999)
# This takes 200ms in my machine
long_list.reverse()
long_list.index(999)
long_list.reverse()
But at the end of the day, a Python list does not seem like the best data structure for this.
As others have sugested, you can build a dict:
indexes = {}
for i, val in enumerate(long_list):
if val in indexes.keys():
indexes[val].append(i)
else:
indexes[val] = [i]
This is memory expensive, but solves your problem (depends on how often you modify the original list).
You can then do:
# This takes 0.02ms in my machine
ix = indexes.get(999)
ix[0], ix[-1]

numpy selecting elements in sub array using slicing [duplicate]

I have a list like this:
a = [[4.0, 4, 4.0], [3.0, 3, 3.6], [3.5, 6, 4.8]]
I want an outcome like this (EVERY first element in the list):
4.0, 3.0, 3.5
I tried a[::1][0], but it doesn't work
You can get the index [0] from each element in a list comprehension
>>> [i[0] for i in a]
[4.0, 3.0, 3.5]
Use zip:
columns = zip(*rows) #transpose rows to columns
print columns[0] #print the first column
#you can also do more with the columns
print columns[1] # or print the second column
columns.append([7,7,7]) #add a new column to the end
backToRows = zip(*columns) # now we are back to rows with a new column
print backToRows
You can also use numpy:
a = numpy.array(a)
print a[:,0]
Edit:
zip object is not subscriptable. It need to be converted to list to access as list:
column = list(zip(*row))
You could use this:
a = ((4.0, 4, 4.0), (3.0, 3, 3.6), (3.5, 6, 4.8))
a = np.array(a)
a[:,0]
returns >>> array([4. , 3. , 3.5])
You can get it like
[ x[0] for x in a]
which will return a list of the first element of each list in a
Compared the 3 methods
2D list: 5.323603868484497 seconds
Numpy library : 0.3201274871826172 seconds
Zip (Thanks to Joran Beasley) : 0.12395167350769043 seconds
D2_list=[list(range(100))]*100
t1=time.time()
for i in range(10**5):
for j in range(10):
b=[k[j] for k in D2_list]
D2_list_time=time.time()-t1
array=np.array(D2_list)
t1=time.time()
for i in range(10**5):
for j in range(10):
b=array[:,j]
Numpy_time=time.time()-t1
D2_trans = list(zip(*D2_list))
t1=time.time()
for i in range(10**5):
for j in range(10):
b=D2_trans[j]
Zip_time=time.time()-t1
print ('2D List:',D2_list_time)
print ('Numpy:',Numpy_time)
print ('Zip:',Zip_time)
The Zip method works best.
It was quite useful when I had to do some column wise processes for mapreduce jobs in the cluster servers where numpy was not installed.
If you have access to numpy,
import numpy as np
a_transposed = a.T
# Get first row
print(a_transposed[0])
The benefit of this method is that if you want the "second" element in a 2d list, all you have to do now is a_transposed[1]. The a_transposed object is already computed, so you do not need to recalculate.
Description
Finding the first element in a 2-D list can be rephrased as find the first column in the 2d list. Because your data structure is a list of rows, an easy way of sampling the value at the first index in every row is just by transposing the matrix and sampling the first list.
Try using
for i in a :
print(i[0])
i represents individual row in a.So,i[0] represnts the 1st element of each row.

Effective ways to group things into list

I am doing a K-means project and I have to do it by hand, which is why I am trying to figure out what is the best ways to group things according to their last values into a list or a dictionary. Here is what I am talking about
list_of_tuples = [(honey,1),(bee,2),(tree,5),(flower,2),(computer,5),(key,1)]
Now my ultimate goal is to be able to sort out the list and have 3 different lists each with its respected element
"""This is the goal"""
list_1 = [honey,key]
list_2 = [bee,flower]
list_3 = [tree, computer]
I can use a lot of if statements and a for loop, but is there a more efficient way to do it?
If you're not opposed to using something like pandas, you could do something along these lines:
import pandas as pd
list_1, list_2, list_3 = pd.DataFrame(list_of_tuples).groupby(1)[0].apply(list).values
Result:
In [19]: list_1
Out[19]: ['honey', 'key']
In [20]: list_2
Out[20]: ['bee', 'flower']
In [21]: list_3
Out[21]: ['tree', 'computer']
Explanation:
pd.DataFrame(list_of_tuples).groupby(1) groups your list of tuples by the value at index 1, then you extract the values as lists of index 0 with [0].apply(list).values. This gives you an array of lists as below:
array([list(['honey', 'key']), list(['bee', 'flower']),
list(['tree', 'computer'])], dtype=object)
Something to the effect can be achieved with a dictionary and a for loop, using the second element of the tuple as a key value.
list_of_tuples = [("honey",1),("bee",2),("tree",5),("flower",2),("computer",5),("key",1)]
dict_list = {}
for t in list_of_tuples:
# create key and a single element list if key doesn't exist yet
# append to existing list otherwise
if t[1] not in dict_list.keys():
dict_list[t[1]] = [t[0]]
else:
dict_list[t[1]].append( t[0] )
list_1, list_2, list_3 = dict_list.values()

Check if element is occurring very first time in python list

I have a list with values occurring multiple times. I want to loop over the list and check if value is occurring very first time.
For eg: Let's say I have a one list like ,
L = ['a','a','a','b','b','b','b','b','e','e','e'.......]
Now, at every first occurrence of element, I want to perform some set of tasks.
How to get the first occurrence of element?
Thanks in Advance!!
Use a set to check if you had processed that item already:
visited = set()
L = ['a','a','a','b','b','b','b','b','e','e','e'.......]
for e in L:
if e not in visited:
visited.add(e)
# process first time tasks
else:
# process not first time tasks
You can use unique_everseen from itertools recipes.
This function returns a generator which yield only the first occurence of an element.
Code
from itertools import filterfalse
def unique_everseen(iterable, key=None):
"List unique elements, preserving order. Remember all elements ever seen."
# unique_everseen('AAAABBBCCDAABBB') --> A B C D
# unique_everseen('ABBCcAD', str.lower) --> A B C D
seen = set()
seen_add = seen.add
if key is None:
for element in filterfalse(seen.__contains__, iterable):
seen_add(element)
yield element
else:
for element in iterable:
k = key(element)
if k not in seen:
seen_add(k)
yield element
Example
lst = ['a', 'a', 'b', 'c', 'b']
for x in unique_everseen(lst):
print(x) # Do something with the element
Output
a
b
c
The function unique_everseen also allows to pass a key for comparison of elements. This is useful in many cases, by example if you also need to know the position of each first occurence.
Example
lst = ['a', 'a', 'b', 'c', 'b']
for i, x in unique_everseen(enumerate(lst), key=lambda x: x[1]):
print(i, x)
Output
0 a
2 b
3 c
Why not using that?
L = ['a','a','a','b','b','b','b','b','e','e','e'.......]
for idxL, L_idx in enumerate(L):
if (L.index(L_idx) == idxL):
print("This is first occurence")
For very long lists, it is less efficient than building a set prior to the loop, but seems more direct to write.

Resources