To generate all possible combinations of items in a list and STORE them in different lists, and to access them later. I have stated an example below - python-3.x

# To generate possible combinations
from itertools import combinations
main_list=('a1','a2','a3','a4')
abc=combinations(main_list,3)
for i in list(abc):
print(i)
# Creating number of empty lists
n=6
obj={}
for i in range(n):
obj['set'+str(i)]=()
# I want to combine these, take list1 generated by combinations and store them down in set1.
/* To generate all possible combinations of items in a list and STORE them in different lists. Eg: main_list=('a1','a2','a3'), now i want to combination lists like set1=('a1'), set2=('a2'), set3=('a3'), set4=('a1','a2'), set5=('a1','a3'), set6=('a2','a3'), set7=('a1','a2','a3'). How to access lists set1, set2,... */

If I understand correctly, you want to generate the power set - the set of all subsets of the list. Python's itertools package provides a nice example function to generate this:
from itertools import chain, combinations
def powerset(iterable):
"powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
s = list(iterable)
return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))

While Nathan answers helps with first part of the question, my code will help you making a dictionary, so you can access the sets like sets['set1'] as asked.
from itertools import chain, combinations
def powerset(iterable):
"powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
s = list(iterable)
return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
def make_dict(comb):
sets = {}
for i, set in enumerate(comb):
sets['set'+str(i)] = set
return sets
if __name__ == '__main__':
sets = make_dict(powerset(['a1','a2','a3','a4']))
print(sets['set1'])
Output
('a1',)

Related

Find an index in a list of lists using an index inside one of the lists in pyton

I'm trying to determine if there is a way to access an index essentially by making a list of lists, where each inner list has a tuple that provides essentially grid coordinates, i.e:
example = [
['a', (0,0)], ['b',(0,1)], ['c', (0,2)],
['d', (1,0)], ['e',(1,1)], ['d', (1,2)],
.....
]
and so on.
So, If I have coordinates (0,1), I want to be able to return example[1][0], or at the very least example[1] since these coordinates correlate with example[1].
I tried using index(), but this doesn't go deep enough. I also looked into itertools, but I cannot find a tool that finds it and doesn't return a boolean.
Using a number pad as an example:
from itertools import chain
def pinpad_test():
pad=[
['1',(0,0)],['2',(0,1)],['3',(0,2)],
['4',(1,0)],['5',(1,1)],['6',(1,2)],
['7',(2,0)],['8',(2,1)],['9',(2,2)],
['0',(3,1)]
]
tester = '1234'
print(tester)
for dig in tester:
print(dig)
if dig in chain(*pad):
print(f'Digit {dig} in pad')
else:
print('Failed')
print('end of tester')
new_test = pad.index((0,1)in chain(*pad))
print(new_test)
if __name__ == '__main__':
pinpad_test()
I get an value error at the initiation of new_test.
You can just yield from simple generator expression:
coords = (0, 1)
idx = next((sub_l[0] for sub_l in pad if sub_l[1] == coords), None)
print(idx)
2
You can create a function that will give you want
def on_coordinates(coordinates:tuple, list_coordinates:list):
return next(x for x in list_coordinatesif x[1] == coordinates)

return unique python lists of chars ignoring order

Problem:
Consider a python list of lists that contains a sequence of chars:
[['A', 'B'],['A','B','C'],['B','A'],['C','A','B'],['D'],['D'],['Ao','B']]
The goal is to return the unique lists, regardless of order:
[['A','B'],['A','B','C'],['D'],['Ao','B']]
Attempt:
I'm able to achieve my goal using many if/else statements with try/exceptions. What would be the most pythonic (faster) way to approach this problem? Thanks!
def check_duplicates(x,list_):
for li in list_:
if compare(x,li):
return True
def compare(s, t):
t = list(t) # make a mutable copy
try:
for elem in s:
t.remove(elem)
except ValueError:
return False
return not t
vars_list = [['A', 'B'],['A','B','C'],['B','A'],['C','A','B'],['D'],['D'],['Ao','B']]
second_list = []
for i in vars_list:
if check_duplicates(i,second_list):
continue
else:
second_list.append(i)
print(i)
Assuming that the elements of the nested lists are hashable, you can isolate the unique collections by constructing a set of frozensets from the nested list:
unique_sets = {frozenset(l) for l in vars_list}
# {frozenset({'D'}),
# frozenset({'A', 'B'}),
# frozenset({'A', 'B', 'C'}),
# frozenset({'Ao', 'B'})}
If you need a list-of-lists as the output, you can obtain one trivially with [list(s) for s in unique_sets].

faster method for comparing two lists element-wise

I am building a relational DB using python. So far I have two tables, as follows:
>>> df_Patient.columns
[1] Index(['NgrNr', 'FamilieNr', 'DosNr', 'Geslacht', 'FamilieNaam', 'VoorNaam',
'GeboorteDatum', 'PreBirth'],
dtype='object')
>>> df_LaboRequest.columns
[2] Index(['RequestId', 'IsComplete', 'NgrNr', 'Type', 'RequestDate', 'IntakeDate',
'ReqMgtUnit'],
dtype='object')
The two tables are quite big:
>>> df_Patient.shape
[3] (386249, 8)
>>> df_LaboRequest.shape
[4] (342225, 7)
column NgrNr on df_LaboRequest if foreign key (FK) and references the homonymous column on df_Patient. In order to avoid any integrity error, I need to make sure that all the values under df_LaboRequest[NgrNr] are in df_Patient[NgrNr].
With list comprehension I tried the following (to pick up the values that would throw an error):
[x for x in list(set(df_LaboRequest['NgrNr'])) if x not in list(set(df_Patient['NgrNr']))]
Though this is taking ages to complete. Would anyone recommend a faster method (method as a general word, as synonym for for procedure, nothing to do with the pythonic meaning of method) for such a comparison?
One-liners aren't always better.
Don't check for membership in lists. Why on earth would you create a set (which is the recommended data structure for O(1) membership checks) and then cast it to a list which has O(N) membership checks?
Make the set of df_Patient once outside the list comprehension and use that instead of making the set in every iteration
patients = set(df_Patient['NgrNr'])
lab_requests = set(df_LaboRequest['NgrNr'])
result = [x for x in lab_requests if x not in patients]
Or, if you like to use set operations, simply find the difference of both sets:
result = lab_requests - patients
Alternatively, use pandas isin() function.
patients = patients.drop_duplicates()
lab_requests = lab_requests.drop_duplicates()
result = lab_requests[~lab_requests.isin(patients)]
Let's test how much faster these changes make the code:
import pandas as pd
import random
import timeit
# Make dummy dataframes of patients and lab_requests
randoms = [random.randint(1, 1000) for _ in range(10000)]
patients = pd.DataFrame("patient{0}".format(x) for x in randoms[:5000])[0]
lab_requests = pd.DataFrame("patient{0}".format(x) for x in randoms[2000:8000])[0]
# Do it your way
def fun1(pat, lr):
return [x for x in list(set(lr)) if x not in list(set(pat))]
# Do it my way: Set operations
def fun2(pat, lr):
pat_s = set(pat)
lr_s = set(lr)
return lr_s - pat_s
# Or explicitly iterate over the set
def fun3(pat, lr):
pat_s = set(pat)
lr_s = set(lr)
return [x for x in lr_s if x not in pat_s]
# Or using pandas
def fun4(pat, lr):
pat = pat.drop_duplicates()
lr = lr.drop_duplicates()
return lr[~lr.isin(pat)]
# Make sure all 3 functions return the same thing
assert set(fun1(patients, lab_requests)) == set(fun2(patients, lab_requests)) == set(fun3(patients, lab_requests)) == set(fun4(patients, lab_requests))
# Time it
timeit.timeit('fun1(patients, lab_requests)', 'from __main__ import patients, lab_requests, fun1', number=100)
# Output: 48.36615000000165
timeit.timeit('fun2(patients, lab_requests)', 'from __main__ import patients, lab_requests, fun2', number=100)
# Output: 0.10799920000044949
timeit.timeit('fun3(patients, lab_requests)', 'from __main__ import patients, lab_requests, fun3', number=100)
# Output: 0.11038020000069082
timeit.timeit('fun4(patients, lab_requests)', 'from __main__ import patients, lab_requests, fun4', number=100)
# Output: 0.32021789999998873
Looks like we have a ~150x speedup with pandas and a ~500x speedup with set operations!
I don't have a pandas installed right now to try this. But you could try removing the list(..) cast. I don't think it provides anything meaningful to the program and sets are much faster for lookup, e.g. x in set(...), than lists.
Also you could try doing this with the pandas API rather than lists and sets, sometimes this faster. Try searching for unique. Then you could compare the size of the two columns and if it is the same, sort them and do an equality check.

Python/Pandas element wise union of 2 Series containing sets in each element

I have 2 pandas data Series that I know are the same length. Each Series contains sets() in each element. I want to figure out a computationally efficient way to get the element wise union of these two Series' sets. I've created a simplified version of the code with fake and short Series to play with below. This implementation is a VERY inefficient way of doing this. There has GOT to be a faster way to do this. My real Series are much longer and I have to do this operation hundreds of thousands of times.
import pandas as pd
set_series_1 = pd.Series([{1,2,3}, {'a','b'}, {2.3, 5.4}])
set_series_2 = pd.Series([{2,4,7}, {'a','f','g'}, {0.0, 15.6}])
n = set_series_1.shape[0]
for i in range(0,n):
set_series_1[i] = set_series_1[i].union(set_series_2[i])
print set_series_1
>>> set_series_1
0 set([1, 2, 3, 4, 7])
1 set([a, b, g, f])
2 set([0.0, 2.3, 15.6, 5.4])
dtype: object
I've tried combining the Series into a data frame and using the apply function, but I get an error saying that sets are not supported as dataframe elements.
pir4
After testing several options, I finally came up with a good one... pir4 below.
Testing
def jed1(s1, s2):
s = s1.copy()
n = s1.shape[0]
for i in range(n):
s[i] = s2[i].union(s1[i])
return s
def pir1(s1, s2):
return pd.Series([item.union(s2[i]) for i, item in enumerate(s1.values)], s1.index)
def pir2(s1, s2):
return pd.Series([item.union(s2[i]) for i, item in s1.iteritems()], s1.index)
def pir3(s1, s2):
return s1.apply(list).add(s2.apply(list)).apply(set)
def pir4(s1, s2):
return pd.Series([set.union(*z) for z in zip(s1, s2)])

Python 3.x random input for lists

from random import randint
List = [randint(0,99)*20]
print(List)
How could i go along the lines of make it into a list for 20 different random numbers between 0 and 99? The code i have ends up multiplying one random number 20 times.
You can use list comprehension:
List = [randint(0,99) for i in range(20)]
from random import randint
List = [randint(0,99) for i in range(20)]
print("%3s" % List)
List.sort()
print("%3s" % List)
My primary is Java. I hope this helps!

Resources