Python construction of value set dictionary - python-3.x

.I've been dealing with list comprehension recently, I came across a problem I can't seem to solve>
let's say I have pairs in the form of:
A,B,C,X="ABCX"
init = {(A,B),(B,C),(C,X)}
I am trying to construct a dictionary, where each key would be an individual letter, and each value all connections this letter has with other, so>
{A:{B},B:{A,C},C:{B,X},X:{C}}
Things I tried>
final_dict = {k : {j for p,j in init if p==k} for k,v in init}
but this returns me ony if the partner is located in the second place,
Trying to add the first place>
final_dict = {k : {j for p,j in init if p==k or p if j == k} for k,v in init}
An error occurs.

Here is the solution without dict comp
init = {('A','B'),('B','C'),('C','X')}
d = {}
for k, v in init:
d.setdefault(k, set()).add(v)
d.setdefault(v, set()).add(k)
The problem is that with your current format for the data you can't properly specify the key, values. Ideally it should be.
init = {('A','B'),('B','A'),('B','C'),('C','B'),('C','X'),('X','C')}
Which you can obtain by doing the following if you don't want to / can't adjust your current method for getting the pairs.
init2 = {(y, x) for x, y in init}.union(init)
So you can then do
d = { key : { v for k, v in init2 if k == key } for key, _ in init2 }
There is also this, but it doesn't include X, and again it would probably become much larger to make work due to the current format.
d = { k : {v1 if k == v2 else v for v1, v2 in init } for k, v in init }

Related

inner function changing the variable value of outer function

def swap(i,r,c,mat):
for j in range(i+1,c):
if(abs(mat[j][j])>0):
mat[[j,i]] = mat[[i,j]]
break
return mat
def upper_triMat(matA,r,c):
np.set_printoptions(precision=4)
# forward elimination
for i in range(0,c-1):
if matA[i][i] == 0:
matA = swap(i,r,c,matA)
for j in range(i+1,r):
multiplier = matA[j][i]/matA[i][i]
for k in range(0,c):
matA[j][k] = matA[j][k] - multiplier*matA[i][k]
return matA
def dolittle(A):
A = np.array(A)
r,c = np.shape(A)
print(A)
U = upper_triMat(A,r,c) # Here the value of A is changed U.
print(A)
l = np.eye(r,c)
for i in range(0,r-1):
for j in range(i+1,r):
sum = 0
for k in range(0,r):
if i != k:
sum = sum + U[k][i]*l[j][k]
l[j][i] = (A[j][i]-sum)/U[i][i]
return l,U
A = [[3,-0.1,-0.2],
[0.1,7,-0.3],
[0.3,-0.2,10]]
dolittle(A)
When i call the upper_triMat function "A" changes in dolittle function. Why?? A is A and the upper_triMat function assigning it to U. But A is also getting the value of U. Using Jupyter Notebook. I am doing LU decomposition
upper_triMat mutates its parameter matA. And since matA is a reference to A, it's being modified.
Maybe you could fix it that way
U = upper_triMat(A.copy(),r,c) # pass a copy of the list instead of the reference of the original one.

How do I remove all duplicate entries from one nested dictionary that appear in another?

Assuming my dictionaries are set up like this:
dict_a = {"first_key": {"second_key": "value1", "third_key": "value2"}}
dict_b = {"first_key": {"third_key": "value2"}}
I want to be left with this:
dict_a = {first_key: {second_key: value1}}
I've tried a few different ways of getting there like this:
dict(dica_a.items() - dict_b.items())
But that tells me dicts are unhashable. Trying this method:
dict_c = {k:dict_a[k] for k in dict_a if k not in dict_b}
Leaves me with an empty dictionary. I also tried this:
for k, v in dict_b.items():
if (k, v) in dict_a.itemS():
dict_a.pop(k, v)
But again, no luck there. It ended up not modifying dict_a at all.
additional_key_to_remove = []
for key, value in dict_b.items():
if isinstance(dict_a[key], dict) and isinstance(
value, dict
): # to makes sure that operation are happening on dictionary and not on any other datastructures
sub_dict = dict_a[key]
for k in value:
sub_dict.pop(k, None)
elif isinstance(dict_a[key], str) and isinstance(
value, str
): # collect the non nested keys for later removal
additional_key_to_remove.append(key)
for key in additional_key_to_remove:
del dict_a[key]
print(dict_a)
Output:
{'first_key': {'second_key': 'value1'}}

Pulling values from lists nested in a dictionary using for loop

Background on what I am doing for context:
I used a scraping tool to return prices for items on various sites to compare them. The information was originally stored as nested dictionaries of the form
{'55" 4K HDR': {'BEST BUY': 279.99, "KOHL'S": 279.99,'TARGET': 279.99},
'55" 4K UHD LED': {'BEST BUY': 329.99,'COSTCO': 349.99,'TARGET': 329.99, 'WALMART': 328.0}...}
and so on. I used for loops to then reorder the nested dictionaries to only have the lowest price, but in doing so converted them to lists.
def sortKey(keyValue):
g = {}
for k, subdic in keyValue.items():
g[k] = {}
for subk, v in sorted(subdic.items(), key=lambda x: x[1], reverse=True):
g[k] = [subk, v]
return g
This resulted in the following output
{'55" 4K HDR': ['BEST BUY', 279.99],
'55" 4K UHD LED': ['WALMART', 328.0]...}
Now I am trying to switch the format of the nested lists into a single dictionary so I can use a greedy algorithm to find all the ways I can spend a certain budget. I am hoping to get an output like
{'55" 4K HDR': 279.99, '55" 4K UHD LED': 328.0...}
and so on. I am trying to use a similar for loop to the one I used before
def greedyKey(keyGreed):
f= {}
for g, subGreed in keyGreed.items():
f[g] = ()
for subg, v in subGreed:
f = v
return f
but am getting
ValueError: too many values to unpack (expected 2)
I know this has to do with the values of my lists, but I am confused because I thought each nested list only had 2 values
['WALMART', 328.0]
minimal executable example
import pprint
dataDict = {'55" 4K HDR': {'BEST BUY': 279.99, "KOHL'S": 279.99,'TARGET': 279.99},
'55" 4K UHD LED': {'BEST BUY': 329.99,'COSTCO': 349.99,'TARGET': 329.99, 'WALMART': 328.0}}
def sortKey(keyValue):
g = {}
for k, subdic in keyValue.items():
g[k] = {}
for subk, v in sorted(subdic.items(), key=lambda x: x[1], reverse=True):
g[k] = [subk, v]
return g
def greedyKey(keyGreed):
f= {}
for g, subGreed in keyGreed.items():
f[g] = ()
for subg, v in subGreed:
f = v
return f
masterList = sortKey(dataDict)
pprint.pprint(masterList)
greedyList = greedyKey(masterList)
pprint.pprint(greedyList)
Please refer to this answer which states:
Python employs assignment unpacking when you have an iterable being assigned to multiple variables.
Essentially, subg, v = subGreed is equivalent to subg, v = subGreed[0], subGreed[1], which is what I think you're looking for here.
There is no need to make f[g] a tuple, either.
def greedyKey(keyGreed):
f = {}
for g, subGreed in keyGreed.items():
subg, v = subGreed
f[g] = v
return f

Pyspark Runtime Error Dictionary Changed size during iteration [duplicate]

I have obj like this
{hello: 'world', "foo.0.bar": v1, "foo.0.name": v2, "foo.1.bar": v3}
It should be expand to
{ hello: 'world', foo: [{'bar': v1, 'name': v2}, {bar: v3}]}
I wrote code below, splite by '.', remove old key, append new key if contains '.', but it said RuntimeError: dictionary changed size during iteration
def expand(obj):
for k in obj.keys():
expandField(obj, k, v)
def expandField(obj, f, v):
parts = f.split('.')
if(len(parts) == 1):
return
del obj[f]
for i in xrange(0, len(parts) - 1):
f = parts[i]
currobj = obj.get(f)
if (currobj == None):
nextf = parts[i + 1]
currobj = obj[f] = re.match(r'\d+', nextf) and [] or {}
obj = currobj
obj[len(parts) - 1] = v
for k, v in obj.iteritems():
RuntimeError: dictionary changed size during iteration
Like the message says: you changed the number of entries in obj inside of expandField() while in the middle of looping over this entries in expand.
You might try instead creating a new dictionary of the form you wish, or somehow recording the changes you want to make, and then making them AFTER the loop is done.
You might want to copy your keys in a list and iterate over your dict using the latter, eg:
def expand(obj):
keys = list(obj.keys()) # freeze keys iterator into a list
for k in keys:
expandField(obj, k, v)
I let you analyse if the resulting behavior suits your expected results.
Edited as per comments, thank you !
I had a similar issue with wanting to change the dictionary's structure (remove/add) dicts within other dicts.
For my situation I created a deepcopy of the dict. With a deepcopy of my dict, I was able to iterate through and remove keys as needed.Deepcopy - PythonDoc
A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.
Hope this helps!
For those experiencing
RuntimeError: dictionary changed size during iteration
also make sure you're not iterating through a defaultdict when trying to access a non-existent key! I caught myself doing that inside the for loop, which caused the defaultdict to create a default value for this key, causing the aforementioned error.
The solution is to convert your defaultdict to dict before looping through it, i.e.
d = defaultdict(int)
d_new = dict(d)
or make sure you're not adding/removing any keys while iterating through it.
Rewriting this part
def expand(obj):
for k in obj.keys():
expandField(obj, k, v)
to the following
def expand(obj):
keys = obj.keys()
for k in keys:
if k in obj:
expandField(obj, k, v)
shall make it work.

How to apply multiprocessing in python3.x for the following nested loop

for i in range(1,row):
for j in range(1,col):
if i > j and i != j:
x = Aglo[0][i][0]
y = Aglo[j][0][0]
Aglo[j][i] = offset.myfun(x,y)
Aglo[i][j] = Aglo[j][i]
Aglo[][] is a 2D array, which consists of lists in the first row
offset.myfun() is a function defined elsewhere
This might be a trivial question but i couldn't understand how to use multiprocessing for these nested loops as x,y (used in myfun()) is different for each process(if multiprocessing is used)
Thank you
If I'm reading your code right, you are not overwriting any previously calculated values. If that's true, then you can use multiprocessing. If not, then you can't guarantee that the results from multiprocessing will be in the correct order.
To use something like multiprocessing.Pool, you would need to gather all valid (x, y) pairs to pass to offset.myfun(). Something like this might work (untested):
pairs = [(i, j, Aglo[0][i][0], Aglo[j][0][0]) for i in range(1, row) for j in range(1, col) if i > j and i != j]
# offset.myfun now needs to take a tuple instead of x, y
# it additionally needs to emit i and j in addition to the return value
# e.g. (i, j, result)
p = Pool(4)
results = p.map(offset.myfun, pairs)
# fill in Aglo with the results
for pair in pairs:
i, j, value = pair
Aglo[i][j] = value
Aglo[j][i] = value
You will need to pass in i and j to offset.myfun because otherwise there is no way to know which result goes where. offset.myfun should then return i and j along with the result so you can fill in Aglo appropriately. Hope this helps.

Resources