Intersecting two Dictionaries and getting average scores

Intersecting two Dictionaries and getting average scores - python-3.x

I have 2 python dictionaries, and each dictionary has a city name and a score of that city.
I need to compare both the dictionaries in order to find the city with max score.Hence, for this I first take intersection of both the dictionaries to get common cities.This is where I am facing issues.
For example, lets say the two dictionaries are:
d1 = {"delhi": 40, "Jaipur": 50, "Gurgaon": 10}
d2 = {"Jaipur(Rajasthan)": 30, "Gurugram(Gurgaon)": 25}
Here because of brackets or the city has some extra string along with it, the intersection fails.
So my question is , Is there any way where in if a city is present partly in a string, it is taken into the intersection?
Also, in the end I need to give the city an average score.
I want the end result to be:
d3 = {"gurgaon": 17.5((10 + 25) / 2), "jaipur": 40(80 / 2)}
How would I achieve this?

You can create normalized dicts where the keys used for matching are extracted from the original keys. Since names both inside and outside parentheses in the keys of the input dicts can be used for matching, create redundant keys for both names in the normalized dict:
import re
n1, n2 = (
{t.lower(): v for k, v in d.items() for t in re.findall('[^()]+', k)}
for d in (d1, d2)
)
print({k: (n1[k] + n2[k]) / 2 for k in n1.keys() & n2.keys()})
This outputs:
{'gurgaon': 17.5, 'jaipur': 40.0}

If you only have to compare two dicts you can do something like this using the filter function:
def get_avg_scores(d1, d2):
d3 = {}
for key, item in d1.items():
# Get match key d1 vs. d2
d2_similar_key = list(filter(lambda x: key.lower() in x.lower(), d2.keys()))
#Get match key d2 vs. d1
d2_similar_key_rev = list(filter(lambda x: x.lower() in key.lower(), d2.keys()))
# Keep the simplest key (to avoid bracets in d3)
if len(d2_similar_key) > 0:
d3[key] = (item + d2[d2_similar_key[0]])/2
if len(d2_similar_key_rev) > 0:
d3[d2_similar_key_rev[0]] = (item + d2[d2_similar_key_rev[0]])/2
return d3
d3 = get_avg_scores(d1, d2)

Related

Looking for a specific combination algorithm to solve a problem

Let’s say I have a purchase total and I have a csv file full of purchases where some of them make up that total and some don’t. Is there a way to search the csv to find the combination or combinations of purchases that make up that total ? Let’s say the purchase total is 155$ and my csv file has the purchases [5.00$,40.00$,7.25$,$100.00,$10.00]. Is there an algorithm that will tell me the combinations of the purchases that make of the total ?
Edit: I am still having trouble with the solution you provided. When I feed this spreadsheet with pandas into the code snippet you provided it only shows one solution equal to 110.04$ when there are three. It is like it is stopping early without finding the final solutions.This is the output that I have from the terminal - [57.25, 15.87, 13.67, 23.25]. The output should be [10.24,37.49,58.21,4.1] and [64.8,45.24] and [57.25,15.87,13.67,23.25]
from collections import namedtuple
import pandas
df = pandas.read_csv('purchases.csv',parse_dates=["Date"])
from collections import namedtuple
values = df["Purchase"].to_list()
S = 110.04
Candidate = namedtuple('Candidate', ['sum', 'lastIndex', 'path'])
tuples = [Candidate(0, -1, [])]
while len(tuples):
next = []
for (sum, i, path) in tuples:
# you may range from i + 1 if you don't want repetitions of the same purchase
for j in range(i+1, len(values)):
v = values[j]
# you may check for strict equality if no purchase is free (0$)
if v + sum <= S:
next.append(Candidate(sum = v + sum, lastIndex = j, path = path + [v]))
if v + sum == S :
print(path + [v])
tuples = next

A dp solution:
Let S be your goal sum
Build all 1-combinations. Keep those which sums less or equal than S. Whenever one equals S, output it
Build all 2-combinations reusing the previous ones.
Repeat
from collections import namedtuple
values = [57.25,15.87,13.67,23.25,64.8,45.24,10.24,37.49,58.21,4.1]
S = 110.04
Candidate = namedtuple('Candidate', ['sum', 'lastIndex', 'path'])
tuples = [Candidate(0, -1, [])]
while len(tuples):
next = []
for (sum, i, path) in tuples:
# you may range from i + 1 if you don't want repetitions of the same purchase
for j in range(i + 1, len(values)):
v = values[j]
# you may check for strict equality if no purchase is free (0$)
if v + sum <= S:
next.append(Candidate(sum = v + sum, lastIndex = j, path = path + [v]))
if abs(v + sum - S) <= 1e-2 :
print(path + [v])
tuples = next
More detail about the tuple structure:
What we want to do is to augment a tuple with a new value.
Assume we start with some tuple with only one value, say the tuple associated to 40.
its sum is trivially 40
the last index added is 1 (it is the number 40 itself)
the used values is [40], since it is the sole value.
Now to generate the next tuples, we will iterate from the last index (1), to the end of the array.
So candidates are 7.25, 100.00, 10.00
The new tuple associated to 7.25 is:
sum: 40 + 7.25
last index: 2 (7.25 has index 2 in array)
used values: values of tuple union 7.25, so [40, 7.25]
The purpose of using the last index, is to avoid considering [7.25, 40] and [40, 7.25]. Indeed they would be the same combination
So to generate tuples from an old one, only consider values occurring 'after' the old one from the array
At every step, we thus have tuples of the same size, each of them aggregates the values taken, the sum it amounts to, and the next values to consider to augment it to a bigger size
edit: to handle floats, you may replace (v+sum)<=S by abs(v+sum - S)<=1e-2 to say a solution is reach when you are very close (here distance arbitrarily set to 0.01) to solution
edit2: same code here as in https://repl.it/repls/DrearyWindingHypertalk (which does give
[64.8, 45.24]
[57.25, 15.87, 13.67, 23.25]
[10.24, 37.49, 58.21, 4.1]

Python losing track of index location in for loop when my list has duplicate values

I'm trying to iterate over pairs of integers in a list. I'd like to return pairs where the sum equals some variable value.
This seems to be working just fine when the list of integers doesn't have repeat numbers. However, once I add repeat numbers to the list the loop seems to be getting confused about where it is. I'm guessing this based on my statements:
print(list.index(item))
print(list.index(item2))
Here is my code:
working_list = [1,2,3,4,5]
broken_list = [1,3,3,4,5]
def find_pairs(list, k):
pairs_list = []
for item in list:
for item2 in list:
print(list.index(item))
print(list.index(item2))
if list.index(item) < list.index(item2):
sum = item + item2;
if sum == k:
pair = (item, item2)
pairs_list.append(pair)
return pairs_list
### First parameter is the name is the list to check.
### Second parameter is the integer you're looking for each pair to sum to.
find_pairs(broken_list, 6)
working_list is fine. When I run broken_list looking for pairs which sum to 6, I'm getting back (1,5) but I should also get back (3,3) and I'm not.

You are trying to use list.index(item) < list.index(item2) to ensure that you do not double count the pairs. However, broken_list.index(3) returns 1 for both the first and second 3 in the list. I.e. the return value is not the actual index you want (unless the list only contains unique elements, like working_list). To get the actual index, use enumerate. The simplest implementation would be
def find_pairs(list, k):
pairs_list = []
for i, item in enumerate(list):
for j, item2 in enumerate(list):
if i < j:
sum = item + item2
if sum == k:
pair = (item, item2)
pairs_list.append(pair)
return pairs_list
For small lists this is fine, but we could be more efficient by only looping over the elements we want using slicing, hence eliminating the if statement:
def find_pairs(list, k):
pairs_list = []
for i, item in enumerate(list):
for item2 in list[i+1:]:
sum = item + item2
if sum == k:
pair = (item, item2)
pairs_list.append(pair)
return pairs_list
Note on variable names
Finally, I have to comment on your choice of variable names: list and sum are already defined by Python, and so it's bad style to use these as variable names. Furthermore, 'items' are commonly used to refer to a key-value pair of objects, and so I would refrain from using this name for a single value as well (I guess something like 'element' is more suitable).

How to predict key from its value in python? [duplicate]

I made a function which will look up ages in a Dictionary and show the matching name:
dictionary = {'george' : 16, 'amber' : 19}
search_age = raw_input("Provide age")
for age in dictionary.values():
if age == search_age:
name = dictionary[age]
print name
I know how to compare and find the age I just don't know how to show the name of the person. Additionally, I am getting a KeyError because of line 5. I know it's not correct but I can't figure out how to make it search backwards.

mydict = {'george': 16, 'amber': 19}
print mydict.keys()[mydict.values().index(16)] # Prints george
Or in Python 3.x:
mydict = {'george': 16, 'amber': 19}
print(list(mydict.keys())[list(mydict.values()).index(16)]) # Prints george
Basically, it separates the dictionary's values in a list, finds the position of the value you have, and gets the key at that position.
More about keys() and .values() in Python 3: How can I get list of values from dict?

There is none. dict is not intended to be used this way.
dictionary = {'george': 16, 'amber': 19}
search_age = input("Provide age")
for name, age in dictionary.items(): # for name, age in dictionary.iteritems(): (for Python 2.x)
if age == search_age:
print(name)

If you want both the name and the age, you should be using .items() which gives you key (key, value) tuples:
for name, age in mydict.items():
if age == search_age:
print name
You can unpack the tuple into two separate variables right in the for loop, then match the age.
You should also consider reversing the dictionary if you're generally going to be looking up by age, and no two people have the same age:
{16: 'george', 19: 'amber'}
so you can look up the name for an age by just doing
mydict[search_age]
I've been calling it mydict instead of list because list is the name of a built-in type, and you shouldn't use that name for anything else.
You can even get a list of all people with a given age in one line:
[name for name, age in mydict.items() if age == search_age]
or if there is only one person with each age:
next((name for name, age in mydict.items() if age == search_age), None)
which will just give you None if there isn't anyone with that age.
Finally, if the dict is long and you're on Python 2, you should consider using .iteritems() instead of .items() as Cat Plus Plus did in his answer, since it doesn't need to make a copy of the list.

I thought it would be interesting to point out which methods are the quickest, and in what scenario:
Here's some tests I ran (on a 2012 MacBook Pro)
def method1(dict, search_age):
for name, age in dict.iteritems():
if age == search_age:
return name
def method2(dict, search_age):
return [name for name,age in dict.iteritems() if age == search_age]
def method3(dict, search_age):
return dict.keys()[dict.values().index(search_age)]
Results from profile.run() on each method 100,000 times:
Method 1:
>>> profile.run("for i in range(0,100000): method1(dict, 16)")
200004 function calls in 1.173 seconds
Method 2:
>>> profile.run("for i in range(0,100000): method2(dict, 16)")
200004 function calls in 1.222 seconds
Method 3:
>>> profile.run("for i in range(0,100000): method3(dict, 16)")
400004 function calls in 2.125 seconds
So this shows that for a small dict, method 1 is the quickest. This is most likely because it returns the first match, as opposed to all of the matches like method 2 (see note below).
Interestingly, performing the same tests on a dict I have with 2700 entries, I get quite different results (this time run 10,000 times):
Method 1:
>>> profile.run("for i in range(0,10000): method1(UIC_CRS,'7088380')")
20004 function calls in 2.928 seconds
Method 2:
>>> profile.run("for i in range(0,10000): method2(UIC_CRS,'7088380')")
20004 function calls in 3.872 seconds
Method 3:
>>> profile.run("for i in range(0,10000): method3(UIC_CRS,'7088380')")
40004 function calls in 1.176 seconds
So here, method 3 is much faster. Just goes to show the size of your dict will affect which method you choose.
Notes:
Method 2 returns a list of all names, whereas methods 1 and 3 return only the first match.
I have not considered memory usage. I'm not sure if method 3 creates 2 extra lists (keys() and values()) and stores them in memory.

one line version: (i is an old dictionary, p is a reversed dictionary)
explanation : i.keys() and i.values() returns two lists with keys and values of the dictionary respectively. The zip function has the ability to tie together lists to produce a dictionary.
p = dict(zip(i.values(),i.keys()))
Warning : This will work only if the values are hashable and unique.

I found this answer very effective but not very easy to read for me.
To make it more clear you can invert the key and the value of a dictionary. This is make the keys values and the values keys, as seen here.
mydict = {'george':16,'amber':19}
res = dict((v,k) for k,v in mydict.iteritems())
print(res[16]) # Prints george
or for Python 3, (thanks #kkgarg)
mydict = {'george':16,'amber':19}
res = dict((v,k) for k,v in mydict.items())
print(res[16]) # Prints george
Also
print(res.get(16)) # Prints george
which is essentially the same that this other answer.

a = {'a':1,'b':2,'c':3}
{v:k for k, v in a.items()}[1]
or better
{k:v for k, v in a.items() if v == 1}

key = next((k for k in my_dict if my_dict[k] == val), None)

Try this one-liner to reverse a dictionary:
reversed_dictionary = dict(map(reversed, dictionary.items()))

If you want to find the key by the value, you can use a dictionary comprehension to create a lookup dictionary and then use that to find the key from the value.
lookup = {value: key for key, value in self.data}
lookup[value]

we can get the Key of dict by :
def getKey(dct,value):
return [key for key in dct if (dct[key] == value)]

You can get key by using dict.keys(), dict.values() and list.index() methods, see code samples below:
names_dict = {'george':16,'amber':19}
search_age = int(raw_input("Provide age"))
key = names_dict.keys()[names_dict.values().index(search_age)]

Here is my take on this problem. :)
I have just started learning Python, so I call this:
"The Understandable for beginners" solution.
#Code without comments.
list1 = {'george':16,'amber':19, 'Garry':19}
search_age = raw_input("Provide age: ")
print
search_age = int(search_age)
listByAge = {}
for name, age in list1.items():
if age == search_age:
age = str(age)
results = name + " " +age
print results
age2 = int(age)
listByAge[name] = listByAge.get(name,0)+age2
print
print listByAge
.
#Code with comments.
#I've added another name with the same age to the list.
list1 = {'george':16,'amber':19, 'Garry':19}
#Original code.
search_age = raw_input("Provide age: ")
print
#Because raw_input gives a string, we need to convert it to int,
#so we can search the dictionary list with it.
search_age = int(search_age)
#Here we define another empty dictionary, to store the results in a more
#permanent way.
listByAge = {}
#We use double variable iteration, so we get both the name and age
#on each run of the loop.
for name, age in list1.items():
#Here we check if the User Defined age = the age parameter
#for this run of the loop.
if age == search_age:
#Here we convert Age back to string, because we will concatenate it
#with the person's name.
age = str(age)
#Here we concatenate.
results = name + " " +age
#If you want just the names and ages displayed you can delete
#the code after "print results". If you want them stored, don't...
print results
#Here we create a second variable that uses the value of
#the age for the current person in the list.
#For example if "Anna" is "10", age2 = 10,
#integer value which we can use in addition.
age2 = int(age)
#Here we use the method that checks or creates values in dictionaries.
#We create a new entry for each name that matches the User Defined Age
#with default value of 0, and then we add the value from age2.
listByAge[name] = listByAge.get(name,0)+age2
#Here we print the new dictionary with the users with User Defined Age.
print
print listByAge
.
#Results
Running: *\test.py (Thu Jun 06 05:10:02 2013)
Provide age: 19
amber 19
Garry 19
{'amber': 19, 'Garry': 19}
Execution Successful!

get_key = lambda v, d: next(k for k in d if d[k] is v)

Consider using Pandas. As stated in William McKinney's "Python for Data Analysis'
Another way to think about a Series is as a fixed-length, ordered
dict, as it is a mapping of index values to data values. It can be
used in many contexts where you might use a dict.
import pandas as pd
list = {'george':16,'amber':19}
lookup_list = pd.Series(list)
To query your series do the following:
lookup_list[lookup_list.values == 19]
Which yields:
Out[1]:
amber 19
dtype: int64
If you need to do anything else with the output transforming the
answer into a list might be useful:
answer = lookup_list[lookup_list.values == 19].index
answer = pd.Index.tolist(answer)

d= {'george':16,'amber':19}
dict((v,k) for k,v in d.items()).get(16)
The output is as follows:
-> prints george

Here, recover_key takes dictionary and value to find in dictionary. We then loop over the keys in dictionary and make a comparison with that of value and return that particular key.
def recover_key(dicty,value):
for a_key in dicty.keys():
if (dicty[a_key] == value):
return a_key

One line solution using list comprehension, which returns multiple keys if the value is possibly present multiple times.
[key for key,value in mydict.items() if value == 16]

for name in mydict:
if mydict[name] == search_age:
print(name)
#or do something else with it.
#if in a function append to a temporary list,
#then after the loop return the list

my_dict = {'A': 19, 'B': 28, 'carson': 28}
search_age = 28
take only one
name = next((name for name, age in my_dict.items() if age == search_age), None)
print(name) # 'B'
get multiple data
name_list = [name for name, age in filter(lambda item: item[1] == search_age, my_dict.items())]
print(name_list) # ['B', 'carson']

I glimpsed all answers and none mentioned simply using list comprehension?
This Pythonic one-line solution can return all keys for any number of given values (tested in Python 3.9.1):
>>> dictionary = {'george' : 16, 'amber' : 19, 'frank': 19}
>>>
>>> age = 19
>>> name = [k for k in dictionary.keys() if dictionary[k] == age]; name
['george', 'frank']
>>>
>>> age = (16, 19)
>>> name = [k for k in dictionary.keys() if dictionary[k] in age]; name
['george', 'amber', 'frank']
>>>
>>> age = (22, 25)
>>> name = [k for k in dictionary.keys() if dictionary[k] in age]; name
[]

it's answered, but it could be done with a fancy 'map/reduce' use, e.g.:
def find_key(value, dictionary):
return reduce(lambda x, y: x if x is not None else y,
map(lambda x: x[0] if x[1] == value else None,
dictionary.iteritems()))

I tried to read as many solutions as I can to prevent giving duplicate answer. However, if you are working on a dictionary which values are contained in lists and if you want to get keys that have a particular element you could do this:
d = {'Adams': [18, 29, 30],
'Allen': [9, 27],
'Anderson': [24, 26],
'Bailey': [7, 30],
'Baker': [31, 7, 10, 19],
'Barnes': [22, 31, 10, 21],
'Bell': [2, 24, 17, 26]}
Now lets find names that have 24 in their values.
for key in d.keys():
if 24 in d[key]:
print(key)
This would work with multiple values as well.

Just my answer in lambda and filter.
filter( lambda x, dictionary=dictionary, search_age=int(search_age): dictionary[x] == search_age , dictionary )

already been answered, but since several people mentioned reversing the dictionary, here's how you do it in one line (assuming 1:1 mapping) and some various perf data:
python 2.6:
reversedict = dict([(value, key) for key, value in mydict.iteritems()])
2.7+:
reversedict = {value:key for key, value in mydict.iteritems()}
if you think it's not 1:1, you can still create a reasonable reverse mapping with a couple lines:
reversedict = defaultdict(list)
[reversedict[value].append(key) for key, value in mydict.iteritems()]
how slow is this: slower than a simple search, but not nearly as slow as you'd think - on a 'straight' 100000 entry dictionary, a 'fast' search (i.e. looking for a value that should be early in the keys) was about 10x faster than reversing the entire dictionary, and a 'slow' search (towards the end) about 4-5x faster. So after at most about 10 lookups, it's paid for itself.
the second version (with lists per item) takes about 2.5x as long as the simple version.
largedict = dict((x,x) for x in range(100000))
# Should be slow, has to search 90000 entries before it finds it
In [26]: %timeit largedict.keys()[largedict.values().index(90000)]
100 loops, best of 3: 4.81 ms per loop
# Should be fast, has to only search 9 entries to find it.
In [27]: %timeit largedict.keys()[largedict.values().index(9)]
100 loops, best of 3: 2.94 ms per loop
# How about using iterkeys() instead of keys()?
# These are faster, because you don't have to create the entire keys array.
# You DO have to create the entire values array - more on that later.
In [31]: %timeit islice(largedict.iterkeys(), largedict.values().index(90000))
100 loops, best of 3: 3.38 ms per loop
In [32]: %timeit islice(largedict.iterkeys(), largedict.values().index(9))
1000 loops, best of 3: 1.48 ms per loop
In [24]: %timeit reversedict = dict([(value, key) for key, value in largedict.iteritems()])
10 loops, best of 3: 22.9 ms per loop
In [23]: %%timeit
....: reversedict = defaultdict(list)
....: [reversedict[value].append(key) for key, value in largedict.iteritems()]
....:
10 loops, best of 3: 53.6 ms per loop
Also had some interesting results with ifilter. Theoretically, ifilter should be faster, in that we can use itervalues() and possibly not have to create/go through the entire values list. In practice, the results were... odd...
In [72]: %%timeit
....: myf = ifilter(lambda x: x[1] == 90000, largedict.iteritems())
....: myf.next()[0]
....:
100 loops, best of 3: 15.1 ms per loop
In [73]: %%timeit
....: myf = ifilter(lambda x: x[1] == 9, largedict.iteritems())
....: myf.next()[0]
....:
100000 loops, best of 3: 2.36 us per loop
So, for small offsets, it was dramatically faster than any previous version (2.36 *u*S vs. a minimum of 1.48 *m*S for previous cases). However, for large offsets near the end of the list, it was dramatically slower (15.1ms vs. the same 1.48mS). The small savings at the low end is not worth the cost at the high end, imho.

Cat Plus Plus mentioned that this isn't how a dictionary is intended to be used. Here's why:
The definition of a dictionary is analogous to that of a mapping in mathematics. In this case, a dict is a mapping of K (the set of keys) to V (the values) - but not vice versa. If you dereference a dict, you expect to get exactly one value returned. But, it is perfectly legal for different keys to map onto the same value, e.g.:
d = { k1 : v1, k2 : v2, k3 : v1}
When you look up a key by it's corresponding value, you're essentially inverting the dictionary. But a mapping isn't necessarily invertible! In this example, asking for the key corresponding to v1 could yield k1 or k3. Should you return both? Just the first one found? That's why indexof() is undefined for dictionaries.
If you know your data, you could do this. But an API can't assume that an arbitrary dictionary is invertible, hence the lack of such an operation.

here is my take on it. This is good for displaying multiple results just in case you need one. So I added the list as well
myList = {'george':16,'amber':19, 'rachel':19,
'david':15 } #Setting the dictionary
result=[] #Making ready of the result list
search_age = int(input('Enter age '))
for keywords in myList.keys():
if myList[keywords] ==search_age:
result.append(keywords) #This part, we are making list of results
for res in result: #We are now printing the results
print(res)
And that's it...

There is no easy way to find a key in a list by 'looking up' the value. However, if you know the value, iterating through the keys, you can look up values in the dictionary by the element. If D[element] where D is a dictionary object, is equal to the key you're trying to look up, you can execute some code.
D = {'Ali': 20, 'Marina': 12, 'George':16}
age = int(input('enter age:\t'))
for element in D.keys():
if D[element] == age:
print(element)

You need to use a dictionary and reverse of that dictionary. It means you need another data structure. If you are in python 3, use enum module but if you are using python 2.7 use enum34 which is back ported for python 2.
Example:
from enum import Enum
class Color(Enum):
red = 1
green = 2
blue = 3
>>> print(Color.red)
Color.red
>>> print(repr(Color.red))
<color.red: 1="">
>>> type(Color.red)
<enum 'color'="">
>>> isinstance(Color.green, Color)
True
>>> member = Color.red
>>> member.name
'red'
>>> member.value
1

def get_Value(dic,value):
for name in dic:
if dic[name] == value:
del dic[name]
return name

Filling a dictionary with lists of values - why is my nested loop only running once?

I'm trying to create a function that takes an array, bins the data in that array (by quantile), and fills a dictionary with the binned data. In the dictionary that gets produced, I want the keys to correspond to bin numbers, and the values to be lists of data from the input array that fall within the jth and (j+1)th bin limits.
Here is my code:
output = []
def binning(array1):
d1 = {} # empty dictionary to fill with lists of values
bin_edges = sp.stats.mstats.mquantiles(array1, prob=[0.0, 0.125, 0.25, 0.375, 0.5, 0.625, 0.75, 0.875,1.00])
j = 0
while j < len(bin_edges):
for i in range(0, len(array1)):
if float(array1[i]) > bin_edges[j] and float(array1[i]) <= bin_edges[j+1]:
output.append(array1[i])
d1["bin_number{0}".format(j)]= output
j+=1
return d1
The problem is, the inner loop only runs once, so I'm getting an output like
d1 = {'bin_number0': [value1, value2, etc.]}.
What I want to see is:
d1 = {'bin_number0': [value1, value2, etc.],'bin_number1': [value3, value4, etc.],'bin_number2': [value5, value6, etc.]}
...and so on, so there are 8 keys corresponding to 8 lists of values.
Can anyone tell me why the inner loop only runs once (for j = 0)? I've looked at it so many times I need a fresh pair of eyes.

return d1 should not be indented into the while loop. Unindent it so that it is indented only once. This is why your code only loops once. Hope this helps!!

How to get keys from nested dictionary of arbitrary length in Python

I have a dictionary object in python. Let's call it as dict. This object could contain another dictionary which may in turn contain another dictionary and so on.
dict = { 'k': v, 'k1': v1, 'dict2':{'k3': v3, 'k4':v4} , 'dict3':{'k5':v5, dict4:{'k6':v6}}}
This is just an example. Length of outermost dictionary could be anything. I want to extract keys from such dictionary object in following two ways :
get list of only keys.
[k,k1,k2,k3,k4,k5,k6]
get list of keys and its parent associated dictionary so something like this :
outer_dict_keys = [k ,dict2, dict3]
dict2_keys = [k3,k4]
dict3_keys = [k5, dict4]
dict4_keys = [k6]
Outermost dictionary dict length is always changing so I can not hard code anything.
What is best way to achieve above result ?

Use a mix of iteration and tail recursion. After quoting undefined names, making spacing uniform, and removing 'k2' from the first result, I came up with the code below. (Written and tested for 3.4, it should run on any 3.x and might on 2.7.) A key thing to remember is that the iteration order of dicts is essentially random, and varies with each run. Recursion as done here visit sub-dicts in depth-first rather than breadth-first order. For dict0, both are the same, But if dict4 were nested in dict2 rather than dict3, they would not be.
dict0 = {'k0': 0, 'k1': 1, 'dict2':{'k3': 3, 'k4': 4},
'dict3':{'k5': 5, 'dict4':{'k6': 6}}}
def keys(dic, klist=[]):
subdics = []
for key in sorted(dic):
val = dic[key]
if isinstance(val, dict):
subdics.append(val)
else:
klist.append(key)
for subdict in subdics:
keys(subdict, klist)
return klist
result = keys(dict0)
print(result, '\n', result == ['k0','k1','k3','k4','k5','k6'])
def keylines(dic, name='outer_dict', lines=[]):
vals = []
subdics = []
for key in sorted(dic):
val = dic[key]
if isinstance(val, dict):
subdics.append((key,val))
else:
vals.append(key)
vals.extend(pair[0] for pair in subdics)
lines.append('{}_keys = {}'.format(name, vals))
for subdict in subdics:
keylines(subdict[1], subdict[0], lines)
return lines
result = keylines(dict0)
for line in result:
print(line,)
print()
expect = [
"outer_dict_keys = ['k0', 'k1', 'dict2', 'dict3']",
"dict2_keys = ['k3', 'k4']",
"dict3_keys = ['k5', 'dict4']",
"dict4_keys = ['k6']"]
for actual, want in zip(result, expect):
if actual != want:
print(want)
for i, (c1, c2) in enumerate(zip(actual, want)):
if c1 != c2:
print(i, c1, c2)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Intersecting two Dictionaries and getting average scores - python-3.x

Related

Looking for a specific combination algorithm to solve a problem

Python losing track of index location in for loop when my list has duplicate values

How to predict key from its value in python? [duplicate]

Filling a dictionary with lists of values - why is my nested loop only running once?

How to get keys from nested dictionary of arbitrary length in Python

Categories

Resources