how to maintain the keys in TransformDict in python - python-3.x

https://github.com/fluentpython/example-code/blob/master/03-dict-set/transformdict.py
I see the demo:
'''Dictionary that calls a transformation function when looking
up keys, but preserves the original keys.
>>> d = TransformDict(str.lower)
>>> d['Foo'] = 5
>>> d['foo'] == d['FOO'] == d['Foo'] == 5
True
>>> set(d.keys())
{'Foo'}
'''
but , i dont know the object how to maintain the keys.
thanks
I really want to ask how the keys method works

It keeps 2 dictionaries, 1 for the keys and one for the values see getitem in line 51:
def getitem(self, key):
'D.getitem(key) -> (stored key, value)'
transformed = self._transform(key)
original = self._original[transformed] # original keys!
value = self._data[transformed] # values!
return original, value

Related

Iterating thru a not so ordinary Dictionary in python 3.x

Maybe it is ordinary issue regarding iterating thru a dict. Please find below imovel.txt file, whose content is as follows:
{'Andar': ['primeiro', 'segundo', 'terceiro'], 'Apto': ['101','201','301']}
As you can see this is not a ordinary dictionary, with a key value pair; but a key with a list as key and another list as value
My code is:
#/usr/bin/python
def load_dict_from_file():
f = open('../txt/imovel.txt','r')
data=f.read()
f.close()
return eval(data)
thisdict = load_dict_from_file()
for key,value in thisdict.items():
print(value)
and yields :
['primeiro', 'segundo', 'terceiro'] ['101', '201', '301']
I would like to print a key,value pair like
{'primeiro':'101, 'segundo':'201', 'terceiro':'301'}
Given such txt file above, is it possible?
You should use the builtin json module to parse but either way, you'll still have the same structure.
There are a few things you can do.
If you know both of the base key names('Andar' and 'Apto') you can do it as a one line dict comprehension by zipping the values together.
# what you'll get from the file
thisdict = {'Andar': ['primeiro', 'segundo', 'terceiro'], 'Apto': ['101','201','301']}
# One line dict comprehension
newdict = {key: value for key, value in zip(thisdict['Andar'], thisdict['Apto'])}
print(newdict)
If you don't know the names of the keys, you could call next on an iterator assuming they're the first 2 lists in your structure.
# what you'll get from the file
thisdict = {'Andar': ['primeiro', 'segundo', 'terceiro'], 'Apto': ['101','201','301']}
# create an iterator of the values since the keys are meaningless here
iterator = iter(thisdict.values())
# the first group of values are the keys
keys = next(iterator, None)
# and the second are the values
values = next(iterator, None)
# zip them together and have dict do the work for you
newdict = dict(zip(keys, values))
print(newdict)
As other folks have noted, that looks like JSON, and it'd probably be easier to parse it read through it as such. But if that's not an option for some reason, you can look through your dictionary this way if all of your lists at each key are the same length:
for i, res in enumerate(dict[list(dict)[0]]):
ith_values = [elem[i] for elem in dict.values()]
print(ith_values)
If they're all different lengths, then you'll need to put some logic to check for that and print a blank or do some error handling for looking past the end of the list.

Create dictionary with count of values from list

I'm trying to figure out how to create a dictionary with the key as the school and values the wins-losses-draws, based on each item in the list. For example, calling my_dict['Clemson'] would return the string "1-1-1"
"
team_score_list =[['Georgia', 'draw'], ['Duke', 'loss'], ['Virginia Tech', 'win'], ['Virginia', 'loss'], ['Clemson', 'loss'], ['Clemson', 'win'], ['Clemson', 'draw']]
The output for the above list should be the following dictionary:
{'Georgia': 0-0-1, 'Duke': 0-1-0, 'Virginia Tech': 1-0-0, 'Virginia': 0-1-0, 'Clemson': 1-1-1}
For context, the original data comes from a CSV, where each line is in the form of Date,Opponent,Location,Points For,Points Against.
For example: 2016-12-31,Kentucky,Neutral,33,18.
I've managed to wrangle the data into the above list (albeit probably not in the most efficient manner), however just not exactly sure how to get this into the format above.
Any help would be greatly appreciated!
Not beautiful but this should work.
team_score_list = [
["Georgia", "draw"],
["Duke", "loss"],
["Virginia Tech", "win"],
["Virginia", "loss"],
["Clemson", "loss"],
["Clemson", "win"],
["Clemson", "draw"],
]
def gen_dict_lst(team_score_list):
"""Generates dict of list based on team record"""
team_score_dict = {}
for team_record in team_score_list:
if team_record[0] not in team_score_dict.keys():
team_score_dict[team_record[0]] = [0, 0, 0]
if team_record[1] == "win":
team_score_dict[team_record[0]][0] += 1
elif team_record[1] == "loss":
team_score_dict[team_record[0]][1] += 1
elif team_record[1] == "draw":
team_score_dict[team_record[0]][2] += 1
return team_score_dict
def convert_format(score_dict):
"""formats list to string for output validation"""
output_dict = {}
for key, value in score_dict.items():
new_val = []
for index, x in enumerate(value):
if index == 2:
new_val.append(str(x))
else:
new_val.append(str(x) + "-")
new_str = "".join(new_val)
output_dict[key] = new_str
return output_dict
score_dict = gen_dict_lst(team_score_list)
out_dict = convert_format(score_dict)
print(out_dict)
You can first make a dictionary and insert/increment values of wins,loss and draw while iterating over the dictionary values. Here I have shown a way using variable name same as the string used for win,loss and draw and then increased corresponding value in dictionary using global()['str'] (from another answer)
dct={}
for i in team_score_list:
draw=2
win=0
loss=1
if i[0] in dct:
dct[i[0]][globals()[i[1]]]+=1
else:
dct[i[0]]=[0,0,0]
dct[i[0]][globals()[i[1]]]=1
You can then convert your list to string by using '-'.join(...) to get it in a format you want in the dictionary.
I now get what you mean:
You could do
a = dict()
f = lambda x,s: str(int(m[x]=='1' or j==s))
for (i,j) in team_score_list:
m = a.get(i,'0-0-0')
a[i] = f"{f(0,'win')}-{f(2,'draw')}-{f(4,'loss')}"
{'Georgia': '0-1-0',
'Duke': '0-0-1',
'Virginia Tech': '1-0-0',
'Virginia': '0-0-1',
'Clemson': '1-1-1'}
Now this is an answer only for this example. If you had many data, it would be good to use a list then join at the end. Eg
b = dict()
g = lambda x,s: str(int(m[x]) + (j==s))
for (i,j) in team_score_list:
m = b.get(i,[0,0,0])
b[i] =[g(0,"win"),g(1,"draw"),g(2,"loss")]
{key:'-'.join(val) for key,val in b.items()}
{'Georgia': '0-1-0',
'Duke': '0-0-1',
'Virginia Tech': '1-0-0',
'Virginia': '0-0-1',
'Clemson': '1-1-1'}

Python delete dictionary items with same values

How would I remove dictionary items that have different keys but identical values? I am sure there a better way than my novice algorithm...?
Example: (abends with error "dictionary changed size during iteration")
The goal of this example is to remove either 'car_id' or 'truck_id' from dict.
key_fields_obj = {}
key_fields_obj['car_id'] = 'bob'
key_fields_obj['bike_id'] = 'sam'
key_fields_obj['truck_id'] = 'bob' #goal: remove this one, so left with only car_id and bike_id
for item in key_fields_obj:
tst = key_fields_obj[item]
for comp in key_fields_obj:
if item == comp:
continue
cmp = key_fields_obj[comp]
if cmp == tst:
del key_fields_obj[comp]
print(key_fields_obj)
Create a new dict with values as key and key as values, then perform the same again on new dict.
>>> key_fields_obj = {key_fields_obj[key]: key for key in key_fields_obj}
>>> key_fields_obj
{'bob': 'truck_id', 'sam': 'bike_id'}
>>>
>>> key_fields_obj = {key_fields_obj[key]: key for key in key_fields_obj}
>>> key_fields_obj
{'truck_id': 'bob', 'bike_id': 'sam'}
seem_values = set()
for key, value in list(key_fields_obj.items()):
if value in seem_values:
del key_fields_obj[key]
else:
seem_values.add(value)

Retrieving dict value via hardcoded key, works. Retrieving via computed key doesn't. Why?

I'm generating a common list of IDs by comparing two sets of IDs (the ID sets are from a dictionary, {ID: XML "RECORD" element}). Once I have the common list, I want to iterate over it and retrieve the value corresponding to the ID from a dictionary (which I'll write to disc).
When I compute the common ID list using my diff_comm_checker function, I'm unable to retrieve the dict value the ID corresponds to. It doesn't however fail with a KeyError. I can also print the ID out.
When I hard code the ID in as the common_id value, I can retrieve the dict value.
I.e.
common_ids = diff_comm_checker( list_1, list_2, "text")
# does nothing - no failures
common_ids = ['0603599998140032MB']
#gives me:
0603599998140032MB {'R': '0603599998140032MB'} <Element 'RECORD' at 0x04ACE788>
0603599998140032MB {'R': '0603599998140032MB'} <Element 'RECORD' at 0x04ACE3E0>
So I suspected there was some difference between the strings. I checked both the function output and compared it against the hard-coded values using:
print [(_id, type(_id), repr(_id)) for _id in common_ids][0]
I get exactly the same for both:
>>> ('0603599998140032MB', <type 'str'>, "'0603599998140032MB'")
I have also followed the advice of another question and used difflib.ndiff:
common_ids1 = diff_comm_checker( [x.keys() for x in to_write[0]][0], [x.keys() for x in to_write[1]][0], "text")
common_ids = ['0603599998140032MB']
print "\n".join(difflib.ndiff(common_ids1, common_ids))
>>> 0603599998140032MB
So again, doesn't appear that there's any difference between the two.
Here's a full, working example of the code:
from StringIO import StringIO
import xml.etree.cElementTree as ET
from itertools import chain, islice
def diff_comm_checker(list_1, list_2, text):
"""Checks 2 lists. If no difference, pass. Else return common set between two lists"""
symm_diff = set(list_1).symmetric_difference(list_2)
if not symm_diff:
pass
else:
mismatches_in1_not2 = set(list_1).difference( set(list_2) )
mismatches_in2_not1 = set(list_2).difference( set(list_1) )
if mismatches_in1_not2:
mismatch_logger(
mismatches_in1_not2,"{}\n1: {}\n2: {}".format(text, list_1, list_2), 1, 2)
if mismatches_in2_not1:
mismatch_logger(
mismatches_in2_not1,"{}\n2: {}\n1: {}".format(text, list_1, list_2), 2, 1)
set_common = set(list_1).intersection( set(list_2) )
if set_common:
return sorted(set_common)
else:
return "no common set: {}\n".format(text)
def chunks(iterable, size=10):
iterator = iter(iterable)
for first in iterator:
yield chain([first], islice(iterator, size - 1))
def get_elements_iteratively(file):
"""Create unique ID out of image number and case number, return it along with corresponding xml element"""
tag = "RECORD"
tree = ET.iterparse(StringIO(file), events=("start","end"))
context = iter(tree)
_, root = next(context)
for event, record in context:
if event == 'end' and record.tag == tag:
xml_element_2 = ''
xml_element_1 = ''
for child in record.getchildren():
if child.tag == "IMAGE_NUMBER":
xml_element_1 = child.text
if child.tag == "CASE_NUM":
xml_element_2 = child.text
r_id = "{}{}".format(xml_element_1, xml_element_2)
record.set("R", r_id)
yield (r_id, record)
root.clear()
def get_chunks(file, chunk_size):
"""Breaks XML into chunks, yields dict containing unique IDs and corresponding xml elements"""
iterable = get_elements_iteratively(file)
for chunk in chunks(iterable, chunk_size):
ids_records = {}
for k in chunk:
ids_records[k[0]]=k[1]
yield ids_records
def create_new_xml(xml_list):
chunk = 5000
chunk_rec_ids_1 = get_chunks(xml_list[0], chunk)
chunk_rec_ids_2 = get_chunks(xml_list[1], chunk)
to_write = [chunk_rec_ids_1, chunk_rec_ids_2]
######################################################################################
### WHAT'S GOING HERE ??? WHAT'S THE DIFFERENCE BETWEEN THE OUTPUTS OF THESE TWO ? ###
common_ids = diff_comm_checker( [x.keys() for x in to_write[0]][0], [x.keys() for x in to_write[1]][0], "create_new_xml - large - common_ids")
#common_ids = ['0603599998140032MB']
######################################################################################
for _id in common_ids:
print _id
for gen_obj in to_write:
for kv_pair in gen_obj:
if kv_pair[_id]:
print _id, kv_pair[_id].attrib, kv_pair[_id]
if __name__ == '__main__':
xml_1 = """<?xml version="1.0"?><RECORDSET><RECORD><CASE_NUM>140032MB</CASE_NUM><IMAGE_NUMBER>0603599998</IMAGE_NUMBER></RECORD></RECORDSET>"""
xml_2 = """<?xml version="1.0"?><RECORDSET><RECORD><CASE_NUM>140032MB</CASE_NUM><IMAGE_NUMBER>0603599998</IMAGE_NUMBER></RECORD></RECORDSET>"""
create_new_xml([xml_1, xml_2])
The problem is not in the type or value of common_ids returned from diff_comm_checker. The problem is that the function diff_comm_checker or in constructing the arguments to the function that destroys the values of to_write
If you try this you will see what I mean
common_ids = ['0603599998140032MB']
diff_comm_checker( [x.keys() for x in to_write[0]][0], [x.keys() for x in to_write[1]][0], "create_new_xml - large - common_ids")
This will give the erroneous behavior without using the return value from diff_comm_checker()
This is because to_write is a generator and the call to diff_comm_checker exhausts that generator. The generator is then finished/empty when used in the if-statement in the loop. You can create a list from a generator by using list:
chunk_rec_ids_1 = list(get_chunks(xml_list[0], chunk))
chunk_rec_ids_2 = list(get_chunks(xml_list[1], chunk))
But this may have other implications (memory usage...)
Also, what is the intention of this construct in diff_comm_checker?
if not symm_diff:
pass
In my opinion nothing will happen regardless if symm_diff is None or not.

Return only a few keys in a dictionary? Python

I have a dictionary with values attached. I am able to get all keys. I have done searching around and a lot of people are saying to put the keys in a list, However I need the values attached to that key and the values must stay the same.
mydict = {'Car':'BMW','Speed':'kph','Range':33}
for keys in mydict:
print(keys)
What I am after is any two of the keys and their values to be printed out.
I don't fully understand what are you looking for.
You want to print values also? Go for
mydict = {'Car':'BMW','Speed':'kph','Range':33}
for keys in mydict:
print(keys,":",mydict[keys])
You want just print 2 of them?
mydict = {'Car':'BMW','Speed':'kph','Range':33}
from itertools import islice
def take(n, iterable):
return list(islice(iterable, n))
n_items = take(2, mydict.iteritems())
print(n_items)
You'll need itertools from pip tho.
Well, even if you put them in a list, you can still get the values:
mydict = {'Car':'BMW','Speed':'kph','Range':33}
keys = list(mydict)
for key in keys:
print(mydict[keys])
If you want only two keys you can do:
keys = keys[:2]
And if you want a new dictionary using only those two keys:
mynewdict = {k:v for k,v in mydict.items() if k in keys}
And probably the shortest:
for key in list(mydict)[:2]:
print(key, mydict[key])

Resources