I'm looping through a CSV, and would like to change "Gotham" to "Home". I've tried a couple ways, after searching around online, but can't seem to get it to work.
import csv
csv_file = "test.csv"
def process_csv(file):
headers=[]
data = []
csv_data = csv.reader(open(file))
for i, row in enumerate(csv_data):
if i == 0:
headers = row
continue;
field = []
for i in range(len(headers)):
field.append((headers[i],row[i]))
data.append(field)
return data
def create_merge_fast(city, country, contact):
lcl = locals()
## None of these do what I'd think - if city is "Gotham" change it to "Home"
for key, value in lcl.items():
if value == "Gotham":
lcl[value] = "Home"
print(key, value)
for value in lcl.values():
if value == "Gotham":
lcl[value] = "Home"
print(value)
def set_fields_create_doc(data):
city = data[4][1]
country = data[6][1]
contact = data[9][1]
create_merge_fast(city, country, contact)
data = process_csv(csv_file)
for i in data:
set_fields_create_doc(i)
I always seem to get
RuntimeError: dictionary changed size during iteration
right after
Gotham
is printed...
You cannot change your dict while iterating over it - the moment you change its state the iterator in for..in loop becomes invalid so it will pop the error from the title.
You can simply fix that by stopping the iteration once the match is found and changes were made to the dict, i.e.
for key, value in lcl.items():
if value == "Gotham":
lcl[key] = "Home"
break # exit here
print(key, value)
However, if it's possible to have multiple items that match this condition, simply breaking away won't work, but you can freeze the key list instead before you start iterating through it:
for key in list(lcl.keys()):
if lcl[key] == "Gotham":
lcl[key] = "Home"
Change
lcl[value] = "Home"
to
lcl[key] = "Home"
The former will actually create a new entry in your dictionary with the key as "Gotham" and value as "Home", rather than modifying the existing value from "Gotham" to "Home".
So, you want:
for key in lcl:
if lcl[key] == "Gotham":
lcl[key] = "Home"
print(key, lcl[key])
Also, you don't need the second loop.
Related
Because it's a dictionary of dictionaries that the function has to return, not a list of dictionaries,
I have trouble with put a dictionary within an empty dictionary. Since it's a nested dictionary,
appending wouldn't work since it's not a list of dictionaries.
Can you guys please pinpoint where the error is?
This is my code.
def run():
name_dict = ['Product', 'Brand', 'Cost']
product = {}
while True:
item = {}
for key in name_dict:
input_item = input("Enter {}: ".format(key))
if input_item == 'quit':
return product
item[key] = input_item
print()
product = run()
print(product)
This is the question.
You forget to update product with item. Put this line after item[key] = input_item:
product[str(item['Product'])] = item
This adds item to the product dictionary. As key I used the users input for the Product name. str() in case the user only uses digits to name a product.
I have just started learning python and i have been given an assignment to create a list of players and stats using different loops.
I cant work out how to create a function that searches the player list and gives an output of the players name and the players stat.
Here is the assignment:
Create an empty list called players
Use two input() statements inside a for loop to collect the name
and performance of each player (the name will be in the form of a
string and the performance as an integer from 0 – 100.) Add both
pieces of information to the list (so in the first iteration of the
loop players[0] will contain the name of the first player and
players[1] will contain their performance.) You are not required to
validate this data.
Use a while loop to display all the player information in the
following form:
Player : Performance
Use a loop type of your choice to copy the performance values from
the players list and store these items in a new list called results
Write a function that accepts the values “max” or “min” and
returns the maximum or minimum values from the results list
Write a function called find_player() that accepts a player name
and displays their name and performance from the players list, or an
error message if the player is not found.
Here is what I have so far:
print ("Enter 11 Player names and stats")
# Create player list
playerlist = []
# Create results list
results = []
# for loop setting amount of players and collecting input/appending list
for i in range(11):
player = (input("Player name: "))
playerlist.append(player)
stats = int(input("Player stats: "))
playerlist.append(stats)
# While loop printing player list
whileLoop = True
while whileLoop == True:
print (playerlist)
break
# for loop append results list, [start:stop:step]
for i in range(11):
results.append(playerlist[1::2])
break
# max in a custom function
def getMax(results):
results = (playerlist[1::2])
return max(results)
print ("Max Stat",getMax(results))
# custom function to find player
def find_player(playerlist):
list = playerlist
name = str(input("Search keyword: "))
return (name)
for s in list:
if name in str(s):
return (s)
print (find_player(playerlist))
I have tried many different ways to create the find player function without success.
I think I am having problems because my list consists of strings and integers eg. ['john', 6, 'bill', 8]
I would like it to display the player that was searched for and the stats ['John', 6]
Any help would be greatly appreciated.
PS:
I know there is no need for all these loops but that is what the assignment seems to be asking for.
Thank you
I cut down on the fat and made a "dummy list", but your find_player function seems to work well, once you remove the first return statement! Once you return something, the function just ends.
All it needs is to also display the performance like so:
# Create player list
playerlist = ["a", 1, "b", 2, "c", 3]
# custom function to find player
def find_player(playerlist):
name = str(input("Search keyword: "))
searchIndex = 0
for s in playerlist:
try:
if name == str(s):
return ("Player: '%s' with performance %d" % (name, playerlist[searchIndex+1]))
except Exception as e:
print(e)
searchIndex += 1
print (find_player(playerlist))
>>Search keyword: a
>>Player: 'a' with performance 1
I also added a try/except in case something goes wrong.
Also: NEVER USE "LIST" AS A VARIABLE NAME!
Besides, you already have an internal name for it, so why assign it another name. You can just use playerlist inside the function.
Your code didn't work because you typed a key and immediately returned it. In order for the code to work, you must use the key to find the value. In this task, it is in the format of '' key1 ', value1,' key2 ', value2, ...]. In the function, index is a variable that stores the position of the key. And it finds the position of key through loop. It then returns list [index + 1] to return the value corresponding to the key.
playerlist = []
def find_player(playerlist):
list = playerlist
name = str(input("Search keyword: "))
index = 0
for s in list:
if name == str(s):
return ("This keyword's value: %d" % (list[index+1]))
index+=1
print (find_player(playerlist))
I've created a dictionary of different keys and different values. When I update the values for one key, all the values update and are equal. I don't understand why that is. I don't understand why the values of each key are pointing to the same memory location especially when they were created at different times.
I've tried using the update methods.
I've tried assigning values by doing diction['new_value']= 'key_value'.
I've tried the from_keys().
def transform(self, rows):
#Rows are a row from a csv, tsv, or txt file and I'm splitting them by white space, tabs, or commas. I've created an inherited function that does that for me called split_line.
data = self.split_line(rows)
for idx, column in enumerate(data):
if idx != 0:
if self.metadata_types[idx].lower() == 'numeric':
column = round(float(column), 3)
elif self.metadata_types[idx].lower() == 'group':
if column not in self.uniqueValues:
self.uniqueValues.append(column)
annotation =self.header[idx]
self.annotation_subdocs[annotation]['value'].append(column)
def create_annotation_subdocs(self):
annotation_subdocs = dict()
#For ever value in the header I want to create a new dictionary with the key being the header value
for idx, value in enumerate(self.header):
if value == 'name':
self.annotation_subdocs[value]= create_metadata_subdoc('text', idx, 'cells')
print(id(annotation_subdocs[value]))
elif value in ('x', 'y', 'z'):
self.annotation_subdocs[value] = create_metadata_subdoc(value, idx, 'coordinates')
print(id(annotation_subdocs[value]))
else:
self.annotation_subdocs[value]=create_metadata_subdoc(value, idx, 'annotations')
print(id(annotation_subdocs[value]))
def create_metadata_subdoc(name, idx, header_value_type, *, value=[], subsampled_annotation=None):
return {
'name': name,
'array_index': idx,
'value': value,
'array_type': header_value_type,
'subsampled_annotation': subsampled_annotation,
'subsamp_threashold': "",
}
I expect the values for each key to do different. Instead all the values are updating at the same time even though I'm accessing specific keys.
I have a list of dictionaries, and i am trying to check if each individual dictionaries in the list contain a particular value and then if the value matches, insert a new item to the matching dictionary.
emp_name = "Jack"
my_list = [{'name':'Jack', 'age':'42', 'j_id':'1'}, {'name':'charles', 'age':'32', 'j_id':'34'}, {'name':'john', 'age':'44', 'j_id':'3'}, {'name':'jacob', 'age':'24', 'j_id':'5'}]
for item in my_list:
name = item.get('name')
print(name)
if name == emp_name:
item['date'] = "something"
print(item)
# add this item value to the dictionary
else:
print("not_matching")
Here is my expected output:
[{'name':'Jack', 'age':'42', 'j_id':'1', 'date':'something'},
{'name':'charles', 'age':'32', 'j_id':'34'}, {'name':'john', 'age':'44',
'j_id':'3'}, {'name':'jacob', 'age':'24', 'j_id':'5'}]
Is there any other pythonic way to simplify this code?
Here's a simplified version of the for loop.
for item in my_list:
if 'name' in item and item['name'] == emp_name:
item['date'] = 'something'
EDIT: Alternate solution (as suggested by #brunodesthuilliers below) - is to use dict's get() method (more details in comments section below).
for item in my_list:
if item.get("name", "") == emp_name:
item['date'] = 'something'
I'm generating a common list of IDs by comparing two sets of IDs (the ID sets are from a dictionary, {ID: XML "RECORD" element}). Once I have the common list, I want to iterate over it and retrieve the value corresponding to the ID from a dictionary (which I'll write to disc).
When I compute the common ID list using my diff_comm_checker function, I'm unable to retrieve the dict value the ID corresponds to. It doesn't however fail with a KeyError. I can also print the ID out.
When I hard code the ID in as the common_id value, I can retrieve the dict value.
I.e.
common_ids = diff_comm_checker( list_1, list_2, "text")
# does nothing - no failures
common_ids = ['0603599998140032MB']
#gives me:
0603599998140032MB {'R': '0603599998140032MB'} <Element 'RECORD' at 0x04ACE788>
0603599998140032MB {'R': '0603599998140032MB'} <Element 'RECORD' at 0x04ACE3E0>
So I suspected there was some difference between the strings. I checked both the function output and compared it against the hard-coded values using:
print [(_id, type(_id), repr(_id)) for _id in common_ids][0]
I get exactly the same for both:
>>> ('0603599998140032MB', <type 'str'>, "'0603599998140032MB'")
I have also followed the advice of another question and used difflib.ndiff:
common_ids1 = diff_comm_checker( [x.keys() for x in to_write[0]][0], [x.keys() for x in to_write[1]][0], "text")
common_ids = ['0603599998140032MB']
print "\n".join(difflib.ndiff(common_ids1, common_ids))
>>> 0603599998140032MB
So again, doesn't appear that there's any difference between the two.
Here's a full, working example of the code:
from StringIO import StringIO
import xml.etree.cElementTree as ET
from itertools import chain, islice
def diff_comm_checker(list_1, list_2, text):
"""Checks 2 lists. If no difference, pass. Else return common set between two lists"""
symm_diff = set(list_1).symmetric_difference(list_2)
if not symm_diff:
pass
else:
mismatches_in1_not2 = set(list_1).difference( set(list_2) )
mismatches_in2_not1 = set(list_2).difference( set(list_1) )
if mismatches_in1_not2:
mismatch_logger(
mismatches_in1_not2,"{}\n1: {}\n2: {}".format(text, list_1, list_2), 1, 2)
if mismatches_in2_not1:
mismatch_logger(
mismatches_in2_not1,"{}\n2: {}\n1: {}".format(text, list_1, list_2), 2, 1)
set_common = set(list_1).intersection( set(list_2) )
if set_common:
return sorted(set_common)
else:
return "no common set: {}\n".format(text)
def chunks(iterable, size=10):
iterator = iter(iterable)
for first in iterator:
yield chain([first], islice(iterator, size - 1))
def get_elements_iteratively(file):
"""Create unique ID out of image number and case number, return it along with corresponding xml element"""
tag = "RECORD"
tree = ET.iterparse(StringIO(file), events=("start","end"))
context = iter(tree)
_, root = next(context)
for event, record in context:
if event == 'end' and record.tag == tag:
xml_element_2 = ''
xml_element_1 = ''
for child in record.getchildren():
if child.tag == "IMAGE_NUMBER":
xml_element_1 = child.text
if child.tag == "CASE_NUM":
xml_element_2 = child.text
r_id = "{}{}".format(xml_element_1, xml_element_2)
record.set("R", r_id)
yield (r_id, record)
root.clear()
def get_chunks(file, chunk_size):
"""Breaks XML into chunks, yields dict containing unique IDs and corresponding xml elements"""
iterable = get_elements_iteratively(file)
for chunk in chunks(iterable, chunk_size):
ids_records = {}
for k in chunk:
ids_records[k[0]]=k[1]
yield ids_records
def create_new_xml(xml_list):
chunk = 5000
chunk_rec_ids_1 = get_chunks(xml_list[0], chunk)
chunk_rec_ids_2 = get_chunks(xml_list[1], chunk)
to_write = [chunk_rec_ids_1, chunk_rec_ids_2]
######################################################################################
### WHAT'S GOING HERE ??? WHAT'S THE DIFFERENCE BETWEEN THE OUTPUTS OF THESE TWO ? ###
common_ids = diff_comm_checker( [x.keys() for x in to_write[0]][0], [x.keys() for x in to_write[1]][0], "create_new_xml - large - common_ids")
#common_ids = ['0603599998140032MB']
######################################################################################
for _id in common_ids:
print _id
for gen_obj in to_write:
for kv_pair in gen_obj:
if kv_pair[_id]:
print _id, kv_pair[_id].attrib, kv_pair[_id]
if __name__ == '__main__':
xml_1 = """<?xml version="1.0"?><RECORDSET><RECORD><CASE_NUM>140032MB</CASE_NUM><IMAGE_NUMBER>0603599998</IMAGE_NUMBER></RECORD></RECORDSET>"""
xml_2 = """<?xml version="1.0"?><RECORDSET><RECORD><CASE_NUM>140032MB</CASE_NUM><IMAGE_NUMBER>0603599998</IMAGE_NUMBER></RECORD></RECORDSET>"""
create_new_xml([xml_1, xml_2])
The problem is not in the type or value of common_ids returned from diff_comm_checker. The problem is that the function diff_comm_checker or in constructing the arguments to the function that destroys the values of to_write
If you try this you will see what I mean
common_ids = ['0603599998140032MB']
diff_comm_checker( [x.keys() for x in to_write[0]][0], [x.keys() for x in to_write[1]][0], "create_new_xml - large - common_ids")
This will give the erroneous behavior without using the return value from diff_comm_checker()
This is because to_write is a generator and the call to diff_comm_checker exhausts that generator. The generator is then finished/empty when used in the if-statement in the loop. You can create a list from a generator by using list:
chunk_rec_ids_1 = list(get_chunks(xml_list[0], chunk))
chunk_rec_ids_2 = list(get_chunks(xml_list[1], chunk))
But this may have other implications (memory usage...)
Also, what is the intention of this construct in diff_comm_checker?
if not symm_diff:
pass
In my opinion nothing will happen regardless if symm_diff is None or not.