I"m stuck on this problem when i tested the test cases and kept getting KeyError, is there another way to fix it?
All of the files are in the shared google drive.
https://drive.google.com/drive/folders/1OqrHxY42Cka9_H9pfA9VLQOkIuqoSQKN?usp=sharing
Code:
import csv
def read_votes(filename):
rows = []
columns = []
try:
with open(filename, 'r') as file:
csvreader = csv.reader(file)
column = next(csvreader)
for row in csvreader:
row.append(row)
dict{}
vote_dbase = {}
for row in rows:
state = row[0]
candidate = (row[1], row[2], row[3], row[4])
if int(row[3]) > 0:
if state in vote_dbase:
flag = 0
for i in range(len(vote_dbase[state])):
if row[1] < vote_dbase[state][i][0]:
vote_dbase[state].insert(i, candidate)
flag = 1
break
if flag == 0:
vote_dbase[state].append(candidate)
else:
vote_dbase[state] = [candidate]
return vote_dbase
except:
return False
Fail case with KeyError
It's not clear from your code what behaviour you want to occur, as we can't see the tests where the error occurs.
That said, doing some basic debugging, it seems you're not processing the input values properly. The problem is at
for row in csvreader:
row.append(row)
You are appending row to itself rather than your list rows. I think you want
for row in csvreader:
rows.append(row)
I would also recommend against putting everything in a try block and not doing anything with the exception. This means you can hit an error and you won't get the error message which would help you debug your code. Either don't use the try block or do something like this for the exception block:
except Exception as exception_instance:
print(exception_instance)
return False
There's also a random dict{} in your code which doesn't work.
Related
CSV file:
Acct,phn_1,phn_2,phn_3,Name,Consent,zipcode
1234,45678,78906,,abc,NCN,10010
3456,48678,,78976,def,NNC,10010
Problem:
Based on consent value which is for each of the phones (in 1st row: 1st N is phn_1, C for phn_2 and so on) I need to retain only that phn column and move the remaining columns to the end of the file.
The below is what I have. My approach isn't that great is what I feel. I'm trying to get the id of the individual Ns and Cs, get the id and map it with the phone (but I'm unable to iterate through the phn headers and compare the id's of the Ns and Cs)
with open('file.csv', 'rU') as infile:
reader = csv.DictReader(infile) data = {} for row in reader:
for header, value in row.items():
data.setdefault(header, list()).append(value) # print(data)
Consent = data['Consent']
for i in range(len(Consent)):
# print(list(Consent[i]))
for idx, val in enumerate(list(Consent[i])):
# print(idx, val)
if val == 'C':
#print("C")
print(idx)
else:
print("N")
Could someone provide me with the solution for this?
Please Note: Do not want the solution to be by using pandas.
You’ll find my answer in the comments of the code below.
import csv
def parse_csv(file_name):
""" """
# Prepare the output. Note that all rows of a CSV file must have the same structure.
# So it is actually not possible to put the phone numbers with no consent at the end
# of the file, but what you can do is to put them at the end of the row.
# To ensure that the structure is the same on all rows, you need to put all phone numbers
# at the end of the row. That means the phone number with consent is duplicated, and that
# is not very efficient.
# I chose to put the result in a string, but you can use other types.
output = "Acct,phn,Name,Consent,zipcode,phn_1,phn_2,phn_3\n"
with open(file_name, "r") as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
# Search the letter “C” in “Consent” and get the position of the first match.
# Add one to the result because the “phn_×” keys are 1-based and not 0-based.
first_c_pos = row["Consent"].find("C") + 1
# If there is no “C”, then the “phn” key is empty.
if first_c_pos == 0:
row["phn"] = ""
# If there is at least one “C”, create a key string that will take the values
# phn_1, phn_2 or phn_3.
else:
key = f"phn_{first_c_pos}"
row["phn"] = row[key]
# Add the current row to the result string.
output += ",".join([
row["Acct"], row["phn"], row["Name"], row["Consent"],
row["zipcode"], row["phn_1"], row["phn_2"], row["phn_3"]
])
output += "\n"
# Return the string.
return(output)
if __name__ == "__main__":
output = parse_csv("file.csv")
print(output)
I am learning some coding, and I am stuck with an error I can't explain. Basically I want to read out a .csv file with birth statistics from the US to figure out the most popular name in the time recorded.
My code looks like this:
# 0:Id, 1: Name, 2: Year, 3: Gender, 4: State, 5: Count
names = {} # initialise dict names
maximum = 0 # store for maximum
l = []
with open("Filepath", "r") as file:
for line in file:
l = line.strip().split(",")
try:
name = l[1]
if name in names:
names[name] = int(names[name]) + int(l(5))
else:
names[name] = int(l(5))
except:
continue
print(names)
max(names)
def max(values):
for i in values:
if names[i] > maximum:
names[i] = maximum
else:
continue
return(maximum)
print(maximum)
It seems like the dictionary does not take any values at all since the print command does not return anything. Where did I go wrong (incidentally, the filepath is correct, it takes a while to get the result since the .csv is quite big. So my assumption is that I somehow made a mistake writing into the dictionary, but I was staring at the code for a while now and I don't see it!)
A few suggestions to improve your code:
names = {} # initialise dict names
maximum = 0 # store for maximum
with open("Filepath", "r") as file:
for line in file:
l = line.strip().split(",")
names[name] = names.get(name, 0) + l[5]
maximum = [(v,k) for k,v in names]
maximum.sort(reversed=True)
print(maximum[0])
You will want to look into Python dictionaries and learn about get. It helps you accomplish the objective of making your names dictionary in less lines of codes (more Pythonic).
Also, you used def to generate a function but you never called that function. That is why it's not printing.
I propose the shorted code above. Ask if you have questions!
Figured it out.
I think there were a few flow issues: I called a function before defining it... is that an issue or is python okay with that?
Also I think I used max as a name for a variable, but there is a built-in function with the same name, that might cause an issue I guess?! Same with value
This is my final code:
names = {} # initialise dict names
l = []
def maxval(val):
maxname = max(val.items(), key=lambda x : x[1])
return maxname
with open("filepath", "r") as file:
for line in file:
l = line.strip().split(",")
name = l[1]
try:
names[name] = names.get(name, 0) + int(l[5])
except:
continue
#print(str(l))
#print(names)
print(maxval(names))
The header line in my csv file is:
Number,Name,Type,Manufacturer,Material,Process,Thickness (mil),Weight (oz),Dk,Orientation,Pullback distance (mil),Description
I can open it and read the line, with no problems:
infile = open('CS_Data/_AD_LayersTest.csv','r')
csv_reader = csv.reader(infile, delimiter=',')
for row in csv_reader:
But I want to find out what the item number is for the "Dk".
The problem is that not only can the items be in any order as decided by the user in a different application. There can also be up to 25 items in the line.
How do I quickly determine what item is "Dk" so I can write Dk = (row[i]) for it and extract it for all the data after the header.
I have tried this below on each of the potential 25 items and it works, but it seems like a waste of time, energy and my ocd.
while True:
try:
if (row[0]) == "Dk":
DkColumn = 0
break
elif (row[1]) == "Dk":
DkColumn = 1
break
...
elif (row[24]) == "Dk":
DkColumn = 24
break
else:
f.write('Stackup needs a "Dk" column.')
break
except:
print ("Exception occurred")
break
Can't you get the index of the column (using list.index()) that has the value Dk in it? Something like:
infile = open('CS_Data/_AD_LayersTest.csv','r')
csv_reader = csv.reader(infile, delimiter=',')
# Store the header
headers = next(csv_reader, None)
# Get the index of the 'Dk' column
dkColumnIndex = header.index('Dk')
for row in csv_reader:
# Access values that belong to the 'Dk' column
rowDkValue = row[dkColumnIndex]
print(rowDkValue)
In the code above, we store the first line of the CSV in as a list in headers. We then search the list to find the index of the item that has the value of 'Dk'. That will be the column index.
Once we have that column index, we can then use it in each row to access the particular index, which will correspond to the column which Dk is the header of.
Use pandas library to save your order and have access to each column by typing:
row["column_name"]
import pandas as pd
dataframe = pd.read_csv(
"",
cols=["Number","Name","Type" ....])
for index, row in df.iterrows():
# do something
If I understand your question correctly, and you're not interested in using pandas (as suggested by Mikey - you sohuld really consider his suggestion, however), you should be able to do something like the following:
with open('CS_Data/_AD_LayersTest.csv','r') as infile:
csv_reader = csv.reader(infile, delimiter=',')
header = next(csv_reader)
col_map = {col_name: idx for idx, col_name in enumerate(header)}
for row in csv_reader:
row_dk = row[col_map['Dk']]
One solution would be to use pandas.
import pandas as pd
df=pd.read_csv('CS_Data/_AD_LayersTest.csv')
Now you can access 'Dk' easily as long as the file is read correctly.
dk=df['Dk']
and you can access individual values of dk like
for i in range(0,10):
temp_var=df.loc('Dk',i)
or however you want to access those indexes.
I'm trying to look up a time for a user. Let's say they input 13(minutes), my code scrolls through the csv and finds each row that has 13 in the time column. It then prints out the row one at a time. I don't know how to allow a user to have the option of revisiting a previous step? My code currently just reverses the order of the csv, starts from the bottom, even if the rows are not the 13 minute- selected rows.
I'm a total newbie so please try to explain as simple as possible.. Thanks
Please see code:
def time():
while True:
find = input("Please enter a time in minutes(rounded)\n"
"> ")
if len(find) < 1:
continue
else:
break
print("Matches will appear below\n"
"If no matches were made\n"
"You will return back to the previous menu.\n"
"")
count = -1
with open("work_log1.csv", 'r') as fp:
reader = csv.DictReader(fp, delimiter=',')
for row in reader:
count+=1
if find == row["time"]:
for key, value in row.items(): # This part iterates through csv dict with no problems
print(key,':',value)
changee = input("make change? ")
if changee == "back": # ISSUE HERE******
for row in reversed(list(reader)): # I'm trying to use this to reverse the order of whats been printed
for key, value in row.items(): # Unfortunately it doesnt start from the last item, it starts from
print(key,':',value) # The bottom of the csv. Once I find out how to iterate through it
# Properly, then I can apply my changes
make_change = input("make change? or go back")
if make_change == 'y':
new_change = input("input new data: ")
fp = pd.read_csv("work_log1.csv")
fp.set_value(count, "time", new_change) # This part makes the changes to the row i'm currently on
fp.to_csv("work_log1.csv", index=False)
print("")
you can always have list that keep last n lines so you can go back using this list, after reading new line just history.pop(0) and 'history.append(last_line)'
or alternatively you can wrap this logic using stream.seek function
I am trying to process jsonlines from an API and I am running into an issue where requests.iter_lines() is not timely. I have to now try to incorporate requests.iter_content(chunk_size=1024*1024). I am trying to work through the logic I would need to take an incomplete jsonline[1] and attach it to the next chunk_size so it makes a complete one.
My current attempt is running a series of if statements against to detect an undesirable state [2] and then rebuild it and continue process, but i'm failing to reassemble it in all the various states this could end up in. Does someone have an example of a well thought out solution to this problem?
[1]
Example:
Last item from first chunk:
{'test1': 'value1', 'test2': 'valu
first item from second chunk:
e2', 'test3': 'value3'}
[2]
def incomplete_processor(main_chunk):
if not main_chunk[0].startswith('{') and not main_chunk[-1].endswith('\n'):
first_line = str(main_chunk[0])
last_line = str(main_chunk[-1])
main_chunk.pop(0)
main_chunk.pop(-1)
return first_line, last_line
if not main_chunk.startswith('{') and main_chunk[-1].endswith('\n'):
first_line = str(main_chunk[-1])
main_chunk.pop(0)
return first_line
if main_chunk.startswith('{') and not main_chunk[-1].endswith('\n'):
last_line = str(main_chunk[-1])
main_chunk.pop(-1)
return last_line
I solve this problem by converting my original rsplit('\n') into a deque and then caught any valueerrors raised as a result of the incomplete json. I stored the first value that errors out, waited for the next value to error out and then combined them.
while True:
try:
jsonline = main_chunk_deque.popleft()
jsonline = json.loads(jsonline)
except ValueError as VE:
if not jsonline.endswith('}'):
next_line = jsonline
elif not jsonline.startswith('{'):
first_line = jsonline
jsonline = json.loads(next_line + first_line)
continue
except IndexError:
break