I'm using a dictwriter as follows:
csv.DictWriter(output_file, keys, delimiter=';', quotechar='"', quoting=csv.QUOTE_NONNUMERIC)
which gives me the desired output when all keys have non-numeric values:
key1;key2;key3
"value1";"value2";"value3"
now i have keys without values and the dictwriter quotes the empty strings as well:
dic.update(key2=None)
{'key1':'value1', 'key2': None, 'key3':'value3'}
key1;key2;key3
"value1";"";"value3"
what i would like to have is:
key1;key2;key3
"value1";;"value3"
how's that possible? any idea?
i wasn't able to find a trivial solution, so i decided to open the existing file and to replace those "" values:
text = open(filename, 'r')
text = ''.join([i for i in text]) \
.replace('""', '')
x = open(filename,'w')
x.writelines(text)
x.close()
that gave me the desired output
from:
key1;key2;key3
"value1";"";"value3"
to:
key1;key2;key3
"value1";;"value3"
all good.. but.. is there a neater way of doing this?
Use csv.QUOTE_MINIMAL where you have None type in dictionary and use csv.QUOTE_NONNUMERIC elsewhere:
>>> d = {'key1': 'value1', 'key2': None, 'key3': 'value3'}
>>> for k,v in d.items():
if d[k] is None:
with open('test.csv', 'w') as f:
w = csv.DictWriter(f, d.keys(), csv.QUOTE_MINIMAL)
w.writeheader()
w.writerow(d)
else:
with open('test.csv', 'w') as f:
w = csv.DictWriter(f, d.keys(), csv.QUOTE_NONNUMERIC)
w.writeheader()
w.writerow(d)
Output:
key1,key2,key3
value1,,value3
My csv file looks like this:-
ID,Product,Price
1,Milk,20
2,Bottle,200
3,Mobile,258963
4,Milk,24
5,Mobile,10000
My code of extracting row is as follow :-
def search_data():
fin = open('Products/data.csv')
word = input() # "Milk"
found = {}
for line in fin:
if word in line:
found[word]=line
return found
search_data()
While I run this above code I got output as :-
{'Milk': '1,Milk ,20\n'}
I want If I search for "Milk" I will get all the rows which is having "Milk" as Product.
Note:- Do this in only Python don't use Pandas
Expected output should be like this:-
[{"ID": "1", "Product": "Milk ", "Price": "20"},{"ID": "4", "Product": "Milk ", "Price": "24"}]
Can anyone tell me where am I doing wrong ?
In your script every time you assign found[word]=line it will overwrite the value that was before it. Better approach is load all the data and then do filtering:
If file.csv contains:
ID Product Price
1 Milk 20
2 Bottle 200
3 Mobile 10,000
4 Milk 24
5 Mobile 15,000
Then this script:
#load data:
with open('file.csv', 'r') as f_in:
lines = [line.split() for line in map(str.strip, f_in) if line]
data = [dict(zip(lines[0], l)) for l in lines[1:]]
# print only items with 'Product': 'Milk'
print([i for i in data if i['Product'] == 'Milk'])
Prints only items with Product == Milk:
[{'ID': '1', 'Product': 'Milk', 'Price': '20'}, {'ID': '4', 'Product': 'Milk', 'Price': '24'}]
EDIT: If your data are separated by commas (,), you can use csv module to read it:
File.csv contains:
ID,Product,Price
1,Milk ,20
2,Bottle,200
3,Mobile,258963
4,Milk ,24
5,Mobile,10000
Then the script:
import csv
#load data:
with open('file.csv', 'r') as f_in:
csvreader = csv.reader(f_in, delimiter=',', quotechar='"')
lines = [line for line in csvreader if line]
data = [dict(zip(lines[0], l)) for l in lines[1:]]
# # print only items with 'Product': 'Milk'
print([i for i in data if i['Product'].strip() == 'Milk'])
Prints:
[{'ID': '1', 'Product': 'Milk ', 'Price': '20'}, {'ID': '4', 'Product': 'Milk ', 'Price': '24'}]
I have a csv file that I read with csv module in a csv.DictReader().
I have an output like this:
{'biweek': '1', 'year': '1906', 'loc': 'BALTIMORE', 'cases': 'NA', 'pop': '526822.1365'}
{'biweek': '2', 'year': '1906', 'loc': 'BALTIMORE', 'cases': 'NA', 'pop': '526995.246'}
{'biweek': '3', 'year': '1906', 'loc': 'BALTIMORE', 'cases': 'NA', 'pop': '527170.1981'}
{'biweek': '4', 'year': '1906', 'loc': 'BALTIMORE', 'cases': 'NA', 'pop': '527347.0136'}
And I need to get the 'loc' as key for a new dict and the count of the 'loc' as values for that new dict, as the 'loc' have a lot of repetitions in the file.
with open('Dalziel2015_data.csv') as fh:
new_dct = {}
cities = set()
cnt = 0
reader = csv.DictReader(fh)
for row in reader:
data = dict(row)
cities.add(data.get('loc'))
for (k, v) in data.items():
if data['loc'] in cities:
cnt += 1
new_dct[data['loc']] = cnt + 1
print(new_dct)
example_file:
biweek,year,loc,cases,pop
1,1906,BALTIMORE,NA,526822.1365
2,1906,BALTIMORE,NA,526995.246
3,1906,BALTIMORE,NA,527170.1981
4,1906,BALTIMORE,NA,527347.0136
5,1906,BALTIMORE,NA,527525.7134
6,1906,BALTIMORE,NA,527706.3183
4,1906,BOSTON,NA,630880.6579
5,1906,BOSTON,NA,631295.9457
6,1906,BOSTON,NA,631710.8403
7,1906,BOSTON,NA,632125.3403
8,1906,BOSTON,NA,632539.4442
9,1906,BOSTON,NA,632953.1503
10,1907,BRIDGEPORT,NA,91790.75578
11,1907,BRIDGEPORT,NA,91926.14732
12,1907,BRIDGEPORT,NA,92061.90153
13,1907,BRIDGEPORT,NA,92198.01976
14,1907,BRIDGEPORT,NA,92334.50335
15,1907,BRIDGEPORT,NA,92471.35364
17,1908,BUFFALO,NA,413661.413
18,1908,BUFFALO,NA,413934.7646
19,1908,BUFFALO,NA,414208.4097
20,1908,BUFFALO,NA,414482.3523
21,1908,BUFFALO,NA,414756.5963
22,1908,BUFFALO,NA,415031.1456
23,1908,BUFFALO,NA,415306.0041
24,1908,BUFFALO,NA,415581.1758
25,1908,BUFFALO,NA,415856.6646
6,1935,CLEVELAND,615,890247.9867
7,1935,CLEVELAND,954,890107.9192
8,1935,CLEVELAND,965,889967.7823
9,1935,CLEVELAND,872,889827.5956
10,1935,CLEVELAND,814,889687.3781
11,1935,CLEVELAND,717,889547.1492
12,1935,CLEVELAND,770,889406.9283
13,1935,CLEVELAND,558,889266.7346
I have done this. I got the keys alright, but I didn't get the count right.
My results:
{'BALTIMORE': 29, 'BOSTON': 59, 'BRIDGEPORT': 89, 'BUFFALO': 134, 'CLEVELAND': 174}
I know pandas is a very good tool but I need the code with csv module.
If any of you guys could help me to get the count done I appreciate.
Thank you!
Paulo
You can use collections.Counter to count occurrences of the cities in CSV file. Counter.keys() will also give you all cities found in CSV:
import csv
from collections import Counter
with open('csvtest.csv') as fh:
reader = csv.DictReader(fh)
c = Counter(row['loc'] for row in reader)
print(dict(c))
print('Cities={}'.format([*c.keys()]))
Prints:
{'BALTIMORE': 6, 'BOSTON': 6, 'BRIDGEPORT': 6, 'BUFFALO': 9, 'CLEVELAND': 8}
Cities=['BALTIMORE', 'BOSTON', 'BRIDGEPORT', 'BUFFALO', 'CLEVELAND']
You are updating a global counter and not the counter for the specific location. You are also iterating each column of each row and updating it for no reason.
Try this:
with open('Dalziel2015_data.csv') as fh:
new_dct = {}
cities = set()
reader = csv.DictReader(fh)
for row in reader:
data = dict(row)
new_dct[data['loc']] = new_dct.get(data['loc'], 0) + 1
print(new_dct)
This line: new_dct[data['loc']] = new_dct.get(data['loc'], 0) + 1 will get the last counter for that city and increment the number by one. If the counter does not exist yet, the function get will return 0.
SEE UPDATE BELOW!
For my Python program I need to write 3 different lists to a csv file, each in a different column. Each lists has a different size.
l1 = ['1', '2', '3', '4', '5']
l2 = ['11', '22', '33', '44']
l3 = ['111', '222', '333']
f = 'test.csv'
outputFile = open(f, 'w', newline='')
outputWriter = csv.writer(resultFile, delimiter=';')
outputWriter.writerow(headerNames)
for r in l3:
resultFile.write(';' + ';' + r + '\n')
for r in l2:
resultFile.write(';' + r + '\n')
for r in l1:
resultFile.write(r + '\n')
resultFile.close()
Unfortunately this doesn't work. The values of the lists are written below each other list in the column to the right. I would prefer to have the list values written beside one another just like this:
1;11;111
2;22;222
etc.
I am sure there is an easy way to get this done, but after hours of trying I still cannot figure it out.
UPDATE:
I tried the following. It is progress, but I am still not there yet.
f = input('filename: ')
l1 = ['1', '2', '3', '4', '5']
l2 = ['11', '22', '33', '44']
l3 = ['111', '222', '333']
headerNames = ['Name1', 'Name2', 'Name3']
rows = zip(l1, l2, l3)
with open(f, 'w', newline='') as resultFile:
resultWriter = csv.writer(resultFile, delimiter=';')
resultWriter.writerow(headerNames)
for row in rows:
resultWriter.writerow(row)
It write the data in the format I would like, however the values 4, 5 and 44 are not writen.
Your first attempt is not using the csv module properly, nor transposing the rows like your second attempt does.
Now zipping the rows will stop as soon as the shortest row ends. You want itertools.ziplongest instead (with a fill value of 0 for instance)
import itertools,csv
f = "out.csv"
l1 = ['1', '2', '3', '4', '5']
l2 = ['11', '22', '33', '44']
l3 = ['111', '222', '333']
headerNames = ['Name1', 'Name2', 'Name3']
rows = itertools.zip_longest(l1, l2, l3, fillvalue=0)
with open(f, 'w', newline='') as resultFile:
resultWriter = csv.writer(resultFile, delimiter=';')
resultWriter.writerow(headerNames)
resultWriter.writerows(rows) # write all rows in a row :)
output file contains:
Name1;Name2;Name3
1;11;111
2;22;222
3;33;333
4;44;0
5;0;0
I am given a .txt file which looks like this..
2:rain
3:odd
5:yes
6:go
I need to convert it into a dictionary.
This is what I have done so far.
words_dict = {}
file = open(filename, "r")
for word in file:
k, v = word.split(":")
words_dict[k.strip()] = v.strip()
file.close()
return words_dict
However, when i go and print the dictionary it does not match my expected output of {2: 'rain', 3: 'odd', 5: 'yes', 6: 'go'}
l="2:rain 3:odd 5:yes 6:go".split()
{x.split(":")[0]:x.split(":")[1] for x in l}
list_ = [x for x in open('text.txt').read().split()]
dict_ = {k: v for k, v in [x.split(':') for x in list_]}
# list_ = ['2:rain', '3:odd', '5:yes', '6:go']
# dict_ = {'2': 'rain', '3': 'odd', '5': 'yes', '6': 'go'}