Csv file writing a new row for each letter - python-3.x

import csv
email = 'someone#somemail.com'
password = 'password123'
with open('test.csv', 'a', newline='') as accts:
b = csv.writer(accts, delimiter=',')
b.writerow(email)
b.writerow(password)
I'm trying to append to a csv file with the format email:password on the same row, but everytime I run the program it creates a new row for each letter and the password is written under the email. What am I doing wrong?
Output:
s,o,m,e,o,n,e,#,s,o,m,e,m,a,i,l,.,c,o,m
p,a,s,s,w,o,r,d,1,2,3
Desired output:
someone#somemail.com,password123

A string looks like a list of individual characters, and writerow expects a list of the column values, so you end up with columns of individual characters.
Instead, use a list of the column values:
b.writerow([email,password])

Related

list not split into proper csv columns using python

I wrote the following code to split my data matrix into a csv file:
f = open('midi_data.csv', 'w', newline="")
writer = csv.writer(f, delimiter= ',',quotechar =',',quoting=csv.QUOTE_MINIMAL)
for item in data:
writer.writerow(item)
print(item)
f.close()
But the csv file ends up looking like this:
tuples not separated by columns but by commas in one column only
What am I doing wrong?
The data seems to be written correctly inside the tuples, because when running the code it outputs the following:
enter image description here

Extract numbers only from specific lines within a txt file with certain keywords in Python

I want to extract numbers only from lines in a txt file that have a certain keyword and add them up then compare them, and then print the highest total number and the lowest total number. How should I go about this?
I want to print the highest and the lowest valid total numbers
I managed to extract lines with "valid" keyword in them, but now I want to get numbers from this lines, and then add the numbers up of each line, and then compare these numbers with other lines that have the same keyword and print the highest and the lowest valid numbers.
my code so far
#get file object reference to the file
file = open("shelfs.txt", "r")
#read content of file to string
data = file.read()
#close file<br>
closefile = file.close()
#get number of occurrences of the substring in the string
totalshelfs = data.count("SHELF")
totalvalid = data.count("VALID")
totalinvalid = data.count("INVALID")
print('Number of total shelfs :', totalshelfs)
print('Number of valid valid books :', totalvalid)
print('Number of invalid books :', totalinvalid)
txt file
HEADER|<br>
SHELF|2200019605568|<br>
BOOK|20200120000000|4810.1|20210402|VALID|<br>
SHELF|1591024987400|<br>
BOOK|20200215000000|29310.0|20210401|VALID|<br>
SHELF|1300001188124|<br>
BOOK|20200229000000|11519.0|20210401|VALID|<br>
SHELF|1300001188124|<br>
BOOK|20200329001234|115.0|20210331|INVALID|<br>
SHELF|1300001188124|<br>
BOOK|2020032904567|1144.0|20210401|INVALID|<br>
FOOTER|
What you need is to use the pandas library.
https://pandas.pydata.org/
You can read a csv file like this:
data = pd.read_csv('shelfs.txt', sep='|')
it returns a DataFrame object that can easily select or sort your data. It will use the first row as header, then you can select a specific column like a dictionnary:
header = data['HEADER']
header is a Series object.
To select columns you can do:
shelfs = data.loc[:,data['HEADER']=='SHELF']
to select only the row where the header is 'SHELF'.
I'm just not sure how pandas will handle the fact that you only have 1 header but 2 or 5 columns.
Maybe you should try to create one header per colmun in your csv, and add separators to make each row the same size first.
Edit (No External libraries or change in the txt file):
# Split by row
data = data.split('<br>\n')
# Split by col
data = [d.split('|') for d in data]
# Fill empty cells
n_cols = max([len(d) for d in data])
for i in range(len(data)):
while len(data[i])<n_cols:
data[i].append('')
# List VALID rows
valid_rows = [d for d in data if d[4]=='VALID']
# Get sum min and max
valid_sum = [d[1]+d[2]+d[3] for d in valid_rows]
valid_minimum = min(valid_sum)
valid_maximum = max(valid_sum)
It's maybe not exactly what you want to do but it solves a part of your problem. I didn't test the code.

Using DictReader to read a csv file that contains a variable number of fields that have the same fieldname

Using DictReader and given a file that contains data like so:
First ,Last,fruit,fruit,fruit,fruit,fruit,fruit
Carl,Yung,apple,watermelon,,,,
Louis,Pasteur,banana,grape,mango,,,
Marie,Curie,watermelon,apple,banana,,,
How do I assign any non-empty "fruit" fields to a list so that when the following code executes, row['fruit'] contains that list.
with open(csv_file) as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
print(row['First'],row['Last'],row['fruit'], sep='--->')
If fieldnames is omitted, the values in the first row will be used as the fieldnames. But you may specify it explicitly. If a row has more fields than fieldnames, the remaining data is put in a list and stored with the fieldname specified by restkey (which defaults to None).
import csv
with open("myfile.csv") as f:
reader = csv.DictReader(f, fieldnames=("First", "Last"), restkey="fruit")
for row in reader:
print(row)

How do I take the punctuation off each line of a column of an xlsx file in Python?

I have an excel file (.xlsx) with a column having rows of strings. I used the following code to get the file:
import pandas as pd
df = pd.read_excel("file.xlsx")
db = df['Column Title']
I am removing the punctuation for the first line (row) of the column using this code:
import string
translator = str.maketrans('', '', string.punctuation)
sent_pun = db[0].translate(translator)
I would like to remove the punctuation for each line (until the last row). How would I correctly write this with a loop? Thank you.
Well given that this code is working for one value and producing the right kind of results then you can write it in a loop as
for row in rows(min_row=1, min_col=1, max_row=6, max_col=3):
for cell in row:
translator = str.maketrans('', '', string.punctuation)
sent_pun = db[0].translate(translator)
Change the arguments (number of rows and columns) as per your need.

Sort excel worksheet using python

I have an excel sheet like this:
I would like to output the data into an excel file like this:
Basically, for the common elements in column 2,3,4, I want to contacenate the values in the 5th column.
Please suggest, how could I do this ?
The easiest way to approach an issue like this is exporting the spreadsheet to CSV first, in order to help ensure that it imports correctly into Python.
Using a defaultdict, you can create a dictionary that has unique keys and then iterate through lines adding the final column's values to a list.
Finally you can write it back out to a CSV format:
from collections import defaultdict
results = defaultdict(list)
with open("in_file.csv") as f:
header = f.readline()
for line in f:
cols = line.split(",")
key = ",".join(cols[0:4])
results[key].append(cols[4])
with open("out_file.csv", "w") as f:
f.write(header)
for k, v in results.iteritems():
line = '{},"{}",\n'.format(k, ", ".join(v))
f.write(line)

Resources