I have a csv file a list of name and mean.
For example:
ali,5.0
hamid,6.066666666666666
mandana,7.5
soheila,7.833333333333333
sara,9.75
sina,11.285714285714286
sarvin,11.375
I am going to rewrite the csv by three lower mean. I have write the code, but I have a problem to write the csv again. I should keep the mean number exactly as an input.
import csv
import itertools
from collections import OrderedDict
with open ('grades4.csv', 'r') as input_file:
reader=csv.reader(input_file)
val1=[]
key=list()
threelowval=[]
for row in reader:
k = row[0]
val=[num for num in row[1:]] #seperate a number in every row
key.append(k) #making a name -key- list
val1.append(val) #making a value list
value = list(itertools.chain.from_iterable(val1)) #making a simple list from list of list in value
value=[float(i) for i in value] ##changing string to float in values
#print(key)
#print(value)
dictionary = dict(zip(key, value))
#print(dictionary)
findic=OrderedDict(sorted(dictionary.items(), key=lambda t: t[1])) ##making a sorted list by OrderedDict
#print(findic)
##make a separation for the final dict to derive the three lower mean
lv=[]
for item in findic.values():
lv.append(item)
#print(lv)
for item in lv[0:3]:
threelowval.append(item)
print(threelowval)
I have tried below code but I get the error.
with open('grades4.csv', 'w', newline='') as output_file_name:
writer = csv.writer(output_file_name)
writer.writerows(threelowval)
expected result:
5.0
6.066666666666666
7.5
You should try this:
with open('grades4.csv', 'w', newline='') as output_file_name:
writer = csv.writer(output_file_name)
for i in threelowval:
writer.writerow([i])
I have tried below code and receive the correct results.
with open('grades4.csv', 'w', newline='') as output_file_name:
writer = csv.writer(output_file_name)
writer.writerows(map(lambda x: [x], threelowval))
Related
I have a dataframe like below:
import pandas as pd
data = {'Words':['actually','he','came','from','home','and','played'],
'Col2':['2','0','0','0','1','0','3']}
data = pd.DataFrame(data)
The dataframe looks like this:
I write this dataframe into the drive using below command:
np.savetxt('/folder/file.txt', data.values,fmt='%s', delimiter='\t')
And the next script reads it with below line of code:
data = load_file('/folder/file.txt')
Below is load_file function to read a text file.
def load_file(filename):
with open(filename, 'r', encoding='utf-8') as f:
data = f.readlines()
return data
The data will be a tab separated list.
print(data)
gives me the following output:
['actually\t2\n', 'he\t0\n', 'came\t0\n', 'from\t0\n', 'home\t1\n', 'and\t0\n', 'played\t3\n']
I dont want to write the file to drive and then read it for processing. Instead I want to convert the dataframe to a tab separated list and process directly. How can I achieve this?
I checked for existing answers, but most just convert list to dataframe and not other way around.
Thanks in advance.
Try using .to_csv()
df_list = data.to_csv(header=None, index=False, sep='\t').split('\n')
df_list:
['actually\t2',
'he\t0',
'came\t0',
'from\t0',
'home\t1',
'and\t0',
'played\t3'
]
v = df.to_csv(header=None, index=False, sep='\t').rstrip().replace('\n', '\n\\n').split('\\n')
df_list:
['actually\t2\n',
'he\t0\n',
'came\t0\n',
'from\t0\n',
'home\t1\n',
'and\t0\n',
'played\t3\n'
]
I think this achieves the same result without writing to the drive:
df_list = list(data.apply(lambda row: row['Words'] + '\t' + row['Col2'] + '\n', axis=1))
Try:
data.apply("\t".join, axis=1).tolist()
I'm using pandas to open a CSV file that contains data from spotify, meanwhile, I have a txt file that contains various artists names from that CSV file. What I'm trying to do is get the value from each row of the txt and automatically search them in the function I've done.
import pandas as pd
import time
df = pd.read_csv("data.csv")
df = df[['artists', 'name', 'year']]
def buscarA():
start = time.time()
newdf = (df.loc[df['artists'].str.contains(art)])
stop = time.time()
tempo = (stop - start)
print (newdf)
e = ('{:.2f}'.format(tempo))
print (e)
with open("teste3.txt", "r") as f:
for row in f:
art = row
buscarA()
but the output is always the same:
Empty DataFrame
Columns: [artists, name, year]
Index: []
The problem here is that when you read the lines of your file in Python, it also gets the line break per row so that you have to strip it off.
Let's suppose that the first line of your teste3.txt file is "James Brown". It'd be read as "James Brown\n" and not recognized in the search.
Changing the last chunk of your code to:
with open("teste3.txt", "r") as f:
for row in f:
art = row.strip()
buscarA()
should work.
I am trying to read a csv and then transpose one column into a row.
I tried following a tutorial for reading a csv and then one for writing but the data doesnt stay saved to the list when I try to write the row.
import csv
f = open('bond-dist-rep.csv')
csv_f = csv.reader(f)
bondlength = []
with open("bond-dist-rep.csv") as f:
for row in csv_f:
bondlength.append(row[1])
print (bondlength)
print (len(bondlength))
with open('joined.csv', 'w', newline='') as csvfile:
csv_a = csv.writer (csvfile, delimiter=',',quotechar='"',
quoting=csv.QUOTE_ALL)
csv_a.writerow(['bondlength'])
with open('joined.csv') as csvfile:
readCSV = csv.reader(csvfile, delimiter=',')
for row in readCSV:
print(row)
print(row[0])
f.close()
The matter is that you only read the first value of each line and write only a string in the new file.
In order to transpose the read lines, you can use the zip function.
I also delete the first open function which is useless because of the good use of with for opening the file.
Here the final code:
import csv
bondlength = []
with open("bond-dist-rep.csv") as csv_f:
read_csv = csv.reader(csv_f)
for row in read_csv:
bondlength.append(row)
# delete the header if you have one
bondlength.pop(0)
with open('joined.csv', 'w') as csvfile:
csv_a = csv.writer (csvfile, delimiter=',')
for transpose_row in zip(*bondlength):
csv_a.writerow(transpose_row)
def gameinfo():
lines = []
html_doc = 'STATIC.html'
soup = BeautifulSoup(open(html_doc), 'html.parser')
for mytable in soup.find_all('table'):
for trs in mytable.find_all('tr'):
tds = trs.find_all('td')
row1 = [elem.text.strip() for elem in tds]
row = str(row1)
sausage = False
with open("FIRE.txt", "r+") as file:
for line in file:
if row+"\n" in line:
break
else:
if row.split(",")[:4] == line.split(",")[:4]:
print(row)
print(line)
file.write(line.replace(line+"\n", row+"\n"))
print('Already exists with diff date')
sausage = True
break
if sausage == False:
print(row.split(",")[:4])
print(line.split(",")[:4])
print(row)
print(line)
file.write(row+"\n")
print('appended')
while True:
gameinfo()
gameinfo()
This program is supposed to keep searching the text file FIRE.txt for lines that match the variable row. When i run it, it works okay, but the part of the code that is supposed to check if the first four elements of the list are the same, and then skin the appending section below, doesn't work. When the program detects that the first 4 elements of a string turned into a list(row) that matches with another string's first 4 elements that's in the text file, it should overwrite the string in the text file. However when it detects a list that has the same first 4 elements, it loops forever and never breaks out.
My string looks like this:
['Infield Upper Deck Reserved 529', '$17.29', '4', '2', '175']
and i compare it to a list that looks like this:
['Infield Upper Deck Reserved 529', '$17.29', '4', '2', '170']
and when it sees that the first 4 elements in the list are the same, it should overwrite the one that was in the text file to begin with, but it is looping.
Question has changed; most recent version last.
Methinks you want to use the csv module. If you iterate through a csv.reader object instead of the file object directly, you'll get each line as a a list.
Example:
import csv
row = ["this", "is", "an", "example"]
with open("FIRE.txt", "r+") as file:
reader = csv.reader(file)
for line in reader:
if row in line:
break
pass
Alternatively, if you don't need to use this in anything other than Python, you could pickle a collections.OrderedDict with a tuple of the first four items as the keys:
import collections
import pickle
import contextlib
#contextlib.contextmanager
def mutable_pickle(path, default=object):
try:
with open(path, "rb") as f:
obj = pickle.load(f)
except IOError, EOFError:
obj = default()
try:
yield obj
finally:
with open(path, "wb") as f:
pickle.dump(obj, f)
with mutable_pickle("fire.bin",
default=collections.OrderedDict) as d:
for row in rows:
d[tuple(row[:4])] = row
I am writing code which takes rows from a CSV file and transfers them into a lists of integers. However, if I leave some blank entries in the row, I get a "list index out of range" error. Here is the code:
import csv
with open('Test.csv', 'r') as f:
reader = csv.reader(f, delimiter=',')
rows = [[int(row[0]), int(row[1]),int(row[2]),int(row[3])] for row in reader]
for row in rows:
print(row)
I looked up some similar questions on this website and the best idea for the solution I got was:
rows = [[int(row[0]), int(row[1]),int(row[2]),int(row[3])] for row in reader if len(row)>1]
However, it resulted with the same error.
Thanks in advance!
The problem is that if you don't have an int or it is empty the cast will fail.
The below example inserts a zero '0' in case the value is not an int or is empty. Replace it by what you want.
You can optimize the code but this should work:
Edit: Shorter version
import csv
def RepresentsInt(s):
try:
int(s)
return True
except ValueError:
return False
l = []
with open('test.csv', 'r') as f:
reader = csv.reader(f, delimiter=',')
for row in reader:
l.append([int(r) if RepresentsInt(r) else 0 for r in row])
for row in l:
print(row)