I'm using pandas to open a CSV file that contains data from spotify, meanwhile, I have a txt file that contains various artists names from that CSV file. What I'm trying to do is get the value from each row of the txt and automatically search them in the function I've done.
import pandas as pd
import time
df = pd.read_csv("data.csv")
df = df[['artists', 'name', 'year']]
def buscarA():
start = time.time()
newdf = (df.loc[df['artists'].str.contains(art)])
stop = time.time()
tempo = (stop - start)
print (newdf)
e = ('{:.2f}'.format(tempo))
print (e)
with open("teste3.txt", "r") as f:
for row in f:
art = row
buscarA()
but the output is always the same:
Empty DataFrame
Columns: [artists, name, year]
Index: []
The problem here is that when you read the lines of your file in Python, it also gets the line break per row so that you have to strip it off.
Let's suppose that the first line of your teste3.txt file is "James Brown". It'd be read as "James Brown\n" and not recognized in the search.
Changing the last chunk of your code to:
with open("teste3.txt", "r") as f:
for row in f:
art = row.strip()
buscarA()
should work.
import csv
with open('C:/Users/dkarar/Desktop/Mapping project/RC_Mapping.csv', 'r') as file1:
with open('C:/Users/dkarar/Desktop/Mapping project/Thinclient_mapping.csv', 'r') as file2:
with open('C:/Users/dkarar/Desktop/Mapping project/output.csv', 'w') as outfile:
writer = csv.writer(outfile)
reader1 = csv.reader(file1)
reader2 = csv.reader(file2)
for row in reader1:
if not row:
continue
for other_row in reader2:
if not other_row:
continue
# if we found a match, let's write it to the csv file with the id appended
if row[1].lower() == other_row[1].lower():
new_row = other_row
new_row.append(row[0])
writer.writerow(new_row)
continue
# reset file pointer to beginning of file
file2.seek(0)
You seem to be getting at least one row where there is a single element. That's why when accessing row[1] you get an IndexError, there's only one element in the list row.
I tried to convert text file content into a .csv format by reading each and every line using python csv module and converting that to a list. But i couldn't get the expected output and it stores the first line in a row but second line will be stored in 3rd row and 5th so on. Since I am new to python i don't know how to skip the line and store it in the right order.
def FileConversion():
try:
with open('TextToCSV.txt', 'r') as textFile:
LineStripped = (eachLine.strip() for eachLine in textFile)
lines = (eachLine.split(" ") for eachLine in LineStripped if eachLine)
with open('finalReport.csv', 'w') as CSVFile:
writer = csv.writer(CSVFile)
writer.writerow(('firstName', 'secondName', 'designation', "age"))
writer.writerows(lines)
Why don't you try doing something more simple:
import pandas as pd
aux = pd.read_csv("TextToCSV.txt", sep=" ")
aux.columns=['firstName', 'secondName', 'designation', "age"]
aux.to_csv("result.csv")
def gameinfo():
lines = []
html_doc = 'STATIC.html'
soup = BeautifulSoup(open(html_doc), 'html.parser')
for mytable in soup.find_all('table'):
for trs in mytable.find_all('tr'):
tds = trs.find_all('td')
row1 = [elem.text.strip() for elem in tds]
row = str(row1)
sausage = False
with open("FIRE.txt", "r+") as file:
for line in file:
if row+"\n" in line:
break
else:
if row.split(",")[:4] == line.split(",")[:4]:
print(row)
print(line)
file.write(line.replace(line+"\n", row+"\n"))
print('Already exists with diff date')
sausage = True
break
if sausage == False:
print(row.split(",")[:4])
print(line.split(",")[:4])
print(row)
print(line)
file.write(row+"\n")
print('appended')
while True:
gameinfo()
gameinfo()
This program is supposed to keep searching the text file FIRE.txt for lines that match the variable row. When i run it, it works okay, but the part of the code that is supposed to check if the first four elements of the list are the same, and then skin the appending section below, doesn't work. When the program detects that the first 4 elements of a string turned into a list(row) that matches with another string's first 4 elements that's in the text file, it should overwrite the string in the text file. However when it detects a list that has the same first 4 elements, it loops forever and never breaks out.
My string looks like this:
['Infield Upper Deck Reserved 529', '$17.29', '4', '2', '175']
and i compare it to a list that looks like this:
['Infield Upper Deck Reserved 529', '$17.29', '4', '2', '170']
and when it sees that the first 4 elements in the list are the same, it should overwrite the one that was in the text file to begin with, but it is looping.
Question has changed; most recent version last.
Methinks you want to use the csv module. If you iterate through a csv.reader object instead of the file object directly, you'll get each line as a a list.
Example:
import csv
row = ["this", "is", "an", "example"]
with open("FIRE.txt", "r+") as file:
reader = csv.reader(file)
for line in reader:
if row in line:
break
pass
Alternatively, if you don't need to use this in anything other than Python, you could pickle a collections.OrderedDict with a tuple of the first four items as the keys:
import collections
import pickle
import contextlib
#contextlib.contextmanager
def mutable_pickle(path, default=object):
try:
with open(path, "rb") as f:
obj = pickle.load(f)
except IOError, EOFError:
obj = default()
try:
yield obj
finally:
with open(path, "wb") as f:
pickle.dump(obj, f)
with mutable_pickle("fire.bin",
default=collections.OrderedDict) as d:
for row in rows:
d[tuple(row[:4])] = row
I create a new column (name:Account) in the csv, then try to make a sequence (c = float(a) + float(b)) and for each number in sequence append to the original line in the csv, which is the value of the new column. Here is my code:
# -*- coding: utf-8 -*-
import csv
with open('./tradedate/2007date.csv') as inf:
reader = csv.reader(inf)
all = []
row = next(reader)
row.append('Amount')
all.append(row)
a =50
for i, line in enumerate(inf):
if i != 0:
size = sum(1 for _ in inf) # count the line number
for b in range(1, size+1):
c = float(a) + float(b) # create the sequence: in 1st line add 1, 2nd line add 2, 3rd line add 3...etc
line.append(c) # this is the error message: AttributeError: 'str' object has no attribute 'append'
all.append(line)
with open('main_test.csv', 'w', newline = '') as new_csv:
csv_writer = csv.writer(new_csv)
csv_writer.writerows(all)
The csv is like this:
日期,成交股數,成交金額,成交筆數,發行量加權股價指數,漲跌點數,Account
96/01/02,"5,738,692,838","141,743,085,172","1,093,711","7,920.80",97.08,51
96/01/03,"5,974,259,385","160,945,755,016","1,160,347","7,917.30",-3.50,52
96/01/04,"5,747,756,529","158,857,947,106","1,131,747","7,934.51",17.21,53
96/01/05,"5,202,769,867","143,781,214,318","1,046,480","7,835.57",-98.94,54
96/01/08,"4,314,344,739","115,425,522,734","888,324","7,736.71",-98.86,55
96/01/09,"4,533,381,664","120,582,511,893","905,970","7,790.01",53.30,56
The Error message is:
Traceback (most recent call last):
File "main.py", line 21, in <module>
line.append(c)
AttributeError: 'str' object has no attribute 'append'
Very thanks for any help!!
I'm a little confused why you're structuring your code this way, but the simplest fix would be to change the append (since you can't append to a string) to += a string version of c, i.e.
line += str(c)
or
line += ',{}'.format(c)
(I'm not clear based on how you're written this if you need the comma or not)
The biggest problem is that you're not using your csv reader - below is a better implementation. With the csv reader it's cleaner to do the append that you want to do versus using the file object directly.
import csv
with open('./tradedate/2007date.csv') as old_csv:
with open('main_test.csv', 'w') as new_csv:
writer = csv.writer(new_csv, lineterminator='\n')
reader = csv.reader(old_csv)
all = []
row = next(reader)
row.append('Line Number')
all.append(row)
line_number = 51
for row in reader:
row.append(line_number)
all.append(row)
line_number += 1
writer.writerows(all)