I want to convert DataFrames in file "text.txt" that contain 101 rows × 1 columns into 1 rows x 1 columns with separator (',')
I tried this code :
tweets_data = []
with open("text.txt", "r", encoding="utf8") as f:
for tweet in f:
ayu = tweet.rstrip('\n').split(',')
print(ayu)
I expected the output [{'text'},{'text'},....,{'text'}]
but the actual output is
[{'text\n'},
{'text\n'},
...
{'text\n'},]
Anyone can help me?
From what I can see the solution would be something like this:
with open("fileName.txt", "r", encoding="utf-8") as f:
listOfTweets = []
for tweet in f:
listOfTweets.append(tweet)
print(listOfTweets)
Related
I'm using pandas to open a CSV file that contains data from spotify, meanwhile, I have a txt file that contains various artists names from that CSV file. What I'm trying to do is get the value from each row of the txt and automatically search them in the function I've done.
import pandas as pd
import time
df = pd.read_csv("data.csv")
df = df[['artists', 'name', 'year']]
def buscarA():
start = time.time()
newdf = (df.loc[df['artists'].str.contains(art)])
stop = time.time()
tempo = (stop - start)
print (newdf)
e = ('{:.2f}'.format(tempo))
print (e)
with open("teste3.txt", "r") as f:
for row in f:
art = row
buscarA()
but the output is always the same:
Empty DataFrame
Columns: [artists, name, year]
Index: []
The problem here is that when you read the lines of your file in Python, it also gets the line break per row so that you have to strip it off.
Let's suppose that the first line of your teste3.txt file is "James Brown". It'd be read as "James Brown\n" and not recognized in the search.
Changing the last chunk of your code to:
with open("teste3.txt", "r") as f:
for row in f:
art = row.strip()
buscarA()
should work.
I want to create dataframe form existing lists( each row of file will be written in row dataframe.
with open(filename, mode='r', encoding='cp1252') as f:
lines=f.readlines()
liste1 = str(lines[0])
df1 = pd.DataFrame(liste1)
who can help me please?
below the 3 first rows of file f1.
[‘x1’, ‘major’, ’1198’, ‘TCP’]
[‘x1’, ‘minor’, ‘1198’, ‘UDP’]
[‘x2’, ‘major’, ’1198’, ‘UDP’]
If I understand this properly, want each row in the DataFrame to be a string you read from a line in the file?
Note that liste in your case is a string so I am not sure what you are going for.
This approach should work anyways.
import pandas as pd
df1 = pd.DataFrame()
with open(filename, mode='r', encoding='cp1252') as f:
lines=f.readlines()
liste1 = str(lines[0])
df1 = df1.append(pd.Series(liste1), ignore_index=True)
So if liste1 has form
> "This is a string"
then your DataFrame will look like this
df1.head()
0
0 This is a string
if liste1 has form
> ["This", "is", "a", "list"]
then your DataFrame will look like this
df1.head()
0 1 2 3
0 This is a list
You can then call this append() routine as many times as you want inside a loop.
However, I suspect that there is a function, such as pd.read_table(), that can do this all for you automatically (as #jezrael suggested in the comments to your question).
I’m merging two text files file1.tbl and file2.tbl with a common column. I used pandas to make data frames of each and merge function to have the output.
The problem is the output file does not show me the whole data and there is a row of "..." instead and at the end it just prints [9997 rows x 5 columns].
I need a file containing the whole 9997 rows.
import pandas
with open("file1.tbl") as file:
d1 = file.read()
with open("file2.tbl") as file:
d2 = file.read()
df1 = pandas.read_table('file1.tbl', delim_whitespace=True, names=('ID', 'chromosome', 'strand'))
df2 = pandas.read_table('file2.tbl', delim_whitespace=True, names=('ID', 'NUClen', 'GCpct'))
merged_table = pandas.merge(df1, df2)
with open('merged_table.tbl', 'w') as f:
print(merged_table, file=f)
I am writing code which takes rows from a CSV file and transfers them into a lists of integers. However, if I leave some blank entries in the row, I get a "list index out of range" error. Here is the code:
import csv
with open('Test.csv', 'r') as f:
reader = csv.reader(f, delimiter=',')
rows = [[int(row[0]), int(row[1]),int(row[2]),int(row[3])] for row in reader]
for row in rows:
print(row)
I looked up some similar questions on this website and the best idea for the solution I got was:
rows = [[int(row[0]), int(row[1]),int(row[2]),int(row[3])] for row in reader if len(row)>1]
However, it resulted with the same error.
Thanks in advance!
The problem is that if you don't have an int or it is empty the cast will fail.
The below example inserts a zero '0' in case the value is not an int or is empty. Replace it by what you want.
You can optimize the code but this should work:
Edit: Shorter version
import csv
def RepresentsInt(s):
try:
int(s)
return True
except ValueError:
return False
l = []
with open('test.csv', 'r') as f:
reader = csv.reader(f, delimiter=',')
for row in reader:
l.append([int(r) if RepresentsInt(r) else 0 for r in row])
for row in l:
print(row)
I have a CSV file and I want to convert it to a text file based on the first column which is the ids. and then each file contain multiple columns. for example
file.csv
id val1 val 2 val3
1 50 52 60
2 45 84 96
and etc.
here is my code:
dir_name = '/Users/user/My Documents/test/'
with io.open('file1.csv', 'rt',encoding='utf8') as f:
reader = csv.reader(f, delimiter=',')
next(reader)
xx = []
for row in reader:
with open(os.path.join(dir_name, row[0] + ".txt"),'a') as f2:
xx = row[1:2]
f2.write(xx +"\n")
so it should be:
1.text
50 52 60
2.text
45 84 96
but it only creates files without content.
can anyone help me?. Thanks in advance
There were a couple of issues:
It's actually a whitespace separated values file, not a comma-separated values file. So, you have to change the delimiter from ,. Also, the whitespace is repeated, so you can pass an additional flag to the csv module.
Some funkiness with the array indexing and conversion to string.
This program meets your requirements:
#!/usr/bin/python
import io
import csv
import os
dir_name = './'
with io.open('input.csv', 'rt',encoding='utf8') as f:
reader = csv.reader(f, skipinitialspace=True, delimiter=' ')
next(reader)
xx = []
for row in reader:
filename = os.path.join(dir_name, row[0])
with open(filename + ".txt", 'a') as f2:
xx = row[1:]
f2.write(" ".join(xx) +"\n")