Print A Pandas Data Frame to a Text File (Python 3) - python-3.x

I have a large data file like this
Words
One
Two
Three
....
Threethousand
I am trying to print this list to a text file with this code:
df1 = df[['Words']]
with open('atextfile.txt', 'w', encoding='utf-8') as outfile:
print(df1, file=outfile)
But what happens is that it doesn't print out the whole DF, it ends up looking like this:
Words
One
Two
Three
....
Threethousand
Fourthousand
Fivethousand
How can I print out the whole DF?

I would use to_string to do this, it doesn't abbreviate like the printing:
df['Words'].to_string('atextfile.txt')
# or
df[['Words']].to_string('atextfile.txt')

Related

list not split into proper csv columns using python

I wrote the following code to split my data matrix into a csv file:
f = open('midi_data.csv', 'w', newline="")
writer = csv.writer(f, delimiter= ',',quotechar =',',quoting=csv.QUOTE_MINIMAL)
for item in data:
writer.writerow(item)
print(item)
f.close()
But the csv file ends up looking like this:
tuples not separated by columns but by commas in one column only
What am I doing wrong?
The data seems to be written correctly inside the tuples, because when running the code it outputs the following:
enter image description here

Instead of printing to console create a dataframe for output

I am currently comparing the text of one file to that of another file.
The method: for each row in the source text file, check each row in the compare text file.
If the word is present in the compare file then write the word and write 'present' next to it.
If the word is not present then write the word and write not_present next to it.
so far I can do this fine by printing to the console output as shown below:
import sys
filein = 'source.txt'
compare = 'compare.txt'
source = 'source.txt'
# change to lower case
with open(filein,'r+') as fopen:
string = ""
for line in fopen.readlines():
string = string + line.lower()
with open(filein,'w') as fopen:
fopen.write(string)
# search and list
with open(compare) as f:
searcher = f.read()
if not searcher:
sys.exit("Could not read data :-(")
#search and output the results
with open(source) as f:
for item in (line.strip() for line in f):
if item in searcher:
print(item, ',present')
else:
print(item, ',not_present')
the output looks like this:
dog ,present
cat ,present
mouse ,present
horse ,not_present
elephant ,present
pig ,present
what I would like is to put this into a pandas dataframe, preferably 2 columns, one for the word and the second for its state . I cant seem to get my head around doing this.
I am making several assumptions here to include:
Compare.txt is a text file consisting of a list of single words 1 word per line.
Source.txt is a free flowing text file, which includes multiple words per line and each word is separated by a space.
When comparing to determine if a compare word is in source, is is found if and only if, no punctuation marks (i.e. " ' , . ?, etc) are appended to the word in source .
The output dataframe will only contain the words found in compare.txt.
The final output is a printed version of the pandas dataframe.
With these assumptions:
import pandas as pd
from collections import defaultdict
compare = 'compare.txt'
source = 'source.txt'
rslt = defaultdict(list)
def getCompareTxt(fid: str) -> list:
clist = []
with open(fid, 'r') as cmpFile:
for line in cmpFile.readlines():
clist.append(line.lower().strip('\n'))
return clist
cmpList = getCompareTxt(compare)
if cmpList:
with open(source, 'r') as fsrc:
items = []
for item in (line.strip().split(' ') for line in fsrc):
items.extend(item)
print(items)
for cmpItm in cmpList:
rslt['Name'].append(cmpItm)
if cmpItm in items:
rslt['State'].append('Present')
else:
rslt['State'].append('Not Present')
df = pd.DataFrame(rslt, index=range(len(cmpList)))
print(df)
else:
print('No compare data present')

How to write a list of floats to csv in columns?

i am searching everywhere for a method to write a list of floats into csv but must be in column format.
My code for writing csv as follow:
csvfile=open('Test.csv','w', newline='')
obj=csv.writer(csvfile)
obj.writerow(list_dis_B1_avg)
csvfile.close()
It turn out that the floats are written in rows.
I have a list of floats stored under "list_dis_B1_avg"
How can i just write it in column?
You dont need any csv module to do that:
with open("Test.csv", "w") as f: # use with to close the file in any case
f.write("\n".join(list_dis_B1_avg)) # newline between the elements
More about the with keyword: https://www.geeksforgeeks.org/with-statement-in-python/
More about str.join(): https://www.programiz.com/python-programming/methods/string/join

How to loop through a list of dictionaries and write the values as individual columns in a CSV

I have a list of dictionaries
d = [{'value':'foo_1', 'word_list':['blah1', 'blah2']}, ...., {'value': 'foo_n', 'word_list':['meh1', 'meh2']}]
I want to write this to a CSV file with all the 'value' keys in one column, and then each individual word from the "value"'s word_list as its own column. So I have the first row as
foo_1 blah_1 blah_2
and so on.
I don't know how many dictionaries I have, or how many words I have in "word_list".
How would I go about doing this in Python 3?
Thanks!
I figured out a solution, but it's kind of messy (wow, I can't write a bit of code without it being in the "proper format"...how annoying):
with open('filename', 'w') as f:
for key in d.keys():
f.write("%s,"%(key))
for word in d[key]:
f.write("%s,"%(word))
f.write("\n")
You can loop through the dictionaries one at a time, construct the list and then use the csv module to write the data as I have shown here
import csv
d = [{'value':'foo_1', 'word_list':['blah1', 'blah2']}, {'value': 'foo_n', 'word_list':['meh1', 'meh2']}]
with open('test_file.csv', 'w') as file:
writer = csv.writer(file)
for val_dict in d:
csv_row = [val_dict['value']] + val_dict['word_list']
writer.writerow(csv_row)
It should work for word lists of arbitrary length and as many dictionaries as you want.
It would probably be easiest to flatten each row into a normal list before writing it to the file. Something like this:
with open(filename, 'w') as file:
writer = csv.writer(file)
for row in data:
out_row = [row['value']]
for word in row['word_list']:
out_row.append(word)
csv.writerow(out_row)
# Shorter alternative to the two loops:
# csv.writerow((row['value'], *row['word_list']) for row in data)

How to print multiple lines from a file python

I'm trying to print several lines from a text file onto python, where it is outputted. My current code is:
f = open("sample.txt", "r").readlines()[2 ,3]
print(f)
However i'm getting the error message of:
TypeError: list indices must be integers, not tuple
Is there anyway of fixing this or printing multiple lines from a file without printing them out individually?
You are trying to pass a tuple to the [...] subscription operation; 2 ,3 is a tuple of two elements:
>>> 2 ,3
(2, 3)
You have a few options here:
Use slicing to take a sublist from all the lines. [2:4] slices from the 3rd line and includes the 4th line:
f = open("sample.txt", "r").readlines()[2:4]
Store the lines and print specific indices, one by one:
f = open("sample.txt", "r").readlines()
print f[2].rstrip()
print f[3].rstrip()
I used str.rstrip() to remove the newline that's still part of the line before printing.
Use itertools.islice() and use the file object as an iterable; this is the most efficient method as no lines need to be stored in memory for more than just the printing work:
from itertools import islice
with open("sample.txt", "r") as f:
for line in islice(f, 2, 4):
print line.rstrip()
I also used the file object as a context manager to ensure it is closed again properly once the with block is done.
Assign the whole list of lines to a variable, and then print lines 2 and 3 separately.
with open("sample.txt", "r") as fin:
lines = fin.readlines()
print(lines[2])
print(lines[3])

Resources