How to write .xlsx data to a .txt file ensuring each column has its own text file, then each row is a new line? - python-3.x

I believe I am close to cracking this however I can't add multiple lines of text to the .txt files. The column do relate to their own .txt files.
import openpyxl
from pathlib import Path
# create workbook
wb = openpyxl.Workbook()
sheet = wb.active
listOfTextFiles = []
# Create a workbook 5x5 with dummy text
for row in range(1, 6):
for col in range(1, 6):
file = sheet.cell(row=row, column=col).value = f'text row:{row}, col:{col}'
listOfTextFiles.append(file)
print(listOfTextFiles) # for testing
wb.save('testSS.xlsx')
for i in range(row): # create 5 text files
textFile = open(f'ssToTextFile{i}.txt', 'w')
textFile.write(listOfTextFiles[i])
The output for each text file is below. I know it has something to do with the 'textFile.write(listOfTextFiles[i])' and I've tried many ways such as replacing [i] with [j] or [file]. I think I am overwriting the text through each loop.
Current output:
ssToTextFile.txt -> text row:1, col:1
What I want the output to be in each .txt file:
ssToTextFile.txt -> text row:1, col:1
text row:2, col:1
text row:3, col:1
text row:4, col:1
text row:5, col:1
Then, the next .txt file to be:
text row:1, col:2
text row:2, col:2 etc
Would appreciate any feedback and the logic behind it please?

Solved. Using sheet.columns on the outer loop I could use [x-1] as the index.
for x in range(sheet.min_row, sheet.max_row + 1):
textFile = open(f'ssToTextFile{x-1}.txt', 'w')
for y in list(sheet.columns)[x-1]:
textFile.write(str(y.value)+ '\n')
print(y.value)

Related

How do I convert multiple multiline txt files to excel - ensuring each file is its own line, then each line of text is it own row? Python3

Using openpyxl and Path I aim to:
Create multiple multiline .txt files,
then insert .txt content into a .xlsx file ensuring file 1 is in column 1 and each line has its own row.
I thought to create a nested list then loop through it to insert the text. I cannot figure how to ensure that all the nested list string is displayed. This is what I have so far which nearly does what I want however it's just a repeat of the first line of text.
from pathlib import Path
import openpyxl
listOfText = []
wb = openpyxl.Workbook() # Create a new workbook to insert the text files
sheet = wb.active
for txtFile in range(5): # create 5 text files
createTextFile = Path('textFile' + str(txtFile) + '.txt')
createTextFile.write_text(f'''Hello, this is a multiple line text file.
My Name is x.
This is text file {txtFile}.''')
readTxtFile = open(createTextFile)
listOfText.append(readTxtFile.readlines()) # nest the list from each text file into a parent list
textFileList = len(listOfText[txtFile]) # get the number of lines of text from the file. They are all 3 as made above
# Each column displays text from each text file
for row in range(1, txtFile + 1):
for col in range(1, textFileList + 1):
sheet.cell(row=row, column=col).value = listOfText[txtFile][0]
wb.save('importedTextFiles.xlsx')
The output is 4 columns/4 rows. All of which say the same 'Hello, this is a multiple line text file.'
Appreciate any help with this!
The problem is in the for loop while writing, change the line sheet.cell(row=row, column=col).value = listOfText[txtFile][0] to sheet.cell(row=col, column=row).value = listOfText[row-1][col-1] and it will work

How can I create an excel file with multiple sheets that stores content of a text file using python

I need to create an excel file and each sheet contains the contents of a text file in my directory, for example if I've two text file then I'll have two sheets and each sheet contains the content of the text file.
I've managed to create the excel file but I could only fill it with the contents of the last text file in my directory, howevr, I need to read all my text files and save them into excel.
This is my code so far:
import os
import glob
import xlsxwriter
file_name='WriteExcel.xlsx'
path = 'C:/Users/khouloud.ayari/Desktop/khouloud/python/Readfiles'
txtCounter = len(glob.glob1(path,"*.txt"))
for filename in glob.glob(os.path.join(path, '*.txt')):
f = open(filename, 'r')
content = f.read()
print (len(content))
workbook = xlsxwriter.Workbook(file_name)
ws = workbook.add_worksheet("sheet" + str(i))
ws.set_column(0, 1, 30)
ws.set_column(1, 2, 25)
parametres = (
['file', content],
)
# Start from the first cell. Rows and
# columns are zero indexed.
row = 0
col = 0
# Iterate over the data and write it out row by row.
for name, parametres in (parametres):
ws.write(row, col, name)
ws.write(row, col + 1, parametres)
row += 1
workbook.close()
example:
if I have two text file, the content of the first file is 'hello', the content of the second text file is 'world', in this case I need to create two worksheets, first worksheet needs to store 'hello' and the second worksheet needs to store 'world'.
but my two worksheets contain 'world'.
I recommend to use pandas. It in turn uses xlsxwriter to write data (whole tables) to excel files but makes it much easier - with literally couple lines of code.
import pandas as pd
df_1 = pd.DataFrame({'data': ['Hello']})
sn_1 = 'hello'
df_2 = pd.DataFrame({'data': ['World']})
sn_2 = 'world'
filename_excel = '1.xlsx'
with pd.ExcelWriter(filename_excel) as writer:
for df, sheet_name in zip([df_1, df_2], [sn_1, sn_2]):
df.to_excel(writer, index=False, header=False, sheet_name=sheet_name)

Read file and output specific fields to CSV file

I'm trying to search for data based on a key word and export that data to an Excel or text file.
When I "print" the variable/list it works no problem. When I try and output the data to a file it only outputs the last entry. I think something is wrong with the iteration, but I can't figure it out.
import xlsxwriter
#Paths
xls_output_path = 'C:\\Data\\'
config = 'C:\\Configs\\filename.txt'
excel_inc = 0 #used to increment the excel columns so not everything
#is written in "A1"
lines = open(config,"r").read().splitlines()
search_term = "ACL"
for i, line in enumerate(lines):
if search_term in line:
split_lines = line.split(' ') #Split lines via a space.
linebefore = lines[i - 1] #Print the line before the search term
linebefore_split = linebefore.split(' ') #Split the line before via
#space
from_obj = linebefore_split[2] #[2] holds the data I need
to_object = split_lines[4] #[4] holds the data I need
print(len(split_lines)) #Prints each found line with no
#problem.
excel_inc = excel_inc + 1 #Increments for column A so not all of
#the data is placed in A1
excel_inc_str = str(excel_inc) #Change type to string so it can
#concatenate.
workbook = xlsxwriter.Workbook(xls_output_path + 'Test.xlsx') #Creates the xls file
worksheet = workbook.add_worksheet()
worksheet.write('A' + excel_inc_str, split_lines[4]) #Write data from
#split_lines[4]
#to column A
workbook.close()
I created this script so it will go and find all lines in the "config" file with the keyword "ACL".
It then has the ability to print the line before and the actual line the data is found. This works great.
My next step is outputting the data to an excel spreadsheet. This is where I get stuck.
The script only prints the very last item in the column A row 10.
I need help figuring out why it'll print the data correctly, but it won't output it to an excel spreadsheet or even a .txt file.
Try this - I moved your workbook and worksheet definitions outside the loop, so it doesn't keep getting redefined.
import xlsxwriter
#Paths
xls_output_path = 'C:\\Data\\'
config = 'C:\\Configs\\filename.txt'
excel_inc = 0 #used to increment the excel columns so not everything
#is written in "A1"
lines = open(config,"r").read().splitlines()
search_term = "ACL"
workbook = xlsxwriter.Workbook(xls_output_path + 'Test.xlsx') #Creates the xls file
worksheet = workbook.add_worksheet()
for i, line in enumerate(lines):
if search_term in line:
split_lines = line.split(' ') #Split lines via a space.
linebefore = lines[i - 1] #Print the line before the search term
linebefore_split = linebefore.split(' ') #Split the line before via
#space
from_obj = linebefore_split[2] #[2] holds the data I need
to_object = split_lines[4] #[4] holds the data I need
print(len(split_lines)) #Prints each found line with no
#problem.
excel_inc = excel_inc + 1 #Increments for column A so not all of
#the data is placed in A1
excel_inc_str = str(excel_inc) #Change type to string so it can
#concatenate.
worksheet.write('A' + excel_inc_str, split_lines[4]) #Write data from
#split_lines[4]
#to column A
workbook.close()

Convert and concatenate data from two columns of a csv file

I have a csv file which contains data in two columns, as follows:
40500 38921
43782 32768
55136 49651
63451 60669
50550 36700
61651 34321
and so on...
I want to convert each data into it's hex equivalent, then concatenate them, and write them into a column in another csv file.
For example: hex(40500) = 9E34, and hex(38921) = 9809.
So, in output csv file, element A1 would be 9E349809
So, i am expecting column A in output csv file to be:
9E349809
AB068000
D760C1F3
F7DBECFD
C5768F5C
F0D38611
I referred a sample code which concatenates two columns, but am struggling with the converting them to hex and then concatenating them. Following is the code:-
import csv
inputFile = 'input.csv'
outputFile = 'output.csv'
with open(inputFile) as f:
reader = csv.reader(f)
with open(outputFile, 'w') as g:
writer = csv.writer(g)
for row in reader:
new_row = [''.join([row[0], row[1]])] + row[2:]
writer.writerow(new_row)
How can i convert data in each column to its hex equivalent, then concatenate them and write them in another file?
You could do this in 4 steps:
Read the lines from the input csv file
Use formatting options to get the hex values of each number
Perform string concatenation to get your result
Write to new csv file.
Sample Code:
with open (outputFile, 'w') as outfile:
with open (inputFile,'r') as infile:
for line in infile: # Iterate through each line
left, right = int(line.split()[0]), int(line.split()[1]) # split left and right blocks
newstr = '{:x}'.format(left)+'{:x}'.format(right) # create new string using hex values excluding '0x'
outfile.write(newstr) # write to output file
print ('Conversion completed')
print ('Closing outputfile')
Sample Output:
In[44] line = '40500 38921'
Out[50]: '9e349809'
ParvBanks solution is good (clear and functionnal), I would simplify it a little like that:
with open (inputFile,'r') as infile, open (outputFile, 'w+') as outfile:
for line in infile:
outfile.write("".join(["{:x}".format(int(v)) for v in line.split()]))

How to convert a tab delimited text file to a csv file in Python

I have the following problem:
I want to convert a tab delimited text file to a csv file. The text file is the SentiWS dictionary which I want to use for a sentiment analysis ( https://github.com/MechLabEngineering/Tatort-Analyzer-ME/tree/master/SentiWS_v1.8c ).
The code I used to do this is the following:
txt_file = r"SentiWS_v1.8c_Positive.txt"
csv_file = r"NewProcessedDoc.csv"
in_txt = csv.reader(open(txt_file, "r"), delimiter = '\t')
out_csv = csv.writer(open(csv_file, 'w'))
out_csv.writerows(in_txt)
This code writes everything in one row but I need the data to be in three rows as normally intended from the file itself. There is also a blank line under each data and I don´t know why.
I want the data to be in this form:
Row1 Row2 Row3
Word Data Words
Word Data Words
instead of
Row1
Word,Data,Words
Word,Data,Words
Can anyone help me?
import pandas
It will convert tab delimiter text file into dataframe
dataframe = pandas.read_csv("SentiWS_v1.8c_Positive.txt",delimiter="\t")
Write dataframe into CSV
dataframe.to_csv("NewProcessedDoc.csv", encoding='utf-8', index=False)
Try this:
import csv
txt_file = r"SentiWS_v1.8c_Positive.txt"
csv_file = r"NewProcessedDoc.csv"
with open(txt_file, "r") as in_text:
in_reader = csv.reader(in_text, delimiter = '\t')
with open(csv_file, "w") as out_csv:
out_writer = csv.writer(out_csv, newline='')
for row in in_reader:
out_writer.writerow(row)
There is also a blank line under each data and I don´t know why.
You're probably using a file created or edited in a Windows-based text editor. According to the Python 3 csv module docs:
If newline='' is not specified, newlines embedded inside quoted fields will not be interpreted correctly, and on platforms that use \r\n linendings on write an extra \r will be added. It should always be safe to specify newline='', since the csv module does its own (universal) newline handling.

Resources