I created a table using PrettyTable. I would like to save the output as a .pdf file but the only thing I can do is save it as .txt.
How to save it as .pdf file?
I installed the FPDF library but I am stucked with this.
# my table is saved as 'data' variable name
# I saved the table ('data') as .txt file
data = x.get_string()
with open('nameoffile.txt', 'w') as f:
f.write(data)
print(data)
PrettyTable is not used to export data to pdf file. It's used to display ASCII table.
The following code is a homemade method that answers your problem.
Lets assume you have this prettytable you want to export:
from prettytable import PrettyTable
x = PrettyTable()
x.field_names = ["City name", "Area", "Population", "Annual Rainfall"]
x.add_row(["Adelaide", 1295, 1158259, 600.5])
x.add_row(["Brisbane", 5905, 1857594, 1146.4])
x.add_row(["Darwin", 112, 120900, 1714.7])
x.add_row(["Hobart", 1357, 205556, 619.5])
x.add_row(["Sydney", 2058, 4336374, 1214.8])
x.add_row(["Melbourne", 1566, 3806092, 646.9])
x.add_row(["Perth", 5386, 1554769, 869.4])
First, you need to get the content of your table. The module isn't supposed to work in this way : it assumes that you have a table content that you want to display. Let's do the opposite :
def get_data_from_prettytable(data):
"""
Get a list of list from pretty_data table
Arguments:
:param data: data table to process
:type data: PrettyTable
"""
def remove_space(liste):
"""
Remove space for each word in a list
Arguments:
:param liste: list of strings
"""
list_without_space = []
for mot in liste: # For each word in list
word_without_space = mot.replace(' ', '') # word without space
list_without_space.append(word_without_space) # list of word without space
return list_without_space
# Get each row of the table
string_x = str(x).split('\n') # Get a list of row
header = string_x[1].split('|')[1: -1] # Columns names
rows = string_x[3:len(string_x) - 1] # List of rows
list_word_per_row = []
for row in rows: # For each word in a row
row_resize = row.split('|')[1:-1] # Remove first and last arguments
list_word_per_row.append(remove_space(row_resize)) # Remove spaces
return header, list_word_per_row
Then you can export it to a pdf file. Here is one solution :
from fpdf import FPDF
def export_to_pdf(header, data):
"""
Create a a table in PDF file from a list of row
:param header: columns name
:param data: List of row (a row = a list of cells)
:param spacing=1:
"""
pdf = FPDF() # New pdf object
pdf.set_font("Arial", size=12) # Font style
epw = pdf.w - 2*pdf.l_margin # Witdh of document
col_width = pdf.w / 4.5 # Column width in table
row_height = pdf.font_size * 1.5 # Row height in table
spacing = 1.3 # Space in each cell
pdf.add_page() # add new page
pdf.cell(epw, 0.0, 'My title', align='C') # create title cell
pdf.ln(row_height*spacing) # Define title line style
# Add header
for item in header: # for each column
pdf.cell(col_width, row_height*spacing, # Add a new cell
txt=item, border=1)
pdf.ln(row_height*spacing) # New line after header
for row in data: # For each row of the table
for item in row: # For each cell in row
pdf.cell(col_width, row_height*spacing, # Add cell
txt=item, border=1)
pdf.ln(row_height*spacing) # Add line at the end of row
pdf.output('simple_demo.pdf') # Create pdf file
pdf.close() # Close file
Finally, you just have to call the two methods:
header, data = get_data_from_prettytable(x)
export_to_pdf(header, data)
Related
from tkinter.filedialog import askopenfile
root = Tk()
root.geometry('200x100')
def open_file():
file = askopenfile(mode ='r', filetypes =[('Measurement Files', '*.dat')])
if file is not None:
content = file.read()
print(content)
btn = Button(root, text ='Open', command = lambda:open_file())
btn.pack(side = TOP, pady = 10)
mainloop()
Python version 3.7
The code is returning me the content of the file, but how can I do to save each column of the data into different variables?
The data file structure is shown in the pic:
enter image description here
And I want to create 5 list with the data of each column.
Here is a solution where the datafile is parsed into a list of five lists, where each inner list contains one column of data:
content = file.read()
cols = [[], [], [], [], []]
for line in content.split('\n'): # loop over all data lines
for n, col in enumerate(line.split()) # split each line into five columns and loop over
cols[n].append(float(col)) # convert each data to float and append to corresponding column
EDIT (after your comment):
If you really need 5 different variables to store your 5 columns, you may simply add the following line after my previous code:
a, b, c, d, e = cols
I have data in csv - 2 columns, 1st column contains member id and second contains characteristics in Key-Value pairs (nested one under another).
I have seen online codes which convert a simple Key-value pairs but not able to transform data like what i have shown above
I want to transform this data into a excel table as below
I did it with this XlsxWriter package, so first you have to install it by running pip install XlsxWriter command.
import csv # to read csv file
import xlsxwriter # to write xlxs file
import ast
# you can change this names according to your local ones
csv_file = 'data.csv'
xlsx_file = 'data.xlsx'
# read the csv file and get all the JSON values into data list
data = []
with open(csv_file, 'r') as csvFile:
# read line by line in csv file
reader = csv.reader(csvFile)
# convert every line into list and select the JSON values
for row in list(reader)[1:]:
# csv are comma separated, so combine all the necessary
# part of the json with comma
json_to_str = ','.join(row[1:])
# convert it to python dictionary
str_to_dict = ast.literal_eval(json_to_str)
# append those completed JSON into the data list
data.append(str_to_dict)
# define the excel file
workbook = xlsxwriter.Workbook(xlsx_file)
# create a sheet for our work
worksheet = workbook.add_worksheet()
# cell format for merge fields with bold and align center
# letters and design border
merge_format = workbook.add_format({
'bold': 1,
'border': 1,
'align': 'center',
'valign': 'vcenter'})
# other cell format to design the border
cell_format = workbook.add_format({
'border': 1,
})
# create the header section dynamically
first_col = 0
last_col = 0
for index, value in enumerate(data[0].items()):
if isinstance(value[1], dict):
# this if mean the JSON key has something else
# other than the single value like dict or list
last_col += len(value[1].keys())
worksheet.merge_range(first_row=0,
first_col=first_col,
last_row=0,
last_col=last_col,
data=value[0],
cell_format=merge_format)
for k, v in value[1].items():
# this is for go in deep the value if exist
worksheet.write(1, first_col, k, merge_format)
first_col += 1
first_col = last_col + 1
else:
# 'age' has only one value, so this else section
# is for create normal headers like 'age'
worksheet.write(1, first_col, value[0], merge_format)
first_col += 1
# now we know how many columns exist in the
# excel, and set the width to 20
worksheet.set_column(first_col=0, last_col=last_col, width=20)
# filling values to excel file
for index, value in enumerate(data):
last_col = 0
for k, v in value.items():
if isinstance(v, dict):
# this is for handle values with dictionary
for k1, v1 in v.items():
if isinstance(v1, list):
# this will capture last 'type' list (['Grass', 'Hardball'])
# in the 'conditions'
worksheet.write(index + 2, last_col, ', '.join(v1), cell_format)
else:
# just filling other values other than list
worksheet.write(index + 2, last_col, v1, cell_format)
last_col += 1
else:
# this is handle single value other than dict or list
worksheet.write(index + 2, last_col, v, cell_format)
last_col += 1
# finally close to create the excel file
workbook.close()
I commented out most of the line to get better understand and reduce the complexity because you are very new to Python. If you didn't get any point let me know, I'll explain as much as I can. Additionally I used enumerate() python Built-in Function. Check this small example which I directly get it from original documentation. This enumerate() is useful when numbering items in the list.
Return an enumerate object. iterable must be a sequence, an iterator, or some other object which supports iteration. The __next__() method of the iterator returned by enumerate() returns a tuple containing a count (from start which defaults to 0) and the values obtained from iterating over iterable.
>>> seasons = ['Spring', 'Summer', 'Fall', 'Winter']
>>> list(enumerate(seasons))
[(0, 'Spring'), (1, 'Summer'), (2, 'Fall'), (3, 'Winter')]
>>> list(enumerate(seasons, start=1))
[(1, 'Spring'), (2, 'Summer'), (3, 'Fall'), (4, 'Winter')]
Here is my csv file,
and here is the final output of the excel file. I just merged the duplicate header values (matchruns and conditions).
I'm new in python and I need some help on read the file and count the word in column.
I have 2 data file, which is category.csv and data.csv.
category.csv:
CATEGORY
Technology
Furniture
Office Supplies
and below is data.csv
CATEGORY
Technology
Furniture
Technology
Furniture
Office Supplies
First, I want to select the 'Technology' in category.csv and match it with data.cvs, after that, it will start to count 'Technology' appears how many times in data.cvs.
import csv # import csv file
filePath1 = "category.csv"
filePath2 = "data.csv"
with open(filePath1) as csvfile1: # open category file
with open(filePath2) as csvfile2: # open data file
reader1 = csv.DictReader(csvfile1) # dictread file
reader2 = csv.DictReader(csvfile2) # dictread file
for row1 in reader1: # read all row in data file
for row2 in reader2:
for row1['CATEGORY'] in row2['CATEGORY']:
total_tech = row2['CATEGORY'].count('Technology')
total_furn = row2['CATEGORY'].count('Furniture')
total_offi = row2['CATEGORY'].count('Office Supplies')
print("=============================================================================")
print("Display category average stock level")
print("=============================================================================")
print( "Technology :", total_tech)
print("Furniture :", total_furn)
print("Office Supplies :", total_offi)
print( "=============================================================================")
But i'm failed to count it with above code, can somebody help me ? Thank you so much.
Here is the solution -
import csv # import csv file
filePath1 = "category.csv"
filePath2 = "data.csv"
categories = {}
with open(filePath1) as csvfile: # open category file
reader = csv.DictReader(csvfile) # dictread file
for row in reader: # Create a dictionary map of all the categories, and initialise count to 0
categories[row["CATEGORY"]] = 0
with open(filePath2) as csvfile: # open data file
reader = csv.DictReader(csvfile) # dictread file
for row in reader:
categories[row["CATEGORY"]] += 1 # For every item in data file, increment the count of the category
print("=============================================================================")
print("Display category average stock level")
print("=============================================================================")
for key, value in categories.items():
print("{:<20} :{:>4}".format(key, value))
print("=============================================================================")
The output is like this -
=============================================================================
Display category average stock level
=============================================================================
Technology : 2
Office Supplies : 1
Furniture : 2
=============================================================================
I want to set numeric format for a column or cell in XLSX file using python script.
The conversion script takes CSV file and converts it to XLSX. I deliberately treat header as a regular line, because final script does in the end of the conversion, in various ways according to specified command line parameters.
The example below shows only my attempt to set numeric format to a column or cell.
What do I do wrong?
With this code I manage to set alignment to the right. But any of the ways to set up numeric format fail. The XLSX file still keep that green triangle in the left upper corner of the cell and refuse to see it as a numeric cell.
Attached screenshot shows "wrong" result.
---- data file ----
a,b,c,d,e
q,1,123,0.4,1
w,2,897346,.786876,-1.1
e,3,9872346,7896876.098098,2.098
r,4,65,.3,1322
t,5,1,0.897897978,-786
---- python script ----
#!/usr/bin/env python3
# -*- coding: UTF-8 -*-
import os
import pandas
import xlsxwriter
def is_type( value ):
'''Function to identify true type of the value passed
Input parameters: value - some value which type need to be identified
Returned values: Type of the value
'''
try:
int(value)
return "int"
except:
try:
float(value)
return "float"
except:
return "str"
csv_file_name = "test37.csv"
xls_file_name = "test37.xlsx"
# Read CSV file to DataFrame
df = pandas.read_csv(csv_file_name, header=None, low_memory=False, quotechar='"', encoding="ISO-8859-1")
# Output DataFrame to Excel file
df.to_excel(xls_file_name, header=None, index=False, encoding="utf-8")
# Create writer object for output of XLSX file
writer = pandas.ExcelWriter(xls_file_name, engine="xlsxwriter")
# Write our Data Frame object to newly created file
xls_sheet_name = os.path.basename(xls_file_name).split(".")[0]
df.to_excel(writer, header=None, index=False, sheet_name=xls_sheet_name, float_format="%0.2f")
# get objects for workbook and worksheet
wb = writer.book
ws = writer.sheets[xls_sheet_name]
ws.set_zoom(120)
num_format1 = wb.add_format({
'align': 'right'
})
num_format2 = wb.add_format({
'align': 'right',
'num_format': '0.00'
})
num_format3 = wb.add_format()
num_format3.set_num_format('0.00')
ws.set_column('D:D', None, num_format1)
ws.set_column('D:D', None, num_format2)
for column in df.columns:
for row in range(1,len(df[column])):
if is_type(df[column][row]) == "int":
#print("int "+str(df.iloc[row][column]))
ws.write( row, column, df.iloc[row][column], num_format2 )
elif is_type(df[column][row]) == "float":
#print("float "+str(df.iloc[row][column]))
ws.write( row, column, df.iloc[row][column], num_format2 )
else:
pass
wb.close()
writer.save()
exit(0)
The problem has nothing to do with your xlsxwriter script, but lies in the way you import the csv in Pandas. Your csv-file has a header, but you specify in pd.read_csv() that there isn't a header. Therefore, Pandas also parses the header row as data. Because the header is a string, the entire column gets imported as a string (instead of integer or float).
Just remove the 'header=None' in pd.read_csv and df.to_excel() and it should work fine.
so:
...<first part of your code>
# Read CSV file to DataFrame
df = pandas.read_csv(csv_file_name, low_memory=False, quotechar='"', encoding="ISO-8859-1")
# Output DataFrame to Excel file
df.to_excel(xls_file_name, index=False, encoding="utf-8")
<rest of your code>...
I have a csv file which is structured like that. What I want to achieve is to merge colors .like for product code 1001 there are different colors, i.e BLACK CREAM GRAPHITE, I want one row for 1001 and all colors in one cell ";" (semi colon) separated.I want to do it for all products.
EDIT
Requried Output:
1001-BLACK-P-OS ,BLACK;CREAM;Graphite
1002-BLACK-P-OS ,BLACK;CREAM
Given CSV
1001-BLACK-P-OS , BLACK
1001-CREAM-P-OS , CREAM
1001-GRAPH-P-OS , GRAPHITE
1002-BLACK-P-OS ,BLACK
1002-CREAM-P-OS ,CREAM
I am trying on python but not able to do it.
with open('ascolor.csv') as csvfile:
readCSV = csv.reader(csvfile, delimiter=',')
for row in readCSV:
serial=row[0]
d=''
for r in readCSV:
if serial is r[0]:
d=d+r[1]
d=d+';'
Create your data file:
data = """1001-BLACK-P-OS , BLACK
1001-CREAM-P-OS , CREAM
1001-GRAPH-P-OS , GRAPHITE
1002-BLACK-P-OS ,BLACK
1002-CREAM-P-OS ,CREAM"""
fn = 'ascolor.csv'
with open(fn, "w") as f:
f.write(data)
with that we can start reformatting it:
fn = 'ascolor.csv'
import csv
data = {}
with open(fn) as csvfile:
readCSV = csv.reader(csvfile, delimiter=',')
for row in readCSV:
if row: # weed out any empty rows - they would cause index errors
num = row[0].split("-")[0] # use only the number as key into our dict
d = data.setdefault(num,[row[0].strip()]) # create the default entry with num as key
# and the old "1001-BLACK-P-OS text as first entry
if len(d) == 1: # first time we add smth
d.append([row[1].strip()]) # now add the first color into an inner list
else: # this is the second/third color for this key, append to inner list
d[1].append(row[1].strip()) # this is kindof inefficient string concat
# after that youve got a dictionary of your data:
# print(data)
# {'1001': ['1001-BLACK-P-OS', ['BLACK', 'CREAM', 'GRAPHITE']],
# '1002': ['1002-BLACK-P-OS', ['BLACK', 'CREAM']]}
# when writing csv with module, always open file with newline = ""
# else you get silly empty lines inside your file. module csv will do
# all newlines needed. See example at
# https://docs.python.org/3/library/csv.html#csv.writer
with open("done.csv","w",newline="") as f:
writer = csv.writer(f,delimiter=",")
for k in sorted(data.keys()):
# this will add the 1001-BLACK-P-OS before it - I dont like that
# writer.writerow([data[k][0],';'.join(data[k][1])])
# I like this better - its just 1001 and then the colors
writer.writerow([k,';'.join(data[k][1])])
print("")
with open("done.csv","r") as f:
print(f.read())
Output:
1001,BLACK;CREAM;GRAPHITE
1002,BLACK;CREAM
or with the commented line:
1001-BLACK-P-OS,BLACK;CREAM;GRAPHITE
1002-BLACK-P-OS,BLACK;CREAM
HTH