Giving names for excel files programatically with pandas ExcelWriter - naming

How can I give file name programmatically for excelwriter of pandas
i.e.
the name below 'filename.xlsx': I need to define/change it within the code
results_to_export =[.......]
excel_file_name = pandas.ExcelWriter('filename.xlsx')
results_to_export.to_excel(excel_file_name)
excel_file_name.save()
Thanks
I tried the thing above

Related

Convert pandas Data frame to existing Excel keeping the worksheet format

I have a data frame that I want to convert into an existing excel file using openpyxl. This file is already created, and it has a format (shown in the image) that I want to keep once the information is transferred from the data frame.
import pandas as pd
import openpyxl
dataframe=pd.read_excel('info.xlsx')
with pd.ExcelWriter('file.xlsx', engine='openpyxl', if_sheet_exists='replace',mode='a', keep_format=True ) as writer:
dataframe.to_excel(writer,sheet_name='DATAFRAME INFO',startrow=1,index=None)
I can't find the way to do it, I have tried adding in "kwargs" something like keep_format=True, but still does not work, it always removes the existing format.
Thank you very much IMAGE OF THE FORMAT

How to tokenize/parse data in an excel sheet using spacy

I'm trying to convert an excel sheet into a doc object using spacy, I spent the last couple of days trying to go around it but it seems a bit challenging. I have opened the sheet in both openpyxl and pandas, I can read the excel sheet and output the content but I couldn't integrate spacy to create doc/token objects.
Is it possible to process excel sheets in spacy's pipeline?
Thank you!
Spacy has no support for excel.
You could use pandas to read either the csv(if csv format)
or excel file
like
import pandas as pd
df = pd.read_csv(file)
or
df = pd.read_excel(file)
respectively.
Select required text column and iterate over df 'column' values and pass them over to nlp() of spacy

How to write the data to excel with python and keep excel number format?

I'm trying to write the time data into excel with python (I'm using Pandas). When I write time data to excel I have excel number format 'General':
Sample Screenshot
But I need to have the number format as 'Time' - which I have when I paste the data as values manually to the excel file.
Is it possible to do the same with python? If yes how can I do it?
I have tried to change the values into DateTime object but when I write the data it always deletes cells format in excel file
df['starttime'] = pd.to_datetime(df['starttime']).dt.strftime('%I:%M:%S %p')
Have you tried to use Pandas?
Writing Excel Files Using Pandas
We'll be storing the information we'd like to write to an Excel file
in a DataFrame. Using the built-in to_excel() function, we can extract
this information into an Excel file.*
Step 1: install pandas in your py env
pip install pandas
Step 2: let's import the Pandas module:
import pandas as pd
Step 3:
use the to_excel() function to write the contents to a file. The only argument is the file path:
df.to_excel('./states.xlsx')

how to read text data and Convert to pandas dataframe

I am reading from the URL
import pandas as pd
url = 'https://ticdata.treasury.gov/Publish/mfh.txt'
Data = pd.read_csv(url, delimiter= '\t')
But when I export the Dataframe, I see all the columns are combined in a single column
I tried with different separators but didn't work. I want to get the Proper data frame. How I can achieve it.
Please help
I just opened this file in a text editor. This is not a tab delimited file, or anything delimited file. This is a fixed width fields files from line 9 to 48. You should use pd.read_fwf instead skiping some lines.
This would work:
import pandas as pd
url = 'https://ticdata.treasury.gov/Publish/mfh.txt'
Data = pd.read_fwf(url, widths=(31,8,8,8,8,8,8,8,8,8,8,8,8,8), skiprows=8, skipfooter=16, header=(0,1))
Data
The file you're linking is not in the CSV/TSV format. You need to transform the data to look something like this before loading it in this way.

Read excel with a particular word in title

I have a folder which may or may not have multiple excel file.
The name of the excel files can change with time, but there would be one specific keyword that will always be in the name of the excel.
For test purposes, let the keyword be Fruits
For the excel which have fixed name like Fruits_Pineapple.xlsx the code works:
import pandas as pd
pd.read_excel(r'c:\mypath\Fruits_Pineapple.xlsx')
But I can have excel file like Fruits_Pineapple, Fruits_Apple,Vegetables etc. I want to know how can I read the excel with contains functionality.
I have searched SO but surprisingly couldn't find any solution!!
Since you have no idea of how many (if any) excel files are valid in your folder, you can do the following using glob-
import glob
import pandas as pd
excel_list = glob.glob("*Fruits*.xlsx")
#or whatever extension you have, you can give a relative path or complete path.
for excel in excel_list:
pd.read_excel(excel)
#Whatever else you need to do below

Resources