Add multiple alignments to excel generated from pandas dataframe - python-3.x

I'm using the pandas to_excel method to export a dataframe to excel as follows:
df.to_excel(writer, sheet_name=k.measures[0].alias, header=False, index=False)
I'd like to left align the names, right align the values and centre align the headers for a dataframe. I'm using the pandas ExcelWriter class. As far as I can see with the df.style.set_properties, you can apply only one type of alignment to all text in the dataframe.
Is there a way to do what I want?

Related

Read excel using pandas and join two pandas dataframes without losing formatting styles

Initially I have two excel files. Input file1 contains some colors present in excel columns.
Another excel file looks likes this.
I have to join this two excel file using openpyxl or xlsxwriter(python library) or by any other methods. And in the output file I don't want to loose colors. output file will look like the below image.
please use the code below to create the pandas dataframe for the two input files.
import pandas as pd
df = pd.DataFrame({'id':[1,2,3,4],
'name':['rahul','raju','mohan','ram'],
'salary':[20000,34000,10000,998765]
})
print(df)
df1 = pd.DataFrame({'id':[1,2,3,4],
'state':['gujrat','bhopal','mumbai','kolkata']
})
print(df1)

Using a for loop in pandas

I have 2 different tabular files, in excel formats. I want to know if an id number from one of the columns in the first excel file (from the "ID" column) exists in the proteome file in a specific column (take "IHD" for example) and if so, to display the value associated with it. Is there a way to do this, specifically in pandas and possible using a for loop?
After loading the excel files with read_excel(), you should merge() the dataframes on ID and protein. This is the recommended approach with pandas rather than looping.
import pandas as pd
clusters = pd.read_excel('clusters.xlsx')
proteins = pd.read_excel('proteins.xlsx')
clusters.merge(proteins, left_on='ID', right_on='protein')

Create a Dataframe from an excel file

I want create a Dataframe from excel file. I am using pandas read_excel function. My requirement is to create a Dataframe for all elements if the column matches some value.
For eg:- Below is my excel file and I want to create the Dataframe with all elements that has Module equal to 'DC-Prod'
Exccel File Image
Welcome, Saagar Sheth!
to make a Dataframe, just import "pandas" it like so...
import pandas as pd
then create a variable for the file to access, like this;
file_var_pandas = 'customer_data.xlsx'
and then, create its dataframe using the read_excel;
customers = pd.read_excel(file_var_pandas,
sheetname=0,
header=0,
index_col=False,
keep_default_na=True
)
finally, use the head() command like so;
customers.head()
if you want to know more just go to this website!
Packet Pandas Dataframe
and have fun!

Dask Dataframe View Entire Row

I want to see the entire row for a dask dataframe without the fields being cutoff, in pandas the command is pd.set_option('display.max_colwidth', -1), is there an equivalent for dask? I was not able to find anything.
You can import pandas and use pd.set_option() and Dask will respect pandas' settings.
import pandas as pd
# Don't truncate text fields in the display
pd.set_option("display.max_colwidth", -1)
dd.head()
And you should see the long columns. It 'just works.'
Dask does not normally display the data in a dataframe at all, because it represents lazily-evaluated values. You may want to get a specific row by index, using the .loc accessor (same as in Pandas, but only efficient if the index is known to be sorted).
If you meant to get the whole list of columns only, you can get this by the .columns attribute.

How to write summary of spark sql dataframe to excel file

I have a very large Dataframe with 8000 columns and 50000 rows.
I want to write its statistics information into excel file.
I think we can use describe() method. But how to write it to excel in good format. Thanks
The return type for describe is a pyspark dataframe. The easiest way to get the describe dataframe into an excel readable format is to convert it to a pandas dataframe and then write the pandas dataframe out as a csv file as below
import pandas
df.describe().toPandas().to_csv('fileOutput.csv')
If you want it in excel format, you can try below
import pandas
df.describe().toPandas().to_excel('fileOutput.xls', sheet_name = 'Sheet1', index = False)
Note, the above requires xlwt package to be installed (pip install xlwt in the command line)

Resources