I have a folder which may or may not have multiple excel file.
The name of the excel files can change with time, but there would be one specific keyword that will always be in the name of the excel.
For test purposes, let the keyword be Fruits
For the excel which have fixed name like Fruits_Pineapple.xlsx the code works:
import pandas as pd
pd.read_excel(r'c:\mypath\Fruits_Pineapple.xlsx')
But I can have excel file like Fruits_Pineapple, Fruits_Apple,Vegetables etc. I want to know how can I read the excel with contains functionality.
I have searched SO but surprisingly couldn't find any solution!!
Since you have no idea of how many (if any) excel files are valid in your folder, you can do the following using glob-
import glob
import pandas as pd
excel_list = glob.glob("*Fruits*.xlsx")
#or whatever extension you have, you can give a relative path or complete path.
for excel in excel_list:
pd.read_excel(excel)
#Whatever else you need to do below
Related
How can I give file name programmatically for excelwriter of pandas
i.e.
the name below 'filename.xlsx': I need to define/change it within the code
results_to_export =[.......]
excel_file_name = pandas.ExcelWriter('filename.xlsx')
results_to_export.to_excel(excel_file_name)
excel_file_name.save()
Thanks
I tried the thing above
Pandas DataFrames - how do I export list 'X' to a CSV so it appears as a string? The problem is when I open the CSV using Excel it appears in date format.
X=['1-4', '1-5', '2-3', '4-8']
ie. when list 'X' is exported to a CSV and opened with Excel it appears as a date:
I would like list 'X' to appear in Excel as is - that is, not converted it to date format.
Desired output for Excel is:
I have tried the following code - but it throws an error:
import pandas as pd
X=['1-4', '1-5', '2-3', '4-8']
Y=[1,4,3,5]
df=pd.DataFrame(list(zip(X,Y)))
column_names=['A','B']
df.columns=[column_names]
df.A.to_string()
df.to_csv('yyy.csv', mode='a', header=True)
Thankyou
worked fine with me...
maybe the excel or whatever program u use to open the file is casting it... try open it as text file...
Even if Excel reads in date format, when you open in pandas it will come in original format (at least in my case). If someone only wants to save data in csv and work in pandas again, it should be fine.
I also tried doing the 2nd option here (https://www.winhelponline.com/blog/stop-excel-convert-text-to-number-date-format-csv-file/) which transform the data as text. And then saving again. It worked for me.
I am coming from java background and have minimal idea regarding python. I have to read an excel file and validate one of it's column values in the DB to verify that those rows exist in the DB or not.
I know the exact libraries and steps in java using which I can do this work.
But I am facing problems in choosing the ways to do this work in python.
till now I am able to identify some things which I can do.
Read excel file in python using python.
Use pyodbc to validate the values.
Can pandas help me to refine those steps. Rather doing things the hard way.
Yes pandas can help. But you phrase the question in a "please google this for me" way. Expect this question to be down-voted a lot.
I will give you the answer for the excel part. Surely you could have found this yourself with a little effort?
import pandas as pd
df = pd.read_excel('excel_file.xls')
Read the documentation.
Using xlrd module, one can retrieve information from a spreadsheet. For example, reading, writing or modifying the data can be done in Python. Also, a user might have to go through various sheets and retrieve data based on some criteria or modify some rows and columns and do a lot of work.
xlrd module is used to extract data from a spreadsheet.
# Reading an excel file using Python
import xlrd
# Give the location of the file
loc = ("path of file")
# To open Workbook
wb = xlrd.open_workbook(loc)
sheet = wb.sheet_by_index(0)
# For row 0 and column 0
sheet.cell_value(0, 0)
put open_workbook under try statement and use pyodbc.Error as exe in except to catch the error if there is any.
I am looking to merge about 15 different excel files to create one dataset. I know the variables in the coding are the same in each file. The problem is that the start rows for all of the data is inconsistent for each xls. Is there a way to use proc import and identify specific rows to import for each file?
Thanks!
Assuming you are using DBMS=EXCEL, you have the RANGE option available to you:
proc import file="myfile.xlsx" out=mydataset dbms=excel replace;
range="'Sheet1$A1:Z1000'";
run;
Obviously change Sheet1, A1, and Z1000 to match what you need.
This manual page contains further information on other DBMS options, including for DBMS=XLS.
I have little to no experience in SAS. But what I would like to do is read in 2 excel spreadsheets into 2 separate temporary datasets.
The files names are C:\signature_recruit.xls and C:\acceptance_recruit.xls.
How do I accomplish this?
For simplicity, you will want your excel files to look like a SAS data set. That means that you should only have rows and columns of data. If desired, the first row can be the names of the columns(variables).
Now you can either write proc import code yourself to read the excel file, or you can use the Import wizard to click through the process. This has a helpful feature in that after you click though dialog, you can have it save a program that contains the proc import code that the wizard generated to read the excel file. You can then save and reuse this code if needed.
To start the import wizard, go to File->Import Data. The default option is to import an Excel file. Browse to the spreadsheet and answer the questions. Repeat for both spreadsheets.
With luck, this should be all you need to do to get the file into SAS. Here is a link to some more info and examples.
An alternative to cmjohns PROC IMPORT approach above is to use DDE. It's an older technology and is more difficult to use but it does provide greater flexibility for complex scenarios.
Plenty has been written on doing this. For example:
http://www.lexjansen.com/wuss/2010/DataPresentation/3015_4_DPR-Smith.pdf
Cheers
Rob