I'm trying to display data from Python in Excel. Ideally, a pandas dataframe's worth of data would appear in a new, unsaved excel instance. My search has turned up ways to create excel files, and lots of ways to 'open' an excel file to read data from it, but no way to display it. My current approach was to create a file and then figure out how to open it, but I consider that approach second-best.
I found this. Another way would be to export your data to CSV and import it in Excel.
I found you can 'run' files with the OS library. As long as your computer knows what to do with it, you can create the xlsx file with whatever method and then run it to display:
import xlsxwriter
import os
w = xlsxwriter.Workbook(r"C:\Temp\test.xlsx")
s=w.add_worksheet("Sheet1")
s.write(1,3,"7")
w.close()
os.startfile(r"C:\Temp\test.xlsx")
Still not sure if you can work with an unsaved open instance of excel
Related
I have been searching the web for hours trying to find something that might help me convert a file that was saved in the ppt file type to the pptx file type using python. I found "python-pptx" and was planning on using it to save the files, however this was not possible due to the continuous error:
Package not found at 'FileName.ppt'
I discovered another post (Convert ppt file to pptx in Python) which did not help me at all. I assume it is because my python version might be too high. (3.9) After reading up on getting the win32com.client to work and installing multiple pip and pip3 commands, it is still not working. If anyone could assist me with this manner I would be very thankful. My Current Code:
from pptx import *
prs = Presentation("FileName.ppt")
prs.save("FileName.pptx")
You can use Aspose.Slides for .NET and Python.NET package for converting PPT to PPTX as shown below:
import clr
clr.AddReference('Aspose.Slides')
from Aspose.Slides import Presentation
from Aspose.Slides.Export import SaveFormat
# Instantiate a Presentation object that represents a PPT file
presentation = Presentation("presentation.ppt")
# Save the presentation as PPTX
presentation.Save("presentation.pptx", SaveFormat.Pptx)
Our web applications use our libraries and you can see conversion results here.
I work at Aspose.
I doubt python-pptx can parse a .ppt file. (It's a completely different file format.) You're better off automating PowerPoint itself - somehow - to read one and write the other.
The "somehow" depends on the platform you're running on - and the automation capabilities available to you.
I have trouble trying to keep the sharing properties of an excel. I tried this :Python and openpyxl is saving my shared workbook as unshared but the part of vout just cancels all the modification I made with the script
To explain the problem :
There's an excel file that is shared in which people can do some modification
Python reads and writes on it
When I save the workbook in the excel file, it automatically either drops the sharing property or when I try to keep it, it just doesn't do any modification
Can someone help me please ?
I'll get a little more precise, as requested.
The sharing mode is the one Microsoft provides. You can see the button below:
Share button Excel
The excel is stored on a server. Several users can write on it at the same time but when I launch my script, it stops automatically the sharing property, so everyone that is writing on it just can't do modification anymore and every modif they did is lost.
First I treated my Excel normally :
DLT=openpyxl.load_workbook(myPath)
ws=DLT['DLT']
...my modifications on ws...
DLT.save()
DLT.close()
But then I tried this (Python and openpyxl is saving my shared workbook as unshared)
DLT=openpyxl.load_workbook(myPath)
ws=DLT['DLT']
zin = zipfile.ZipFile(myPath, 'r')
buffers = []
for item in zin.infolist():
buffers.append((item, zin.read(item.filename)))
zin.close()
...my modif on ws...
DLT.save()
zout = zipfile.ZipFile(myPath, 'w')
for item, buffer in buffers:
zout.writestr(item, buffer)
zout.close()
DLT.close()
The second one just doesn't save my modification on ws.
The thing I would like to do, is not to get rid of the sharing property. I would need to keep it while I write on it. Not sure if it is possible. I have one alternative solution that is to use another file, and just copy/paste by hand the new data from this file to the DLT one.
well... after playing with it back and forth, for some weird reason zipfile.infolist() does contains the sheet data as well, so here's my way to fine tune it, using the shared_pyxl_save example the previous gentleman provided
basically instead of letting the old file overriding the sheet's data, use the old one
def shared_pyxl_save(file_path, workbook):
"""
`file_path`: path to the shared file you want to save
`workbook`: the object returned by openpyxl.load_workbook()
"""
zin = zipfile.ZipFile(file_path, 'r')
buffers = []
for item in zin.infolist():
if "sheet1.xml" not in item.filename:
buffers.append((item, zin.read(item.filename)))
zin.close()
workbook.save(file_path)
""" loop through again to find the sheet1.xmls and put it into buffer, else will show up error"""
zin2 = zipfile.ZipFile(file_path, 'r')
for item in zin2.infolist():
if "sheet1.xml" in item.filename:
buffers.append((item, zin2.read(item.filename)))
zin2.close()
#finally saves the file
zout = zipfile.ZipFile(file_path, 'w')
for item, buffer in buffers:
zout.writestr(item, buffer)
zout.close()
workbook.close()
I use Windows 64bit with 8GB RAM and Matlab 64bit.
I tried to load a .xlsx file into matlab. The file size is around 700MB, containing a sheet with 673928 rows and 43 columns.
First I use the GUI tool 'uiimport'. After choosing the file path and name, the GUI tool needs around 3 minutes to read the .xlsx file, and then shows the data in a table. If I choose "cell array", it needs around 10 minutes to import the data into workspace.
>>whos
Name Size Bytes Class Attributes
NBPPdataV3YOS1 673928x43 3473588728 cell
It works very well, but I have many .xlsx files to import. It is impossible to import each file using GUI tool. So I use the GUI tool to generate function like this
function data = importfile(workbookFile, sheetName, range)
%% Import the data
[~, ~, data] = xlsread(workbookFile, sheetName, range);
data(cellfun(#(x) ~isempty(x) && isnumeric(x) && isnan(x),data)) = {''};
For simply, I ignore some irrelevant code. However, when I use this function to import the data, It does not work well. The used RAM by Matlab and Excel increases dramatically until almost all RAM is used. The data cannot be imported even after 30 minutes.
I also try to do it like this,
filename='E:\data.xlsx';
excelObj = actxserver('Excel.Application');
fileObj = excelObj.Workbooks.Open(filename);
sheetObj = fileObj.Worksheets.get('Item', 'sheet2');
%Read in ranges the same way as xlsread!
indata = sheetObj.Range('A1:AQ673928').Value;
The same problem occurs as xlsread().
My questions are:
1. Does the GUI import tool use xlsread() to read .xlsx file? If yes, why the generated function does not work? If no, which interface it uses?
2. Is there an efficient way to load Excel file into Matlab?
Thanks!
It sounds like you may be keeping the excel file in memory in Matlab. I would suggest looking into making sure you close the connection to each excel file after you have imported its data.
You may also find that the Matlab table class is more memory efficient than the cell class.
Good luck.
I am totally new to Stata and am wondering how to import .xlsx data in Stata. Let's say the data is in the subdirectory Data and has name "a b c.xlsx". So, from working directory, the data is in /Data
I am trying to do
import excel using "\Data\a b c.xlsx", sheet("a")
but it's not working
it's not working
is anything but a useful error report. For future questions, please report the exact error given by Stata.
Let's say the file is in the directory /home/roberto then
clear
set more off
import excel using "/home/roberto/a b c.xlsx"
list
should work.
If you are already in /home/roberto (which you can verify using display c(pwd)), then
import excel using "a b c.xlsx"
should work.
Using backslashes to refer to directories is not encouraged. See Stata tip 65: Beware the backstabbing backslash, by Nick Cox.
See also help cd.
I have an excel file with ~10000 rows and ~250 columns, currently I am using RODBC to do the importing:
channel <- odbcConnectExcel(xls.file="s:/demo.xls")
demo <- sqlFetch(channel,"Sheet_1")
odbcClose(channel)
But this way is a bit slow (I need a minute or two to import them), and the excel is originally encrypted, I need to remove the password to work on it, which is something that I prefer not to, I wonder if there is any better way (i.e. import faster, and capable of importing encrypted excel files)
Thanks.
I recommend to try using the XLConnect package instead of RODBC.
http://cran.r-project.org/web/packages/XLConnect/index.html