CSV to Excel Doc? - excel

I was curious what the community thinks is the easiest way to take a CSV file and 'save as' a Excel document with only a couple formulas pasted in?
I am trying to do this behind the scenes, and not physically navigating. e.g. opening, selecting save As, etc -- even though this is already VERY simple I **need to do this in code (Think automation)
Background: I have a c++ command line program generating the .csv, and a C# GUI starting this process. Either programs could hold the code, but I figure this is easiest in C# (InterOp?) The reasons I don't directly send code into the csv is because of the amount of comma characters that will mess up the csv and because other Excel documents need to reference the sheets so they need to be in .xls format.
=AVERAGE(C2:C999)
=COUNTIFS(C:C,">0",C:C,"<31")
=COUNTIFS(C:C,">31",C:C,"<55")
=COUNTIF(C:C,">55")

Have a look and see whether command-line scipting of openoffice will do the job. It can do quite a lot of conversions very easily. Otherwise there are a lot of Excel-producing libraries, for example PHPExcel, but you'd need to wrap some programming around them.

Related

How do I extract text and formatting from a ppt file in python?

I'm trying to write a context-sensitive text parser in python. To do that, I need to be able to open ppt files and extract both the text and information about how it was formatted. I need to be able to tell if a sentence was in a header or if it's bolded, for instance.
It's supposed to run on large batches of files, so manually converting all the ppts to pptxs isn't practical.
I tried tika, but it doesn't give formatting information.
I tried python-pptx, but it doesn't seem like it can open ppts.
And I'm hoping to make the parser OS agnostic, so the command-line converters I've seen proposed on other variants of this question won't work for me, unless they'll somehow work on linux, mac, and windows.

Update linked excel path in PowerPoint via Python

I want to automate creating of a powerpoint ppt via linking template charts to some Excel files. Updating the excel file values changes the powerpoint slides automatically. I have created my powerpoint template and linked charts to sample excel files data.
I want to send the folder with the powerpoint and excel files to someone else. But this will break the link to excel files due to change in the path. (As path is not relative). I can edit the paths manually by going under the "edit links to files" option under File Menu but this is tedious as charts are numerous with multiple files.
I want to update the same via Python code using the Python-Pptx package.
Please help!
There's no API support for this in the current version of python-pptx.
You would need to modify the underlying XML directly, perhaps using python-pptx internals as a starting point and using lxml calls on the appropriate element objects. If you search on "python-pptx workaround function" you will find some examples.
Another thing to consider is modifying the XML by cruder but still possibly effective means by accessing the XML files in the .pptx package directly (the .pptx file is a Zip archive of largely XML files) and using regular expressions or perhaps a command line tool like sed or awk to do simple text substitution.
Either way you're going to need to want it pretty badly, depending on your Python skill level. You'll also of course need to discover just which strings in which parts of the XML are the ones that need changing. opc-diag can be helpful for that, but it's a bit of detective work even with the best tools.

File that normally opens our application, but will fall back to Excel

Our application exports snippets of databases in XLSX format. We wrote our own code on top of System.Packaging as it is many (many!) times faster than using the Excel objects.
Right now we save these files with a .xlsx format, and that works OK. However, it would be much nicer if double-clicking one of these opened our app instead, but failed back to Excel on machines without it.
I know that SpreadsheetML has a feature to do this. If you insert this near the top of the file:
<?mso-application progid=""Excel.Sheet""?>
some sort of magic occurs that causes Excel to open on Win machines. While this might work in SML files, it does not appear to work in "real" xlsx files - I tried adding this line to various parts of the workbook structure but it remained unrecognized.
So is there a similar mechanism we can use in "true" XLSX files generated by System.Packaging? Or some other Windows mechanism we should use in these situations?

Is this possible in Excel: Open XLS via commandline, OnLoad import CSV data, Print as PDF, Close Doc?

Thinking that to solve a problem I've got this is the fastest solution:
Generate a custom CSV file on the file (this is already done via Perl).
Have a XLS document opened via commandline via a scripting language (clients already got a few Perl scripts running in this pipeline.)
Write VBA or record a macro that executes the following OnLoad:
Imports a the data from the CSV file into the report template,
Print the file via PDF driver to fixed location using data in the CSV to name the file.
Closes the XLS file.
So, is this possible via Excel macros, if not is it possible via VBA -- thanks!
NOTE: Appears I've got to have a copy of MS Office anyway, so this is much faster to get going than using Visual Studio Tools for Office (VSTO). The report template is going to be on a server, and this way the end user can build as many reports as they like, "test" by printing a PDF using a demo CSV file, and import/embed the marco or VBA when they're done. I'd looked in Jasper Reports, but the end user is putting ad-hoc static text and groupings all over the report and I figure this way they can build reports how ever they want and then automate them. Both of these questions by me and the resulting comments/feedback are related to this question:
In Excel, is it possible to automate reading of CSV data into a template and printing it to PDF from the commandline?
Is it possible to deploy a VB application made in Excel as a stand alone app?
FOCUS OF QUESTION: Again, focus of the question is if this is possible via Excel marcos, if not macros VBA, and if there's any huge issue with this approach; for example, I know this is going to be "slow" since Excel would be loaded per job, but there's 16GB of ram on the server and it's not used at all. Figure since I've got to have a copy of office on the server anyway, this is a much faster approach.
If you've got any questions, let me know via comments.
I suppose you could launch the report file from perl and then have a macro inside the report file automatically look for the newest csv file to import. Then you could process and output. So you just need to launch the proper excel file with the embedded macros from perl and then let excel and VBA take over.

Using Office to programmatically convert documents?

I'm interested in using Office 2007 to convert between the pre-2007 binary formats (.doc, .xls, .ppt) and the new Office Open XML formats (.docx, .xlsx, .pptx)
How would I do this? I'd like to write a simple command line app that takes in two filenames (input and output) and perhaps the source and/or destination types, and performs the conversion.
Microsoft has a page which gives several examples of writing scripts to "drive" MS Word. One such example shows how to convert from a Word document to HTML. By changing the last parameter to any values listed here, you can get the output in different formats.
The easiest way would be to use Automation thru the Microsoft.Office.Interop. libraries. You can create an instance of a Word application, for example. There are methods attached to the Application object that will allow you to open and close documents, plus pretty much anything else you can accomplish in VBA by recording a macro.
You could also just write the VBA code in your Office application to do roughly the same thing. Both approaches are equally valid, depending on your comfort in programming in C#, VB.NET or VBA.

Resources