Convert .xls to .pdf using LibreOffice via Command Line - excel

I'm trying to convert a .xls file to .pdf using LibreOffice via command line on Ubuntu. I have a kind of report on the .xls file with some colors in the background of the cells and etc.
The problem is when I convert the .xls file, the .pdf loses the original format. Each page is broken almost in the half and the content of one page is displayed in two different pages.
Does anybody know how to convert the .xls file to .pdf via command line with keeping the original format?
Or some trick to set the size of the .pdf page to not break pages? (Also via command line)
The code I used to make the conversion was:
soffice --headless --convert-to pdf:"impress_pdf_Export" filename.xls

If you use LibreOffice to convert Microsoft Excel (XLS) files to PDF documents, this is a two-step process (even if your command does look like it is a one-step process):
Import the XLS into LibreOffice (even if started with --headless).
Export the PDF from LibreOffice.
If the result does not look like you expect (not similar enough to Excel's native PDF export), then start with debugging the first step from above:
Open the XLS file with LibreOffice in a GUI. Does it look like you expect it to look? Or are some formatting options looking weird?
Export the PDF from there (with the GUI). Are the page dimensions as you expect? Did you set them up how you prefer? The margins like you want them? etc.pp. ...
If you are working on Windows, you may also want to consider OfficeToPDF.exe. It is hosted on CodePlex, licensed with the Apache 2.0 License and available in binary and in source code.
It requires a working Office 2013, Office 2010 or Office 2007 installation. But then it can commandline- and batch-convert to PDF various MS Office-based file formats, including XLS(X), PPT(X), DOC(X), VSD(X) and PUB as well as Libre/OpenOffice-based ODT, ODS and ODC files.

Although this is a little bit off from the initial question (you don't _really need Office Libre if you have the Office suite and on a Windows machine)
I do appreciate the follow-up provided by Kurt. It prompted me to post the following Gist offering some clear instructions on how to go about using the .exe in a for loop.
https://gist.github.com/einsty/2189cae4175f619cff0f

Try copying appropriate font file (for me it's
a simsun.ttc file) to your libreoffice installing directory like '/opt/libreoffice4.2/share/fonts/truetype'.But if the width of a single excel sheet is too much for a print page(sth like 'A4'),it'll still collapse.

Related

Unoconv - Maintain scaling converting xlsx to pdf

I'm using unoconv to convert an Excel file to PDF. The converted PDF retains the correct scaling when converting from an .xls file, however the PDF scaling reverts to 100% when converting from an .xlsx file. In other words, unoconv converts the same file, albeit with different extensions and Excel format, differently.
Operating system is Ubuntu. I'm running unoconv from the command line. I've scoured the web for a solution and have found none. I believe it's a bug in unoconv, i.e. unoconv does not seem to support maintaining the scaling when converting from .xlsx as it does when converting from .xls
Has anyone else encountered this and, if so, is there a workaround?
You could convert .xlsx files directly from Excel
File>Save As > Browse. change file extension to pdf from drop down menu.
or
File>Save as Adobe PDF and follow the dialog
this is at Office 2013

Batch convert xls-Files to csv

I need to convert over 100 Excel files to CSV. Worse these files consist of multiple sheets and I only need one of them.
At first I stumbled upon the Perl program xls2csv. Luckily I even found on XLS file conversion at the bottom a convenient script that converts all sheets into seperate csv files. But unluckily this converter is broken and skips lines.
I also tried pyodconverter but that only converts the first sheet.
Any suggestions? It would be ok if that conversion had to be done on Windows though I would really prefer Linux. And if it has to be Windows it would be nice if it wouldn't need an Excel installation.
There's a very useful java library called Apache POI at http://poi.apache.org/
The following link provides an example application that converts xls to csv.
http://svn.apache.org/repos/asf/poi/trunk/src/examples/src/org/apache/poi/hssf/eventusermodel/examples/XLS2CSVmra.java
If you know java you can adjust it to your needs. Since it's java it runs also on linux.
you could also have a look at StatTransfer... (Win only, I'm afraid)
I know this is late but there is actually an HTA (HTML Application) which can do this. The details and download link can be found here.

Is this possible in Excel: Open XLS via commandline, OnLoad import CSV data, Print as PDF, Close Doc?

Thinking that to solve a problem I've got this is the fastest solution:
Generate a custom CSV file on the file (this is already done via Perl).
Have a XLS document opened via commandline via a scripting language (clients already got a few Perl scripts running in this pipeline.)
Write VBA or record a macro that executes the following OnLoad:
Imports a the data from the CSV file into the report template,
Print the file via PDF driver to fixed location using data in the CSV to name the file.
Closes the XLS file.
So, is this possible via Excel macros, if not is it possible via VBA -- thanks!
NOTE: Appears I've got to have a copy of MS Office anyway, so this is much faster to get going than using Visual Studio Tools for Office (VSTO). The report template is going to be on a server, and this way the end user can build as many reports as they like, "test" by printing a PDF using a demo CSV file, and import/embed the marco or VBA when they're done. I'd looked in Jasper Reports, but the end user is putting ad-hoc static text and groupings all over the report and I figure this way they can build reports how ever they want and then automate them. Both of these questions by me and the resulting comments/feedback are related to this question:
In Excel, is it possible to automate reading of CSV data into a template and printing it to PDF from the commandline?
Is it possible to deploy a VB application made in Excel as a stand alone app?
FOCUS OF QUESTION: Again, focus of the question is if this is possible via Excel marcos, if not macros VBA, and if there's any huge issue with this approach; for example, I know this is going to be "slow" since Excel would be loaded per job, but there's 16GB of ram on the server and it's not used at all. Figure since I've got to have a copy of office on the server anyway, this is a much faster approach.
If you've got any questions, let me know via comments.
I suppose you could launch the report file from perl and then have a macro inside the report file automatically look for the newest csv file to import. Then you could process and output. So you just need to launch the proper excel file with the embedded macros from perl and then let excel and VBA take over.

an HTML file is NOT an Excel file, right?

we use an application that has an "export to excel" feature that doesn't work on PC's that done have outlook express installed.
i know, you're thinking "WTF does outlook express have to do with excel files?"
i asked the same thing, and here's what i found:
the file being generated is actually one of those Microsoft Single File Web Pages (.mht) and NOT an excel file
you need to have outlook express installed to actually view a .mht file.
i've explained to their support people that just because you can slap a .xls on a file and excel will open it does not mean its an excel file, and does not mean that this is the right way to do it.
how would you explain that this is not proper?
Many people (especially managers) confuse Excel files with reporting files. In my opinion, a file is only qualified as an Excel file if it meets all of these conditions:
Is a spreadsheet formatted in one of the many Microsoft Excel formats.
Can be opened in the most recent version of Microsoft Excel.
Is editable in Microsoft Excel.
In your case, I'm guessing only condition #3 is met, so it's no Excel file. But your support people may still call it a reporting file.
If a clean Windows image with only Excel installed can't open it, then it isn't in Excel format. Period.
If a Windows machine with Outlook Express, but without Excel can open it (if you change the extension) then it can't be an Excel file. I'd combine that with Ignacio's suggestion for a slam-dunk.
Plus, surely if it's MHT, then you can't actually do spreadsheet operations on it? Or am I misunderstanding how it works?
I don't think your statements are correct. Excel (2007) has import and export filters for single-file HTML documents (.mht) even if there is no Outlook Express installed. However, this is not a native format and worksheet features such as formulas cannot be retained (see http://office.microsoft.com/en-us/excel/HP100141051033.aspx#7)
So what you should make clear to your customers is that there is a difference between an applications native file format and a format which isn't designed to contain spreadsheet functionality and that is only supported via an import/export filter.

SWT OleClientSite: How to load XML file in Excel?

I have an Excel file in OfficeML format, MyData.xls. Since I upgraded to Office 2007 from Office 2003 I get a warning message saying that the file content does not match the file extension. It seems that OfficeML now must have the extension 'xml'.
In my application I use OleClientSite to display the file in an OleFrame object. If I change the file extension to 'xml' then Excel is not started. If I leave the extension as 'xsl' then I get the above warning message.
How can I force the file with the 'xml' extension to be opened in the OleFrame using Excel?
The easiest solution is to switch back to the 2003 format, which should not require any changes to your application. To do this, open your file with the extension set to *.xls. When prompted with the warning ("... do you want to open the file now?"), proceed to open (this is a warning to make sure you don't unintentionally open a macro-enabled file). Once in Excel and the file is open, simply save it as *.xls. This can be done by going to "Office Button / Save As / Excel 97-2003 Workbook".
Now, the harder solution will be upgrading your application to deal with the new OfficeML format. I don't know about the component you're using, but it will likely still work for some of the binary parts in the new standard (most notably VBA projects), but you're going to have to unpack and start reading XML files.
If you haven't already done this, create a new Excel workbook, save it as *.xlsx (the 2007 format) and in Explorer, change its extension to *.zip. Open it up and take a look around. For more in-depth on the files, I would start digesting this MSDN article.
Maybe I'm missing something, but shouldn't you just use .xslx as your extension? I'm assuming that by OfficeML, you're refering to Office Open XML.
The <?mso-application progid="Excel.Sheet"?> should be present in the XSL template used.
The below link explains clearly how to include the processing instruction. I had to do something similar and it worked for me.
http://www.shareyourwork.org/roller/ralphsjavablog/entry/generating_excel_sheets_with_xslt

Resources