Conversion of Excel into .pdf while using 'IncludeDocProperties' field - excel

I use the following code in MATLAB to convert an Excel worksheet into PDF, but I am getting the info (name) of the Excel file at the top of the PDF.
How can I get rid of it?
I know there is an optional field IncludeDocProperties, but I am not sure how to use it?
hExcel = actxserver('Excel.Application');
hWorkbook = hExcel.Workbooks.Open(sprintf('%s',excel_filepath));
hWorksheet = hWorkbook.Sheets.Item(1);
hWorksheet.ExportAsFixedFormat('xlTypePDF',pdf_filepath);
hWorkbook.Close;
hExcel.Quit;
hExcel.delete;

As far as I know, convert to pdf by ExportAsFixedFormat method will give you a pdf file that looks exactly like what you get when you print out the excel file.
So, the info (name) that you mentioned is not created by Matlab, but by Excel. You have to open the file in Excel, and check if the Header/Footer is turned off or not. If you have Header/Footer turned on for the file, then the pdf exported through Matlab will also have that info (maybe file name, or page number, or author name etc.).

Related

How to export cells with links, when saving to a csv?

I am trying to export an Excel file (.xlsx) to a csv, with LibreOffice. Some columns have hyperlinks, which I can open when the sheet is open in LibreOffice. The cell does not show the link, but a short summary text: the link is somehow a property of the cell (or the text, not sure).
I would like that the CSV contains the links for the affected columns (I don't care about the short summary text), but by doing a "Save As csv" I am losing the links. What can I do?
EDIT
I have investigated a bit: the hyperlink can be manually created in Libreoffice in a cell with Ctrl-K or from the menu Insert -> Hyperlink. When I try to export the csv, I am offered two relevant options:
save cell content as shown
save cell formulas instead of calculated values
I have played around with them, but those are not helping at all.
Is there any way of exporting the hyperlinks instead of the text?
From what I can tell, the CSV export filter always saves the link text, not the link URL. This behavior occurs when saving from LibreOffice format as well, so your question does not need to involve Excel or the .xlsx format.
What I would probably do is write a macro to create a CSV file with the URLs. If you want to try that, then have a look at https://wiki.openoffice.org/wiki/Documentation/DevGuide/Accessibility/XAccessibleHyperlink.
Depending on what you are trying to do and how much time you are willing to invest, you can create your own filter.
Another option that requires programming would be to use the HTML export filter, which saves both the link text and link URL, and then write some code to parse out the URL.

Opening CSV file

I am generating CSV files. My first row it is column names, and it looks like
User ID;First Name;Last Name;Email;...
But if I will change User ID to ID, MS office cannot open this CSV and shows me error
Cannot read record(number of record)
But this file opens correctly on, Notepad++. I am using Excel 2013. Any ideas what is wrong?
You can solve the problem by inserting the following simple text at the beginning (the first line) of your .csv file:
sep=;
This will not be seen when the file is opened in Excel. What it will do - it will explicitly tell Excel that the delimiter is ;, and values will be separated into separate cells. Also, you will be able to use ID as the title of a column. Unfortunately, I cannot answer why Excel does not like it when you use this title at the beginning of the file.

Excel and Tab Delimited Files Question

I am encountering what I believe to be a strange issue with Excel (in this case, Excel 2007, but maybe also Excel 2003, but don't have access to it as I write this).
I can reliably convert some server data over into a tab-delimited format (been doing this for years) and then open it using Excel - no issue.
However, what seems to be happening is if I have an html <table> inside one of the fields, it looks like Excel 2007 thinks it should be converting the table into rows and columns inside Excel (not what I want). As you might imagine, this throws off the entire spreadsheet.
So question is, is there any way to set up excel to NOT do this (perhaps some setting in Excel that pertains to reading tab delimited files), or am I missing something?
Thanks.
Save your file as .txt
Now open the file in excel using Drag and Drop (rather than double clicking your hookey .xls)
Slightly more work to open the file, but your tab text formatting will now be respected.
When you open the tab-delimited file, you are shown an import mapping dialog that lets you pick each columns' data type (date, text, currency, etc.). For the columns that have HTML data present, choose text. This will tell it basically to import as-is and not try to automatically parse the data into a derived format.
Excel 2003 does the same. I don't think there is a way to do it with a config because Excel finds delimiters in the html table and breaks the html in cells and columns as it does for the other columns.
If the column containing html is always the same, you can use JYelton suggestion of renaming the file as csv and record a small VBA macro to load the file selecting automatically the html column as text in the import mapping dialog and you load the file calling the macro instead of double-clicking on the file.
If nothing else, import it into OpenOffice.org Calc, save as an .xls file, then open in Excel.

Creating a new pdf document using AcroEXch in VBScript

I am looking to automate the conversion of an excel sheet into a pdf document (I do not want to manually print the report generated in excel as a pdf document every morning). For now, I would like to create a button in excel that will run the macro to automatically generate the pdf document, but this button will eventually not be used.
Im also new to VB, but have read up on the AcroEXch SDK. Seems like I should be using AcroEXch.PDDoc.Create, but this is not quite right (because I cannot specify an input file to be printed/created as a new pdf document).
Any ideas on how I can create a brand new pdf file? Thanks in advance.
I think i found the answer. Here is one solution someone at work suggested (if anyone finds it useful, then great).
There is no available method in the AcroEXch class (or set of methods that I know of) to convert a non-pdf file to a pdf file. Instead, you have to use the pdf Distiller to first convert the file to postscript and then you can write to pdf, using the PDFDistiller class. Here's a snippet of the code:
'1. open excel being converted to pdf:
xlReport.activate
xlReport.range("a1").select
dim PdfFilePath
PdfFilePath = ""
dim PsFilePath
PsFilePath = ""
'2. Print Excel file to postscript file
xlBook.activesheet.PrintOut , , 1, , "Adobe PDF on Ne01:" ,TRUE, , PsFilePath
Dim oDistiller
Set oDistiller = CreateObject("PDFDistiller.PDFDistiller.1")
oDistiller.FileToPDF sPsFilePath, sPdfFilePath, ""
' Close Excel - do not save.
'COMMENTED OUT BELOW 3 LINES FOR DEBUG
xlApp.displayalerts=false
xlApp.quit
set xlApp=nothing
I don't know exactly what your circumstance is and what tools you have access to, but reading your description, it sounds like you simply want to have an Excel file converted for you with a click.
It would be helpful if you had posted whether you have Adobo Acrobat Professional, latest version of Excel, or other converters that are available on the market.
If you have Acrobat Pro installed, your office apps (word, excel, outlook, etc) should already have a "Convert to PDF" button in the toolbars, in combination with Excel command line argument it shouldn't be too hard to cook up a Windows Scheduled Task that periodically convert excel files for you.
Have you considered CuteFTP or PDFCreator, both are free. I have sucessfully used PDFCreator with VBA and I have heard that CuteFTP is good.

Excel CSV - Number cell format

I produce a report as an CSV file.
When I try to open the file in Excel, it makes an assumption about the data type based on the contents of the cell, and reformats it accordingly.
For example, if the CSV file contains
...,005,...
Then Excel shows it as 5.
Is there a way to override this and display 005?
I would prefer to do something to the file itself, so that the user could just double-click on the CSV file to open it.
I use Excel 2003.
There isn’t an easy way to control the formatting Excel applies when opening a .csv file. However listed below are three approaches that might help.
My preference is the first option.
Option 1 – Change the data in the file
You could change the data in the .csv file as follows ...,=”005”,...
This will be displayed in Excel as ...,005,...
Excel will have kept the data as a formula, but copying the column and using paste special values will get rid of the formula but retain the formatting
Option 2 – Format the data
If it is simply a format issue and all your data in that column has a three digits length. Then open the data in Excel and then format the column containing the data with this custom format 000
Option 3 – Change the file extension to .dif (Data interchange format)
Change the file extension and use the file import wizard to control the formats.
Files with a .dif extension are automatically opened by Excel when double clicked on.
Step by step:
Change the file extension from .csv to .dif
Double click on the file to open it in Excel.
The 'File Import Wizard' will be launched.
Set the 'File type' to 'Delimited' and click on the 'Next' button.
Under Delimiters, tick 'Comma' and click on the 'Next' button.
Click on each column of your data that is displayed and select a 'Column data format'. The column with the value '005' should be formatted as 'Text'.
Click on the finish button, the file will be opened by Excel with the formats that you have specified.
Don't use CSV, use SYLK.
http://en.wikipedia.org/wiki/SYmbolic_LinK_(SYLK)
It gives much more control over formatting, and Excel won't try to guess the type of a field by examining the contents. It looks a bit complicated, but you can get away with using a very small subset.
This works for Microsoft Office 2010, Excel Version 14
I misread the OP's preference "to do something to the file itself." I'm still keeping this for those who want a solution to format the import directly
Open a blank (new) file (File -> New from workbook)
Open the Import Wizard (Data -> From Text)
Select your .csv file and Import
In the dialogue box, choose 'Delimited', and click Next.
Choose your delimiters (uncheck everything but 'comma'), choose your Text qualifiers (likely {None}), click Next
In the Data preview field select the column you want to be text. It should highlight.
In the Column data format field, select 'Text'.
Click finished.
You can simply format your range as Text.
Also here is a nice article on the number formats and how you can program them.
Actually I discovered that, at least starting with Office 2003, you can save an Excel spreadsheet as an XML file.
Thus, I can produce an XML file and when I double-click on it, it'll be opened in Excel.
It provides the same level of control as SYLK, but XML syntax is more intuitive.
Adding a non-breaking space in the cell could help.
For instance:
"firstvalue";"secondvalue";"005 ";"othervalue"
It forces Excel to treat it as a text and the space is not visible.
On Windows you can add a non-breaking space by tiping alt+0160.
See here for more info: http://en.wikipedia.org/wiki/Non-breaking_space
Tried on Excel 2010.
Hope this can help people who still search a quite proper solution for this problem.
I had this issue when exporting CSV data from C# code, and resolved this by prepending the leading zero data with the tab character \t, so the data was interpreted as text rather than numeric in Excel (yet unlike prepending other characters, it wouldn't be seen).
I did like the ="001" approach, but this wouldn't allow exported CSV data to be re-imported again to my C# application without removing all this formatting from the import CSV file (instead I'll just trim the import data).
I believe when you import the file you can select the Column Type. Make it Text instead of Number. I don't have a copy in front of me at the moment to check though.
Load csv into oleDB and force all inferred datatypes to string
i asked the same question and then answerd it with code.
basically when the csv file is loaded the oledb driver makes assumptions, you can tell it what assumptions to make.
My code forces all datatypes to string though ... its very easy to change the schema.
for my purposes i used an xslt to get ti the way i wanted - but i am parsing a wide variety of files.
I know this is an old question, but I have a solution that isn't listed here.
When you produce the csv add a space after the comma but before your value e.g. , 005,.
This worked to prevent auto date formatting in excel 2007 anyway .
The Text Import Wizard method does NOT work when the CSV file being imported has line breaks within a cell. This method handles this scenario(at least with tab delimited data):
Create new Excel file
Ctrl+A to select all cells
In Number Format combobox, select Text
Open tab delimited file in text editor
Select all, copy and paste into Excel
Just add ' before the number in the CSV doc.
This has been driving me crazy all day (since indeed you can't control the Excel column types before opening the CSV file), and this worked for me, using VB.NET and Excel Interop:
'Convert .csv file to .txt file.
FileName = ConvertToText(FileName)
Dim ColumnTypes(,) As Integer = New Integer(,) {{1, xlTextFormat}, _
{2, xlTextFormat}, _
{3, xlGeneralFormat}, _
{4, xlGeneralFormat}, _
{5, xlGeneralFormat}, _
{6, xlGeneralFormat}}
'We are using OpenText() in order to specify the column types.
mxlApp.Workbooks.OpenText(FileName, , , Excel.XlTextParsingType.xlDelimited, , , True, , True, , , , ColumnTypes)
mxlWorkBook = mxlApp.ActiveWorkbook
mxlWorkSheet = CType(mxlApp.ActiveSheet, Excel.Worksheet)
Private Function ConvertToText(ByVal FileName As String) As String
'Convert the .csv file to a .txt file.
'If the file is a text file, we can specify the column types.
'Otherwise, the Codes are first converted to numbers, which loses trailing zeros.
Try
Dim MyReader As New StreamReader(FileName)
Dim NewFileName As String = FileName.Replace(".CSV", ".TXT")
Dim MyWriter As New StreamWriter(NewFileName, False)
Dim strLine As String
Do While Not MyReader.EndOfStream
strLine = MyReader.ReadLine
MyWriter.WriteLine(strLine)
Loop
MyReader.Close()
MyReader.Dispose()
MyWriter.Close()
MyWriter.Dispose()
Return NewFileName
Catch ex As Exception
MsgBox(ex.Message)
Return ""
End Try
End Function
When opening a CSV, you get the text import wizard. At the last step of the wizard, you should be able to import the specific column as text, thereby retaining the '00' prefix. After that you can then format the cell any way that you want.
I tried with with Excel 2007 and it appeared to work.
Well, excel never pops up the wizard for CSV files. If you rename it to .txt, you'll see the wizard when you do a File>Open in Excel the next time.
Put a single quote before the field. Excel will treat it as text, even if it looks like a number.
...,`005,...
EDIT: This is wrong. The apostrophe trick only works when entering data directly into Excel. When you use it in a CSV file, the apostrophe appears in the field, which you don't want.
http://support.microsoft.com/kb/214233

Resources