Excel file and program structure - excel

I need to know for a school project how Excel work. Precisely I need to know what kind of structure is behind an Excel file and how the Excel program work with this file.
I know Excel is a Microsoft propriety and it' s not Open Source so I know I can' t find too much on this argument... But everything that can help me to understand how excel work it' s useful.
If I could not find something about Excel I will try to take a look at Open Office or Open Document format. So even some information about this will be real useful.
Thanks to all

You can find details of the MS Office BIFF file formats here in the microsoft.com library, while the Office Open XML format is published here on the ECMA site and here in the microsoft.com library.
You can find specifications for the OpenDocument format used by Open Office on the OASIS site

It is simpler than you may think.
An excel file is just a zip file of multiple XML documents. Each XML document corresponds to one spreadsheet in the Excel file.
You will find the XML sheets at xl\worksheets inside the zip folder.
You can scripting reading and writing to it.

Related

Can I read other than the first sheet with the Read Delimited Spreadsheet.vi function in LabVIEW?

I am using Read Delimited Spreadsheet.vi in LabVIEW and need to read data from other than the first sheet. How do I tell LabVIEW that want to use other than the first sheet?
CSV files are plain text files and there are no multiple sheets inside.
Sheets are present within Excel files, but this function "Read Delimiter Spreadsheet" does not work with these.
Unfortunately LabVIEW still doesn't have built-in support for reading Excel files as far as I know, although it can write them with the Save to Measurement File express VI.
There are third-party toolkits available for reading Excel files in LabVIEW, or you might be able to use some Python code with openpyxl or pandas.
I suppose you need to read a Excel file (.xls or .xlsx) and not a CSV file (as suggested by Mateusz the CSV file doesn't have sheets).
Anyway, in LabVIEW you can read, write, manipulate and do any other operations on Excel files by using ActiveX. It is verbose but you can use it as any other LabVIEW library.
Look this post or the built-in examples in your LabVIEW environment
As mentioned above, you can use Excel to read a spreadsheet. Alternately, you can use LibreOffice.
LibreOffice to LV library

Convert Excel with internal links to pdf

I am trying to convert an excel with internal links(i.e. links to different places within that same excel) to a pdf. I have gone through the several posts available online in this regard and couldn't seem to find any proper solution for such a conversion. The solutions provided mostly works for the external hyperlinks and not the internal one's. Is it even possible to do so? Is there any software that might be able to achieve this functionality?
Basically, I am looking to migrate whole of the excel workbook to one single pdf such that every link between different worksheets in that workbook still works. For example, if I have provided a link in worksheet one that points to a named section in worksheet two, I would like this relation to be maintained within pdf as well. So, in the resulting pdf when I click on the link, it should take me to that named section location in the pdf.
Solution to keep internal and external links that works:
Install Office 2013/2016
Install Adobe Acrobat PRO 11 (for Office 2013) or Adobe Acrobat DC (for Office 2016)
Open Excel file and use "Save as Adobe PDF" entry from the File menu. Screenshot below. This pops up another dialog where you select sheets for saving.
Main idea is to have Adobe Add-in active inside the Office ribbon and File dialog.
Works like a charm! Finally!

xla vs xlam addin, what is the difference?

Could someone please explain the difference between an xla Excel addin format and an xlam Excel addin format? Googling didnt provide anything useful.
The m stands for macro-enabled which is the new format (as from Excel 2007).
These are add-ins that may call macro's.
On the other hand, you could also have xlax extensions, which are meant for macro-freeworkbooks.
Note also the difference between xls and xlsm, where xlsx files also don't contain macro's.
Why? My guess is that the main reason would be security.
Some people don't like to receive files, not knowing if there are potentially harmful macro's in it. In the old format, you could not make the distinction based on the file extension.
Both files are macro enabled files:
XLA files are excel files for office 97 - that are loaded as addins
XLAM files are excel 2003+ files, which are actually zip files that have xml documents inside them per opendocument protocol.

Programmatically Determine If An Excel File (.xls) Contains Macros

Is there any way to programmatically determine if an .xls contains macros, without actually opening it in Excel?
Also are there any methods to examine which certificate (including timestamp cert) these macros are signed with? Again without using Excel.
I'm wondering in particular if there are any strings that always show up in the raw data of an Excel file when macros are present.
Yes, you can open the .xls file as a compound document file and check whether is contains a VBA folder and streams containing VBA code.
Sample code is available in this CodeProject article:
Another OLE Doc Viewer but with editing facility
The certificate information is stored in the DocumentSummaryInformation stream. If you want to read out the information from there you should dig into the file format specifications available from Microsoft:
[MS-OSHARED]: Office Common Data Types and Objects Structure Specification
[MS-OFFCRYPTO]: Office Document Cryptography Structure Specification
An xls file containing a macro should contain a string looking something like
Keyboard Shortcut:
Don't know if this is a surefire solution though

Identifying Different Excel File Formats

Is anyone familiar with a library or tool that can determine which format an excel file is in? Or, failing that, documentation on the different formats that would allow me to write my own?
The Excel file format is called the Binary Interchange File Format (BIFF) there are different versions of Excel that use the same version of BIFF.
Open Office document on the Excel File Format.
Take a look at the Open Office API, this should help you.
Excel 97-2003 workbooks are known as Biff8. They are actually OLE Compound documents which are essentially a file system within a file. They store the main workbook in a stream named "Workbook" and they have other streams for VBA modules, OLE objects, document properties, etc...
Win32 includes APIs for reading OLE Compound Documents. They are far from trivial. Once you get the "Workbook" stream, the first Biff record identifies the file as being an Excel file.
You can find excellent documentation from Microsoft on the Biff8 file format on the Microsoft Office Binary File Formats page.
The new Excel 2007 Open XML (xlsx) format is actually a zip file with workbook parts and is documented at OpenXmlDeveloper.org.
I am not aware of a tool which will simply tell you the format of a workbook. You could take the easy, but not very reliable approach of just looking at the extension which will be right 99%+ of the time - if accuracy is not an issue.
There are many tools to read xls and xlsx workbooks, including SpreadsheetGear for .NET which reads both.
Disclaimer: I own SpreadsheetGear LLC

Resources