Parsing Excel files in monotouch - excel

Are there any good ways of parsing Excel files in monotouch? Seems like most methods to work with Excel is based on using the Excel Object Library. Doesn't seem like that's even an option in monotouch? I read that objective-c doesn't have any native support for Excel-files, so don't know if that would change anything?

You would need to either
write your own
find an obj-c library that does it and write MT bindings for it
find a open source .NET library and port it to MT
If all you want to do is display a file, you can use the existing iOS document APIs to do it.
The newest Office formats are XML based, so depending on how complex the files are, writing your own parser might be feasible to do.

I ended up just writing the middle step, a web service that fetches the Excel file, parses it and serves up the content as xml/json.

Related

How to convert a QuickReport .QRP files into XML, HTML or Text?

I need to create a service to convert a series of QuickReport _(.QRP)_ files into something more parsable such as Text or HTML.
What is the best way of doing that?
qrp files are definition files for the Delphi/Quick Reports engine.
It is not immediately clear from the question whether you want to completely do away with the delphi/quickreports engine or just get the results in a browser.
If it is the former, then you may have a complete rewrite on your hands. But if it is the latter then you probably need to adapt a delphi app to export pdf files, say.
The Delphi app approach is likely only viable if you have full control of the security of your web-server.
Re-reading the question, you specifically want to parse the reports (presumably for data extraction).
As quick reports uses a relational db as a source, the only reason I can think of doing this is because you don't have access to the Delphi source or the database.
This may imply a one-off data migration away from an ex-supplier. So maybe just print to a pdf-print driver, then convert the pdfs to text and parse away.
I hope this helps.
I would recommend you to check Gnostice eDocEngine which will allow you to export .QPR to many formats including Excel.
https://www.gnostice.com/nl_article.asp?id=248&t=Export_From_Quickreport_To_PDF_And_Other_Formats
eDocEngine VCL has several interesting components:
TQRPQuickrep, TgtQRExportInterface and for export to Excel TgtExcelEngine. QPR-to-Excel rendering is performed with 4 lines of code (assuming the components are initialized) :
gtPDFEngine1.FileName := 'eDoc_QuickReport_Demo.pdf';
gtPDFEngine1.Preferences.ShowSetupDialog := false;
gtQRExportInterface1.Engine := gtPDFEngine1;
gtQRExportInterface1.RenderDocument('eDoc_QuickReport.QRP');
It is fairly priced and is provided with the source code. The only limitation is that it is Delphi and C++ Builder complaint, but if you feel comfortable with any of these languages you can easily create CLI utility or service and call it from your code.

How to generate application forms/documents programmatically?

At the moment, we use MS WORD and MS EXCEL to mail merge documents that needs to be sent to multiple recepients.
For example, say there is a complaint form where the complainant needs to fill in his/her name, address, etc. So we have a .doc file set up with the content and the dynamic entities set up for mail merging, with the name and address details put in an excel file, from where we can happily mail merge to generate all or just the necessary forms/documents.
However, I would like to automate this process, like a form in a website where the complainant can fill in his/her name, address and other details, and we could use that to generate the complaint form automatically and offer it to be downloaded (preferrably as a pdf).
Now, the only solution that comes to mind, is Latex, so that I can just replace the needed entities and just compile to PDF. However, that bit has to be negotiated with the webhost, if they are offering Latex or not.
Is there any other solution? Any other way we could get this done, with something that shouldn't be a problem for most webhosting solutions to offer?
EDIT: I would prefer a non .NET or rather non microsoft solution since, the servers are running linux and while mono might be capable of getting the job done, none of our devs know any .NET languages. However, if required we might have to dwelve into it.
Generating PDF using an XSL. Check the following: Apoc XSL-FO
You will need to create an XML file with the required fields and transform that with this tool.
If you wish to avoid .NET then XSL-FO is worth a look. Try the FOray project.
XSLT can be a steep learn if you do not have experience already. Also users will not be able to change the templates without asking the XSLT guru to do it.
If your templates are already in MS Word and MS Excel then I would stick with generating MS docs on the server. These are now easy to work with from code since OpenXML - check out OfficeOpenXML and OpenXMLDeveloper
Apache FOP : http://xmlgraphics.apache.org/fop/
I suggest generating rtf on the server: it's easy enough to automatically generate using cpan's RTF::Writer, has converters generating good pdf, can be edited by hand in word, oo-writer & TextEdit, doesn't have any really bad compatibility issues between the main editing applications, and has decent text & resource extraction tools, with text extraction being rather better than pdf.
There's some support for moving between rtf & latex, although the best rtf -> latex converter, docx2tex, depends on the System.IO.Packaging .net module, whose mono implementation isn't yet rock solid.
Postscript — Not a recommendation: it's too much of an unwieldy sledgehammer for this job, but iText will generate the pdf directly from the form data. If you wanted to do fancy things like signed pdf, that would be the way to go.
Postscript #2 — If you break up the Word document into individual files using word's master document representation, then you can clobber one of the parts with hand-generated content. This makes it easy to do something approximating form-filling on word .doc files using just standard file-utils and some trivial rtf->doc tweaking.

Exporting Native Excel 2007 Files From .NET

Does anyone know of resources that can help me export simple contents of a GridView to a native Excel 2007 format (i.e. the OpenOfficeXML format).
I've already seen solutions like Matt Berseth's, and in fact I have been using that for a while, but it comes with an annoying warning produced by Excel 2007 as documented here stemming from the fact that a native Excel file is not generated; rather it is HTML.
My initial research shows that, at the core, xlsx files are zip files, but I have no idea how to produce these or what goes in them.
Any suggestions (or tutorials) would be greatly appreciated.
CarlosAg has an ExcelXML writer which works really well. It isn't a native excel 2007 formatted file, but it will be readable in excel 2007.
You will need to write a little method to do the exporting manually, the API is very straight forward though. You will create a sheet object, then a row object, then a cell object. You can just loop through your data and output it. The examples on the site are pretty decent.
I prefer using Microsoft's own Open XML Format SDK. It is free, it is released by Microsoft and it creates real .xlsx files.
You can find the reference documentation here, as you can see, it is pretty straightforward to use.
SpreadsheetGear for .NET can read and write native xls and xlsx files and is easier to use (takes less of your time) than other solutions because it has an Excel like API so you don't have to learn anything about Open XML.
You can see some live ASP.NET (C# and VB) Excel Reporting examples here and download an evaluation version here.
Disclaimer: I own SpreadsheetGear LLC

How to read an Open Office spreadsheet?

How can I read an Open Office 3.0 spreadsheet (.ods) from Groovy? I'd like to select specific columns from a named worksheet. Ideally, it would be useful to add a 'where' clause, or other criteria clause.
I've never used it, but Open Office has a Java API, which of course you could use from Groovy as well. It looks like the best places to start reading are the Developer's Guide, the Java UNO Reference, and the samples in Java and (hey!) Groovy. Hope that helps!
Might be something here at Spring Factories or here at Groovy and JMX. There is a forum for Groovy and Open Office.
Could you export the table / spreadsheet as SQL entries then use that. You could also look at this plugin for goovy -- http://www.ifcx.org/
OpenOffice documents are ZIP files which contains the document data as XML plus some other files (style sheets for word documents). Details can be found here.
The main problem with calc is formulas. If you just have tabular data, then you can simply read the cell values and use that. So you can open the ZIP archive, read the content.xml in it and parse that with any XML parser.
But when a cell contains a formula, then you need to execute it. In this case, you will have to open the document via the UNO API. Here is the Java version. There is a link where you can download example code that explains how to open ODF documents and how to examine their content. There are also snippets but none of them show how to examine a sheet.
The main disadvantage of UNO is the documentation. Each method is explained somewhere but you have to find the method which solves your problem, first.
Since the title does not mention Groovy (only question specifics does), I didn't want to make this a new question.
How to generally read an Open Office spreadsheet document? There are tools for creating one (ooo-python) but not for reading one. They are XML but just bluntly diving into that and trying to get the right logic of extracting the data I want seems so sub-optimal.
What I'd like is features similar to Excel COM support, but from a command line tool (or scripting language).

How best to export native data to Excel without introducing dependency on Office?

Our product has the requirement of exporting its native format (essentially an XML file) to Excel for viewing/editing. However, what this entails is having a dependency on Excel (or Office) itself for our product build - something that we do not want.
What we have done is export the data from our native format to a csv file which can be opened in Excel. If user selects an option to open the generated report as well, we (try to) launch Excel application to open it (ofcourse it requires Excel to be already present on the client system).
The data for most part is flat list of records.
Is there a better format (or even a better way) to handle this requirement? This is a common requirement for many products - how do you handle this?
Excel versions, both 2007 and several previous, have native XML formats. 2007, obviously, is XML by default, and earlier versions have the ability to save as XML. This SO question deals with the issue. I'd guess a little inspection would give an idea of what's required. I don't know if a XSD/DTD exists for older versions, but a little creative Googling might yield something.
As other people pointed out, it is reasonably easy to generate Excel XML files. You can do this in multiple ways. For example:
By creating a template Excel XML document, and then using XML DOM to stuff your data into the template, or
Converting the template Excel XML into an XSLT, and then simply passing your proprietary XML as input to XSLT.
I'm using ExcelPackage to create spreadsheets in one of my side projects. Works pretty good, but (at least the version I'm using) its a bit limited when it comes to styling and calculations.
ExcelPackage lets you create OOXML docs (.xslx files) that are natively compat with 2k7, but you can download a plugin for previous versions of Office from MS.
We export our data either using Excel objects (COM based code) on client side or CSV file (usually on server side, but can be used on client side too). And we allow copy data from grids in simple html format, what can be pasted into Excel without problems.
For one customer we even had to export data [from sql stored procedure] into csv-like tab-separated format, but named file like xxxxx.xls - this way excel opened that file in more correct way than csv file. Ugly hack, but worked well.
CSV is most compatible format (no dependencies on external applications or libraries), but customers don't like it. Maybe we need to incorporate some XLS export code, this way all users will be happy :)
If .csv isn't formatted enough, you could create a template in Excel, and use a little bit of VBA code to import the CSV and format it appropriately. This way your app is only concerned with generating the .CSV, and will use the same .XLS for each export.
If you're careful, you should be able to get this to work with most versions of Excel seamlessly.
With Perl there are several modules that can be used to produce .xlsx files without requiring an Office installation. Among those :
https://metacpan.org/pod/Excel::Writer::XLSX is the most well-known, with support for many Excel features like colors, formatting, etc.
https://metacpan.org/pod/Excel::ValueWriter::XLSX (I'm actually the author) has less features but is optimized for fast writing of large amounts of data
If you are working in Java, Checkout the POI project from APACHE.
http://poi.apache.org/
Simple, nice, complete, powerful.
We started with Office on the server, but that's not very nice. We had to kill processes that hung, and had quite a bit of a performance dip. We thought about putting it on a different machine, but didn't bother after trying and using Aspose (commercial). We don't have a very large number of simultaneous users, but complex documents. Simple ones can be handled easier with csv.
I've used FlexCel Studio for a couple of projects now. It's very functional and fast. 100% managed code, no dependencies. Sounds like you'd use the "Reports" feature which allows you to define an empty report template in Excel, then pass datatable and volia, it's populated with your data.
TMS Software
We use a combination of OleDB and Interop. We found that Interop was much faster and used less memory, but it's a pain for compatibility issues, especially when using different language installs of Office.
OleDb has the advantage that you don't require Excel to be installed on the client machine. Both Interop and OleDb support multiple sheets (tables) per workbook which you cannot do with csv.
If you're using C# or VB.Net, and your data is in a a DataSet, DataTable or List<>, then you can use my free "Export to Excel" class.
It uses the free Microsoft OpenXML libraries (so you don't need to have Excel on your server), and lets you export your data into a "real" .xlsx file with just one line of code, eg:
DataSet ds = CreateSampleData();
CreateExcelFile.CreateExcelDocument(ds, "C:\\Sample.xlsx");
All source code is provided on the following page along with a demo project, completely free of charge (and popups !)
http://mikesknowledgebase.com/pages/CSharp/ExportToExcel.htm
Hope this helps !

Resources