Coming from many different languages, I got stuck in ERP development. The thing is that our ERP is using XSL transformations to generate Excel files in SpreadsheetML format, which is saved with .xls extension, so that Windows open it within Excel.
This results in many unnecessary clicks that our users are forced into; excel screaming about "repairing" the file, wanting to save it with .xml extension - so our users have to pick correct extension in order to send it to client / colleagues... That's just unnecessary load from user standpoint.
Now, due to historical reasons we are unable to ditch the XSLT (and hey! it's awesome anyway) but what we do want to is to make the entire process streamlined from the user standpoint.
Generating 2007+ formats
In the end - after researching google thoroughly, I ended up writing xslt transformations "engine" which allows us to generate valid 2007 formats. All we do now is to generate the contents of sheet.xml in OpenXML format. It also allows us to generate macro-enabled excel files.
But seriously, is there something I missed? Is there any other way how to generate the Excel OpenXML format with some already maintained solution?
The SpreadsheetML is one way how to do it, I don't like it because it introduces a huge demand on end-user and does not contain images/macros. Or maybe I'm doing something wrong?
SpreadsheetML confusion
When I say SpreadsheetML I mean the "XML Spreadsheet" . I've noticed that Microsoft is calling the OpenXML Excel format also as SpreadsheetML - which makes me confused.
Related
I have an excel worksheet with 10 tabs.
For each tab, the data is structured as follows:
All tabs follow this same basic structure.
In Power BI, when I go to "Get Data", and then choose the .xlsx file, I get the following error:
Unable to connect
We encountered an error while trying to connect.
Details: "The input couldn't be recognized as a valid Excel document."
This is very frustrating and I don't know how such a simple task cannot be accomplished in Power BI.
Thank you.
Such alert could appear when you try to use Power BI connector on Excel file. It's understandable if the source file is corrupted and can't be opened in Excel. However, it looks strange if Excel opens the file in question and shows nothing wrong.
Based on our experience above is usually means what something is wrong with XML scheme of the Excel workbook.
Mushup trace (Data->New Query->Query Options->Diagnostics->Enable tracing) could give some additional information, but often not enough to find the reason.
We had two main scenarios
XML scheme is not complete
Usually if Excel file was generated by third-party tool. Such tool could generate quite limited XML scheme which is enough to open the file in Excel and to work with it, but not enough for Power BI connector. As an example, trace log shows
[DataFormat.Error] The input couldn't be recognized as a valid Excel document.\r\nStackTrace:\n…
…
[DataFormat.Error] We couldn't find a part named '/xl/sharedStrings.xml' in the Excel package.\r\nStackTrace:\n…
Such case is easy to fix – it's enough to open the file in Excel and save it (without any changes) – Excel is clever enough to fix the scheme. For the routine regular tasks we use poweshell script which does exactly the same in background.
There is the link within Excel file which is not recognizable as valid
Usually if Excel file is synced/kept with some cloud storage. One of the variants, wrong link could appear with copy/paste from another such file. That could be active link in one of the cells; or the link within conditional formatting formula; or even the link which actually isn't used by Excel but kept somewhere inside the scheme. For example, in one of the files I found in Data->Consolidate->All references the link like
'\drive.tresorit.com#7235\Tresors….[file.xlsx]Sheet'!$AC$6:$AC$357
on the file which was deleted long ago and isn't used, but for some strange reason the link was kept within the scheme.
Unfortunately for such case trace log doesn't give enough information to localize the issue, it looks like
[DataFormat.Error] The input couldn't be recognized as a valid Excel document.\r\nStackTrace:\n…
…
nExceptionType: System.UriFormatException, System, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089\r\nMessage: Invalid URI: The hostname could not be parsed.\r\nStackTrace:\n
Perhaps I have not enough knowledge for more straight forward localization of the problem, but the only way is to exclude Excel file parts one by one and check if the issue disappeared. Another way could be to unzip Excel file and check if wookbook.xml or sheetNN.xml have something suspicious inside.
With newly released Webi there's no way to manipulate reports with VBA like it was in DESKI era.
I'd like to know if there's a way for me to click a button with parameters in Excel sheet and get a report from the server?
I've been thinking of using the RESTful Web-services but it seems that there is a performance problem.
I also considered using a JAVA app in the middle using the SDK but it's not really satisfying as I add one layer.
Do you know if there's an other way to download a Webi report from and to Excel?
For this type of requirement, you'd normally use the OpenDocument feature. There is one thing that it won't do however, at least not for Webi documents, and that is deliver the output in Excel format (HTML and PDF are the two possible formats for Webi). In all fairness, the export to Excel option is only about two or three clicks away, but I can understand that this wouldn't be an ideal solution.
Another option is the Java SDK, which I would not recommend, as the ReBEAN SDK (the part of the Java SDK you need to interface with Webi documents) is deprecated and replaced by the REST SDK.
The REST SDK would be the way to go if the OpenDocument feature is not sufficient. Keep in mind that this would involve quite a few steps, each time sending a command to the WACS server and then decoding the answer. The steps would be:
Authenticate and get a logon token
Refresh the document (if necessary pass prompt values)
Export the document to Excel
Close the document
The REST interface is only supported on the WACS server, which should run on your BI4 server (unless you have a customised landscape). If it's slow, I would suggest looking into the root cause of this performance issue, instead of discarding the SDK altogether.
If you're going to use the REST interface, I would recommend opting for JSON to communicate through REST instead of XML. It's easier to read and parse.
A last option, which I wouldn't recommend, is LiveOffice. This is a separate product which allows you to embed contents from Webi documents into Office documents (most notably Excel). LiveOffice has always had its share of problems and has not received much love from SAP regarding much needed updates.
One final thought: the report will never appear in the same sheet, at least not without an additional amount of coding. Whatever SDK you end up choosing, you will always end up with an Excel file. If you want to show the results in the Excel file you started from, you'll need to code the steps to open the generated file, grab the contents and then copy those to your worksheet.
At the moment, we use MS WORD and MS EXCEL to mail merge documents that needs to be sent to multiple recepients.
For example, say there is a complaint form where the complainant needs to fill in his/her name, address, etc. So we have a .doc file set up with the content and the dynamic entities set up for mail merging, with the name and address details put in an excel file, from where we can happily mail merge to generate all or just the necessary forms/documents.
However, I would like to automate this process, like a form in a website where the complainant can fill in his/her name, address and other details, and we could use that to generate the complaint form automatically and offer it to be downloaded (preferrably as a pdf).
Now, the only solution that comes to mind, is Latex, so that I can just replace the needed entities and just compile to PDF. However, that bit has to be negotiated with the webhost, if they are offering Latex or not.
Is there any other solution? Any other way we could get this done, with something that shouldn't be a problem for most webhosting solutions to offer?
EDIT: I would prefer a non .NET or rather non microsoft solution since, the servers are running linux and while mono might be capable of getting the job done, none of our devs know any .NET languages. However, if required we might have to dwelve into it.
Generating PDF using an XSL. Check the following: Apoc XSL-FO
You will need to create an XML file with the required fields and transform that with this tool.
If you wish to avoid .NET then XSL-FO is worth a look. Try the FOray project.
XSLT can be a steep learn if you do not have experience already. Also users will not be able to change the templates without asking the XSLT guru to do it.
If your templates are already in MS Word and MS Excel then I would stick with generating MS docs on the server. These are now easy to work with from code since OpenXML - check out OfficeOpenXML and OpenXMLDeveloper
Apache FOP : http://xmlgraphics.apache.org/fop/
I suggest generating rtf on the server: it's easy enough to automatically generate using cpan's RTF::Writer, has converters generating good pdf, can be edited by hand in word, oo-writer & TextEdit, doesn't have any really bad compatibility issues between the main editing applications, and has decent text & resource extraction tools, with text extraction being rather better than pdf.
There's some support for moving between rtf & latex, although the best rtf -> latex converter, docx2tex, depends on the System.IO.Packaging .net module, whose mono implementation isn't yet rock solid.
Postscript — Not a recommendation: it's too much of an unwieldy sledgehammer for this job, but iText will generate the pdf directly from the form data. If you wanted to do fancy things like signed pdf, that would be the way to go.
Postscript #2 — If you break up the Word document into individual files using word's master document representation, then you can clobber one of the parts with hand-generated content. This makes it easy to do something approximating form-filling on word .doc files using just standard file-utils and some trivial rtf->doc tweaking.
Does anyone know of resources that can help me export simple contents of a GridView to a native Excel 2007 format (i.e. the OpenOfficeXML format).
I've already seen solutions like Matt Berseth's, and in fact I have been using that for a while, but it comes with an annoying warning produced by Excel 2007 as documented here stemming from the fact that a native Excel file is not generated; rather it is HTML.
My initial research shows that, at the core, xlsx files are zip files, but I have no idea how to produce these or what goes in them.
Any suggestions (or tutorials) would be greatly appreciated.
CarlosAg has an ExcelXML writer which works really well. It isn't a native excel 2007 formatted file, but it will be readable in excel 2007.
You will need to write a little method to do the exporting manually, the API is very straight forward though. You will create a sheet object, then a row object, then a cell object. You can just loop through your data and output it. The examples on the site are pretty decent.
I prefer using Microsoft's own Open XML Format SDK. It is free, it is released by Microsoft and it creates real .xlsx files.
You can find the reference documentation here, as you can see, it is pretty straightforward to use.
SpreadsheetGear for .NET can read and write native xls and xlsx files and is easier to use (takes less of your time) than other solutions because it has an Excel like API so you don't have to learn anything about Open XML.
You can see some live ASP.NET (C# and VB) Excel Reporting examples here and download an evaluation version here.
Disclaimer: I own SpreadsheetGear LLC
Our product has the requirement of exporting its native format (essentially an XML file) to Excel for viewing/editing. However, what this entails is having a dependency on Excel (or Office) itself for our product build - something that we do not want.
What we have done is export the data from our native format to a csv file which can be opened in Excel. If user selects an option to open the generated report as well, we (try to) launch Excel application to open it (ofcourse it requires Excel to be already present on the client system).
The data for most part is flat list of records.
Is there a better format (or even a better way) to handle this requirement? This is a common requirement for many products - how do you handle this?
Excel versions, both 2007 and several previous, have native XML formats. 2007, obviously, is XML by default, and earlier versions have the ability to save as XML. This SO question deals with the issue. I'd guess a little inspection would give an idea of what's required. I don't know if a XSD/DTD exists for older versions, but a little creative Googling might yield something.
As other people pointed out, it is reasonably easy to generate Excel XML files. You can do this in multiple ways. For example:
By creating a template Excel XML document, and then using XML DOM to stuff your data into the template, or
Converting the template Excel XML into an XSLT, and then simply passing your proprietary XML as input to XSLT.
I'm using ExcelPackage to create spreadsheets in one of my side projects. Works pretty good, but (at least the version I'm using) its a bit limited when it comes to styling and calculations.
ExcelPackage lets you create OOXML docs (.xslx files) that are natively compat with 2k7, but you can download a plugin for previous versions of Office from MS.
We export our data either using Excel objects (COM based code) on client side or CSV file (usually on server side, but can be used on client side too). And we allow copy data from grids in simple html format, what can be pasted into Excel without problems.
For one customer we even had to export data [from sql stored procedure] into csv-like tab-separated format, but named file like xxxxx.xls - this way excel opened that file in more correct way than csv file. Ugly hack, but worked well.
CSV is most compatible format (no dependencies on external applications or libraries), but customers don't like it. Maybe we need to incorporate some XLS export code, this way all users will be happy :)
If .csv isn't formatted enough, you could create a template in Excel, and use a little bit of VBA code to import the CSV and format it appropriately. This way your app is only concerned with generating the .CSV, and will use the same .XLS for each export.
If you're careful, you should be able to get this to work with most versions of Excel seamlessly.
With Perl there are several modules that can be used to produce .xlsx files without requiring an Office installation. Among those :
https://metacpan.org/pod/Excel::Writer::XLSX is the most well-known, with support for many Excel features like colors, formatting, etc.
https://metacpan.org/pod/Excel::ValueWriter::XLSX (I'm actually the author) has less features but is optimized for fast writing of large amounts of data
If you are working in Java, Checkout the POI project from APACHE.
http://poi.apache.org/
Simple, nice, complete, powerful.
We started with Office on the server, but that's not very nice. We had to kill processes that hung, and had quite a bit of a performance dip. We thought about putting it on a different machine, but didn't bother after trying and using Aspose (commercial). We don't have a very large number of simultaneous users, but complex documents. Simple ones can be handled easier with csv.
I've used FlexCel Studio for a couple of projects now. It's very functional and fast. 100% managed code, no dependencies. Sounds like you'd use the "Reports" feature which allows you to define an empty report template in Excel, then pass datatable and volia, it's populated with your data.
TMS Software
We use a combination of OleDB and Interop. We found that Interop was much faster and used less memory, but it's a pain for compatibility issues, especially when using different language installs of Office.
OleDb has the advantage that you don't require Excel to be installed on the client machine. Both Interop and OleDb support multiple sheets (tables) per workbook which you cannot do with csv.
If you're using C# or VB.Net, and your data is in a a DataSet, DataTable or List<>, then you can use my free "Export to Excel" class.
It uses the free Microsoft OpenXML libraries (so you don't need to have Excel on your server), and lets you export your data into a "real" .xlsx file with just one line of code, eg:
DataSet ds = CreateSampleData();
CreateExcelFile.CreateExcelDocument(ds, "C:\\Sample.xlsx");
All source code is provided on the following page along with a demo project, completely free of charge (and popups !)
http://mikesknowledgebase.com/pages/CSharp/ExportToExcel.htm
Hope this helps !