I am running a tool in Jenkins, which reads a .csv file. The computer I'm using doesn't have Microsoft Office installed, just OpenOffice.
Compared to Excel, OpenOffice opens this file in a strange view, for example:
connects a few columns to one column;
splits one line to many columns;
does not show columns titles.
I wonder if anyone has encountered this issue and can advise me.
Related
TL;DR: Excel Workbook generated by Docx4J always says corrupted but I can't determine what Excel doesn't like about the underlying XML, let alone how to fix it.
My use case is as follows:
I am trying to produce an excel workbook with charts and graphs automatically on a regular basis. Only the raw data will change but everything else will dynamically update as the raw data is changed.
So I built an excel workbook which has a number of charts and graphs being generated by a sheet of raw data. I am using it as a template. All values of the raw data are numeric. The intent was to use Docx4J to read this 'template' and to populate the raw data sheet, then save it as a new file whereupon opening will initiate the recalculation and the charts and graphs will update. Since I am new to Docx4j, I basically decided to do baby steps by first seeing if I could open and read the contents of the cells; which I could. So far so good. I also could change the values of the cells but I could only verify this programatically by writing out to the console the location and value before a change, then the location and value after the change (ex. A1=45 followed by A1=55).
My problem starts when I try to open the resulting file. It generates, looks to be about the right size but Excel claims it is corrupted. It does try to recover what it can, but ultimately fails and the workbook won't even open. For troubleshooting, I opened up the generated xlsx and confirmed all the various XML files that make up an xlsx file were present and readable so I am concluding either something is missing or some part of the XML coming out the other side is not what Excel wants. Further troubleshooting involved creating an empty workbook (no data, 1 sheet) as my 'template', opening it and then saving it back to the file system with a different name and simply trying to see if I could open it in Excel but no dice. This has me ruling out anything to do with my attempts to write or add data to the sheet.
Relevant Environment Information:
'template' workbook is being generated on a Windows 10 64bit machine
My docx4j code is executing on a Debian 10 Linux machine running OpenJDK 11.0.4
My version of Excel both to create the 'template' and open the copy is Excel for Office365
I am running Docx4J v11.1.3 but I also tried with v8.1.5(both cases I had to use the Reference Implementation of JAXB to get around a marshalling error when trying to save)
I did see another post on Stackoverflow here about an issue related to fonts in Linux environments so I made sure to install the MS TT Corefonts but it didn't help my problem.
I ran the entire unzipped directory through BeyondCompare and there are some differences but I don't know which are just artifacts of the two different OS' or even which differences matter. Mostly they are:
small differences in file size
boolean values showing as "1", "yes", or "true" but not the same way for both files
namespaces and attributes in one file but not the other
Sheet1 from my blank workbook, before and after
All ideas are welcome.
Please try the just-released docx4j 8.1.6, which fixes handling of xlsx files created by recent releases of Excel. This was https://github.com/plutext/docx4j/issues/389
I want to export excel data of multiple languages to resource files in Visual studio and i do not want to copy paste each single row (key and value) to resource file as i have data for about 7 web pages with each page containing 20 rows.
I have worked on exporting .resx to .xlsx earlier using a very good tool : http://www.zeta-resource-editor.com/index.html but it does not work otherwise (.resx to .xlsx)
One major challenge i am facing is: the number of rows in excel are less than the number of key-values in .resx i.e i do not have the whole content translated in excel
I have tried microsoft's in-built tool (https://msdn.microsoft.com/en-us/library/bbwz4bhx(v=vs.90).aspx) but it imports excel as excel/csv itself and not as .resx
Also giving a try to http://resxresourcemanager.codeplex.com/
which is throwing error : could not find a matching file in Solution..
Any suggestions for tried and tested tool will be highly appreciated :)
http://resxresourcemanager.codeplex.com/ could help but it always throws errors saying sheet and solution's resx file names does not match, even if you create a rex file on root and provide same name to excel sheet.
So i opted for manual copy-paste.. was not much time consuming :)
I'm working on using CSV files to create Highcharts – but running into an odd problem: when Excel 'touches' a CSV file, the chart breaks immediately. Here's the simplest example:
Highcharts online documentation has a handy example of a bar chart generated from a CSV file: http://www.highcharts.com/studies/data-from-csv.htm
The data underlying this chart can be downloaded from: http://www.highcharts.com/studies/data.csv
But here's the odd thing. If I download those files, and recreate the chart on my own web server, everything works fine... until I open the data.csv file in Excel, then save it. This breaks the Highchart immediately, even if no changes are made to the underlying data. No error messages are thrown up in the console – the chart simply goes blank as soon as Excel makes a save.
I know what you're thinking – "this moron is saving a CSV file as a .xslx, then wondering why his chart breaks." But that's not what's happening – using the 'Save As... .csv' option in Excel also breaks the chart immediately.
Here's the content of the CSV file before I open it in Excel (cut and pasted from TextEdit):
Categories,Apples,Pears,Oranges,Bananas
John,8,4,6,5
Jane,3,4,2,3
Joe,86,76,79,77
Janet,3,16,13,15
And here's the content of the CSV file after opening it in Excel:
Categories,Apples,Pears,Oranges,Bananas
John,8,4,6,5
Jane,3,4,2,3
Joe,86,76,79,77
Janet,3,16,13,15
To my eyes, those are the same file! And yet the first one renders perfectly, the second (which has been saved by Excel) creates an invisible chart.
Any help greatly appreciated. I'm using Excel 2008 for Mac, if that's relevant. Thanks in advance...
Seems the solution was to save:
the CSV file as a 'Windows comma separated CSV'
Is there some way to write a csv file such that , when opened in MS Excel , it will open in different tabs in the workspace ?
The short answer is, NO.
For that matter, the long answer is NO too.
csv is a continuous run of lines of values separated by commas. each line doesn't even have to have the same number of values etc. there's no concept of workbooks or different "areas" in csv. Excel cannot be cajoled into opening a csv into multiple workbooks...well at least not without writing VBA to parse the csv file yourself.
the oxml or whatever they've ended up calling the xml file spec for office, allows workbooks and is still easy to deal with being text based. Do you have to use csv or can you switch (at least part way through) to xml?
I am encountering what I believe to be a strange issue with Excel (in this case, Excel 2007, but maybe also Excel 2003, but don't have access to it as I write this).
I can reliably convert some server data over into a tab-delimited format (been doing this for years) and then open it using Excel - no issue.
However, what seems to be happening is if I have an html <table> inside one of the fields, it looks like Excel 2007 thinks it should be converting the table into rows and columns inside Excel (not what I want). As you might imagine, this throws off the entire spreadsheet.
So question is, is there any way to set up excel to NOT do this (perhaps some setting in Excel that pertains to reading tab delimited files), or am I missing something?
Thanks.
Save your file as .txt
Now open the file in excel using Drag and Drop (rather than double clicking your hookey .xls)
Slightly more work to open the file, but your tab text formatting will now be respected.
When you open the tab-delimited file, you are shown an import mapping dialog that lets you pick each columns' data type (date, text, currency, etc.). For the columns that have HTML data present, choose text. This will tell it basically to import as-is and not try to automatically parse the data into a derived format.
Excel 2003 does the same. I don't think there is a way to do it with a config because Excel finds delimiters in the html table and breaks the html in cells and columns as it does for the other columns.
If the column containing html is always the same, you can use JYelton suggestion of renaming the file as csv and record a small VBA macro to load the file selecting automatically the html column as text in the import mapping dialog and you load the file calling the macro instead of double-clicking on the file.
If nothing else, import it into OpenOffice.org Calc, save as an .xls file, then open in Excel.