Reading XLSB file - Apache POI - apache-poi

I have referred all post in stack overflow related to reading XLSB file using apache POI.
I tried many ways to read XLSB file using available links/example mentioned in post. But I am ended up in issues.
I am using latest Apache POI 3.17 and used the code mentioned in
Link :
Exception reading XLSB File Apache POI java.io.CharConversionException
Section: Post mentioned by "Gagravarr "
I am getting the following errors
The method getLocale() is undefined for the type XSSFBEventBasedExcelExtractor
The method getFormulasNotResults() is undefined for the type XSSFBEventBasedExcelExtractor
The constructor XSSFEventBasedExcelExtractor.SheetTextExtractor() is not visible
The method getIncludeSheetNames() is undefined for the type XSSFBEventBasedExcelExtractor
.......................... etc
I checked the base class "XSSFEventBasedExcelExtractor" in poi-ooxml-3.17.jar (source files) and I can able to find implementation for all the method.
I wanted to know whether this is an known issue ? Does it mean that there is no working example available to read XLSB files in Java.
I hope this query is not duplicate.

Recently, i study how to use poi to read xlsb.
If you just want to read a xlsb purely, you can use the apache test example code as the following.
https://svn.apache.org/repos/asf/poi/trunk/src/ooxml/testcases/org/apache/poi/xssf/eventusermodel/TestXSSFBReader.java
In fact, xlsb use .bin file instead of .xml file.
If you want to do more thing to xlsb file, you can read this document as the following.
https://msdn.microsoft.com/en-us/library/office/cc313133(v=office.12).aspx

Related

How to read and write same excel sheet in Jmeter?

I have csv as follows:
url, Expected Reponse, Actual Response, Status
I found various sites explaining how to write to Excel file but can find solution for writing Actual Response to the same file, from where i read.
How do we achieve this?
To update existing file you must write a new file with context.delete old file and rename new file to old file.
In jmeter you can execute java using JSR223 Sampler with language Java.
See java example.

How to parse strict *.xlsx file in Java

I need to parse data from xlsx file. Currently I'm using Jakarta-POI (v. 3.11) to do that. It handles fine some xlsx but not all. I noticed that the files that are not parsed properly are "strict xlsx" files saved with Office 2013. To be more exact this files are compliant with ISO29500 not ECMA-376 the difference is that in ISO29500 file there are relationships with type:
http://purl.oclc.org/ooxml/officeDocument/relationships/officeDocument
and Jakarta-POI is looking for:
String CORE_DOCUMENT =
"http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument"
Is there a way to make Jakarta-POI read this files?
OOXML Strict Converter for Office 2010 may help if you need to resave the docs using an older format.
Some of the purl namespaces are listed on http://pyxb.sourceforge.net/PyXB-1.2.2/bundles.html (Jethro's link above appears to no longer work).
The up to date XML schema files can be found at:
http://www.ecma-international.org/publications/standards/Ecma-376.htm

Excel biff5 to biff8 conversion

My system uses Apache-POI to manage some xls files. Now I've got almost 300 xls files, but it appears that they are in an old format so i got this exception:
The supplied spreadsheet seems to be Excel 5.0/7.0 (BIFF5) format. POI only supports BIFF8 format (from Excel versions 97/2000/XP/2003)
Is there a way to handle that or to automatically convert all those files to a biff8 format?
Go with converting it to OOXLS format, POI supports both BIFF8 and newer OOXLS. Download official Microsoft converter pack:
http://www.microsoft.com/en-us/download/details.aspx?id=3
Convert files by running excelcnv.exe -oice <input file> <output file>. You can try run it directly from your code as external program, or create some batch file. There is a good explanation from mrdivo at social msdn here.
EDIT
The download mentioned above from microsoft.com is no longer available as of 6/21/2018. However, excelcnv.exe is a standard part of some Microsoft Office installations. It has been confirmed to be deployed with Office 2014 and Office 2016, and possibly other versions. It can be found at:
C:\Program Files (x86)\Microsoft Office\root\Office16` (or `Office14`).
It seems apache-POI can't handle BIFF5 format.
You should try to use Java Excel API instead : http://jexcelapi.sourceforge.net/

Wrong text encoding when parsing json data

I am curling a website and writing it to .json file; this file is input to my java code which parses it using json library and the necessary data is written back in a CSV file which i later use to store it in a database.
As you know data coming from a website can be in different formats so i make sure that i read and write in UTF-8 format, still i get wrong output.
For example, Østerriksk becomes �sterriksk.
I am doing all this in Linux. I think there is some encoding problem because this same code runs fine in Windows but not in Unix/Linux.
I am quite sure my java code is proper but i am not able to find out what I'm doing wrong.
You're reading the data as ISO 8859-1 but the file is actually UTF-8. I think there's an argument (or setting) to the file reader that should solve that.
Also: curl isn't going to care about the encodings. It's really something in your Java code that's wrong.
What kind of IDE are you using, for example this can happen if you are using Eclipse IDE, and not set your default encoding to utf-8 in properties.

Working with excel files using apache poi

Is there any way to read or write both excel 2003 and 2007 format using apache poi.I know that we can use HSSF workbook for 2003 format and XSSF for 2007(correct me if am wrong).But is there any way to read both the format using any single workbook but not using separately.
Yes, you can do it. In fact, it's fairly widely documented on the Apache POI website!
If you already have code that uses HSSF, then you should follow the HSSF to SS converting guide for help on updating your code to be general across the two formats.
If you don't have any code yet, then follow the User API guide to get started - all the code in that is general for both formats. You can also look at the Quick Guide for some specific problems and how to solve them in the general way.
Use
WorkbookFactory.create(in);
Based on the javadoc, it
Creates the appropriate HSSFWorkbook / XSSFWorkbook from the given
InputStream.
Try Workbook wb = WorkbookFactory.create(OPCPackage pkg);.
It should work. However, if the XSSF is too big you will get an OutOfMemoryException and therefore you should use the event user model to read your file. In that case you should read your path and check the extension of your file, like following:
private boolean isXLS(String inputPath) {
String tmp = inputPath.substring(inputPath.length() - 3,
inputPath.length());
if (tmp.equalsIgnoreCase("XLS"))
return true;
else
return false;
}
Read the How-to for more information about the event user model.

Resources