Apache POI - How to write XSSFWorkbook to POIFSFileSystem? - apache-poi

Using Apache POI HSSF, we can create xls file like this
private void write(HSSFWorkbook workbook) {
POIFSFileSystem filesystem = new POIFSFileSystem();
filesystem.createDocument(new ByteArrayInputStream(workbook.getBytes()),
"Workbook");
FileOutputStream stream = new FileOutputStream("test.xls");
filesystem.writeFilesystem(stream);
}
Similarly, how can I write with XSSFWorkbook? This does not have the getBytes() method.
I tried to create ByteArrayInputStream from XSSFWorkbook like this -
ByteArrayOutputStream baos = new ByteArrayOutputStream();
workbook.write(baos); //XSSFWorkbook here
ByteArrayInputStream bias = new ByteArrayInputStream(baos.toByteArray());
But the xlsx file created was corrupt. How can I write the workbook to disc using POIFSFileSystem?
The same XSSFWorkbook was written sucessfully when I did like this -
FileOutputStream stream = new FileOutputStream("test.xlsx");
workbook.write(stream);
When I extracted and compared the xlsx files, there was no difference. However, when I do a plain text compare on the xlsx files directly (without extracting), there are few differences in the bytes.
So the problem should be in the createDocument() and/or writeFilesystem() methods of POIFSFileSystem. Can someone let me know how to write XSSFWorkbook using POIFSFileSystem?

You can't!
POIFSFileSystem works with OLE2 files, such as .xls, .doc, .ppt, .msg etc. The POIFS code handles reading and writing the individual streams within that for you.
With the OOXML files (.xlsx, .docx, .pptx etc), the container for the file is no longer OLE2. Instead, the files are stored within a Zip container. In POI, this is handled by OPCPackage, which takes care of reading and writing from Zip files with the required OOXML metadata.
If you want to write a XSSF file to disk, simply do:
FileOutputStream stream = new FileOutputStream("test.xlsx");
workbook.write(stream);
stream.close();
And XSSFWorkbook will handle talking to OPCPackage for you to make that happen.

Related

Can't read an .xlsx file with [BlobInput]

I'm trying to read an .xlsx file from blob storage but the only option I have is to read it as a string from the binding parameter.
[BlobInput("templates/myTemplate.xlsx", Connection = "StorageAccountConnStr")] string template
To load the .xlsx file I need to make a MemoryStream. Thus I wrote:
var templateBytes = Encoding.Unicode.GetBytes(template);
var templateStream = new MemoryStream(templateBytes);
It fails and tells me the file might be corrupt.
Any ideas how to read properly an .xlsx file from blob storage as an input?
Turns out, except string, byte[] is supported.
Therefore I could be able to read and open my file. Azure documentation does not mention it yet.

Apache POI appending data to xlsx file when task ran twice

I have a template.xls file that I'm adding data to from some database queries. I add the data and generate a new file named yyyyMMddHHmmss.xls. This works great. The file size is getting large so I'm trying to do the same with an xlsx file. When I generate the file the first time it works great. If I run the process again (even if I restart my java app) it's somehow retaining the last file in memory and appending the data to that file. In both cases it's pulling the source file from template.xls(x) which is an unmodified file.
The code between the two is identical except I'm passing in xlsx instead of xls in the latter case.
ClassLoader classLoader = getClass().getClassLoader();
File file = new File(Objects.requireNonNull(classLoader.getResource("template.xlsx")).getFile());
Workbook workbook = WorkbookFactory.create(file);
// write data
Date date = new Date();
SimpleDateFormat formatter = new SimpleDateFormat("yyyyMMddHHmmss");
String currentDate = formatter.format(date);
FileOutputStream fileOutputStream = new FileOutputStream(currentDate + ".xlsx");
workbook.write(fileOutputStream);
fileOutputStream.close();
workbook.close();
I'm using Java 8u201 and org.apache.poi:poi:4.1.0 (also tried 4.0.1)
As told in Apache POI - FileInputStream works, File object fails (NullPointerException) already, creating a XSSFWorkbook from a File has the disadvantage, that all changes which was made in that workbook always will be stored into that file while XSSFWorkbook.write. This is true even if write writes to another file. But writing explicitly to the same file is not even possible because the File stays open after the workbook was created and so writing into that same file leads to exceptions.
So creating a XSSFWorkbook from a File using
Workbook workbook = WorkbookFactory.create(file);
is not a good idea when file is a *.xlsx file. Instead the Workbook needs to be created using a FileInputstream:
Workbook workbook = WorkbookFactory.create(new FileInputStream(file));
Although the linked SO Q/A is from 2017, the same problem always nor occurs today using apache poi 4.1.0.

Not saving over template

I have an XLS template which I am modifying via Apache POI.
The aim is to modify the template and then email this modified xls spreadsheet.
Once the workbook has been modified accordingly - is there a way to keep this file in memory without saving over the original
POIFSFileSystem fs = new POIFSFileSystem(new FileInputStream("template.xls"));
//do some processing
FileOutputStream fileOut = new FileOutputStream("template.xls");
wb.write();
fileOut.close();
I want to keep this template in tact for the next run. I would have used Apache freemarker but couldnt see xls support.

Error while reading CSV file using Apache POI

I am trying to read a CSV file from a local drive using the Apache PoI API.
FileInputStream fInputStream = new FileInputStream(inputName);
Workbook workBook = WorkbookFactory.create(fInputStream);
When I try to read a CSV file, which is created in Windows, the API reads it perfectly.
Whereas, when I have the CSV file(which is DOWNLOADED from an UNIX environment) and read it in windows environment, I get the below exception.
java.lang.IllegalArgumentException: Your InputStream was neither an OLE2 stream, nor an OOXML stream
at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:81)
Can somebody throw some inputs on this behavior.

Open XML SDK - Save a template file (.xltx to .xlsx)

I have the following code to open Excel template file and save it as .xlsx file and I get the error below when I try to open the new file. Please help to resolve this.
Excel cannot open the file ‘sa123.xlsx’ because the file format or the extension is not valid. Verify that the file has not been corrupted and that the file extension matches the format of the file.
string templateName = "C:\\temp\\sa123.xltx";
byte[] docAsArray = File.ReadAllBytes(templateName);
using (MemoryStream stream = new MemoryStream())
{
stream.Write(docAsArray, 0, docAsArray.Length); // THIS performs doc copy
File.WriteAllBytes("C:\\temp\\sa123.xlsx", stream.ToArray());
}
In order to do this you will need to use the Open XML SDK 2.0. Below is a snippet of code that worked for me when I tried it:
byte[] byteArray = File.ReadAllBytes("C:\\temp\\sa123.xltx");
using (MemoryStream stream = new MemoryStream())
{
stream.Write(byteArray, 0, (int)byteArray.Length);
using (SpreadsheetDocument spreadsheetDoc = SpreadsheetDocument.Open(stream, true))
{
// Change from template type to workbook type
spreadsheetDoc.ChangeDocumentType(SpreadsheetDocumentType.Workbook);
}
File.WriteAllBytes("C:\\temp\\sa123.xlsx", stream.ToArray());
}
What this code does is it takes your template file and opens it into a SpreadsheetDocument object. The type of this object is Template, but since you want it as a Workbook you call the ChangeDocumentType method to change it from a Template to a Workbook. This will work since the underlying XML is the same between a .xltx and a .xlsx file and it was just the type that was causing you an issue.
Excel sees the .xlsx extension and tries to open it as a worksheet file. But it isn't. It's a template file. When you have a template open in Excel and save it as a .xlsx file, it converts it to the worksheet format. What you are doing is the same as changing the extension in the filename. Try it in Windows Explorer and you will get the same result.
I believe you should be able to accomplish what you want by using the Excel Object Model. I have not used this though.

Resources