Is disposing SXSSFWorkbook necessary when used in try with resource - apache-poi

Below is the sample code snippet to create SXSSFWorkbook:
try(SXSSFWorkbook wb = new SXSSFWorkbook()) {
//...
} finally {
wb.dispose(); //wb not accessible over here, so can't use try with resource
}
Here problem is that if I use try with resource then can't dispose() SXSSFWorkbook in finally, as variable wb won't be accessible in finally block.
I wanted know that is disposing of workbook necessary to delete temporary files or since SXSSFWorkbook is AutoCloseable, try with resource will take care of it.

Not sure whether someone of the apache poi programmers will answering this. But apache poi is open source. So every programmer can answering this itself by looking at the code.
State May 2018, apache poiversion 3.17.
SXSSFWorkbook.java:
public class SXSSFWorkbook implements Workbook
So why can this be a resource for using in try with resource? Because
Workbook.java:
public interface Workbook extends Closeable, Iterable<Sheet>
So org.apache.poi.ss.usermodel.Workbook extends java.io.Closeable and so classes which implements this must providing a method close.
SXSSFWorkbook.close
As you see, the single SheetDataWriters will be closed and then the internally XSSFWorkbook _wb will be closed.
SheetDataWriter.close
SheetDataWriter.close only flushes and closes the Writer _out.
So no, nowhere the dispose is called while auto closing until now (May 2018) in apache poiversion 3.17
And only SheetDataWriter.dispose will deleting the TempFile _fd created for each sheet.

This is a forrmal resolution of the problem.
SXSSFWorkbook t_wb = null;
try(SXSSFWorkbook wb = t_wb = new SXSSFWorkbook()) {
//...
} finally {
if(t_wb != null) t_wb.dispose();
}

This question bothers me too, so my solution is to override the close method, like this:
//a utility method somewhere
Workbook createMyCustomWorkbook() {
return new SXSSFWorkbook() {
public void close() throws IOException {
try {
dispose();
} catch (Exception e) {
//some logging
}
super.close();
}
};
}
//use in a simple try catch block
try(Workbook wb = createMyCustomWorkbook())
//do stuff with wb
}

Related

NPOI Write Corrupts File - Bare Ampersands

Using NPOI 2.1.3.1, I am trying to read an existing Excel (*.xlsx) workbook, modify it, and then write it back to the original file. After reading various threads (including this one), I still cannot find a solution to the problem I'm having.
When I write the file to disk and then try to open it again in Excel, I get the following error:
We found a problem with some content in (filename. Do you want us to
try to recover as much as we can? If you trust the source of this
workbook, click Yes.
Clicking "Yes" fixes various problems in the Excel file, after which I see the following report of the fixes performed:
Replaced Part: /xl/worksheets/sheet3.xml part with XML error. Illegal
name character. Line 3, column 3891168.
Replaced Part: /xl/worksheets/sheet19.xml part with XML error. Illegal name
character. Line 1, column 699903.
Removed Records: Formula from /xl/calcChain.xml part (Calculation properties)
I unzipped the *.xlsx file and found the sheets mentioned and discovered that the character it was referring to is a bare ampersand (&) that was not written as "&" in the XML. The original does use "&", but the file NPOI wrote does not. I have no idea what the issue is with the formula (third issue).
Here is a complete program that reproduces this issue every single time with the workbook I'm using, with the file name removed:
using System.IO;
using NPOI.XSSF.UserModel;
namespace NpoiTest
{
public sealed class NpoiTest
{
public static void Main(string[] args)
{
XSSFWorkbook workbook;
using (FileStream file = new FileStream(#"C:\Path\To\File.xlsx", FileMode.Open, FileAccess.Read))
{
workbook = new XSSFWorkbook(file);
}
using (FileStream file = new FileStream(#"C:\Path\To\File.xlsx", FileMode.Create, FileAccess.Write, FileShare.ReadWrite))
{
workbook.Write(file);
}
}
}
}
As a test, I wrote pretty much the same program using Apache POI, to see if it was just a universal problem with my workbook, and the result was that POI didn't have any problems.
Here is the complete program:
package poitest;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
public class PoiTest
{
public static void main(String[] args)
{
XSSFWorkbook workbook;
try (FileInputStream file = new FileInputStream(new File("C:\\Path\\To\\File.xlsx")))
{
workbook = new XSSFWorkbook(file);
}
catch (IOException e)
{
System.out.println(e.getMessage());
return;
}
try (FileOutputStream out = new FileOutputStream(new File("C:\\Path\\To\\File.xlsx")))
{
workbook.write(out);
}
catch (IOException e)
{
System.out.println(e.getMessage());
}
}
}
So the question is why is NPOI leaving the bare ampersands? Is this just a bug in NPOI?

I am unable to fetch excel data to selenium code At ubuntu o/s

public class ReadAndWrite {
public static void main(String[] args) throws InterruptedException, BiffException, IOException
{
System.out.println("hello");
ReadAndWrite.login();
}
public static void login() throws BiffException, IOException, InterruptedException{
WebDriver driver=new FirefoxDriver();
driver.get("URL");
System.out.println("hello");
FileInputStream fi = new FileInputStream("/home/sagarpatra/Desktop/Xpath.ods");
System.out.println("hiiiiiii");
Workbook w = Workbook.getWorkbook(fi);
Sheet sh = w.getSheet(1);
//or w.getSheet(Sheetnumber)
//String variable1 = s.getCell(column, row).getContents();
for(int row=1; row <=sh.getRows();row++)
{
String username = sh.getCell(0, row).getContents();
System.out.println("Username "+username);
driver.get("URL");
driver.findElement(By.name("Email")).sendKeys(username);
String password= sh.getCell(1, row).getContents();
System.out.println("Password "+password);
driver.findElement(By.name("Passwd")).sendKeys(password);
Thread.sleep(10000);
driver.findElement(By.name("Login")).click();
System.out.println("Waiting for page to load fully...");
Thread.sleep(30000);
}
driver.quit();
}
}
I don't know what is wrong with my code, or how to fix it. It outputs the following error:
Exception in thread "main" jxl.read.biff.BiffException: Unable to recognize OLE stream
at jxl.read.biff.CompoundFile.<init>(CompoundFile.java:116)
at jxl.read.biff.File.<init>(File.java:127)
at jxl.Workbook.getWorkbook(Workbook.java:221)
at jxl.Workbook.getWorkbook(Workbook.java:198)
at test.ReadTest.main(ReadTest.java:19)
I would try using Apache MetaModel instead. I have had better luck with that, than using JXL. Here is a example project I wrote that reads from a .XLSX file. I use this library to run tests on a Linux Jenkins server from .XLS files generated on MS Windows.
Also, it should be noted that this library is also perfect for making a parameterized DataProvider that queries a database with JDBC.
Using JXL, you limit yourself to one data type, either .XLS or .CSV. I believe MetaModel is actually using JXL under the hood and wrapping it to make it easier to use. So, it also would support the OpenOffice documents in the same fashion and suffer the same file compatibility issues.

VSTO Excel: Restore ListObject data source when reopening a file

I am working on an Excel 2010 template project. In my template I have many sheets with static ListObject controls in each of them. To initialize my ListObject, I bind a BindingList<MyCustomType> so it generates a column for each of my MyCustomType public properties. It is really handy because when the user some rows in the ListObject, it automatically fills up my BindingList instance. I added a button in the Excel ribbon so that the program can validate and commit these rows through an EDM. This is how I bind my data to the ListObject in the startup event handler of one of my Excel sheet.
public partial class MyCustomTypesSheet
{
private BindingList<MyCustomType> myCustomTypes;
private void OnStartup(object sender, System.EventArgs e)
{
ExcelTools.ListObject myCustomTypeTable = this.MyCustomTypeData;
BindingList<MyCustomType> customTypes = new BindingList<MyCustomType>();
myCustomTypeTable.SetDataBinding(customTypes);
}
// Implementation detail...
}
Now my issue is that it is very likely that the user of this template will enter these rows in many sessions. It means that he will enter data, save the file, close it, reopen it, enter some new rows and eventually try to commit these rows when he thinks he is done. What I noticed is that when the Excel file created from the template is reopened, the DataSource property of my ListObject controls is null. Which means I have no way to get back the data from the ListObject into a BindingList<MyCustomType>. I have been searching and I found no automatic way to do that and I don't really want to make a piece of code that would crawl through all of the columns to recreate my MyCustomType instances. In an ideal world I would have done like this.
private void OnStartup(object sender, System.EventArgs e)
{
ExcelTools.ListObject myCustomTypeTable = this.MyCustomTypeData;
BindingList<MyCustomType> customTypes = null;
if (myCustomTypeTable.DataSource == null) // Will always be null and erase previous data.
{
customTypes = new BindingList<MyCustomType>();
myCustomTypeTable.SetDataBinding(customTypes);
}
else
{
customTypes = myCustomTypeTable.DataSource as BindingList<MyCustomType>;
}
}
I have been doing a lot of research on this but I was not able to find a solution so I hope some of your can help me to resolve this issue.
Thanks.
As a last solution I decided that I would serialize my object list in XML and then add it as a XML custom part to my Excel file on save. But when I got into MSDN documentation to achieve this, I found out that there was 2 ways to persist data: XML custom part and data caching. And actually data caching was exactly the functionality I was looking for.
So I have been able to achieve my goal by simply using the CachedAttribute.
public partial class MyCustomTypesSheet
{
[Cached]
public BindingList<MyCustomType> MyCustomTypesDataSource { get; set; }
private void OnStartup(object sender, System.EventArgs e)
{
ExcelTools.ListObject myCustomTypeTable = this.MyCustomTypeData;
if (this.MyCustomTypesDataSource == null)
{
this.MyCustomTypesDataSource = new BindingList<MyCustomType>();
this.MyCustomTypesDataSource.Add(new MyCustomType());
}
myCustomTypeTable.SetDataBinding(this.MyCustomTypesDataSource);
}
private void InternalStartup()
{
this.Startup += new System.EventHandler(OnStartup);
}
}
It works like a charm. You can find more information about data caching in MSDN documentation.

Opening multiple sessions simultaneously in NHibernate

I finally figured out what's wrong with my code, but I'm not sure how to fix it. I have some background processes running on a separate thread that perform some database maintenance tasks. Here's an exmple of what's happening:
//Both processes share the same instance of ISessionFactory
//but open separate ISessions
//This is running on it's own thread
public void ShortRunningTask()
{
using(var session = _sessionFactory.OpenSession())
{
//Do something quickly here
session.Update(myrecord);
}
}
//This is running on another thread
public void LongRunningTask()
{
using(var session = _sessionFactory.OpenSession())
{
//Do something long here
}
}
Let's say I start LongRunningTask first. While it's running I start ShortRunningTask on another thread. ShortRunningTask finishes up and closes its session. Once LongRunningTask finishes it tries to do something with it's session it created but an error get's thrown saying that the session has already been closed.
Clearly what's happening is that ISessionFactory.OpenSession() is not honoring the fact that I've opened 2 separate sessions. Closing the session opened in ShortRunningTask also closes the session in LongRunningTask How can I fix this? Please help!
Thanks!
UPDATE
So apparently everyone thinks my fix is totally wrong. So here's the configuration I am using:
_sessionFactory = Fluently.Configure()
.Database(
FluentNHibernate.Cfg.Db.MsSqlConfiguration.MsSql2008
.ConnectionString(db => db.Is(
WikipediaMaze.Core.Properties.Settings.Default.WikipediaMazeConnection)))
.Mappings(m => m.FluentMappings.AddFromAssemblyOf<IRepository>())
.BuildSessionFactory();
I have no configuration taking place in an xml file. Should there be? What am I missing. Here's another example of how opening multiple sessions fails:
public void OpenMultipleSessionsTest()
{
using(var session1 = _sessionFactory.OpenSession())
{
var user = session1.Get<Users>().ById(1);
user.Name = "New Name";
using(var session2 = _sessionFactory.OpenSession())
{
//Do some other operation here.
}
session1.Update(user);
session1.Flush(); // Throws error 'ISession already closed!'
}
}
I figured out how to fix the problem. I setup my SessionFactory as a singleton at made it [ThreadStatic] like this:
[ThreadStatic]
private ISessionFactory _sessionFactory;
[ThreadStatic]
private bool _isInitialized;
public ISessionFactory SessionFactory
{
get
{
if(!_isInitialized)
{
//Initialize the session factory here
}
}
}
The crux of the problem is that creating sessions on separate threads from the same ISessionFactory is problematic. ISessionFactory does not like multiple ISessions being opened at the same time. Closing one, automatically closes any others that are open. Using the [ThreadStatic] attribute creates a separate ISessionFactory for each thread. This allows me to open and close ISessions on each thread without affecting the others.

File Read/Write Locks

I have an application where I open a log file for writing. At some point in time (while the application is running), I opened the file with Excel 2003, which said the file should be opened as read-only. That's OK with me.
But then my application threw this exception:
System.IO.IOException: The process cannot access the file because another process has locked a portion of the file.
I don't understand how Excel could lock the file (to which my app has write access), and cause my application to fail to write to it!
Why did this happen?
(Note: I didn't observe this behavior with Excel 2007.)
Here is a logger which will take care of sync locks. (You can modify it to fit to your requirements)
using System;
using System.Collections.Generic;
using System.Text;
using System.IO;
namespace Owf.Logger
{
public class Logger
{
private static object syncContoller = string.Empty;
private static Logger _logger;
public static Logger Default
{
get
{
if (_logger == null)
_logger = new Logger();
return _logger;
}
}
private Dictionary<Guid, DateTime> _starts = new Dictionary<Guid, DateTime>();
private string _fileName = "Log.txt";
public string FileName
{
get { return _fileName; }
set { _fileName = value; }
}
public Guid LogStart(string mesaage)
{
lock (syncContoller)
{
Guid id = Guid.NewGuid();
_starts.Add(id, DateTime.Now);
LogMessage(string.Format("0.00\tStart: {0}", mesaage));
return id;
}
}
public void LogEnd(Guid id, string mesaage)
{
lock (syncContoller)
{
if (_starts.ContainsKey(id))
{
TimeSpan time = (TimeSpan)(DateTime.Now - _starts[id]);
LogMessage(string.Format("{1}\tEnd: {0}", mesaage, time.TotalMilliseconds.ToString()));
}
else
throw new ApplicationException("Logger.LogEnd: Key doesn't exisits.");
}
}
public void LogMessage(string message)
{
lock (syncContoller)
{
string filePath = Environment.GetFolderPath(Environment.SpecialFolder.ApplicationData);
if (!filePath.EndsWith("\\"))
filePath += "\\owf";
else
filePath += "owf";
if (!Directory.Exists(filePath))
Directory.CreateDirectory(filePath);
filePath += "\\Log.txt";
lock (syncContoller)
{
using (StreamWriter sw = new StreamWriter(filePath, true))
{
sw.WriteLine(DateTime.Now.ToString("yyyy-MM-dd HH:mm:ss.sss") + "\t" + message);
}
}
}
}
}
}
How do you write the log? Have your own open/close or use some thirty party product?
I thing that the log is opened and locked only when it writes something. Once the data writing is finished, the code closes the file and, of course, releases the lock
This seems like a .NET issue. (Well; a Bug if you ask me).
Basically I have replicated the problem by using the following multi-threaded code:
Dim FS As System.IO.FileStream
Dim BR As System.IO.BinaryReader
Dim FileBuffer(-1) As Byte
If System.IO.File.Exists(FileName) Then
Try
FS = New System.IO.FileStream(FileName, System.IO.FileMode.Open, IO.FileAccess.Read, IO.FileShare.Read)
BR = New System.IO.BinaryReader(FS)
Do While FS.Position < FS.Length
FileBuffer = BR.ReadBytes(&H10000)
If FileBuffer.Length > 0 Then
... do something with the file here...
End If
Loop
BR.Close()
FS.Close()
Catch
ErrorMessage = "Error(" & Err.Number & ") while reading file:" & Err.Description
End Try
Basically, the bug is that trying to READ the file with all different share-modes (READ, WRITE, READ_WRITE) have absolutely no effect on the file locking, no matter what you try; you would always end up in the same result: The is LOCKED and not available for any other user.
Microsoft won't even admit to this problem.
The solution is to use the internal Kernel32 CreateFile APIs to get the proper access done as this would ensure that the OS LISTENs to your request when requesting to read files with a share-locked or locked access.
I believe I'm having the same type of locking issue, reproduced as follows:
User 1 opens Excel2007 file from network (read-write) (WindowsServer, version unkn).
User 2 opens same Excel file (opens as ReadOnly, of course).
User 1 successfully saves file many times
At some point, User 1 is UNABLE to save the file due to message saying "file is locked".
Close down User 2's ReadOnly version...lock is released, and User 1 can now save again.
How could opening the file in ReadOnly mode put a lock on that file?
So, it seems to be either an Excel2007 issue, or a server issue.

Resources