Apache POI in Windows Server 2012 R2 - apache-poi

We have a set of utility programs which reads an .xlsx file for some input data and generate reports, Apache POI is used for this purpose. Excel file got 8 sheets with an average of 50 rows and 20 columns of data. Everything was working fine in normal Windows 7 box (Read developers machine). The file reading will get finished in few seconds.
Recently we moved these jobs to a Windows Server 2012 R2 box and we have noticed that the last sheet in the excel file takes lots of time to finish reading. I have duplicated the last sheet to confirm that this is not the data issue and executed the job, the second last sheet( was the last one in the previous execution) got finished reading in milli seconds and the last one (duplicated sheet) got again stuck for 15 minutes. My best guess here is that this may be because the time taken to close the file is getting too high but that is just a guess and no concrete evidence to prove that, also if that is the case I am not sure why so. Only difference between working Windows boxes and non-working boxes are the OS, rest all configurations are similar. I have analyzed the heap and thread dump and no issues found.
Is there any known compatibility issues with POI and Windows server boxes? Or is it something related to code? We are using POI-XSSF implementation.

Ok, finally we got the problem; the issue identified is with the the VM itself, the Disk I/O operation is always 100% and file read/write was taking a lot of time to complete, this caused the program to stuck there. However we couldnt identify why the disk I/O is high, tried some blogs but didnt work hence we downgraded the OS to Windows 2008 server and it worked well.
Note that there is nothing to do with POI or anything, it was certainly a VM/OS issue.

Related

When redemption.dll is loaded and operational excel 2016 pauses before launching saved files from disk

I have identified a possible issue with redemption slowing down excel when you click a file from disk (with excel closed), it introduces a delay of minimum 4-5 seconds.
If excel is open the files open immediately from disk, the problem goes away when we close the program that has launched the redemption.dll process is closed.
If you launch excel via the command line with the file as an argument then it also launches immediately eg. "c:\path to office\excel.exe" myfile.xls its jsut if you click it then there is a pause.
All the important helpful bits
Machine is a 2016 RDS Host (Latest patches (inc Jan 2020 patch)), 8
cores, 16GB Memory, users and admins are affected (tested multiple
accounts)
Office is 2016 standard edition with all items installed (again fully patched)
Redemption.dll version is 5.20.0.5298
I have tried just de-registering the redemption.dll but I suspect the program that launches it looks where it expects to find it within its world, and re-registers it (kind of expected tbh), if I de-register it and delete the dll from disk then the program falls back to the outlook security triggering tick the box method but excel does not get slowed down at all when launching files with excel closed.
Side note: Word is unaffected
Thanks in advance
Issue turned out to be a setting within the software thats using redemption, something calander related as the setting we altered to resolve the issue related to diary monitoring.
Thanks.

Jet.OLEDB or ACE.OLEDB MS Access

I'm using excel vba to pull data from a MS Access DB - this is using Excel 2013 and Access 2013 32bit. The code historically has used:
Provider=Microsoft.Jet.OLEDB.4.0;
However some computers have upgraded to Excel 2016 64bit and the Jet provider is not available for 64bit. I have changed the code to:
Provider=Microsoft.ACE.OLEDB.12.0;
which works for both 64bit and 32bit systems. However, I have noticed a significant speed drop in loading/saving data just from changing this line. Does anyone know why this can be and how I can improve it?
You are correct in having to choose the ACE provider for x64 bits.
And the big advantage of JET was it was (and still is) installed on all copies of windows by default. So no need to install Access or the runtime, or previous the office connectivity package.
As for performance? There has been a few comments about performance in regards to ACE x64.
However, one trick or suggestion is to ensure that the connection stays open. In other words, are you sure the row processing is going slow, or it is the overall time?
(perhaps put a test msg box, or test in your code.
Eg:
Dim T as single
T = timer()
‘ your code here
Debug.print timer() – t
The above will thus spit out the time to the debug window (while in VBA ide hit ctrl-g to display the immediate/debug window.
The reason why I suggest force open idea is often you find that ACE takes a VERY long time to open. But once open then the data reading has good performance (same as before).
So, I suggest to check and try this fix.
So open a table (any table) and KEEP it open. Now run your existing code (that may well open + close other tables). The issue is when ACE attempts to open a table, it tries to put locks on the mdb/accdb file and it is this process that takes VERY VERY long time.
However, if you force (keep) open one table, then this VERY slow process of ACE attempting to lock the file for read/write does not occur each time you execute a query, or create additional recordsets in code.
So, if the row reading speed is fast, but the time to START + open is very slow, then before you run + test your routines, force open a table to some reocrdset (keep it active and in scope), and THEN try your code.
I find 9 out of 10 times, this results in elimination of this slow speed, and often I seen the results are nothing short of spectacular (it will run faster then before!!!)

Excel 2016 Upgraded From 2007; Any Workarounds for Apparent Memory Issue?

This isn't really a programming question perse, but at work we have been forced to upgrade from Excel 2007 to Excel 2016 which has caused some productivity issues with respect to opening multiple workbooks at once.
The problem is that our entire file system has a bunch of linked formulas and iterative calculations where it is necessary to have multiple workbooks open at once (around 60+). Previously with Excel 2007 we were able to do this easily with relatively low power computers (4 gigs of ram, with a mediocre hyperthreaded i3 processor) but with the new change to Excel 2016 we keep getting an error stating to "upgrade to 64-bit Excel" or "install more physical memory". The error message can be seen here. We have already upgraded to 64-bit Excel, and on top of that, I tried using other co workers computers with 8 and 16 gigs of ram which resulted in little to no difference. Does this imply that perhaps Excel 2016 is not well optimized for having this many workbooks open at the same time? Are there solutions or workarounds this problem? I find it hard to believe that 16 gigs of ram is still insufficient when 2007 was able to go through this process easily, though perhaps the change from MDI to SDI means less efficiency?
As an aside, yes, I do indeed wish we were not using Excel for such a computationally expensive process in which Excel may not have been designed for tasks like these, but to move everything to something like SAS would take a lot of time I'd imagine and ultimately, I don't make the decisions :(
Thanks for any help.

QFileSystemWatcher - does it need to run in another thread?

I have a class that does some parsing of two large (~90K rows, 11 columns in the first and around ~20K, 5 columns in the second) CSV files. According to the specification I'm working with the CSV files can be externally changed (removing/adding of new rows; columns remain constant as well as the paths). Such updates can happen at any time (though highly unlikely that an update will be launched in time intervals shorter than a couple of minutes) and an update of any of the two files has to terminate the current processing of all that data (CSV, XML from an HTTP GET request, UDP telegrams), followed by re-parsing the content of each of the two (or just one if only one has changed).
I keep the CSV data (quite reduced since I apply multiple filters to remove unwanted entries) in memory to speed working with it and also to avoid unnecessary IO operations (opening, reading, closing file).
Right now I'm looking into the QFileSystemWatcher, which seems to be exactly what I need. However I'm unable to find any information on how it actually works internally.
Since all I need is to monitor 2 files for changes the number of files shouldn't be an issue. Do I need to run it in a separate thread (since the watcher is part of the same class where the CSV parsing happens) or is it safe to say that it can run without too much fuss (that is it works asynchronously like the QNetworkAccessManager)? My dev environment for now is a 64bit Ubuntu VM (VirtualBox) on a relatively powerful host (a HP Z240 workstation) however the target system is an embedded one. While the whole parsing of the CSV files takes just 2-3 seconds at the most I don't know how much performance impact there will be once the application gets deployed so additional overhead is something of a concern of mine.

Pocket PC 2003 C# Performance Issues...Should I Thread It?

Environment
Windows XP SP3 x32
Visual Studio 2005 Standard Edition
Honeywell Dolphin 9500 Pocket PC/Windows Mobile 2003 Platform
Using the provided Honeywell Dolphin 9500 VS2005 SDK
.NET Framework 1.1 and .NET Compact Framework 1.0 SP3
Using VC#
Problem
When I save an image from the built in camera and Honeywell SDK ImageControl to the device's storage card or internal memory, it takes 6 - 7 seconds.
I am currently saving the image as a PNG but have the option of a BMP or JPG as well.
Relevant lines in the code: 144-184 and 222, specifically 162,163 and 222.
Goal
I would like to reduce that time down to something like 2 or 3 seconds, and even less if possible.
As a secondary goal, I am looking for a profiling suite for Pocket PC 2003 devices specifically supporting the .NET Compact Framework Version 1.0. Ideally free but an unfettered short tutorial would work as well.
Things I Have Tried
I looked into asynchronous I/O via System.Threading a little bit but I do not have the experience to know whether this is a good idea, nor exactly how to implement threading for a single operation.
With threading implemented as it is in the code below, there seems to be a trivial speed increase of maybe a second or less. However, something on the next Form requires the image, possibly in the act of being saved, and I do not know how to mitigate the wait or handle that scenario at all, really.
EDIT: Changing the save format from PNG to BMP or JPG, with the threading, seems to reduce the save time considerably..
Code
http://friendpaste.com/3J1d5acHO3lTlDNTz7LQzB
Let me know if the code should just be posted here in code tags. It is a little long (~226 lines) so I went ahead and friendpasted it as that seemed to be acceptable in my last post.
By changing the save format from PNG to BMP and including the Threading code shown in the Code link, I was able to reduce the save time to ~1 second.
You're at the mercy of the Honeywell SDK for this one, since their control is doing the actual saving of the image. Calling this on a separate thread (i.e. not the UI thread) isn't going to help at all (as you've found out), and it will actually make things more difficult for you since you need to wait until the save task is completed before moving on to the next form.
The only suggestion I can make is to make sure you're saving the image to internal memory (and not to the SD card), since writing to an SD card usually takes significantly longer than writing to memory. Or see if you can get technical support from Honeywell - 6-7 seconds seems way too long for a task like this.
Or see if the Honeywell SDK lets you get the image as a byte array (instead of saving to disk). If this call returns in less than 6-7 seconds, you can handle persisting it yourself.

Resources