Apache POI: Transfer time from cell to cell - apache-poi

Need in short: I'd like to copy a time value from one cell to another cell.
Problem in short: POI 5.2.2 (or more specific: DateUtil.internalGetExcelDate) transforms the 08:00 o'clock from the input cell to the numeric value -1.00.
More details: There's an xlsx file (created with LibreOffice 7.0.4.2) with the time '08:00:00' in it:
I can read that value with sourceCell.getLocalDateTimeCellValue(), which is fine.
But when I try to transfer that value into another cell (targetCell.setCellValue(sourceCell.getLocalDateTimeCellValue())), in the targetCell there is the value -1 instead of the expected 08:00:00 o'clock.
Here's a screenshot while debugging the setCellValue call:
And here's a screenshot while debugging the DateUtil.internalGetExcelDate call:
Possible workaround: I guess that it would work to evaluate the LocalDateTime from the sourceCell and if its year is < 1904 then I add some years to that the resulting LocalDateTime is not transformed to -1.00 by DateUtil.internalGetExcelDate.
This is something I don't want to do because that would set another value in the targetCell than there was in the sourceCell.
Another workaround: Another workaround would be to use LocalDateTime.now(), set the hour and minute, call targetCell.setCellValue(...) and then change the format like this:
short format = workbook.createDataFormat().getFormat("HH:MM:SS");
CellStyle cellStyle = workbook.createCellStyle();
cellStyle.setDataFormat(format);
targetCell.setCellStyle(cellStyle);
Unfortunately I don't know whether the sourceCell just contains a time or whether it contains a full timestamp. I just want to copy cell contents (which works fine with String, Number, ...).
Actual workaround: As a current (working) workaround I check the year and if it's <1900 then I set another year at the date, set the modified LocalDateTime into the cell and set the dataformat (see workaround description above).
Question: How can I transfer a LocalTime value from one cell to another cell (without manipulating the year)? I guess that my (working) workaround should not be the answer ...

Excel cell date types
Microsoft Excel only has following cell data types:
String (alphanumeric)
Numeric (floating point number)
Boolean (true or false)
Formula (formula strings)
Error (internally error codes)
Empty (empty cell)
There is no special date cell data type as well as no special Integer cell date type.
How Excel stores date or time or date-time
If Excel stores date-time, it stores it as floating point number. There 1.00 is 1900-01-01 00:00:00.000. (There is a special case when Excel has set 1904-Date. But that is a special case only about the meaning of 1.00.).
Adding 1.00 means adding one day. Adding 1/24 means adding one hour. Adding 1/24/60 means adding one minute. Adding 1/24/60/60 means adding one second. Adding 1/24/60/60/10 means adding a tenth second and so on.
For cell values lower than 1.00, Excel interprets that as time in day 0 of month 1 in year 1900 (or 1904 if Excel has set 1904-Date). There 1/24 means one hour. 1/24 + 1/24/60 means one hour and one minute and so on. So your 08:00:00 is the cell value 8*1/24 (8 hours) = 1/3.
When reading Excel cell values the only way to determine whether Excel interprets a cell value as a date or time or date-time is to get the cell's number format too. If that is a date or time or date-time format, then Excel interprets a cell value of 1.00 as 1900-01-01, a cell value of 1/24 as 01:00:00, a cell value of 1/24 + 1/24/60 as 01:01:00 and so on.
But your observation is correct. If Apache POI reads a date-time from an Excel cell which has set only a time, which is a numeric (double) value between 0.00 and 1.00, then it reads a date-time of day 1899-12-31. But that is not what Excel does. For Excel the time-only value is in day 0 of month 1 in year 1900. If then Apache POI sets a date-time value of, for example 1899-12-31 08:00:00, then it sets -1 because Excel cannot have date-time values before 1900.
So the only way to set time values in Excel cells is to set numeric (double) values between 0.00 and 1.00 and set a cell style having a number format of HH:MM:SS. One cannot set a time only Excel cell value from a Java date-time value, because there is not a Java date-time value which can have day 0 of month 1 in year 1900.
So if DateUtil.isCellDateFormatted tells that the source cell is date formatted and the numeric (double) cell value of that cell is lower than 1, then set that numeric (double) cell value to the new cell and format that cell the same as the source cell.
Complete example to test:
import java.io.FileInputStream;
import java.io.FileOutputStream;
import org.apache.poi.ss.usermodel.*;
public class ExcelSetCellValue {
public static void main(String[] args) throws Exception {
Workbook workbook = WorkbookFactory.create(new FileInputStream("./ExcelWithTime.xlsx")); String filePath = "./ExcelWithTimeNew.xlsx";
//Workbook workbook = WorkbookFactory.create(new FileInputStream("./ExcelWithTime.xls")); String filePath = "./ExcelWithTimeNew.xls";
Sheet sheet = workbook.getSheetAt(0);
Cell sourceCell = sheet.getRow(0).getCell(0); // get cell value from A1: 08:00:00
Row targetRow = sheet.getRow(5); if (targetRow==null) targetRow = sheet.createRow(5);
Cell targetCell = targetRow.getCell(5); if (targetCell==null) targetCell = targetRow.createCell(5);
if (sourceCell.getCellType() == CellType.NUMERIC && DateUtil.isCellDateFormatted(sourceCell)) {
System.out.println(sourceCell.getNumericCellValue()); // 8*1/24 = 1/3
System.out.println(sourceCell.getLocalDateTimeCellValue()); // 1899-12-31T08:00
//targetCell.setCellValue(sourceCell.getLocalDateTimeCellValue()); // does not work because sourceCell.getLocalDateTimeCellValue() is in year 1899
targetCell.setCellValue(sourceCell.getNumericCellValue());
targetCell.setCellStyle(sourceCell.getCellStyle());
}
FileOutputStream out = new FileOutputStream(filePath);
workbook.write(out);
out.close();
workbook.close();
}
}

Related

How to extract data from a field that is apparently of the date / time format but is not really such a format

I have copied (copy/paste) a part of a home-page into an empty excel. One of the fields looks like this: 3140:01:00. If I check the format, it shows that the category is custom, and that the type is [t]:mm:ss. The problem is that I am only interested in the first 4 digits shown plus digits 6 and 7. If I change the format to e.g. text, I end up with a number. Probably a number, that identifies a specific date. In fact the first 4 digits are the length of a horse race! :-) I'm new at VB, but I have managed to clean up the rest of the information - but not this. Probably a known problem. Please help!
You will need to understand the difference between the value and the representation of your data (note that this is not VBA but Excel related). When you enter 3140:01:00 in a cell in Excel, Excel tries to understand what you enter. With the colon, it looks somehow like a time value, so Excel guesses that this a a time, convert what you enter into a date value (a date in Excel has automatically a time part) and put a number format that displays this date+time as [h]:mm:ss.
As I said, internally, what you entered is converted into a Date. Now a Date in Excel in internally stored as a number. If you set the number format to "Number", the cell will display 130.83402. This is because 3140 hours = 130 days + 20 hours. The 20 hours (plus the 1 minute) are stored as a fraction of a day (0.83402).
If you format the same value as Date/Time, you will see (depending on your regional settings) something like 05/09/1900 20:01:00 - because that is the 130th day in the Excel calendar (day 1 in Excel is 1/1/1900). Note that the value of the cell doesn't change, only the way it is displayed.
If you could prevent Excel to convert your input into a date, the solution would be to do string-handling, eg use the Split-function. When you format a cell as Text and enter 3140:01:00 manually, Excel leaves the string untouched and this would work. However, it seems that when you Paste the value into the cell, the number format is set automatically and the value is converted into a date even if the cell was formatted as Text before. I don't know if there is a way to tell Excel to not convert the data if it is pasted.
So what we can do instead is to convert the date value back into "hours", "minutes" and "seconds" - even if the "hours" are in fact something else (meters? yards? horse length?), and the minutes are probably also not minutes but whatever.
Several ways to do so.
If you don't mind that the strange pseudo-date value remains in your Excel (you can hide the column with that value), use just 2 simple formulas. Assuming your "date" is in D2:
use the formula =TRUNC(24*D2) to get the horse race length (the first number). We cannot use the Hour-formula here as this would return only 20 and not 3140.
use the formula =MINUTE(D2) to get the second number
use the formula =SECOND(D2) to get the third number
If you want to involve VBA:
Sub SplitStrangeDate(cell As Range)
If Not IsDate(cell) Then Exit Sub
Dim d As Date
d = cell.Value
Dim v1 As Long, v2 As Long, v3 As Long
v1 = CLng(d * 24)
v2 = Minute(d)
v3 = Second(d)
Debug.Print v1, v2, v3
End Sub

day and month are reversed

I have a cell with the following content:
01/02/2015
The cell is date formatted.
Then I copy the value and put it in my module class:
Set Line = New CTravelLine
Line.Date= Cells(1, 8).value
Everything works fine until the moment I put this value in another cell:
The value 01/02/2015 becomes 02/01/2015.
I am using this format (dd/mm/yyyy). I have the impression that when the days are numerically lower than the month, the 2 values are reversed. The values are reversed whatever the method I tried:
Method 1:
Dim WrdArray() As String, datetest As String
WrdArray() = Split(travelLine.Date, "/")
datetest= WrdArray(0) & "/" & WrdArray(1) & "/" & WrdArray(2)
Cells(5, 5) = datetest
Method 2:
Cells(5, 5) = travelLine.Date
I don't understand how I can solve this problem.
This might have happened due to 'Regional formatting problem'.
Excel has a habit of forcing the American date format (mm/dd/yyyy) when the dates have been imported from another data source. So, if the day in your date happens to be 1 - 12, then Excel will switch the date to mm/dd/yyyy.
When dates are imported from a text file, there is an option in the VBA code to apply regional format which corrects this problem.
OR
Change number format of date column in excelsheet from 'date' format category to 'text'; save it.
(After Saving run the VBA Code if you have any. Now check whether the date format is 'text' or changed back to 'date'.)
If it has changed back to 'date' try to fix it as 'text'
If it's 'text'; Correct the erroneous date cells and save the excel sheet. This will make dates not to change automatically to American Format.
Long story short, I had a similar problem where the dates are working just fine in some cells but keep flipping in others regardless if I copy paste or enter manually, I did the whole data text to column and cell formatting solutions and all of that didn't work.
The solution actually is not in excel, it's in the region and language setting.
To have the dates display as MM/DD/YYYY in the formats tab change the format to US.
To have the dates display as DD/MM/YYYY in the formats tab change the format to UK.
I had the same issue as you .
Let me explain what I want to do :
I have a csv file with some date.
I copy a range of my sheet in variable table. The range contain some columns with dates.
I make some manipulations on my table (very basic ones)
I transpose my variable table (since only the last dimension of a variable table can be increase)
I put my variable table on a new sheet.
What I found:
There is no date issue after executing step 1-4. The date issue shows up when writing on the sheet...
Considering what Avidan said on the post of Feb 24 '15 at 13:36, I guess it is excel which forces the American format mm/dd/yyyy... So I just change the date format at the very beginning of my program :
Before starting any manipulation date:
do
.Cells("where my date is") = Format(.Cells("where my date is"), "mm dd yy")
execute your program
write the table on a sheet
back up to the date format you like directly on the sheet
Just use:
Line.Date = cDate(Cells(1, 8).value2)

Date format dd/mm/yyyy read as mm/dd/yyyy

I have a spreadsheet with a column formatted as:
Category: Date
Type: *dd/mm/yyyy
Location: UK
When I read the data in this column via VBA, it reads in the format mm/dd/yyyy.
For example, 10/06/2014 (10 June 2014) is reading 06/10/2014 (06 Oct 2014).
My code: sDate = SourceSheet.Range("AB" & CurRow.Row).Value
I have this issue with my forms too and the best method for me is to format the textbox like this:
sDate = format(SourceSheet.Range("AB" & CurRow.Row).Value, "mm/dd/yyyy")
Even though the date format is wrong in VBA, it seems to work the right way round in Excel. It's weird, I can't explain why it happens, but this fixes it for me. Whenever I go from VBA to Excel, I almost always find this issue if the value is stored as a date.
Consider:
Sub luxation()
Dim sDate As Date, CurRow As Range
Set SourceSheet = ActiveSheet
Set CurRow = Range("A1")
ary = Split(SourceSheet.Range("AB" & CurRow.Row).Text, "/")
sDate = DateSerial(ary(2), ary(1), ary(0))
MsgBox Format(sDate, "dd mmmm yyyy")
End Sub
This question of mine - .NumberFormat sometimes returns the wrong value with dates and times - gives some background which may help.
I first encountered this VBA bug many years ago and it is worse than it seems. I noticed that many - but not all - dates in a worksheet that I had been updating for a year were wrong. It took me a long time to diagnose the problem. Those dates that could be interpreted as middle endian dates had been corrupted but those that could not be interpreted as middle endian dates were unchanged. So 12/06/2014 will become 6 December but 13/06/2014 will remain 13 June. If 13/06/2014 had been rejected as an invalid date or left as a string, I would have spotted the error immediately. The dual interpretation so every date was imported as a date - the wrong date but still a date - ensured I did not notice until much later maximising the cost of correcting for the bug.
Excel holds dates and times as numbers. "17 June 2014" is held as 41807 and "1 January 1900" is held as 1. In both cases, the value is the number of days since 31 December 1899. Times as held as a fraction:
number of seconds since midnight
--------------------------------
seconds in a day
So 06:00, 12:00 and 18:00 are held as 0.25, 0.5 and 0.75.
This bug is encountered when the transfer of a date involves a conversion to and from string format. I have not discovered a single case in which the conversion from date to string has been wrong. It is the conversion from string to date that hits this bug.
I can see that SilverShotBee's solution will avoid the bug but it would not appeal to me. I no longer use any ambiguous dates ever.
One choice is to transfer the value as a number. If cell A3 contains the date and time "17 June 2014 9:00" then CDbl(Range("A3").Value) returns 41807.375. When you store this number in a cell you will need to set the cell's NumberFormat to the date format of your choice but that might be a good thing.
If I were going to use middle endian dates, I would be explicit. #13/06/2014# is always interpreted as middle endian.
I prefer unambiguous strings. "2014-06-13" or "13 June 2014" are not misinterpreted by VBA or by a human reader.
Have just come up against this issue! Reading records from a .csv and storing in an .xls
I found the following sequence works to overcome the misinterpreted dates:
Read the date field from the .csv file
Store it into a cell in the .xls file
Read it back into vba
Store into its required destination in the .xls
Date is in original format
I found this issue to be incredibly complex and was trying to keep it as simple as possible but have indeed left a few vital details out! Apologies. Here is a fuller version of what I found:
First of all I should explain I was reading dates (and other fields) from a .csv and storing back into an .xls
I am on Office 2002 running on Windows/7
Using 2 example dates: 27/4/2015 and 7/5/2015 in dd/mm/yyyy string format (from the csv)
What I found was:
Reading the 27/4/2015 text date field from csv into a variable dimensioned as STRING and storing into an xls field in dd/mm/yyyy DATE format produces a cell that reads 27/4/2015 but converting it into a cell formatted as Number also produces 27/4/2015. 7/5/2015 on the other hand produces a string that reads 7/5/2015 and converting it into a cell formatted as Number produces 42131.
Reading the 27/4/2015 text date field from csv into an undimensioned variable and storing into an xls field in dd/mm/yyyy DATE format produces a cell that reads 27/4/2015 but converting it into a cell formatted as Number also produces 27/4/2015 while 7/5/2015 reads 5/7/2015 and converting it into a cell formatted as Number produces 42190.
Reading the 27/4/2015 text date field from csv into a variable dimensioned as DATE and storing into an xls field in dd/mm/yyyy DATE format produces a cell that reads 27/4/2015 and converting it into a cell formatted as Number produces 42121. 7/5/2015 on the other hand produces a string that reads 5/7/2015 and converting it into a cell formatted as Number produces 42190.
The first 3 scenarios above therefore do not produce the desired results for all date specifications.
To fix this I do the following:
Input_Workbook.Activate
ilr = Range("A5000").End(xlDown).End(xlDown).End(xlDown).End(xlUp).Row
For i = 1 To ilr
Input_Workbook.Activate
If IsDate(Cells(i, 1).Value) Then
d1 = Cells(i, 1).Value
d1 = Replace(d1, "/", "-")
ThisWorkbook.Activate
Cells(14, 5).Value = d1
d1 = Cells(14, 5).Value
If VarType(d1) = vbString Then
d1 = CDate(d1)
End If
Cells(i, 1).Value = d1
End If
Next
The cell used to store the date initially is formatted GENERAL and the ultimate target cells is formatted as DATE (dd/mm/yyyy).
I don't have enough brain cells left to fully explain what happens to the dates during this process but it works for me and of course the choice of target cells is completely random in the above code block.
The problem was VBA was opening the csv with the reverse dates for single digit days.
This way of opening the workbook worked the same as when I did it manually so had the correct dates in dd/mm/yyyy format. Then copied across correctly:
Workbooks.OpenText FileName:=fpathO, datatype:=xlDelimited, comma:=True, local:=True

apache poi reading date from excel date() function gives wrong date

Im using an xlsx file that contains a DATE(year, month, day) formula within a cell. This cell is formatted as date, so Excel/OpenOffice shows the proper date.
e.g. the cell content is '=DATE(2013;1;1)' which produces : '01.01.2013'
So far - so good.
When reading this file with the poi library I do:
XSSFWorkbook workbook = new XSSFWorkbook(new FileInputStream(new File("/home/detlef/temp/test.xlsx")));
XSSFSheet sheet = workbook.getSheet("Sheet A");
XSSFCell cell = sheet.getRow(0).getCell(0);
System.out.println(cell.getDateCellValue());
This will print out:
Sun Dec 31 00:00:00 CET 1899
I am using POI 3.9.
Can anybody tell me why this happens?
Found a way to do it:
if (cell.getCellType() == Cell.CELL_TYPE_FORMULA) {
FormulaEvaluator evaluator = workbook.getCreationHelper().createFormulaEvaluator();
CellValue evaluate = evaluator.evaluate(cell);
Date date = DateUtil.getJavaDate(evaluate.getNumberValue());
System.out.println(date);
}
That produces:
Tue Jan 01 00:00:00 CET 2013
Thanks anyway.
Everything you have described is entirely correct and expected
When Excel stores a formula in a cell, not only does it store the formula itself (either a string in .xlsx, or in parsed token form for .xls), it also stores the last evaluated answer of that formula. This means that when Excel loads, it doesn't have to grind away for ages calculating the formula results to display, it can just render them with the last value like any other cell.
This is why when you make changes to your excel file with Apache POI, you then need to run an evaluation to update all those cached formula values, so it looks right in Excel before you go to that cell
However, there are a few special formula functions in Excel, which are defined as volatile. These are functions which always return a different value every time, for example DATE, and for these Excel just writes a dummy value (eg 0 or -1) to the cell, and re-evaluates it on load
When you read the numeric value of a formula cell in POI, it gives you the cached value back. If you want POI to evaluate the formula, you need to ask it to, as you're doing
Dates in Excel are stored as fractional days since 1/1/1900 or 1/1/1904 (depending on a flag). A value of -1 is the 31st of December 1899, so that's what you see when Excel writes -1 and you request it as a date

Reference cell value as string in Excel

In Excel, if the cell A1 has some value that gets formatted in a specific way, is there a way for cell B1 to reference the string displayed in A1?
To clarify:
If A1 displays, for instance, the time 10:31:48, I wish to have B1 reference this outputted string as shown to the user ("10:31:48", not the underlying numerical representation "0.43875").
I'm well aware that there are functions for manually formatting values. However, what I'm looking for is copying an already formatted value from another cell, no matter what format that cell may have.
Is something like this possible?
In fact, Excel stores datetime as a number, so you have to explicitly set format of the cell to see the proper value.
You may want to use TEXT function, but anyway, you have to specify format of output string:
=TEXT(A1,"hh:mm:ss")
Another option is to write your own VBA function, which can convert a value of a cell based on it's format:
Public Function GetString(ByVal cell As Range) As String
GetString = Format(cell, cell.NumberFormat)
End Function
This will give you a result based on source cell's format
So that does not quite work as the VBA Format function isn't compatible with Excel formats.
The table below shows the difference between "GetString()" above, and "GetText()"
Public Function GetText(ByVal cell As Range) As String
GetText = Application.WorksheetFunction.Text(cell, cell.NumberFormat)
End Function
Short Date and Long date are interesting -- they are off by 1 day.
Format Value GetString GetText GetFormat
general 3.141592638 'Ge23eral' '3.141592638' 'General'
number 3.14 '3.14' '3.14' '0.00'
Currency $3.14 '$3.14' '$3.14' '$#,##0.00'
Accounting $3.14 '_($3.14_)' ' $3.14 ' '_($* #,##0.00_);_($* (#,##0.00);_($* "-"??_);_(#_)'
Short Date 1/3/1900 '1/2/1900' '1/3/1900' 'm/d/yyyy'
Long Date Tuesday, January 3, 1900 'Tuesday, January 02, 1900' 'Tuesday, January 3, 1900' '[$-F800]dddd, mmmm dd, yyyy'
Time 3:23:54 AM '3:23:54 AM' '3:23:54 AM' '[$-F400]h:mm:ss AM/PM'
Percentage 314.16% '314.16%' '314.16%' '0.00%'
Fraction 3 2/16 '3 ??/16' '3 2/16' '# ??/16'
Scientific 3.14E+00 '3.14E+00' '3.14E+00' '0.00E+00'
Text 3.141592638 '3.141592638' '3.141592638' '#'

Resources