What indicates an Office Open XML Cell contains a Date/Time value? - excel

I'm reading an .xlsx file using the Office Open XML SDK and am confused about reading Date/Time values. One of my spreadsheets has this markup (generated by Excel 2010)
<x:row r="2" spans="1:22" xmlns:x="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
<x:c r="A2" t="s">
<x:v>56</x:v>
</x:c>
<x:c r="B2" t="s">
<x:v>64</x:v>
</x:c>
.
.
.
<x:c r="J2" s="9">
<x:v>17145</x:v>
</x:c>
Cell J2 has a date serial value in it and a style attribute s="9". However, the Office Open XML Specification says that 9 corresponds to a followed hyperlink. This is a screen shot from page 4,999 of ECMA-376, Second Edition, Part 1 - Fundamentals And Markup Language Reference.pdf.
The presetCellStyles.xml file included with the spec also refers to builtinId 9 as a followed hyperlink.
<followedHyperlink builtinId="9">
All of the styles in the spec are simply visual formatting styles, not number styles. Where are the number styles defined and how does one differentiate a style reference s="9" from indicating a cell formatting (visual) style vs a number style?
Obviously I'm looking in the wrong place to match styles on cells with their number formats. Where's the right place to find this information?

The s attribute references a style xf entry in styles.xml. The style xf in turn references a number format mask. To identify a cell that contains a date, you need to perform the style xf -> numberformat lookup, then identify whether that numberformat mask is a date/time numberformat mask (rather than, for example, a percentage or an accounting numberformat mask).
The style.xml file has elements like:
<xf numFmtId="14" ... applyNumberFormat="1" />
<xf numFmtId="1" ... applyNumberFormat="1" />
These are the xf entries, which in turn give you a numFmtId that references the number format mask.
You should find the numFmts section somewhere near the top of style.xml, as part of the styleSheet element
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<styleSheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
<numFmts count="3">
<numFmt numFmtId="164" formatCode="[$-414]mmmm\ yyyy;#" />
<numFmt numFmtId="165" formatCode="0.000" />
<numFmt numFmtId="166" formatCode="#,##0.000" />
</numFmts>
The number format id may be here, or it may be one of the built-in formats. Number format codes (numFmtId) less than 164 are "built-in".
The list that I have is incomplete:
0 = 'General';
1 = '0';
2 = '0.00';
3 = '#,##0';
4 = '#,##0.00';
9 = '0%';
10 = '0.00%';
11 = '0.00E+00';
12 = '# ?/?';
13 = '# ??/??';
14 = 'mm-dd-yy';
15 = 'd-mmm-yy';
16 = 'd-mmm';
17 = 'mmm-yy';
18 = 'h:mm AM/PM';
19 = 'h:mm:ss AM/PM';
20 = 'h:mm';
21 = 'h:mm:ss';
22 = 'm/d/yy h:mm';
37 = '#,##0 ;(#,##0)';
38 = '#,##0 ;[Red](#,##0)';
39 = '#,##0.00;(#,##0.00)';
40 = '#,##0.00;[Red](#,##0.00)';
44 = '_("$"* #,##0.00_);_("$"* \(#,##0.00\);_("$"* "-"??_);_(#_)';
45 = 'mm:ss';
46 = '[h]:mm:ss';
47 = 'mmss.0';
48 = '##0.0E+0';
49 = '#';
27 = '[$-404]e/m/d';
30 = 'm/d/yy';
36 = '[$-404]e/m/d';
50 = '[$-404]e/m/d';
57 = '[$-404]e/m/d';
59 = 't0';
60 = 't0.00';
61 = 't#,##0';
62 = 't#,##0.00';
67 = 't0%';
68 = 't0.00%';
69 = 't# ?/?';
70 = 't# ??/??';
The missing values are mainly related to east asian variant formats.

The chosen answer is spot-on, but note that Excel defines some number format (numFmt) codes differently from the OpenXML spec. Per the Open XML SDK 2.5 Productivity Tool's documentation (on the "Implementer Notes" tab for the NumberingFormat class):
The standard defines built-in format ID 14: "mm-dd-yy"; 22: "m/d/yy h:mm"; 37: "#,##0 ;(#,##0)"; 38: "#,##0 ;[Red]"; 39: "#,##0.00;(#,##0.00)"; 40: "#,##0.00;[Red]"; 47: "mmss.0"; KOR fmt 55: "yyyy-mm-dd".
Excel defines built-in format ID
14: "m/d/yyyy"
22: "m/d/yyyy h:mm"
37: "#,##0_);(#,##0)"
38: "#,##0_);[Red]"
39: "#,##0.00_);(#,##0.00)"
40: "#,##0.00_);[Red]"
47: "mm:ss.0"
55: "yyyy/mm/dd"
Most are minor variations, but #14 is a doozy. I wasted a couple of hours troubleshooting why leading zeros weren't being added to single-digits months and days (e.g. 01/05/14 vs. 1/5/14).

Thought I'd add my solution that I've put together to determine if the double value FromOADate is really a date or not. Reason being is I have a zip code in my excel file as well. The numberingFormat will be null if it's text.
Alternatively you could use the numberingFormatId and check against a list of Ids that Excel uses for dates.
In my case I've explicitly determined the formatting of all fields for the client.
/// <summary>
/// Creates the datatable and parses the file into a datatable
/// </summary>
/// <param name="fileName">the file upload's filename</param>
private void ReadAsDataTable(string fileName)
{
try
{
DataTable dt = new DataTable();
using (SpreadsheetDocument spreadSheetDocument = SpreadsheetDocument.Open(string.Format("{0}/{1}", UploadPath, fileName), false))
{
WorkbookPart workbookPart = spreadSheetDocument.WorkbookPart;
IEnumerable<Sheet> sheets = spreadSheetDocument.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>();
string relationshipId = sheets.First().Id.Value;
WorksheetPart worksheetPart = (WorksheetPart)spreadSheetDocument.WorkbookPart.GetPartById(relationshipId);
Worksheet workSheet = worksheetPart.Worksheet;
SheetData sheetData = workSheet.GetFirstChild<SheetData>();
IEnumerable<Row> rows = sheetData.Descendants<Row>();
var cellFormats = workbookPart.WorkbookStylesPart.Stylesheet.CellFormats;
var numberingFormats = workbookPart.WorkbookStylesPart.Stylesheet.NumberingFormats;
// columns omitted for brevity
// skip first row as this row is column header names
foreach (Row row in rows.Skip(1))
{
DataRow dataRow = dt.NewRow();
for (int i = 0; i < row.Descendants<Cell>().Count(); i++)
{
bool isDate = false;
var styleIndex = (int)row.Descendants<Cell>().ElementAt(i).StyleIndex.Value;
var cellFormat = (CellFormat)cellFormats.ElementAt(styleIndex);
if (cellFormat.NumberFormatId != null)
{
var numberFormatId = cellFormat.NumberFormatId.Value;
var numberingFormat = numberingFormats.Cast<NumberingFormat>()
.SingleOrDefault(f => f.NumberFormatId.Value == numberFormatId);
// Here's yer string! Example: $#,##0.00_);[Red]($#,##0.00)
if (numberingFormat != null && numberingFormat.FormatCode.Value.Contains("mm/dd/yy"))
{
string formatString = numberingFormat.FormatCode.Value;
isDate = true;
}
}
// replace '-' with empty string
string value = GetCellValue(spreadSheetDocument, row.Descendants<Cell>().ElementAt(i), isDate);
dataRow[i] = value.Equals("-") ? string.Empty : value;
}
dt.Rows.Add(dataRow);
}
}
this.InsertMembers(dt);
dt.Clear();
}
catch (Exception ex)
{
LogHelper.Error(typeof(MemberUploadApiController), ex.Message, ex);
}
}
/// <summary>
/// Reads the cell's value
/// </summary>
/// <param name="document">current document</param>
/// <param name="cell">the cell to read</param>
/// <returns>cell's value</returns>
private string GetCellValue(SpreadsheetDocument document, Cell cell, bool isDate)
{
string value = string.Empty;
try
{
SharedStringTablePart stringTablePart = document.WorkbookPart.SharedStringTablePart;
value = cell.CellValue.InnerXml;
if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString)
{
return stringTablePart.SharedStringTable.ChildElements[Int32.Parse(value)].InnerText;
}
else
{
// check if this is a date or zip.
// integers will be passed into this else statement as well.
if (isDate)
{
value = DateTime.FromOADate(double.Parse(value)).ToString();
}
return value;
}
}
catch (Exception ex)
{
LogHelper.Error(typeof(MemberUploadApiController), ex.Message, ex);
}
return value;
}

In styles.xml see if there is a numFmt node. I think that will hold a numFmtId of "9" which will relate to the date format that's used.
I don't know where that is in the ECMA, but if you search for numFmt, you might find it.

It was unclear to me how to reliably determine whether a cell has date/time value. After spending some time experimenting I had come up with the code (see post) that would look for both built-in and custom date/time formats.

In case anyone else is having a hard time with this, here is what I've done:
1) Create a new excel file and put in a date time string in cell A1
2) Change formatting on the cell to whatever you want, then save file.
3) Run following powershell script to extract out the stylesheet from .xlxs
[Reflection.Assembly]::LoadWithPartialName("DocumentFormat.OpenXml")
$xlsx = (ls C:\PATH\TO\FILE.xlsx).FullName
$package = [DocumentFormat.OpenXml.Packaging.SpreadsheetDocument]::Open($xlsx, $true)
[xml]$style = $package.WorkbookPart.WorkbookStylesPart.Stylesheet.OuterXml
Out-File -InputObject $style.OuterXml -FilePath "style.xml"
style.xml now contains the information that you can inject to DocumentFormat.OpenXml.Spreadsheet.Stylesheet(string outerXml), leading to
4) Use the extracted file to construct excel object model
var style = File.ReadAllText(#"c:\PATH\TO\EXTRACTED\Style.xml");
var stylesheetPart = WorkbookPart_REFERENCE.AddNewPart<WorkbookStylesPart>();
stylesheetPart.Stylesheet = new Stylesheet(style);
stylesheetPart.Stylesheet.Save();

#RobScott reference to your code snippet
I have found always null in style index of a particular Cell
var styleIndex = (int)row.Descendants<Cell>().ElementAt(i).StyleIndex.Value;
my requirement to read below mentioned excel and transfrom the row and column data to the json.
excel reference
StockInvoiceNo
StockInvoiceOn
Name
Description
DC3320012989
23-01-2021 00:00:00:00
item1
description
DC3320012989
24-01-2021 00:00:00:00
item2
description
DC3320012989
25-01-2021 00:00:00:00
item3
description

Related

How to HIDE Un-used Rows and Columns in Excel Using Apache POI

Title says it all -- need to hide all rows and columns that are outside of the the rows and columns containing my data.
I have tried several options:
How to hide the following Un-used rows in Excel sheet using Java Apache POI?
Permanently Delete Empty Rows Apache POI using JAVA in Excel Sheet
How to hide the following Un-used rows in Excel sheet using Java Apache POI?
But these never produce the desired effect. I'm using apache poi version 4.1.1
See the following screenshots showing the excel format I have versus the format I want. (Since I am new on stackoverflow, it doesn't allow me to embed the pictures directly. Weird I know.)
What I have
What I need
Hiding unused rows and columns is not provided by high level classes of apache poi until now.
Hiding unused rows is a setting in sheet format properties of Office Open XML, the format of XSSF (*.xlsx). There is defined how to handle rows per default. For example default row height. But there also can be set that rows are zero height per default. So only used rows, which have cells having content or format are visible. As apache poi does not have a method to set SheetFormatPr.setZeroHeight we need using the underlaying org.openxmlformats.schemas.spreadsheetml.x2006.main.* classes.
In binary BIFF format of HSSF (*.xls) hiding unused rows is a setting in DEFAULTROWHEIGHT record within the worksheet's record stream. There option flags can be set. Option flag 0x0002 means hiding unused rows. To set that using apache poi we need access to the org.apache.poi.hssf.record.DefaultRowHeightRecord. This only can be got from InternalSheet.
Hiding columns could be done using Sheet.setColumnHidden, but only for single columns. So to hide 100 columns one needs calling Sheet.setColumnHidden 100 times.
Excel also provides settings for column ranges from min column to max column. But Apache poi does not providing high level methods for this.
Using XSSF (Office Open XML) we need the org.openxmlformats.schemas.spreadsheetml.x2006.main.CTCols to get or set a org.openxmlformats.schemas.spreadsheetml.x2006.main.CTCol having the appropriate min and max and setHidden(true).
Using HSSF (BIFF) we need get or set the COLINFOrecord from/to the the worksheet's record stream which has the appropriate min and max and and setHidden(true).
The following complete example shows the code sample for the above. It uses ExcelExampleIn.xlsx or ExcelExampleIn.xls as input and sets unused rows hidden and sets columns hidden from given min to max column.
Tested and works using apache poi 4.1.1.
import java.io.FileInputStream;
import java.io.FileOutputStream;
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.*;
import org.apache.poi.hssf.usermodel.*;
import org.apache.poi.hssf.model.InternalSheet;
import org.apache.poi.hssf.record.DefaultRowHeightRecord;
import org.apache.poi.hssf.record.ColumnInfoRecord;
import org.apache.poi.hssf.record.RecordBase;
import java.util.List;
public class ExcelHideUnusedRowsAndColumns {
static void setUnusedRowsHidden(Sheet sheet) throws Exception {
if (sheet instanceof XSSFSheet) {
// in OOXML set zeroHeight property true for all undefined rows, so only rows having special settings are visible
XSSFSheet xssfSheet = (XSSFSheet)sheet;
org.openxmlformats.schemas.spreadsheetml.x2006.main.CTWorksheet ctWorksheet = xssfSheet.getCTWorksheet();
org.openxmlformats.schemas.spreadsheetml.x2006.main.CTSheetFormatPr ctSheetFormatPr = ctWorksheet.getSheetFormatPr();
if (ctSheetFormatPr == null) ctSheetFormatPr = ctWorksheet.addNewSheetFormatPr();
ctSheetFormatPr.setZeroHeight(true);
} else if (sheet instanceof HSSFSheet) {
// in BIFF file format set option flag 0x0002 in DEFAULTROWHEIGHT record
HSSFSheet hssfSheet= (HSSFSheet)sheet;
java.lang.reflect.Field _sheet = HSSFSheet.class.getDeclaredField("_sheet");
_sheet.setAccessible(true);
InternalSheet internalSheet = (InternalSheet)_sheet.get(hssfSheet);
java.lang.reflect.Field defaultrowheight = InternalSheet.class.getDeclaredField("defaultrowheight");
defaultrowheight.setAccessible(true);
DefaultRowHeightRecord defaultRowHeightRecord = (DefaultRowHeightRecord)defaultrowheight.get(internalSheet);
defaultRowHeightRecord.setOptionFlags((short)2);
}
}
static void setColumnsHidden(Sheet sheet, int min, int max) throws Exception {
if (sheet instanceof XSSFSheet) {
// respect max column count 16384 (1 to 16384) for OOXML
if (max > 16384) max = 16384;
// in OOXML set cols settings in XML
XSSFSheet xssfSheet = (XSSFSheet)sheet;
org.openxmlformats.schemas.spreadsheetml.x2006.main.CTWorksheet ctWorksheet = xssfSheet.getCTWorksheet();
org.openxmlformats.schemas.spreadsheetml.x2006.main.CTCols ctCols = ctWorksheet.getColsArray(0);
boolean colSettingFound = false;
for (org.openxmlformats.schemas.spreadsheetml.x2006.main.CTCol ctCol : ctCols.getColList()) {
if (ctCol.getMin() == min && ctCol.getMax() == max) {
colSettingFound = true;
ctCol.setHidden(true);
}
System.out.println(ctCol);
}
if (!colSettingFound) {
org.openxmlformats.schemas.spreadsheetml.x2006.main.CTCol ctCol = ctCols.addNewCol();
ctCol.setMin(min);
ctCol.setMax(max);
ctCol.setHidden(true);
System.out.println(ctCol);
}
} else if (sheet instanceof HSSFSheet) {
// in BIFF min and max are 0-based
min = min -1;
max = max -1;
// respect max column count 256 (0 to 255) for BIFF
if (max > 255) max = 255;
// in BIFF file format set hidden property in COLINFO record
HSSFSheet hssfSheet= (HSSFSheet)sheet;
java.lang.reflect.Field _sheet = HSSFSheet.class.getDeclaredField("_sheet");
_sheet.setAccessible(true);
InternalSheet internalSheet = (InternalSheet)_sheet.get(hssfSheet);
java.lang.reflect.Field _records = InternalSheet.class.getDeclaredField("_records");
_records.setAccessible(true);
#SuppressWarnings("unchecked")
List<RecordBase> records = (List<RecordBase>)_records.get(internalSheet);
boolean colInfoFound = false;
for (RecordBase record : records) {
if (record instanceof ColumnInfoRecord) {
ColumnInfoRecord columnInfoRecord = (ColumnInfoRecord)record;
if (columnInfoRecord.getFirstColumn() == min && columnInfoRecord.getLastColumn() == max) {
colInfoFound = true;
columnInfoRecord.setHidden(true);
}
System.out.println(columnInfoRecord);
}
}
if (!colInfoFound) {
ColumnInfoRecord columnInfoRecord = new ColumnInfoRecord();
columnInfoRecord.setFirstColumn(min);
columnInfoRecord.setLastColumn(max);
columnInfoRecord.setHidden(true);
records.add(records.size()-1, columnInfoRecord);
System.out.println(columnInfoRecord);
}
}
}
public static void main(String[] args) throws Exception {
String inFilePath = "./ExcelExampleIn.xlsx"; String outFilePath = "./ExcelExampleOut.xlsx";
//String inFilePath = "./ExcelExampleIn.xls"; String outFilePath = "./ExcelExampleOut.xls";
try (Workbook workbook = WorkbookFactory.create(new FileInputStream(inFilePath));
FileOutputStream out = new FileOutputStream(outFilePath ) ) {
Sheet sheet = workbook.getSheetAt(0);
//set unused rows hidden
setUnusedRowsHidden(sheet);
//set multiple columns hidden, here column 7 (G) to last column 16384 (XFD)
setColumnsHidden(sheet, 7, 16384);
workbook.write(out);
}
}
}
Mark the first "outside" column, hold CTRL + SHIFT and then right arrow. Then, all columns should be highlighted. Right click, select "Hide".
Repeat the same with rows, select the first row outside of your data, hold CTRL + SHIFT and press Arrow Down.
Best of luck! ^_^

EPPlus DataField with PercentOfTotal

I'm usingg EPPlus to create a pivot table in excel but I wish show data as percent of the total in one my DataFields, how can I do that?
public static void createTableMotivo(Worksheet ws, ExcelRangeBase range)
{
const string FORMATCURRENCY = "#,###;[Red](#,###)";
ExcelWorksheet worksheet = ws.EPPlusSheet;
//The pivot table
ExcelPivotTable pivotTable = worksheet.PivotTables.Add(worksheet.Cells["B12"], range, "pivot_table1");
//The label row field
pivotTable.RowFields.Add(pivotTable.Fields["FIELD1"]);
pivotTable.DataOnRows = false;
pivotTable.ShowCalcMember = true;
//The data fields
ExcelPivotTableDataField fieldSum = pivotTable.DataFields.Add(pivotTable.Fields["FIELD2"]);
fieldSum.Name = "Quantidade de Faturas";
fieldSum.Function = DataFieldFunctions.Sum;
fieldSum.Format = FORMATCURRENCY;
ExcelPivotTableDataField fieldPercent = pivotTable.DataFields.Add(pivotTable.Fields["FIELD2"]);
fieldPercent.Name = "%";
fieldPercent.Function = DataFieldFunctions.None;
fieldPercent.Format = "0.00%";
pivotTable.PageFields.Add(pivotTable.Fields["FIELD3"]);
pivotTable.PageFields.Add(pivotTable.Fields["FIELD4"]);
}
I am trying to do something similar. The only way I can get it this to work is by manipulating the xml of the Excel file.
This was interesting in itself - I can't remember where I saw that you can rename .xlsx to .zip and then just view all of the xml files inside that zip.
In my case, I first set "Show Values As" to "% of Grand Total" on the pivot table of the actual Excel file. Then after changing the file extention from .xlsx to .zip and extracting, there was a pivotTable3.xml file that had:
<dataFields count="2">
<dataField name="Number of Orders" fld="10" subtotal="count"/>
<dataField name="Percent of Total" fld="10" subtotal="count" showDataAs="percentOfTotal" numFmtId="10"/>
</dataFields>
The goal is to get showDataAs="percentOfTotal" in the dataField element whose name is "Percent of Total".
I tried to use an xpath expression to get the dataField element, but it returns null:
pivotTable.PivotTableXml.SelectSingleNode("//dataField[name='Percent of Total']")
So I had to fall back to walking down the xml:
pivotTable.DataFields[1].Format = "#0.00%";
pivotTable.DataFields[1].Function = DataFieldFunctions.Count;
pivotTable.DataFields[1].Name = "Percent of Total";
foreach (XmlElement documentElementChild in pivotTable.PivotTableXml.DocumentElement.ChildNodes)
{
if (documentElementChild.Name.Equals("dataFields"))
{
foreach (XmlElement dataFieldChild in documentElementChild.ChildNodes)
{
foreach (XmlAttribute attribute in dataFieldChild.Attributes)
{
if (attribute.Value.Equals("Percent of Total"))
{
// found our dataField element; add the attribute
dataFieldChild.SetAttribute("showDataAs", "percentOfTotal");
break;
}
}
}
}
}

NPOI set Explicit Column Type Not working properly

I'm using NPOI Excel Library to generate a Excel file, in that Excel file i'm explicitly define column type for Columns like Date,String etc.
Im using the following code to achive this.
var row = sheet.CreateRow(currentNPOIRowIndex++);
for (var colIndex = 0; colIndex < exportData.Columns.Count; colIndex++)
{
ICell cell = null;
cell = row.CreateCell(colIndex);
if (exportData.Columns[colIndex].DataType == typeof(DateTime))
{
if (exportData.Rows[rowIndex][colIndex].ToString() != "")
{
cell.SetCellValue((DateTime)exportData.Rows[rowIndex][colIndex]);
cell.CellStyle = (NPOI.HSSF.UserModel.HSSFCellStyle)book.CreateCellStyle();
cell.CellStyle.DataFormat = book.CreateDataFormat().GetFormat("yyyyMMdd HH:mm:ss");
cell = null;
}
else
cell.SetCellValue(exportData.Rows[rowIndex][colIndex].ToString());
}
else
cell.SetCellValue(exportData.Rows[rowIndex][colIndex].ToString());
}
}
The above code works fine for 42 rows i.e. it correctly set the Column Type,but after 42 rows Column Type doesn't apply.
Any help will be highly appreciated.
you'll required to set default column style if you want to set column format for all cells of that column. Please see the below example from xssf format. Syntax may differ for your hssf format but it will give you idea what you are missing.
I am providing you from my working code. I am using NPOI version 2.2.1.0.
can you comment line //cell = null;
XSSFWorkbook workbook = new XSSFWorkbook();
XSSFSheet sheet = (XSSFSheet)workbook.CreateSheet("Template");
XSSFFont defaultFont = (XSSFFont)workbook.CreateFont();
defaultFont.FontHeightInPoints = (short)10;
XSSFCellStyle headerStyle = (XSSFCellStyle)workbook.CreateCellStyle();
headerStyle.WrapText = true;
XSSFCellStyle defaultStyle = (XSSFCellStyle)workbook.CreateCellStyle();
XSSFDataFormat defaultDataFormat = (XSSFDataFormat)workbook.CreateDataFormat();
defaultStyle.SetDataFormat(defaultDataFormat.GetFormat("000-000-0000"));
defaultStyle.FillBackgroundColor = IndexedColors.LightYellow.Index;
defaultStyle.FillForegroundColor = IndexedColors.LightTurquoise.Index;
defaultStyle.SetFont(defaultFont);
var row = sheet.CreateRow(0);
for (int headerCount = 0; headerCount < headers.Count(); headerCount++)
{
row.CreateCell(headerCount).SetCellValue(headers[headerCount]);
row.Cells[headerCount].CellStyle = headerStyle;
sheet.SetDefaultColumnStyle(headerCount, defaultStyle);
}

How can I change the format of the ColumnField column headings in EPPlus?

I create a column field in EPPlus like so:
// Column field[s]
var monthYrColField = pivotTable.Fields["MonthYr"];
pivotTable.ColumnFields.Add(monthYrColField);
...that displays like so (the "201509" and "201510" columns):
I want those values to display instead as "Sep 15" and "Oct 15"
In Excel Interop it's done like this:
var monthField = pvt.PivotFields("MonthYr");
monthField.Orientation = XlPivotFieldOrientation.xlColumnField;
monthField.NumberFormat = "MMM yy";
...but in EPPlus the corresponding variable (monthYrColField) has no "NumberFormat" (or "Style") member.
I tried this:
pivotTableWorksheet.Column(2).Style.Numberformat.Format = "MMM yy";
...but, while it didn't complain or wreak havoc, also did not change the vals from "201509" and "201510"
How can I change the format of my ColumnField column headings in EPPlus from "untransformed" to "MMM yy" format?
UPDATE
For VDWWD:
As you can see by the comments, there are many things related to PivotTables which don't work or are hard to get to work in EPPlus; Excel Interop is a bear (and not a teddy or a Koala, but more like a grizzly) compared to EPPlus, but as to PivotTables, it seems that EPPlus is kind of half-baked to compared to Exterop's fried-to-a-crispness.
private void PopulatePivotTableSheet()
{
string NORTHWEST_CORNER_OF_PIVOT_TABLE = "A6";
AddPrePivotTableDataToPivotTableSheet();
var dataRange = pivotDataWorksheet.Cells[pivotDataWorksheet.Dimension.Address];
dataRange.AutoFitColumns();
var pivotTable = pivotTableWorksheet.PivotTables.Add(
pivotTableWorksheet.Cells[NORTHWEST_CORNER_OF_PIVOT_TABLE],
dataRange,
"PivotTable");
pivotTable.MultipleFieldFilters = true;
pivotTable.GridDropZones = false;
pivotTable.Outline = false;
pivotTable.OutlineData = false;
pivotTable.ShowError = true;
pivotTable.ErrorCaption = "[error]";
pivotTable.ShowHeaders = true;
pivotTable.UseAutoFormatting = true;
pivotTable.ApplyWidthHeightFormats = true;
pivotTable.ShowDrill = true;
// Row field[s]
var descRowField = pivotTable.Fields["Description"];
pivotTable.RowFields.Add(descRowField);
// Column field[s]
var monthYrColField = pivotTable.Fields["MonthYr"];
pivotTable.ColumnFields.Add(monthYrColField);
// Data field[s]
var totQtyField = pivotTable.Fields["TotalQty"];
pivotTable.DataFields.Add(totQtyField);
var totPriceField = pivotTable.Fields["TotalPrice"];
pivotTable.DataFields.Add(totPriceField);
// Don't know how to calc these vals here, so had to put them on the data sheet
var avgPriceField = pivotTable.Fields["AvgPrice"];
pivotTable.DataFields.Add(avgPriceField);
var prcntgOfTotalField = pivotTable.Fields["PrcntgOfTotal"];
pivotTable.DataFields.Add(prcntgOfTotalField);
// TODO: Get the sorting (by sales, descending) working:
// These two lines don't seem that they would do so, but they do result in the items
// being sorted by (grand) total purchases descending
//var fld = ((PivotField)pvt.PivotFields("Description"));
//fld.AutoSort(2, "Total Purchases");
//int dataCnt = pivotTable.ra //DataBodyRange.Columns.Count + 1;
FormatPivotTable();
}
private void FormatPivotTable()
{
int HEADER_ROW = 7;
if (DateTimeFormatInfo.CurrentInfo != null)
pivotTableWorksheet.Column(2).Style.Numberformat.Format =
DateTimeFormatInfo.CurrentInfo.YearMonthPattern;
// Pivot Table Header Row - bold and increase height
using (var headerRowFirstCell = pivotTableWorksheet.Cells[HEADER_ROW, 1])
{
headerRowFirstCell.Style.VerticalAlignment = ExcelVerticalAlignment.Center;
headerRowFirstCell.Style.Font.Bold = true;
headerRowFirstCell.Style.Font.Size = 12;
pivotTableWorksheet.Row(HEADER_ROW).Height = 25;
}
ColorizeContractItemBlocks(contractItemDescs);
// TODO: Why is the hiding not working?
HideItemsWithFewerThan1PercentOfSales();
}
You can use the build-in Date format YearMonthPattern. which would give september 2016 as format.
pivotTableWorksheet.Column(2).Style.Numberformat.Format = DateTimeFormatInfo.CurrentInfo.YearMonthPattern;
If you really want MMM yy as pattern, you need to overwrite the culture format:
Thread.CurrentThread.CurrentCulture = new CultureInfo("nl-NL")
{
DateTimeFormat = { YearMonthPattern = "MMM yy" }
};
pivotTableWorksheet.Column(2).Style.Numberformat.Format = DateTimeFormatInfo.CurrentInfo.YearMonthPattern;
It doesn't seem that you can set the format on the field itself. You have to access through the pivot table object:
pivotTable.DataFields[0].Format = "MMM yy";
Any formatting applied to the underlying worksheet seems to be completely ignored.

string automatically converted in spring

I'm working on a project in Spring using SpringMVC. I'm importing data from (.xls) files .
the problem is that:
I'm reading this value "945854955" as a String but saved in DB as "9.45854955E8"
this value "26929" saved as "26929.0"
this value "21/05/1987" saved as "31918.0"
/read Code
// import ...
#RequestMapping(value="/read")
public String Read(Model model,#RequestParam CommonsMultipartFile[] fileUpload)
throws IOException, EncryptedDocumentException, InvalidFormatException {
List<String> liste = new ArrayList();
Employe employe = new Employe();
String modelnom = null;
liste = extraire(modelnom); //See the second code
for (int m=0, i=29;i<liste.size();i=i+29) {
if(i % 29 == 0) {
m++;
}
employe.setNomEmploye(liste.get(29*m+1));
//...
employe.setDateNaissance((String)liste.get(29*m+8).toString()); // here i had the date problem
employe.setDateEntree((String)liste.get(29*m+9).toString()); // here i had the date problem
employe.setDateSortie((String)liste.get(29*m+10).toString()); // here i had the date problem
// ...
employe.setNumCpteBanc(liste.get(29*m+17)); // here i had the first & second case problem
employe.setNumCIMR(liste.get(29*m+19)); // here i had the first & second case problem
employe.setNumMUT(liste.get(29*m+20)); // here i had the first & second case problem
employe.setNumCNSS(liste.get(29*m+21)); // here i had the first & second case problem
boolean bool=true;
List<Employe> employes = dbE.getAll();// liste des employes
for (int n=0;n<employes.size();n++) {
if (employes.get(n).getMatriculeMY() == (int)mat ) {
bool= false;
}
}
if (bool) {
dbE.create(employe);
}
}
return "redirect";
}
extraire code
private List<String> extraire (String nomFichier) throws IOException {
List<String> liste = new ArrayList();
FileInputStream fis = new FileInputStream(new File(nomFichier));
HSSFWorkbook workbook = new HSSFWorkbook(fis);
HSSFSheet spreadsheet = workbook.getSheetAt(0);
Iterator < Row > rowIterator = null;
// recup une ligne
rowIterator = spreadsheet.iterator();
while (rowIterator.hasNext()) {
int i = 0;
row = (HSSFRow) rowIterator.next();
Iterator < Cell > cellIterator = row.cellIterator();
while ( cellIterator.hasNext()) {
Cell cell = cellIterator.next();
i++;
/**
* Pour verifier si une ligne est vide. (for verifing if the line is empty)
*/
if (i % 29 == 0 || i == 1) {
while ( cellIterator.hasNext() && cell.getCellType() == Cell.CELL_TYPE_BLANK) {
cell = cellIterator.next();
}
}
switch (cell.getCellType()) {
case Cell.CELL_TYPE_NUMERIC:
String cellule = String.valueOf(cell.getNumericCellValue());
liste.add(cellule);
break;
case Cell.CELL_TYPE_STRING:
liste.add(cell.getStringCellValue());
break;
case Cell.CELL_TYPE_BLANK:
cellule = " ";
liste.add(cellule);
break;
}
}
}
fis.close();
return liste;
}
}
Excel's tries to data type cells and sometimes when you explicitly specify the data type Excel may try and cast the cell. You can try to right click on the cell and select 'Format Cell', then select 'Text' as the type (Category). However, at parse time it may still get hosed up.
Your quickest solution might be to save the file as a CSV and use that. You can still edit it in Excel. Although you will need to do some validation to ensure Excel isn't trying to do the above conversions on CSV save as. There are a lot of good Java CSV parsers out there OpenCSV, Super CSV.
The most time consuming, but probably the most correct way, if you want to continue to use Excel, is build a middle ware layer that parses the row and correctly identifies and formats the cell values. Apache POI and HSSF & XSSF can be used. Be warned that to handle xls and xlsx requires two different sets of libraries and often enough abstraction to handle both.
See https://poi.apache.org/spreadsheet/
As an Example:
protected String getCellValue(final Cell cell){
if (null == cell) { return null; }
// For Excel binaries 97 and below, The method of setting the cell type to CELL_TYPE_STRING converts the
// Formatted to date to a short. To correct this we check that the cell type is numeric and the check that it is
// date formatted. If we don't check that it is Numeric first an IllegalAccessorException is thrown.
if(cell.getCellType() == Cell.CELL_TYPE_NUMERIC && isCellDateFormated(cell) {
// isCellDateFormated is seperate util function to look at the cell value in order to determine if the date is formatted as a double.
// is a date format.
return // do date format procedure.
}
cell.setTypeCell(Cell.CELL_TYPE_STRING);
return cell.toString();
}
Hope this helps.
============Update==================
Instead of calling methods like "getNumericCellValue()" try setting the cell type to String and using toString like the example above. Here is my test code.
Note the xls file has one row and 4 cells in csv: "abba,1,211,q123,11.22"
public void testExtract() throws Exception{
InputStream is = new FileInputStream("/path/to/project/Test/src/test/java/excelTest.xls");
HSSFWorkbook wb = new HSSFWorkbook(is);
HSSFSheet sheet = wb.getSheetAt(0);
Iterator<Row> rowIter = sheet.iterator();
while (rowIter.hasNext()){
HSSFRow row = (HSSFRow) rowIter.next();
Iterator<Cell> cellIter = row.cellIterator();
while (cellIter.hasNext()){
Cell cell = cellIter.next();
System.out.println("Raw to string: " + cell.toString());
// Check for data format here. If you set a date cell to string and to string the response the output is funky.
cell.setCellType(Cell.CELL_TYPE_STRING);
System.out.println("Formatted to string: " + cell.toString());
}
}
is.close();
}
Output is
Raw to string: abba
Formatted to string: abba
Raw to string: 1.0
Formatted to string: 1
Raw to string: 211.0
Formatted to string: 211
Raw to string: q1123
Formatted to string: q1123
Raw to string: 11.22
Formatted to string: 11.22

Resources