Apache POI > handling Special formats? - apache-poi

I am looking into generating an xlsx file, and, for certain cells, apply a "Special" (that is the exact word used in Excel, in Format Cells > Category) category with a specific locale. For instance, my local installation of Excel comes with "Social Insurance Number" for the locale "English (Canada)".
I have checked the POI API, Googled a bit and I am puzzled about how to do that.
I have tried creating such cells manually (using Excel directly) then read them using POI.
If I apply getCellStyle().getDataFormat() to my cell, I am returned values equal or superior to 164. Which I guess means it is considered as something user-defined, since POI org.apache.poi.ss.usermodel.BuiltinFormats#FIRST_USER_DEFINED_FORMAT_INDEX constant is 164.
Is what am trying to do achievable at all ? I do not even know where are Excel's Special types defined generally speaking. These do seem to be built-in.

All Excel number formats based on special format pattern. See How to control and understand settings in the Format Cells dialog box in Excel.
To get what exact format patterns are needed for special formats one can apply the special format to a cell and then have a look at the corresponding custom format then.
The following is the special format "Social Insurance Number (CH)" - Switzerland:
This is German Excel "Sonderformat" is "Special".
This corresponds to the custom format 000\.00\.000\.000. You can deduct that by simply changing to category Custom in Excel's dialog Format Cells - Number. Then read the pattern from text field below Type::
This is German Excel "Benutzerdefiniert" is "Custom".
So the general way is: First choose your special format in Excel's dialog Format Cells - Number. Then change the category to Custom and read the corresponding pattern from the text field below Type:.
If you have the number format pattern, this can be used using apache poi as follows:
import java.io.FileOutputStream;
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
class CreateExcelNumberFormat {
public static void main(String[] args) throws Exception {
try (Workbook workbook = new XSSFWorkbook();
FileOutputStream fileout = new FileOutputStream("Excel.xlsx") ) {
DataFormat format = workbook.createDataFormat();
CellStyle specialSIN = workbook.createCellStyle();
specialSIN.setDataFormat(format.getFormat("000\\.00\\.000\\.000"));
Sheet sheet = workbook.createSheet();
Cell cell = sheet.createRow(0).createCell(0);
cell.setCellStyle(specialSIN);
cell.setCellValue(12345678901d);
sheet.setColumnWidth(0, 14*256);
workbook.write(fileout);
}
}
}
Your Social Insurance Number for the locale English (Canada) would must be:
...
specialSIN.setDataFormat(format.getFormat("000 000 000"));
...
cell.setCellValue(46454286);
...

Related

Data extraction from excel with operators is unable to store values

I have a Excel file with two columns. One has a name other has the corresponding mass to it. I have used the corresponding lines to read it and find the position of the name. But when I am trying to find the mass to the corresponding name as shown below it is not able to store it in the memory. In the Excel file, I have the mass values as 1.989*10^30. This seems to affect the code as the same code works fine when the cells in the excel has just numeric values.
majbod = 'Sun';
minbod = 'Earth';
majbodin = readtable("Major_and_Minor_Bodies.xlsx","Sheet",1);
minbodin = readtable("Major_and_Minor_Bodies.xlsx","Sheet",2);
MAJORBODY = table2array(majbodin(:,"Major_Body"));
MINORBODY = table2array(minbodin(:,"Minor_Body"));
mmaj = table2array(majbodin(:,"Mass"));
mmin = table2array(minbodin(:,"Mass"));
selected_majbody = find(strcmp(MAJORBODY,majbod));
selected_minbody = find(strcmp(MINORBODY,minbod));
M = mmaj(selected_majbody);
m = mmin(selected_minbody);
disp([M ;m])
Is there a better way to write the code compared to the way which I wrote?
Thanks.
Excel does it's best to figure out what kind of data is in each cell. Since your data has something besides just numbers, Excel treats it like a string. You have a couple of options for getting around that:
If you put an equals sign in front of it, it will treat it like an equation, and calculate the value of 1.989*10^3 for you. this will be a number.
Since scientific notation is so common, programmers have created a shortcut for representing it. They often use the character 'E' where you use "*10^". This means that if you type "1.989E30", excel will recognize that as a number.
If keeping the current string format is very important, you could probably modify the string during extraction - replace '*10^' with E, and then whatever language you are using will have a string to number parser you can use.
If the real problem is that the real numbers are just too long in Excel, you can always format the cell that they are in. (right click the cell, select format cells, then select scientific.)
Good luck

SAS import excel date format changes

I need to import an excel, the excel has a few columns and the 1st column A is a date column. Column A has the date format DDMMMYYYY e.g. '01Jan2017' and in excel the data type is date type. But when I import it to SAS, all the other columns remain the same data type (numeric, character, etc.) and value. But column A becomes a number e.g. ('42736' for '01Jan2017'). How do I import the data as it is and without converting the data type to other types?
libname out '/path';
proc import out=out.sas_output_dataset
datafile='/path/excel_file.xlsx'
DBMS=XLSX
REPLACE;
sheet="Sheet1";
run;
It is hard to know without seeing the data. The below is general information, it may not answer your precise problem.
To avoid common errors you should set mixed=yes in your libname. You may also want to include stringdate=yes statement.
The mixed=yes allows for any out of range excel date values.
stringdates=yes brings all dates into SAS in character format, so you will need to use the input() function to convert this into a SAS date.
Date = input( Date , mmddyy10. )
I would suggest that you import the excel with the import wizard in SAS. Afterwards right-click on the query and extract the code, see here: SAS Import Query DE
In the generated code itself you can format each imported column into the desired format.
For the possible format see: https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/leforinforref/n0p2fmevfgj470n17h4k9f27qjag.htm
Hope this helps.
A value of '42736' for '01Jan2017' is an indication that the column in the Excel file has a mix of cells with date values and cells with string values. In that case SAS will make the variable character and store the date values as a digit string that represents the raw number excel uses for the date. To convert '42736' to a date value you need to first convert it to a number and then adjust the number for the difference in the base date used by Excel.
date_value = input(date_string,32.) + '30DEC1899'd ;
To convert the strings that look like '01JAN2017' use the DATE informat instead.
date_value = input(date_string,date11.);
You could add logic to do both to handle a column with mixed values.
date_value = input(date_string,??date11.);
if missing(date_value) then
date_value = input(date_string,??32.) + '30DEC1899'd
;
To have the new variable print the date values in a human readable style attach a date type format to the variable.
format date_value date9. ;

Apache POI : How to format numeric cell values

I am using Apache POI 3.9 for XLS/XLSX file processing.
In the XLS sheet, there is a column with numeric value like "3000053406".
When I read it with POI with..
cell.getNumericCellValue()
It gives me value like "3.00E+08". This create huge problem in my application.
How can I set the number formatting while reading data in Apcahe POI ?
There is a way that I know is to set the column as "text" type. But I want to know if there is any other way at Apache POI side while reading the data. OR can we format it by using simple java DecimalFormatter ?
This one comes up very often....
Picking one of my past answers to an almost identical question
What you want to do is use the DataFormatter class. You pass this a cell, and it does its best to return you a string containing what Excel would show you for that cell. If you pass it a string cell, you'll get the string back. If you pass it a numeric cell with formatting rules applied, it will format the number based on them and give you the string back.
For your case, I'd assume that the numeric cells have an integer formatting rule applied to them. If you ask DataFormatter to format those cells, it'll give you back a string with the integer string in it.
Problem can be strictly Java-related, not POI related, too.
Since your call returns a double,
double val = cell.getNumericCellValue();
You may want to get this
DecimalFormat df = new DecimalFormat("#");
int fractionalDigits = 2; // say 2
df.setMaximumFractionDigits(fractionalDigits);
double val = df.format(val);
Creating a BigDecimal with the double value from the numeric cell and then using the
BigDecimal.toPlainString()
function to convert it to a plain string and then storing it back to the same cell after erasing the value solved the whole problem of exponential representation of numeric values.
The below code solved the issue for me.
Double dnum = cellContent.getNumericCellValue();
BigDecimal bd = new BigDecimal(dnum);
System.out.println(bd.toPlainString());
cellContent.setBlank();
cellContent.setCellValue(bd.toPlainString());
System.out.println(cellContent.getStringCellValue());
long varA = new Double(cellB1.getNumericCellValue()).longValue();
This will bring the exact value in variable varA.

JExcel/POI: Creating a cell as type Text

I have a requirement that involves reading values from an excel spreadsheet, and populating a spreadsheet for users to modify and re-upload to our application. One of these cells contains a text string of 5 characters that may be letters, numbers, or a combination of both. Some of these strings contain only numbers, and begin with a zero. Because of this, the cell type is Text; however, when I use Apache POI or JExcel to populate a spreadsheet for the users to modify it is always set as cell type General.
Is there a way using either of these libraries, or some other excel api that I have not seen yet, to specify that a cell have type Text?
My co-worker just found a way to accomplish this. In JExcel, it can be accomplished by using a WritableCellFormat such as:
WritableCellFormat numberAsTextFormat = new WritableCellFormat(NumberFormats.TEXT);
Then, when you are creating your cell to add to a sheet you just pass in the format as normal:
Label l = new Label(0, 0, stringVal, numberAsTextFormat);
If you are using Apache POI, you would create a HSSFCellStyle, and then set it's data format like this:
HSSFCellStyle style = book.createCellStyle();
style.setDataFormat(BuiltInFormats.getBuiltInFormat("text"));
Many times when user enters number in cell which type(formatting) is text(string), spreadsheet software (openoffice or msoffice) changes it's formatting automatically. I am using apache poi and this is the way I wrote my code :
cell = row.getCell();
switch (cell.getCellType()) {
case HSSFCell.CELL_TYPE_STRING:
value = cell.getRichStringCellValue().getString();
break;
case HSSFCell.CELL_TYPE_NUMERIC:
// if cell fomratting or cell data is Numeric
// Read numeric value
// suppose user enters 0123456 (which is string), in numeric way it is read as 123456.0
// Now I want ot read string and I got number...Problem?????
//well this is the solution
cell.setCellType(Cell.CELL_TYPE_STRING); // set cell type string, so it will retain original value "0123456"
value = cell.getRichStringCellValue().getString(); // value read is now "0123456"
break;
default:
}

Excel turning my numbers to floats

I have a bit of ASP.NET code that exports data in a datagrid into Excel but I noticed that it messes up a particular field when exporting.
E.g. I have the value of something like 89234010000725515875 in a column in the datagrid but when exported, it turns it into 89234+19.
Is there any Excel formatting that will bring back my original number? Thanks.
Excel isn't really messing up the field. Two things are happening:
Excel formats large numbers in scientific notation. So "89234010000725515875" becomes "8.9234E+19" or "8.9234 x 10 ^ 19".
The size of the number "89234010000725515875" exceeds the precision in which Excel uses to store values. Excel stores your number as "89234010000725500000" so you're losing the last five digits.
Depending on your needs you can do one of two things.
Your first option is to change the formatting from "General" to "0" (Number with zero decimal places.) This will give you "89234010000725500000" so you will have lost precision but you will be able to perform calculcations on the number.
The second option is to format the cell as text "#" or to paste your field with an apostrophe at the beginning of the line to force the value to be text. You'll get all of the digits but you won't be able to do calculations of the value.
I hope this helps.
You can add a space to the field, then when you export it to Excel, it's considered as string:
lblTest.Text = DTInfo.Rows(0).Item("Test") & " "
Good luck.
Below is the C# source code to do this with SpreadsheetGear for .NET. Since the SpreadsheetGear API is similar to Excel's API, you should be able to easily adapt this code to Excel's API to get the same result.
You can download a free trial here if you want to try it yourself.
Disclaimer: I own SpreadsheetGear LLC
using System;
using SpreadsheetGear;
namespace Program
{
class Program
{
static void Main(string[] args)
{
// Create a new workbook and get a reference to A1.
IWorkbook workbook = Factory.GetWorkbook();
IWorksheet worksheet = workbook.Worksheets[0];
IRange a1 = worksheet.Cells["A1"];
// Format A1 as Text using the "#" format so that the text
// will not be converted to a number, and put the text in A1.
a1.NumberFormat = "#";
a1.Value = "89234010000725515875";
// Show that the formatted value is
Console.WriteLine("FormattedValue={0}, Raw Value={1}", a1.Text, a1.Value);
// Save the workbook.
workbook.SaveAs(#"c:\tmp\Text.xls", FileFormat.Excel8);
workbook.SaveAs(#"c:\tmp\Text.xlsx", FileFormat.OpenXMLWorkbook);
}
}
}

Resources