Treat all cells as strings while using the Apache POI XSSF API - apache-poi

I'm using the Apache POI framework for parsing large Excel spreadsheets. I'm using this example code as a guide: XLSX2CSV.java
I'm finding that cells that contain just numbers are implicitly being treated as numeric fields, while I wanted them to be treated always as strings. So rather than getting 1.00E+13 (which I'm currently getting) I'll get the original string value: 10020300000000.
The example code uses a XSSFSheetXMLHandler which is passed an instance of DataFormatter. Is there a way to use that DataFormatter to treat all cells as strings?
Or as an alternative: in the implementation of the interface SheetContentsHandler.cell method there is string value that is the cellReference. Is there a way to convert a cellReference into an index so that I can use the SharedStringsTable.getEntryAt(int idx) method to read directly from the strings table?
To reproduce the issue, just run the sample code on an xlsx file of your choice with a number like the one in my example above.
UPDATE: It turns out that the string value I get seems to match what you would see in Excel. So I guess that's going to be "good enough" generally. I'd expect the data I'm sent to "look right" and therefore it'll get parsed correctly. However, I'm sure there will be mistakes and in those cases it'd be nice if I could get at the raw string value using the streaming API.

To resolve this issue I created my own class based on XSSFSheetXMLHandler
I copied that class, renamed it and then in the endElement method I changed this part of the code which is formatting the raw string:
case NUMBER:
String n = value.toString();
if (this.formatString != null && n.length() > 0)
thisStr = formatter.formatRawCellContents(Double.parseDouble(n), this.formatIndex, this.formatString);
else
thisStr = n;
break;
I changed it so that it would not format the raw string:
case NUMBER:
thisStr = value.toString();
break;
Now every number in my spreadsheet has its raw value returned rather than a formatted version.

Related

Unity: Comparing strings doesn't seem to work in this case?

I'm trying to compare to strings in a function to be able to do stuff if it's true. However, for some reason the two strings I'm comparing are never equal to eachother.
I am creating a new string in a function, getting data from a Json and then comparing that string with another string in another class. This doesn't work for some reason. The Debug shows that both strings are equal, but the if-statement returns false.
I created two other string variables (hello & hello2) with the same value ("Hello") and this time it's comparing correctly.
Check the images below. What am I doing wrong?
As you can see in the console, both strings have the same value:
Image 1. Here's where I create the string (zoneId).
Image 2. Further down in the same function, same for-loop.
Here's where I'm trying to compare the string created in this function with another string from another class.
Image 3. As you can see in the console it's looping through the jsonArray. But it's returning false even though it's clear that both strings have the same value.
Image 4 and 5. Here I am testing with two other strings inside the function and they are working fine.
Does it has something to do calling a string from another class?
Here's how my other string in userInfo is set up:
public string userID { get; private set; }
In Image 3, is there an additional space before the second 2?
How are you processing Main.Instance.userInfo.zoneID?
You need to use string.Equals for string comparison instead of operator == or !=.
Are string.Equals() and == operator really same?

Error converting string feature to numeric in Azure ML studio

QuotedPremium column is a string feature so I need to convert it to numeric value in order to use algorithm.
So, for that I am using Edit Metadata module, where I specify data type to be converted is Floating Point.
After I run it - I got an error:
Could not convert type System.String to type System.Double, inner exception message: Input string was not in a correct format.
What am I missing here?
As mentioned in comments, you must change column where numbers are handled as text to numeric type data and it shouldn't have any null values. Now answering the question of how to substitute NULL's in data using ML studio and converting to numeric type.
Substitute NULL's in data
Use Execute R Script module for that, and add this code in it.
dataset1 <- maml.mapInputPort(1); # class: data.frame
dataset1[dataset1 == "NULL"] = 0; # Wherever cell's value is "NULL", replace it with 0
maml.mapOutputPort("dataset1"); # return the modified data.frame
Image for same:
Convert to numeric data
As you have added in your answer, this can be done using the Edit Metadata module.

Processing Split (server)

I am doing 2player game and when I get informations from server, it's in format "topic;arg1;arg2" so if I am sending positions it's "PlayerPos;x;y".
I then use split method with character ";".
But then... I even tried to write it on screen "PlayerPos" was written right, but it cannot be gained through if.
This is how I send info on server:
server.write("PlayerPos;"+player1.x+";"+player1.y);
And how I accept it on client:
String Get=client.readString();
String [] Getted = split(Get, ';');
fill(0);
text(Get,20,20);
text(Getted[0],20,40);
if(Getted[0]=="PlayerPos"){
text("HERE",20,100);
player1.x=parseInt(Getted[1]);
player1.x=parseInt(Getted[2]);
}
It writes me "PlayerPos;200;200" on screen, even "PlayerPos" under it. But it never writes "HERE" and it never makes it into the if.
Where is my mistake?
Don't use == when comparing String values. Use the equals() function instead:
if(Getted[0].equals("PlayerPos")){
From the Processing reference:
To compare the contents of two Strings, use the equals() method, as in if (a.equals(b)), instead of if (a == b). A String is an Object, so comparing them with the == operator only compares whether both Strings are stored in the same memory location. Using the equals() method will ensure that the actual contents are compared. (The troubleshooting reference has a longer explanation.)

Cognos query calculation - how to obtain a null/blank value?

I have a query calculation that should throw me either a value (if conditions are met) or a blank/null value.
The code is in the following form:
if([attribute] > 3)
then ('value')
else ('')
At the moment the only way I could find to obtain the result is the use of '' (i.e. an empty character string), but this a value as well, so when I subsequently count the number of distinct values in another query I struggle to get the correct number (the empty string should be removed from the count, if found).
I can get the result with the following code:
if (attribute='') in ([first_query].[attribute]))
then (count(distinct(attribute)-1)
else (count(distinct(attribute))
How to avoid the double calculation in all later queries involving the count of attribute?
I use this Cognos function:
nullif(1, 1)
I found out that this can be managed using the case when function:
case
when ([attribute] > 3)
then ('value')
end
The difference is that case when doesn't need to have all the possible options for Handling data, and if it founds a case that is not in the list it just returns a blank cell.
Perfect for what I needed (and not as well documented on the web as the opposite case, i.e. dealing with null cases that should be zero).

C# 4.0 function to check for first four characters in the string

I need to validate for valid code name.
So, my string can have values like below:
String test = "C000. ", "C010. ", "C020. ", "C030. ", "CA00. ","C0B0. ","C00C. "
So my function needs to validate below conditions:
It should start with C
After that next 3 characters should be numeric before .
Rest it can be anything.
So in above string values, only ["C000.", "C010.", "C020.", "C030."] are valid ones.
EDIT:
Below is the code I tried:
if (nameObject.Title.StartsWith(String.Format("^[C][0-9]{3}$",nameObject.Title)))
I'd suggest a regex, for example (written off the top of my head, may need work):
string s = "C030.";
Regex reg = new Regex("C[0-9]{3,3}\\.");
bool isMatch = reg.IsMatch(s);
This regex should do the trick:
Regex.IsMatch(input, #"C[0-9]{3}\..*")
Check out http://www.techotopia.com/index.php/Working_with_Strings_in_C_Sharp
for a quick tutorial on (among other things) individual access of string elements, so you can test each element for your criteria.
If you think your criteria may change, using regular expressions gives you maximum flexibility (but is more runtime intensive than regular string-element evaluation). In your case, it may be overkill, IMHO.

Resources