Not able to read £1,000.00 from excel using poi

Not able to read £1,000.00 from excel using poi - apache-poi

I know how read a number, a string etc but now aware of reading an alphanumeric or special characters. Currently i am trying to read/write £1,000.00 to excel using apache poi but i got format exception. Can you please give me some lines of code for it.
Below is the code:
String str = driver.findElement(By.xpath("//*[#id='wrapper']/div[4]/div/form/div[1]/div/div/div[1]/table/tbody/tr[3]/td[2]")).getText().trim().replaceAll("\\D", "");
System.out.println(str);
env.setInputFile(new FileInputStream("C://ABC//DataNew.xls"));
env.setBook(new HSSFWorkbook(env.getInputFile()));
env.setOutFile(new FileOutputStream("C://ABC//DataNew.xls"));
env.getBook().createSheet("test").createRow(15).createCell((short)6).setCellValue(str);
env.getBook().write(env.getOutFile());
env.getOutFile().flush();
env.getOutFile().close();
And the error I am getting is :
org.apache.poi.hssf.record.RecordFormatException: Unable to construct record instance, the following exception occured: null

Related

PySpark - data mismatch error when trying to split a column content

I'm trying to use PySpark's split() method on a column that has data formatted like:
[6b87587f-54d4-11eb-95a7-8cdcd41d1310, 603, landing-content, landing-content-provider]
my intent is to extract the 4th element after the last comma.
I'm using a syntax like:
mydf.select("primary_component").withColumn("primary_component_01",f.split(mydf.primary_component, "\,").getItem(0)).limit(10).show(truncate=False)
But I'm consistently getting this error:
"cannot resolve 'split(mydf.primary_component, ',')' due to data
type mismatch: argument 1 requires string type, however,
'mydf.primary_component' is of
structuuid:string,id:int,project:string,component:string
type.;;\n'Project [primary_component#17,
split(split(primary_component#17, ,)[1], \,)...
I've also tried escaping the "," using \, \\ or not escaping it at all and this doesn't make any difference. Also, removing the ".getItem(0)" produces no difference.
What am I doing wrong? Feeling a dumbass but I don't know how to fix this...
Thank you for any suggestions

You are getting the error:
"cannot resolve 'split(mydf.`primary_component`, ',')' due to data
type mismatch: argument 1 requires string type, however,
'mydf.`primary_component`' is of
struct<uuid:string,id:int,project:string,component:string>
because your column primary_component is using a struct type when split expects string columns.
Since primary_component is already a struct and you are interested in the value after your last comma you may try the following using dot notation
mydf.withColumn("primary_component_01","primary_component.component")
In the error message, spark has shared the schema for your struct as
struct<uuid:string,id:int,project:string,component:string>
i.e.
column
data type
uuid
string
id
int
project
string
component
string
For future debugging purposes, you may use mydf.printSchema() to show the schema of the spark dataframe in use.

NumberFormatException when i try to toDouble() the String, even when the input String is a valid representation of a Double

i got an issue with my Code. Im using Android Studio with Kotlin.
So, i got an EditText-field with an input-type="numberDecimal".
when i try to convert that string to a Double:
val preis = etProduktGekauftPreis.text.toString().toDouble()
and try to create a new Object with the "preis"
val produkt = Produkt(name, anzahl.toInt(), preis)
For example im getting the error: NumberFormatException: For input string: "3.00"
The string is a valid representation of a number or not? why do i keep getting this Error?
Thanks for the help :)

Its due to your locale, in german it would be expected to give "3,00" instead of "3.00". You would need to parse the string correctly/differently for example by replacing the comma based on what ever possible locales you support or by removing the comma converting to double then dividing by 100

How to fix 'Unclosed quotation mark after the character string \')\'.' error

I'm generating a dynamic sql query based on some user input. Here is the code that prepares the query:
var preparedParamValues = paramValues.map(paramValue => `'${paramValue}'`).join(',');
var sql = `INSERT INTO [DB] (${paramNames}) VALUES (${preparedParamValues})`;
When I send the following string to the DB it throws the below error:
'They're forced to drive stupid cars.'
I get an error :
'Unclosed quotation mark after the character string \')\'.'
I'm trying to find a way to escape all those characters but I don't understand the error or at least the last part of it with all the symbols.

You have to use two single quotes when a single quote appears in the string:
'They''re forced to drive stupid cars.'

invalid input syntax for type numeric: " "

I'm getting this message in Redshift: invalid input syntax for type numeric: " " , even after trying to implement the advice found in SO.
I am trying to convert text to number.
In my inner join, I try to make sure that the text being processed is first converted to null when there is an empty string, like so:
nullif(trim(atl.original_pricev::text),'') as original_price
... I noticed from a related post on coalesce that you have to convert the value to text before you can try and nullif it.
Then in the outer join, I test to see that there's a limited set of acceptable characters and if this test is met I try to do the to_number conversion:
,case
when regexp_instr(trim(atl.original_price),'[^0-9.$,]')=0
then to_number(atl.original_price,'FM999999999D00')
else null
end as original_price2
At this point I get the above error and unfortunately I can't see the details in datagrip to get the offending value.
So my questions are:
I notice that there is an empty space in my error message:
invalid input syntax for type numeric: " " . Does this error have the exact same meaning as
invalid input syntax for type numeric:'' which is what I see in similar posts??
Of course: what am I doing wrong?
Thanks!

It's hard to know for sure without some data and the complete code to try and reproduce the example, but as some have mentioned in the comments the most likely cause is the to_number() function you are using.
In the earlier code fragment you are converting original_price to text (string) and then substituting an empty string ('') if the value is NULL. Calling the to_number() function on an empty string will give you the error described.
Without the full SQL statement it's not clear why you're putting the nullif() function around the original_price in the "inner join" or how whether the CASE statement is really in an outer join clause or one of the columns returned by the query. However you could perhaps alter the nullif() to substitute a value that can be converted to a number e.g. '0.00' instead of ''.

Sorry I couldn't share real data. I spent the weekend testing small sets to try and trap the error. I found that the error was caused by the input string having no numbers, which is permitted by my regex filter:
when regexp_instr(trim(atl.original_price),'[^0-9.$,]') .
I wrongly expected that a non numeric string like "$" would evaluate to NULL and then the to_number function would = NULL . But from experimenting it seems that it needs at least one number somewhere in the string. Otherwise it reduces the string argument to an empty string prior to running the to_number formatting and chokes.
For example select to_number(trim('$1'::text),'FM999999999999D00') will evaluate to 1 but select to_number(trim('$A'::text),'FM999999999999D00') will throw the empty string error.
My fix was to add an additional regex to my initial filter:
and regexp_instr(atl.original_price2,'[0-9]')>0 .
This ensures that at least one number will be in the string and after that the empty string error went away.
Hope my learning experience helps someone else.

Treat all cells as strings while using the Apache POI XSSF API

I'm using the Apache POI framework for parsing large Excel spreadsheets. I'm using this example code as a guide: XLSX2CSV.java
I'm finding that cells that contain just numbers are implicitly being treated as numeric fields, while I wanted them to be treated always as strings. So rather than getting 1.00E+13 (which I'm currently getting) I'll get the original string value: 10020300000000.
The example code uses a XSSFSheetXMLHandler which is passed an instance of DataFormatter. Is there a way to use that DataFormatter to treat all cells as strings?
Or as an alternative: in the implementation of the interface SheetContentsHandler.cell method there is string value that is the cellReference. Is there a way to convert a cellReference into an index so that I can use the SharedStringsTable.getEntryAt(int idx) method to read directly from the strings table?
To reproduce the issue, just run the sample code on an xlsx file of your choice with a number like the one in my example above.
UPDATE: It turns out that the string value I get seems to match what you would see in Excel. So I guess that's going to be "good enough" generally. I'd expect the data I'm sent to "look right" and therefore it'll get parsed correctly. However, I'm sure there will be mistakes and in those cases it'd be nice if I could get at the raw string value using the streaming API.

To resolve this issue I created my own class based on XSSFSheetXMLHandler
I copied that class, renamed it and then in the endElement method I changed this part of the code which is formatting the raw string:
case NUMBER:
String n = value.toString();
if (this.formatString != null && n.length() > 0)
thisStr = formatter.formatRawCellContents(Double.parseDouble(n), this.formatIndex, this.formatString);
else
thisStr = n;
break;
I changed it so that it would not format the raw string:
case NUMBER:
thisStr = value.toString();
break;
Now every number in my spreadsheet has its raw value returned rather than a formatted version.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Not able to read £1,000.00 from excel using poi - apache-poi

Related

PySpark - data mismatch error when trying to split a column content

NumberFormatException when i try to toDouble() the String, even when the input String is a valid representation of a Double

How to fix 'Unclosed quotation mark after the character string \')\'.' error

invalid input syntax for type numeric: " "

Treat all cells as strings while using the Apache POI XSSF API

Categories

Resources