Java 11 UTF-16 BOM issues with com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl

Java 11 UTF-16 BOM issues with com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl - jaxb

I have a UTF-16 XML file:
<?xml version="1.0" encoding="utf-16" standalone="yes"?>
It starts with BOM FE FF.
Migrating my code to Java 11, I get:
Caused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,1]
Message: Content is not allowed in prolog.
at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:652) ~[?:?]
This is unmarshalling it using JAXB.
It happens whether I use the Reference Implementation:
Caused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,1]
Message: Content is not allowed in prolog.
at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:652) ~[?:?]
at com.sun.xml.bind.v2.runtime.unmarshaller.StAXStreamConnector.bridge(StAXStreamConnector.java:134) ~[jaxb-runtime-2.4.0-SNAPSHOT.jar:?]
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:385) ~[jaxb-runtime-2.4.0-SNAPSHOT.jar:?]
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal(UnmarshallerImpl.java:356) ~[jaxb-runtime-2.4.0-SNAPSHOT.jar:?]
or MOXy:
Message: Content is not allowed in prolog.
at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:652) ~[?:?]
at org.eclipse.persistence.internal.oxm.record.XMLStreamReaderReader.parse(XMLStreamReaderReader.java:98) ~[org.eclipse.persistence.core-2.5.2.jar:?]
at org.eclipse.persistence.internal.oxm.record.XMLStreamReaderReader.parse(XMLStreamReaderReader.java:86) ~[org.eclipse.persistence.core-2.5.2.jar:?]
at org.eclipse.persistence.internal.oxm.record.SAXUnmarshaller.unmarshal(SAXUnmarshaller.java:895) ~[org.eclipse.persistence.core-2.5.2.jar:?]
at org.eclipse.persistence.oxm.XMLUnmarshaller.unmarshal(XMLUnmarshaller.java:659) ~[org.eclipse.persistence.core-2.5.2.jar:?]
at org.eclipse.persistence.jaxb.JAXBUnmarshaller.unmarshal(JAXBUnmarshaller.java:585) ~[org.eclipse.persistence.moxy-2.5.2.jar:?]
They both use com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl
Unmarshalling that file worked fine using Java 6 to 8. Did something change in Java 9 or 11?
If I remove the FE FF BOM, it unmarshals OK with Java 11.

Turns out my problem was caused by maven-resources-plugin, with filtering set to true. That was mangling any UTF-16 resource, changing the first 2 bytes to EF BF.

Related

AvalonEdit TextDocument crash reading large files

I have an editor implemented using AvalonEdit, and when I open a new file I populate the Document with a new TextDocument, and populate that TextDocument with an IEnumerable<char>. That IEnumerable is just reading the file, char-by-char, from a stream.
We recently had it crash when opening a large file. Large meaning 600MB of plaintext. Don't ask, I think it's ridiculous, but it's necessary to be able to view files that big for our application. After a bunch of debugging, I found that the TextDocument constructor is crashing with an OutOfMemoryException, specicifically something where "Array dimensions exceeded supported range." Here's the relavent part of the stack trace:
System.OutOfMemoryException: Array dimensions exceeded supported range.
at System.Linq.Buffer`1..ctor(IEnumerable`1 source)
at System.Linq.Enumerable.ToArray[TSource](IEnumerable`1 source)
at ICSharpCode.AvalonEdit.Utils.Rope`1.ToArray(IEnumerable`1 input)
at ICSharpCode.AvalonEdit.Utils.Rope`1..ctor(IEnumerable`1 input)
at ICSharpCode.AvalonEdit.Document.TextDocument..ctor(IEnumerable`1 initialText)
...
In my debugging I found that it happened consistently on the 134217728th character (2^27). That seems like an arbitrary limit, is there any reason that this would be a restriction? I'd expect the restriction of an array length to be int.MaxValue. It looks like it's something to do with the Rope creation, but I haven't been able to go into the Rope<T> source code yet.

Failed to parse the expression [#{myGalleryBean.photos.stream.filter((i)->i.selected eq true).toList()}]

I am trying to do a for loop over a list and show the images that are selected only.
I am using this code in my index.xhtml:
But I keep getting the following error:
/index.xhtml #87,73 value="#{myGalleryBean.photos.stream.filter((i)->i.selected eq true).toList()}" Failed to parse the expression [#{myGalleryBean.photos.stream.filter((i)->i.selected eq true).toList()}]
I have this code working on Java EE 7, but after changing to Java EE 6 I got that error.
Any help on how to do this in Java EE 6 would be really appreciated.

The XML parser detected error code 302

I am using the XML-INTO op-code to parse a web service request. Every now and then I get errors in the logs
(RNX0351 - "The XML parser detected error code 302").
The help for a 302 is
302 The parser does not support the requested CCSID value or
the first character of the XML document was not '<'
To the best of my knowledge, the first character is "<" and the request is generated from a previous web service call so I would be very suprised if the CCSID has changed.
The error is repeatable, for the specific query so it is almost certainly data related, I am just unsure how I would go about identifying the offending item.
Any thoughts on how to determine the issue, or better yet, how to overcome it?
cheers

CCSID is an AS400/iSeries/Power System attribute, and it applies to the whole IFS.It's like a declaration of what inside the file is, or in other words what its internal encoding "should be".
It's supposed that data content encoding in the file and the file one (the envelope) match, and the box uses this attribute to show and handle corresponding characters.
It sounds like you receive data under one encoding, but CCSID file doesn't match.
Try changing CCSID on your file (only the envelope). E.G.: 37 (american), 500 (latin-1), 819 (utf-8), 850 (dos), 1252 (win) and display file after.You can check first using ls -Sla yourfile in QSH or QP2TERM, or EDTF as well. CHGATTR allows you to change CCSID, as well as setccsid in QSH (again).
This way helped me to find related issues. Remember that although data may be visible in the four hundred, they may not be visible through a share folder in Win. It means that CCSID file, an content encoding don't match.
Hope it helps.

Hi I've seen this error with XML data uploaded to AS400/iSeries/IBM i with FTP and the CCSID 819 (ISO 8859-1 ASCII) and it has some binary garbage in first few positions of file. Changed encoding to CCSID 1208 (UTF-8 with IBM PUA) using FTP "quote type c 1208" and the problem cleared and XML-INTO was successful.
So, suggestion about XML parser error 302 received when using XML-INTO is to look at the file (wrklnk ...) and if first character is not "<" but instead some binary garbage then try CCSID 1208 for utf-8.
Statements in this answer about what 819 is and what ccsid represents utf-8 do not agree with previous answer but are correct, according to IBM documentation:
https://www-01.ibm.com/software/globalization/ccsid/ccsid819.html
https://www-01.ibm.com/software/globalization/ccsid/ccsid1208.html

I'm working on this problem a couple hours,
for me the solution was use option ccsid=UCS2 when you use data structure or variable to store xml.
something like that :
XML-INTO customer %XML( xmlSource : 'ccsid=UCS2');
I have the program running on ccsid = 870, every conversion to ccsid on the xmlSource field don't work,
The strange thing that when I use the file with ccsid = 850, every thing work fine
I mention that becouse this is the first page when you looking about this problem.
Maybe this help someone.

Groovy: setPropertyValue()

I need to update a value in the property test step. Dynamically I am getting value in 'abc' parameter and 'line1' value need to update in 'abc' parameter in Properties test step.
testRunner.testCase.getTestStepByName("Properties1").setPropertyValue(%s,"abc",line1)
This is giving an error message.
following is the Error msg,
org.codehaus.groovy.control.MultipleCompilationErrorsException: startup failed: Script99.groovy: 19: expecting EOF, found '(' # line 19, column 70. Properties1").setPropertyValue(%s,"abc", ^
org.codehaus.groovy.syntax.SyntaxException: expecting EOF, found '(' # line 19, column 70. at
org.codehaus.groovy.antlr.AntlrParserPlugin.transformCSTIntoAST(AntlrParserPlugin.java:139) at
org.codehaus.groovy.antlr.AntlrParserPlugin.parseCST(AntlrParserPlugin.java:107) at
org.codehaus.groovy.control.SourceUnit.parse(SourceUnit.java:236) at
org.codehaus.groovy.control.CompilationUnit$1.call(CompilationUnit.java:163) at
org.codehaus.groovy.control.CompilationUnit.applyToSourceUnits(CompilationUnit.java:839) at
org.codehaus.groovy.control.CompilationUnit.doPhaseOperation(CompilationUnit.java:544) at
org.codehaus.groovy.control.CompilationUnit.processPhaseOperations(CompilationUnit.java:520) at
org.codehaus.groovy.control.CompilationUnit.compile(CompilationUnit.java:497) at
groovy.lang.GroovyClassLoader.doParseClass(GroovyClassLoader.java:306) at
groovy.lang.GroovyClassLoader.parseClass(GroovyClassLoader.java:287) at
groovy.lang.GroovyShell.parseClass(GroovyShell.java:731) at
groovy.lang.GroovyShell.parse(GroovyShell.java:743) at
groovy.lang.GroovyShell.parse(GroovyShell.java:770) at
groovy.lang.GroovyShell.parse(GroovyShell.java:761) at
com.eviware.soapui.support.scripting.groovy.SoapUIGroovyScriptEngine.compile(SoapUIGroovyScriptEngine.java:148) at
com.eviware.soapui.support.scripting.groovy.SoapUIGroovyScriptEngine.run(SoapUIGroovyScriptEngine.java:93) at
com.eviware.soapui.impl.wsdl.teststeps.WsdlGroovyScriptTestStep.run(WsdlGroovyScriptTestStep.java:148) at
com.eviware.soapui.impl.wsdl.panels.teststeps.GroovyScriptStepDesktopPanel$RunAction$1.run(GroovyScriptStepDesktopPanel.java:274) at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source)
Caused by: Script99.groovy:19:70: expecting EOF, found '(' at
groovyjarjarantlr.Parser.match(Parser.java:211) at
org.codehaus.groovy.antlr.parser.GroovyRecognizer.compilationUnit(GroovyRecognizer.java:780) at
org.codehaus.groovy.antlr.AntlrParserPlugin.transformCSTIntoAST(AntlrParserPlugin.java:130) ... 20 more 1 error

I fixed your question - you may want take the time to understand how to format questions. Your error message was being hidden because it was inside html brackets, and also it was all one line, making it hard to read.
As for the error, it is a compilation error (MultipleCompilationErrorsException). This means that the code itself is invalid.
Simply looking at your code, I see this:
.setPropertyValue(%s,"abc",line1)
^^
The marked value is not valid Groovy code. I don't know what you were going for, but it looks like it was copied and pasted here from something else. You probably meant:
.setPropertyValue("abc", line1)
Fix this, and you might be able to get your code to compile.

The below line of code would help,
testRunner.testCase.getTestStepByName("Properties Test Step Name").getProperty("Prop1").setValue("MyValue")
or to set up property if it does not exist
testRunner.testCase.testSteps["Properties Test Step Name"].setPropertyValue( "Prop1", "MyValue" )
I've tested this in my code, it will work fine.

Error reading excel (.xlsx) file using apache poi xssf eventmodel only

I am trying to read an excel file with words and not numeric data using the code from apache site http://poi.apache.org/spreadsheet/how-to.html#xssf_sax_api
I get the following error:
Processing new sheet:
A1 - Have a nice day
Exception in thread "main" java.lang.NumberFormatException: For input string: "Have a nice day"
at org.apache.xerces.framework.XMLParser.parse(XMLParser.java:1111)
at ExcelExtract.processAllSheets(ExcelExtract.java:48)
at ExcelExtract.main(ExcelExtract.java:119)
Caused by: java.lang.NumberFormatException: For input string: "Have a nice day"
at java.lang.NumberFormatException.forInputString(Unknown Source)
at java.lang.Integer.parseInt(Unknown Source)
at java.lang.Integer.parseInt(Unknown Source)
at ExcelExtract$SheetHandler.endElement(ExcelExtract.java:99)
at org.apache.xerces.parsers.SAXParser.endElement(SAXParser.java:1403)
at org.apache.xerces.validators.common.XMLValidator.callEndElement(XMLValidator.java:1550)
at org.apache.xerces.framework.XMLDocumentScanner$ContentDispatcher.dispatch(XMLDocumentScanner.java:1204)
at org.apache.xerces.framework.XMLDocumentScanner.parseSome(XMLDocumentScanner.java:381)
at org.apache.xerces.framework.XMLParser.parse(XMLParser.java:1098)
... 2 more
Also is there any way to read xlsx file using poi xssf event model only without using xerces.jar? Please inform if any other sample code is available.

That exception seems to be coming from your own code - ExcelExtract looks to be your program rather than a core bit of POI
It looks like you're treating a cell that contains a string as if it contains a number. That won't work - you need to check the type of the cell, and handle the contents appropriately. You can't just parse something to an int without first ensuring it is one!
Doesn't look to be a POI issue though

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Java 11 UTF-16 BOM issues with com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl - jaxb

Turns out my problem was caused by maven-resources-plugin, with filtering set to true. That was mangling any UTF-16 resource, changing the first 2 bytes to EF BF.

Related

AvalonEdit TextDocument crash reading large files

Failed to parse the expression [#{myGalleryBean.photos.stream.filter((i)->i.selected eq true).toList()}]

The XML parser detected error code 302

Groovy: setPropertyValue()

Error reading excel (.xlsx) file using apache poi xssf eventmodel only

Categories

Resources