XSD validation with XML reader, collecting validation errors. (C#)

XSD validation with XML reader, collecting validation errors. (C#) - xsd

I'm currently fighting with using an XMLSerializer to execute XSD validation and collect the validation errors in the files. The task is the validation of the file, based on custom XSD-s containing valueset information, presence information etc.
My problem is the following: when using the XMLReader it stops at the first error, if we attach a listener to the ValidationEvents of the reader (through XMLReaderSettings). So I simply catch the exception where I log the error. So far everything is fine, the problems start to appear after logging the exception. Right after that the XMLReader goes to the end tag of the failed field, but I cannot validate the next field due to an unexplained exception.
To put it in practice, here's my code where I catch the exception:
private bool TryDeserialize(XmlSerializer ser, XmlReader read,out object item)
{
string Itemname = read.Name;
XmlReader read2 = read.ReadSubtree();
try
{
item= ser.Deserialize(read2);
return true;
}
catch (Exception e)
{
_ErrorList.Add("XSD error at " + Itemname + ": " + e.InnerException.Message);
item = null;
return false;
}
}
This routine works well, but what follows is problematic. Assume I pass the following XML snippet to this code:
<a>2885</a>
<b>ABC</b>
<c>5</c>
Assume that 'b' may not have 'ABC' as a value, so I get an XSD error. At the end of this, the xmlreader will be at
'EndElement, Name=b'
from which I simply cannot move unless I get an exception. If I do xmlreader.read, then I get the following exception (cut the namespace here):
"e = {"The element 'urn:iso:.....b' cannot contain child element 'urn:iso:.....:c' because the parent element's content model is text only."}"
After this the xmlreader is at 'Element, Name=c', so it seems good, but when trying to deserialize it with the code above, I get the following exception:
'_message = "The transition from the 'ValidateElement' method to the 'ValidateText' method is not allowed."'
I don't really see how I may go over it. I tried without a second reader reading the subtree, but I have the same problem. Please suggest me something, I really am stuck. Thanks a lot in advance!
Greets

You may have to consider the following things:
In general, it is not always possible to "collect" all the errors, simply because validating parsers are free to abandon the validation process when certain types of errors occur, particularly those that put the validator in a state where it can't reliably recover. For e.g., a validator may still continue after running into a constraining facet violation for a simple type, but it'll skip a whole section if it runs in unexpected content.
Unlike parsing into a DOM, where the loading of a DOM is not affected by a validating reader failing validation, deserializing into an object is (or at least should be) totally different: DOM is about being well formed; deserializing, i.e. strong typing is about being valid.
Intuitively I would think that if you get a validation error, what is the point in continuing with the deserialization, and further validation?
Try validating your XML independent of deserialization. If indeed you get more errors flagged with this approach, then the above should explain why. If not, then you're chasing something else.

Related

Mongoose - Difference between Error.ValidationError, and Error.ValidatorError

The question pretty much says it all.
As far as I can tell from the docs, one is for general validation errors (a required field not being included, maxLength being exceeded, etc.).
And one is given as the reason within the other, whenever an error occurs within a custom validator... Is that correct?
If so, which is which? The naming convention used is really confusing here!

After a little more research, and investigation. It seems that all validation type errors (which include Error.CastError errors) always exist within a, Error.ValidationError object. This is the parent object, and itself doesn't really offer much information, other than that validation has failed somewhere along the line.
Within this Error.ValidationError object, can be many other error objects. It's these children that will be instances of Error.ValidatorError (or Error.CastError).
A validation error will therefore look something like:
// When receiving some error, with the variable name "err".
err instanceof mongoose.Error.ValidationError
// true
console.log(err);
errors: {
some_doc_field_name: {} // Possibly an instanceof "Error.ValidatorError"
another_doc_field_name: {} // Possibly an instanceof "Error.CastError"
}
TL;DR:
ValidationError = the parent type of the error.
ValidatorError = one of the potential children types of error it's children can have.

Nice Error Messages for "no viable alternative at input '<EOF>'" in ANTLR4

I want to show more beautiful error message to my users.
For example if someone types integer i= the error message no viable alternative at input '<EOF>' appears. That's totally fine and predictable due to my grammar rules but I'm figuring out ways to improve those messages. If the = is missing in the example above the message changes to mismatched input '<EOF>' expecting '='. Again predictable but I can do more stuff on things like this in my code than on a general input error.
Should I catch them in the code and try to evaluate which cases is meant? Or is there a better way to handle this?

Typically you'd create your own error listener and add it to the parser where you can deal with the error yourself. For that remove any existing error listeners (one for the console is automatically registered by default), by calling parser.removeErrorListeners();. Define an own error listener class derived from BaseErrorListener and add an instance of that to your parser via parser.addErrorListener(yourListener);. You can see an example of such a custom error listener in the ANTLR runtime (search for XPathLexerErrorListener). Override the syntaxError method and use the provided info to generate your own error message. There is already a message passed in to this method (in addition to other stuff like line + char position, exception etc.), which you cannot customize as it comes directly from generated code. So best is probably to leave that alone and start from scratch (the passed in exception is your best option you have).

How to detect if an element is "Stale" in Geb?

I'm trying to detect if a module has gone "stale" in Geb. That is, if using will throw:
org.openqa.selenium.StaleElementReferenceException
The below code seems to work, but I feel like its excessively hacky (I'm just calling any arbitrary method on module (toString() seemed like a decent choice) and checking if it throws the stale element exception.
static boolean isStale(Module module)
{
boolean isStale = false
try {
module.toString() // arbitrary method call
} catch (StaleElementReferenceException e) {
isStale = true
}
return isStale
}
Is there a cleaner way to do this?

If you are trying to detect page changes, however arbitrary those pages are then I would probably tackle it the other way around - detecting the new content and not the stale content. First, you have to find something (an element or a state of an element) that you can use in your at checker to detect that the new page has "loaded". Then you would execute the page changing action and wrap an at checker verification inside of a waitFor {} call. This should be more reliable than your current approach, especially because Geb doesn't cache any content elements by default.

GCHandle, AppDomains managed code and 3rd party dll

I have looking at many threads about the exception "cannot pass a GCHandle across AppDomains" but I still don't get it....
I'm working with an RFID Reader which is driven by a DLL. I don't have source code for this DLL but only a sample to show how to use it.
The sample works great but I have to copy some code in another project to add the reader to the middleware Microsoft Biztalk.
The problem is that the process of Microsoft Biztalk works in another AppDomain. The reader handle events when a tag is read. But when I run it under Microsoft Biztalk I got this annoying exception.
I can't see any solution on how to make it work...
Here is some code that may be interesting :
// Let's connecting the result handlers.
// The reader calls a command-specific result handler if a command is done and the answer is ready to send.
// So let's tell the reader which functions should be called if a result is ready to send.
// result handler for reading EPCs synchronous
Reader.KSRWSetResultHandlerSyncGetEPCs(ResultHandlerSyncGetEPCs);
[...]
var readerErrorCode = Reader.KSRWSyncGetEPCs();
if (readerErrorCode == tKSRWReaderErrorCode.KSRW_REC_NoError)
{
// No error occurs while sending the command to the reader. Let's wait until the result handler was called.
if (ResultHandlerEvent.WaitOne(TimeSpan.FromSeconds(10)))
{
// The reader's work is done and the result handler was called. Let's check the result flag to make sure everything is ok.
if (_readerResultFlag == tKSRWResultFlag.KSRW_RF_NoError)
{
// The command was successfully processed by the reader.
// We'll display the result in the result handler.
}
else
{
// The command can't be proccessed by the reader. To know why check the result flag.
logger.error("Command \"KSRWSyncGetEPCs\" returns with error {0}", _readerResultFlag);
}
}
else
{
// We're getting no answer from the reader within 10 seconds.
logger.error("Command \"KSRWSyncGetEPCs\" timed out");
}
}
[...]
private static void ResultHandlerSyncGetEPCs(object sender, tKSRWResultFlag resultFlag, tKSRWExtendedResultFlag extendedResultFlag, tKSRWEPCListEntry[] epcList)
{
if (Reader == sender)
{
// Let's store the result flag in a global variable to get access from everywhere.
_readerResultFlag = resultFlag;
// Display all available epcs in the antenna field.
Console.ForegroundColor = ConsoleColor.White;
foreach (var resultListEntry in epcList)
{
handleTagEvent(resultListEntry);
}
// Let's set the event so that the calling process knows the command was processed by reader and the result is ready to get processed.
ResultHandlerEvent.Set();
}
}

You are having a problem with the gcroot<> helper class. It is used in the code that nobody can see, inside that DLL. It is frequently used by C++ code that was designed to interop with managed code, gcroot<> stores a reference to a managed object. The class uses the GCHandle type to add the reference. The GCHandle.ToIntPtr() method returns a pointer that the C++ code can store. The operation that fails is GCHandle.FromIntPtr(), used by the C++ code to recover the reference to the object.
There are two basic explanations for getting this exception:
It can be accurate. Which will happen when you initialized the code in the DLL from one AppDomain and use it in another. It isn't clear from the snippet where the Reader class object gets initialized so there are non-zero odds that this is the explanation. Be sure to keep it close to the code that uses the Reader class.
It can be caused by another bug, present in the C++ code inside the DLL. Unmanaged code often suffers from pointer bugs, the kind of bug that can accidentally overwrite memory. If that happens with the field that stores the gcroot<> object then nothing goes wrong for a while. Until the code tries to recover the object reference again. At that point the CLR notices that the corrupted pointer value no longer matches an actual object handle and generates this exception. This is certainly the hard kind of bug to solve since this happens in code you cannot fix and showing the programmer that worked on it a repro for the bug is very difficult, such memory corruption problems never repro well.
Chase bullet #1 first. There are decent odds that Biztalk runs your C# code in a separate AppDomain. And that the DLL gets loaded too soon, before or while the AppDomain is created. Something you can see with SysInternals' ProcMon. Create a repro of this by writing a little test program that creates an AppDomain and runs the test code. If that reproduces the crash then you'll have a very good way to demonstrate the issue to the RFID vendor and some hope that they'll use it and work on a fix.
Having a good working relationship with the RFID reader vendor to get to a resolution is going to be very important. That's never not a problem, always a good reason to go shopping elsewhere.

testing for empty groovy closure?

I want to let users supply a groovy class with a property that is a file-selector closure which I pass on to AntBuilder's 'copy' task:
class Foo {
def ANT = { fileset(dir:'/tmp/tmp1') }
}
in my code, I pick up the ANT property as 'fAnt' and pass to Ant:
ant.copy(todir:'/tmp/tmp2', fAnt)
This works - but, if the user passes in an empty closure (def ANT={}) or with a selector that doesn't select anything (maybe the fileset dir doesn't exist) then it blows up. I tried surrounding the ant copy with a try-catch to catch the InvokerInvocationException, but somehow the exception comes through anyway ... while I'm tracking that down, is there a way to read back a groovy Closure's contents as a string, or to test if it's empty?

In short: No. You can't decompile a closure in a meanngful way at runtime. If it's user supplied, the Closure could even be a Java class.
Long answer: If you want to do a lot of work, you might be able to, but it's probably not worth it. The Groovy parser is part of the API, so if you have access to the source, you can theoretically examine the AST and determine if the closure is empty. Look into the SourceUnit class.
It's almost certainly not worth the effort though. You're better off catching the exception and adding a helpful message like "You may have passed an empty closure or invalid fileset".

One mystery solved - the exception I need to catch is org.apache.tools.ant.BuildException - so I can just catch that to trap errors, but the original question remains - is there a way to examine a Closure's contents?

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string