xml-flow NPM package - Unexpected XML Parsing Behaviour - node.js

Background
I am using xml-flow npm package to parse XML using streams. Issue is that the xml nodes are getting parsed in an unexpected way.
My intention is to parse a huge XML file using a repeating xml node. The XML file can be any URL and the repeating node will be provided from UI.
I tried to use the options with all possible values but the parsing behaviour doesn't seem to change.
Sample Code
I used following sample XML -
<list>
<item>
<details>
<id>1</id>
</details>
</item>
<item>
<details>
<id>2</id>
<description>description for item 2</description>
</details>
</item>
</list>
I tried to parse it using item as repeating node as follows -
const fs = require("fs");
const flow = require("xml-flow");
const xmlStream = flow(fs.createReadStream("./sample.xml"));
xmlStream.on('tag:item', function (person) {
console.log(JSON.stringify(person, null, 4));
});
I got following response for 2 parsed xml nodes -
// node 1
{
"$name": "item",
"details": "1"
}
// node 2
{
"$name": "item",
"details": {
"id": "2",
"description": "description for item 2"
}
}
Problem
As you can see in the response, I get a different JSON structure for parsed XML nodes.
In case of first XML node, <id> node didn't appear in JSON object (unlike second XML node) because its parent node viz. <details> has only one child node viz. <id>.
This is causing problems in my application as the parsed XML might have thousands of records & the relative path in JSON structure to the leaf nodes are changing because of this behaviour.
As an example, if there are 10000 records in xml file and all the records after 5000th record have node 2 structure, item.details relative path will point to a string for records 1 to 5000 whereas the same path will point to an object for remaining records.
Alternative NPM Package
I did try to use xml-stream which works on the same logic, but it comes with a problem of collecting the sub-items explained here which is even more complicated problem for me as incoming XML structure in this case will vary from file to file.
Let me know if I need to provide more information.
Cheers!

Well! After going through the implementation of these packages, it seems there is no workaround for this problem (I might have missed something) unless explicit support is provided.
I finally decided to write a new logic & ended up writing a new npm package xtreamer which provides xml nodes instead of converting them into JSON objects.
This package exposes a transform stream that can be piped with any readable stream. It expects xml node name in request and emits a custom event xmldata to output the xml node.
The output can the be plugged in to any xml-json npm package as per the requirement to get the final JSON. Check the npm package for further details.
supporting module
I managed to create one more npm package xtagger which uses sax npm package and provides xml structure in following format -
structure: { [name: string]: { [hierarchy: number]: number } };
This package can be used to find the repeating nodes in xml file by considering their hierarchy.

Related

How to parse and edit XML in TypeScript without converting to JSON

I need to repair an XML file in TypeScript and I cannot find any info on it, since everyone who posts something like this has different needs. I'd like to be pointed in the right direction here.
I have an XML request as shown below. It is autogenerated by node-soap when given JSON. Using the WSDL, node-soap attempts to fill in the namespace prefixes for each property. The problem is, it gets them wrong a lot.
In this example below, q106 should be replaced with hep3.
<soap:Envelope q15="some-good-url" q106="some-good-url-1" q98-"some-good-url-2>
...
<q98:SalesOrderAuditInfo>
<q15:ConfirmedBy xsi:nil="true"></q15:ConfirmedBy>
<q15:ConfirmedDate>0001-01-01T00:00:00</q15:ConfirmedDate>
<q15:CreatedBy>
<q106:ID>103</q106:ID>
<q106:Value>System, System</q106:Value>
</q15:CreatedBy>
<q15:CreatedDate>2022-10-26T00:43:13.413</q15:CreatedDate>
<q15:SalesOrderType>Standard</q15:SalesOrderType>
</q98:SalesOrderAuditInfo>
I know which namespace prefixes are bad because I have a sample request that was supplied to me. It's just XML. It looks like this:
<soap:Envelope hep="some-good-url" hep1="some-good-url-1" hep2-"some-good-url-2 hep3="some-good-url-3">
...
<hep2:SalesOrderAuditInfo>
<hep:ConfirmedBy xsi:nil="true"></hep:ConfirmedBy>
<hep:ConfirmedBy:ConfirmedDate>0</hep:ConfirmedByConfirmedDate>
<hep:CreatedBy>
<hep3:ID>103</hep3:ID>
<hep3:Value>System, System</hep3:Value>
</hep:CreatedBy>
<hep:CreatedDate>0</hep:CreatedDate>
<hep:SalesOrderType>Standard</hep:SalesOrderType>
</hep2:SalesOrderAuditInfo>
Here is the part that really matters. The Sample Request is the entire possible request body. The supplied request (with incorrect namespaces) is a subset of the sample request. I need to loop through each of the elements in the supplied request, and check to make sure the URL of that element matches the URL of the respective element in the sample request.
So in this example, loop through each element of supplied request. Start with SalesOrderAuditInfo. It's namespace URL is some-good-url-2. If we check the Sample Request, we can see that SalesOrderAuditInfo also corresponds to some-good-url-2.
Continue until we hit the ID tag. This has a namespace set to some-good-url-1. If we check the Same ID (inside of CreatedBy, inside of SalesOrderAuditInfo), we can see the namespace should actually be set to some-good-url-3. So we replace q106 with hep3.
I also need to take all of the namespaces defined in the Sample Request Envelope and move them into the supplied request envelope so that this new hep3 will be defined.
At this point, I need to edit the namespace prefix. In this example, q106:ID would be replaced be the string hep3:ID. Same with all of the closing tags.
Which library can I use to accomplish this in XML? Is anyone familiar with node-soap screwing these namespaces up and know of any fix?
I am using node-soap v0.43

Node.js XML builder Error: Invalid character in name: during creation of XML document

I am trying to create the XML file using the node.js and npm package xmlbuilder. When I am trying to create the tags I have some special characters such as : / etc due to which I am getting the following error:
Error: Invalid character in name: http://google.com.http://google.com
How can I resolve this issue. I can replace it with the blank but I don't want to do that I want my XML to retain these special characters.
var root = builder.create('test:document')
var ObjectEvent = root.ele('ObjectEvent')
for(var ex=0; ex<Extension.length; ex++)
{
Extension[ex].NameSpace = Extension[ex].NameSpace;
Extension[ex].LocalName = Extension[ex].LocalName;
Extension[ex].FreeText = Extension[ex].FreeText;
ObjectEvent.ele(Extension[ex].NameSpace+Extension[ex].LocalName,Extension[ex].FreeText).up()
}
ObjectEvent.ele(Extension[ex].NameSpace+'.'+Extension[ex].LocalName,Extension[ex].FreeText).up()
My Extension elements would look something like this;
[
{
NameSpace: 'http://google.com',
LocalName: 'http://google.com',
ExtensionVlaues: 0,
FreeText: 'Google Website',
'$$hashKey': 'object:290'
}
]
I wanted to know how can I retain all the special characters in my XML document
XML namespaces can be URIs, but XML element names cannot: / is not allowed in XML elements names.
I wanted to know how can I retain all the special characters in my XML document
Realize that your error is not about special characters in your XML document; it's about special characters in the names of your XML elements.
Regarding XML elements, you simply must abide by the rules specified in the standard regarding the allowed characters in XML element names. Otherwise, your data is not XML, and you and your callers will not be able to use XML tools and libraries with it.
See also How to include ? and / in XML tag

NodeJS Amazon AWS Submit Feed Generic Error

I am trying to submit a product feed to AWS and I keep getting a generic error related to the XML I am sending
I have gone through all the .xsd files and come up what I believe to be correct xml but apparently not :(
The Error
{
"MessageID": "1",
"ResultCode": "Error",
"ResultMessageCode": "25",
"ResultDescription": "We are unable to process the XML feed because one or more items are invalid. Please re-submit the feed."
}
How I am creating the content
const getContent = (upc) => `<?xml version="1.0" encoding="iso-8859-1"?>
<AmazonEnvelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="amzn-envelope.xsd">
<Header>
<DocumentVersion>1.01</DocumentVersion>
<MerchantIdentifier>${process.env.MERCHANT_ID}</MerchantIdentifier>
</Header>
<MessageType>Product</MessageType>
<Message>
<MessageID>1</MessageID>
<OperationType>Update</OperationType>
<Product>
<SKU>${upc}</SKU>
</Product>
</Message>
</AmazonEnvelope>`
Turns out this just means a value in the xml is malformed. It helped a lot to search the xsd files for the specific variable I was trying to tie a value to and then it provides the restrictions on that value (i.e. AmazonOrderId has a regex validation associated with it that restricts the value to be in the structure of 123-1234567-1234567). After I manually ran through this process for each variable I was able to submit a feed finally but now I have to have an associated OrderAcknowledgement which I am working through.

How to achieve security level 3 in FIWARE?

I am deploying FIWARE security GEs (i.e., Wilma, AuthzForce, Keyrock) in my computer. Security level 2 (Basic Authorization) is working well, but now I need security level 3 (Advanced Authorization) using XACML.
Long story short, I want a tutorial of implementation security level 3. However, as far as I know, any tutorial or document about security level 3 does not exist.
For now, I create my policy with PAP's API, and change 'custom_policy' option in config.js from 'undefined' to 'policy.js'. And then I create 'policy.js' file into 'PEP/policies', but don't change anything compared with its template file because I don't know what this code does exactly. I think I should make XACML Request form using 'xml' variable. But in my case, PEP gives me the error when I make the XACML Request using 'xml' variable, and return this variable. Here is my error of PEP:
Error: Root - Error in AZF communication <?xml version="1.0" encoding="UTF-8" standalone="yes"?><error xmlns="http://authzforce.github.io/rest-api-model/xmlns/authz/S" xmlns:ns2="http://www.w3.org/2005/Atom" xmlns:ns3="http://authzforce.github.io/core/xmlns/pdp/5.0" xmlns:ns4="http://authzforce.github.io/pap-dao-flat-file/xmlns/properties/3.6"><message>Invalid parameters: cvc-elt.1: Cannot find the declaration of element 'Request'.</message></error>
And here is my 'getPolicy' code (XACML Request) in policy.js. I just made very simple request whether response is permit or not because I'm not sure what I'm doing at that time.:
exports.getPolicy = function (roles, req, app_id) {
var xml = xmlBuilder.create('Request', {
'xmlns': 'urn:oasis:names:tc:xacml:3.0:core:schema:wd-17',
'CombinedDecision': 'false',
'ReturnPolicyIdList': 'false'})
.ele('Attributes', {
'Category': 'urn:oasis:names:tc:xacml:1.0:subject-category:access-subject'});
So, anyone can give me any information about implementation of security level 3?
Upgrade to Wilma 6.2 (bug fixing).
Reuse the code from lib/azf.js which is known to work, and adapt the Request content to your needs. The variable is wrongly called XACMLPolicy there, but don't be mistaken, this is an actual XACML Request. This is using xml2json package to convert the JSON to XML, whereas in your code you seem to use a different one, xmlbuilder maybe? You didn't paste the full code - where does this xmlBuilder variable come from? - so I'm just guessing.
If you are indeed using xmlbuilder package and want to stick with it, I notice that in the example using namespaces, the xmlns attribute is put in a different way:
var xmlBuilder = require('xmlbuilder');
var xml = xmlBuilder.create('Request', { encoding: 'utf-8' })
.att('xmlns', 'urn:oasis:names:tc:xacml:3.0:core:schema:wd-17')
.att('CombinedDecision': 'false')
.att('ReturnPolicyIdList': 'false')
.ele('Attributes', {'Category': 'urn:oasis:names:tc:xacml:1.0:subject-category:access-subject'});
Maybe this makes a difference, I didn't check.
Also feel free to create an issue with your question on Wilma's github to get help from the dev team. (I am not one of them but we've worked together for AuthzForce integration.)
The error you are getting is really
Invalid parameters: cvc-elt.1: Cannot find the declaration of element
'Request'.
This is a simple XML validation issue. You need to make sure that the XACML request you send contains the right namespace declaration.
You'll see there is another question on this topic here.
Can you paste your XACML request so we can tell whether it is valid?

Inserting artificially generated XML into SOAPUI request

I am trying to do the following in SOAPUI:
Read a response and extract a node from it
Insert the node into another request
Generate some xml in a Groovy script and store in a TestCase property
Insert the generated xml from the property as a child node to the node inserted in Step 2.
For 1 and 2: The structure of the response is something like
<A><B>bb</B><C>cc</C><D>dd</D></A>
I extract it via a Property Transfer step using //A to identify the beginning of the node, and passing the node with its children to the request in the next test step. Until this, the node in the request has no content. This works.
For 3 I generate something like
<E>ee</E>
The goal after step 4 is a request structure looking like this:
<A><E>ee</E><B>bb</B><C>cc</C><D>dd</D></A>
A solution using
${#TestCase#new_xml}
to insert the node does not work because there is no way to place the property where the E node should be (as far as I know).
I tried inserting the E node via another Property Transfer test step - the value of the property gets inserted in the request as child to the A node (same way the A node was copied from the response to the next request in Step 2). The result is this:
<A><![CDATA[<E>ee</E>]]<<B>bb</B><C>cc</C><D>dd</D></A>
I would like to know:
How to insert the E node as a child node to the A node while avoiding CDATA (or removing the CDATA subsequently).
Why the xml is passed without CDATA in Step 2 which also uses the SOAPUI Property Transfer Step, but not in Step 4.
Any tips appreciated!
For 1 & 2, you can use just a simple property expansion.
Let say your Response looks like:
<AAA>
<BBB/>
<CCC/>
<BBB/>
<BBB/>
<DDD>
<BBB/>
</DDD>
<CCC/>
</AAA>
And let say you want to transfer the entire node DDD, including the children. In your next request you would use ${<TestStep_name>#Response//*:DDD}. Note the *: means "any namespace", since in a real SOAP Response you will probably have some kind of namespace.
For 3:
// Generate some xml in a Groovy script
def xml = '<AAA><BBB/><CCC/><BBB/><BBB/><DDD><BBB/></DDD><CCC/></AAA>'
// store in a TestCase property
testRunner.testCase.setPropertyValue('my_property', xml)
If you want to get more fancy, you could use one of the many Java XML libraries, some of which are packaged with SoapUI. Here is one possibility.
For 4, you would again use property expansion: ${#TestCase#my_property}

Resources