NIFI - EvaluateXPath to use as attribute - values - text

I have an xml that I need to extract the value of nCT and serie.
<?xml version="1.0" encoding="UTF-8"?>
<cteProc xmlns="http://www.portalfiscal.inf.br/cte" versao="3.00">
<CTe>
<infCte Id="CTe41221100428307001240570010023982451023982450" versao="3.00">
<ide>
<serie>1</serie>
<nCT>2398245</nCT>
<dhEmi>2022-11-04T19:24:16-03:00</dhEmi>
I need to create an attribute with evaluateXPath processor to extract this 'nCT' and 'serie' as text.

Related

how to create an XML File from and EXCEL File

Friends, I have not done that before and seeking help if some has done this. I have one One EXCEL File with multiple rows and column.
I want to export the data into XML as below:
Case Number AER_No. P_No. PROD1 PROD2 EVENT1 EVENT2
004089652 202211-01 80000231204 TYLONEL PREVNAR2 FEVER RASH
Expected output:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE ichicsr SYSTEM http://eudravigilance.ema.europa.eu/dtd/icsr21xml.dtd>
<ichicsr lang="en">
<ichicsrmessageheader>
<CASENUMBER>004089652</CASENUMBER>
<AER_NO>202211-01</AER_NO>
<P_NO>80000231204</P_NO>
<PROD1>TYLONEL</PROD1>
<PROD2>PREVNAR2</PROD2>
<EVENT1>FEVER</EVENT1>
<EVENT2>RASH</EVENT2>
</ichicsrmessageheader>

To remove array string item from config file

How to remove a string array item from the config file?
<configuration>
<applicationSettings>
<Sample.Service.Properties.Settings>
<setting name="SampleAttribute" serializeAs="Xml">
<value>
<ArrayOfString xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<string>firstemail#domain.com</string>
<string>secondemail#domain.com</string>
</ArrayOfString>
</value>
I am able to access the first item (firstemail#domain.com) from the config file and can replace the value by the following code.
But my question is how to remove the second item(secondemail#domain.com) from the config file through following similar code.
{
"configuration/applicationSettings/Sample.Service.Properties.Settings/setting[#name='SampleAttribute']/value/ArrayOfString/string[0]":"$(SampleAttribute)"
}
To remove array string item from config file
I am afraid we could not use the File Transform (not sure whether the Config Transform you said is file Transform or Config Transformation) to remove the array string item.
To resolve this issue, you could use the task Replace Tokens to replace the first item and remove the second item:
The format of variable in config file is #{EmailOne}# & #{EmailTwo}#.
My test config file like:
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<applicationSettings>
<Sample.Service.Properties.Settings>
<setting name="SampleAttribute" serializeAs="Xml">
<value>
<ArrayOfString xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<string>#{EmailOne}#</string>
#{EmailTwo}#
</ArrayOfString>
</value>
</setting>
</Sample.Service.Properties.Settings>
</applicationSettings>
</configuration>
Then we just need to define the variable EmailOne and EmailTwo in the Variables with replace value and empty value:
The test result:
Inline Powershell Scripts:
$appConfigFile = "$(System.DefaultWorkingDirectory)\xxx\xxx\web.config" #change the path to your config file.
$appConfig = New-Object XML
$appConfig.Load($appConfigFile)
foreach($BuildNumber in $appConfig.configuration.applicationSettings."Sample.Service.Properties.Settings".setting.value.ArrayOfString)
{
$BuildNumber.RemoveChild($BuildNumber.FirstChild.NextSibling)
}
$appConfig.Save($appConfigFile)

Mapping excel to XML - Problem importing XML-fields

I seem to have a problem with mapping XML parts to an existing exceltable.
I have a sample XML file provided from the Swedish tax authority as XML-schema:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<Skatteverket xmlns="http://xmls.skatteverket.se/se/skatteverket/ai/instans/infoForBeskattning/4.0"
xmlns:gm="http://xmls.skatteverket.se/se/skatteverket/ai/gemensamt/infoForBeskattning/4.0"
xmlns:ku="http://xmls.skatteverket.se/se/skatteverket/ai/komponent/infoForBeskattning/4.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" omrade="Kontrolluppgifter"
xsi:schemaLocation="http://xmls.skatteverket.se/se/skatteverket/ai/instans/infoForBeskattning/4.0
http://xmls.skatteverket.se/se/skatteverket/ai/kontrolluppgift/instans/Kontrolluppgifter_4.0.xsd ">
<ku:Avsandare>
<ku:Programnamn>KUfilsprogrammet</ku:Programnamn>
<ku:Organisationsnummer>162234567895</ku:Organisationsnummer>
<ku:TekniskKontaktperson>
<ku:Namn>Bo Ek</ku:Namn>
<ku:Telefon>+46881234567</ku:Telefon>
<ku:Epostadress>bo.ek#elbolagetab.se</ku:Epostadress>
<ku:Utdelningsadress1>Strömgatan 11</ku:Utdelningsadress1>
<ku:Postnummer>62145</ku:Postnummer>
<ku:Postort>Strömby</ku:Postort>
</ku:TekniskKontaktperson>
<ku:Skapad>2015-06-07T21:32:52</ku:Skapad>
</ku:Avsandare>
<ku:Blankettgemensamt>
<ku:Uppgiftslamnare>
<ku:UppgiftslamnarePersOrgnr>165599990602</ku:UppgiftslamnarePersOrgnr>
<ku:Kontaktperson>
<ku:Namn>John Ström</ku:Namn>
<ku:Telefon>+46812345678</ku:Telefon>
<ku:Epostadress>siv.strom#elbolagetab.se</ku:Epostadress>
<ku:Sakomrade>Förnybar el</ku:Sakomrade>
</ku:Kontaktperson>
</ku:Uppgiftslamnare>
</ku:Blankettgemensamt>
<!-- Kontrolluppgift 1 -->
<ku:Blankett nummer="2350">
<ku:Arendeinformation>
<ku:Arendeagare>165599990602</ku:Arendeagare>
<ku:Period>2018</ku:Period>
</ku:Arendeinformation>
<ku:Blankettinnehall>
<ku:KU66>
<ku:UppgiftslamnareKU66>
<ku:UppgiftslamnarId faltkod="201">165599990602</ku:UppgiftslamnarId>
<ku:NamnUppgiftslamnare faltkod="202">Sonjas elhandel</ku:NamnUppgiftslamnare>
</ku:UppgiftslamnareKU66>
<ku:Inkomstar faltkod="203">2018</ku:Inkomstar>
<ku:KWhMatatsIn faltkod="270">3622</ku:KWhMatatsIn>
<ku:KWhTagitsUt faltkod="271">4822</ku:KWhTagitsUt>
<ku:AnlaggningsID faltkod="272">735999123456789012</ku:AnlaggningsID>
<ku:AndelIAnslPunkt faltkod="273">12.5</ku:AndelIAnslPunkt>
<ku:Specifikationsnummer faltkod="570">128</ku:Specifikationsnummer>
<ku:InkomsttagareKU66>
<ku:Inkomsttagare faltkod="215">193804139149</ku:Inkomsttagare>
</ku:InkomsttagareKU66>
</ku:KU66>
</ku:Blankettinnehall>
</ku:Blankett>
</Skatteverket>
When using Excel, Developer tab -> XML -> Source and adding the file I don't seem to get the XML parts inside the tag
<ku:Blankettinnahall>
Any reason why Excel would skip these XML parts?
Here is some sample exceltable data that I would like to map to those XML-fields:
AnlaggningsID Inkomsttagare Inkomstar KWhMatatsIn KWhTagitsUt AndelIAnslPunkt Specifikationsnummer
526009875445385000 190101019999 2018 50078,0 88462,0 1
258655985101244000 190201019999 2018 75,0 4615,0 2
112855269388863000 190301019999 2018 16687,0 19870,0 42 3
364615095294089000 190401019999 2018 16687,0 19870,0 58 4
534980084130649000 190501019999 2018 174,0 7009,0 5
It looks like your missing the actual data itself...the top half is the description of the sender and details. And later is data section (Blankettinnehall)
So on your excel I would expect rows with columns for each header/ sender details. This may be whats missing.
You can see this if you take a sample file from them and view it in Excel.
I struggled with KU52 last year ended up doing a C# application to generate the XML file.

How to extract CDATA without the GPath/node name

I'm trying to extract CDATA content from an XML without the using GPath (or) node name. In short, i want to find & retrieve the innerText containing CDATA section from an XML.
My XML look like:
def xml = '''<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<root>
<Test1>This node contains some innerText. Ignore This.</Test1>
<Test2><![CDATA[this is the CDATA section i want to retrieve]]></Test2>
</root>'''
From the above XML, i want to get the CDATA content alone without using the reference of its node name 'Test2'. Because the node name is not always the same in my scenario.
Also note that the XML can contain innerText in few other nodes (Test1). I dont want to retrieve that. I just need the CDATA content out of the whole XML.
I want something like below (the code below is incorrect though)
def parsedXML = new xmlSlurper().parseText(xml)
def cdataContent = parsedXML.depthFirst().findAll { it.text().startsWith('<![CDATA')}
My output should be :
this is the CDATA section i want to retrieve
As #daggett says, you can't do this with the Groovy slurper or parser, but it's not too bad to drop down and use the java classes to get it.
Note you have to set the property for CDATA to become visible, as by default it's just treated as characters.
Here's the code:
import javax.xml.stream.*
def xml = '''<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<root>
<Test1>This node contains some innerText. Ignore This.</Test1>
<Test2><![CDATA[this is the CDATA section i want to retrieve]]></Test2>
</root>'''
def factory = XMLInputFactory.newInstance()
factory.setProperty('http://java.sun.com/xml/stream/properties/report-cdata-event', true)
def reader = factory.createXMLStreamReader(new StringReader(xml))
while (reader.hasNext()) {
if (reader.eventType in [XMLStreamConstants.CDATA]) {
println reader.text
}
reader.next()
}
That will print this is the CDATA section i want to retrieve
Considering you just have one CDATA in your xml split can help here
def xml = '''<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<root>
<Test1>This node contains some innerText. Ignore This.</Test1>
<Test2><![CDATA[this is the CDATA section i want to retrieve]]></Test2>
</root>'''
log.info xml.split("<!\\[CDATA\\[")[1].split("]]")[0]
So in the above logic we split the string on CDATA start and pick the portion which is left after
xml.split("<!\\[CDATA\\[")[1]
and once we got that portion we did the split again and then got the portion which is before that pattern by using
.split("]]")[0]
Here is the proof it works

Delete all chars before the xml Tag in a string - in groovy. soapui

How do I replace all the characters with nothing (thus deleting them) up to a certain character? I have a log string which is an XML request:
I have a string like this:
Mon Dec 19 09:50:50 EST 2016:INFO:
string = "test-testing ID:idm-zx-sawe.3CE65834D32AD741:370 <?xml version="1.0" encoding="UTF-8"?>"
string.replaceAll("([^,]*'<')", "").replaceAll("(?m)^\\s*ID.*","");
I need to remove all the charters before <?xml
and return the following string: "test-testing ID:idm-zx-sawe.3CE65834D32AD741:370
I'm trying with this regular expression:
/.*<\?/ - need this translated to groovy string.replaceAll(".*<\?","")
I would do it like this:
​def string = 'test-testing ID:idm-zx-sawe.3CE65834D32AD741:370 <?xml version="1.0" encoding="UTF-8"?>'
def start = ​​​​​​​​​​​​​​​​string.indexOf('<?xml')​​​​​;
if (start) {
string = string.substring(start);
}​
string is:
<?xml version="1.0" encoding="UTF-8"?>

Resources