I want to transform an XML document which I have parsed with XmlSlurper. The (identical) XML tag names should be replaced with the value of the id attribute; all other attributes should be dropped. Starting from this code:
def xml = """<tag id="root">
| <tag id="foo" other="blah" more="meh">
| <tag id="bar" other="huh"/>
| </tag>
|</tag>""".stripMargin()
def root = new XmlSlurper().parseText(xml)
// Some magic here.
println groovy.xml.XmlUtil.serialize(root)
I want to get the following:
<root>
<foo>
<bar/>
</foo>
</root>
(I write test assertions on the XML, and want to simplify the structure for them.) I've read Updating XML with XmlSlurper and searched around, but found no way with replaceNode() or replaceBody() to exchange a node while keeping its children.
Adding the 'magic' in to the code in the question gives:
def xml = """<tag id="root">
| <tag id="foo" other="blah" more="meh">
| <tag id="bar" other="huh"/>
| </tag>
|</tag>""".stripMargin()
def root = new XmlSlurper().parseText(xml)
root.breadthFirst().each { n ->
n.replaceNode {
"${n.#id}"( n.children() )
}
}
println groovy.xml.XmlUtil.serialize(root)
Which prints:
<?xml version="1.0" encoding="UTF-8"?><root>
<foo>
<bar/>
</foo>
</root>
HOWEVER, this will drop any content in the nodes. To maintain content, we would probably need to use recursion and XmlParser to generate a new doc from the existing one... I'll have a think
More general solution
I think this is more generalised:
import groovy.xml.*
def xml = """<tag id="root">
| <tag id="foo" other="blah" more="meh">
| <tag id="bar" other="huh">
| something
| </tag>
| <tag id="bar" other="huh">
| something else
| </tag>
| <noid>woo</noid>
| </tag>
|</tag>""".stripMargin()
def root = new XmlParser().parseText( xml )
def munge( builder, node ) {
if( node instanceof Node && node.children() ) {
builder."${node.#id ?: node.name()}" {
node.children().each {
munge( builder, it )
}
}
}
else {
if( node instanceof Node ) {
"${node.#id ?: node.name()}"()
}
else {
builder.mkp.yield node
}
}
}
def w = new StringWriter()
def builder = new MarkupBuilder( w )
munge( builder, root )
println XmlUtil.serialize( w.toString() )
And prints:
<?xml version="1.0" encoding="UTF-8"?><root>
<foo>
<bar>something</bar>
<bar>something else</bar>
<noid>woo</noid>
</foo>
</root>
Now passes through nodes with no (or empty) id attributes
Related
I want to retrieve the elements of <logs> as array of String and I am trying the following:
import groovy.util.XmlSlurper
def payload = '''<logs>
<log>
<text>LOG 1</text>
<timestamp>2017-05-18T16:20:00.000</timestamp>
</log>
<log>
<text>LOG 2</text>
<timestamp>2017-05-18T16:20:00.000</timestamp>
</log>
</logs>'''
def logs = new XmlSlurper().parseText(payload)
def result = []
logs.log.each{
result.add(it)
}
result
However, I am getting the values, but I would like to get the whole node as text, more or less this:
[<log>
<text>LOG 1</text>
<timestamp>2017-05-18T16:20:00.000</timestamp>
</log>,
<log>
<text>LOG 2</text>
<timestamp>2017-05-18T16:20:00.000</timestamp>
</log>]
Is this at all possible with XmlSlurper or should I use some String operations?
You can use XmlUtil but have to remove the xml declaration:
import groovy.util.XmlSlurper
import groovy.xml.XmlUtil
def payload = '''<logs>
<log>
<text>LOG 1</text>
<timestamp>2017-05-18T16:20:00.000</timestamp>
</log>
<log>
<text>LOG 2</text>
<timestamp>2017-05-18T16:20:00.000</timestamp>
</log>
</logs>'''
def logs = new XmlSlurper().parseText(payload)
def result = logs.log.collect {
XmlUtil.serialize(it).replaceAll(/<.xml.*?>/,"")
}
println result
Try this:
def payload = '''<logs>
<log>
<text>LOG 1</text>
<timestamp>2017-05-18T16:20:00.000</timestamp>
</log>
<log>
<text>LOG 2</text>
<timestamp>2017-05-18T16:20:00.000</timestamp>
</log>
</logs>'''
def logs = new XmlSlurper().parseText(payload)
def result = []
logs.log.each{
result.add( "<log> <text>" + it?.'text'.text() + "</text> <timestamp> " + it?.'timestamp'.text() + "</timestamp> </log>")
}
return result
You can go with:
def payload = '''<logs>
<log>
<text>LOG 1</text>
<timestamp>2017-05-18T16:20:00.000</timestamp>
</log>
<log>
<text>LOG 2</text>
<timestamp>2017-05-18T16:20:00.000</timestamp>
</log>
</logs>'''
def logs = new XmlParser().parseText(payload)
def result = logs.log.collect {
def sw = new StringWriter()
def pw = new PrintWriter(sw)
new XmlNodePrinter(pw).print(it)
sw.toString().replaceAll('\\s', '')
}
I have the following groovy code:
def xml = '''<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
<foot>
<email>m#m.com</email>
<sig>hello world</sig>
</foot>
</note>'''
def records = new XmlSlurper().parseText(xml)
How do I get records to return a map looks like the following:
["to":"Tove","from":"Jani","heading":"Reminder","body":"Don't forget me this weekend!","foot":["email":"m#m.com","sig":"hello world"]]
Thanks.
You can swing the recursion weapon. ;)
def xml = '''<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
<foot>
<email>m#m.com</email>
<sig>hello world</sig>
</foot>
</note>'''
def slurper = new XmlSlurper().parseText( xml )
def convertToMap(nodes) {
nodes.children().collectEntries {
[ it.name(), it.childNodes() ? convertToMap(it) : it.text() ]
}
}
assert convertToMap( slurper ) == [
'to':'Tove',
'from':'Jani',
'heading':'Reminder',
'body':"Don't forget me this weekend!",
'foot': ['email':'m#m.com', 'sig':'hello world']
]
I'm trying to add several chunks of XML code from one file into another. The problem is, some of these chunks have root tags that don't need to be copied into the destination XML file (that's the case if the root tags equal the pre-defined parent tags). Here's the code I'm currently using to insert the snippet (written in Groovy):
if (addCode.nodeName == parentTags) { //meaning the root tags shouldn't be included
for (org.w3c.dom.Node n : addCode.childNodes) {
//parent is a NodeList
parent.item(parent.length - 1).appendChild(document.importNode(n, true))
}
} else {
parent.item(parent.length - 1).appendChild(document.importNode(addCode, true))
}
And to parse the XML:
Document parseWithoutDTD(Reader r, boolean validating = false, boolean namespaceAware = true) {
FactorySupport.createDocumentBuilderFactory().with { f ->
f.namespaceAware = namespaceAware
f.validating = validating
f.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
f.newDocumentBuilder().with { db ->
db.parse(new InputSource(r))
}
}
}
Here's an example XML file where the root tags shouldn't be included:
<catalogue> <!-- shouldn't be included -->
<message key='type_issuedate'>Date Issued</message>
<message key='type_accessioneddate'>Date Accesioned</message>
</catalogue>
You might have noticed the problem: if I leave out the root tags from the XML files to copy into the other XML file, they throw a parsing exception.
EDIT: here's an (shortened) example of the file to insert to:
<catalogue xml:lang="en" xmlns:i18n="http://apache.org/cocoon/i18n/2.1">
...
<message key="column4">Date</message>
<message key="column5">Summary</message>
<message key="column6">Actions</message>
<message key="restore">Restore</message>
<message key="update">Update</message>
<!-- INSERT XML HERE -->
...
</catalogue>
And an example of XML with root tags to be included (and the corresponding file to insert to):
XML to insert
<dependency>
<groupId>grID</groupId>
<artifactId>artID</artifactId>
<version>${version.number}</version>
</dependency>
XML file to insert into
<?xml version="1.0" encoding="UTF-8"?>
<project>
<dependencies>
<dependency>
<groupId>grID1</groupId>
<artifactId>artID1</artifactId>
<type>jar</type>
<classifier>classes</classifier>
</dependency>
<!-- INSERT XML HERE -->
</dependencies>
</project>
Currently, all of this code isn't working as I want it to work. Can someone help me out?
Much appreciated!
I think (if I understand right), you need something like this:
def insert( parent, data ) {
if( parent.name() == data.name() ) {
data.children().each {
parent.append it
}
}
else {
parent.append data
}
}
So, given
def newdoc = '''<dependency>
| <groupId>grID</groupId>
| <artifactId>artID</artifactId>
| <version>${version.number}</version>
|</dependency>'''.stripMargin()
def doc = '''<?xml version="1.0" encoding="UTF-8"?>
|<project>
| <dependencies>
| <dependency>
| <groupId>grID1</groupId>
| <artifactId>artID1</artifactId>
| <type>jar</type>
| <classifier>classes</classifier>
| </dependency>
| </dependencies>
|</project>'''.stripMargin()
def docnode = new XmlParser().parseText( doc )
def newnode = new XmlParser().parseText( newdoc )
// use head() as I want to add to the first dependencies node
insert( docnode.dependencies.head(), newnode )
println groovy.xml.XmlUtil.serialize( docnode )
You get the output:
<?xml version="1.0" encoding="UTF-8"?><project>
<dependencies>
<dependency>
<groupId>grID1</groupId>
<artifactId>artID1</artifactId>
<type>jar</type>
<classifier>classes</classifier>
</dependency>
<dependency>
<groupId>grID</groupId>
<artifactId>artID</artifactId>
<version>${version.number}</version>
</dependency>
</dependencies>
</project>
And given:
def newdoc = '''<catalogue>
| <message key='type_issuedate'>Date Issued</message>
| <message key='type_accessioneddate'>Date Accesioned</message>
|</catalogue>'''.stripMargin()
def doc = '''<catalogue xml:lang="en" xmlns:i18n="http://apache.org/cocoon/i18n/2.1">
| <message key="column4">Date</message>
| <message key="column5">Summary</message>
| <message key="column6">Actions</message>
| <message key="restore">Restore</message>
| <message key="update">Update</message>
|</catalogue>'''.stripMargin()
def docnode = new XmlParser().parseText( doc )
def newnode = new XmlParser().parseText( newdoc )
insert( docnode, newnode )
println groovy.xml.XmlUtil.serialize( docnode )
you get:
<?xml version="1.0" encoding="UTF-8"?><catalogue xml:lang="en" xmlns:xml="http://www.w3.org/XML/1998/namespace">
<message key="column4">Date</message>
<message key="column5">Summary</message>
<message key="column6">Actions</message>
<message key="restore">Restore</message>
<message key="update">Update</message>
<message key="type_issuedate">Date Issued</message>
<message key="type_accessioneddate">Date Accesioned</message>
</catalogue>
Edit
Ok, given the extra info, does this help? Given the same newdoc and doc strings as above, this script seems to do what you want...
import groovy.xml.*
import groovy.xml.dom.*
import org.w3c.dom.Document;
import org.xml.sax.InputSource;
Document parseWithoutDTD(Reader r, boolean validating = false, boolean namespaceAware = true) {
FactorySupport.createDocumentBuilderFactory().with { f ->
f.namespaceAware = namespaceAware
f.validating = validating
f.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
f.newDocumentBuilder().with { db ->
db.parse(new InputSource(r))
}
}
}
def addCode = parseWithoutDTD( new StringReader( newdoc ) ).documentElement
def document = parseWithoutDTD( new StringReader( doc ) )
def parent = document.documentElement
def parentTags = 'catalogue'
use( DOMCategory ) {
if( addCode.nodeName == parentTags ) {
addCode.childNodes.each { node ->
parent.appendChild( document.importNode( node, true ) )
}
}
else {
parent.appendChild( document.importNode( addCode, true ) )
}
}
May i know how to update the second element's attribute using linq to xml? I do write some code but it doesnt work, it only update the user attribute....I'm sorry for asking this kind simple question.
My XML:
<Settings>
<Settig>
<User id="1" username="Aplha"/>
<Location Nation="USA" State="Miami" />
<Email>user1#hotmail.com</Email>
</Setting>
</Settings>
My Cs :
public static void saveSetting(MainWindow main)
{
XDocument document = XDocument.Load("Setting.xml");
IEnumerable<XElement> query = from p in document.Descendants("User")
where p.Attribute("id").Value == "1"
select p;
foreach (XElement element in query)
{
string i = "New York";
element.SetAttributeValue("State", i);
}
document.Save("Setting.xml");
}
You want to select the Setting elements; you can still select on id=1, like this:
IEnumerable<XElement> query = from p in document.Descendants("Setting")
where p.Element("User").Attribute("id").Value == "1"
select p;
Then select the Location element before updating:
foreach (XElement element in query)
{
element.Element("Location").SetAttributeValue("State", "New York");
}
I've been trying to do some xml modifications with groovy's XML Slurper.
Basically, i'm going through the xml and looking for tags or attributes that have ? as the value and then replacing it with some value.
I've got it working for xml that doesn't have namespaces but once I include them things get wonky. For example, this:
String foo = "<xs:test xmlns:xs="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:foo="http://myschema/xmlschema" name='?'>
<foo:tag1>?</foo:tag1>
<foo:tag2>?</foo:tag2>
</xs:test>";
produces:
<Envelope/>
Here's the groovy code I'm using. This does appear to work when I am not using a namespace:
public def populateRequest(xmlString, params) {
def slurper = new XmlSlurper().parseText(xmlString)
//replace all tags with ?
def tagsToReplace = slurper.depthFirst().findAll{ foundTag ->
foundTag.text() == "?"
}.each { foundTag ->
foundTag.text = {webServiceOperation.parameters[foundTag.name()]}
foundTag.replaceNode{
"${foundTag.name()}"(webServiceOperation.parameters[foundTag.name()])
}
}
//replace all attributes with ?
def attributesToReplace = slurper.list().each{
it.attributes().each{ attributes ->
if(attributes.value == '?')
{
attributes.value = webServiceOperation.parameters[attributes.key]
}
}
}
new StreamingMarkupBuilder().bind { mkp.yield slurper }.toString()
}
from groovy documentation
def wsdl = '''
<definitions name="AgencyManagementService"
xmlns:ns1="http://www.example.org/NS1"
xmlns:ns2="http://www.example.org/NS2">
<ns1:message name="SomeRequest">
<ns1:part name="parameters" element="SomeReq" />
</ns1:message>
<ns2:message name="SomeRequest">
<ns2:part name="parameters" element="SomeReq" />
</ns2:message>
</definitions>
'''
def xml = new XmlSlurper().parseText(wsdl).declareNamespace(ns1: 'http://www.example.org/NS1', ns2: 'http://www.example.org/NS2')
println xml.'ns1:message'.'ns1:part'.size()
println xml.'ns2:message'.'ns2:part'.size()