So I'm making a script in Groovy that parses a really large XML file, appends some stuff and slightly changes each element every time it appends. Each of these elements has an ID number associated with it and I want to make it so that every time an element is appended, the ID number will = the highest ID number in the file +1. I'll show a little piece of code to that will help understand:
<?xml version="1.0" encoding="UTF-8"?>
<xliff xmlns="xyxy" version="1.1">
<file original="zzz.js" source-language="en" target-language="en" datatype="javascript">
<body>
<trans-unit id="20" resname="foo">
<source>foofoo</source>
<target>foofoo</target>
</trans-unit>
<trans-unit id="21" resname="blah">
<source>blahblah</source>
<target>blahblah</target>
</trans-unit>
</body>
</file>
</xliff>
In this case, if I added an element (trans-unit) to the list, the ID would need to be 22. I have an algorithm that parses and appends, but I'm not sure how to increment the ID each time. Again, I'm using Groovy to do this. Does anyone have an idea? Thanks in advance!!
Assuming you have parsed that XML with xmlslurper or xmlparser, you should be able to get the next id with the help of max:
def xml = '''<?xml version="1.0" encoding="UTF-8"?>
<xliff xmlns="xyxy" version="1.1">
<file original="zzz.js" source-language="en" target-language="en" datatype="javascript">
<body>
<trans-unit id="20" resname="foo">
<source>foofoo</source>
<target>foofoo</target>
</trans-unit>
<trans-unit id="21" resname="blah">
<source>blahblah</source>
<target>blahblah</target>
</trans-unit>
</body>
</file>
</xliff>'''
def x = new XmlSlurper().parseText(xml)
def next = 1 + x.file.body.'trans-unit'*.#id*.text().collect { it as Integer }.max()
assert next == 22
To use XmlParser, you need to change the line to:
def next = 1 + x.file.body.'trans-unit'*.#id.collect { it as Integer }.max()
Related
I'm trying to iterate through an xml file with groovy to get some values.
I found many people with the same problem, but the solution they used doesn't work for me, or it's too complicated.
I'm not a groovy dev, so I need a bullet proof solution which I can implement.
Basically I have an xml response file that looks like this: ( it looks bad but that's what I get)
<Body>
<head>
<Details>
<items>
<item>
<AttrName>City</AttrName>
<AttrValue>Rome</AttrValue>
</item>
<item>
<AttrName>Street</AttrName>
<AttrValue>Via_del_Corso</AttrValue>
</item>
<item>
<AttrName>Number</AttrName>
<AttrValue>34</AttrValue>
</item>
</items>
</Details>
</head>
</Body>
I've already tried this solution I found here on StackOverflow to print the values:
def envelope = new XmlSlurper().parseText("the xml above")
envelope.Body.head.Details.items.item.each(item -> println( "${tag.name}") item.children().each {tag -> println( " ${tag.name()}: ${tag.text()}")} }
the best I get is
ConsoleScript11$_run_closure1$_closure2#2bfec433
ConsoleScript11$_run_closure1$_closure2#70eb8de3
ConsoleScript11$_run_closure1$_closure2#7c0da10
Result: CityRomeStreetVia_del_CorsoNumber34
I can also remove everything after the first println, and anything inside it, the result is the same
My main goal here is not to print the values but to extrapolate those values from the xml and save them as string variables...
I know that using strings is not the best practice but I just need to understand now.
Your code as is had 2 flaws:
with envelope.Body you would NOT find anything
if you fix No. 1, you would run into multiple compile errors for each(item -> println( "${tag.name}"). Here the ( is used instead of { and you use an undefined tag variable here.
The working code would look like:
import groovy.xml.*
def xmlBody = new XmlSlurper().parseText '''\
<Body>
<head>
<Details>
<items>
<item>
<AttrName>City</AttrName>
<AttrValue>Rome</AttrValue>
</item>
<item>
<AttrName>Street</AttrName>
<AttrValue>Via_del_Corso</AttrValue>
</item>
<item>
<AttrName>Number</AttrName>
<AttrValue>34</AttrValue>
</item>
</items>
</Details>
</head>
</Body>'''
xmlBody.head.Details.items.item.children().each {tag ->
println( " ${tag.name()}: ${tag.text()}")
}
and print:
AttrName: City
AttrValue: Rome
AttrName: Street
AttrValue: Via_del_Corso
AttrName: Number
AttrValue: 34
I am trying to find any nodes in a xml whos tags start with a certain pattern.
<data>
<general>
<va value="400" /> <!--looking for this "v-tag"-->
<vb value="42" /> <!-- and this one-->
<y value="43" />
</general>
<special>
<va value="100" />
</special>
</data>
I cannot put together the xpath expression. Something like this
xyz = lxml.etree.parse( ... )
vees = xyz.xpath("general/[tag='v*']")
I would like to have vees beeing
vees
Out[64]: [<Element va at 0x....>, <Element vb at 0x...>]
Try changing:
vees = xyz.xpath("general/[tag='v*']")
to
doc.xpath('//general//*[starts-with(name(),"v")]')
and see if it works.
using xslt 3, i need to take all content elements' values, and move them to the title elements (if the title elements already exist in a record, they need to be appended with a separator like -) i now have inputted my real data, since the below solution does not solve the problem when implemented to something like:
example input:
<data>
<RECORD ID="31365">
<no>25099</no>
<seq>0</seq>
<date>2/4/2012</date>
<ver>2/4/2012</ver>
<access>021999</access>
<col>GS</col>
<call>889</call>
<pr>0</pr>
<days>0</days>
<stat>0</stat>
<ch>0</ch>
<title>1 title</title>
<content>1 content</content>
<sj>1956</sj>
</RECORD>
<RECORD ID="31366">
<no>25100</no>
<seq>0</seq>
<date>2/4/2012</date>
<ver>2/4/2012</ver>
<access>022004</access>
<col>GS</col>
<call>8764</call>
<pr>0</pr>
<days>0</days>
<stat>0</stat>
<ch>0</ch>
<sj>1956</sj>
<content>1 title</content>
</RECORD>
</data>
expected output:
<data>
<RECORD ID="31365">
<no>25099</no>
<seq>0</seq>
<date>2/4/2012</date>
<ver>2/4/2012</ver>
<access>021999</access>
<col>GS</col>
<call>889</call>
<pr>0</pr>
<days>0</days>
<stat>0</stat>
<ch>0</ch>
<title>1 title - 1 content</title>
<sj>1956</sj>
</RECORD>
<RECORD ID="31366">
<no>25100</no>
<seq>0</seq>
<date>2/4/2012</date>
<ver>2/4/2012</ver>
<access>022004</access>
<col>ΓΣ</col>
<call>8764</call>
<pr>0</pr>
<days>0</days>
<stat>0</stat>
<ch>0</ch>
<sj>1956</sj>
<title>1 title</title>
</RECORD>
<data>
with my attempt, i did not manage to move the elements, i just got an empty line where the content element existed, so please add the removal of blank lines in the suggested solution.
i believe the removal of blank lines could be fixed with the use of
<xsl:template match="text()"/>
One way to achieve this is the following template. It uses XSLT-3.0 content value templates.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0" expand-text="true">
<xsl:output method="xml" indent="yes" />
<xsl:mode on-no-match="shallow-copy" />
<xsl:strip-space elements="*" /> <!-- Remove space between elements -->
<xsl:template match="RECORD">
<xsl:copy>
<xsl:copy-of select="#*" />
<title>{title[1]}{if (title[1]) then ' - ' else ''}<xsl:value-of select="content" separator=" " /></title>
<xsl:apply-templates select="node() except (title,content)" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
It's output is as desired.
If you want to separate the <content> elements with a -, too, you can simplify the core <title> expression to
<xsl:value-of select="title|content" separator=" - " />
EDIT:
All I changed was replacing chapter with RECORD, and it's working fine with Saxon-HE 9.9.1.4J. The only difference in the output is that the title element is always at the first position, but that shouldn't matter. I also added a directive to remove space between elements.
1) i want to read below mentioned XML file and access the values, i already tried in many ways but not able to access, for example i want 'NightRaidPerformanceCPUScore' value and that is from which passIndex.
<?xml version='1.0' encoding='utf8'?>
<benchmark>
<results>
<result>
<name />
<description />
<passIndex>-1</passIndex>
<sourceId>C:\Users\dgadhipx\Documents\3DMark\3dmark-autosave-20200401155825.3dmark-result</sourceId>
<NightRaidPerformance3DMarkScore>2066</NightRaidPerformance3DMarkScore>
<NightRaidPerformanceCPUScore>1454</NightRaidPerformanceCPUScore>
<NightRaidPerformanceGraphicsScore>2233</NightRaidPerformanceGraphicsScore>
<benchmarkRunId>8045dec5-e97c-452b-abeb-54af187fd50a</benchmarkRunId>
</result>
<result>
<name />
<description />
<passIndex>0</passIndex>
<sourceId>C:\Users\dgadhipx\Documents\3DMark\3dmark-autosave-20200401155825.3dmark-result</sourceId>
<NightRaidPerformanceCPUScoreForPass>1454</NightRaidPerformanceCPUScoreForPass>
<NightRaidPerformance3DMarkScoreForPass>2066</NightRaidPerformance3DMarkScoreForPass>
<NightRaidPerformanceGraphicsScoreForPass>2233</NightRaidPerformanceGraphicsScoreForPass>
<NightRaidPerformanceGraphicsTest1>9.57</NightRaidPerformanceGraphicsTest1>
<NightRaidPerformanceGraphicsTest2>12.18</NightRaidPerformanceGraphicsTest2>
<NightRaidCpuP>395.2</NightRaidCpuP>
<benchmarkRunId>8045dec5-e97c-452b-abeb-54af187fd50a</benchmarkRunId>
</result>
</results>
</benchmark>
You can use BeautifulSoup as fellow:
with open(file_path, "r") as f:
content = f.read()
xml = BeautifulSoup(content, 'xml')
elements = xml.find_all("NightRaidPerformanceCPUScore")
for i in elements:
print(i.text)
That will print you the values of all "NightRaidPerformanceCPUScore" tags.
<Shape ID="1" NameU="Start/End" Name="Start/End" Type="Shape" Master="2">
....</Shape>
<Shape ID="2" NameU="Start/End" Name="Start/End" Type="Shape" Master="5">
....</Shape>
I have to return the Master value for every ID value.
How can i achieve it by using LINQ to XMl.
You didn't really present how your XML document looks like, so I assumed it's as follow:
<Shapes>
<Shape ID="1" NameU="Start/End" Name="Start/End" Type="Shape" Master="2">
</Shape>
<Shape ID="2" NameU="Start/End" Name="Start/End" Type="Shape" Master="5">
</Shape>
</Shapes>
You can simply get Master attribute value for all different ID like that:
var xDoc = XDocument.Load("Input.xml");
var masters = xDoc.Root
.Elements("Shape")
.ToDictionary(
x => (int)x.Attribute("ID"),
x => (int)x.Attribute("Master")
);
masters will be Dictionary<int, int> where key is your ID and value is corresponding Master attribute value.