How to get the required values from the below mentioned xml file? - python-3.x

1) i want to read below mentioned XML file and access the values, i already tried in many ways but not able to access, for example i want 'NightRaidPerformanceCPUScore' value and that is from which passIndex.
<?xml version='1.0' encoding='utf8'?>
<benchmark>
<results>
<result>
<name />
<description />
<passIndex>-1</passIndex>
<sourceId>C:\Users\dgadhipx\Documents\3DMark\3dmark-autosave-20200401155825.3dmark-result</sourceId>
<NightRaidPerformance3DMarkScore>2066</NightRaidPerformance3DMarkScore>
<NightRaidPerformanceCPUScore>1454</NightRaidPerformanceCPUScore>
<NightRaidPerformanceGraphicsScore>2233</NightRaidPerformanceGraphicsScore>
<benchmarkRunId>8045dec5-e97c-452b-abeb-54af187fd50a</benchmarkRunId>
</result>
<result>
<name />
<description />
<passIndex>0</passIndex>
<sourceId>C:\Users\dgadhipx\Documents\3DMark\3dmark-autosave-20200401155825.3dmark-result</sourceId>
<NightRaidPerformanceCPUScoreForPass>1454</NightRaidPerformanceCPUScoreForPass>
<NightRaidPerformance3DMarkScoreForPass>2066</NightRaidPerformance3DMarkScoreForPass>
<NightRaidPerformanceGraphicsScoreForPass>2233</NightRaidPerformanceGraphicsScoreForPass>
<NightRaidPerformanceGraphicsTest1>9.57</NightRaidPerformanceGraphicsTest1>
<NightRaidPerformanceGraphicsTest2>12.18</NightRaidPerformanceGraphicsTest2>
<NightRaidCpuP>395.2</NightRaidCpuP>
<benchmarkRunId>8045dec5-e97c-452b-abeb-54af187fd50a</benchmarkRunId>
</result>
</results>
</benchmark>

You can use BeautifulSoup as fellow:
with open(file_path, "r") as f:
content = f.read()
xml = BeautifulSoup(content, 'xml')
elements = xml.find_all("NightRaidPerformanceCPUScore")
for i in elements:
print(i.text)
That will print you the values of all "NightRaidPerformanceCPUScore" tags.

Related

XSLT Can't Read an Excel XML File?

I'm using XSLT / XPath to browse some of the XML files you get when you unzip an Excel file. I found a "relationships" file workbook.xml.rels that I don't seem to be able to read, using code similar to that which successfully read the workbook.xml file.
Here's some of the workbook.xml file:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<workbook xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main"
...
<sheets>
<sheet name="Sheet1"
sheetId="2"
r:id="rId1"/>
<sheet name="Test Sheet"
sheetId="1"
r:id="rId2"/>
</sheets>
...
</workbook>
Here's the workbook.xml.rels file:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
<Relationship Id="rId3"
Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/theme"
Target="theme/theme1.xml"/>
<Relationship Id="rId2"
Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet"
Target="worksheets/sheet2.xml"/>
<Relationship Id="rId1"
Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet"
Target="worksheets/sheet1.xml"/>
<Relationship Id="rId5"
Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/sharedStrings"
Target="sharedStrings.xml"/>
<Relationship Id="rId4"
Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/styles"
Target="styles.xml"/>
</Relationships>
Here's some of the XSLT:
<?xml version="1.0"?>
<!-- greeting.xsl -->
<xsl:stylesheet
...
<xsl:output method="text"/>
<xsl:variable name="baseDir" select="replace(document-uri(.), '(.*[\\/]xl).*', '$1/')"/>
<xsl:variable name="workbook" select="concat($baseDir, 'workbook.xml')"/>
<xsl:variable name="theSheetId" select="doc($workbook)/workbook/sheets/sheet[matches(#name, 'Test Sheet')]/#r:id"/>
<xsl:variable name="workbook_rels" select="concat($baseDir, '_rels/workbook.xml.rels')"/>
<!-- code to read workbook.xml.rels -->
<xsl:variable name="theSheet" select="doc($workbook_rels)/Relationships/Relationship[matches(#Id, $theSheetId)]/#Target"/>
<xsl:template match="/">
<xsl:text>
baseDir = </xsl:text><xsl:value-of select="$baseDir"/>
<xsl:text>
workbook = </xsl:text><xsl:value-of select="$workbook"/>
<xsl:text>
workbook_rels = </xsl:text><xsl:value-of select="$workbook_rels"/>
<xsl:text>
theSheetId = </xsl:text><xsl:value-of select="$theSheetId"/>
<xsl:text>
theSheet = </xsl:text><xsl:value-of select="$theSheet"/>
<xsl:text>
end</xsl:text>
</xsl:template>
</xsl:stylesheet>
And the output:
baseDir = file:/C:/Training/sandbox/conv_/xl/
workbook = file:/C:/Training/sandbox/conv_/xl/workbook.xml
workbook_rels = file:/C:/Training/sandbox/conv_/xl/_rels/workbook.xml.rels
theSheetId = rId2
theSheet = **<I get nothing here>**
end
You can see that 'theSheetID' variable is correctly set when reading workbook.xml. But when I use that variable to get the corresponding Target value into 'theSheet' variable from workbook.xml.rels, I get nothing. I tried replacing the matches expression with just a number but I still get nothing. Is there a problem from reading this type of file?
Suggestions? Thanks!
The use of matches and replace suggests you are using an XSLT 2 or 3 processor and that way XSLT 2 or 3 where you can certainly declare xpath-default-namespace, you just have to understand you have to change that in the sections that deal with elements from a different namespace e.g. <xsl:variable name="theSheet" select="doc($workbook_rels)/Relationships/Relationship[matches(#Id, $theSheetId)]/#Target" xpath-default-namespace="http://schemas.openxmlformats.org/package/2006/relationships"/>.
Given the samples I would rather use a key <xsl:key name="rel" match="Relationships/Relationship" use="#Id" xpath-default-namespace="http://schemas.openxmlformats.org/package/2006/relationships"/> and then use <xsl:variable name="theSheet" select="key('rel,$theSheetId, doc($workbook_rels))/#Target"/> but the use of xpath-default-namespace to declare the relevant namespace when selecting elements from a particular document is probably what is missing in your XSLT.

find xml-nodes by tag with wildcard

I am trying to find any nodes in a xml whos tags start with a certain pattern.
<data>
<general>
<va value="400" /> <!--looking for this "v-tag"-->
<vb value="42" /> <!-- and this one-->
<y value="43" />
</general>
<special>
<va value="100" />
</special>
</data>
I cannot put together the xpath expression. Something like this
xyz = lxml.etree.parse( ... )
vees = xyz.xpath("general/[tag='v*']")
I would like to have vees beeing
vees
Out[64]: [<Element va at 0x....>, <Element vb at 0x...>]
Try changing:
vees = xyz.xpath("general/[tag='v*']")
to
doc.xpath('//general//*[starts-with(name(),"v")]')
and see if it works.

How to replace xml element from looping list on python?

I have a xml tree and I would like to replace the sub elements in the xml structure.
This is actual xml tree read from the file
xml_data = ET.parse('file1.xml')
<?xml version='1.0' encoding='UTF-8'?>
<call method="xxxx" callerName="xxx">
<credentials login="" password=""/>
<filters>
<accounts>
<account code="" ass="" can=""/>
</accounts>
</filters>
</call>
I'm expecting this format from looping the list
a = [1,23453, 3543,4354,3455, 6345]
<?xml version='1.0' encoding='UTF-8'?>
<call method="xxxx" callerName="xxx">
<credentials login="" password=""/>
<filters>
<accounts>
<account code="1" ass="34" can="yes"/>
<account code="23453" ass="34" can="yes"/>
<account code="3543" ass="34" can="yes"/>
<account code="4354" ass="34" can="yes"/>
<account code="3455" ass="34" can="yes"/>
<account code="6345" ass="34" can="yes"/>
</accounts>
</filters>
</call>
New to xml-parsing. Any help would be appreciated. Thanks in advance
I'm new in python, but it's work. I find only 1 bug: len of a must be equal len . It would be great if i help you. Good luck.
from xml.etree import ElementTree as ET
a = [1, 23453, 3543, 4354, 3455, 6345]
code = 0
xmlfile = "./log/logs.xml"
tree = ET.parse(xmlfile)
root = tree.getroot()
for filters in root.findall("filters"):
for accounts in filters.findall("accounts"):
for account in accounts.findall("account"):
attributes = account.attrib
attributes["code"] = str(a[code])
attributes["ass"] = "34"
attributes["can"] = "yes"
code += 1
tree.write(xmlfile)

How to update all sub-sub tag of XML with different values using BeautifulSoup

lets say this is my XML
<sky class="new">
<list name="school">
<p>63</p>
<p>62</p>
<p>61</p>
</list>
</sky>
And this is my values in list.
value = [51,56,87]
Now I what I need is:
<sky class="new">
<list name="school">
<p>51</p>
<p>56</p>
<p>87</p>
</list>
</sky>
So far this is what I did:
for i in soup.find_all('sky', {'class':'new'}):
k = i.find('list',{'name':'school'})
After this I am not getting what to do, could you help here?
EDIT1:
<sky class="new">
<list name="alpha">
<item>
<p unit="kg">63</p>
<p weight="wg">54</p>
</item>
<item>
<p unit="kg">57</p>
<p weight="wg">32</p>
</item>
</list>
</sky>
Another version:
from bs4 import BeautifulSoup
txt = '''<sky class="new">
<list name="school">
<p>63</p>
<p>62</p>
<p>61</p>
</list>
</sky>'''
soup = BeautifulSoup(txt, 'xml')
values = [51, 56, 87]
for p, new_value in zip(soup.select('sky.new > list[name="school"] > p'), values):
p.string = str(new_value)
print(soup)
Prints:
<?xml version="1.0" encoding="utf-8"?>
<sky class="new">
<list name="school">
<p>51</p>
<p>56</p>
<p>87</p>
</list>
</sky>
Try something like this:
targets = soup.select('p')
for target in targets:
repl = str(value[targets.index(target)])
target.string.replace_with(repl)
soup
Output:
<html><body><sky class="new">
<list name="school">
<p>51</p>
<p>56</p>
<p>87</p>
</list>
</sky></body></html>

move an element to another element or create a new one if it does not exist using xslt-3

using xslt 3, i need to take all content elements' values, and move them to the title elements (if the title elements already exist in a record, they need to be appended with a separator like -) i now have inputted my real data, since the below solution does not solve the problem when implemented to something like:
example input:
<data>
<RECORD ID="31365">
<no>25099</no>
<seq>0</seq>
<date>2/4/2012</date>
<ver>2/4/2012</ver>
<access>021999</access>
<col>GS</col>
<call>889</call>
<pr>0</pr>
<days>0</days>
<stat>0</stat>
<ch>0</ch>
<title>1 title</title>
<content>1 content</content>
<sj>1956</sj>
</RECORD>
<RECORD ID="31366">
<no>25100</no>
<seq>0</seq>
<date>2/4/2012</date>
<ver>2/4/2012</ver>
<access>022004</access>
<col>GS</col>
<call>8764</call>
<pr>0</pr>
<days>0</days>
<stat>0</stat>
<ch>0</ch>
<sj>1956</sj>
<content>1 title</content>
</RECORD>
</data>
expected output:
<data>
<RECORD ID="31365">
<no>25099</no>
<seq>0</seq>
<date>2/4/2012</date>
<ver>2/4/2012</ver>
<access>021999</access>
<col>GS</col>
<call>889</call>
<pr>0</pr>
<days>0</days>
<stat>0</stat>
<ch>0</ch>
<title>1 title - 1 content</title>
<sj>1956</sj>
</RECORD>
<RECORD ID="31366">
<no>25100</no>
<seq>0</seq>
<date>2/4/2012</date>
<ver>2/4/2012</ver>
<access>022004</access>
<col>ΓΣ</col>
<call>8764</call>
<pr>0</pr>
<days>0</days>
<stat>0</stat>
<ch>0</ch>
<sj>1956</sj>
<title>1 title</title>
</RECORD>
<data>
with my attempt, i did not manage to move the elements, i just got an empty line where the content element existed, so please add the removal of blank lines in the suggested solution.
i believe the removal of blank lines could be fixed with the use of
<xsl:template match="text()"/>
One way to achieve this is the following template. It uses XSLT-3.0 content value templates.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0" expand-text="true">
<xsl:output method="xml" indent="yes" />
<xsl:mode on-no-match="shallow-copy" />
<xsl:strip-space elements="*" /> <!-- Remove space between elements -->
<xsl:template match="RECORD">
<xsl:copy>
<xsl:copy-of select="#*" />
<title>{title[1]}{if (title[1]) then ' - ' else ''}<xsl:value-of select="content" separator=" " /></title>
<xsl:apply-templates select="node() except (title,content)" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
It's output is as desired.
If you want to separate the <content> elements with a -, too, you can simplify the core <title> expression to
<xsl:value-of select="title|content" separator=" - " />
EDIT:
All I changed was replacing chapter with RECORD, and it's working fine with Saxon-HE 9.9.1.4J. The only difference in the output is that the title element is always at the first position, but that shouldn't matter. I also added a directive to remove space between elements.

Resources