How to access text in XML containing namespace using python ElementTree

How to access text in XML containing namespace using python ElementTree - python-3.x

I have a simple XML with namespaces. I am unable to access the text inside the namespace. The XML looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<!-- Created by IRIS Business Services Limited -->
<link:linkbase xmlns:xsi="http://www.ffff.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.uhtj.org/2006/ref http://www.frsfs.org/2006/ref-2006-02-27.xsd http://www.ghi.org/in-ghi-rep-par ../core/in-ghi-rep-par.xsd http://www.rl.org/2003/linkbase http://www.rl.org/2003/rl-linkbase-2003-12-31.xsd" xmlns:in-ghi-rep-par="http://www.ghi.org/in-ghi-rep-par" xmlns:link="http://www.rl.org/2003/linkbase" xmlns:ref="http://www.rl.org/2006/ref" xmlns:rli="http://www.rl.org/2003/instance" xmlns:xlink="http://www.ffff.org/1999/xlink">
<link:referenceLink xlink:type="extended" xlink:role="http://www.rl.org/2003/role/link">
<link:loc xlink:type="locator" xlink:href="../core/in-ghi-rep.xsd#in-ghi-rep_ReportingPeriodTable" xlink:label="in-ghi-rep_ReportingPeriodTable"/>
<link:reference xlink:type="resource" xlink:label="res_1" xlink:role="http://www.rl.org/2003/role/disclosureRef" id="res_1">
<in-ghi-rep-par:Circular>DBS.No.FBC.BC.34/13.12.001/99-2000 dt April 6, 2000</in-ghi-rep-par:Circular>
</link:reference>
</link:referenceLink>
</link:linkbase>
All I want to do is retrieve "DBS.No.FBC.BC.34/13.12.001/99-2000 dt April 6, 2000" which is the Circular value.
My current code looks like this. I have explored ElementTree but still not able to get the solution.
from lxml import etree
tree = etree.parse("s2.xml")
root = tree.getroot()
root2.nsmap
for Circular in root.findall('{http://www.ghi.org/in-ghi-rep-par}'):
print (Circular.text)
I am new to parsing XML. Please help.

Your expression for findall is not correct. findall will search based on your expression and you currently telling it to only look in the node its in. In the root node there are no nodes with this namespace so it correctly returns an empty list. So your expression could work if you ran it in the parent node where the Circular tag is located. But other than the name space you need to pass either a wild card to get all tags of that name space, or if you are interested in the Circular tag then specify that.
print(root[0][1].findall('{http://www.ghi.org/in-ghi-rep-par}*'))
print(root[0][1].findall('{http://www.ghi.org/in-ghi-rep-par}Circular'))
But assuming you dont know where the tag might be in the xml then you can search from the root and use .// to tell your xpath spression to look for this in all the elements recursivly from this element. Again you need to give either a wild card for the tag name or the actual tag name.
print(root.findall('.//{http://www.ghi.org/in-ghi-rep-par}*'))
print(root.findall('.//{http://www.ghi.org/in-ghi-rep-par}Circular'))
For example:
print(root.findall('.//{http://www.ghi.org/in-ghi-rep-par}Circular')[0].text)
OUTPUT
DBS.No.FBC.BC.34/13.12.001/99-2000 dt April 6, 2000

Related

Reading and writing a text value using selenium and pandas when the html element has no definite id

I am working on creating a program that would read a list of aircraft registrations from an excel file and return the aircraft type codes.
My source of information is FlightRadar24. (example - https://www.flightradar24.com/data/aircraft/n502dn)
I tried inspecting the elements on the page to find the correct class id to invoke and found the id to be listed as "details" When I run my code, it extracts the aircraft name with the class id/name details, instead of the type code.
See here for the example data
I then changed my approach to using XPath to seek the correct text but with the xpath it prints out
(For Xpath, i used a browser add on to find the exact xpath for the element, fairly confident that it is correct.)
It gives no output. What would you suggest in this particular instance when extracting values without a definite id ?
for i in list_regs:
driver.get('https://www.flightradar24.com/data/aircraft/'+i)
driver.implicitly_wait(3)
load = 0
while load==0:
try:
element = driver.find_element_by_xpath("/html/body/div[5]/div/section/section[2]/div[1]/div[1]/div[2]/div[2]/span")
print('element') #Printing to terminal to see if the right value is returned.

You should probably change your xpath expression to:
//label[.="TYPE CODE"]/following-sibling::span[#class="details"]
and
print('element')
to
print(element)
Edit:
This works for me:
element = driver.find_element_by_xpath('//label[.="TYPE CODE"]/following-sibling::span[#class="details"]')
print(element.text)
Output:
A359

Generate random numbers and update in XML file using Robot Framework

I've two XML files in which I manually change the values before proceeding with further evaluation. I would like to know how should I be able to update the values in the XML file using Robot Framework.
I've used faker library to generate random number but I don't know how to update them in XML. The first XML file is something like this:
<dns:ManageRequest>
<SPResource>
<ID>ORD452257337191</ID>
<interactionDate>2016-09-20T02:35:30Z</interactionDate>
<orderType>Connect</orderType>
<SPResourceComprisedOf>
<DescribedBy>
<value>CLI0000000000191</value>
<Characteristic>
<ID>clientID</ID>
</Characteristic>
</DescribedBy>
<DescribedBy>
<value>TOW566105009191</value>
<Characteristic>
<ID>ticketOfWorkId</ID>
</Characteristic>
</DescribedBy>
</SPResourceComprisedOf>
</SPResource>
</dns:ManageRequest>
and the second xml file looks like this:
<dns:ManageOrder>
<FieldWork>
<ID>WOR140618136785</ID>
<Priority>
<priorityValues>45</priorityValues>
</Priority>
<baseRevisionNumber>-1</baseRevisionNumber>
<FieldWorkSpecifiedBy>
<ID>Activation</ID>
<version>1.0.5</version>
<type>WorkOrder Specification</type>
</FieldWorkSpecifiedBy>
<FieldWorkOverview>
<DescribedBy>
<value>WRQ140618136785</value>
<Characteristic>
<ID>Work Request ID</ID>
<type>Overview</type>
</Characteristic>
</DescribedBy>
<DescribedBy>
<value>ORD452257337191</value>
<Characteristic>
<ID>Reference ID</ID>
<type>Overview</type>
</Characteristic>
</DescribedBy>
</FieldWorkOverview>
</FieldWork>
</dns:ManageOrder>
In the firs XML file the values of ORD, CLI & TOW needs to be changed and in the second XML file WOR & WRQ need to be changed but the value of ORD in the second file needs to same as the value of ORD in first file.
I really appreciate any help, because I am really lost in this now :( Thanks!

you can use lxml library.
Link: https://pypi.org/project/lxml/
This is example for edit element ID with your value ORD452257337191 to value '123456'.
Code:
${file}= get file ${path_to_file} encoding=UTF-8
${xml_file}= parse xml ${file}
set element text ${xml_file} 123456 xpath=ID
save xml ${xml_file} ${path_to_file} encoding=UTF-8

How to add xmi:version="2.0" attribute to an element

I am creating a xml file. i am done with the root element creation and i am able to define xml declaration. But i need to create anther tag, which looks like
<?xml version="1.0" encoding="UTF-8"?>
<xmi:XMI xmi:version="2.0" xmlns:xmi="http://www.omg.org/XMI" xmlns:TalendProperties="http://www.talend.org/properties">
# i am unable to replicate the above
### some subelements..
</xmi:XMI>
i am done with adding xmlns URIs, but unable to get the xmi:version="2.0".
I am not familiar with XML, so getting confused, read about namespace and all, not quite getting it. Can somebody show me how to do that or share a related weblink. That woul dbe great help. Because i found mostly the XML parsing stuff on internet but very few resource on XML generaton.
xmlns_uris_dict = {'xmi':'http://..', 'subprocess':'http://xyz...'}
root = ET.Element("talendfile:ProcessType")
ET.register_namespace('xmi', 'version="2.0"') # This part gives a wrong presentation.
# i am able to add URIs here
for prefix, uri in xmlns_uris_dict.items():
root.attrib['xmlns:' + prefix] = uri

A good way to create namespaced elements and attributes is to use QName.
import xml.etree.ElementTree as ET
NS = "http://www.omg.org/XMI"
ET.register_namespace("xmi", NS)
# Create xmi:XMI element
root = ET.Element(ET.QName(NS, "XMI"))
# Add xmi:version attribute
root.set(ET.QName(NS, "version"), "2.0")
print(ET.tostring(root).decode())
Result:
<xmi:XMI xmlns:xmi="http://www.omg.org/XMI" xmi:version="2.0" />
register_namespace() ensures that the xmi prefix (not the default ns0) is used when serializing the XML document.

Python: Universal XML parser

I'm trying to make simple Python 3 program to read weather information from XML web source, convert it into Python-readable object (maybe dictionary) and process it (for example visualize multiple observations into graph).
Source of data is national weather service's (direct translation) xml file at link provided in code.
What's different from typical XML parsing related question in Stack Overflow is that there are repetitive tags without in-tag identificator (<station> tags in my example) and some with (1st line, <observations timestamp="14568.....">). Also I would like to try parse it straight from website, not local file. Of course, I could create local temporary file too.
What I have so far, is simply loading script, that gives string containing xml code for both forecast and latest weather observations.
from urllib.request import urlopen
#Read 4-day forecast
forecast= urlopen("http://www.ilmateenistus.ee/ilma_andmed/xml/forecast.php").read().decode("iso-8859-1")
#Get current weather
observ=urlopen("http://www.ilmateenistus.ee/ilma_andmed/xml/observations.php").read().decode("iso-8859-1")
Shortly, I'm looking for as universal as possible way to parse XML to Python-readable object (such as dictionary/JSON or list) while preserving all of the information in XML-file.
P.S I prefer standard Python 3 module such as xml, which I didn't understand.

Try xmltodict package for simple conversion of XML structure to Python dict: https://github.com/martinblech/xmltodict

How do i display the data contents of my xml file using pyqt4?

I am trying to build a tiny app to read from an xml file and display on a widget. I don't know which widget to use exactly; QTextBrowser, QTextedit and QWebView. I can't seem to find a good explanation. Please help as much as you can. Before i get, I'm so new to Python, PyQt and my programming ain't good at all.

I suggest you first interprete the xml content into a dom object, and then show whatever you want from that object into your widget. For the first part (detailed info here):
from xml.dom import minidom
dom = minidom.parse('my_xml.xml')
print(dom.toxml()) # .toxml() creates a string from the dom object
def print_some_info(node):
print('node representation: {0}'.format(node))
print('.nodeName: ' + node.nodeName)
print('.nodeValue: {0}'.format(node.nodeValue))
for child in node.childNodes:
print_some_info(child)
print_some_info(child)
(using e.g. an xml example in file 'my_xml.xml' from here)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to access text in XML containing namespace using python ElementTree - python-3.x

Related

Reading and writing a text value using selenium and pandas when the html element has no definite id

Generate random numbers and update in XML file using Robot Framework

How to add xmi:version="2.0" attribute to an element

Python: Universal XML parser

How do i display the data contents of my xml file using pyqt4?

Categories

Resources