How to add xmi:version="2.0" attribute to an element - python-3.x

I am creating a xml file. i am done with the root element creation and i am able to define xml declaration. But i need to create anther tag, which looks like
<?xml version="1.0" encoding="UTF-8"?>
<xmi:XMI xmi:version="2.0" xmlns:xmi="http://www.omg.org/XMI" xmlns:TalendProperties="http://www.talend.org/properties">
# i am unable to replicate the above
### some subelements..
</xmi:XMI>
i am done with adding xmlns URIs, but unable to get the xmi:version="2.0".
I am not familiar with XML, so getting confused, read about namespace and all, not quite getting it. Can somebody show me how to do that or share a related weblink. That woul dbe great help. Because i found mostly the XML parsing stuff on internet but very few resource on XML generaton.
xmlns_uris_dict = {'xmi':'http://..', 'subprocess':'http://xyz...'}
root = ET.Element("talendfile:ProcessType")
ET.register_namespace('xmi', 'version="2.0"') # This part gives a wrong presentation.
# i am able to add URIs here
for prefix, uri in xmlns_uris_dict.items():
root.attrib['xmlns:' + prefix] = uri

A good way to create namespaced elements and attributes is to use QName.
import xml.etree.ElementTree as ET
NS = "http://www.omg.org/XMI"
ET.register_namespace("xmi", NS)
# Create xmi:XMI element
root = ET.Element(ET.QName(NS, "XMI"))
# Add xmi:version attribute
root.set(ET.QName(NS, "version"), "2.0")
print(ET.tostring(root).decode())
Result:
<xmi:XMI xmlns:xmi="http://www.omg.org/XMI" xmi:version="2.0" />
register_namespace() ensures that the xmi prefix (not the default ns0) is used when serializing the XML document.

Related

Groovy Script to remove the xml declaration and parent nodes and display only childnodes

I am trying to transform xml using groovy script but failing, Any help will be highly appreciated
My input is xml is below
<?xml version="1.0" encoding="UTF-8"?>
<rsp stat="ok" version="1.0">
<result>
<total_results>700</total_results>
<emailClick>
<id>993</id>
<created_at>2023-02-14 00:00:10</created_at>
</emailClick>
<emailClick>
<id>995</id>
<created_at>2023-02-14 00:00:10</created_at>
</emailClick>
</result>
</rsp>
My output will display only childnodes without xml declaration and parent xml tags
<emailClick>
<id>993</id>
<created_at>2023-02-14 00:00:10</created_at>
</emailClick>
<emailClick>
<id>995</id>
<created_at>2023-02-14 00:00:10</created_at>
</emailClick>
I tried getting the nodelist using xmlparser and then convert that into string but it is resulting in error

How to parse big XML file using beautiful soup?

I am trying to parse an XML file named document.xml which contains around 400000 character (including tags, breakline, space) init find the code below
document_xml_file_object = open('document.xml', 'r')
document_xml_file_content = document_xml_file_object.read()
xml_content = BeautifulSoup(document_xml_file_content, 'lxml-xml')
print("XML CONTENT: ", xml_content)
when I am printing xml_content below is my output:
XML CONTENT: <?xml version="1.0" encoding="utf-8"?>
For the smaller size of files its printing complete XML code. can anyone help me with this why its happening.
Edit : Click Here to see my XML Content.
Thanks in Advance
For large files it better to use line parser like xml.sax. beautifulsoup will load the whole file in memory and parse, while using xml.sax you will use quite less memory.

Generate random numbers and update in XML file using Robot Framework

I've two XML files in which I manually change the values before proceeding with further evaluation. I would like to know how should I be able to update the values in the XML file using Robot Framework.
I've used faker library to generate random number but I don't know how to update them in XML. The first XML file is something like this:
<dns:ManageRequest>
<SPResource>
<ID>ORD452257337191</ID>
<interactionDate>2016-09-20T02:35:30Z</interactionDate>
<orderType>Connect</orderType>
<SPResourceComprisedOf>
<DescribedBy>
<value>CLI0000000000191</value>
<Characteristic>
<ID>clientID</ID>
</Characteristic>
</DescribedBy>
<DescribedBy>
<value>TOW566105009191</value>
<Characteristic>
<ID>ticketOfWorkId</ID>
</Characteristic>
</DescribedBy>
</SPResourceComprisedOf>
</SPResource>
</dns:ManageRequest>
and the second xml file looks like this:
<dns:ManageOrder>
<FieldWork>
<ID>WOR140618136785</ID>
<Priority>
<priorityValues>45</priorityValues>
</Priority>
<baseRevisionNumber>-1</baseRevisionNumber>
<FieldWorkSpecifiedBy>
<ID>Activation</ID>
<version>1.0.5</version>
<type>WorkOrder Specification</type>
</FieldWorkSpecifiedBy>
<FieldWorkOverview>
<DescribedBy>
<value>WRQ140618136785</value>
<Characteristic>
<ID>Work Request ID</ID>
<type>Overview</type>
</Characteristic>
</DescribedBy>
<DescribedBy>
<value>ORD452257337191</value>
<Characteristic>
<ID>Reference ID</ID>
<type>Overview</type>
</Characteristic>
</DescribedBy>
</FieldWorkOverview>
</FieldWork>
</dns:ManageOrder>
In the firs XML file the values of ORD, CLI & TOW needs to be changed and in the second XML file WOR & WRQ need to be changed but the value of ORD in the second file needs to same as the value of ORD in first file.
I really appreciate any help, because I am really lost in this now :( Thanks!
you can use lxml library.
Link: https://pypi.org/project/lxml/
This is example for edit element ID with your value ORD452257337191 to value '123456'.
Code:
${file}= get file ${path_to_file} encoding=UTF-8
${xml_file}= parse xml ${file}
set element text ${xml_file} 123456 xpath=ID
save xml ${xml_file} ${path_to_file} encoding=UTF-8

How to access text in XML containing namespace using python ElementTree

I have a simple XML with namespaces. I am unable to access the text inside the namespace. The XML looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<!-- Created by IRIS Business Services Limited -->
<link:linkbase xmlns:xsi="http://www.ffff.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.uhtj.org/2006/ref http://www.frsfs.org/2006/ref-2006-02-27.xsd http://www.ghi.org/in-ghi-rep-par ../core/in-ghi-rep-par.xsd http://www.rl.org/2003/linkbase http://www.rl.org/2003/rl-linkbase-2003-12-31.xsd" xmlns:in-ghi-rep-par="http://www.ghi.org/in-ghi-rep-par" xmlns:link="http://www.rl.org/2003/linkbase" xmlns:ref="http://www.rl.org/2006/ref" xmlns:rli="http://www.rl.org/2003/instance" xmlns:xlink="http://www.ffff.org/1999/xlink">
<link:referenceLink xlink:type="extended" xlink:role="http://www.rl.org/2003/role/link">
<link:loc xlink:type="locator" xlink:href="../core/in-ghi-rep.xsd#in-ghi-rep_ReportingPeriodTable" xlink:label="in-ghi-rep_ReportingPeriodTable"/>
<link:reference xlink:type="resource" xlink:label="res_1" xlink:role="http://www.rl.org/2003/role/disclosureRef" id="res_1">
<in-ghi-rep-par:Circular>DBS.No.FBC.BC.34/13.12.001/99-2000 dt April 6, 2000</in-ghi-rep-par:Circular>
</link:reference>
</link:referenceLink>
</link:linkbase>
All I want to do is retrieve "DBS.No.FBC.BC.34/13.12.001/99-2000 dt April 6, 2000" which is the Circular value.
My current code looks like this. I have explored ElementTree but still not able to get the solution.
from lxml import etree
tree = etree.parse("s2.xml")
root = tree.getroot()
root2.nsmap
for Circular in root.findall('{http://www.ghi.org/in-ghi-rep-par}'):
print (Circular.text)
I am new to parsing XML. Please help.
Your expression for findall is not correct. findall will search based on your expression and you currently telling it to only look in the node its in. In the root node there are no nodes with this namespace so it correctly returns an empty list. So your expression could work if you ran it in the parent node where the Circular tag is located. But other than the name space you need to pass either a wild card to get all tags of that name space, or if you are interested in the Circular tag then specify that.
print(root[0][1].findall('{http://www.ghi.org/in-ghi-rep-par}*'))
print(root[0][1].findall('{http://www.ghi.org/in-ghi-rep-par}Circular'))
But assuming you dont know where the tag might be in the xml then you can search from the root and use .// to tell your xpath spression to look for this in all the elements recursivly from this element. Again you need to give either a wild card for the tag name or the actual tag name.
print(root.findall('.//{http://www.ghi.org/in-ghi-rep-par}*'))
print(root.findall('.//{http://www.ghi.org/in-ghi-rep-par}Circular'))
For example:
print(root.findall('.//{http://www.ghi.org/in-ghi-rep-par}Circular')[0].text)
OUTPUT
DBS.No.FBC.BC.34/13.12.001/99-2000 dt April 6, 2000

Loop and merge XML files in a folder to one HTML file using Groovy script?

I am using ReadyAPI and trying to fetch my reports generation, so I'm at the point where all the xml files for each test case are generated, and I need to merge them.
So basically I only have a path where the files are, let's say "C:\Path", where XML files lie.
I have found parsers for single files, and ways to append some information of one XML file into another XML file, but I have not found the way to loop through all XML files and dump their content into a new file...
Any help or indication could be much appreciated...
Jackson.
There is a working example of this answer here.
Let's assume that we have XML files of this form:
<composer>
<name>Wolfgang Mozart</name>
<born>1756</born>
</composer>
Then, we could build a list of parsed XML documents from each .xml file in the current directory (or whichever you need):
def composers = []
new File(".").eachFile { def file ->
if (file.name ==~ /.*\.xml/) {
composers << new XmlSlurper().parse(file)
}
}
Then, we could use a StreamingMarkupBuilder to create a unified XML document. Note this mixes markup with the composers list built above:
def xml = new StreamingMarkupBuilder().bind {
root {
composers.each { c ->
mkp.yield c
}
}
}.toString()
That is, the document looks like:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<composer>
<name>Wolfgang Mozart</name>
<born>1756</born>
</composer>
<composer>
<name>JS Bach</name>
<born>1685</born>
</composer>
...
</root>
Altering the solution for your local goals should be straight-forward.

Resources