Using python3 I am trying to read an xml file and recalculate values based on the attributes within the Item, then write a copy of entire xml file with new values.
Example of xml file (about 10k rows in full file):
<?xml version="1.0" encoding="utf-8"?>
<Items>
<Item id="headscarf_d"
name="{=wW3iouiU}Hijab"
mesh="headscarf_d"
culture="Culture.aserai"
weight="0.5"
value="63"
appearance="1"
Type="HeadArmor">
<ItemComponent>
<Armor head_armor="3"
has_gender_variations="false"
beard_cover_type="type3"
hair_cover_type="all"
modifier_group="cloth_unarmoured"
material_type="Cloth"/>
</ItemComponent>
<Flags Civilian="true"
UseTeamColor="true" />
</Item>
<Item id="open_head_scarf"
name="{=qsVRoGUv}Open Head Scarf"
mesh="aserai_helmet_c"
culture="Culture.aserai"
weight="0.6"
value="174"
appearance="1"
Type="HeadArmor">
<ItemComponent>
<Armor head_armor="5"
has_gender_variations="false"
beard_cover_type="type3"
hair_cover_type="all"
modifier_group="cloth_unarmoured"
material_type="Cloth"/>
</ItemComponent>
<Flags Civilian="true"
UseTeamColor="true" />
</Item>
<Item id="woven_turban"
name="{=ArPvuBYK}Woven Turban"
subtype="head_armor"
mesh="aserai_helmet_h"
culture="Culture.aserai"
weight="0.8"
difficulty="0"
value="250"
appearance="1"
Type="HeadArmor">
<ItemComponent>
<Armor head_armor="6"
has_gender_variations="false"
beard_cover_type="type2"
hair_cover_type="all"
modifier_group="cloth_unarmoured"
material_type="Cloth"/>
</ItemComponent>
<Flags Civilian="true"
UseTeamColor="true" />
</Item>
</Items>
Taking a single item from the example xml,
<Item id="headscarf_d"
name="{=wW3iouiU}Hijab"
mesh="headscarf_d"
culture="Culture.aserai"
weight="0.5"
value="63"
appearance="1"
Type="HeadArmor">
<ItemComponent>
<Armor head_armor="3"
has_gender_variations="false"
beard_cover_type="type3"
hair_cover_type="all"
modifier_group="cloth_unarmoured"
material_type="Cloth"/>
</ItemComponent>
<Flags Civilian="true"
UseTeamColor="true" />
For simplicity say I wanted to take the Item value (63 above example) and divide by 2 (63/2=31.5). Then if the Item's ItemComponent material_type="Cloth" divide by 2 again (31.5/2=15.75). Finally round to an integer before updating the value and repeating for each item then writing the new updated xml file.
I attempted to use Reading, modifying and writing xml but could not get anything useful.
You are probably looking for something along these lines:
from lxml import etree
import math
inv="""[your xml above]"""
doc = etree.XML(inv)
values = doc.xpath('//Item')
materials = doc.xpath('Item//ItemComponent//Armor')
for t, m in zip(values,materials):
if m.attrib['material_type'] == 'Cloth':
val = float(t.attrib['value'])/4
t.attrib['value'] = str(math.ceil(val))
else:
t.attrib['value']= str(math.ceil(val*2))
print(etree.tostring(doc).decode())
The output is your xml with the Items/Item/#value attribute value divided by 2 or 4, as necessary, and rounded up by math.ceil(). Since all Items in your example have cloth as the value of the attribute material_type, they were all divided by 4 and rounded up to:
16
44
63
Related
Edit/Update: By removing the <GrpHdr> element completely, Excel was able to verify the XML Map as exportable. My original question still remains, how can I solve the "Denormalized Data" error, with the <GrpHdr> included.
I am new to XML, and have been trying to import a source file (XML below) into Excel, create a schema/XML Map (unsure of the difference there) which I can then drag and drop onto two different tables:
One table contains one row of data for the Group Header: <GrpHdr> (Occurs ONCE)
One table contains multiple rows of data for the various Payments: <PmtInf> (Occurs MULTIPLE times)
I am able to successfully load the below XML into Excel using the Source button, and also to create an XML map off of it (which then appears in a "XML Source" window, showing the parent and child elements).
The problem I am having is in Verifying the XML Map for export. Excel says that the map contains "Denormalized Data". I have looked at various Microsoft resources, as well as on Stack Overflow.
Such as:
https://support.microsoft.com/en-us/office/issue-verifying-an-xml-map-for-export-fbfcdb77-c2d6-4040-b256-e584a71151b0
excel: Cannot save or export xml data. The xml map in this workbook are not exportable
Export denormalized data from excel to xml
Based on my research, I tried the following:
I have tried setting the MinOccurs and MaxOccurs attributes to be "0" and "unbounded" respectively, as I believe the default is "1" for both, and Excel's error saying that the XML Map contains "Denormalized Data" is due to having an element with the MaxOccurs set to "1".
I have also tried adding multiple <PmtInf> elements, so Excel knows (when creating a schema from the below sample file), that <PmtInf> is to occur multiple times.
Thanks!
<?xml version="1.0" encoding="utf-8"?>
<Document xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:iso:std:iso:20022:tech:xsd:pain.001.001.03">
<CstmrCdtTrfInitn>
<GrpHdr>
<MsgId>UNIQUE MESSAGE ID 35 AN</MsgId>
<CreDtTm>2016-05-26T10:07:00</CreDtTm>
<NbOfTxs>1</NbOfTxs>
<CtrlSum>0.01</CtrlSum>
<InitgPty>
<Id>
<OrgId>
<Othr>
<Id>ABC12345678</Id>
</Othr>
</OrgId>
</Id>
</InitgPty>
</GrpHdr>
<PmtInf>
<PmtInfId>ORIGINATOR REFERENCE 35AN</PmtInfId>
<PmtMtd>TRF</PmtMtd>
<PmtTpInf>
<SvcLvl>
<Cd>SEPA</Cd>
</SvcLvl>
</PmtTpInf>
<ReqdExctnDt>2016-05-26</ReqdExctnDt>
<Dbtr>
<Nm>DEBTOR NAME 70AN</Nm>
<PstlAdr>
<StrtNm>Street Name</StrtNm>
<BldgNb>Building Number</BldgNb>
<PstCd>Post Code</PstCd>
<TwnNm>Town Name</TwnNm>
<CtrySubDvsn>County/State/Region</CtrySubDvsn>
<Ctry>LU</Ctry>
</PstlAdr>
</Dbtr>
<DbtrAcct>
<Id>
<IBAN>NL39HSBC0123456789</IBAN>
</Id>
</DbtrAcct>
<DbtrAgt>
<FinInstnId>
<BIC>HSBCNL2A</BIC>
<PstlAdr>
<Ctry>IE</Ctry>
</PstlAdr>
</FinInstnId>
</DbtrAgt>
<ChrgBr>SLEV</ChrgBr>
<CdtTrfTxInf>
<PmtId>
<InstrId>PAYMENT ID 35AN</InstrId>
<EndToEndId>UNIQUE BENEFICIARY REFERENCE 35AN</EndToEndId>
</PmtId>
<Amt>
<InstdAmt Ccy="EUR">0.01</InstdAmt>
</Amt>
<CdtrAgt>
<FinInstnId>
<BIC>MIDLGB22</BIC>
<PstlAdr>
<Ctry>GB</Ctry>
</PstlAdr>
</FinInstnId>
</CdtrAgt>
<Cdtr>
<Nm>CREDITOR NAME 70AN</Nm>
<PstlAdr>
<StrtNm>Street Name</StrtNm>
<BldgNb>Building Number</BldgNb>
<PstCd>Post Code</PstCd>
<TwnNm>Town Name</TwnNm>
<CtrySubDvsn>County/State/Region</CtrySubDvsn>
<Ctry>GB</Ctry>
</PstlAdr>
</Cdtr>
<CdtrAcct>
<Id>
<IBAN>GB94MIDL40123487654321</IBAN>
</Id>
</CdtrAcct>
<RmtInf>
<Ustrd>Remittance Info up to 140AN</Ustrd>
</RmtInf>
</CdtTrfTxInf>
</PmtInf>
</CstmrCdtTrfInitn>
</Document>
I am trying to create a xml whose first element is:
<speak version="1.0"
xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-US">
</speak>
I am able to add the first attributes with...
from lxml.etree import Element, SubElement, QName, tostring
root = Element('speak', version="1.0",
xmlns="http://www.w3.org/2001/10/synthesis")
...but not the namespace xml:lang="en-US". Based on several tuto/question like this and this I tried many solutions but none worked.
For example, I tried this :
class XMLNamespaces:
xml = 'http://www.w3.org/2001/10/synthesis'
root.attrib[QName(XMLNamespaces.xml, 'lang')] = "en-US"
But the ouput is
<speak xmlns:ns0="http://www.w3.org/2001/10/synthesis" version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" ns0:lang="en-US">
How can I create the xml:lang="en-US" of my first xml element?
The special xml: prefix is associated with the http://www.w3.org/XML/1998/namespace URI.
The following code adds xml:lang="en-US" to the root element:
root.attrib[QName("http://www.w3.org/XML/1998/namespace", "lang")] = "en-US"
I am trying to get values from a web service response in readyAPI, so i can pass it to another web service request, so i can create a automated test flow.
I have tried different code pieces most of them was a single line of code, which i prefer if it possible. I can take value from a node by typing the parent node by its attribute value. I also can get parent node by child nodes attribute value and use it to get another child value.
Here some examples:
First Format that I can use it to get childs value:
<webserviceResponse>
<documentslist>
<document #id="1">
<payment #currency="USD" >
<amount>1250.00</amount>
</payment>
</document>
<document #id="2">
<payment #currency="JPY" >
<amount>150.00</amount>
</payment>
</document>
<document #id="3">
<payment #currency="EUR" >
<amount>1170.00</amount>
</payment>
</document>
<!-- etc. -->
</documentslist>
-----> To get currency for a specific document
def webServiceResponse = "webservice#Response"
int index=2
def currency = context.expand('${'+webServiceResponse+'//*:document[#id="['+index+']"]//*:payment/#currency}')
-----> Result of this is "JPY"
<webserviceResponse>
<documentslist>
<document #id="1">
<payment #currency="USD" >
<amount>1250.00</amount>
</payment>
<refund>true</refund>
</document>
<document #id="2">
<payment #currency="JPY" >
<amount>150.00</amount>
</payment>
</document>
<document #id="3">
<payment #currency="EUR" >
<amount>1170.00</amount>
</payment>
<refund>false</refund>
</document>
<!-- etc. -->
</documentslist>
-------> To get a currency dependent on existence of a specific node
In this example we are looking the file from up to down and we are finding every refund nodes,
and taking currency value that is in the same block with the second time we see a refund node.
def webServiceResponse = "webservice#Response"
int index=2
def currrency= context.expand('${'+webServiceResponse+'(//*:refund)['+index+']//parent::*//*:payment/#currency}')
--------> Result for this is "EUR"
This one is that i cant take child value with the same way.
<webserviceResponse>
<documentslist>
<document>
<key>D_Computer</key>
<currency>USD</currency>
<amount>1250.00</amount>
<refund>true</refund>
</document>
<document>
<key>D_Keyboard</key>
<currency>JPY</currency>
<amount>150.00</amount>
</document>
<document>
<key>D_Monitor</key>
<currency>EUR</currency>
<amount>1170.00</amount>
<refund>false</refund>
</document>
<!-- etc. -->
</documentslist>
My problem with this one it doesn't have any attributes, has only values of the nodes. I know that it doesnt have an integer by the way but maybe i am doing wrong that i dont realize.
I want to get the amount value only dependent to the "key" nodes value which i am going to specify in the script.
result should show :150.00
Thank you for the very detailed and well written question.
You can use the below. Your problem is easy as there are no namespace in it.
Technique is same which you have dispalyed, its just that you need not to use # as its for attributes
def groovyUtils=new com.eviware.soapui.support.GroovyUtils(context)
def xml=groovyUtils.getXmlHolder("NameOfRequest#Response");
def currency=xml.getNodeValue("//*:documentslist/*:document[key='${key}']/*:amount");
log.info "Value of $key is " + currency
key="D_Monitor"
currency=xml.getNodeValue("//*:documentslist/*:document[key='${key}']/*:amount");
log.info "Value of $key is " + currency
Replace NameOfRequest with your Request's name
There is an alternative way too. I will post it as a separate answer so not to cause confusion. This one is still better than other one
There is an alternate way of doing things using Hashmap if the other answer is not working due to namespaces in your XML
Try this method
We are getting all values first by using getNodeValues and then since we have pair we are putting in hashmap.
Now you can retrieve anything.
def groovyUtils=new com.eviware.soapui.support.GroovyUtils(context)
def xml=groovyUtils.getXmlHolder("Request1#Response");
def keys=xml.getNodeValues("//*:documentslist/*:document/*:key")
def amounts=xml.getNodeValues("//*:documentslist/*:document/*:amount")
log.info keys.toString()
log.info amounts.toString()
HashMap h1=[:]
// Add the pair into hashmap and then retrieve
for(int i=0;i<keys.size();i++)
{
h1.put(keys[i],amounts[i])
}
def whichone="D_Computer"
log.info "Value for $whichone is " + h1.get(whichone)
Lets say you want to retrieve more than one value then you can use arrays.
i.e. take arrays as key,currency,amount,refund
so if you want to retrieve the refund for a key='Z' So using a for loop you can know that Z is present at 3 location in the array
then your refund should be refund[3]. Similarly currency[3] and amount[3]
Both the answers have their own relevance
Python 3.6, Lxml, Windows 10
I am getting crazy. I want to access the item field. But I always get the error:
AttributeError: 'cython_function_or_method' object has no attribute'item'
Everything else (address fields etc...) I can access without problems. How can I access the item fields (sku, amount etc...)?
I've used this code:
import requests
from lxml import objectify
url = "URL_TO_XML_FILE"
xml_content = requests.get(url).text.encode('utf-8')
xml = objectify.fromstring(xml_content)
for sale in xml.response.sales.sale:
for item in sale.items.item:
print(item.sku)
Here is the beginning of the xml:
<?xml version="1.0" encoding="ISO-8859-1"?>
<getnewsalesresult xmlns="https://pmcdn.priceminister.com/res/schema/getnewsales">
<request>
<version>2017-08-07</version>
<user>SELLER</user>
</request>
<response>
<lastversion>2017-08-07</lastversion>
<sellerid>95029358</sellerid>
<sales>
<sale>
<purchaseid>297453287592813953</purchaseid>
<purchasedate>15/12/2018-19:10</purchasedate>
<deliveryinformation>
<shippingtype>Normal</shippingtype>
<isfullrsl>N</isfullrsl>
<purchasebuyerlogin><![CDATA[LOGIN]]></purchasebuyerlogin>
<purchasebuyeremail>EMAIL</purchasebuyeremail>
<deliveryaddress>
<civility>Mme</civility>
<lastname><![CDATA[Lastname]]></lastname>
<firstname><![CDATA[Firstname]]></firstname>
<address1><![CDATA[STREET]]></address1>
<address2><![CDATA[]]></address2>
<zipcode>13570</zipcode>
<city><![CDATA[Paris]]></city>
<country><![CDATA[France]]></country>
<countryalpha2>FX</countryalpha2>
<phonenumber1></phonenumber1>
<phonenumber2>PHONENUMBER</phonenumber2>
</deliveryaddress>
</deliveryinformation>
<items>
<item>
<sku><![CDATA[SKU1]]></sku>
<advertid>411812243030</advertid>
<advertpricelisted>
<amount>15.99</amount>
<currency>EUR</currency>
</advertpricelisted>
<itemid>551131040</itemid>
<headline><![CDATA[HEADLINE]]></headline>
<itemstatus><![CDATA[REQUESTED]]></itemstatus>
<ispreorder>N</ispreorder>
<isnego>N</isnego>
<negotiationcomment></negotiationcomment>
<price>
<amount>15.99</amount>
<currency>EUR</currency>
</price>
<isrsl>N</isrsl>
<isbn></isbn>
<ean>4363745894373857474; </ean>
<paymentstatus><![CDATA[INCOMING]]></paymentstatus>
<sellerscore></sellerscore>
</item>
</items>
</sale>
<sale>
The problem is that items is actually a method of ObjectifiedElement, so the expression sale.items actually returns the method, because it has precedence.
To get the 'items' object you want, you have to be more explicit about getting the attribute of sale and not looking for methods of the class first, which is the usual python order. This is what python does behind the scene when you access an attribute, and you can do it too:
sale.__getattr__('items')
This will also work (it's a dictionary-like interface to the attributes of an object):
sale.__dict__['items']
The revised code:
import requests
from lxml import objectify
url = "URL_TO_XML_FILE"
xml_content = requests.get(url).text.encode('utf-8')
xml = objectify.fromstring(xml_content)
for sale in xml.response.sales.sale:
for item in sale.__dict__['items'].item:
print(item.sku)
Another way to deal with this is to avoid using the flaky attribute interface:
for sale in xml['response']['sales']['sale']:
for item in sale['items']['item']:
print(item['sku'])
Using the dict-like indexing interface, you never have to worry about certain attributes names (which includes such common words as items, index, keys, remove, replace, tag, set, text, and values) returning surprising results.
I came across a strange behavior of the each() method when trying this code:
def xml = new XmlSlurper().parseText('''
<list>
<item a="1">a</item>
<item a="2">b</item>
<item a="1">c</item>
</list>
''')
def i = 0
xml.'**'.findAll { it.#a=='1' }.each {
println "hi" + i
}
The result is only hi0, however I would expect hi0hi1. Is this behavior a bug or per language design? The second result is only provided if I write println "hi" + i++ instead of the current closure body, so when the content is different for each item...
Your i variable is not being incremented because there's nothing that tells it to increment. The way your code is currently written, I would expect the output to be:
hi0
hi0
I think what you are looking for is eachWithIndex, which provides the closure with two arguments - the current item and the index of the item. Your code would then look like this:
def xml = new XmlSlurper().parseText('''
<list>
<item a="1">a</item>
<item a="2">b</item>
<item a="1">c</item>
</list>
''')
xml.'**'.findAll { it.#a=='1' }.eachWithIndex { item, i ->
println "hi" + i
}
This results in an output of:
hi0
hi1