I need to find out the index (position) of XML element with certain attribute and namespace. In my XML there are more elements with the same name so only possible way to identify the right one is by its attribute.
This is sample of my XML document:
<mets:mets LABEL="Moderní pedagogika, 2002" TYPE="Monograph"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:mets="http://www.loc.gov/METS/"
xmlns:mods="http://www.loc.gov/mods/v3"
xmlns:ns3="http://www.openarchives.org/OAI/2.0/oai_dc/"
xmlns:ns5="http://www.w3.org/1999/xlink"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/XMLSchema-instance http://www.w3.org/2001/XMLSchema.xsd http://www.loc.gov/METS/ http://www.loc.gov/standards/mets/mets.xsd http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-4.xsd http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd http://www.w3.org/1999/xlink http://www.w3.org/1999/xlink.xsd">
<mets:metsHdr CREATEDATE="2012-12-05T07:42:22" LASTMODDATE="2012-12-05T07:42:22">
<mets:agent ROLE="CREATOR" TYPE="ORGANIZATION">
<mets:name>ABA001</mets:name>
</mets:agent>
<mets:agent ROLE="ARCHIVIST" TYPE="ORGANIZATION">
<mets:name>ABA001</mets:name>
</mets:agent>
</mets:metsHdr>
<mets:dmdSec ID="MODSMD_VOLUME_0001">
.....
</mets:dmdSec>
<mets:dmdSec ID="DCMD_VOLUME_0001">
.....
</mets:dmdSec>
</mets:mets>
Desired Index in this case is the index of this tag <mets:dmdSec ID="MODSMD_VOLUME_0001">
I have tried some solution regarding list(root).index(dmdSec) but without success since I am not able or do not know how to insert there details about attribute and namespace
Could someone help me with this
I'm assuming that you are using the lxml.etree library for xml parsing - if not you may have to modify things a bit - but the principle is the same:
Simply use:
Edit:
from lxml import etree
root = etree.parse(r'path\to\your\file.xml')
int(root.xpath('count(//*[#ID="MODSMD_VOLUME_0001"]/preceding-sibling::*)+1'))
Output:
2.
Note that the position is 2 and not 1 - xpath counts from 1 (unlike python, which counts from 0). Your target is the second <mets:dmdSec> node within the root.
I need to remove a particular string from a tag using xslt 1.0. The string is random and can appear anywhere.The only way to identify the string is that it is followed by either "tt" or "Tt" or "tt." or "Tt."
Can anybody help me with the code snippet which i can use to achieve this.
For example
<Page>9 tt., 407-415</Page>
Expected output (remove 9tt.)
<Page>, 407-415</Page>
<Page>425 Tt (approx.)</Page>
Expected output (remove 425 Tt)
<Page> (approx.)</Page>
<Page>055302, 8 tt.</Page>
Expected output(remove 8Tt.)
<Page>055302, </Page>
I am trying to check if a key of a map in freemarker matches a particular string. How can I do that?
<#if (list_map.id)!?matches(("abc"))>
you matched ...
</#if>
but the above doesn't work in freemarker. It says matches expects a string. How can I convert list_map.id to a string ? is there any toString() method available in freemarkeR?
If id is a number, then probably you want list_map.id?c to render it to computer-format string.
Good day everyone!
I have an element
<tbody class="cp-ads-list__table-item _sas-offers-table__item cp-ads-list__table- item_state-deposit" data-card_id="16676514">
I'd like to access it by the data-card_id tag, but when I try the following
#browser.tbody(:data_card_id => "16676514").hover
I get an error
unable to locate element, using {:data_card_id=>"16676514", :tag_name=>"tbody"} (Watir::Exception::UnknownObjectException)
I guess my code would have worked if the tag were "data-card-id", but it's "data-card_id".
How do I access my element by this attribute?
Problem
You are right that the problem is the underscore in the data attribute. As seen in the ElementLocator, when building the XPath expression, all underscores are converted to dashes (in the else part of the statement):
def lhs_for(key)
case key
when :text, 'text'
'normalize-space()'
when :href
# TODO: change this behaviour?
'normalize-space(#href)'
when :type
# type attributes can be upper case - downcase them
# https://github.com/watir/watir-webdriver/issues/72
XpathSupport.downcase('#type')
else
"##{key.to_s.gsub("_", "-")}"
end
end
Solution - One-Off
If this is the only data attribute that is using underscores (rather than dashes), I would probably manually build the XPath or CSS expression.
#browser.tbody(:css => '[data-card_id="16676514"]').hover
Solution - Monkey Patch
If using underscores is a standard on the website, I would probably consider monkey patching the lhs_for method. You could monkey patch the method so that you only change the first underscore for data attributes:
module Watir
class ElementLocator
def lhs_for(key)
puts 'hi'
case key
when :text, 'text'
'normalize-space()'
when :href
# TODO: change this behaviour?
'normalize-space(#href)'
when :type
# type attributes can be upper case - downcase them
# https://github.com/watir/watir-webdriver/issues/72
XpathSupport.downcase('#type')
else
if key.to_s.start_with?('data')
"##{key.to_s.sub("_", "-")}"
else
"##{key.to_s.gsub("_", "-")}"
end
end
end
end
end
This would then allow your original code to work:
#browser.tbody(:data_card_id => "16676514").hover
I'm Trying to deserialize xml data into an object with c#. I have always done this using the .NET deserialize method, and that has worked well for most of what I have needed.
Now though, I have XML that is created by Sharepoint and the attribute names of the data I need to deserialize have encoded caracters, namely:
*space, º, ç ã, :, * and a hyphen as
x0020, x00ba, x007a, x00e3, x003a and x002d respectivly
I'm trying to figure out what I have to put in the attributeName parameter in the properties XmlAttribute
x0020 converts to a space well, so, for instance, I can use
[XmlAttribute(AttributeName = "ows_Nome Completo")]
to read
ows_Nome_x0020_Completo="MARIA..."
On The other hand, neither
[XmlAttribute(AttributeName = "ows_Motiva_x00e7__x00e3_o_x003a_")]
nor
[XmlAttribute(AttributeName = "ows_Motivação_x003a_")]
nor
[XmlAttribute(AttributeName = "ows_Motivação:")]
allow me to read
ows_Motiva_x00e7__x00e3_o_x003a_="text to read..."
With the first two I get no value returned, and the third gives me a runtime error for invalid caracters (the colon).
Anyway to get this working with .NET Deserialize, or do I have to build a specific deserializer for this?
Thanks!
What you are looking at (the "cryptic" data) is called XML entities. It's used by SharePoint to safekeep attribute names and similar elements.
There are a few ways of dealing with this, the most elegant ways to solve it is by extracting the List schema and match the element towards the schema. The schema contain all meta-data about your list data. A polished example of a Schema can be seen below or here http://www.bendsoft.com/documentation/camelot-php-tools/1_5/packets/schema-and-content-packets/schemas/example-list-view-schema/
If you don't want to walk that path you could start here http://msdn.microsoft.com/en-us/library/35577sxd.aspx
<Field Name="ContentType">
<ID>c042a256-787d-4a6f-8a8a-cf6ab767f12d</ID>
<DisplayName>Content Type</DisplayName>
<Type>Text</Type>
<Required>False</Required>
<ReadOnly>True</ReadOnly>
<PrimaryKey>False</PrimaryKey>
<Percentage>False</Percentage>
<RichText>False</RichText>
<VisibleInView>True</VisibleInView>
<AppendOnly>False</AppendOnly>
<FillInChoice>False</FillInChoice>
<HTMLEncode>False</HTMLEncode>
<Mult>False</Mult>
<Filterable>True</Filterable>
<Sortable>True</Sortable>
<Group>_Hidden</Group>
</Field>
<Field Name="Title">
<ID>fa564e0f-0c70-4ab9-b863-0177e6ddd247</ID>
<DisplayName>Title</DisplayName>
<Type>Text</Type>
<Required>True</Required>
<ReadOnly>False</ReadOnly>
<PrimaryKey>False</PrimaryKey>
<Percentage>False</Percentage>
<RichText>False</RichText>
<VisibleInView>True</VisibleInView>
<AppendOnly>False</AppendOnly>
<FillInChoice>False</FillInChoice>
<HTMLEncode>False</HTMLEncode>
<Mult>False</Mult>
<Filterable>True</Filterable>
<Sortable>True</Sortable>
</Field>
<Field>
...
</Field>
Well... I guess I kind of hacked a way around, which works for now. Just replaced the _x***_ charecters for nothing, and corrected the XmlAttributes acordingly. This replacement is done by first loading the xml as a string, then replacing, then loading the "clean" text as XML.
But I wopuld still like to know if it is possible to use some XmlAttribute Name for a more direct approach...
Try using System.Xml; XmlConvert.EncodeName and XmlConvert.DecodeName
I use a simply function to get the NameCol:
private string getNameCol(string colName) {
if (colName.Length > 20) colName = colName.Substring(0, 20);
return System.Xml.XmlConvert.EncodeName(colName);
}
I'm already searching for replace characters like á, é, í, ó, ú. EncodeName doesn't convert this characters.
Can use Replace:
.Replace("ó","_x00f3_").Replace("á","_x00e1_")