In case I want to change the text or add an element in XML files, I can just directly convert the file to a string, replace or add elements as a string, then convert it back to XML.
In what use case where that approach is bad? Why do we need to manipulate it using libraries such as XMLdom, Xpath?
The disadvantage of manipulating XML via string operators is that achieving a parsing-dependent goal for even one particular XML document is already harder than using a proven XML parser. Achieving the goal for equivalent XML document variations will be nearly impossible, especially for anyone naive enough to be considering such an approach in the first place.
Not convinced?
Scan the table of contents of the Extensible Markup Language (XML) 1.0 (Fifth Edition), W3C Recommendation 26 November 2008. If you do not understand everything, your hand-written, poor imitation of an XML parser, will fail, if not on your first test case, on future variations which you're obligated to handle if you wish to claim your code works with XML. To mention just a few challenges, your program should
Report if its input XML is not well-formed.
Handle character and entity references.
Handle comments and CDATA sections.
Tempted to parse XML via string operators, including regex? Don't do it.
Use a real XML parser.
I am writing an HTML to Markdown converter in Rust, using Kuchiki to get access to the parsed tree from html5ever.
For unknown HTML tags, I want to provide the possibility to ignore them and pass them through to the output string, but still processing their children as normal. For that, I need the textual representation of the tag without its contents, but I can't figure how best to do that.
The best I can come up with is:
Clone the node
Drop its children
Call node.to_string
"parse" the string with a regular expression to separate the opening and closing tags.
I feel there must be a better way. I don't think Kuchiki provides this functionality out of the box, but I also don't know how to get access to the html5ever API through Kuchiki, and I also don't get from the html5ever API documentation whether they would provide some functionality like this.
Given an excel column containing filepaths, what excel formula returns only the file extension?
src\main\java\com\something\proj\UI.java --> java
src\main\java\com\something\proj\Server.scala --> scala
src\main\java\com\something\proj\include.h\someinclude.hpp --> hpp
Note 1: this formula works great for filepaths with only a single period, but not for case 3:=IF(A1="","",RIGHT(A1,LEN(A1)-FIND(".",A1)))
Note 2: I understand that these filepaths are Windows-specific, I don't need a cross-platform solution.
Related: triming extension from filename in Excel and How to extract file name from path?
With data in A1, use:
=SUBSTITUTE(RIGHT(SUBSTITUTE(A1,".",REPT(".",999)),999),".","")
From:
Jim Cone's old post
This will find everything after the last .:
=MID(A1,FIND("{{{",SUBSTITUTE(A1,".","{{{",LEN(A1)-LEN(SUBSTITUTE(A1,".",""))))+1,LEN(A1))
Here's a nice long answer. :-)
=SUBSTITUTE(A1,LEFT(A1,FIND(CHAR(1),SUBSTITUTE(A1,".",CHAR(1),LEN(A1)-LEN(SUBSTITUTE(A1,".",""))))),"")
A neat trick I sometimes use for string parsing in general is to leverage the FilterXML() function (Excel 2013 and later). The basic strategy is to use Substitute() to format your string in a way that it is parsed across elements in an xml string, and then you can use xpath syntax to conveniently navigate your parsed elements. Using this strategy, getting an extension would look like this...
=FILTERXML("<A><p>" & SUBSTITUTE(A1,".","</p><p>.")&"</p></A>","//p[last()]")
If you're not familiar with xml, this can seem intimidating, but if you can grasp what's going on, I find it to be cleaner, more flexible, and easier to remember than the alternative approaches using len(), substitute(), etc. One reason why it's nicer is because there's only one cell reference.
Illegal Characters
There are two characters that are allowed in paths but not in xml: & and '
The equation above will work if these characters are not present, otherwise, they will need to be handled something like this...
=FILTERXML("<A><p>" & SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(J8,"'",""),"&",""),".","</p><p>.")&"</p></A>","//p[last()]")
Example
Suppose we have a nasty file path like this:
C:\Folder1\Folder2\(ugly characters !##$%^()_+={};;,`)\two.dots.LongExt
1.) The Substitution() portion will convert it to an xml string like this...
<A>
<p>
C:\Folder1\Folder2\(ugly characters !##$%^()_+={};;,`)\two
</p>
<p>
.dots
</p>
<p>
.txt
</p>
</A>
2.) Once formatted like this, it's trivial to pick out the last p element using the xpath syntax //p[last()].
I am trying to replace the strings in my template
To do this ive done the following:
section = helper.Section("%course_name%", "Tekst");
mail.addSection(section);
section = helper.Section("%user%", "Textforasubstitutiontagofsection2");
mail.addSection(section);
However when i recieve the mail the strings are not replace and stand as the above picture
Can anyone tell me what im doing wrong?
You want to be using substitutions rather than sections in this case. section tags are meant to encapsulate groups of substitution tags. Docs for more info.
I am working with an SSRS report, and I'm trying to build links to SharePoint list items through an expression on the Action of a Placeholder. The problem that manifests whenever I put any query string values into the link is that SSRS is duplicating them. When SharePoint receives this URL, this causes the New Item page to come up instead of displaying the list item.
Here is the expression where I'm building the link:
="http://home.oursharepointsite.net" & Left(First(Fields!Url.Value,"List"),InStrRev(First(Fields!Url.Value, "List"),"/")) & "DispForm.aspx?ID=" & Fields!ListItemId.Value
And here is the resulting link:
http://home.oursharepointsite.net/communities/home/Sites/CORPFI/Wiki1/Forms/DispForm.aspx?ID=395&ID=395
Another developer I work with isn't using the Action of a Placeholder but rather just building <a> tags and gets the same behavior. In his case it doesn't affect the rendering of what he's linking to though.
Does anyone know a way to solve this conundrum?
I recently had this problem and was able to solve it by making the link relative instead of absolute.
Try dropping the "http://home.oursharepointsite.net" and replace it with "/".