How to add an ampersand in KML? - kml

What is the proper format for including an ampersand in KML? I am using them in the name tag. If I include a regular '&' then it is invalid.
What other characters do I need to properly encode?
I'm using this format:
<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://earth.google.com/kml/2.2">
I'm looking for a PHP solution. I'm creating the KML in PHP.

KML is an XML file so simply follow the XML rules to encode special characters.
In XML you can encode "&" and other special characters with "predefined entities" that represent the associated special characters or using a CDATA section.
1. Predefined entities
The XML specification defines five "predefined entities" representing special characters and requires that all XML processors honor them. Use these special predefined entities names that are substituted with the actual characters it represents when element content is evaluated.
name | characters
----- | ------
< | <
> | >
& | &
" | "
' | &apos;
Example:
<description>
<a href="http://server.com/link">A & B</a>
</description>
2. CDATA
Another mechanism to escape markup inside XML elements is CDATA. CDATA section is a section of element content that is marked for the parser to interpret as only character data, not markup.
Example:
<description>
<![CDATA[
A & B
]]>
</description>

To escape special characters like an ampersand in markup languages like XML, HTML/XHTML or derived formats like KML, use HTML entities.
In your case & become &
You can find more information about entities on this page : Character_references

I used PHP function with the ENT_XML1 flag:
htmlentities($sString, ENT_XML1))

Related

DocBook 5.x: bibliography citation with extra text, or completely custom text?

I have a book with bibliography like
<bibliography>
<biblioentry>
<abbrev>A</abbrev>
<title>This is the book title</title>
</biblioentry>
<!-- ... -->
</bibliography>
and can cite individual works with <citation>A</citation>, which will output something like [A] in the resulting HTML.
Now, the citation often includes precise location within the work (such as volume/chapter/page/paragraph number, or their range). I currently have this part in the following text (like <citation>A</citation><phrase>, XX–XXI</phrase>) so I get [A], XX-XXI in the resulting text, but the XX–XXI part is semantically not related to the citation.
How can I make either a citation with text affixed to the reference abbreviation (something like <citation>A<loc>XX–XXI</loc></citation> → [A, XX–XXI]), or citation with completely custom text (but hyperlink resolved to the bibliography entry)?
I've been browsing DocBook 5.2: The Definitive Guide, grepping through xslTNG stylesheets and unit tests, and still unsure what to do. Perhaps <link ...> with some xref pointing to bibliography or something similar? Pointers appreciated.

NetSuite Advanced PDF - How to set <#ftl output_format = "HTML" />

I've built many many many Advanced PDFs in the past couple of years. There is one thing that always sticks...
This applies mainly to SuiteScript rendered PDF templates.
The PDFs error if the user fields include & or -- or any other unesdcaped string literal. The default output_format is undefined
I'm looking at FTL documentation and can set <#ftl output_format = "HTML" /> but no matter where I put this in the PDF template, it fails.
Is there a particular place I need to declare this in the template?
It's not feasible to globally replace "&" with "&" everywhere etc...
Not sure that this answers the exact question you're asking, but I don't think it's the output format that's your problem here. My understanding is that the output format refers to what's generated by the template - ie: the final render. The output format, in any case, should be XML, as that's what's consumed by the BFO tag library when you're creating PDFs.
I think the issue is that your XML itself is not valid when string literals contain XML control characters of "&", "<" or ">". To avoid this, when building your templates and adding strings with SuiteScript, you can use the N/xml module's xml.escape() method to wrap anything that could contain one of those characters.
Sorry if I'm off base with this, but hope it helps.

Why do we have two delimiters for query parameters?

We all know there are 2 delimiters for query strings. Which are ? and &. Why wouldn't we use just ? for both cases? Why do we need &
RFC 3986 gives description of the standard, but does't provide us with motivation on that subject.
The query component is indicated by the first question
mark ("?") character and terminated by a number sign ("#") character
or by the end of the URI.
If you read the various URL specifications, you will see that it doesn't set out any syntax for the <query> component. Indeed, the client and server could agree on any syntax for the string subject to the restrictions on allowed / reserved characters.
The ?<name>=<value>&<name>=<value> syntax that is most commonly used comes from the HTML specification. Look for the section of the HTML spec (pick any version) that specifies the "application/x-www-form-urlencoded" encoding scheme for form parameters.
Why does the HTML spec not use & as the parameter separator? I think that is because the URL spec says that ? is reserved in the <query> part. (So if HTML used ? as a separator, it would need to be percent-encoded.)
Why is ? reserved in the <query> part? Well now we are getting into the history of http: hyperlinks before a unified URL specification existed. Basically, I don't know, but it could have been related to the way that early web servers or browsers parsed hyperlinks.

Excel VBA formula: extract file extension from filepath?

Given an excel column containing filepaths, what excel formula returns only the file extension?
src\main\java\com\something\proj\UI.java --> java
src\main\java\com\something\proj\Server.scala --> scala
src\main\java\com\something\proj\include.h\someinclude.hpp --> hpp
Note 1: this formula works great for filepaths with only a single period, but not for case 3:=IF(A1="","",RIGHT(A1,LEN(A1)-FIND(".",A1)))
Note 2: I understand that these filepaths are Windows-specific, I don't need a cross-platform solution.
Related: triming extension from filename in Excel and How to extract file name from path?
With data in A1, use:
=SUBSTITUTE(RIGHT(SUBSTITUTE(A1,".",REPT(".",999)),999),".","")
From:
Jim Cone's old post
This will find everything after the last .:
=MID(A1,FIND("{{{",SUBSTITUTE(A1,".","{{{",LEN(A1)-LEN(SUBSTITUTE(A1,".",""))))+1,LEN(A1))
Here's a nice long answer. :-)
=SUBSTITUTE(A1,LEFT(A1,FIND(CHAR(1),SUBSTITUTE(A1,".",CHAR(1),LEN(A1)-LEN(SUBSTITUTE(A1,".",""))))),"")
A neat trick I sometimes use for string parsing in general is to leverage the FilterXML() function (Excel 2013 and later). The basic strategy is to use Substitute() to format your string in a way that it is parsed across elements in an xml string, and then you can use xpath syntax to conveniently navigate your parsed elements. Using this strategy, getting an extension would look like this...
=FILTERXML("<A><p>" & SUBSTITUTE(A1,".","</p><p>.")&"</p></A>","//p[last()]")
If you're not familiar with xml, this can seem intimidating, but if you can grasp what's going on, I find it to be cleaner, more flexible, and easier to remember than the alternative approaches using len(), substitute(), etc. One reason why it's nicer is because there's only one cell reference.
Illegal Characters
There are two characters that are allowed in paths but not in xml: & and '
The equation above will work if these characters are not present, otherwise, they will need to be handled something like this...
=FILTERXML("<A><p>" & SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(J8,"'",""),"&",""),".","</p><p>.")&"</p></A>","//p[last()]")
Example
Suppose we have a nasty file path like this:
C:\Folder1\Folder2\(ugly characters !##$%^()_+={};;,`)\two.dots.LongExt
1.) The Substitution() portion will convert it to an xml string like this...
<A>
<p>
C:\Folder1\Folder2\(ugly characters !##$%^()_+={};;,`)\two
</p>
<p>
.dots
</p>
<p>
.txt
</p>
</A>
2.) Once formatted like this, it's trivial to pick out the last p element using the xpath syntax //p[last()].

Gate- add annotation to entire document

I am trying to do document classification with gate. For that I need to annotate the entire document with one type of annotation. Can anyone please tell me how to do that?
Usually I use XML for that purpose. Something like:
<document class="class-1">
The text of you document 1 is here..
</document>
<document class="class-2">
The text of you document 2 is here..
</document>
Then save these xml as separated files (or as one document).
In GATE application you can use Annotation Set Transfer PR and move annotation from "Original markups" to default annotation set. This is one of the options. Other options depends on data format you have.
If your source documents are HTML or XML then there will already be an annotation in the Original markups set that spans all the content, otherwise the simplest option would be to load the Groovy plugin and use the scripting PR with a one-line script like
outputAS.add(doc.start(), doc.end(), "Document", Utils.featureMap())

Resources