I have a book with bibliography like
<bibliography>
<biblioentry>
<abbrev>A</abbrev>
<title>This is the book title</title>
</biblioentry>
<!-- ... -->
</bibliography>
and can cite individual works with <citation>A</citation>, which will output something like [A] in the resulting HTML.
Now, the citation often includes precise location within the work (such as volume/chapter/page/paragraph number, or their range). I currently have this part in the following text (like <citation>A</citation><phrase>, XX–XXI</phrase>) so I get [A], XX-XXI in the resulting text, but the XX–XXI part is semantically not related to the citation.
How can I make either a citation with text affixed to the reference abbreviation (something like <citation>A<loc>XX–XXI</loc></citation> → [A, XX–XXI]), or citation with completely custom text (but hyperlink resolved to the bibliography entry)?
I've been browsing DocBook 5.2: The Definitive Guide, grepping through xslTNG stylesheets and unit tests, and still unsure what to do. Perhaps <link ...> with some xref pointing to bibliography or something similar? Pointers appreciated.
I've built many many many Advanced PDFs in the past couple of years. There is one thing that always sticks...
This applies mainly to SuiteScript rendered PDF templates.
The PDFs error if the user fields include & or -- or any other unesdcaped string literal. The default output_format is undefined
I'm looking at FTL documentation and can set <#ftl output_format = "HTML" /> but no matter where I put this in the PDF template, it fails.
Is there a particular place I need to declare this in the template?
It's not feasible to globally replace "&" with "&" everywhere etc...
Not sure that this answers the exact question you're asking, but I don't think it's the output format that's your problem here. My understanding is that the output format refers to what's generated by the template - ie: the final render. The output format, in any case, should be XML, as that's what's consumed by the BFO tag library when you're creating PDFs.
I think the issue is that your XML itself is not valid when string literals contain XML control characters of "&", "<" or ">". To avoid this, when building your templates and adding strings with SuiteScript, you can use the N/xml module's xml.escape() method to wrap anything that could contain one of those characters.
Sorry if I'm off base with this, but hope it helps.
We all know there are 2 delimiters for query strings. Which are ? and &. Why wouldn't we use just ? for both cases? Why do we need &
RFC 3986 gives description of the standard, but does't provide us with motivation on that subject.
The query component is indicated by the first question
mark ("?") character and terminated by a number sign ("#") character
or by the end of the URI.
If you read the various URL specifications, you will see that it doesn't set out any syntax for the <query> component. Indeed, the client and server could agree on any syntax for the string subject to the restrictions on allowed / reserved characters.
The ?<name>=<value>&<name>=<value> syntax that is most commonly used comes from the HTML specification. Look for the section of the HTML spec (pick any version) that specifies the "application/x-www-form-urlencoded" encoding scheme for form parameters.
Why does the HTML spec not use & as the parameter separator? I think that is because the URL spec says that ? is reserved in the <query> part. (So if HTML used ? as a separator, it would need to be percent-encoded.)
Why is ? reserved in the <query> part? Well now we are getting into the history of http: hyperlinks before a unified URL specification existed. Basically, I don't know, but it could have been related to the way that early web servers or browsers parsed hyperlinks.
Given an excel column containing filepaths, what excel formula returns only the file extension?
src\main\java\com\something\proj\UI.java --> java
src\main\java\com\something\proj\Server.scala --> scala
src\main\java\com\something\proj\include.h\someinclude.hpp --> hpp
Note 1: this formula works great for filepaths with only a single period, but not for case 3:=IF(A1="","",RIGHT(A1,LEN(A1)-FIND(".",A1)))
Note 2: I understand that these filepaths are Windows-specific, I don't need a cross-platform solution.
Related: triming extension from filename in Excel and How to extract file name from path?
With data in A1, use:
=SUBSTITUTE(RIGHT(SUBSTITUTE(A1,".",REPT(".",999)),999),".","")
From:
Jim Cone's old post
This will find everything after the last .:
=MID(A1,FIND("{{{",SUBSTITUTE(A1,".","{{{",LEN(A1)-LEN(SUBSTITUTE(A1,".",""))))+1,LEN(A1))
Here's a nice long answer. :-)
=SUBSTITUTE(A1,LEFT(A1,FIND(CHAR(1),SUBSTITUTE(A1,".",CHAR(1),LEN(A1)-LEN(SUBSTITUTE(A1,".",""))))),"")
A neat trick I sometimes use for string parsing in general is to leverage the FilterXML() function (Excel 2013 and later). The basic strategy is to use Substitute() to format your string in a way that it is parsed across elements in an xml string, and then you can use xpath syntax to conveniently navigate your parsed elements. Using this strategy, getting an extension would look like this...
=FILTERXML("<A><p>" & SUBSTITUTE(A1,".","</p><p>.")&"</p></A>","//p[last()]")
If you're not familiar with xml, this can seem intimidating, but if you can grasp what's going on, I find it to be cleaner, more flexible, and easier to remember than the alternative approaches using len(), substitute(), etc. One reason why it's nicer is because there's only one cell reference.
Illegal Characters
There are two characters that are allowed in paths but not in xml: & and '
The equation above will work if these characters are not present, otherwise, they will need to be handled something like this...
=FILTERXML("<A><p>" & SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(J8,"'",""),"&",""),".","</p><p>.")&"</p></A>","//p[last()]")
Example
Suppose we have a nasty file path like this:
C:\Folder1\Folder2\(ugly characters !##$%^()_+={};;,`)\two.dots.LongExt
1.) The Substitution() portion will convert it to an xml string like this...
<A>
<p>
C:\Folder1\Folder2\(ugly characters !##$%^()_+={};;,`)\two
</p>
<p>
.dots
</p>
<p>
.txt
</p>
</A>
2.) Once formatted like this, it's trivial to pick out the last p element using the xpath syntax //p[last()].
I am trying to do document classification with gate. For that I need to annotate the entire document with one type of annotation. Can anyone please tell me how to do that?
Usually I use XML for that purpose. Something like:
<document class="class-1">
The text of you document 1 is here..
</document>
<document class="class-2">
The text of you document 2 is here..
</document>
Then save these xml as separated files (or as one document).
In GATE application you can use Annotation Set Transfer PR and move annotation from "Original markups" to default annotation set. This is one of the options. Other options depends on data format you have.
If your source documents are HTML or XML then there will already be an annotation in the Original markups set that spans all the content, otherwise the simplest option would be to load the Groovy plugin and use the scripting PR with a one-line script like
outputAS.add(doc.start(), doc.end(), "Document", Utils.featureMap())