A field in docx is represented this way.
<w:r>
<w:fldChar w:fldCharType="begin"/>
</w:r>
AAA
<w:r>
<w:instrText xml:space="preserve"> NOTEREF _Ref111111 \h </w:instrText>
</w:r>
BBB
<w:r>
<w:fldChar w:fldCharType="separate"/>
</w:r>
CONTENT
<w:r>
<w:fldChar w:fldCharType="end"/>
</w:r>
The field content goes to the CONTENT placeholder. My question is: can anything go to AAA or BBB? Or they are always empty? I suspect the creators of this format had something in mind to have four separator elements instead of just two, but I haven't seen any examples of using this.
It's better to think of it as only three separator elements and two slots for content, which can be complex thanks to the separators.
<w:r><w:fldChar w:fldCharType="begin"/></w:r>
LABEL
<w:r><w:fldChar w:fldCharType="separate"/></w:r>
VALUE
<w:r><w:fldChar w:fldCharType="end"/></w:r>
So your AAA and BBB are just extra content for the LABEL.
There's an example in the spec, where LABEL is:
<w:r><w:rPr><w:b/><w:color w:val="ED1C24"/><w:u w:val="single"/></w:rPr>
<w:instrText>D</w:instrText></w:r>
<w:r><w:instrText xml:space="preserve">ATE</w:instrText></w:r>
to make the D in DATE a different style.
Related
I've a list a values (column-1) which contains multiple parents and childs levels and I need to associate in another column which is the parent of each cell.
Any ideas on how to do that easily in excel?
really thanks!
COLUMN-1 COLUMN-2
**A
A.01** **A**
**A.01.01** **A.01**
A.01.01.01 **A.01.01**
A.01.01.01.01 A.01.01.01
A.01.01.01.02 A.01.01.01
A.01.01.01.03 A.01.01.01
A.01.01.01.04 A.01.01.01
A.01.01.02 **A.01.01**
A.01.01.02.01 A.01.01.02
A.01.01.02.02 A.01.01.02
A.01.01.02.03 A.01.01.02
A.01.01.02.04 A.01.01.02
A.01.01.03 **A.01.01**
A.01.01.03.01 A.01.01.03
A.01.01.03.02 A.01.01.03
A.01.01.03.03 A.01.01.03
A.01.01.03.04 A.01.01.03
FINAL GOAL
Jos Woolley's approach looks likely.
For how to do it, see How can I perform a reverse string search in Excel without using VBA? (adapating . for spaces)
I have a custom field with some HTML code in it:
<h1>A H1 Heading</h1>
<h2>A H2 Heading</h2>
<b>Rich Text</b><br>
fsdfafsdaf df fsda f asdfa f asdfsa fa sfd<br>
<ol><li>numbered list</li><li>fgdsfsd f sa</li></ol>Another List<br>
<ul><li>bulleted</li></ul>
I also have another non-stored field where I want to display the plain text version of the above using REGEXP_REPLACE, while preserving the carriage returns/line breaks, maybe even converting <br> and <br/> to \r\n
However the patterns etc... seem to be different in NetSuite fields compared to using ?replace(...) in freemarker... and I'm terrible with remembering regexp patterns :)
Assuming the html text is stored in custitem_htmltext what expression could i use as the default value of the NetSuite Text Area custom field to display the html code above as:
A H1 Heading
A H2 Heading
Rich Text
fsdfafsdaf df fsda f asdfa f asdfsa fa sfd
etc...
I understand the bulleted or numbered lists will look crap.
My current non-working formula is:
REGEXP_REPLACE({custitem_htmltext},'<[^<>]*>','')
I've also tried:
REGEXP_REPLACE({custitem_htmltext},'<[^>]+>','') - didn't work
When you use a Text Area type of custom field and input HTML, NetSuite seems to change the control characters ('<' and '>') to HTML entities ('<' and '>'). You can see this if you input the HTML and then change the field type to Long Text.
If you change both fields to Long Text, and re-input the data and formula, the REGEXP_REPLACE() should work as expected.
From what I have learned recently, Netsuite encodes data by default to URL format, so from < to < and > to >.
Try using triple handlebars e.g. {{{custitem_htmltext}}}
https://docs.celigo.com/hc/en-us/articles/360038856752-Handlebars-syntax
This should stop the default behaviour and allow you to use in a formula/saved search.
I'm trying to obtain the text in only the title#lang=en-US elements in an XML file.
This code obtains all the title text for all languages.
entries = root.xpath('//prefix:new-item', namespaces={'prefix': 'http://mynamespace'})
for entry in entries:
all_titles = entry.xpath('./prefix:title', namespaces={'prefix': 'http://mynamespace'})
for title in all_titles:
print (title.text)
I tried this code to get the title#lang=en-US text, but it does not work.
all_titles = entry.xpath('./prefix:title', namespaces={'prefix': 'http://mynamespace'})
for title in all_titles:
test = title.xpath("#lang='en-US'")
print (test)
How do I obtain the text for only the english language items?
The expression
//prefix:title[lang('en')]
will select all the English-language titles. Specifically:
title elements that have an xml:lang attribute identifying the title as English, for example <title xml:lang="en-US"> or <title xml:lang="en-GB">
title elements within some container that identifies all the contents as English, for example <section xml:lang="en-US"><title/></section>.
If you specifically want only US English titles, excluding other forms of English, then you can use the predicate [lang('en-US')].
I have created a *.docx file with a 2x2 table, each cell containing the text Cell x-y where x=row number and y=column number.
When I pass this document through a simple transformation process, docx4j's Differencer.diff() method reports no differences (i.e. no w:ins or w:del tags).
This is expected and handled cleanly, inspite of the fact that the .docx has the text of the original document broken up like this inside the <w:tc> -> <w:p> tags:
<w:r>
<w:t>Cell</w:t>
</w:r>
<w:r>
<w:t xml:space="preserve"> 1-1</w:t>
</w:r>
and this in the transformed document:
<w:r>
<w:t xml:space="preserve">Cell 1-1</w:t>
</w:r>
However, if I add the text "Table Title" above the table in the document, the contents of the original document (Word's handling, nothing I can do about it) cells merges into one <w:r>:
<w:r>
<w:t>Cell 1-1</w:t>
</w:r>
And the only difference in the transformed document is that xml:space="preserve" is inserted:
<w:r>
<w:t xml:space="preserve">Cell 1-1</w:t>
</w:r>
However, docx4j's Differencer.diff() method now reports that the content of each cell is inserted, and shows the following as the content of each w:tc's w:p in the generated diff document:
<w:ins xmlns:xalan="http://xml.apache.org/xalan" xmlns:pkg="http://schemas.microsoft.com/office/2006/xmlPackage" w:date="2009-03-11T17:57:00Z" w:author="someone" w:id="1">
<w:r>
<w:t xml:space="preserve">Cell 1-1</w:t>
</w:r>
</w:ins>
and shows the content of each cell as deleted, immediately following the closing <w:tbl> tag:
<!--Handling simple deleted w:p-->
<w:p xmlns:xalan="http://xml.apache.org/xalan" xmlns:pkg="http://schemas.microsoft.com/office/2006/xmlPackage">
<w:del w:date="2009-03-11T17:57:00Z" w:author="someone" w:id="5">
<w:r>
<w:delText>Cell 1-1
</w:r>
</w:del>
</w:p>
I know that the Differencer is capable of ignoring the xml:space="preserve" attributes because it does so with the inserted text before the table, so I doubt that's the cause.
Are these table scenarios outside the intended use case for the Differencer? Is it an error in usage / invocation? Bug?
Any guidance is appreciated.
Is it possible to define such element as HTML's "font" tag, which can contain all three types of subelements?
For example, I can write
<font size=3>This is <b>the</b> text</font>
How can I define is XSD, that font can contain:
1) attribute size
2) nested element B
3) text arount it
?
Thanks
Define the type as Content Type Mixed.