Cheerio - indent newly inserted element under sibling? - node.js

I am using cheerio to mutate a xml file in node. I am inserting the node/element <target> after <source> which works with the insertAfter() api in oldEl.translation.insertAfter(msgSourceEl);.
However, I loose my indention:
<trans-unit id="title" datatype="html">
<source>Login</source><target>iniciar sesión</target>
Is it possible, or is there a way, to indent the newly inserted <target>iniciar sesión</target> underneath the <source> element?

Just fix the final XML indentation :
It is possible to use xml-beautifier to achieve the human-readable indented XML
import beautify from 'xml-beautifier';
const HumanXML = beautify(XML);
console.log(HumanXML); // => will output correctly indented elements
(EXTRA) No need for Cheerio :
In the following example we will be using xml2js to manipulate the XML as a JSON and then build it back to the original XML format
var xml2js = require('xml2js');
var xml = "<trans-unit id=\"title\" datatype=\"html\"><source>Login</source></trans-unit>"
xml2js.parseString(xml, function (err, result) {
result["trans-unit"].target=["iniciar sessión"]
var builder = new xml2js.Builder();
var xml = builder.buildObject(result);
console.log(xml)
});
Final Output :
<trans-unit id="title" datatype="html">
<source>Login</source>
<target>iniciar sessión</target>
</trans-unit>
I am sure you are doing this as part of a loop, so it shouldn't be hard to extrapolate the example to make it work . I suggest using underscore for the usual (each, map, reduce, filter...)

First of all, since source is an empty element, cheerio does not keep the <source>Login</source>. It converts it to <source>Login
So for demonstrating element indentation I will use a <source2> element.
As shown below, providing newlines and tabs as part of the given html solves the issue.
let $ = require('cheerio').load(`
<trans-unit id="title" datatype="html">
<source2>Login</source2>
</trans-unit>`)
$(`
<target>iniciar sesión</target>`).insertAfter($('source2'));
console.log($.html('trans-unit'))
output
<trans-unit id="title" datatype="html">
<source2>Login</source2>
<target>iniciar sesión</target>
</trans-unit>

Related

Why is JSON.stringify needed to retain an object's value when importing it from node.js into an EJS template?

Environment: Node.js, Express, EJS
When JSON.stringify() is used to process objects passed from node.js to an EJS template the objects retain their original values. Although it works I find this result unexpected. JSON.stringify turns objects into strings. Why does this appear to work in reverse in this instance?
In the Node.js file:
app.get('/', function(req, res) {
let myArray = [1, 5];
let myObject = {
cats: 2,
dogs: 0
}
res.render('index', { myArray, myObject });
})
EJS:
<script>
let importedArray = <%- JSON.stringify(myArray) %>;
let importedObject = <%- JSON.stringify(myObject) %>;
</script>
Rendered version in browser:
Although I find this result unexpected it works perfectly fine.
<script>
let importedArray = [1,5];
let importedObject = {"cats":2,"dogs":0};
</script>
Rendered after both JSON.stringify() are removed in EJS file:
The values are lost and the browser throws an error. I would have thought the unescaped output tag <%- would be enough but it's not.
<script>
let importedArray = 1,5;
let importedObject = [object Object];
</script>
Because when you're trying to specify the source code for a script that will live inside a <script> tag inside a web page, you need to generate RAW Javascript source code that will make your object in the web page.
So, you need some method of turning your live server-side Javascript object back into Javascript source code that describes the same object. JSON.stringify() is one such way to generate that Javascript source.
If you don't use something like JSON.stringify() and just pass your actual Javascript object, the EJS will see that it's not a string and it will call obj.toString() on it to try to get a string representation of it. Unfortunately, the implemention of .toString() for a Javascript object just generates "[object Object]" which is completely useless in an EJS template. So, you can't do it that way - you have to manually generate the correct Javascript source code string. And, JSON.stringify() is one such way to do that.
because ejs only render string text, and when use toString on a json, it will get '[object Object]' instead of your real content

Showdown doesn't parse inside html block

I am using Showdown.
When I run this code:
const showdown = require("showdown")
converter = new showdown.Converter()
const myMarkdownText = '## Some important text'
const myHtmlText = converter.makeHtml(myMarkdownText)
I get
<h2 id="someimportanttext">Some important text</h2>
which is the expected result.
But when I run this code:
const showdown = require("showdown")
converter = new showdown.Converter()
const myMarkdownText = '<div markdown = "1"> ## Some important text </div>'
const myHtmlText = converter.makeHtml(myMarkdownText)
I get
<div markdown = "1"><p>## Some important text </p></div>
Which means that Showdown didn't parse the stuff inside the html div.
Any help on how to make it work?
After reading the Showdown documentation (https://github.com/showdownjs/showdown#valid-options) my conclusion is that you should probably enable the backslashEscapesHTMLTags option and backslash the html tags.
A bit late but for future reference:
To enable parsing of markdown inside HTML tags you have to put markdown="1" as a property on the HTML tag like:
<div markdown="1"># I will be parsed</div>
There is more information in the documentation

Cheerio how to ignore elements of a certain tag

I am scraping the body of the webpage:
axios.get(url)
.then(function(response){
var $ = cheerio.load(response.data);
var body = $('body').text();
});
The problem is, I want to exclude contents from the <footer> tag. How do I do that?
cheerio creates a pseudo-DOM when it parses the HTML. You can manipulate that DOM similar to how you would manipulate the DOM in a browser. In your specific case, you could remove items from the DOM using any number of methods such as
.remove()
.replaceWith()
.empty()
.html()
So, the basic idea is that you would use a selector to find the footer element and then remove it as in:
$('footer').remove();
Then, fetch the text after you've removed those elements:
var body = $('body').text();

Gathering document fragments at rendring time using `pug`

I use pug to generate HTML email messages from a template:
doctype html
html
head
title Hello #{name}
body
...
The title is the subject of the email.
Currently, I extract the title text content by parsing the HTML document rendered by pug. But it doesn't seem to be a very efficient way of doing.
Is there some feature or hook available in pug to collect part of the document while rendering it? I considered pug filters, but as far as I understand, those are not suitable since they are triggered at compile time. Not while rendering the document.
I came to a solution using a mixin:
mixin collect(name)
-
// This is just an ugly hack to
// capture the inner block rendered
// text
const savedHtml = pug_html;
pug_html = "";
if (block) block();
const innerHtml = pug_html;
self[name]=innerHtml;
pug_html = savedHtml+innerHtml;
html
head
title
+collect('title')
| Hello #{self.name}
var pug = require("pug");
const compiledFunction = pug.compileFile('template.pug', {debug:true,self:true});
console.log(compiledFunction(out={
name: 'Timothy',
}));
console.log(JSON.stringify(out));
Displaying:
<html><head><title>Hello Timothy</title></head></html>
{"name":"Timothy","title":"Hello Timothy"}
The code of the collect() mixin is not particularly pretty because as far as I know it there is no elegant way to capture the block() output. So I had to tackle into the internal undocumented pug_html variable.
Or is there a cleaner way to achieve that?

Multiline comment missing. at heredoc

I used ejs and heredoc to compile xml files :
const ejs = require('ejs');
const heredoc = require('heredoc');
var tpl = heredoc(function(){/*
<xml>
<ToUserName><![CDATA[<%= toUserName %>]]></ToUserName>
<FromUserName><![CDATA[<%= fromUserName %>]]></FromUserName>
....
</xml>
*/});
however,after using jslint and compiling by babel,it becomes like this :
var ejs=require("ejs"),heredoc=require("heredoc"),
tpl=heredoc(function(){});
the content in heredoc was missing. It seems that the compiler regards the code inside as comments.
Does anyone know how to solve this ?

Resources