I have html stored in a var
var html = "<div class="RiP" style="text-align: left;"><div class="clr"></div><input name="extraMP" value="999" type="hidden"><div class="txta dropError">Slide to activate</div><div class="bgSlider"><div class="Slider ui-draggable"></div></div><div class="clr"></div><input name="randomValue" value="randomValue2" type="hidden"></div>"
I want to extract "randomValue" and "randomValue2".
Maybe I should use cheerio? I tried with it but I had hard time managing to do it.
If cheerio is hard for you - you could use regular expression to get the values.
For easily access you could provide class attribute for the <input> like:
<input class="className" name="randomValue" value="randomValue2" type="hidden">
your regexp will be:
const match = html.match(/<input\s*class="className"\s*name="(.+?)"\s*value="(.+?)"/m)
match[1] // randomValue
match[2] // randomValue2
With cheerio it will be:
const cheerio = require('cheerio');
const html = `<div class="RiP" style="text-align: left;"><div class="clr"></div><input name="extraMP" value="999" type="hidden"><div class="txta dropError">Slide to activate</div><div class="bgSlider"><div class="Slider ui-draggable"></div></div><div class="clr"></div><input class="myClass" name="randomValue" value="randomValue2" type="hidden"></div>`
const $ = cheerio.load(html);
$('.myClass').val(); // randomValue2
$('.myClass').attr('name'); // randomValue
What you might do using cheerio is to find the last input inside the RiP class and get the name and value attribute:
var html = `<div class="RiP" style="text-align: left;"><div class="clr"></div><input name="extraMP" value="999" type="hidden"><div class="txta dropError">Slide to activate</div><div class="bgSlider"><div class="Slider ui-draggable"></div></div><div class="clr"></div><input name="randomValue" value="randomValue2" type="hidden"></div>`;
const cheerio = require('cheerio');
const $ = cheerio.load(html);
let input = $('.RiP input').last();
console.log(input.attr('name'));
console.log(input.val());
Result:
randomValue
randomValue2
Note that it is not advisable to parse html with regex
Related
I have a HTML table that looks like this:
<tr class="row-class" role="row">
<td>Text1</td>
<td>
<form method='get' action='http://example.php'>
<input type='hidden' name='id_num' value='ABCD123'> <!-- < I NEED THIS VALUE -->
<button type='submit' class='btn' title='Check' ></button>
</form>
</td>
</tr>
I want to get the value of the hidden input type named id_num. (In this example the value I want is "ABCD123").
I tried to parse the code with cheerio like this:
var $ = cheerio.load(body);
$('tr').each(function(i, tr){
var children = $(this).children();
var x = children.eq(0);
var id_num = children.eq(1);
var row = {
"x": x.text().trim(), //this is correct, value is Text1
"id_num": id_num.text().trim() //This is empty, value is "", I want the value "ABCD123"
};
});
But I only get the first value correct.
How can I get the value from the hidden input element id_num?
Thanks.
That should be:
$(tr).find('[name="id_num"]').attr('value')
Your eq(1) was getting the whole <tr>, try this instead:
$('tr').each(function(i, tr){
var children = $(this).children('td');
var x = $(children[0]);
var id_num = $(children[1]).find("input[name='id_num']");
var row = {
"x": x.text(),
"id_num": id_num.val()
};
}
I am using Showdown.
When I run this code:
const showdown = require("showdown")
converter = new showdown.Converter()
const myMarkdownText = '## Some important text'
const myHtmlText = converter.makeHtml(myMarkdownText)
I get
<h2 id="someimportanttext">Some important text</h2>
which is the expected result.
But when I run this code:
const showdown = require("showdown")
converter = new showdown.Converter()
const myMarkdownText = '<div markdown = "1"> ## Some important text </div>'
const myHtmlText = converter.makeHtml(myMarkdownText)
I get
<div markdown = "1"><p>## Some important text </p></div>
Which means that Showdown didn't parse the stuff inside the html div.
Any help on how to make it work?
After reading the Showdown documentation (https://github.com/showdownjs/showdown#valid-options) my conclusion is that you should probably enable the backslashEscapesHTMLTags option and backslash the html tags.
A bit late but for future reference:
To enable parsing of markdown inside HTML tags you have to put markdown="1" as a property on the HTML tag like:
<div markdown="1"># I will be parsed</div>
There is more information in the documentation
I am scraping a web site and using node and cheerio for that purpose.
I have the below structure
<li class="wrap-level-1">
<a class="level-2 link" href="https:mysite..." target="_blank"> Tropical Viking </a>
</li>
How do I get the Tropical Viking text only ?
I am trying this
$('.wrap-level-1').map((i, el) => {
console.log('entering scrapper')
const count = resultCount++
console.log(count)
//This is what I need
const title = $(el).find('a').???
const metadata = {
title: title
}
parsedResults.push(metadata)
console.log(metadata)
})
Thanks for your help
It looks like you want this:
let parsedResults = $('.wrap-level-1').map((i, el) => {
console.log('entering scrapper')
const count = resultCount++
console.log(count)
// This is what I need
const title = $(el).find('a').text()
const metadata = {
title: title
}
return metadata
}).get()
I am using Cheerio in nodejs to select text from a URL where an element contains the attribute itemprop="name".
At the moment I need to know the parent element in order to read the attribute and associated text. See below as an example.
However, what I would like to do is insert a wildcard for the Element. eg. H2, so I can select any attribute with name="itemprop". Is this possible?
var $ = cheerio.load(body);
var domElem = $("h2[itemprop = 'name']").get(0);
var content = $(domElem).text().trim();
ogTitle = content;
console.log(content);
It looks like you can do the following as a wilcard:
var $ = cheerio.load(body);
var domElem = $("*[itemprop = 'name']").get(0);
var content = $(domElem).text().trim();
ogTitle = content;
console.log(content);
The following also worked for me:
Html Code:
<a href="/someLine" itemscope="" itemprop="author" itemtype="http://schema.org/Person">
<span itemprop="name">Jane Author</span>
</a>
Used this to get Jane Author:
author = $("*[itemprop = 'author']").text();
// Jane Author
I'm trying to create a custom handlebars helper, and I want to be able to pass it a "base-template" and a "partial"..
So what it should do is render the base template and then render whatever partials is passed as the second parameter.
I have the following right now:
module.exports.register = function(Handlebars, options) {
var assembleOpts = options || {};
Handlebars.registerHelper("sgComponent", function (template, partial, options) {
// Default options
var opts = {
cwd: '',
src: '',
glob: {}
};
options = _.defaults(options.hash, assembleOpts.sgComponent, opts);
var partialContent, partialContext;
// Join path to 'cwd' if defined in the helper's options
var cwd = path.join.bind(null, options.cwd, '');
var src = path.join.bind(null, options.src, '');
glob.find(src(partial), options.glob).map(function(path) {
partialContext = yfm.extract(path).context;
partialContent = yfm.extract(path).content;
});
return glob.find(cwd(template), options.glob).map(function(path) {
var context = yfm.extract(path).context;
var content = yfm.extract(path).content;
return {
path: path,
context: processContext(grunt, partialContext),
content: content
};
}).map(function (obj) {
var template = Handlebars.compile(obj.content);
return new Handlebars.SafeString(template({content: obj.context}));
});
});
var processContext = function(grunt, context) {
grunt.config.data = _.defaults(context || {}, _.cloneDeep(grunt.config.data));
return grunt.config.process(grunt.config.data);
};
};
And right now I'm using my helper like this:
{{{sgComponent 'path/to/basetemplate/basetemplate.hbs' 'path/to/partial/partial.hbs'}}}
I'm a little stuck right now. At the moment I can only figure out how to render either the base template or the partial.. Or render the base template but with the context from the partial (it's yaml font matter) What I would like to achieve is the basetemplate being rendered and the partials content being render inside of it, with whatever context defined in the partial.
Like so (base template):
<div class="sg-component">
<!-- Markup -->
<div class="sg-component__markup">
{{partial}}
</div>
<!-- Documentation -->
<div class="sg-component__documentation">
{{#markdown}}
~~~markup
{{partial}}
~~~
{{/markdown}}
</div>
</div>
Partial:
---
context: context stuff here
---
<h1 class="title--huge">This is a very large header</h1>
<h2 class="title--xlarge">This is a large header</h2>
<h3 class="title--large">This is a medium header</h3>
<h4 class="title--medium">This is a moderate header</h4>
<h5 class="title--small">This is a small header</h5>
<h6 class="title--xsmall">This is a tiny header</h6>
Thanks in advance!
Dan
So, I fixed it my self! Hurray..
I sat down it thought it through and came to the conclusion that I only needed to register the second hash argument as a partial.
So I added this after the Handlebars.compile(obj.content);
Handlebars.registerPartial("sgComponentContent", partial);
And then within my basetemplate I can now use {{> sgComponentContent}}
Awesome!