I am trying to get html element of an angular app but when I am using request module, I am only getting the source which has module tag but not the actual rendered html page.
Is there any way I can get rendered html page with all elements rendered in request response.
var request = require ("request")
request (
{ uri: "http://www.myangular-app.com"},
function (error, response, body){
console.log(body); // ---> Receiving source only, not the rendered elements of module.
});
Related
i'm trying to scrape this page with cheerio https://en.dict.naver.com/#/search?query=%EC%B6%94%EC%9B%8C%EC%9A%94&range=all
But i can't get anything. I tried to get that 'Word-Idiom' text but i get nothing as response.
Here's my code
app.get("/conjugation", (req, res) => {
axios(
"https://en.dict.naver.com/#/search?query=%EC%B6%94%EC%9B%8C%EC%9A%94&range=all"
)
.then((response) => {
const htmlData = response.data;
const $ = cheerio.load(htmlData);
const element = $(
"#searchPage_entry > h3 > span.title_text.myScrollNavQuick.my_searchPage"
);
console.log(element.text());
})
.catch((err) => console.log(err));
});
The server at that URL doesn't return any body DOM structure in the HTML response. The body DOM is rendered by linked JavaScript after the response is received. Cheerio doesn't execute the JavaScript in the HTML response, so it won't be possible to scape that page using Cheerio. Instead, you'll need to use another method which can execute the in-page JavaScript (e.g. Puppeteer).
This is a common issue while web scraping, the page loads dynamically, that's why when you fetch the content of the initial get response from that website, all you're getting is script tags, print the htmlData so you can see what I mean. There are no loaded html elements in your response, what you'll have to do is use something like selenium to wait for the elements that you're requiring to get rendered.
I am new to react js as well as node &express js. I just want to know a simple thing, If I have used the axios API for simply getting the response from backend via res.send() function and I want to display a div tag saying " backend responds" at the end of my submit button on the frontend in case I get a response from the backend.
I created a react component called Block which returned div tag but how to call that component after button that too when a response has arrived is a mystery to me.
Check out the screenshot of the code
You can use state to save the response received from Axios. Remember, you got to use response.data to retrieve data.
const submitReview = () => {
// Getting around CORS problem using cors-anywhere
Axios.get(
"your url here"
).then((response) => {
// You got to get 'data' from response.data
console.log(response.data);
setReview(response);
});
};
And in your render loop - check if you have successfully received response. I have used length of variable. You can use a boolean variable such as 'loaded' to indicate that you already have received the response.
<button onClick={submitReview}> Submit </button>
{review.length > 0 ? review : ""}
Check out complete example here
https://codesandbox.io/s/get-response-via-axios-api-and-display-it-as-dom-element-on-frontend-st0ht?file=/src/App.js:349-798
Scenario
I am creating a nodejs server, which'll act as a middle server between the actual client and actual server. i.e. I send a request to a website, through my nodejs server, receive the response from actual (website) server, and forward the same to the client (the browser).
Here's part of the code for doing that
const cheerio = require('cheerio');
//#================================================================
// include other files and declare variables
//#================================================================
app.get('/*', (req, res) => {
//#================================================================
// some code...
//#================================================================
request(options, function(error, response, body){
if (!error && response.statusCode == 200) {
res.writeHead(200, headers);
if (String(response.headers['content-type']).indexOf('text/html') !== -1){
var $ = cheerio.load(body);
//#================================================
// perform html manipulations
//#================================================
//send the html content as response
res.end($.html());
}else{
res.end(body);
}
}else{
res.send({status: 500, error: error});
}
});
}
Everything works fine, untill I stumble upon this particular website https://www.voonik.com/recommendations/bright-cotton-a-line-kurta-for-women-blue-printed-bcown-007b-38-1f2073ca.
If you look at its view source it is more or less like this
<!doctype html>
<html lang="en-in" data-reactid=".mc12nbyapk" data-react-checksum="-2121099716">
<!-- rest of the html code -->
...
<script type="text/javascript" charset="UTF-8" data-reactid=".mc12nbyapk.1.1">
window.NREUM||(NREUM={});NREUM.info = {"agent":"","beacon":"bam.nr-data.net","errorBeacon":"bam.nr-data.net"...
...
</script></body></html>
and when I send this very html in my response object, it sends incomplete html i.e. breaks in between somewhere of the last script tag.
I consoled log the html also and it prints the whole string. But sending the same in response object sends half.
Also tried res.write(); res.send() and storing the html content in a variable then sending that variable, but the outcome is same i.e. incomplete html content.
I was thinking of solution which wouldn't involve writing to and reading from a file. Just directly send the response as you receive it
after you manipulate the target server response contents, the content length is changed, so you must recalculate the content length and rewrite the content-length header, or just delete the content-length header,
put this code delete headers['content-length'], before res.writeHead(200, headers); this line.
var Request = require("request");
Request.get("https://www.yonline.com", (error, response, body) => {
if(error) {
return console.dir(error);
}
document.getElementById('msg').innerHTML=JSON.parse(body)
console.dir(JSON.parse(body));
});
I tried this code, but it is only giving data on the command line. i want it to give data on the intended textfield.
NodeJS is server side script you can't access HTML element like this.
You need to do it on html javascript side.
I’m writing some proxy server code which intercepts a request (originated by a user clicking on a link in a browser window) and forwards the request to a third party fileserver. My code then gets the response and forwards it back to the browser. Based on the mime type of the file, I would like to handle the file server's response in one of two ways:
If the file is an image, I want to send the user to a new page that
displays the image, or
For all other file types, I simply want the browser to handle receiving it (typically a download).
My node stack includes Express+bodyParser, Request.js, EJS, and Passport. Here’s the basic proxy code along with some psuedo code that needs a lot of help. (Mia culpa!)
app.get('/file', ensureLoggedIn('/login'), function(req,res) {
var filePath = 'https://www.fileserver.com/file'+req.query.fileID,
companyID = etc…,
companyPW = etc…,
fileServerResponse = request.get(filePath).auth(companyID,companyPW,false);
if ( fileServerResponse.get('Content-type') == 'image/png') // I will also add other image types
// Line above yields TypeError: Object #<Request> has no method 'get'
// Is it because Express and Request.js aren't using compatible response object structures?
{
// render the image using an EJS template and insert image using base64-encoding
res.render( 'imageTemplate',
{ imageData: new Buffer(fileServerResponse.body).toString('base64') }
);
// During render, EJS will insert data in the imageTemplate HTML using something like:
// <img src='data:image/png;base64, <%= imageData %>' />
}
else // file is not an image, so let browser deal with receiving the data
{
fileServerResponse.pipe(res); // forward entire response transparently
// line above works perfectly and would be fine if I only wanted to provide downloads.
}
})
I have no control over the file server and the files won't necessarily have a file suffix so that's why I need to get their MIME type. If there's a better way to do this proxy task (say by temporarily storing the file server's response as a file and inspecting it) I'm all ears. Also, I have flexibility to add more modules or middleware if that helps. Thanks!
You need to pass a callback to the request function as per it's interface. It is asynchronous and does not return the fileServerResponse as a return value.
request.get({
uri: filePath,
'auth': {
'user': companyId,
'pass': companyPW,
'sendImmediately': false
}
}, function (error, fileServerResponse, body) {
//note that fileServerResponse uses the node core http.IncomingMessage API
//so the content type is in fileServerResponse.headers['content-type']
});
You can use mmmagic module. It is an async libmagic binding for node.js for detecting content types by data inspection.