How to get HTML source of HTTPS website in Node - node.js

I have this following code snippet which works with Google, but I noticed that trying to reach websites like Amazon which force HTTPS will throw an error 301 (permanently moved). I think the problem may be that I’m using the http package, but the HTTPS package confuses me. If anyone could help me out, that would be stupendous.
var vars = {
host: “www.google.com”,
port: 80,
path: “/index.html”
}
http.get(vars, function(res) {
console.log(res.statusCode);
res.setEncoding(“utf8”);
res.on(“data”, function(data) {
console.log(data);
}
})

You can just use https.get(). But, for https, you have to use a different port (443). I prefer to just pass in the URL and let the library handle the default port for me:
const https = require('https');
https.get("https://www.google.com/index.html", function(res) {
console.log(res.statusCode);
res.setEncoding('utf8');
res.on('data', function(data) {
console.log(data);
});
}).on('error', function(err) {
console.log(err);
});
This may return the data in multiple data events so if you want all the data, you'd have to manually combine all the data.
Personally, I prefer to use a higher level library that is promise-based and makes lots of things simpler:
const got = require('got');
got("https://www.google.com/index.html").then(result => {
console.log(result);
}).catch(err => {
console.log(err);
});
Among many other features, the got() library will automatically collect the whole response for you, uses promises, will follow redirects, will automatically parse JSON results, will check the status and provide an error for 4xx and 5xx statuses, supports lots of authentication means, etc... It's just easier to use than the plain http/https libraries.

Related

Test redirection using jest in express

i am using Jest to test my code.
What i want achieve is to test redirection from http to https. (if it exists if process.env.IS_PRODUCTION).
I don't know how to test it, how to mockup this and so on...
I've tried standard get reqest but don't know how to mockup environment varible or test it in different way
it('should redirect from http to https, (done) => {
request(server)
.get('/')
.expect(301)
.end((err, res) => {
if (err) return done(err);
expect(res.text).toBe('...')
return done();
});
}, 5000);
I expect to be able to test this redirection :)
You could use the node-mocks-http libary which allows you to simulate a request and response object.
Example:
const request = httpMocks.createRequest({
method: 'POST',
url: '/',
});
const response = httpMocks.createResponse();
middlewareThatHandlesRedirect(request, response);
I never worked with jest but I believe that you can check the response.location parameter once the middleware has been called
Preface: I'm not familiar with jest or express or node. But I have found it to be much easier to test explicit configuration (instantiating objects with explicit values) vs implicit configuration (environmental variables and implementation switches on them):
I'm not sure what request or server are but explicit approach might look like:
it('should redirect from http to https, (done) => {
const server = new Server({
redirect_http_to_https: true,
});
request(server)
.get('/')
.expect(301)
.end((err, res) => {
if (err) return done(err);
expect(res.text).toBe('...')
return done();
});
}, 5000);
This allows the test to explicitly configure server to the state it needs instead of mucking with the environment.
This approach also helps to keep process configuration at the top level of your application:
const server = new Server({
redirect_http_to_https: process.env.IS_PRODUCTION,
});

Does Node run out of http handles if used thusly

In Express I have a route that makes an http call to Amazon S3 to see if a specific image exists or not (it's basically a proxy call to bypass my inability to make a cross-domain call directly to Amazon from the browser).
The problem is that it works successively for a while and then suddenly just stops and is only fixed by my restarting my express app.
Just in case (though I believe the code is totally generic) here's the code:
Router.get('/testS3Image',
function(request, response){
try{
var imagePath = request.param('imagePath');
var https = require('https');
var options = {
method:'GET',
host: 's3.amazonaws.com',
path: '/' + imagePath,
port: 443
};
console.log("MAKING S3 call with options " + JSON.stringify(options));
var req = https.request(options,
function(res) {
try{
if(!res || !res.headers || !res.headers["content-length"])
throw ("image not found on S3");
console.log("returned from S3 call with success");
response.json({success:true,headers:res.headers});
}catch(e){
console.log("returned from S3 call with failure");
response.json({success:false, error:e});
}
}
);
req.end();
}catch(e){
}
});
after the https.request() call is made, nothing happens. It simply goes dark. I'm new enough to Node to not know how I can follow along to see what's happening beneath the hood. We're running this in nginx.
Since it works perfectly for a long time and then suddenly breaks with no obvious rhyme or reason I am suspecting some sort of http call limitation, but I am unclear on what I can do differently.
You're not reading the response data from S3; I wonder if this is causing the issue.

How do I stream data to browsers with Hapi?

I'm trying to use streams to send data to the browser with Hapi, but can't figure our how. Specifically I am using the request module. According to the docs the reply object accepts a stream so I have tried:
reply(request.get('https://google.com'));
The throws an error. In the docs it says the stream object must be compatible with streams2, so then I tried:
reply(streams2(request.get('https://google.com')));
Now that does not throw a server side error, but in the browser the request never loads (using chrome).
I then tried this:
var stream = request.get('https://google.com');
stream.on('data', data => console.log(data));
reply(streams2(stream));
And in the console data was outputted, so I know the stream is not the issue, but rather Hapi. How can I get streaming in Hapi to work?
Try using Readable.wrap:
var Readable = require('stream').Readable;
...
function (request, reply) {
var s = Request('http://www.google.com');
reply(new Readable().wrap(s));
}
Tested using Node 0.10.x and hapi 8.x.x. In my code example Request is the node-request module and request is the incoming hapi request object.
UPDATE
Another possible solution would be to listen for the 'response' event from Request and then reply with the http.IncomingMessage which is a proper read stream.
function (request, reply) {
Request('http://www.google.com')
.on('response', function (response) {
reply(response);
});
}
This requires fewer steps and also allows the developer to attach user defined properties to the stream before transmission. This can be useful in setting status codes other than 200.
2020
I found it !! the problem was the gzip compression
to disable it just for event-stream you need provide the next config to Happi server
const server = Hapi.server({
port: 3000,
...
mime:{
override:{
'text/event-stream':{
compressible: false
}
}
}
});
in the handler I use axios because it support the new stream 2 protocol
async function handler(req, h) {
const response = await axios({
url: `http://some/url`,
headers: req.headers,
responseType: 'stream'
});
return response.data.on('data',function (chunk) {
console.log(chunk.toString());
})
/* Another option with h2o2, not fully checked */
// return h.proxy({
// passThrough:true,
// localStatePassThrough:true,
// uri:`http://some/url`
// });
};

Testing an API wrapper

I'm writing an API wrapper for an external API, to be used in our application.
I have adopted a test-driven approach for this project but since I have little to no experience with writing API wrappers, I'm not sure if I'm on the right track.
I understand that I should not be testing the external API, nor should I be hitting the network in my tests. I'm using Nock to mock my requests to the API.
However, I'm not sure I'm doing this correctly.
I made some requests to the API using curl and put the (XML) response in a file, for example: /test/fixtures/authentication/error.js:
module.exports = "<error>Authorization credentials failed.</error>"
Since I don't want to hit the network, but want to make sure my wrapper parses the XML to JSON, I figured I needed sample data.
My test looks like this:
describe("with an invalid application key", function() {
var cl, api;
before(function(done) {
api = nock(baseApi)
.get('/v1/auth/authenticate')
.reply(200, fixtures.authentication.error);
done();
});
after(function(done) {
nock.cleanAll();
done();
});
it("returns an error", function(done) {
cl = new APIClient(auth.auth_user, auth.auth_pass, "abcd1234");
cl.authenticate(function(err, res) {
should.exist(err);
err.should.match(/Authorization credentials failed./);
should.not.exist(res);
api.isDone().should.be.true;
done();
});
});
});
With my tested code looking like this:
APIClient.prototype.authenticate = function(callback) {
var self = this;
request({
uri: this.httpUri + '/auth/authenticate',
method: 'GET',
headers: {
auth_user: this.user,
auth_pass: this.pass,
auth_appkey: this.appkey
}
}, function(err, res, body) {
if (err) {
return callback('Could not connect to the API endpoint.');
}
self.parser.parseXML(body, function(err, result) {
if (err) { return callback(err); }
if (result.error) { return callback(result.error); }
self.token = result.auth.token[0];
return callback(null, result);
});
});
};
Now, this seems to be working fine for the authentication side of things (I also have a 'success' fixture, which returns the 'success' XML and I check if the returned JSON is actually correct.
However, the API I'm using also has endpoints like:
/data/topicdata/realtime/:reportxhours/:topics/:mediatypes/:pageIndex/:pageSize
I'm not sure how to test all (should I?) possible combinations with URLs like those. I feel like I can hardly put 30 XML responses in my fixtures directory. Also, when mocking responses, I'm afraid to miss out on possible errors, edge cases, etc. the external API might return. Are these valid concerns?
If anyone has any pointers, and/or knows of any open-source and well-tested API wrappers I could take a look at, I'd be very grateful.
I think your concern is very valid and I suggest you to also build tests using Zombie or other simular request-based testing frameworks.

GeoJson from Geoserver using node.js

I am new to Node.js, learning with examples.
Here is what I am trying to do, I have a geoserver running to serve GeoJson, I want to call geoserver WFS url and get json data using node.js. Here is code, when I run it, I get :
getaddrinfo ENOENT
var http = require('http');
var options = {
host: "local:8080/geoserver/wfs?service=WFS&version=1.0.0&request=GetFeature&typeName=layername&outputFormat=JSON&cql_filter=id=1";
path: '/'
}
var request = http.request(options, function (res) {
var data = '';
res.on('data', function (chunk) {
data += chunk;
});
res.on('end', function () {
console.log(data);
});
});
request.on('error', function (e) {
console.log(e.message);
});
request.end();
Please guide me in right direction. Thank you.
You need to pass in the correct options:
host - should only be the host name
path - should the path to the resource on the host (all the stuff you have after the host name
method - should be GET or POST (GET in your case).
var options = {
host: "local:8080";
path: '/geoserver/wfs?service=WFS&version=1.0.0&request=GetFeature&typeName=layername&outputFormat=JSON&cql_filter=id=1',
method: 'GET'
}
I was also trying to get a GeoJSON via Node.js and took a different approach; I used Express and sequelizejs. This allowed me to get objects directly from Postgres / PostGIS. I needed to do a little formatting client-side to form a valid GeoJSON from the express response.

Resources