NodeJs HTTP Request Not working with proxy (407) - node.js

I've been using the NPM Request package for years now and despite the fact that it's deprecated, it has always worked perfectly for my needs and I haven't ever had any issues up until today...
I am trying to do a regular GET request with a user:pass authenticated proxy and some additional headers - the same thing I've done a thousand times in the past, however this time the url is HTTP instead of HTTPS.
For whatever reason because the link is HTTP it is messing with my proxy authentication and for whatever reason is returning a response code of 407 and the response body shows an error stating Cache Access Denied (ERR_CACHE_ACCESS_DENIED). This is where I'm confused as to what to do because I know for a fact that there shouldn't be anything wrong with my proxy authentication since its the same thing I've done for years and years.
Request Code:
const request = require('request').defaults({
timeout: 30000,
gzip: true,
forever: true
});
cookieJar = request.jar();
proxyUrl = "http://proxyUsername:proxyPassword#proxyDomain:proxyPort";
request({
method: "GET",
url: "http://mylink.com",
proxy: proxyUrl,
jar: cookieJar,
headers: {
"Proxy-Authorization": new Buffer('proxyUsername:proxyPassword').toString('base64'),
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
"Accept-Language": "en-GB,en-US;q=0.9,en;q=0.8",
"Connection": "keep-alive",
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112 Safari/537.36",
},
followAllRedirects: true
}, (err, resp, body) => {
if (err || resp.statusCode != 200) {
if (err) {
console.log(err);
} else {
console.log(resp.statusCode);
console.log(resp.body);
}
return;
}
console.log(resp.body);
});
Snippet of Response Body (Status Code 407):
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<meta type="copyright" content="Copyright (C) 1996-2019 The Squid Software Foundation and contributors">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>ERROR: Cache Access Denied</title>
<style type="text/css"><!--
body
:lang(fa) { direction: rtl; font-size: 100%; font-family: Tahoma, Roya, sans-serif; float: right; }
:lang(he) { direction: rtl; }
--></style>
</head><body id=ERR_CACHE_ACCESS_DENIED>
<div id="titles">
<h1>ERROR</h1>
<h2>Cache Access Denied.</h2>
</div>
<hr>
...
<p>Sorry, you are not currently allowed to request http://mylink.com from this cache until you have authenticated yourself.</p>
...
Other things to note:
I have also connected to the exact same proxy on my browser and gone to the same site and it works perfectly so it's not an issue with the proxy itself.
If I remove the proxy from the request it works perfectly, so maybe I have configured the request wrong for HTTP is all I can think of
Now as I said this exact code works flawlessly with any HTTPS link so that's where I'm stumped. Any help would be appreciated!

Changed the Proxy-Authorization header to Proxy-Authenticate and that seemed to work perfectly. I have no clue why it requires that for HTTP but there you go, not much documentation on the matter...

Related

Does "request" follow redirects from meta refresh tags?

In my nodejs program, I'm using require(request). It doesn't seem to be following redirects even though it should be by default.
I even explicitly set the redirect flag (even though this should be set by default)
var options = {
url:url
, followRedirect: true
, headers: {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.110 Safari/537.36'
}
}
request(options, function (err, res, body) {...
For example, check out the site http://www.fanniemae.com/
which redirects to http://www.fanniemae.com/portal/index.html
Inside the /index.html, the html contains this
<meta http-equiv="REFRESH" content="0;url=/portal/index.html">
request doesn't seem to be following this meta tag redirect.
Is this normal? And how can I make it follow that redirect?
Request doesn't render pages as browsers do, it's just a way to make simple http calls (redirects would work if there were server-side redirects on external services). So that's why it can't understand this kind of redirect.
As a solution you could try to use something like PhantomJS (http://phantomjs.org/) to make it worked with some workaround mentioned here.
Or probably scripts written for Selenium server might help you to solve your problem.
Came across this post while having the same problem. I extracted the refresh URL and made another request to get the page contnt like this:
var regex = /<meta http-equiv="Refresh" CONTENT="1; URL=([^"]+)[^>]+>/;
var match = regex.exec(response.body);
if (match[1] !== undefined) {
request.get({
url: host + match[1],
}, function(error, response, body) {
console.log(error, response, body);
});
} else {
console.log('no meta redirect found :(');
}

phantomjs + web fonts + font loader

I'm running phantomjs within a node.js environment, and it's going well. At the moment I'm just using local fonts, but want to get google web fonts working with phantomjs.
There are various conflicting and confusing reports out there about whether and how web fonts can be made to work with phantomjs. There are articles like this that contain outdated information with dead links. And posts like this that suggest that phantomjs 2.0 will or can support web fonts, others saying that it doesn't but 2.0.1 will. In this post there is a suggestion that webfonts do work in 2.0.
I've tried lots of options, including with phantomjs 2.0 and 2.0.1 binaries, but can't get it working. It may be that I'm loading the web fonts in my js using the web font loader using something along the following:
WebFont.load({
google: {
families: ['Droid Sans', 'Droid Serif']
},
loading: function() { console.log('loading'); },
active: function() {
console.log('active');
// hooray! can do stuff...
},
inactive: function() { console.log('inactive'); },
fontloading: function(familyName, fvd) { console.log('fontloading', familyName, fvd); },
fontactive: function(familyName, fvd) { console.log('fontactive', familyName, fvd); },
fontinactive: function(familyName, fvd) { console.log('fontinactive', familyName, fvd); }
});
But I'm always reaching the inactive branch, so the font load is never successful... even though the same code works fine in a browser (reaching the active branch.
In the font loader docs, it says:
If Web Font Loader determines that the current browser does not support #font-face, the inactive event will be triggered.
My suspicion is that web font loader is indeed determining that the browser (phantomjs) does not support this, hence always reaching inactive.
Anyone got phantomjs + web fonts + web font loader working?
What is the UA you are using? I think Web Font Loader uses UA to detect the support. Try a UA of Chrome 46 and then see if it works.
var webPage = require('webpage');
var page = webPage.create();
page.settings.userAgent = 'Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.86 Safari/537.36';
Not to be marked as correct, just expanding on the above answer. Since all phantomjs wrappers (like phridge and phantomjs-node) basically spawn a new phantomjs process, the result should be the same when run from a nodejs context.
phatomjs-webfonts.html:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>PhantomJS WebFontsTest</title>
</head>
<body>
<script src="https://ajax.googleapis.com/ajax/libs/webfont/1.5.18/webfont.js"></script>
<script>
WebFont.load({
google: {
families: ['Droid Sans', 'Droid Serif']
},
loading: function(){ console.log('WebFonts loading'); },
active: function(){ console.log('WebFonts active'); },
inactive: function(){ console.log('WebFonts inactive'); }
});
</script>
</body>
</html>
phantomjs-webfonts.js:
var page = require('webpage').create();
page.settings.userAgent = 'Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.86 Safari/537.36';
page.onConsoleMessage = function(msg, lineNum, sourceId) {
console.log('Console: ' + msg);
};
page.open('http://<server-address>/phantomjs-webfonts.html', function(status) {
console.log("Loading status: " + status);
});
Command:
phantomjs phantomjs-webfonts.js
Output:
Console: WebFonts loading
Console: WebFonts active
Loading status: success

Making a Post request to Github API for creating issue is not working

I have been trying to make this post request to the github api for the last couple of days, but unfortunately the response is coming back as "bad message"
here is the piece of code we are sending in the post request using https request in node -
This is the post data
var issueData = JSON.stringify({
"title":title,
"body":comment
});
This is the options file
var options = {
host: 'api.github.com',
path: '/repos/sohilpandya/katasohil/issues?access_token='+sessions.token,
headers: {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:24.0) Gecko/20100101 Firefox/24.0',
},
method: 'POST'
};
This is the https request
var requestaddIssue = https.request(options, function(responseFromIssues){
responseFromIssues.setEncoding('utf8');
responseFromIssues.on('data', function(chunk){
console.log('>>>>chunk>>>>>',chunk);
issueBody += chunk;
});
responseFromIssues.on('end',function(issueBody){
console.log(issueBody);
});
});
requestaddIssue.write(issueData);
requestaddIssue.end();
I have tried another approach where the authentication token for the user is in the header as
'Authentication': 'OAuth '+ sessions.token (where we are storing token inside sessions)
But the chunk response always seems to come back with the following in the console log.
{
"message": "Not Found",
"documentation_url": "https://developer.github.com/v3/issues/#create-an-issue"
}
I have tried the same in apigee and it seems to work ok and returns to correct response. Hoping someone can find the minor error in the code above that is causing this bad message error.
Except the issueBody variable is not defined in the snippets you posted, the code is correct. I tried it using a personal access token.
The error you get appears because you need to add a scope with power to open issues.
I tried the repo and public_repo scopes and they are both working. Note that repo has access to private repositories. Here you can see the list of scopes.
If you're using OAuth, then you you should have an url looking like this:
https://github.com/login/oauth/authorize?client_id=<client-id>&scope=public_repo&redirect_uri=<redirect-uri>

Node.js request.js HPE_INVALID_HEADER_TOKEN

I got desperate about one problem and I need some help...
I'm using node.js to crawl a list of websites, some of them gives me this error, for example:
http://www.fz-juelich.de/portal/DE/Home/home_node.html, Parse Error, HPE_INVALID_HEADER_TOKEN
request.get({
url: uri,
timeout: timeout,
headers: {
referer: domain
}
}, (error, response, body) => {
if (error)
console.log(error);
console.log(body);
});
though, curl -i --raw http://www.fz-juelich.de/portal/DE/Home/home_node.html
works just perfect
HTTP/1.1 404 Not Found
Server: Apache-Coyote/1.1
Cache-Control: no-cache
JSESSIONID=E594677A6CCA13BE0338E1D00A729C34; Path=/cae:
Content-Type: text/html;charset=utf-8
Content-Language: de
Set-Cookie: JSESSIONID=E594677A6CCA13BE0338E1D00A729C34; Path=/
Content-Length: 19677
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" >
Also I'm able to see this website in my chrome browser
Any ideas in which side should I dig to get rid of this errors?
I use quotes in properties and that resolve for me :
request.post(url,{
headers: {
'Authorization': 'Basic onEnAGrosEncodedBase64',
'Content-Type': 'application/x-www-form-urlencoded'
},
form: {
'grant_type': 'client_credentials'
}
})
I hope that can help someone ;)
I the end of this journey I no longer use node.js for crawling and parsing
Go lang crawler fits much better here, more flixibility in http library and easier to write really concurrent stuff

KrakenJS: perform POST request over a controller ends with error

I'm using KrakenJS to build a web app. Being it MVC, I'm implenting a REST service by a controller, here's a sample code:
//users can get data
app.get('myRoute', function (req, res) {
readData();
});
//users can send data
app.post('myRoute', function (req, res) {
writeData();
});
I can read data with no problems. But when I try dummy data insertion with POST requests, it ends up with this error:
Error:Forbidden
127.0.0.1 - - [Thu, 06 Feb 2014 00:11:30 GMT] "POST /myRoute HTTP/1.1" 500 374 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/32.0.1700.102 Chrome/32.0.1700.102 Safari/537.36"
How can I overcome this?
One thing is to make sure you're sending the correct CSRF Headers (http://krakenjs.com/#Security). If I remember correctly, by default Kraken expects those headers to be specified.
You can disable CSRF too and see if that fixes your problem. Since Kraken uses the Lusca module for CSRF, you can get information on how to disable/configure from here: https://github.com/paypal/lusca
I used a trick earlier in which you don't have to turn off csrf...
In your "index.dust" ->
<input id="csrfid" type="hidden" name="_csrf" value="{_csrf}">
In your "script.js" ->
var csrf = document.getElementById('csrfid').value;
$http({ method: 'POST',
url: 'http://localhost:8000/myRoute/',
data: { '_csrf': csrf, 'object': myObject }
}).success(function(result) {
//success handler
}).error(function(result) {
//error handler
});
i was using angularjs btw
As Dan said you can turn csrf off, but you may also want to consider using it, for the added security it brings.
Check out the shopping cart example for more info: https://github.com/lmarkus/Kraken_Example_Shopping_Cart
If you do not need csrf:
By placing this in middleware in your config.json and setting the values to false, you are disabling the use of the csrf middlware, and your app will function as expected.
"middleware": {
"appsec": {
"priority": 110,
"module": {
"name": "lusca",
"arguments": [
{
"csrf": false,
"xframe": "SAMEORIGIN",
"p3p": false,
"csp": false
}
]
}
},

Resources