phantomjs + web fonts + font loader

phantomjs + web fonts + font loader - node.js

I'm running phantomjs within a node.js environment, and it's going well. At the moment I'm just using local fonts, but want to get google web fonts working with phantomjs.
There are various conflicting and confusing reports out there about whether and how web fonts can be made to work with phantomjs. There are articles like this that contain outdated information with dead links. And posts like this that suggest that phantomjs 2.0 will or can support web fonts, others saying that it doesn't but 2.0.1 will. In this post there is a suggestion that webfonts do work in 2.0.
I've tried lots of options, including with phantomjs 2.0 and 2.0.1 binaries, but can't get it working. It may be that I'm loading the web fonts in my js using the web font loader using something along the following:
WebFont.load({
google: {
families: ['Droid Sans', 'Droid Serif']
},
loading: function() { console.log('loading'); },
active: function() {
console.log('active');
// hooray! can do stuff...
},
inactive: function() { console.log('inactive'); },
fontloading: function(familyName, fvd) { console.log('fontloading', familyName, fvd); },
fontactive: function(familyName, fvd) { console.log('fontactive', familyName, fvd); },
fontinactive: function(familyName, fvd) { console.log('fontinactive', familyName, fvd); }
});
But I'm always reaching the inactive branch, so the font load is never successful... even though the same code works fine in a browser (reaching the active branch.
In the font loader docs, it says:
If Web Font Loader determines that the current browser does not support #font-face, the inactive event will be triggered.
My suspicion is that web font loader is indeed determining that the browser (phantomjs) does not support this, hence always reaching inactive.
Anyone got phantomjs + web fonts + web font loader working?

What is the UA you are using? I think Web Font Loader uses UA to detect the support. Try a UA of Chrome 46 and then see if it works.
var webPage = require('webpage');
var page = webPage.create();
page.settings.userAgent = 'Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.86 Safari/537.36';

Not to be marked as correct, just expanding on the above answer. Since all phantomjs wrappers (like phridge and phantomjs-node) basically spawn a new phantomjs process, the result should be the same when run from a nodejs context.
phatomjs-webfonts.html:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>PhantomJS WebFontsTest</title>
</head>
<body>
<script src="https://ajax.googleapis.com/ajax/libs/webfont/1.5.18/webfont.js"></script>
<script>
WebFont.load({
google: {
families: ['Droid Sans', 'Droid Serif']
},
loading: function(){ console.log('WebFonts loading'); },
active: function(){ console.log('WebFonts active'); },
inactive: function(){ console.log('WebFonts inactive'); }
});
</script>
</body>
</html>
phantomjs-webfonts.js:
var page = require('webpage').create();
page.settings.userAgent = 'Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.86 Safari/537.36';
page.onConsoleMessage = function(msg, lineNum, sourceId) {
console.log('Console: ' + msg);
};
page.open('http://<server-address>/phantomjs-webfonts.html', function(status) {
console.log("Loading status: " + status);
});
Command:
phantomjs phantomjs-webfonts.js
Output:
Console: WebFonts loading
Console: WebFonts active
Loading status: success

Related

NodeJs HTTP Request Not working with proxy (407)

I've been using the NPM Request package for years now and despite the fact that it's deprecated, it has always worked perfectly for my needs and I haven't ever had any issues up until today...
I am trying to do a regular GET request with a user:pass authenticated proxy and some additional headers - the same thing I've done a thousand times in the past, however this time the url is HTTP instead of HTTPS.
For whatever reason because the link is HTTP it is messing with my proxy authentication and for whatever reason is returning a response code of 407 and the response body shows an error stating Cache Access Denied (ERR_CACHE_ACCESS_DENIED). This is where I'm confused as to what to do because I know for a fact that there shouldn't be anything wrong with my proxy authentication since its the same thing I've done for years and years.
Request Code:
const request = require('request').defaults({
timeout: 30000,
gzip: true,
forever: true
});
cookieJar = request.jar();
proxyUrl = "http://proxyUsername:proxyPassword#proxyDomain:proxyPort";
request({
method: "GET",
url: "http://mylink.com",
proxy: proxyUrl,
jar: cookieJar,
headers: {
"Proxy-Authorization": new Buffer('proxyUsername:proxyPassword').toString('base64'),
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
"Accept-Language": "en-GB,en-US;q=0.9,en;q=0.8",
"Connection": "keep-alive",
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112 Safari/537.36",
},
followAllRedirects: true
}, (err, resp, body) => {
if (err || resp.statusCode != 200) {
if (err) {
console.log(err);
} else {
console.log(resp.statusCode);
console.log(resp.body);
}
return;
}
console.log(resp.body);
});
Snippet of Response Body (Status Code 407):
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<meta type="copyright" content="Copyright (C) 1996-2019 The Squid Software Foundation and contributors">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>ERROR: Cache Access Denied</title>
<style type="text/css"><!--
body
:lang(fa) { direction: rtl; font-size: 100%; font-family: Tahoma, Roya, sans-serif; float: right; }
:lang(he) { direction: rtl; }
--></style>
</head><body id=ERR_CACHE_ACCESS_DENIED>
<div id="titles">
<h1>ERROR</h1>
<h2>Cache Access Denied.</h2>
</div>
<hr>
...
<p>Sorry, you are not currently allowed to request http://mylink.com from this cache until you have authenticated yourself.</p>
...
Other things to note:
I have also connected to the exact same proxy on my browser and gone to the same site and it works perfectly so it's not an issue with the proxy itself.
If I remove the proxy from the request it works perfectly, so maybe I have configured the request wrong for HTTP is all I can think of
Now as I said this exact code works flawlessly with any HTTPS link so that's where I'm stumped. Any help would be appreciated!

Changed the Proxy-Authorization header to Proxy-Authenticate and that seemed to work perfectly. I have no clue why it requires that for HTTP but there you go, not much documentation on the matter...

Is there a programmatic way to change user agent in Cypress.io?

I have some ad calls that are only made on mobile devices. In Chrome, I can use Device Mode and simulate a mobile device, and the resulting ad call from the server is correctly tailored to mobile. I'm not sure how Chrome does this, except possibly by sending a different user agent.
In the Cypress.io documentation, it says the user agent can be changed in the configuration file (Cypress.json). But, I need to run a test for a desktop viewport and then a mobile viewport with a mobile user agent. Is there a way to change the user agent programmatically?

Update: According to https://github.com/cypress-io/cypress/issues/3873 it is possible since Cypress 3.3.0 use user-agent property in a cy.request() and cy.visit().
If you need, for example, set userAgent as Googlebot:
cy.visit(url, {
headers: {
'user-agent': 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)',
}
});
Original answer before Cypress 3.3.0
before(() => {
cy.visit(url, {
onBeforeLoad: win => {
Object.defineProperty(win.navigator, 'userAgent', {
value: 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)',
});
},
});
});

Now cypress supports passing user agent in the header for cy.visit as well as cy.request:
it('Verify Social Sharing Meta Tags', () => {
cy.visit(portalURL + '/whats_new/140', {
headers: {
'user-agent': 'LinkedInBot/1.0 (compatible; Mozilla/5.0; Apache-HttpClient +http://www.linkedin.com)',
}
});
cy.document().get('head meta[name="og:type"]')
.should('have.attr', 'content', 'website');
});
https://on.cypress.io/changelog#3-3-0
Update as on Aug 12, 2021:
It seems you can't change the user agent anymore, reference https://docs.cypress.io/api/cypress-api/config#Notes

The other answers do not set the User-Agent header of the underlying HTTP request, just the userAgent property of win.navigator. To set the User-Agent header to a custom value for all HTTP requests, you can set the userAgent configuration option:
{
// rest of your cypress.json...
"userAgent": "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
}

PhantomJS, generated PDF file size is different on local in compare with aws-lambda

I am using PhantomJS to convert web page to PDF.
When I run the code in my development environment the output file size is around few KB but when I run the same code on AWS-Lambda the output file size is around few MB.
I want to generate a pdf file with the same size of dev environment.
Here is the code I am using for conversion
var system = require('system');
var page = require('webpage').create();
page.settings.userAgent = 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36';
var url = system.args[1];
var file_name = system.args[2]
var margin = system.args[3];
page.paperSize = {
format: 'A2',
orientation: 'portrait',
margin: JSON.parse(margin)
};
page.evaluate(function () {
var style = document.createElement('style');
style.innerHTML = '.card-padding{padding-top: 0;padding-bottom: 0;}';
document.body.appendChild(style);
})
page.open(url, function start(status) {
window.setTimeout(function () {
page.render(file_name, {format: 'pdf'});
system.stdout.write(file_name)
phantom.exit();
}, 5000);
});
I also tried to set the page dpi
page.open(url, function start(status) {
window.setTimeout(function () {
page.dpi=72
page.render(file_name, {format: 'pdf'});
system.stdout.write(file_name)
phantom.exit();
}, 5000);
})
but it didn't help.
You can find the project here

Solution is;
create a .fonts folder
put the fonts you are using in PDF in .fonts folder
Add enviroment variable to lambda HOME = /var/task
it significantly reduces the file size.
What was the issue;
PhantomJs v2.1.x on debian and ubuntu at least handles local fonts correctly, i.e, they are embedded in the pdf and therefore rendered on screen in high quality, the text is searchable and selectable and the file size is kept low. For remote webfonts, the text is converted to outlines, fonts are not embedded and text doesn't search and selectable, and makes the file size huge for longer docs.

Does "request" follow redirects from meta refresh tags?

In my nodejs program, I'm using require(request). It doesn't seem to be following redirects even though it should be by default.
I even explicitly set the redirect flag (even though this should be set by default)
var options = {
url:url
, followRedirect: true
, headers: {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.110 Safari/537.36'
}
}
request(options, function (err, res, body) {...
For example, check out the site http://www.fanniemae.com/
which redirects to http://www.fanniemae.com/portal/index.html
Inside the /index.html, the html contains this
<meta http-equiv="REFRESH" content="0;url=/portal/index.html">
request doesn't seem to be following this meta tag redirect.
Is this normal? And how can I make it follow that redirect?

Request doesn't render pages as browsers do, it's just a way to make simple http calls (redirects would work if there were server-side redirects on external services). So that's why it can't understand this kind of redirect.
As a solution you could try to use something like PhantomJS (http://phantomjs.org/) to make it worked with some workaround mentioned here.
Or probably scripts written for Selenium server might help you to solve your problem.

Came across this post while having the same problem. I extracted the refresh URL and made another request to get the page contnt like this:
var regex = /<meta http-equiv="Refresh" CONTENT="1; URL=([^"]+)[^>]+>/;
var match = regex.exec(response.body);
if (match[1] !== undefined) {
request.get({
url: host + match[1],
}, function(error, response, body) {
console.log(error, response, body);
});
} else {
console.log('no meta redirect found :(');
}

Another Cross-XHR related

I know that there's a bunch of questions about the "not allowed by Access-Control-Allow-Origin." error.
But I've tried some of them without success. :(
Some appointments:
I'm trying to build a dev-tools-tab extension
I can touch flickr API like the example shows
I can't reach localhost
Already tried several permission wildcards
http://localhost/
http://*/
*://*/
Already tried pack'd and unpack'd extensions
currently, manifest.json has
"version": "0.0.1",
"manifest_version": 2,
"devtools_page": "components/devtools.html",
"permissions": [
"http://*/"
]
devtools.html
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title></title>
</head>
<body>
<script src="../js/devtools.js"></script>
</body>
</html>
and, devtools.js
(function (window) {
"use strict";
var xhr1, xhr2, url;
xhr1 = new window.XMLHttpRequest();
xhr2 = new window.XMLHttpRequest();
xhr1.onreadystatechange = function () {
if (this.readyState === 4) {
console.log('flickr ok');
}
};
xhr2.onreadystatechange = function () {
console.log(this.readyState);
if (this.readyState === 4) {
console.log(this.responseText);
}
};
url = 'https://secure.flickr.com/services/rest/?' +
'method=flickr.photos.search&' +
'api_key=90485e931f687a9b9c2a66bf58a3861a&' +
'text=' + encodeURIComponent('cats') + '&' +
'safe_search=1&' +
'content_type=1&' +
'sort=interestingness-desc&' +
'per_page=20';
xhr1.open('get', url, true);
xhr1.send();
url = 'http://apache.local';
xhr2.open('get', url, true);
xhr2.setRequestHeader('Origin', url);
xhr2.send();
Chrome console output:
1 devtools.js:12
Refused to set unsafe header "Origin" devtools.html:1
XMLHttpRequest cannot load http://apache.local/. Origin chrome-extension://nafbpegjhkifjgmlkjpaaglhdpjchlhk is not allowed by Access-Control-Allow-Origin. devtools.html:1
4 devtools.js:12
flickr ok devtools.js:8
Chrome version:
28.0.1500.20 dev
Thanks in any advice.

I've got it!
Actually, the problem is that I'm trying to perform XHR requests on devtools page and it seems to have no permissions to bypass cross-origin-access policies like a popup page do.
Devtools tab tries are also unsuccessful.
edit
Is an stage-permission related. Not wildcard-permission. As I've said, I've managed to perform queries on some domains, yet not having they explicitly on my permissions array.
The problem really lies on the type of script running.
The same script, if used as a popup, work'd fine. So, I've tried as an background-script with success too! I was facing the problem that devtools_page and related doesn't have such permissions...
The APIs available to extension pages within the Developer Tools window include all devtools modules listed above and chrome.extension API. Other extension APIs are not available to the Developer Tools pages, but you may invoke them by sending a request to the background page of your extension, similarly to how it's done in the content scripts.
http://developer.chrome.com/extensions/devtools.html
That level of script denies non explicit cross xhrs.
Solved the problem putting the requests in a background script and using messages api.
Thank you!

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

phantomjs + web fonts + font loader - node.js

Related

NodeJs HTTP Request Not working with proxy (407)

Is there a programmatic way to change user agent in Cypress.io?

PhantomJS, generated PDF file size is different on local in compare with aws-lambda

Does "request" follow redirects from meta refresh tags?

Another Cross-XHR related

Categories

Resources