How to modify/fake document.referrer in any existing headless browser? - node.js

Idea is to be able to modify document.referrer variable just before any JavaScript execution on website loaded by headless browser. What browser to use is not relevant, I have tried PhantomJS and Zombie without any luck.
My researches indicates that currently it is like so:
PhantomJS - no, since it is const somewhat taken from Referrer header, but even providing this header .referrer is still an empty string in result.
zombie - unknown.

you may try this in phantomJS
var webPage = require('webpage');
var dpage = webPage.create();
dpage.customHeaders = {
"Referer": "https://www.facebook.com"
};

Related

How can I know if browser is Chrome vs Firefox from web extension popup JavaScript?

I am using the chrome namespace for both Chrome and Firefox, but would like to know which browser is running the web extension.
Links to extension resources have different schemes in Chrome and Firefox.
const isFirefox = chrome.runtime.getURL('').startsWith('moz-extension://');
const isChrome = chrome.runtime.getURL('').startsWith('chrome-extension://');
Check chrome.app which is absent in Firefox:
const isFirefox = !chrome.app;
Check for browser which is absent in Chrome:
const isFirefox = window.browser && browser.runtime;
(the additional check is to avoid false positives on pages that have an element with id="browser" that creates a named property on window object for this element)
Use the asynchronous browser.runtime.getBrowserInfo.
P.S. navigator.userAgent may be changed during debugging in devtools when switching to device mode or via about:config option in Firefox so it's an unreliable source.
This is what I do in my own extensions to check for Firefox (FF) vs Chrome:
const FF = typeof browser !== 'undefined';
Update: (1)
Here is an explanation .....
I am using the chrome namespace for both Chrome and Firefox, but would
like to know which browser is running the web extension.
AFA I understand, the question relates to extension code and not content code. I use above code in background script in "firefox-webextensions" or "google-chrome-extension" background script.
From then on then code would be:
if (FF) {...}
else { .... }
Once established, content script has no bearing on it.
In case of a developer who somehow decides to use id="browser" then a further step could be added which returns a boolean true|false e.g.
const FF = typeof browser !== 'undefined' && !!browser.runtime;
Worth nothing that the following returns an object or undefined and not a boolean
const isFirefox = window.browser && browser.runtime;
While it works fine in if() conditionals, it wont work in other situations where a boolean would be required (e.g. switch)
(1) Note: Marking down answers, discourages people from spending time and effort in answering questions in future.

Why is Chrome treating this file as document, while Firefox as Image?

I have a download GET endpoint in my express app. For now it simply reads a file from the file system and streams it after setting some headers.
When i open the endpoint in Chrome, I can see that this is treated as a "document", while in Firefox it is being treated as type png.
I can't seem to understand why it is being treated differently.
Chrome: title bar - "download"
Firefox: title bar - "image name"
In Chrome, this also leads to no caching of the image if I refresh the address bar.
In Firefox it is being cached just fine.
This is my express code:
app.get("/download", function(req, res) {
let file = `${__dirname}/graph-colors.png`;
var mimetype = "image/png";
res.set("Content-Type", mimetype);
res.set("Cache-Control", "public, max-age=1000");
res.set("Content-Disposition", "inline");
res.set("Vary", "Origin");
var filestream = fs.createReadStream(file);
filestream.pipe(res);
});
Also attaching images for Browser network tabs.
This are all to do with the behaviors of Chrome, you can test on another site like Example.png on Wikipedia.
Chrome always treats the "thing" you opened in the address bar as document, ignoring what it really is. You can even test loading a css and it will read document.
For title, it reads download because your path is /download, you cannot change it according to this SO thread.
For caching, Chrome apparently ignores the cache when you are reloading, anything, page or image. You can try using the Wiki example.png, you will get 304 instead of "(from cache)". (304 means the request is sent, and the server has implemented ETag, if-none-match or similar technique)

Selenium firefox webdriver: set download.dir for a pdf

I tried several solution around nothing really works or they are simply outdated.
Here my webdriver profile
let firefox = require('selenium-webdriver/firefox');
let profile = new firefox.Profile();
profile.setPreference("pdfjs.disabled", true);
profile.setPreference("browser.download.dir", 'C:\\MYDIR');
profile.setPreference("browser.helperApps.neverAsk.saveToDisk", "application/pdf","application/x-pdf", "application/acrobat", "applications/vnd.pdf", "text/pdf", "text/x-pdf", "application/vnd.cups-pdf");
I simply want to download a file and set the destination path. It looks like browser.download.dir is ignored.
That's the way I download the file:
function _getDoc(i){
driver.sleep(1000)
.then(function(){
driver.get(`http://mysite/pdf_showcase/${i}`);
driver.wait(until.titleIs('here my pdf'), 5000);
})
}
for(let i=1;i<5;i++){
_getDoc(i);
}
The page contains an iframe with a pdf. I can gathers the src attribute of it, but with the iframe and pdfjs.disabled=true simply visits the page driver.get() causes the download (so I'm ok with it).
The only problem is the download dir is ignored and the file is saved in the default download firefox dir.
Side question: if I wrap _getDoc() in a for loop for that parameter i how can I be sure I won't flood the server? If I use the same driver instance (just like everyone usually does) the requests are sequentials?

Check if the browser is Firefox

I need to know if the browser running my page is Firefox. I came across the code below:
var isGecko = (navigator.product == 'Gecko');
but this is true for Firefox and Safari.
Only Firefox has the string "Firefox" in the user agent, so it is as easy as
var isFirefox = (navigator.userAgent.indexOf('Firefox') !== -1);
Edit: yes, Mozilla discourages it

Sammy.js with Knockout.js Not Running Route With Every URL Change

I have a single Sammy route that recognizes an arbitrary number of parameters. The route looks like this:
get(/^\/(?:\?[^#]*)?#page\/?((?:[^\:\/]+\:[^\:\/]+\/?)*)$/g, function() {
var params = {};
var splat = this.params.splat[0];
var re = /([^\:\/]+)\:([^\:\/]+)/g;
match = true
while(match = re.exec(splat)) {
params[match[1]] = match[2];
}
self.loadData(params);
});
This code works. What it does is it recognizes routes of the pattern #page/param1:value1/param2:value2/ for an arbitrary number of parameters. My loadData function has default values for many of these parameters. I'm confident there isn't a problem with the actual loading of the pages, since it works 100% on many computers in many browsers. However, it has weird behavior on my Android's browser and on my friend's Mac's Safari and Chrome (works on my PC's Chrome). I've noticed that these are Webkit browsers.
The behavior is that the route runs correctly for the first URL change, then won't for the next URL change (although the URL in the browser bar does indeed always change), then it'll work again for the third one, and won't for the fourth. That is, it works every other time. This seems like very strange behavior to me, and I'm at a loss as to how to debug this. For certain links, I was able to run a hack such that on click I set the window location to the URL and forcefully run the sammy code with runRoute('get', url);. It's impractical to have to add this for every click event on the page, and that doesn't really account for all URL changes anyway. Is there something I can do to debug why my route isn't being run every time the URL is changing?
For those of you who encounter similar behavior, on every other click in the above-mentioned browsers, this.params.splat was undefined. It's supposed to be set to the matched part of the URL (e.g. /#page/param1:value1/).
The hack I came up with to deal with this is to add this to the top of the get route:
if(this.params.splat === undefined) {
app.unload().run();
return;
}
This doesn't get to the root of the problem, it's just a hack that allows it to re-run the routes so that params.splat isn't undefined the next time through. If anyone has more information on what is going on, I'd be interested.

Resources