chrome.tabs.executeScript is unreliable - google-chrome-extension

I'm trying to inject content script dynamically so I'm using the documented chrome.tabs.executeScript method for this.
However, unlike embedded content scripts (defined in the manifest) the dynamic script runs on a random basis while I need to be sure it runs everytime.
Basically I listen for tab update events in background script and execute dynamic content script on every "loading" event
What I've noticed is that the behavior seems to be connected with page/script loading timing - if page loading completes before the script execution, the script won't run, otherwise it seems to work as expected.
Though this is just a guess based on the observation and if you have any other ides of what is going on here feel free to share your thoughts.
Is there any ways to ensure dynamic script executes 100% of the time despite of any circumstances?
Here is the log sequence in which script doesn't run:
loading: changeInfo = {status: "loading"}
loading: start. sender.tab.id = 1454
loading: start script loading
loading: code size = 4190306 bytes
loading: changeInfo = {status: "complete"} <- page loading completed
loading: time to execute script = 945 <- script execution completed
And here is the code snippet:
chrome.tabs.onUpdated.addListener(function (tabId, changeInfo, tab) {
console.log("loading: changeInfo = ", changeInfo)
if (changeInfo.status !== "loading") {
console.log("loading: skip");
return;
}
console.log("loading: start. sender.tab.id = ", tabId)
console.log("loading: start script loading")
var timerStart = Date.now();
console.log("loading: code size = ", byteCount(scriptCode), " bytes")
chrome.tabs.executeScript(tabId,
{
code: scriptCode,
runAt: "document_idle"
}, function (response) {
console.log("loading: time to execute script = ", Date.now() - timerStart)
var err = chrome.runtime.lastError;
console.log("loading: response = ", response, ", err = ", err)
});
});
There are also no errors in the log
One more thing to add - if I wrap chrome.tabs.executeScript in setTimeout with something like 2000ms delay this guarantees script will never run which suggests timing issue.

Ok, that was dumb - the problem appeared to be in the dynamic script code itself: the main logic was executed in the onload callback and was never run if the script loads after the onload:
window.addEventListener("load", mainLogic, false);
removing the callback and running mainLogic() directly fixed the issue.
I'll leave it here in case someone did the same mistake as I did.

Related

Firefox doesn't wait for a page load Webdriverio

I am trying to run my test using Selenium and just encountered the problem. I have my test written for the Chrome browser. Now I have been trying to run the same tests in the Firefox browser, but they failed.
I've started to investigate the problem and found out that Firefox doesn't wait until page is fully loaded. Chrome works perfectly.
I am running Selenium in docker containers.
Here is my script
storeSearch(info) {
let that = this;
return new Promise(function (resolve, reject) {
browserClient.init()
.url("http://somewhere.com")
.selectByVisibleText("#store","Tech")
// Redirect to a new page
.setValue("input[name='search']", info.searchCriteria)
.selectByValue(".featured", 'MacBook')
.click("button[name='info']")
.element('.popup')
.then(function (element) {
if (element.state === 'success') {
}
});
});
}
It doesn't try even to select a store type from the select .selectByVisibleText("#store","Tech") and just throws an exception.
"An element could not be located on the page using the given search
parameters (\"input[name='search']\").",
I have tried to add timeouts but it doesn't work as well and gives me an error.
browserClient.init()
.url("http://somewhere.com")
.timeouts('pageLoad', 100000)
.selectByVisibleText("#store","Tech")
The following error is thrown.
"Unknown wait type: pageLoad\nBuild info: version: '3.4.0', revision:
'unknown', time: 'unknown'\nSystem info: host: 'ef7581676ebb', ip:
'172.17.0.3', os.name: 'Linux', os.arch: 'amd64', os.version:
'4.9.27-moby', java.version: '1.8.0_121'\nDriver info: driver.version:
unknown
I have been trying to solve this problem for two days, but no luck so far.
Could someone help, maybe you have some ideas what can cause the problem ?
Thanks.
UPDATE
.url("http://somewhere.com")
.pause(2000)
.selectByVisibleText("#store","Tech")
If I put some pause statements it works, but this is really bad and not what I expect from this framework. Chrome works perfectly. It waits until loading bar is fully loaded and only then performs actions.
The problem is in geckodriver I guess, I have tested it the same flow in Python, Java and the behavior is exactly the same.
I am experiencing the exact behavior you detailed above, all-green/passing test cases in Chrome, but on Firefox, a different story.
First off, never use timeouts, or pause in your test cases unless you are debugging. In which case, chaining a .debug() previously to your failing step/command will actually do more good.
I wrapped all my WDIO commands in waitUntill() and afterwards, I saw green in Firefox as well. See your code bellow:
storeSearch(info) {
let that = this;
return new Promise(function (resolve, reject) {
browserClient.init()
.url("http://somewhere.com")
.waitUntil(function() {
return browser
.isExisting("#store");
}, yourTimeout, "Your custom error msg for this step")
.selectByVisibleText("#store","Tech")
// Redirect to a new page
.waitUntil(function() {
return browser
.setValue("input[name='search']", info.searchCriteria);
}, yourTimeout, "Your custom error msg for this step")
.waitUntil(function() {
return browser
.selectByValue(".featured", 'MacBook');
}, yourTimeout, "Your custom error msg for this step")
.waitUntil(function() {
return browser
.click("button[name='info']");
}, yourTimeout, "Your custom error msg for this step")
.waitUntil(function() {
return browser
.isExisting(".popup");
}, yourTimeout, "Your custom error msg for this step")
.element('.popup')
.then(function (element) {
assert.equal(element.state,'success');
});
});
}
It's not pretty, but it did the job for me. Hopefully for you as well.
Enhancement: If you plan on actually building & maintaining a strong automation harness using WDIO, then you should consider creating custom commands that package the waiting & make your test cases more readable. See bellow an example for .click():
commands.js:
module.exports = (function() {
browser.addCommand('cwClick', function(element) {
return browser
.waitUntil(function() {
return browser.isExisting(element);
}, timeout, "Oups! An error occured.\nReason: element(" + element + ") does not exist")
.waitUntil(function() {
return browser.isVisible(element);
}, timeout, "Oups! An error occured.\nReason: element(" + element + ") is not visible")
.waitUntil(function() {
return browser.click(element);
}, timeout, "Oups! An error occured.\nReason: element(" + element + ") could not be clicked")
});
})();
All that's left to do, is import your module via require() in your test case file: var commands = require('./<pathToCommandsFile>/commands.js');
You can use code in javascript which will be waiting for state of website.
In C# looks like this:
public void WaitForPage(IWebDriver driver, int timeout = 30)
{
IWait<IWebDriver> wait = new WebDriverWait(driver, TimeSpan.FromSeconds(timeout));
wait.Until(driver1 => ((IJavaScriptExecutor)driver).ExecuteScript("return document.readyState").Equals("complete"));
}
I've run into quite a few issues like this with Selenium in Python and C#, unfortunately both in the Chrome and Firefox webdrivers. The problem seems to be that the code gets ahead of itself and tries to reference elements before they even exist/are visible on the page. The solution I found in Python at least was to use the Wait functions like this: http://selenium-python.readthedocs.io/waits.html
If there isn't an equivalent in node, you might have to code a custom method to check the source every so often for the presence of the element in x intervals over time.

How to manage a queue in nodejs?

I have written a script in Nodejs that takes a screenshot of websites(using slimerJs), this script takes around 10-20 seconds to complete, the problem here is the server is stalled until this script has is finished.
app.get('/screenshot', function (req, res, next) {
var url = req.query.url;
assert(url, "query param 'url' needed");
// actual saving happens here
var fileName = URL.parse(url).hostname + '_' + Date.now() + '.png';
var command = 'xvfb-run -a -n 5 node slimerScript.js '+ url + ' '+ fileName;
exec(command, function (err, stdout, stderror) {
if(err){ return next(err); }
if(stderror && (stderror.indexOf('error')!= -1) ){ return next(new Error('Error occurred!')); }
return res.send({
status: true,
data: {
fileName: fileName,
url: "http://"+path.join(req.headers.host,'screenshots', fileName)
}
});
})
});
Since the script spawns a firefox browser in memory and loads the website, the ram usage can spike upto 600-700mb, and thus i cannot execute this command asynchronously as ram is expensive on servers.
may i know if its possible to queue the incoming requests and executing them in FIFO fashion?
i tried checking packages like kue, bull and bee-queues, but i think these all assume the job list is already known before the queue is started, where as my job list depends on users using the site, and i wanna also tell people that they are in queue and need to wait for their turn. is this possible with the above mentioned packages?
If I were doing the similar thing, I would try these steps.
1.An array(a queue) to store requested info, when any request come, store those info in the array, and send back a msg to users, telling them they are in the queue, or the server is busy if there are already too many requests.
2.Doing the screen shot job, async, but not all in the same time. You could start the job if you find the queue is empty when a new request comes, and start another recursively when you finish the last one.
function doSceenShot(){
if(a.length > 1){
execTheJob((a[0])=>{
//after finishing the job;
doScreenShot()
})
}
}
3.Notify the user you've finished the job, via polling or other ways.

XPages Standby control breaks Scrolling in Bootstrap

I have an XPages Bootstrap based application where some functions take a little time to process so I've added the OpenNTF Standby control (https://openntf.org/XSnippets.nsf/snippet.xsp?id=standby-dialog-custom-control) to the page to show an indicator during the Partial refresh.
Since dropping the control on the page after the partial refresh runs it seems to break the page's ability to scroll. I can scroll before the partial refresh but not after.
Anyone know how to fix this?
It sounds like you're getting a global JavaScript error (I'm curious as to the stack trace from your browser's console), which prevents further execution. While in development only (aka- don't force error handling like below), you could try adding a script early on to try and catch the culprit, you could try and load up a script, as early as possible, to force handling of the error to return true or, more importantly, dump out more information on what is throwing the error.
For more on the global error event handler, here's a link to the corresponding page on MDN.
function ignoreError(msg, url, lineNo, columnNo, error) {
var string = msg.toLowerCase();
var substring = "script error";
if (string.indexOf(substring) > -1){
alert('Script Error: See Browser Console for Detail');
} else {
var message = [
'Message: ' + msg,
'URL: ' + url,
'Line: ' + lineNo,
'Column: ' + columnNo,
'Error object: ' + JSON.stringify(error)
].join(' - ');
alert(message);
}
return true; // normally, such a global exception ought to return false
}
window.onerror=ignoreError();

phantomJS scraping with breaks not working

I'm trying to scrape some URLS from a webservice, its working perfect but I need to scrape something like 10,000 pages from the same web servicve.
I do this by creating multiple phantomJS processes and they each open and evaluate a different URL (Its the same service, all I change is one parameter in the URL of the website).
Problem is I don't want to open 10,000 pages at once, since I don't want their service to crash, and I don't want my server to crash either.
I'm trying to make some logic of opening/evaluating/insertingToDB ~10 pages, and then sleeping for 1 minute or so.
Let's say this is what I have now:
var numOfRequests = 10,000; //Total requests
for (var dataIndex = 0; dataIndex < numOfRequests; dataIndex++) {
phantom.create({'port' : freeport}, function(ph) {
ph.createPage(function(page) {
page.open("http://..." + data[dataIncFirstPage], function(status) {
I want to insert somewhere in the middle something like:
if(dataIndex % 10 == 0){
sleep(60); //I can use the sleep module
}
Every where I try to place sleepJS the program crashes/freezes/loops forever...
Any idea what I should try?
I've tried placing the above code as the first line after the for loop, but this doesn't work (maybe because of the callback functions that are waiting to fire..)
If I place it inside the phantom.create() callback also doesn't work..
Realize that NodeJS runs asynchronously and in your for-loop, each method call is being executing one after the other. That phantom.create call finishes near immediately, and then the next cycle of the for-loop kicks in.
To answer your question, you want the sleep command at the end of the phantom.create block, still in side the for-loop. Like this:
var numOfRequests = 10000; // Total requests
for( var dataIndex = 0; dataIndex < numOfRequests; dataIndex++ ) {
phantom.create( { 'port' : freeport }, function( ph ) {
// ..whatever in here
} );
if(dataIndex % 10 == 0){
sleep(60); //I can use the sleep module
}
}
Also, consider using a package to help with these control flow issues. Async is a good one, and has a method, eachLimit that will concurrently run a number of processes, up to a limit. Handy! You will need to create an input object array for each iteration you wish to run, like this:
var dataInputs = [ { id: 0, data: "/abc"}, { id : 1, data : "/def"} ];
function processPhantom( dataItem, callback ){
console.log("Starting processing for " + JSON.stringify( dataItem ) );
phantom.create( { 'port' : freeport }, function( ph ) {
// ..whatever in here.
//When done, in inner-most callback, call:
//callback(null); //let the next parallel items into the queue
//or
//callback( new Error("Something went wrong") ); //break the processing
} );
}
async.eachLimit( dataInputs, 10, processPhantom, function( err ){
//Can check for err.
//It is here that everything is finished.
console.log("Finished with async.eachLimit");
});
Sleeping for a minute isn't a bad idea, but in groups of 10, that will take you 1000 minutes, which is over 16 hours! Would be more convenient for you to only call when there is space in your queue - and be sure to log what requests are in process, and have completed.

Node.js Fibers and code scheduled with setTimeout leads to crash

I am using Fibers to solve a problem regarding how to yield control to the event loop in node.js, pausing the execution of some synchronous code. This works well, mostly, but I encountered a strange crashing but, and I am not able to find the reason for it.
Setup
There are three process:
A main server process, it receives code to instrument and execute. When it receives new code to execute it use child_process.fork() to spawn
An execution process. This instruments the received code to call a specific callback from time to time to report what happened in the executed code. It then executes the code in a sandbox created by using Contextify. Sometimes these reports include incorrect location information about the line and column in the code something happens. In that case a source map is needed to map locations in the instrumented code to locations in the original code. But calculating this source map takes a significant amount of time. Therefore, before starting the execution the execution process spawns
A source map calculation process. This just takes the original code and the instrumented code and calculates a source map. When it's done it sends the finished source map to the execution process and exits.
If the execution process needs the source map in a callback before the execution is finished, it will use Fiber.yield() to yield control to the event loop and thus pause the execution. When the execution process then receives the data it continues the execution using pausedFiber.run().
This is implemented like so:
// server.js / main process
function executeCode(codeToExecute) {
var runtime = fork("./runtime");
runtime.on("uncaught exception", function (exception) {
console.log("An uncaught exception occured in process with id " + id + ": ", exception);
console.log(exception.stack);
});
runtime.on("exit", function (code, signal) {
console.log("Child process exited with code: " + code + " after receiving signal: " + signal);
});
runtime.send({ type: "code", code: code});
}
and
// runtime.js / execution process
var pausedExecution, sourceMap, messagesToSend = [];
function getSourceMap() {
if (sourceMap === undefined) {
console.log("Waiting for source map.");
pausedExecution = Fiber.current;
Fiber.yield();
pausedExecution = undefined;
console.log("Wait is over.")
}
if (sourceMap === null) {
throw new Error("Source map could not be generated.");
} else {
// we should have a proper source map now
return sourceMap;
}
}
function callback(message) {
console.log("Message:", message.type;)
if (message.type === "console log") {
// the location of the console log message will be the location in the instrumented code
/// we have to adjust it to get the position in the original code
message.loc = getSourceMap().originalPositionFor(message.loc);
}
messagesToSend.push(message); // gather messages in a buffer
// do not forward messages every time, instead gather a bunch and send them all at once
if (messagesToSend.length > 100) {
console.log("Sending messages.");
process.send({type: "message batch", messages: messagesToSend});
messagesToSend.splice(0); // empty the array
}
}
// function to send messages when we get a chance to prevent the client from waiting too long
function sendMessagesWithEventLoopTurnaround() {
if (messagesToSend.length > 0) {
process.send({type: "message batch", messages: messagesToSend});
messagesToSend.splice(0); // empty the array
}
setTimeout(sendMessagesWithEventLoopTurnAround, 10);
}
function executeCode(code) {
// setup child process to calculate the source map
importantDataCalculator = fork("./runtime");
importantDataCalculator.on("message", function (msg) {
if (msg.type === "result") {
importantData = msg.data;
console.log("Finished source map generation!")
} else if (msg.type === "error") {
importantData = null;
} else {
throw new Error("Unknown message from dataGenerator!");
}
if (pausedExecution) {
// execution is waiting for the data
pausedExecution.run();
}
});
// setup automatic messages sending in the event loop
sendMessagesWithEventLoopTurnaround();
// instrument the code to call a function called "callback", which will be defined in the sandbox
instrumentCode(code);
// prepare the sandbox
var sandbox = Contextify(new utils.Sandbox(callback)); // the callback to be called from the instrumented code is defined in the sandbox
// wrap the execution of the code in a Fiber, so it can be paused
Fiber(function () {
sandbox.run(code);
// send messages because the execution finished
console.log("Sending messages.");
process.send({type: "message batch", messages: messagesToSend});
messagesToSend.splice(0); // empty the array
}).run();
}
process.on("message", function (msg) {
if (msg.type === "code") {
executeCode(msg.code, msg.options);
}
});
So to summarize:
When new code is received a new process is created to execute it. This process first instruments and then executes it. Before doing so it starts a third process to calculate a source map for the code. The instrumented code calls the function named callback in the code above handing messages to the runtime that report progress of the executing code. These have to be adjusted sometimes, one example for which an adjustment is necessary are "console log" messages. To do this adjustment, the source map calculated by the third process is necessary. When the callback needs the source map it calls getSourceMap() which waits for the sourceMap process to finish its calculation and yields control to the event loop during that waiting time to enable itself to receive messages from the sourceMap process (otherwise the event loop would be blocked and no message could be received).
Messages passed to the callback are first stored in an array and then sent as a batch to the main process for performance reasons. However, we do not want the main process to wait too long for messages so in addition to sending a batch of messages when the threshold is reached we scheduled a function sendMessagesWithEventLoopTurnAround() to run in the event loop and check whether there are messages to send. This has two advantages:
When the execution process is waiting for the source map process it can use the time to send the messages it already got. So if the sourceMap process takes several seconds to finish, the main process does not have to wait the same time for messages that have already been created and contain correct data.
When the executing code generates only very little messages in the event loop (e.g. by a function scheduled with setTimeInterval(f, 2000) which only creates one single message per execution) it does not have to wait a long time until the message buffer is full (in this example 200s) but receives updates about the progress every 10ms (if anything changed).
The Problem
What works
This setup works fine in the following cases
I do not use fibers and a separate process to calculate the source map. Instead I calculate the source map before the code is executed. In that case all the code to execute I tried works as expected.
I do use fibers and a separate process and execute code for which I do not need the source map. E.g.
var a = 2;
or
setTimeout(function () { var a = 2;}, 10)
In the first case the output looks like this.
Starting source map generation.
Message: 'variables init'
Message: 'program finished'
Sending messages.
Finished source map generation.
Source map generator process exited with code: 0 after receiving signal: null
I do use fibers and a separate process and code for which I need the source map but that doesn't use the event loop, e.g.
console.log("foo");
In that case the output looks like this:
Starting source map generation.
Message: 'console log'
Waiting for source map generation.
Finished source map generation.
Wait is over.
Message: 'program finished'
Sending messages.
Source map generator process exited with code: 0 after receiving signal: null
I do use fibers and a separate process and code for which I need the source map and which uses the event loop, but the source map is only needed when the source map calculation is already finished (so no waiting).
E.g.
setTimeout(function () {
console.log("foo!");
}, 100); // the source map generation takes around 100ms
In that case the output looks like this:
Starting source map generation.
Message: 'function declaration'
Message: 'program finished'
Sending messages.
Finished source map generation.
Source map generator process exited with code: 0 after receiving signal: null
Message: 'function enter'
Message: 'console log'
Message: 'function exit'
Sending messages in event loop.
What doesn't work
It only breaks if I use fibers and separate processes and code that uses the event loop but needs the source map before it is finished, e.g.
setTimeout(function () {
console.log("foo!");
}, 10); // the source map generation takes around 100ms
The output then looks like this:
Starting source map generation.
Message: 'function declaration'
Message: 'program finished'
Sending messages.
Message: 'function enter'
Message: 'console log'
Waiting for source map generation.
/path/to/code/runtime.js:113
Fiber.yield();
^
getSourceMap (/path/to/code/runtime.js:113:28),callback (/path/to/code/runtime.js:183:9),/path/to/code/utils.js:102:9,Object.console.log (/path/to/code/utils.js:190:13),null._onTimeout (<anonymous>:56:21),Timer.listOnTimeout [as ontimeout] (timers.js:110:15)
Child process exited with code: 8 after receiving signal: null
The process that crashes here is the execution process. However, I can't find out why that happens or how to track down the problem. As you can see above, I already added several log statements to find out what is happening. I am also listening to the "uncaught exception" event on the execution process, but that does not seem to be fired.
Also, the log message we see in the end is not one of mine, since I prefix my log messages with some kind of description string, so it's one created by node.js itself. I neither understand why this occurs, nor what exit code 8 or even what else I could do to narrow down the cause.
Any help would be greatly appreciated.
As usual, once one finishes describing the problem completely a solution presents itself.
The problem, I think, is that code executed by setTimeout is not wrapped in a Fiber. So calling Fiber.yield() inside that code crashes, understandably.
Therefore, the solution is to overwrite setTimeout in the executed code. Since I am already providing a sandbox with some special functions (e.g. my own console object) I can also exchange the implementation of setTimeout by one that wraps the executed function in a fiber, like so:
// this being the sandbox object, which si the global object for the executing code
this.setTimeout = function (functionToExecute, delay) {
return setTimeout(function () {
fibers(functionToExecute).run();
}, delay);
};
This implementation does not support passing additional parameters to setTimeout but it could trivially be expanded to do so. It also does not support the version of setTimeout that is passed a string of code instead of a function, but who would use that anyway?
To make it work completely I would have to exchange the implementations of setTimeout, setInterval, setImmediate and process.nextTick. Anything else that is usually used to fulfill such a role?
This only leaves the question whether there is an easier way to do this than reimplementing each of these functions?

Resources