Spawned phantomjs process hanging - node.js

I'm trying to create a node server that spawns phantomjs processes to create screenshots. The grab.js script works fine when executed and I've confirmed that it writes to stdout. Problem is the node code that spawns the process simply hangs. I've confirmed that phantomjs is in the path. Anyone know what might be happening here or how I might troubleshoot this?
Here's the phantomjs code (grab.js) that renders the page and writes the data to stdout:
var page = require('webpage').create(),
system = require('system'),
fs = require('fs');
var url = system.args[1] || 'google.com';
page.viewportSize = {
width: 1024,
height: 1200
};
page.open(url, function() {
var b64 = page.renderBase64('png');
fs.write('/dev/stdout', b64, 'w');
phantom.exit();
});
And here's the node code that spawns the phantom progress and prints the result (hangs):
var http = require('http'),
exec = require('child_process').exec,
fs = require('fs');
exec('phantomjs grab.js google.com', function(error, stdout, stderr) {
console.log(error, stdout, stderr);
});

I have had similar issues with exec and then switched to using spawn instead and it worked.
According to this article , Use spawn when you want the child process to return huge binary data to Node, use exec when you want the child process to return simple status messages.
hth

I had same problem, in my case it was not in nodejs, but in phantomjs (v2.1).
It's known problem when phantom`s open method hangs.
Also, found second link (I guess same person wrote) in which author points that requestAnimationFrame is not working well with tweenJs, which causes freezing. PhantomJS returns unixtimestamp but tweenjs expects it to be DOMHighResTimeStamp, and so on...
Trick is to inject request-animation-frame.js (which is also provided in that article)

Related

Node PhantomJS script onResourceError path issue

Having some trouble using the webpage API in a phantomJS script I'm using for load testing.
I'm running the script in a child process, like so:
var path = require('path');
var childProcess = require('child_process');
var binPath = require('phantomjs').path;
var childArgs = [
path.join(__dirname, 'phantom-script.js')
];
var spawn = childProcess.spawn;
var child = spawn(binPath, childArgs);
child.stdout.on('data', function(data) {
const buf = Buffer.from(data);
console.log('stdout:', buf.toString());
});
child.stderr.on('data', function(data) {
const buf = Buffer.from(data);
console.log('stderr:', buf.toString());
});
And my simple phantomJS script:
var webPage = require('webpage');
var page = webPage.create();
page.onConsoleMessage = function (msg) {
console.log(msg);
};
page.onResourceError = function(resourceError) {
console.log(resourceError.errorCode + ':', resourceError.errorString);
};
function runScript() {
page.open('<webpage-url>', function(status) {
console.log('Status:', status);
if (status === 'success') {
page.evaluate(function() {
console.log('Title:', document.title);
});
}
});
}
runScript();
So to start the phantomJS script, if both of these files are in the test/ directory, and my current directory is up one from that: node test/child-process.js, which then spawns the child process and runs my phantomJS script.
So, this gets the script to run, but it always fails in page.open because of a resource error. Replacing my url with Google's, or really any website, works fine.
The error logged in onResourceError is stdout: 202: Cannot open file:///Users/<user>/path/to/local/current/directory: Path is a directory.
This is always the path from which I'm running this script. If I move down a directory into test/ and run it with node child-process.js, the error instead logs that directory.
As a headless browser, I assumed phantomJS would interface with a webpage like any client would, just without rendering the template--what does the current directory from which the script was run have anything to do with opening the webpage? Why would it be trying to load resources from my local directory when the webpage URL points to a public website, hosted at the IP and PORT specified in the first argument of page.open (e.g. xx.xxx.xx.xx:PORT)?
I'm at a bit of a loss here. The phantomJS path and all that is correct, since it runs the script fine. I just don't understand why page.open would attempt to open the directory from which the script was called--what does that have to do with its function, which is to open the URL and load it to the page?
Not sure if this is even worthy of answering--as opposed to just deleting.
I figured it out when I manually typed in the argument www.google.com, instead of copy/pasting from the browser, and and I got this as the path in the error: file:///Users/<user>/path/to/local/current/directory/www.google.com.
Now I know why I couldn't find a SO question for it. A stupid error on my part at any rate, it would've been a quick debug if the error had appended the IP address and PORT (my "url") to the end of the file path like it did for www.google.com, a clear indicator that it's not pinging a URL.
TL;DR: It's a URL, you need http(s)://...

how to stop phantomjs instances properly with nightmarejs

I wanted to get screenshot of my web app using nightmare. When there is no error while running the code and exits are clean, no phantomjs instances are left behind.
var Nightmare = require('nightmare');
var Screenshot = require('nightmare-screenshot');
var nightmare = new Nightmare();
var url = process.argv[2];
var url = process.argv[3];
nightmare
.goto(url)
.wait('#selector')
.use(Screenshot.screenshotSelector(path, '#selector'))
.run(function (err, nightmare) {
if (err)
console.log('Error');
else
console.log('Done.');
nightmare.teardownInstance();
nightmare.end();
});
However, when there is some error like web is not running, selector is not present. The instance of phantomjs remain unexited.
$ ps -ax | grep phantom
6065 ttys004 0:00.00 grep phantom
6050 ttys005 0:02.87 phantomjs --load-images=true --ignore-ssl-errors=true --ssl-protocol=any --web-security=true /.../node_modules/phantom/shim.js 13201 127.0.0.1
How could I properly exit its instance even when there is any error?
Presently, the .wait(elem) keeps on looping to check if the element is present or not every 250ms. You can report this to the developer to add passing additional parameter to stop the checks after a specific timeout to exit gracefully.
Alternatively, you can fix the issue by manually checking the presence of the element for a certain timeout and then exiting the process.
you can use wait(fn) for performing the check.

How to interact with multiple console windows?

how can interact with multiple console windows, from one node.js script?
so far i have researched a bit, and not have found anything that covers my case.
What i want to accomplish is to have one main console window, which it reads my input,
1. action#1
2. action#2
> do 1 // select action
and it would redirect its output to another console window named as Logger which shows the stdout of the action that the user selected, but keeps the main "select action" console window clean.
well i manage to find a way around it, since i wanted to stay with node.js all the way.
start.js
var cp = require("child_process");
cp.exec('start "Logger" cmd /K node logger.js',[],{});
cp.exec("start cmd /K node startAdminInterface.js",[],{});
setTimeout(function(){process.exit(0);},2000);
logger.js
var net = require('net');
net.createServer(function (socket) {
socket.on('data',function(d){
console.log(": "+d.toString("utf8"));
});
socket.on('error',function(err){
console.log("- An error occured : "+err.message);
});
}).listen(9999);
startAdminInterface.js
var net = require("net");
var logger = net.connect(9999);
var readline = require('readline'),
rl = readline.createInterface(process.stdin,process.stdout);
rl.setPrompt('> ');
rl.prompt();
rl.on('line', function(line) {
logger.write(line);
rl.prompt();
}).on('close', function() {
process.exit(0);
});
bottom, line its a workaround not exactly what i was after, put i saw potential, on logger.js it could listen from multiple sources, which is an enormous plus in the application that i'm building.

NodeJS not spawning child process except in tests

I have the following NodeJS code:
var spawn = require('child_process').spawn;
var Unzipper = {
unzip: function(src, dest, callback) {
var self = this;
if (!fs.existsSync(dest)) {
fs.mkdir(dest);
}
var unzip = spawn('unzip', [ src, '-d', dest ]);
unzip.stdout.on('data', function (data) {
self.stdout(data);
});
unzip.stderr.on('data', function (data) {
self.stderr(data);
callback({message: "There was an error executing an unzip process"});
});
unzip.on('close', function() {
callback();
});
}
};
I have a NodeUnit test that executes successfully. Using phpStorm to debug the test the var unzip is assigned correctly
However if I run the same code as part of a web service, the spawn call doesn't return properly and the server crashes on trying to attach an on handler to the nonexistent stdout property of the unzip var.
I've tried running the program outside of phpStorm, however it crashes on the command line as well for the same reason. I'm suspecting it's a permissions issue that the tests don't have to deal with. A web server spawning processes could cause chaos in a production environment, therefore some extra permissions might be needed, but I haven't been able to find (or I've missed) documentation to support my hypothesis.
I'm running v0.10.3 on OSX Snow Leopard (via MacPorts).
Why can't I spawn the child process correctly?
UPDATES
For #jonathan-wiepert
I'm using Prototypical inheritance so when I create an "instance" of Unzipper I set stdout and stderr ie:
var unzipper = Unzipper.spawn({
stdout: function(data) { util.puts(data); },
stderr: function(data) { util.puts(data); }
});
This is similar to the concept of "constructor injection". As for your other points, thanks for the tips.
The error I'm getting is:
project/src/Unzipper.js:15
unzip.stdout.on('data', function (data) {
^
TypeError: Cannot call method 'on' of undefined
As per my debugging screenshots, the object that is returned from the spawn call is different under different circumstances. My test passes (it checks that a ZIP can be unzipped correctly) so the problem occurs when running this code as a web service.
The problem was that the spawn method created on the Object prototype (see this article on Protypical inheritance) was causing the child_process.spawn function to be replaced, so the wrong function was being called.
I saved child_process.spawn into a property on the Unzipper "class" before it gets clobbered and use that property instead.

How do I open a terminal application from node.js?

I would like to be able to open Vim from node.js program running in the terminal, create some content, save and exit Vim, and then grab the contents of the file.
I'm trying to do something like this:
filename = '/tmp/tmpfile-' + process.pid
editor = process.env['EDITOR'] ? 'vi'
spawn editor, [filename], (err, stdout, stderr) ->
text = fs.readFileSync filename
console.log text
However, when this runs, it just hangs the terminal.
I've also tried it with exec and got the same result.
Update:
This is complicated by the fact that this process is launched from a command typed at a prompt with readline running. I completely extracted the relevant parts of my latest version out to a file. Here is it in its entirety:
{spawn} = require 'child_process'
fs = require 'fs'
tty = require 'tty'
rl = require 'readline'
cli = rl.createInterface process.stdin, process.stdout, null
cli.prompt()
filename = '/tmp/tmpfile-' + process.pid
proc = spawn 'vim', [filename]
#cli.pause()
process.stdin.resume()
indata = (c) ->
proc.stdin.write c
process.stdin.on 'data', indata
proc.stdout.on 'data', (c) ->
process.stdout.write c
proc.on 'exit', () ->
tty.setRawMode false
process.stdin.removeListener 'data', indata
# Grab content from the temporary file and display it
text = fs.readFile filename, (err, data) ->
throw err if err?
console.log data.toString()
# Try to resume readline prompt
cli.prompt()
The way it works as show above, is that it shows a prompt for a couple of seconds, and then launches in to Vim, but the TTY is messed up. I can edit, and save the file, and the contents are printed correctly. There is a bunch of junk printed to terminal on exit as well, and Readline functionality is broken afterward (no Up/Down arrow, no Tab completion).
If I uncomment the cli.pause() line, then the TTY is OK in Vim, but I'm stuck in insert mode, and the Esc key doesn't work. If I hit Ctrl-C it quits the child and parent process.
You can inherit stdio from the main process.
const child_process = require('child_process')
var editor = process.env.EDITOR || 'vi';
var child = child_process.spawn(editor, ['/tmp/somefile.txt'], {
stdio: 'inherit'
});
child.on('exit', function (e, code) {
console.log("finished");
});
More options here: http://nodejs.org/api/child_process.html#child_process_child_process_spawn_command_args_options
Update: My answer applied at the time it was created, but for modern versions of Node, look at this other answer.
First off, your usage of spawn isn't correct. Here are the docs. http://nodejs.org/docs/latest/api/child_processes.html#child_process.spawn
Your sample code makes it seem like you expect vim to automatically pop up and take over the terminal, but it won't. The important thing to remember is that even though you may spawn a process, it is up to you to make sure that the data from the process makes it through to your terminal for display.
In this case, you need to take data from stdin and send it to vim, and you need to take data output by vim and set it to your terminal, otherwise you won't see anything. You also need to set the tty into raw mode, otherwise node will intercept some of the key sequences, so vim will not behave properly.
Next, don't do readFileSync. If you come upon a case where you think you need to use a sync method, then chances are, you are doing something wrong.
Here's a quick example I put together. I can't vouch for it working in every single case, but it should cover most cases.
var tty = require('tty');
var child_process = require('child_process');
var fs = require('fs');
function spawnVim(file, cb) {
var vim = child_process.spawn( 'vim', [file])
function indata(c) {
vim.stdin.write(c);
}
function outdata(c) {
process.stdout.write(c);
}
process.stdin.resume();
process.stdin.on('data', indata);
vim.stdout.on('data', outdata);
tty.setRawMode(true);
vim.on('exit', function(code) {
tty.setRawMode(false);
process.stdin.pause();
process.stdin.removeListener('data', indata);
vim.stdout.removeListener('data', outdata);
cb(code);
});
}
var filename = '/tmp/somefile.txt';
spawnVim(filename, function(code) {
if (code == 0) {
fs.readFile(filename, function(err, data) {
if (!err) {
console.log(data.toString());
}
});
}
});
Update
I seeee. I don't think readline is as compatible with all of this as you would like unfortunately. The issue is that when you createInterface, node kind of assumes that it will have full control over that stream from that point forward. When we redirect that data to vim, readline is still there processing keypresses, but vim is also doing the same thing.
The only way around this that I see is to manually disable everything from the cli interface before you start vim.
Just before you spawn the process, we need to close the interface, and unfortunately manually remove the keypress listener because, at least at the moment, node does not remove it automatically.
process.stdin.removeAllListeners 'keypress'
cli.close()
tty.setRawMode true
Then in the process 'exit' callback, you will need to call createInterface again.
I tried to do something like this using Node's repl library - https://nodejs.org/api/repl.html - but nothing worked. I tried launching vscode and TextEdit, but on the Mac there didn't seem to be a way to wait for those programs to close. Using execSync with vim, nano, and micro all acted strangely or hung the terminal.
Finally I switched to using the readline library using the example given here https://nodejs.org/api/readline.html#readline_example_tiny_cli - and it worked using micro, e.g.
import { execSync } from 'child_process'
...
case 'edit':
const cmd = `micro foo.txt`
const result = execSync(cmd).toString()
console.log({ result })
break
It switches to micro in a Scratch buffer - hit ctrl-q when done, and it returns the buffer contents in result.

Resources