Limited require() inside VM - node.js

Is there a way to make a limited require in wich files require()'d by VM programs are also run inside the VM?
Is it safe to pass a function to the script wich reads a file and then it returns it a VM instance of the code.
My situation is that I need a sandboxed file that is kind-of trusted to start up the enviroment by requiring all the other files. This should not be done by passing them in context because those files have to be required when the file wants, since those required files might change.
The required files are not really trusted, so normal require() is dangerous. As only way code has to mess up is that.
So: init.js calls files/core1.js, files/core2.js and files/whatever_it_wants.js
What I use:
var sandbox = {
fs: vfs,
write: console.log,
files: f,
exit: process.exit
}
var context = new vm.createContext(sandbox);
var script = new vm.Script(fs.readFileSync("./e/master.js"),{timeout: 10000});
script.runInContext(context);
master.js needs to load files that are not very trustable, having files (contains all files available) and it's virtual filesystem for it to do its FS operations, and write/exit to do what it needs.

Related

Logrotation for a Nodejs Application

I am working on a very old Nodejs application which creates a new child process using forever-monitor. The logs of this child process are taken care by forever-monitor only. This is how the configuration looks like:
var child = new (forever.Monitor)(__dirname + '/../lib/childprocess.js', {
max: 3,
silent: true,
options: [program.port],
'errFile': __dirname + '/../childprocess_error.log',
'outFile': __dirname + '/../childprocess_output.log'
}
);
Everything is working fine in this setup. The new requirement is to rotate these logs every 12 hours. That is every 12 hours a new file will be created which will have all the content of this file childprocess_output.log and should be stored in some other directory. The new log file will obviously have the timestamp appended at the end of the name (eg: childprocess_output_1239484034.log).
And the original file childprocess_output.log should be reset, that is all its content should be deleted and it should start logging from fresh.
I am trying to understand which npm library should I used for this purpose. I googled a bit and found a few of the npm libraries which matches my requirement, but the number of downloads for these libraries was really small, so I doubt the reliability of those libraries.
Which library NodeJs developers use for log rotation?
Also, my last resort would be to use the Linux tool Logrotate if I couldn't find any appropriate library in Node. I am avoiding using Logroate because I want my application to handle the scenario and not depend on the instance configuration.
you can use :
fs (the file system library) handled with methods like statSync and renameSync coupled with try-catches block-codes.

How to statically analyse that a file is fit for importing?

I have CLI program that can be executed with a list of files that describe instructions, e.g.
node ./my-program.js ./instruction-1.js ./instruction-2.js ./instruction-3.js
This is how I am importing and validating that the target file is an instruction file:
const requireInstruction = (instructionFilePath) => {
const instruction = require(instructionFilePath)
if (!instruction.getInstruction) {
throw new Error('Not instruction file.');
}
return instruction;
};
The problem with this approach is that it will execute the file executes regardless of whether it matches the expected signature, i.e. if file contains a side action such as connecting to the database:
const mysql = require('mysql');
mysql.createConnection(..);
module.exports = mysql;
Not instruction file. will fire, I will ignore the file, but the side-action will remain in the background.
How to safely validate target file signature?
Worst case scenario, is there a conventional way to completely sandbox the require logic and kill the process if file is determined to be unsafe?
Worst case scenario, is there a conventional way to completely sandbox the require logic and kill the process if file is determined to be unsafe?
Move the check logic into a specific js file. Make it process.exit(0) when everything is fine, process.exit(1) when it s wrong.
In your current program, instead of loading the file via require, use child_process.exec to invoke your new file, giving it the required parameter to know which file to test.
In your updated program, bind the close event to know if the return code was 0 or 1.
If you need more information than 0 or 1, into the new js file which will load the instruction, print some JSON.stringified data to stdout (console.log), and retrieve then JSON.parse it in the callback of call to child_process.exec.
Alternatively, have you looked into AST processing ?
http://jointjs.com/demos/javascript-ast
It could help you to identify piece of code which are not embedded within an exported function.
(Note: I discussed this question with the author on IRC. There may be some context in my answer that isn't in the original question.)
Given that your scenario is purely about preventing against accidental inclusion of non-instruction files, rather than about preventing malicious behaviour, static analysis using something like Esprima will probably be sufficient.
One approach would be to require that every instruction file exports some kind of object with a name property, containing the name of the instruction file. As there's not really anything to put in there besides a string literal, you can be fairly certain that if you can't locate a name property through static analysis, the file is not an instruction file - even in a language like JavaScript that isn't fully statically analyzable.
For any readers of this thread that are trying to protect from malicious actors, rather than accidents - for example, when accepting untrusted code from users: you cannot sandbox or 'validate' JavaScript with Node.js alone (not with the vm module either), and the above solution will not work for you. You will need system-level containerization or virtualization to run this kind of code safely. There are no other options.

How to call require for more files in one line?

I have a protractor test and I made helper js files for each functionality (E.g. login, createObject, logout).
These separate js files are called from Test.js (the spec file).
I want to make the require for all 3 methods in config.js, but in just one call, all my tests contain a lot of helpers/methods files.
I've tried this:
in config.js,
onPrepare: function () {
'use strict';
global.Methods = require ('./method1.js' , './method2.js' , './method3.js');
}
but it doesn't work.
Can anyone tell me if this is possible or if there is a better way to do it?
Thank you in advance.
Since the require function only takes one file as an parameter you can put all your methods into one file called actions. This way you only have to make one require call to gain access to all your methods.
global.Methods = require('./actions.js');
or even better
global.Methods = new (require('./actions.js'));
Or simply require one file at a time
global.Methods1 = require ('./method1.js');
global.Methods2 = require ('./method2.js');
global.Methods3 = require ('./method3.js');

How can you persist user data for command line tools?

I'm a front-end dev just venturing into the Node.js, particularly in using it to create small command line tools.
My question: how do you persist data with command line tools? For example, if I want to keep track of the size of certain files over time, I'd need to keep a running record of changes (additions and deletions) to those files, and relevant date/time information.
On the web, you store that sort of data in a database on a server, and then query the database when you need it again. But how do you do it when you're creating a Node module that's meant to be used as a command line tool?
Some generic direction is all I'm after. I don't even know what to Google at this point.
It really depends on what you're doing, but a simple approach is to just save the data that you want to persist to a file and, since we're talking node, store it in JSON format.
Let's say you have some data like:
var data = [ { file: 'foo.bar', size: 1234, date: '2014-07-31 00:00:00.000'}, ...]
(it actually doesn't matter what it is, as long as it can be JSON.stringifiy()d)
You can just save it with:
fs.writeFile(filename, JSON.stringify(data), {encoding: 'utf8'}, function(err) { ... });
And load it again with:
fs.readFile(filename, {encoding: 'utf8'}, function(err, contents) {
data = JSON.parse(contents);
});
You'll probably want to give the user the ability to specify the name of the file you're going to persist the data to via an argument like:
node myscript.js <data_file>
You can get that passed in parameter with process.argv:
var filename = process.argv[2]; // Be sure to check process.argv.length and have a default
Using something like minimist can be really helpful if you want to get more complex like:
node myscript.js --output <data_file>
You also can store files in temporary directory, for example /tmp directory on linux and give user the option to change the directory.
To get path to temporary directory you can use os module in nodejs:
const os = require('os');
const tmp = os.tmpdir();

How do I support multiple server.pid files?

I am running play on multiple machines in our datacenter. We loadbalance the hell out of everything. On each play node/VM I'm using Apache and an init.d/play script to start and stop the play service.
The problem is that our play websites are hosted on shared network storage. This makes deployment really nice, you deploy to one place and the website is updated on all 100 machines. Each machine has a mapped folder "/z/www/PlayApp1" where the play app lives.
The issue is that when the service starts or stops the server.pid file is being written to that network location where the apps files live.
The problem is that as I bring up 100 nodes, the 100th node will override the PID file with it's pid and now that pid file only represents the correct process ID for 1 out of 100 nodes.
So how do I get play to store the pid file locally and not with the app files on the network share? I'll need each server's PID file to reflect that machines actual process.
We are using CentOS (Linux)
Thanks in advance
Josh
According to https://github.com/playframework/play/pull/43 it looks like there is a --pid_file command line option; it might only work with paths under the application root so you might have to make directories for each distinct host (which could possibly be symlinks)
I have 0 experience with Play so hopefully this is helpful information.
I don't even think it should run a second copy, based on the current source code. The main function is:
public static void main(String[] args) throws Exception {
File root = new File(System.getProperty("application.path"));
if (System.getProperty("precompiled", "false").equals("true")) {
Play.usePrecompiled = true;
}
if (System.getProperty("writepid", "false").equals("true")) {
writePID(root);
}
:
blah blah blah
}
and writePID is:
private static void writePID(File root) {
String pid = ManagementFactory.getRuntimeMXBean().getName().split("#")[0];
File pidfile = new File(root, PID_FILE);
if (pidfile.exists()) {
throw new RuntimeException("The " + PID_FILE + " already exists. Is the server already running?");
}
IO.write(pid.getBytes(), pidfile);
}
meaning it should throw an exception when you try to run multiple copies using the same application.path.
So either you're not using the version I'm looking at or you're discussing something else.
It seems to me it would be a simple matter to change that one line above:
File root = new File(System.getProperty("application.path"));
to use a different property for the PID file storage, one that's not on the shared drive.
Although you'd need to be careful, root is also passed to Play.int so you should investigate the impact of changing it.
This is, after all, one of the great advantages of open source software, inasmuch as you can fix the "bugs" yourself.
For what it's worth, I'm not a big fan of the method you've chosen for deployment. Yes, it simplifies deployment but upgrading your servers is an all-or-nothing thing which will cause you grief if you accidentally install some dodgy software.
I much prefer staged deployments so I can shut down non-performing nodes as needed.
Change your init script to write the pid to /tmp or somewhere else machine-local.
If that is hard, a symlink might work.

Resources