Changing how nodejs require() fetches files - node.js

I'm looking to monkey-patch require() to replace its file loading with my own function. I imagine that internally require(module_id) does something like:
Convert module_id into a file path
Load the file path as a string
Compile the string into a module object and set up the various globals correctly
I'm looking to replace step 2 without reimplementing steps 1 + 3. Looking at the public API, there's require() which does 1 - 3, and require.resolve() which does 1. Is there a way to isolate step 2 from step 3?
I've looked at the source of require mocking tools such as mockery -- all they seem to be doing is replacing require() with a function that intercepts certain calls and returns a user-supplied object, and passes on other calls to the native require() function.
For context, I'm trying to write a function require_at_commit(module_id, git_commit_id), which loads a module and any of that module's requires as they were at the given commit.
I want this function because I want to be able to write certain functions that a) rely on various parts of my codebase, and b) are guaranteed to not change as I evolve my codebase. I want to "freeze" my code at various points in time, so thought this might be an easy way of avoiding having to package 20 copies of my codebase (an alternative would be to have "my_code_v1": "git://..." in my package.json, but I feel like that would be bloated and slow with 20 versions).
Update:
So the source code for module loading is here: https://github.com/joyent/node/blob/master/lib/module.js. Specifically, to do something like this you would need to reimplement Module._load, which is pretty straightforward. However, there's a bigger obstacle, which is that step 1, converting module_id into a file path, is actually harder than I thought, because resolveFilename needs to be able to call fs.exists() to know where to terminate its search... so I can't just substitute out individual files, I have to substitute entire directories, which means that it's probably easier just to export the entire git revision to a directory and point require() at that directory, as opposed to overriding require().
Update 2:
Ended up using a different approach altogether... see answer I added below

You can use the require.extensions mechanism. This is how the coffee-script coffee command can load .coffee files without ever writing .js files to disk.
Here's how it works:
https://github.com/jashkenas/coffee-script/blob/1.6.2/lib/coffee-script/coffee-script.js#L20
loadFile = function(module, filename) {
var raw, stripped;
raw = fs.readFileSync(filename, 'utf8');
stripped = raw.charCodeAt(0) === 0xFEFF ? raw.substring(1) : raw;
return module._compile(compile(stripped, {
filename: filename,
literate: helpers.isLiterate(filename)
}), filename);
};
if (require.extensions) {
_ref = ['.coffee', '.litcoffee', '.md', '.coffee.md'];
for (_i = 0, _len = _ref.length; _i < _len; _i++) {
ext = _ref[_i];
require.extensions[ext] = loadFile;
}
}
Basically, assuming your modules have a set of well-known extensions, you should be able to use this pattern of a function that takes the module and filename, does whatever loading/transforming you need, and then returns an object that is the module.
This may or may not be sufficient to do what you are asking, but honestly from your question it sounds like you are off in the weeds somewhere far from the rest of the programming world (don't take that harshly, it's just my initial reaction).

So rather than mess with the node require() module, what I ended up doing is archiving the given commit I need to a folder. My code looks something like this:
# commit_id is the commit we want
# (note that if we don't need the whole repository,
# we can pass "commit_id path_to_folder_we_need")
#
# path is the path to the file you want to require starting from the repository root
# (ie 'lib/module.coffee')
#
# cb is called with (err, loaded_module)
#
require_at_commit = (commit_id, path, cb) ->
dir = 'old_versions' #make sure this is in .gitignore!
dir += '/' + commit_id
do_require = -> cb null, require dir + '/' + path
if not fs.existsSync(dir)
fs.mkdirSync(dir)
cmd = 'git archive ' + commit_id + ' | tar -x -C ' + dir
child_process.exec cmd, (error) ->
if error
cb error
else
do_require()
else
do_require()

Related

get location of script requiring current script

I need to do some file operations with paths relative to the script that required the current one.
Say we have the following in ~/somewhere/file2.js
const y = require('~/file1.js');
And in ~/file1.js we have:
const x = require('./other/script.js'); //relative to ~/file1.js
And we invoke it like this:
cd ~/somedir
node ~/somewhere/file2.js
then within ~/other/script.js we can do this:
console.log(__dirname); // -> ~/other
console.log(__filename); // -> ~/other/script.js
console.log(process.cwd()); // -> ~/somedir
console.log(process.argv[0]); // -> path/to/node
console.log(path.resolve('.')); // -> ~/somedir
console.log(process.argv[1]); // -> ~/somewhere/file2.js
None of these are the path I need.
How, from ~other/script.js, can I determine the location of the script that required us - i.e ~/file1.js
To put it another way.
~/somewhere/file2.js requires ~/file1.js
and
~/file1.js requires ~/other/script.js
from within ~/other/script.js I need to do file operations relative to ~/somewhere/file1.js - how can I get it's location?
I actually only need the directory in which file1.js sits, so filename or directory will work for me.
You can use module.parent.filename inside of other/script.js, or you can pass the __dirname as a parameter to your module like require('other/script.js')(__dirname) (given your module exports a function)

NodeJS Usage require('../');

I am in the process of learning NodeJs and stumbled across this line of code:
var irsdk = require('../');
I cannot figure out what is being loaded. I can see where the variable is being used and calling functions.
I understand how to use the require statement when loading a particular file.
If anyone could shed some light it would be appreciated.
Thanks.
From Node's documentation on Modules
require(X) from module at path Y
1. If X is a core module,
a. return the core module
b. STOP
2. If X begins with './' or '/' or '../'
a. LOAD_AS_FILE(Y + X)
b. LOAD_AS_DIRECTORY(Y + X)
3. LOAD_NODE_MODULES(X, dirname(Y))
4. THROW "not found"
LOAD_AS_FILE(X)
1. If X is a file, load X as JavaScript text. STOP
2. If X.js is a file, load X.js as JavaScript text. STOP
3. If X.json is a file, parse X.json to a JavaScript Object. STOP
4. If X.node is a file, load X.node as binary addon. STOP
LOAD_AS_DIRECTORY(X)
1. If X/package.json is a file,
a. Parse X/package.json, and look for "main" field.
b. let M = X + (json main field)
c. LOAD_AS_FILE(M)
2. If X/index.js is a file, load X/index.js as JavaScript text. STOP
3. If X/index.json is a file, parse X/index.json to a JavaScript object. STOP
4. If X/index.node is a file, load X/index.node as binary addon. STOP
LOAD_NODE_MODULES(X, START)
1. let DIRS=NODE_MODULES_PATHS(START)
2. for each DIR in DIRS:
a. LOAD_AS_FILE(DIR/X)
b. LOAD_AS_DIRECTORY(DIR/X)
NODE_MODULES_PATHS(START)
1. let PARTS = path split(START)
2. let I = count of PARTS - 1
3. let DIRS = []
4. while I >= 0,
a. if PARTS[I] = "node_modules" CONTINUE
c. DIR = path join(PARTS[0 .. I] + "node_modules")
b. DIRS = DIRS + DIR
c. let I = I - 1
5. return DIRS
require('../') runs the LOAD_AS_DIRECTORY(X) section for the parent directory.
To make this simple, just keep in mind that if you dont specify a file but a directory, it will be the index.js file that will be loaded.
In the current case, we are requiring ../ witch will load the index of the upper directory.
It could be the InRule SDK for JS (http://www.inrule.com/products/inrule-for-javascript/) Which can be used to separate your business logic out from your application logic.
Or, it could be the npm 'node-irsdk' package which appears to be a telemetry package of some sort that enhances the existing "utils" module. (https://www.npmjs.com/package/node-irsdk)
Either way, you can log it out to the console to get more information about it by literally logging the variable.
console.log(irsdk);
//or
console.dir(irsdk);
//both must be called AFTER you instantiate the var irsdk = req.....

How can I add the build version to a scons build

At the moment I'm using some magic to get the current git revision into my scons builds.. I just grab the version a stick it into CPPDEFINES.
It works quite nicely ... until the version changes and scons wants to rebuild everything, rather than just the files that have changed - becasue the define that all files use has changed.
Ideally I'd generate a file using a custom builder called git_version.cpp and
just have a function in there that returns the right tag. That way only that one file would be rebuilt.
Now I'm sure I've seen a tutorial showing exactly how to do this .. but I can't seem to track it down. And I find the custom builder stuff a little odd in scons...
So any pointers would be appreciated...
Anyway just for reference this is what I'm currently doing:
# Lets get the version from git
# first get the base version
git_sha = subprocess.Popen(["git","rev-parse","--short=10","HEAD"], stdout=subprocess.PIPE ).communicate()[0].strip()
p1 = subprocess.Popen(["git", "status"], stdout=subprocess.PIPE )
p2 = subprocess.Popen(["grep", "Changed but not updated\\|Changes to be committed"], stdin=p1.stdout,stdout=subprocess.PIPE)
result = p2.communicate()[0].strip()
if result!="":
git_sha += "[MOD]"
print "Building version %s"%git_sha
env = Environment()
env.Append( CPPDEFINES={'GITSHAMOD':'"\\"%s\\""'%git_sha} )
You don't need a custom Builder since this is just one file. You can use a function (attached to the target version file as an Action) to generate your version file. In the example code below, I've already computed the version and put it into an environment variable. You could do the same, or you could put your code that makes git calls in the version_action function.
version_build_template="""/*
* This file is automatically generated by the build process
* DO NOT EDIT!
*/
const char VERSION_STRING[] = "%s";
const char* getVersionString() { return VERSION_STRING; }
"""
def version_action(target, source, env):
"""
Generate the version file with the current version in it
"""
contents = version_build_template % (env['VERSION'].toString())
fd = open(target[0].path, 'w')
fd.write(contents)
fd.close()
return 0
build_version = env.Command('version.build.cpp', [], Action(version_action))
env.AlwaysBuild(build_version)

Python 3 C-API IO and File Execution

I am having some serious trouble getting a Python 2 based C++ engine to work in Python3. I know the whole IO stack has changed, but everything I seem to try just ends up in failure. Below is the pre-code (Python2) and post code (Python3). I am hoping someone can help me figure out what I'm doing wrong.I am also using boost::python to control the references.
The program is supposed to load a Python Object into memory via a map and then upon using the run function it then finds the file loaded in memory and runs it. I based my code off an example from the delta3d python manager, where they load in a file and run it immediately. I have not seen anything equivalent in Python3.
Python2 Code Begins here:
// what this does is first calls the Python C-API to load the file, then pass the returned
// PyObject* into handle, which takes reference and sets it as a boost::python::object.
// this takes care of all future referencing and dereferencing.
try{
bp::object file_object(bp::handle<>(PyFile_FromString(fullPath(filename), "r" )));
loaded_files_.insert(std::make_pair(std::string(fullPath(filename)), file_object));
}
catch(...)
{
getExceptionFromPy();
}
Next I load the file from the std::map and attempt to execute it:
bp::object loaded_file = getLoadedFile(filename);
try
{
PyRun_SimpleFile( PyFile_AsFile( loaded_file.ptr()), fullPath(filename) );
}
catch(...)
{
getExceptionFromPy();
}
Python3 Code Begins here: This is what I have so far based off some suggestions here... SO Question
Load:
PyObject *ioMod, *opened_file, *fd_obj;
ioMod = PyImport_ImportModule("io");
opened_file = PyObject_CallMethod(ioMod, "open", "ss", fullPath(filename), "r");
bp::handle<> h_open(opened_file);
bp::object file_obj(h_open);
loaded_files_.insert(std::make_pair(std::string(fullPath(filename)), file_obj));
Run:
bp::object loaded_file = getLoadedFile(filename);
int fd = PyObject_AsFileDescriptor(loaded_file.ptr());
PyObject* fileObj = PyFile_FromFd(fd,fullPath(filename),"r",-1,"", "\n","", 0);
FILE* f_open = _fdopen(fd,"r");
PyRun_SimpleFile( f_open, fullPath(filename) );
Lastly, the general state of the program at this point is the file gets loaded in as TextIOWrapper and in the Run: section the fd that is returned is always 3 and for some reason _fdopen can never open the FILE which means I can't do something like PyRun_SimpleFile. The error itself is a debug ASSERTION on _fdopen. Is there a better way to do all this I really appreciate any help.
If you want to see the full program of the Python2 version it's on Github
So this question was pretty hard to understand and I'm sorry, but I found out my old code wasn't quite working as I expected. Here's what I wanted the code to do. Load the python file into memory, store it into a map and then at a later date execute that code in memory. I accomplished this a bit differently than I expected, but it makes a lot of sense now.
Open the file using ifstream, see the code below
Convert the char into a boost::python::str
Execute the boost::python::str with boost::python::exec
Profit ???
Step 1)
vector<char> input;
ifstream file(fullPath(filename), ios::in);
if (!file.is_open())
{
// set our error message here
setCantFindFileError();
input.push_back('\0');
return input;
}
file >> std::noskipws;
copy(istream_iterator<char>(file), istream_iterator<char>(), back_inserter(input));
input.push_back('\n');
input.push_back('\0');
Step 2)
bp::str file_str(string(&input[0]));
loaded_files_.insert(std::make_pair(std::string(fullPath(filename)), file_str));
Step 3)
bp::str loaded_file = getLoadedFile(filename);
// Retrieve the main module
bp::object main = bp::import("__main__");
// Retrieve the main module's namespace
bp::object global(main.attr("__dict__"));
bp::exec(loaded_file, global, global);
Full Code is located on github:

Have R look for files in a library directory

I am using R, on linux.
I have a set a functions that I use often, and that I have saved in different .r script files. Those files are in ~/r_lib/.
I would like to include those files without having to use the fully qualified name, but just "file.r". Basically I am looking the same command as -I in the c++ compiler.
I there a way to set the include file from R, in the .Rprofile or .Renviron file?
Thanks
You can use the sourceDir function in the Examples section of ?source:
sourceDir <- function(path, trace = TRUE, ...) {
for (nm in list.files(path, pattern = "\\.[RrSsQq]$")) {
if(trace) cat(nm,":")
source(file.path(path, nm), ...)
if(trace) cat("\n")
}
}
And you may want to use sys.source to avoid cluttering your global environment.
If you set the chdir parameter of source to TRUE, then the source calls within the included file will be relative to its path. Hence, you can call:
source("~/r_lib/file.R",chdir=T)
It would probably be better not to have source calls within your "library" and make your code into a package, but sometimes this is convenient.
Get all the files of your directory, in your case
d <- list.files("~/r_lib/")
then you can load them with a function of the plyr package
library(plyr)
l_ply(d, function(x) source(paste("~/r_lib/", x, sep = "")))
If you like you can do it in a loop as well or use a different function onstead of l_ply. Conventional loop:
for (i in 1:length(d)) source(paste("~/r_lib/", d[[i]], sep = ""))
Write your own source() wrapper?
mySource <- function(script, path = "~/r_lib/", ...) {
## paste path+filename
fname <- paste(path, script, sep = "")
## source the file
source(fname, ...)
}
You could stick that in your .Rprofile do is will be loaded each time you start R.
If you want to load all the R files, you can extend the above easily to source all files at once
mySource <- function(path = "~/r_lib/", ...) {
## list of files
fnames <- list.files(path, pattern = "\\.[RrSsQq]$")
## add path
fnames <- paste(path, fnames, sep = "")
## source the files
lapply(fnames, source, ...)
invisible()
}
Actually, though, you'd be better off starting your own private package and loading that.

Resources