Using child_process.exec in Google Functions is very slow and random - node.js

I am trying to execute arbitrary user provided code in a reasonably secure and controlled way. I have been doing that using child_process.exec within a Google Cloud Function.
However, I'm finding that the execution time can vary pretty dramatically.
Running a single console.log inside a Cloud Function directly, vs within child_process.exec inside a Cloud Function, results in an overhead of 500-4000 ms in execution time.
It seems a little crazy that it both:
Can vary so widely.
Can take over 4 seconds extra to run in a separate thread.
My guess this is because they are essentially only allocating one thread to the Cloud Function, and my process has to wait around for another on that machine to free-up.
Is there something I can do to even this out?
UPDATE:
So I've been able to reproduce this consistently. It's definitely something causing require statements to take awhile inside child_process.exec Cloud Functions when the dependencies are medium/large.
Originally able to reproduce with just using Mocha to execute an empty unit test.
But I created a whole repo to reproduce it better here
And a blog post talking about my results here
I'd interested if someone could explain this.

For the moment, the issue appears to be that require calls to medium/large dependencies, made inside a child_process.exec call, can take awhile sometimes.
Not sure why.
Cloud Run does not have this issue.
But I created a whole repo to reproduce it better here
And a blog post talking about my results here
I'd interested if someone could explain this.

Related

Python multiprocessing pipe with process pool

I had a few manually started processes (with p.start()) for dealing with some background tasks, and I communicated with them via multiprocessing.Pipe(). So far, so good.
Now, I have to scale my application in a situation that, following the same structure, too many processes would be started.
So, I'm trying to port my code from having some manually started multiprocessing.Process's to a pool of processes. The problem is that multiprocessing.Pipe() does not seem to work with them. It seems that I should have to use a queue.
Specifically, I was using the code suggested in this stackovervlow answer to run some generators in background, but the problem is that now I have many generators.
Many thanks.

active log in production mode ? it's a good idea or not?

in context of a nodejs backend applicaton ,In production mode, storing log (request/ response) its good idea or not ?
for example : using library like good
Thanks.
Kind it depends. Logs can be helpful, but sometimes is a hardwork to manage them and eventually find something useful when you really need, for example, a error stack. Also, it can use space in your hosting service if you don't have some cleaning or backup routine.
Futhermore, don't forget the console.log is a blocking operation, that can be decrease the performance of the application when used in excess. Some libs makes it better, such winston but doesn't avoid this.
If you want just to got more helpful error stacks when it happens, I sugest you take a look in Rollbar . Is good service of crash report. Everything you need is just catch your errors and send to this service using the library.

Azure Function reaching timeout without doing anything

I have an Azure Function app in Node.js with a couple of Queue-triggered functions.
These were working great, until I saw a couple of timeouts in my function logs.
From that point, none of my triggered functions are actually doing anything. They just keep timing out even before executing the first line of code, which is a context.log()-statement to show the execution time.
What could be the cause of this?
Check your functions storage account in the azure portal, you'll likely see very high activity for files monitoring.
This is likely due to the interaction between Azure Files and requiring a large node_modules tree. Once the modules have been required once, functions will execute quickly because modules are cached, but these timeouts can throw the function app into a timeout -> restart loop.
There's a lot of discussion on this, along with one possible improvement (using webpack on server side modules) here.
Other possibilities:
decrease number of node modules if possible
move to dedicated instead of consumption plan (it runs on a different file system which has better performance)
use C# or F#, which don't suffer from these limitations

Testing a Node library working with Docker containers

I'm currently writing a Node library to execute untrusted code within Docker containers. It basically maintains a pool of containers running, and provides an interface to run code in one of them. Once the execution is complete, the corresponding container is destroyed and replaced by a new one.
The four main classes of the library are:
Sandbox. Exposes a constructor with various options including the pool size, and two public methods: executeCode(code, callback) and cleanup(callback)
Job. A class with two attributes, code and callback (to be called when the execution is complete)
PoolManager, used by the Sandbox class to manage the pool of containers. Provides the public methods initialize(size, callback) and executeJob(job, callback). It has internal methods related to the management of the containers (_startContainer, _stopContainer, _registerContainer, etc.). It uses an instance of the dockerode library, passed in the constructor, to do all the docker related stuff per se.
Container. A class with the attributes tmpDir, dockerodeInstance, IP and a public method executeCode(code, callback) which basically sends a HTTP POST request against ContainerIP:3000/compile along with the code to compile (a minimalist API runs inside each Docker container).
In the end, the final users of the library will only be using the Sandbox class.
Now, my question is: how should I test this?
First, it seems pretty clear to my that I should begin by writing functional tests against my Sandbox class:
it should create X containers, where X is the required pool size
it should correctly execute code (including the security aspects: handling timeouts, fork bombs, etc. which are in the library's requirements)
it should correctly cleanup the resources it uses
But then I'm not sure what else it would make sense to test, how to do it, and if the architecture I'm using is suitable to be correctly tested.
Any idea or suggestion related to this is highly appreciated! :) And feel free to ask for a clarification if anything looks unclear.
Christophe
Try and separate your functional and unit testing as much as you can.
If you make a minor change to your constructor on Sandbox then I think testing will become easier. Sandbox should take a PoolManager directly. Then you can mock the PoolManager and test Sandbox in isolation, which it appears is just the creation of Jobs, calling PoolManager for Containers and cleanup. Ok, now Sandbox is unit tested.
PoolManager may be harder to unit test as the Dockerode client might be hard to mock (API is fairly big). Regardless if you mock it or not you'll want to test:
Growing/shrinking the pool size correctly
Testing sending more requests than available containers in the pool
How stuck containers are handled. Both starting and stopping
Handling of network failures (easier when you mock things)
Retries
Any of failure cases you can think of
The Container can be tested by firing up the API from within the tests (in a container or locally). If it's that minimal recreating it should be straightforward. Once you have that it's really just testing an HTTP client it sounds like.
The source code for the actual API within the container can be tested however you like with standard unit tests. Because you're dealing with untrusted code there are a lot of possibilities:
Doesn't compile
Never completes execution
Never starts
All sorts of bombs
Uses all host's disk space
Is a bot and talks over the network
The code could do basically anything. You'll have to pick the things you care about. Try and restrict everything else.
Functional tests are going to be important too, there are a log of pieces to deal with here and mocking Docker isn't going to be easy.
Code isolation is a difficult problem; I wish Docker was around last time I had to deal with it. Just remember that your customers will always do things you didn't expect! Good luck!

Heroku node timeout because of enormous task

Our node app gets quite big and one job takes quite some time to execute. We run this job with a cronjob, but by calling the URL. Now Heroku has problems with this, because the job takes more than 30 seconds to finish. So we receive a time-out and after that it tries to execute it immediately again, and again, till our Memory quota is about 300% and the app crashes.
Now I want to fix this. Locally we don't have any problems running this script at all. It takes about a minute (for now, but in the future if we have more users it may take more time) to finish and memory stays stable.
Now running this script on the background should fix the problem according https://devcenter.heroku.com/articles/request-timeout#debugging-request-timeouts
Overe here https://devcenter.heroku.com/articles/asynchronous-web-worker-model-using-rabbitmq-in-node#getting-started I read about JackRabbit. But it seems like it's used for systems like RabbitMQ https://github.com/hunterloftis/jackrabbit
So my question: anyone who has experience with background tasks in node? Can and should I use JackRabbit for my background tasks, or are there better solutions? My background task just contains a very complex ExpressJS task, which takes some time to execute so....
I'm the Node.js platform owner at Heroku (and I actually wrote the web worker article you referenced).
Your use case sounds like it may fit the scheduler very well:
https://devcenter.heroku.com/articles/scheduler
It's a great replacement for cron-type jobs.

Resources