Ultimately speaking - is there a practical method (also maybe by inserting some JS constructs into the code) to break or halt the long lasting JS code during the execution? For example: can it be interrupted by some process.* object constructs, or similar? Or the other way? The valid solution may even include the NodeJS process to be killed and/or restarted. Thank you!
EDIT:
I need to execute some particular user code on the server, using Function clause (ala eval, - let alone security concerns). I cannot insert any extra code inside it, only enclose it. What I need is to have a possibility to break user code after 5 minutes, if it is not finished by this time. For example:
usercode = 'Some code from the user';
pre_code = 'some controlling code for breaking the user code';
post_code = 'another controlling code';
fcode = pre_code + usercode + post_code;
<preparations for breaking usercode>
(new Function(fcode))(); // This MUST exit in 5 minutes
Edit:
Answering your edit. I see the intention now. If it is running in nodejs, you can use worker_thread for that https://nodejs.org/api/worker_threads.html#worker_threads_worker_workerdata.
For example:
// main.js
const runCode = (code) => {
const worker = new Worker("./code-executor.js", { workerData: { code: guestCode } });
const promise = new Promise((resolve) => {
setTimeout(() => worker.kill(), 60000 * 5);
worker.on("error", () => {
return reject(new SomeCustomError())
});
worker.on("message", (message) => {
if(message.success) return resolve(message.result);
return reject(new Error(message.error));
});
});
promise.finally(() => { worker.kill() });
return promise;
}
// code-executor.js
const { workerData, parentPort } = require("worker_threads");
const { code } = workerData;
Promise.resolve()
.then(() => (new Function(fcode))())
.then((result) => {
parentPort.postMessage({
success: true,
result: value
})
})
.catch((error) => {
parentPort.postMessage({
success: true,
error: error.message
})
});
If it's in browser https://developer.mozilla.org/en-US/docs/Web/API/Worker
The WebAPI is not exactly the same but the logic should be similar
Original
Killing a process. Also read: https://nodejs.org/api/process.html#process_signal_events
process.kill(pid, "SIGINT")
"Killing" a long running function, you gotta hack a bit. There's no elegant solution. Inject a controller which can be mutated outside of the long running function. To stop it from the outside, set controller.isStopped = true
export const STOP_EXECUTION = Symbol();
function longRunning(controller){
... codes
// add stopping point
if(controller.isStopped) throw STOP_EXECUTION;
... codes
// add stopping point
if(controller.isStopped) throw STOP_EXECUTION;
... codes
}
// catch it by
try{
longRunnning();
}catch(e){
switch(true){
e === STOP_EXECUTION: ...; // the longRunning function is stopped from the outside
default: ...; // the longRunning function is throwing not because of being stopped
}
}
Gist: https://gist.github.com/Kelerchian/3824ca4ce1be390d34c5147db671cc9b
Related
I am using a tcp server to recieve and process packets in Node.js. It should recieve 2 packets:
"create" for creating an object in a database. It first checks if the object already exists and then creates it. (-> takes some time process)
"update" for updating the newly created object in the database
For the sake of simplicity, we'll just assume the first step always takes longer than the second. (which is always true in my original code)
This is a MWE:
const net = require("net");
const server = net.createServer((conn) => {
conn.on('data', async (data) => {
console.log(`Instruction ${data} recieved`);
await sleep(1000);
console.log(`Instruction ${data} done`);
});
});
server.listen(1234);
const client = net.createConnection(1234, 'localhost', async () => {
client.write("create");
await sleep(10); // just a cheap workaround to "force" sending 2 packets instead of one
client.write("update");
});
// Just to make it easier to read
function sleep(ms) {
return new Promise((resolve) => {
setTimeout(resolve, ms);
});
}
If i run this code i get:
Instruction create recieved
Instruction update recieved
Instruction create done
Instruction update done
But i want the "create" instruction to block the conn.on('data', func) until the last callback returns asynchronously. The current code tries to update an entry before it is created in the database which is not ideal.
Is there an (elegant) way to achieve this? I suspect some kind of buffer which stores the data and a worker loop of some kind which processes the data? But how do i avoid running an infinite loop which blocks the event loop? (Event loop is the correct term, is it?)
Note: I have a lot more logic to handle fragmentation, etc. But this explains the issue i'm having.
I managed to get it to work with the package async-fifo-queue.
It's not the cleanest solution but it should do what i want and as efficient as possible (using async/await instead of just looping infinitely).
Code:
const net = require("net");
const afq = require("async-fifo-queue");
const q = new afq.Queue();
const server = net.createServer((conn) => {
conn.on('data', q.put.bind(q));
});
server.listen(1234);
const client = net.createConnection(1234, 'localhost', async () => {
client.write("create");
await sleep(10);
client.write("update");
});
(async () => {
while(server.listening) {
const data = await q.get();
console.log(`Instruction ${data} recieved`);
await sleep(1000);
console.log(`Instruction ${data} done`);
}
})();
function sleep(ms) {
return new Promise((resolve) => {
setTimeout(resolve, ms);
});
}
You can pause the socket when you get the "create" event. After it finishes, you can resume the socket. Example:
const server = net.createServer((conn) => {
conn.on('data', async (data) => {
if (data === 'create') {
conn.pause()
}
console.log(`Instruction ${data} recieved`);
await sleep(1000);
console.log(`Instruction ${data} done`);
if (data === 'create') {
conn.resume()
}
});
});
server.listen(1234);
const client = net.createConnection(1234, 'localhost', async () => {
client.write("create");
await sleep(10); // just a cheap workaround to "force" sending 2 packets instead of one
client.write("update");
});
// Just to make it easier to read
function sleep(ms) {
return new Promise((resolve) => {
setTimeout(resolve, ms);
});
}
The code below is an example of what may take place during development.
With the current code, the outer function may throw an error but in this case wont. However, the nested function WILL throw an error (for examples sake). Once it throws the error it cannot be caught as it is asynchronous function.
Bungie.Get('/Platform/Destiny2/Manifest/').then((ResponseText)=>{
//Async function that WILL throw an error
Bungie.Get('/Platform/Destiny2/Mnifest/').then((ResponseText)=>{
console.log('Success')
})
}).catch((error)=>{
//Catch all errors from either the main function or the nested function
doSomethingWithError(error)
});
What I want is for the outer most function to catch all asynchronous function error's but with this code I cannot. I have tried awaiting the nested function but there may be certain circumstances where it will be quicker to not wait for the function. I also tried to include a .catch() with each nested function but this would require a .catch() for each function that would allhandle the error in the same way e.g. doSomethingWithError().
you only needs return the inner function in the outside function.
see example below:
const foo = new Promise((resolve,reject) =>{
setTimeout(() => resolve('foo'), 1000);
});
foo.then((res)=>{
console.log(res)
return new Promise((resolve,reject)=>{
setTimeout(() => reject("bar fail"), 1000);
})
}).catch((e)=>{
// your own logic
console.error(e)
});
this is called promise chaining. see this post for more info https://javascript.info/promise-chaining
if you have multiple promises can do something like:
const foo1 = new Promise((resolve,reject) =>{
setTimeout(() => resolve('foo1'), 1000);
});
const foo2 = new Promise((resolve,reject) =>{
setTimeout(() => resolve('foo2'), 2000);
});
const foo3 = new Promise((resolve,reject) =>{
setTimeout(() => reject('foo3'), 3000);
});
const bar = new Promise((resolve,reject) =>{
setTimeout(() => resolve('bar'), 4000);
});
foo1
.then((res)=>{
console.log(res)
return foo2
})
.then((res)=>{
console.log(res)
return foo3 // throws the error
})
.then((res)=>{
console.log(res)
return bar
})
.catch((e)=>{
// every error will be cached here
console.error(e)
});
I would aim to use async / await unless you have very particular reasons, since it avoids callback hell and makes your code simpler and more bug free.
try {
const response1 = await Bungie.Get('/Platform/Destiny2/Manifest/');
const response2 = await Bungie.Get('/Platform/Destiny2/Mnifest/');
console.log('Success');
} catch (error) {
doSomethingWithError(error);
}
Imagine each Bungie call takes 250 milliseconds. While this is occurring, NodeJS will continue to execute other code via its event loop - eg requests from other clients. Awaiting is not the same as hanging the app.
Similarly, this type of code is used in many browser or mobile apps, and they remain responsive to the end user during I/O. I use the async await programming model in all languages these days (Javascript, Java, C#, Swift etc).
Try this:
let getMultiple = function(callback, ... keys){
let result = [];
let ctr = keys.length;
for(let i=0;i<ctr;i++)
result.push(0);
let ctr2 = 0;
keys.forEach(function(key){
let ctr3=ctr2++;
try{
Bungie.Get(key, function(data){
result[ctr3] = data;
ctr--;
if(ctr==0)
{
callback(result);
}
});
} catch(err) {
result[ctr3]=err.message;
ctr--;
if(ctr==0)
{
callback(result);
}
}
});
};
This should get all your data requests and replace relevant data with error message if it happens.
getMultiple(function(results){
console.log(results);
}, string1, string2, string3);
If the error causes by requesting same thing twice asynchronously, then you can add an asynchronous caching layer before this request.
I'm writing an HTTP API with expressjs in Node.js and here is what I'm trying to achieve:
I have a regular task that I would like to run regularly, approx every minute. This task is implemented with an async function named task.
In reaction to a call in my API I would like to have that task called immediately as well
Two executions of the task function must not be concurrent. Each execution should run to completion before another execution is started.
The code looks like this:
// only a single execution of this function is allowed at a time
// which is not the case with the current code
async function task(reason: string) {
console.log("do thing because %s...", reason);
await sleep(1000);
console.log("done");
}
// call task regularly
setIntervalAsync(async () => {
await task("ticker");
}, 5000) // normally 1min
// call task immediately
app.get("/task", async (req, res) => {
await task("trigger");
res.send("ok");
});
I've put a full working sample project at https://github.com/piec/question.js
If I were in go I would do it like this and it would be easy, but I don't know how to do that with Node.js.
Ideas I have considered or tried:
I could apparently put task in a critical section using a mutex from the async-mutex library. But I'm not too fond of adding mutexes in js code.
Many people seem to be using message queue libraries with worker processes (bee-queue, bullmq, ...) but this adds a dependency to an external service like redis usually. Also if I'm correct the code would be a bit more complex because I need a main entrypoint and an entrypoint for worker processes. Also you can't share objects with the workers as easily as in a "normal" single process situation.
I have tried RxJs subject in order to make a producer consumer channel. But I was not able to limit the execution of task to one at a time (task is async).
Thank you!
You can make your own serialized asynchronous queue and run the tasks through that.
This queue uses a flag to keep track of whether it's in the middle of running an asynchronous operation already. If so, it just adds the task to the queue and will run it when the current operation is done. If not, it runs it now. Adding it to the queue returns a promise so the caller can know when the task finally got to run.
If the tasks are asynchronous, they are required to return a promise that is linked to the asynchronous activity. You can mix in non-asynchronous tasks too and they will also be serialized.
class SerializedAsyncQueue {
constructor() {
this.tasks = [];
this.inProcess = false;
}
// adds a promise-returning function and its args to the queue
// returns a promise that resolves when the function finally gets to run
add(fn, ...args) {
let d = new Deferred();
this.tasks.push({ fn, args: ...args, deferred: d });
this.check();
return d.promise;
}
check() {
if (!this.inProcess && this.tasks.length) {
// run next task
this.inProcess = true;
const nextTask = this.tasks.shift();
Promise.resolve(nextTask.fn(...nextTask.args)).then(val => {
this.inProcess = false;
nextTask.deferred.resolve(val);
this.check();
}).catch(err => {
console.log(err);
this.inProcess = false;
nextTask.deferred.reject(err);
this.check();
});
}
}
}
const Deferred = function() {
if (!(this instanceof Deferred)) {
return new Deferred();
}
const p = this.promise = new Promise((resolve, reject) => {
this.resolve = resolve;
this.reject = reject;
});
this.then = p.then.bind(p);
this.catch = p.catch.bind(p);
if (p.finally) {
this.finally = p.finally.bind(p);
}
}
let queue = new SerializedAsyncQueue();
// utility function
const sleep = function(t) {
return new Promise(resolve => {
setTimeout(resolve, t);
});
}
// only a single execution of this function is allowed at a time
// so it is run only via the queue that makes sure it is serialized
async function task(reason: string) {
function runIt() {
console.log("do thing because %s...", reason);
await sleep(1000);
console.log("done");
}
return queue.add(runIt);
}
// call task regularly
setIntervalAsync(async () => {
await task("ticker");
}, 5000) // normally 1min
// call task immediately
app.get("/task", async (req, res) => {
await task("trigger");
res.send("ok");
});
Here's a version using RxJS#Subject that is almost working. How to finish it depends on your use-case.
async function task(reason: string) {
console.log("do thing because %s...", reason);
await sleep(1000);
console.log("done");
}
const run = new Subject<string>();
const effect$ = run.pipe(
// Limit one task at a time
concatMap(task),
share()
);
const effectSub = effect$.subscribe();
interval(5000).subscribe(_ =>
run.next("ticker")
);
// call task immediately
app.get("/task", async (req, res) => {
effect$.pipe(
take(1)
).subscribe(_ =>
res.send("ok")
);
run.next("trigger");
});
The issue here is that res.send("ok") is linked to the effect$ streams next emission. This may not be the one generated by the run.next you're about to call.
There are many ways to fix this. For example, you can tag each emission with an ID and then wait for the corresponding emission before using res.send("ok").
There are better ways too if calls distinguish themselves naturally.
A Clunky ID Version
Generating an ID randomly is a bad idea, but it gets the general thrust across. You can generate unique IDs however you like. They can be integrated directly into the task somehow or can be kept 100% separate the way they are here (task itself has no knowledge that it's been assigned an ID before being run).
interface IdTask {
taskId: number,
reason: string
}
interface IdResponse {
taskId: number,
response: any
}
async function task(reason: string) {
console.log("do thing because %s...", reason);
await sleep(1000);
console.log("done");
}
const run = new Subject<IdTask>();
const effect$: Observable<IdResponse> = run.pipe(
// concatMap only allows one observable at a time to run
concatMap((eTask: IdTask) => from(task(eTask.reason)).pipe(
map((response:any) => ({
taskId: eTask.taskId,
response
})as IdResponse)
)),
share()
);
const effectSub = effect$.subscribe({
next: v => console.log("This is a shared task emission: ", v)
});
interval(5000).subscribe(num =>
run.next({
taskId: num,
reason: "ticker"
})
);
// call task immediately
app.get("/task", async (req, res) => {
const randomId = Math.random();
effect$.pipe(
filter(({taskId}) => taskId == randomId),
take(1)
).subscribe(_ =>
res.send("ok")
);
run.next({
taskId: randomId,
reason: "trigger"
});
});
I'm currently setting up a CI environment to automate e2e tests our team runs in a test harness. I am setting this up on Gitlab and currently using Puppeteer. I have an event that fires from our test harness that designates when the test is complete. Now I am trying to "pool" the execution so I don't use up all resources or run out of listeners. I decided to try out "puppeteer-cluster" for this task. I am close to having things working, however I can't seem to get it to wait for the event on page before closing the browser. Prior to using puppeteer-cluster, I was passing in a callback to my function and when the custom event was fired (injected via exposeFunction), I would go about calling it. That callback function is now being passed in data though now and therefore not waiting. I can't seem to find a way to get the execution to wait and was hoping someone might have an idea here. If anyone has any recommendations, I'd love to hear them.
test('Should launch the browser and run e2e tests', async (done) => {
try {
const cluster = await Cluster.launch({
concurrency: Cluster.CONCURRENCY_CONTEXT,
maxConcurrency: 10,
monitor: false,
timeout: 1200000,
puppeteerOptions: browserConfig
});
// Print errors to console
cluster.on("taskerror", (err, data) => {
console.log(`Error crawling ${data}: ${err.message}`);
});
//Setup our task to be run
await cluster.task( async ({page, data: {testUrl, isLastIndex, cb}, worker}) => {
console.log(`Test starting at url: ${testUrl} - isLastIndex: ${isLastIndex}`);
await page.goto(testUrl);
await page.waitForSelector('#testHarness');
await page.exposeFunction('onCustomEvent', async (e) => {
if (isLastIndex === true){ ;
//Make a call to our callback, finalizing tests are complete
cb();
}
console.log(`Completed test at url: ${testUrl}`);
});
await page.evaluate(() => {
document.addEventListener('TEST_COMPLETE', (e) => {
window.onCustomEvent('TEST_COMPLETE');
console.log("TEST COMPLETE");
});
});
});
//Perform the assignment of all of our xml tests to an array
let arrOfTests = await buildTestArray();
const arrOfTestsLen = arrOfTests.length;
for( let i=0; i < arrOfTestsLen; ++i){
//push our tests on task queue
await cluster.queue( {testUrl: arrOfTests[i], isLastIndex: (i === arrOfTestsLen - 1), cb: done });
};
await cluster.idle();
await cluster.close();
} catch (error) {
console.log('ERROR:',error);
done();
throw error;
}
});
So I got something working, but it really feels hacky to me and I'm not really sure it is the right approach. So should anyone have the proper way of doing this or a more recommended way, don't hesitate to respond. I am posting here shoudl anyone else deal with something similar. I was able to get this working with a bool and setInterval. I have pasted working result below.
await cluster.task( async ({page, data: {testUrl, isLastIndex, cb}, worker}) => {
let complete = false;
console.log(`Test starting at url: ${testUrl} - isLastIndex: ${isLastIndex}`);
await page.goto(testUrl)
await page.waitForSelector('#testHarness');
await page.focus('#testHarness');
await page.exposeFunction('onCustomEvent', async (e) => {
console.log("Custom event fired");
if (isLastIndex === true){ ;
//Make a call to our callback, finalizing tests are complete
cb();
complete = true;
//console.log(`VAL IS ${complete}`);
}
console.log(`Completed test at url: ${testUrl}`);
});
//This will run on the actual page itself. So setup an event listener for
//the TEST_COMPLETE event sent from the test harness itself
await page.evaluate(() => {
document.addEventListener('TEST_COMPLETE', (e) => {
window.onCustomEvent('TEST_COMPLETE');
});
});
await new Promise(resolve => {
try {
let timerId = setInterval(()=>{
if (complete === true){
resolve();
clearInterval(timerId);
}
}, 1000);
} catch (e) {
console.log('ERROR ', e);
}
});
});
I am trying to rewrite a module I wrote that seeds a MongoDB database. It was originally working fine with callbacks, but I want to move to Promises. However, the execution and results don't seem to make any sense.
There are three general functions in a Seeder object:
// functions will be renamed
Seeder.prototype.connectPromise = function (url, opts) {
return new Promise((resolve,reject) => {
try {
mongoose.connect(url, opts).then(() => {
const connected = mongoose.connection.readyState == 1
this.connected = connected
resolve(connected)
})
} catch (error) {
reject(error)
}
})
}
[...]
Seeder.prototype.seedDataPromise = function (data) {
return new Promise((resolve,reject) => {
if (!this.connected) reject(new Error('Not connected to MongoDB'))
// Stores all promises to be resolved
var promises = []
// Fetch the model via its name string from mongoose
const Model = mongoose.model(data.model)
// For each object in the 'documents' field of the main object
data.documents.forEach((item) => {
// generates a Promise for a single item insertion.
promises.push(promise(Model, item))
})
// Fulfil each Promise in parallel
Promise.all(promises).then(resolve(true)).catch((e)=>{
reject(e)
})
})
}
[...]
Seeder.prototype.disconnect = function () {
mongoose.disconnect()
this.connected = false
this.listeners.forEach((l) => {
if (l.cause == 'onDisconnect') l.effect()
})
}
There is no issue with the main logic of the code. I can get it to seed the data correctly. However, when using Promises, the database is disconnected before anything else is every done, despite the disconnect function being called .finally().
I am running these functions like this:
Seeder.addListener('onConnect', function onConnect () { console.log('Connected') })
Seeder.addListener('onDisconnect', function onDisconnect () {console.log('Disconnected')})
Seeder.connectPromise(mongoURI, options).then(
Seeder.seedDataPromise(data)
).catch((error) => { <-- I am catching the error, why is it saying its unhandled?
console.error(error)
}).finally(Seeder.disconnect())
The output is this:
Disconnected
(node:14688) UnhandledPromiseRejectionWarning: Error: Not connected to MongoDB
at Promise (C:\Users\johnn\Documents\Code\node projects\mongoose-seeder\seed.js:83:37)
which frankly doesn't make sense to me, as on the line pointed out in the stack trace I call reject(). And this rejection is handled, because I have a catch statement as shown above. Further, I can't understand why the database never even has a chance to connect, given the finally() block should be called last.
The solution was to return the Promise.all call, in addition to other suggestions.
You are passing the wrong argument to then and finally. First here:
Seeder.connectPromise(mongoURI, options).then(
Seeder.seedDataPromise(data)
)
Instead of passing a callback function to then, you actually execute the function on the spot (so without waiting for the promise to resolve and trigger the then callback -- which is not a callback).
You should do:
Seeder.connectPromise(mongoURI, options).then(
() => Seeder.seedDataPromise(data)
)
A similar error is made here:
finally(Seeder.disconnect())
It should be:
finally(() => Seeder.disconnect())
Promise Constructor Anti-Pattern
Not related to your question, but you are implementing an antipattern, by creating new promises with new Promise, when in fact you already get promises from using the mongodb API.
For instance, you do this here:
Seeder.prototype.connectPromise = function (url, opts) {
return new Promise((resolve,reject) => {
try {
mongoose.connect(url, opts).then(() => {
const connected = mongoose.connection.readyState == 1
this.connected = connected
resolve(connected)
})
} catch (error) {
reject(error)
}
})
}
But the wrapping promise, created with new is just a wrapper that adds nothing useful. Just write:
Seeder.prototype.connectPromise = function (url, opts) {
return mongoose.connect(url, opts).then(() => {
const connected = mongoose.connection.readyState == 1
this.connected = connected
return connected;
});
}
The same happens in your next prototype function. I'll leave it to you to apply a similar simplification there, so avoiding the promise constructor antipattern.
In the later edit to your question, you included this change, but you did not return a promise in that function. Add return here:
return Promise.all(promises).then(() => {
//^^^^^^
return true
}).catch(() => {
console.log(`Connected:\t${this.connected}`)
})