I wonder whether I should create one EventEmitter and pass an object to it in which there will be a key to run some code inside the callback function depending on the key inside the object (I would have 15 different situations) or should I create 15 eventEmitters depending on 15 different names of message? I wonder if creating multiple evenEmitter will slow down, take RAM or CPU resources of my NodeJS instance
something like this:
const EventEmitter = require('events');
class MyEmitter extends EventEmitter {}
const myEmitter = new MyEmitter();
myEmitter.on('event1', (data) => console.log(data)); // receiver code
myEmitter.on('event2', (data) => console.log(data)); //receiver code
myEmitter.emit('event1','event1'); //sender code
myEmitter.emit('event2','event2'); //sender code
//event3
//event4
//...
or something like that:
const EventEmitter = require('events');
class MyEmitter extends EventEmitter {}
const myEmitter = new MyEmitter();
let obj1 = { msgType:'event1',data:'one exemple'}; // sender code
let obj2 = { msgType:'event2',data:'another exemple'}; //sender code
myEmitter.on('event', (data) => { //receiver code
if (data.msgType=="event1"){
console.log("event1");
}
if (data.msgType=="event2"){
console.log("event2");
}
});
myEmitter.emit('event',obj1); //sender code
myEmitter.emit('event',obj2); //sender
All an event emitter is just a Javascript object that keeps a data structure containing listeners for various methods. An inactive listener in an emitter consumes no CPU and takes only as much RAM as storing a callback reference and a message name that it's listening for (so, almost no RAM).
You can have as many emitters as make sense for your code. They are cheap and efficient. You can literally just think of them as a static array of listeners. When you do .addListener(), it adds an item to the array. When you do .removeListener(), it removes an item from the array. When you do .emit(), it finds the listeners that match that particular message and calls them (just function calls).
or should I create 15 eventEmitters depending on 15 different names of message?
eventEmitters were built to handle many different message names. So, just because you have 15 different messages, that is no reason to make 15 unique eventEmitters. You can easily just use one eventEmitter and call .emit() on it with all your different messages.
The reason to make multiple eventEmitters has to do with the design and architecture of your code. If you have a component that you want to be modular and reusable and it uses an eventEmitter, then it may want to create its own emitter and make that available to its clients just so it doesn't have to have a dependency with other code that also wants to use an eventEmitter, but does not otherwise have anything to do with this particular module. So, it's an architectural and code organization question, not one of runtime efficiency. Create only as many eventEmitters as your architecture naturally desires and no more.
I wonder if creating multiple eventEmitter will slow down, take RAM or CPU resources of my NodeJS instance
No, it will not. Each eventEmitter takes a very small amount of memory to just initialize its basic instance data, but this is so small that you could probably not even measure the difference between 1 or these and 15 of them.
I wonder if I should create one EventEmitter and pass an object to it in which there will be a key to run some code inside the callback function depending on the key inside the object.
You are free to design your code that way if you want, but you're making extra work for yourself and writing code that probably isn't as clean as it could be. A big advantage of eventEmitters is that they maintain a specific set of listeners for each separate messages. If you use one generic message and then embed the actual message inside an object that you pass to the .emit() call, then you're just throwing away features the eventEmitter has and putting a burden on the calling code to figure out which sub-message is actually contained in this event.
In general, this would be an inefficient way to use an EventEmitter. Instead, put the actual event name in the .emit() and let code register a listener for the actual event names it wants to listen to.
So, of the two schemes you show, I much, much prefer the first one. That's how EventEmitters were designed to be used and how they were designed to help you. There could be situations where you have to have a generic message with your own sub-routing, but unless you are sure you require that, you should not be adding the extra level of complexity and you're throwing away functionality that the EventEmitter will do for you for free.
Also, you show this code:
class MyEmitter extends EventEmitter {}
const myEmitter = new MyEmitter();
Do you realize that there's no need to subclass an EventEmitter just to use one. You would only subclass it if you're going to add or override methods on your subclass. But this code shows no actual new methods or overrides so there's no point to doing it that way.
If you just want to use an EventEmitter, you just create one:
const myEmitter = new EventEmitter();
Related
I have a system which runs multiple service (long lived) and worker (short lived) threads. They all share a state which contains objects. Any thread can request an object an any time, through a singleton-of-sorts class called ObjectManager. If the object is not available it needs to be loaded.
Here's some pseudo-code of how object loading looks now:
class ObjectManager {
getLoadinData(path) {
if (hasLoadingDataFor(path))
return whatWeHave()
else {
loadingData = createNewLoadingData();
loadingData.path = path;
pushLoadingTaskToLoadingThread(loadingData);
return loadingData;
}
}
// loads object and blocks until it's loaded
loadObjectSync(path) {
loadingData = getLoadinData(path);
waitFor(loadingData.conditionVar);
return loadingData.loadedObject;
}
// initiates a load and calls a callback when done
loadObjectAsync(path, callback) {
loadingData = getLoadinData(path);
loadingData.callbacks.add(callback);
}
// dedicated loading thread
loadingThread() {
while (running) {
loadingData = waitForLoadingData();
object = readObjectFromDisk(loadingData.path);
object.onLoaded(); // !!!!
loadingData.object = object;
// unblock cv waiters
loadingData.conditionVar.notifyAll();
// call callbacks
loadingData.callbacks.callAll(object);
}
}
}
The problem is the line object.onLoaded. I have no control over this function. Some objects might decide that they need other objects to be valid. So in their onLoaded method they might call loadObjectSync. Uh-oh! This (naturally) dead locks. It blocks the loading loop until the loading loop makes more iterations.
What I could do to solve this is leave the onLoaded call to the initiating threads. This will change loadObjectSync to something like:
loadObjectSync(path) {
loadingData = getLoadinData(path);
waitFor(loadingData.conditionVar);
if (loadingData.wasCreatedInThisThread()) {
object.onLoaded();
loadingData.onLoadedConditionVar.notifyAll();
loadingData.callbacks.callAll(object);
}
else {
// wait more
waitFor(loadingData.onLoadedConditionVar);
}
return loadingData.loadedObject;
}
... but then the problem is that if I have no calls for loadSync and only for loadAsync or simply the loadAsync call was the first to create the loading data, there will be no one to finalize the object. So to make this work, I have to introduce another thread finalizes objects whose loadingData was created by loadObjectAsync.
It seems that it would work. But I have a simpler idea! What if I change getLoadingData instead. What if it does this:
getLoadinData(path) {
if (hasLoadingDataFor(path))
return whatWeHave()
else {
loadingData = createNewLoadingData();
loadingData.path = path;
///
thread = spawnLoadingThread(loadingData);
thread.detach();
///
return loadingData;
}
}
Spawn a new thread for every object load. Thus there is no dead lock. Every loading thread can safely block until it's done. The rest of the code remains exactly as it is.
This means potentially tens (or why not thousands in certain edge cases) active threads, waiting on condition variables. I know that spawning threads has its overhead but I think it would be negligible compared to the cost of I/O from readObjectFromDisk
So my question is: Is this terrible? Can this somehow backfire?
The target platform is conventional desktop machines. But this software is supposed to run for a long time without stopping: weeks, maybe months.
Alternatively... even though I have an idea how to solve this if the thread-per-load turns out to be terrible, can this be solved in another way?
Very interesting! This is a problem I have bumped into a couple of times, trying to add a synchronous interface to a fundamentally asynchronous operation (i.e. file load, or in my case, network write) that is performed by a service thread.
My own preference would be to not provide the synchronous interface. Why? Because it keeps the code simpler in design & implementation and easier to reason about -- always important for multi-threading.
Benefits of sticking to single thread & async only is that you only have 1 service thread, so resource growth is not a concern, plus the user callbacks are always invoked on this same thread, which simplifies thread-safety concerns for users of ObjectManager (if you have multiple callback threads, every user callback must be thread safe, so it's an important choice to make). However sticking to only an async interface does mean the user of ObjectManager has more work to do.
But if you do want to keep the synchronous interface, then another approach that I have taken could work for you. You stick to a single service thread but inside the implementation of loadObjectSync you check the thread-ID to determine if the invoker is the service thread or any-other thread. If it is any-other thread you queue the request and safely block. But if it is the service thread, you can immediately load the object, say by calling a new function loadObjectImpl. You will need to grab the thread-ID of the service thread during initialization and store it within the ObjectManager instance, and use that for thread identification. And you will need a new function which is basically just the internal scope of the loadingThread function -- i.e. a new function called something like loadObjectImpl.
Let's take the simple code snippet:
var express = require('express');
var app = express();
var counter = 0;
app.get('/', function (req, res) {
// LOCK
counter++;
// UNLOCK
res.send('hello world')
})
Let's say that app.get(...) is called a huge number of times, and as you can understand I don't want the line counter++ to be executed concurrently by the two different threads.
Therefore, I want to lock this line that only one thread can have access to this line. My question is how to do it in node.js?
I know there is a lock package: https://www.npmjs.com/package/locks, but I'm wondering whether there is a "native" way of doing it without an external library.
I don't want the line counter++ to be executed concurrently by the two different threads
That cannot happen in node.js with just regular Javascript coding.
node.js is single threaded and event-driven, so there's only ever one piece of Javascript code running at a time that can access that variable. You do not have to worry about the typical pre-emptive concurrency issues of multi-threaded systems.
That said, you can still have concurrency issues in node.js if you are using asynchronous code because the node.js asynchronous model returns control back to the system to process the next event and the asynchronous callback gets called on some future event. But, the concurrency issues are non-pre-emptive so you fully control when they can occur.
If you show us your actual code in your app.get() route handler, then we can advise more specifically about whether you do or don't have a concurrency issue there or not. And, if you do, we can advise on how to best deal with that.
Threads in the thread pool are all native code that runs behind the scenes. They only trigger actual Javascript to run by queuing events through the event queue. So, because all Javascript that runs is serialized through the event queue, you only get one piece of Javascript ever running at a time. The basic scheme of the event queue is that the interpreter runs a piece of Javascript until it returns control back to the system. At that point, the interpreter looks in the event queue and if there's an event waiting, it pulls that event out and calls the callback associated with that event. Meanwhile, if there is native code running in the background, when it completes, it adds an event to the event queue. That event is not processed until the current Javascript returns control back to the system and it can then grab the next event out of the event queue. So, it's this event-queue that serializes running only one piece of Javascript at a time.
Edit: Nodejs does now have WorkerThreads which enable separate threads of Javascript, but each thread has its own heap and its own variables so a variable from one thread cannot be directly accessed from another thread. You can configure shared memory that both WorkerThreads can access, but that isn't straight variables, but blocks of memory and if you want to use shared memory, then you do indeed need to code your own synchronization methods to make sure you are atomically accessing the variable. The code you show in your question is not using any of this so the access to the counter variable is already atomic and cannot be simultaneously accessed by any other Javascript, even if you are using WorkerThreads.
If you block thread none of the requests will execute all will be in the queue.
It 's not good practice to block the thread in Node.js
var express = require('express');
var app = express();
var counter = 0;
const getPromise = () => {
return new Promise((resolve) => {
setTimeout(() => {
resolve('Done')
}, 100);
});
}
app.get('/', async (req, res) => {
const localCounter = counter++;
// Use local counter for rest of operation so value won't vary
// LOCK: Use promise/callback
await getPromise(); // Not locked but waiting for getPromise to finish
console.log(localCounter); // Same value before lock
res.send('hello world')
})
Node.js is single-threaded, which means that any single process running your app will not have data races like you anticipate. In fact, a quick inspection of the locks library shows that they use a boolean flag and a system of Array objects to determine whether something is locked or not.
You should only really worry about this if you plan on sharing data with multiple processes. In that case, you could use Alan's lockfile approach from this stackoverflow thread here.
I was pretty shocked to find out that "require" in node creates a singleton by default. One might assume that many people have modules which they require which have state, but are created as a singleton, so break the app as soon as there are multiple concurrent users.
We have the opposite problem, requires is creating a non-singleton, and we dont know how to fix this.
Because my brain is wired as a java developer, all our node files/modules are defined thusly:
file playerService.js
const Player = require("./player")
class PlayerService {
constructor(timeout) {
// some stuff
}
updatePlayer(player) {
// logic to lookup player in local array and change it for dev version.
// test version would lookup player in DB and update it.
}
}
module.exports = PlayerService
When we want to use it, we do this:
someHandler.js
const PlayerService = require("./playerService")
const SomeService = require("./someService")
playerService = new PlayerService(3000)
// some code which gets a player
playerService.updatePlayer(somePlayer)
Although requires() creates singletons by default, in the above case, I am guessing it is not creating a singleton as each websocket message (in our case) will instantiate a new objects in every module which is called in the stack. That is a lot of overhead - to service a single message, the service might get instantiated 5 times as there are 5 different sub services/helper classes which call each other and all do a requires(), and then multiply this by the number of concurrent users and you get a lot of unnecessary object creation.
1) How do we modify the above class to work as a singleton, as services don't have state?
2) Is there any concept of a global import or creating a global object, such that we can import (aka require) and/or instantiate an object once for a particular websocket connection and/or for all connections? We have no index.js or similar. It seems crazy to have to re-require the dependent modules/files for every js file in a stack. Note, we looked at DI options, but found them too arcane to comprehend how to use them as we are not js gurus, despite years of trying.
You can simply create an instance inside the file and export it.
let playerService = new PlayerService();
module.exports = playerService;
In this case, you may want to add setters for the member variables you would take as constructor parameters to ensure encapsulation.
Also note that, creating object instances with new in javascript is cheaper than traditional OOP language because of it's prototype model (more).
So don't hesitate when you really need new instances (as seen in your code, do you really want to share the timeout constructor parameter?), since javascript objects are pretty memory efficient with prototype methods and modern engines has excellent garbage collectors to prevent memory leak.
I've got a separate thread which needs to request some data that may change in the meantime within the JavaFX thread. I'd like to execute a blocking invocation in this separate thread that makes sure that the request becomes enqued into the JavaFX thread.
The Swing-GUI testing framework, AssertJ, provides an easy to use API for this purpose:
List list = GuiActionRunner.execute(new GuiQuery<...>...);
The invocation blocks the current thread, executes the passed code within event dispatching thread and returns the required data.
How can this be implemented in production code for JavaFX applications? What would be the recommended approach for this requirement?
Here's an alternative solution, using a FutureTask. This avoids the explicit latch and managing the synchronized data in an AtomicReference. The code here is probably simple enough that it would make including this functionality inPlatform redundant.
FutureTask<List<?>> task = new FutureTask<>( () -> {
List<?> data = ... ; // access data
return data ;
});
Platform.runLater(task);
List<?> data = task.get();
This technique is very useful if you want to pause a background thread to await user input.
Ok I think I got it now. You need to implement something like this yourself:
AtomicReference<List<?>> r = new AtomicReference<>();
CountDownLatch l = new CountDownLatch(1);
Platform.runLater( () -> {
// access data
r.set(...)
l.countDown();
})
l.await();
System.err.println(r.get());
I'm using Redis to generate IDs for my in memory stored models. The Redis client requires a callback to the INCR command, which means the code looks like
client.incr('foo', function(err, id) {
... continue on here
});
The problem is, that I already have written the other part of the app, that expects the incr call to be synchronous and just return the ID, so that I can use it like
var id = client.incr('foo');
The reason why I got to this problem is that up until now, I was generating the IDs just in memory with a simple closure counter function, like
var counter = (function() {
var count = 0;
return function() {
return ++count;
}
})();
to simplify the testing and just general setup.
Does this mean that my app is flawed by design and I need to rewrite it to expect callback on generating IDs? Or is there any simple way to just synchronize the call?
Node.js in its essence is an async I/O library (with plugins). So, by definition, there's no synchronous I/O there and you should rewrite your app.
It is a bit of a pain, but what you have to do is wrap the logic that you had after the counter was generated into a function, and call that from the Redis callback. If you had something like this:
var id = get_synchronous_id();
processIdSomehow(id);
you'll need to do something like this.
var runIdLogic = function(id){
processIdSomehow(id);
}
client.incr('foo', function(err, id) {
runIdLogic(id);
});
You'll need the appropriate error checking, but something like that should work for you.
There are a couple of sequential programming layers for Node (such as TameJS) that might help with what you want, but those generally do recompilation or things like that: you'll have to decide how comfortable you are with that if you want to use them.
#Sergio said this briefly in his answer, but I wanted to write a little more of an expanded answer. node.js is an asynchronous design. It runs in a single thread, which means that in order to remain fast and handle many concurrent operations, all blocking calls must have a callback for their return value to run them asynchronously.
That does not mean that synchronous calls are not possible. They are, and its a concern for how you trust 3rd party plugins. If someone decides to write a call in their plugin that does block, you are at the mercy of that call, where it might even be something that is internal and not exposed in their API. Thus, it can block your entire app. Consider what might happen if Redis took a significant amount of time to return, and then multiple that by the amount of clients that could potentially be accessing that same routine. The entire logic has been serialized and they all wait.
In answer to your last question, you should not work towards accommodating a blocking approach. It may seems like a simple solution now, but its counter-intuitive to the benefits of node.js in the first place. If you are only more comfortable in a synchronous design workflow, you may want to consider another framework that is designed that way (with threads). If you want to stick with node.js, rewrite your existing logic to conform to a callback style. From the code examples I have seen, it tends to look like a nested set of functions, as callback uses callback, etc, until it can return from that recursive stack.
The application state in node.js is normally passed around as an object. What I would do is closer to:
var state = {}
client.incr('foo', function(err, id) {
state.id = id;
doSomethingWithId(state.id);
});
function doSomethingWithId(id) {
// reuse state if necessary
}
It's just a different way of doing things.