node, require, singleton or not singleton? - node.js

I was pretty shocked to find out that "require" in node creates a singleton by default. One might assume that many people have modules which they require which have state, but are created as a singleton, so break the app as soon as there are multiple concurrent users.
We have the opposite problem, requires is creating a non-singleton, and we dont know how to fix this.
Because my brain is wired as a java developer, all our node files/modules are defined thusly:
file playerService.js
const Player = require("./player")
class PlayerService {
constructor(timeout) {
// some stuff
}
updatePlayer(player) {
// logic to lookup player in local array and change it for dev version.
// test version would lookup player in DB and update it.
}
}
module.exports = PlayerService
When we want to use it, we do this:
someHandler.js
const PlayerService = require("./playerService")
const SomeService = require("./someService")
playerService = new PlayerService(3000)
// some code which gets a player
playerService.updatePlayer(somePlayer)
Although requires() creates singletons by default, in the above case, I am guessing it is not creating a singleton as each websocket message (in our case) will instantiate a new objects in every module which is called in the stack. That is a lot of overhead - to service a single message, the service might get instantiated 5 times as there are 5 different sub services/helper classes which call each other and all do a requires(), and then multiply this by the number of concurrent users and you get a lot of unnecessary object creation.
1) How do we modify the above class to work as a singleton, as services don't have state?
2) Is there any concept of a global import or creating a global object, such that we can import (aka require) and/or instantiate an object once for a particular websocket connection and/or for all connections? We have no index.js or similar. It seems crazy to have to re-require the dependent modules/files for every js file in a stack. Note, we looked at DI options, but found them too arcane to comprehend how to use them as we are not js gurus, despite years of trying.

You can simply create an instance inside the file and export it.
let playerService = new PlayerService();
module.exports = playerService;
In this case, you may want to add setters for the member variables you would take as constructor parameters to ensure encapsulation.
Also note that, creating object instances with new in javascript is cheaper than traditional OOP language because of it's prototype model (more).
So don't hesitate when you really need new instances (as seen in your code, do you really want to share the timeout constructor parameter?), since javascript objects are pretty memory efficient with prototype methods and modern engines has excellent garbage collectors to prevent memory leak.

Related

Are EventEmitters resource consuming in NodeJS?

I wonder whether I should create one EventEmitter and pass an object to it in which there will be a key to run some code inside the callback function depending on the key inside the object (I would have 15 different situations) or should I create 15 eventEmitters depending on 15 different names of message? I wonder if creating multiple evenEmitter will slow down, take RAM or CPU resources of my NodeJS instance
something like this:
const EventEmitter = require('events');
class MyEmitter extends EventEmitter {}
const myEmitter = new MyEmitter();
myEmitter.on('event1', (data) => console.log(data)); // receiver code
myEmitter.on('event2', (data) => console.log(data)); //receiver code
myEmitter.emit('event1','event1'); //sender code
myEmitter.emit('event2','event2'); //sender code
//event3
//event4
//...
or something like that:
const EventEmitter = require('events');
class MyEmitter extends EventEmitter {}
const myEmitter = new MyEmitter();
let obj1 = { msgType:'event1',data:'one exemple'}; // sender code
let obj2 = { msgType:'event2',data:'another exemple'}; //sender code
myEmitter.on('event', (data) => { //receiver code
if (data.msgType=="event1"){
console.log("event1");
}
if (data.msgType=="event2"){
console.log("event2");
}
});
myEmitter.emit('event',obj1); //sender code
myEmitter.emit('event',obj2); //sender
All an event emitter is just a Javascript object that keeps a data structure containing listeners for various methods. An inactive listener in an emitter consumes no CPU and takes only as much RAM as storing a callback reference and a message name that it's listening for (so, almost no RAM).
You can have as many emitters as make sense for your code. They are cheap and efficient. You can literally just think of them as a static array of listeners. When you do .addListener(), it adds an item to the array. When you do .removeListener(), it removes an item from the array. When you do .emit(), it finds the listeners that match that particular message and calls them (just function calls).
or should I create 15 eventEmitters depending on 15 different names of message?
eventEmitters were built to handle many different message names. So, just because you have 15 different messages, that is no reason to make 15 unique eventEmitters. You can easily just use one eventEmitter and call .emit() on it with all your different messages.
The reason to make multiple eventEmitters has to do with the design and architecture of your code. If you have a component that you want to be modular and reusable and it uses an eventEmitter, then it may want to create its own emitter and make that available to its clients just so it doesn't have to have a dependency with other code that also wants to use an eventEmitter, but does not otherwise have anything to do with this particular module. So, it's an architectural and code organization question, not one of runtime efficiency. Create only as many eventEmitters as your architecture naturally desires and no more.
I wonder if creating multiple eventEmitter will slow down, take RAM or CPU resources of my NodeJS instance
No, it will not. Each eventEmitter takes a very small amount of memory to just initialize its basic instance data, but this is so small that you could probably not even measure the difference between 1 or these and 15 of them.
I wonder if I should create one EventEmitter and pass an object to it in which there will be a key to run some code inside the callback function depending on the key inside the object.
You are free to design your code that way if you want, but you're making extra work for yourself and writing code that probably isn't as clean as it could be. A big advantage of eventEmitters is that they maintain a specific set of listeners for each separate messages. If you use one generic message and then embed the actual message inside an object that you pass to the .emit() call, then you're just throwing away features the eventEmitter has and putting a burden on the calling code to figure out which sub-message is actually contained in this event.
In general, this would be an inefficient way to use an EventEmitter. Instead, put the actual event name in the .emit() and let code register a listener for the actual event names it wants to listen to.
So, of the two schemes you show, I much, much prefer the first one. That's how EventEmitters were designed to be used and how they were designed to help you. There could be situations where you have to have a generic message with your own sub-routing, but unless you are sure you require that, you should not be adding the extra level of complexity and you're throwing away functionality that the EventEmitter will do for you for free.
Also, you show this code:
class MyEmitter extends EventEmitter {}
const myEmitter = new MyEmitter();
Do you realize that there's no need to subclass an EventEmitter just to use one. You would only subclass it if you're going to add or override methods on your subclass. But this code shows no actual new methods or overrides so there's no point to doing it that way.
If you just want to use an EventEmitter, you just create one:
const myEmitter = new EventEmitter();

Is it safe to define my io as global.io

I am new to nodejs and socket. What I would like to know is how I should able to access my io variable in different controllers or files. Is it safe to declare my io variable as:
global.io = require('socket.io').listen(server);
so now my io is accessible in any of my controller?
I got this idea from this link:
https://blog.sylo.space/use-global-variable-for-socket-io/
It can cause crashes if it overlaps with another variable, which could be hard to track/handle if your project is quite big and/or includes a lot of librariers, it also exists for the lifetime of the application thus taking up resources. And it is overall considered bad practice to use global variables.
You can check out a great article about global variables in node.js here - http://stackabuse.com/using-global-variables-in-node-js/
EDIT: Since you mentioned safety, it is safe on theory, just not recommended.
In most cases, global variables are an indication of poor software design. A better approach would be containing all your IO code in a distinct class/module. Here's a short code piece to illustrate what I mean:
const socketIO = require('socket.io');
class myIO {
construct(server) {
this.io = socketIO.listen(server)
}
doStuff() {
// do stuff with your io socket
}
}
module.exports = myIO;
Now you need to provide an instance of myIO to your controllers, but I'm not sure what constitutes a controller in your case. And do you need to reconnect socket each time, or reuse same socket?
Either way you might want to look into Dependency Injection pattern OR use a Singleton:
module.exports = new myIO();
in this case require("myio") would always return same socket. Or if your framework has a registry, you could use that.

Spring cache with Redis using Jackson serializer: How to deal with multiple type of domain object

There are many types of domain objects in my web application, such as MemberModel, PostModel, CreditsModel and so on. I find that the type of the object is needed when configuring JacksonJsonRedisSerializer, so I specified Object.class. But I got error when deserializing objects.
To work around this, I've got 2 options:
Use JdkSerializationRedisSerializer instead. But the result of the serialization is too long so it will consume lots of memory in Redis.
Configure serializer for every domian objects, which means if I have 50 domain objects then I have to configure 50 serializers. But this is obviously pretty tedious.
Is there a graceful way to solve this problem? Thanks!
There's an open PR #145 available. Untill that one is merged one can pretty much just implement a RedisSerializer the way it is done in GenericJackson2JsonRedisSerializer configuring the used ObjectMapper to inlcude type information within the json.
ObjectMapper mapper = new ObjectMapper();
mapper.enableDefaultTyping(DefaultTyping.NON_FINAL, As.PROPERTY);
byte[] bytes = mapper.writeValueAsBytes(domainObject);
// using Object.class allows the mapper fall back to the default typing.
// one could also use a concrete domain type if known to avoid the cast.
DomainObject target = (DomainObject) mapper.readValue(bytes, Object.class);

are class level property or variables thread safe

I always had this specific scenario worry me for eons. Let's say my class looks like this
public class Person {
public Address Address{get;set;}
public string someMethod()
{}
}
My question is, I was told by my fellow developers that the Address propery of type Address, is not thread safe.
From a web request perspective, every request is run on a separate thread and every time
the thread processes the following line in my business object or code behind, example
var p = new Person();
it creates a new instance of Person object on heap and so the instance is accessed by the requesting thread, unless and otherwise I spawn multiple threads in my application.
If I am wrong, please explain to me why I am wrong and why the public property (Address) is not thread safe?
Any help will be much appreciated.
Thanks.
If the reference to your Person instance is shared among multiple threads then multiple threads could potentially change Address causing a race condition. However unless you are holding that reference in a static field or in Session (some sort of globally accessible place) then you don't have anything to be worried about.
If you are creating references to objects in your code like you have show above (var p = new Person();) then you are perfectly thread safe as other threads will not be able to access the reference to these objects without resorting to nasty and malicious tricks.
Your property is not thread safe, because you have no locking to prevent multiple writes to the property stepping on each others toes.
However, in your scenario where you are not sharing an instance of your class between multiple threads, the property doesn't need to be thread safe.
Objects that are shared between multiple threads, where each thread can change the state of the object, then all state changes need to be protected so that only one thread at a time can modify the object.
You should be fine with this, however there are a few things I'd worry about...
If your Person object was to be modified or held some disposable resources, you could potentially find that one of the threads will be unable to read this variable. To prevent this, you will need to lock the object before read/writing it to ensure it won't be trampled on by other threads. The easiest way is by using the lock{} construct.

Is this a safe version of double-checked locking?

Slightly modified version of canonical broken double-checked locking from Wikipedia:
class Foo {
private Helper helper = null;
public Helper getHelper() {
if (helper == null) {
synchronized(this) {
if (helper == null) {
// Create new Helper instance and store reference on
// stack so other threads can't see it.
Helper myHelper = new Helper();
// Atomically publish this instance.
atomicSet(helper, myHelper);
}
}
}
return helper;
}
}
Does simply making the publishing of the newly created Helper instance atomic make this double checked locking idiom safe, assuming that the underlying atomic ops library works properly? I realize that in Java, one could just use volatile, but even though the example is in pseudo-Java, this is supposed to be a language-agnostic question.
See also:
Double checked locking Article
It entirely depends on the exact memory model of your platform/language.
My rule of thumb: just don't do it. Lock-free (or reduced lock, in this case) programming is hard and shouldn't be attempted unless you're a threading ninja. You should only even contemplate it when you've got profiling proof that you really need it, and in that case you get the absolute best and most recent book on threading for that particular platform and see if it can help you.
I don't think you can answer the question in a language-agnostic fashion without getting away from code completely. It all depends on how synchronized and atomicSet work in your pseudocode.
The answer is language dependent - it comes down to the guarantees provided by atomicSet().
If the construction of myHelper can be spread out after the atomicSet() then it doesn't matter how the variable is assigned to the shared state.
i.e.
// Create new Helper instance and store reference on
// stack so other threads can't see it.
Helper myHelper = new Helper(); // ALLOCATE MEMORY HERE BUT DON'T INITIALISE
// Atomically publish this instance.
atomicSet(helper, myHelper); // ATOMICALLY POINT UNINITIALISED MEMORY from helper
// other thread gets run at this time and tries to use helper object
// AT THE PROGRAMS LEISURE INITIALISE Helper object.
If this is allowed by the language then the double checking will not work.
Using volatile would not prevent a multiple instantiations - however using the synchronize will prevent multiple instances being created. However with your code it is possible that helper is returned before it has been setup (thread 'A' instantiates it, but before it is setup thread 'B' comes along, helper is non-null and so returns it straight away. To fix that problem, remove the first if (helper == null).
Most likely it is broken, because the problem of a partially constructed object is not addressed.
To all the people worried about a partially constructed object:
As far as I understand, the problem of partially constructed objects is only a problem within constructors. In other words, within a constructor, if an object references itself (including it's subclass) or it's members, then there are possible issues with partial construction. Otherwise, when a constructor returns, the class is fully constructed.
I think you are confusing partial construction with the different problem of how the compiler optimizes the writes. The compiler can choose to A) allocate the memory for the new Helper object, B) write the address to myHelper (the local stack variable), and then C) invoke any constructor initialization. Anytime after point B and before point C, accessing myHelper would be a problem.
It is this compiler optimization of the writes, not partial construction that the cited papers are concerned with. In the original single-check lock solution, optimized writes can allow multiple threads to see the member variable between points B and C. This implementation avoids the write optimization issue by using a local stack variable.
The main scope of the cited papers is to describe the various problems with the double-check lock solution. However, unless the atomicSet method is also synchronizing against the Foo class, this solution is not a double-check lock solution. It is using multiple locks.
I would say this all comes down to the implementation of the atomic assignment function. The function needs to be truly atomic, it needs to guarantee that processor local memory caches are synchronized, and it needs to do all this at a lower cost than simply always synchronizing the getHelper method.
Based on the cited paper, in Java, it is unlikely to meet all these requirements. Also, something that should be very clear from the paper is that Java's memory model changes frequently. It adapts as better understanding of caching, garbage collection, etc. evolve, as well as adapting to changes in the underlying real processor architecture that the VM runs on.
As a rule of thumb, if you optimize your Java code in a way that depends on the underlying implementation, as opposed to the API, you run the risk of having broken code in the next release of the JVM. (Although, sometimes you will have no choice.)
dsimcha:
If your atomicSet method is real, then I would try sending your question to Doug Lea (along with your atomicSet implementation). I have a feeling he's the kind of guy that would answer. I'm guessing that for Java he will tell you that it's cheaper to always synchronize and to look to optimize somewhere else.

Resources