Is there any particular reason to use memcached for fast access to cached data instead of just creating a global CACHE variable in the node program and using that?
Assume that the application will we running in one instance and not distributed across multiple machines.
The global variable option seems like it would be faster and more efficient but I wasn't sure if there was a good reason to not do this.
It depends on the size and number of items. If you're working with a few items of modest size and they don't need to be accessible to other node instances then using an object has a key/value store is fine. The one trick is that when you go to delete/remove items from the cache/object make sure you don't keep any other references to it, otherwise you will have a leak.
Related
Is there any to lock any object in Node JS application.
Is there are multiple instance for application is available some function shouldnt run concurrent. If instance A function is completed, it should unlock that object/key or some identifier and B instance of application should check if its unlock it should run some function.
Any Object or Key can be used for identifying the locking and unlocking the function.
How to do that in NodeJS application which have multiple instances.
As mentioned above Redis may be your answer, however, it really depends on the resources available to you. There are some other possibilities less complicated and certainly less powerful which may also do the trick.
node-cache may also do the trick, if you set it up correctly. It is not any where near as powerful as Redis, but on the bright side it does not require as much setup and interaction with your environment.
So there is Redis and node-cache for memory locks. I should mention there are quite a few NPM packages which do the cache. Depends on what you need, and how intricate your cache needs to be.
However, there are less elegant ways to do what you want, though less elegant is not necessarily worse.
You could use a JSON file based system and hold locks on the files for a TTL. lockfile or proper-lockfile will accomplish the task. You can read the information from the files when needed, delete when required, give them a TTL. Basically a cache system to disk.
The memory system is obviously faster. The file system requires just as much planning in your code as the memory system.
There is yet another way. This is possibly the most dangerous one, and you would have to think long and hard on the consequences in terms of security and need.
Node.js has its own process.env. As most know this holds the system global variables available to all by simply writing process.env.foo where foo would have been declared as a global system variable. A package such as .dotenv allows you to add to your system variables by way of a .env text file. Thus if you put in that file sam=mongoDB, then in your code where you write process.env.sam it will be interpreted as mongoDB. Tons of system wide variables can be set up here.
So what good does that do, you may ask? Well these are system wide variables, and they can be changed in mid-flight. So if you need to lock the variables and then change them it is a simple manner to do it with. Beware though of the gotcha here. Once the system goes down, or all processes stop, and is started again, your environment variables will return to the default in the .env file.
Additionally, unless you are running a system which is somewhat safe on AWS or Azure etc. I would not feel secure in having my .env file open to the world. There is a way around this one too. You can use a hash to encrypt all variables and put the hash in the file. When you call it, decrypt before actually requesting use of the full variable.
There are probably many wore ways to lock and unlock, not the least of which is to use the native Node.js structure. Combine File System events together with Crypto. But this demands a much deeper level of understanding of the actual Node.js library and structures.
Hope some of this helped.
I strongly recommend Redis in your case.
There are several ways to create a application/process shared object, using locks is one of them, as you mentioned.
But they're just complicated. Unless you really need to do that yourself, Redis will be good enough. Atomic ops cross multiple process, transaction and so on.
Old thread but I didn't want to use redis so I made my own open source solution which utilizes websocket connections:
https://github.com/OneAndonlyFinbar/sync-cache
I am trying to find the best caching solution for a node app. There a few modules which can manage this. Most popular being: https://www.npmjs.com/package/node-cache
But I found that responses are faster if I just save some results into a variable like:
var cache = ["Kosonsoy","Pandean","Ḩadīdah","Chirilagua","Chattanooga","Hebi","Péruwelz","Pul-e Khumrī"];
I can then update that variable on a fixed interval, is this also classed as caching? Are there any known issues/problems to this method. As it defiantly provides the fastest response times.
Of course it is faster as you use local memory to store the data.
For limited cache size this is good, but can quickly eat up all your process memory.
I would advise to use modules as there is a considerable amount of collaborative brainpower invested in them. Alternatively you can use a dedicated instance to run something like Redis to use as cache.
Alternatives aside, if you would stick with your solution I recommend small improvement.
Instead of
var cache = ["Kosonsoy","Pandean","Ḩadīdah","Chirilagua","Chattanooga","Hebi","Péruwelz","Pul-e Khumrī"];
try using objects as key value storage.
This will make searching and updating entries faster.
var cache = {"Kosonsoy":data,"Pandean":moreData,...};
Searching in array requires iterations while accessing the object is as simple as
var storedValue = cache["Kosonosoy"];
saving
cache[key] = value;
If you use many workers you will have a duplicated cache for each one because they have no shared memory.
Of course, you can use it for small data. And keeping a data in a variable for an amount of time can be called caching for me.
I'm working on a chef implementation where sometimes in the past attribute.set has been used where attribute.default would have done. In order to untangle this I've become pretty comfortable with the Chef attribute precedence paradigm. I understand that "Normal" attributes (assigned using attribute.set[]) persist between chef client runs.
This has led me to wonder what are the common and best ways to use attribute.set? I don't understand the value of having attribute assignments persist on a node between chef client runs?
The places to use node.set are when you need some state but can't (easily) store it in the system. This is common with self-generating database passwords. You need to store it somewhere, usually because other nodes need the password but the database itself only stores it hashed so you can't retrieve it from there. Using the node object as stateful storage gives you a place to put the data in the interim.
Also because I have to say it, storing passwords like this is highly insecure, please don't.
Historically node.set and "normal" attributes were first introduced. They just were attributes and were how attributes worked.
They are useful for tags, the run_list, the chef_environment and other bits of 'desired' state -- things that you set with knife on the command line and expect the node to pick up and for the recipes to consume, but not set themselves. Since chef clients need a run_list to do anything this problem had to get solved first, and normal attributes are how it got implemented.
As Chef evolved, default and override precedence levels were created and those were cleared on the start of the chef run, which means that recipes can use them much more declaratively. This is how people generally want attributes in recipes to behave, which is why it was introduced that way.
The use of 'node.set' is now highly confusing since it seems like if you want to set a node that you'd use 'node.set' but in nearly all cases for users 'node.default' or 'node.override' are preferred. The use of 'node.set/normal' leads to hard to debug behavior when code that sets attributes is removed from cookbooks, but the attributes persist leading to fun times debugging until its recognized that the state is persisting in the node object.
While it can be used to store password information, as #coderanger points out this is completely insecure. Every node on the chef server can read the node information of every other node, so your password is essentially broadcast out globally.
Unless you're doing something akin to 'tagging' the server (in which case why not use the node.tags feature we already built over the top of normal attributes?) then you really don't want to be using the normal precedence level.
The Chef attributes system unfortunately grew organically and now we're left with the rule "do not use node.set to set node attributes".
For that reason, we're doing to start deprecating the use of node.set in favor of node.normal in order to whittle away at a bit of the confusion (https://github.com/chef/chef/pull/5029).
Managing Change of State
For example, you can store role of a node in a persistent attribute and if current role is changed from db-slave to db-master tear down one way replication from previous db-master beside promoting and announcing current db-master.
Migrating State
For example your Redis server was on node A and now is on node B, you can move your data from server A to B.
As another example, an SSH key pair is generated for each server at first converge and backup server has their fingerprints in authorized_keys. Your backup server changes. You can move authorized_keys to new server and delete it on old one.
So I have a backend implementation in node.js which mainly contains a global array of JSON objects. The JSON objects are populated by user requests (POSTS). So the size of the global array increases proportionally with the number of users. The JSON objects inside the array are not identical. This is a really bad architecture to begin with. But I just went with what I knew and decided to learn on the fly.
I'm running this on a AWS micro instance with 6GB RAM.
How to purge this global array before it explodes?
Options that I have thought of:
At a periodic interval write the global array to a file and purge. Disadvantage here is that if there are any clients in the middle of a transaction, that transaction state is lost.
Restart the server every day and write the global array into a file at that time. Same disadvantage as above.
Follow 1 or 2, and for every incoming request - if the global array is empty look for the corresponding JSON object in the file. This seems absolutely absurd and stupid.
Somehow I can't think of any other solution without having to completely rewrite the nodejs application. Can you guys think of any .. ? Will greatly appreciate any discussion on this.
I see that you are using memory as a storage. If that is the case and your code is synchronous (you don't seem to use database, so it might), then actually solution 1. is correct. This is because JavaScript is single-threaded, which means that when one code is running the other cannot run. There is no concurrency in JavaScript. This is only a illusion, because Node.js is sooooo fast.
So your cleaning code won't fire until the transaction is over. This is of course assuming that your code is synchronous (and from what I see it might be).
But still there are like 150 reasons for not doing that. The most important is that you are reinventing the wheel! Let the database do the hard work for you. Using proper database will save you all the trouble in the future. There are many possibilites: MySQL, PostgreSQL, MongoDB (my favourite), CouchDB and many many other. It shouldn't matter at this point which one. Just pick one.
I would suggest that you start saving your JSON to a non-relational DB like http://www.couchbase.com/.
Couchbase is extremely easy to setup and use even in a cluster. It uses a simple key-value design so saving data is as simple as:
couchbaseClient.set("someKey", "yourJSON")
then to retrieve your data:
data = couchbaseClient.set("someKey")
The system is also extremely fast and is used by OMGPOP for Draw Something. http://blog.couchbase.com/preparing-massive-growth-revisited
I have a small project that I was using node-dirty for, but it's not really for production use and I've had way to many surprises with it so I would like to switch. I was looking at using sqlite, but compiling a client for it seems troublesome. Is there something like node-dirty (i.e. a pure Node.js implementation of a data store), but that's more suited for a small project that doesn't have more than a few hundred sets of data. I've faced the following problems with node-dirty that I would expect an altenrative data store not to do:
Saving a Date object makes it come out as a string when reloading the data (but during execution it remains a Date object). I'm fine with having to serialize the Date object myself, as long as I get out the same thing it lets me put into it.
Iterating over data and deleting something in the same forEach loop makes the iteration stop.
My client is reporting deleted data re-appearing and I've intermittently seen this too, I have no idea why.
How much data do you have? For some projects it's reasonable to just have things in memory and persist them by dumping a JSON file with all the data.
Just use npm to install a NoSQL module like redis or mongodb.