Question about Redis Implementation on NodeJS - node.js

Why should you use Redis to optimize your NodeJS application?
Why POST, PUT and DELETE methods should never be cached?
How is the caching process?
Why do we cache?
Things to install to use Redis on NodeJS?
What is an example of an app that uses the Redis implementation?
Is there any alternative or better than Redis?
Is it too hard to implement Redis in NodeJS?
What happens when we don’t use Redis?
Can we use Redis in any OS?

According to several sources that i have been searched.
By using Redis we can use cache database that gives clients faster data retrieval.
a. The POST method itself is semantically meant to post something to a resource. POST cannot be cached because if you do something once vs twice vs three times, then you are altering the server's resource each time. Each request matters and should be delivered to the server.
b. The PUT method itself is semantically meant to put or create a resource. It is an idempotent operation, but it won't be used for caching because a DELETE could have occurred in the meantime.
c. The DELETE method itself is semantically meant to delete a resource. It is an idempotent operation, but it won't be used for caching because a PUT could have occurred in the meantime.
We can simplify the method like this :
a. client request data X with ID "id1".
b. the system will check the data X in cache database on RAM.
c. if the data X available in cache database, clients will retrieve the data from cache database in RAM.
d. if data unavailable in cache database, the system will retrieve the data from API and then deliver it to clients also save it on cache database at the same time.
To shorten the data retrieval time.
Redis in npm.
Twitter, GitHub, Weibo, Pinterest, Snapchat.
Memcached, MongoDB, RabbitMQ, Hazelcast, and Cassandra are the most popular alternatives and competitors to Redis.
The redis community is quite large, you can see lots of tutorials and manuals. You should be fine
No cache , slower speed to query data and slowing performance
Redis works in most POSIX systems like Linux, *BSD, and OS X, without external dependencies, but there is no official support for Windows builds.

According to many sources,
It'll gives clients faster retrieval of similar/repeated data. Therefore it's called cached database.
Because, Commands (POST,PUT,DELETE) may include many variable, thus differ to each client. Also, not worth the cache. you might want to read more about CQRS.
One of the many methods, in oversimplified terms:
a. client request certain data A with request ID req-id-1.
b. cache will be stored in high speed memory (RAM).
c. if another client request data with ID req-id-1, instead of reading from slower drives, it'll deliver from cache in RAM.
d. if data A is updated, cache with req-id-1 will be deleted. and repeats to step a.
same as answer 1.
redis or ioredis in npm, and a redis process running.
Mostly app/site with a lot of GET request such as news portal. If it's well known, high probability it implements redis.
this is opinionated question. here's a list of redis-like DB,
as long as you read the manual/tutorial, it should be fine.
no cache, thus, saving ram but slower query.
works in most POSIX systems

Related

Node.js: a variable in memory compared to redis (or other in-memory key/value store)

I'd like to store some info. in a node.js array variable (to be a local cache) that my middleware would check before making a database query.
I know that I can do this w/redis and it's generally the preferred method b/c redis offers snapshots for persistence and is quite performant, but I can't imagine anything being more performant than a variable stored in-memory.
Every time someone brings up this topic, however, folks say "memory leaks" make this a bad idea. But why? Why is node.js bad at managing server-side vars?
Is there a preferred method (outside of an external k/v db store) of managing a server-side array/cache through node.js?
The problem with using a node variable as storage is that by using it you have made your application unable to scale. Consider a large application which serves thousands of requests per second, and cannot be run on a single machine. If you spin up a second node process, it has different values for your node storage variable.
Let's say a user making an API call to your application hits machine 1, and stores a session variable. They make a second API call and this time are routed by your load balancer to machine 2. Their session variable is not found and you throw an error.
If you are writing a small application and have no expectations of scaling up in the near term, by all means use a node variable - I've done this myself before for auth tokens on small websites. You can always switch to redis later if you need to. Of course, you need to be aware that if your node process restarts, the contents of your variable will be lost.

Can I cache a single value in Azure Functions without any negative effects?

I have an Azure Function on a timer that activates every minute, which calls an API which returns an integer value. I store this value in SQL.
I then have another Azure Function that can be queried by the user to retrieve the integer value. This query could in theory be as high as hundreds or thousands of times per second.
Rather than have the second Azure Function query SQL every single time it gets a request, I would like it to cache the value in memory. If the cache were perfect there would be no need for SQL at all, but because Functions can scale up and also seem to lose their cache periodically there has to be some persistent storage.
Is it just a case of a static variable within the function to cache the value, and another with the last date retrieved? Or is there another type of caching that I can use within the function?
I understand there are solutions such as Redis but they seem pretty overkill to spin up just for a single integer value. I'm also not even sure if Azure SQL itself would cache the value when it's requested.
My question is, would a static variable work (if it's null/reset then we'd just do a quick SQL query to get the value) and actually persist? Or does an alternative like redis or similar exist that wouldn't be overkill for this application? And finally, is there actually any harm (performance problems) in hammering SQL over and over to retrieve a single value (i.e. is it clever enough to cache already so there's not a significant performance hit vs. querying a variable in memory)?
Really depends. If you understand the limitations of using in-memory cache in an azure function, and your business case is fine with those limitations, you should use it.
The main thing is you can't invalidate cache.
So for example, if your number changes, it can be not usable for you. You will have cases where a container for your azure is spinning, and it has an old value. The same user could get different values on each request because who knows which instance he will hit, and what that instance is caching.
If you number is something that is set only once and doesn't change, you don't have this issue.
And another important thing is that you still make quite a few requests just to cache it. Every new container will have to cache it for itself, while centralized cache would do it only once. This can be fine for something smaller, but if the thing you're caching really takes significant amount of time, or if the resources of the service are super limited, you would be a lot more efficient with centralized cache.
No matter what, caching in Azure Function level still reduces the load, and there's no reason to make requests when you don't have to.
To answer your sql question, yes, most likely the SQL server will cache it too, but your azure function still needs to establish a connection to sql server, make the request and kill the connection.
Azure functions best practices states that functions should be stateless and your state information should be with data. I think Redis is still the better option that SQL.

Caching posts using redis

I have a forum which contains groups, new groups are created all the time by users, currently I'm using node-cache with ttl to cache groups and it's content (posts, likes and comments).
The server worked great at the begging but the performance decreased when more people start using the app, so I decided to use the node.js Cluster module as the next step to improve performance.
The node-cache will cause a consistency problem, the same group could be cached in two workers, so if one of them changed, the other will not know (unless you do).
The first solution that came to my mind is using redis to store the whole group and it's content with the help of redis datatypes (sets and hash objects), but I don't know how efficient this could be.
The other solution is using redis to map requests to the correct worker, in this case the cached data is distributed randomly in workers, so when a worker receives a request that related to some group, he checks the group owner(the worker that holds this group instance in-memory) in redis and ask him to get the wanted data using node-ipc and then return it to the user.
Is there any problem with the first solution?
The second solution does not provides a fairness (if all the popular groups landed in the same worker), is there a solution for this?
Any suggestions?
Thanks in advance

Where to store consistent JSON, Redis or global variable?

It's been a while that i am using node for my applications and i was wondering where a global or local variable is stored? (in RAM or CPU cache maybe. guessing RAM. right?) and is it a good idea to store some JSON's that are most of the times static as a global variable and access them right away.
would it be faster than Reading from in-memory database like Redis?
For example let's see i am talking about something like website categories list which is a JSON with some nodes in it.
Most of the times this JSON is constant and even if it gets changed i can refresh the variable with new value because one server app handles all requests.
And when node app starts i can have this initializer function that reads the JSON from in-disk database.
Currently i am using Redis for this situation and when app starts i'm reading this JSON from mySQL and keep it in redis for faster request handling.
But i'm wondering is it a good practice to keep JSON as a global variable and how would it be compared against having it in Redis performance wise?
P.S: I know redis has consistency and keeps value in disk too but i am reading them from mySQL because redis is a caching mechanism for a small part of schema and using initializer gives me a manual sync if needed.
Thanks
I would prefer Redis. Because even if you restart node application data will be there and putting global variables in memory has one disadvantage that at run time if you want them to be changed you are just left with choice of restarting whole application.
Plus while running application you should always query Redis to get data whenever you want.So in future if you want these values to be dynamic it will directly reflect by just changing it in Redis.
You can keep it anywhere you want. You can store them as files and require them while starting your app. I'd prefer this if they do not change.
If you update them, then you can use any database or caching mechanism and read them. It's up to you.
Yes, the variables are stored in memory. They won't persist if the app crashes. So a persistent storage is recommended.

Is this MEAN stack design-pattern suitable at the 1,000-10,000 user scale?

Let's say that when a user logs into a webapp, he sees a list of information.
Let's say that list of information is served by one of two dynos (via heroku), but that the list of information originates from a single mongo database (i.e., the nodejs dynos are just passing the mongo information to a user when he logs into the webapp).
Question: Suppose I want to make it possible for a user to both modify and add to that list of information.
At a scale of 1,000-10,000 users, is the following strategy suitable:
User modifies/adds to data; HTTP POST sent to one of the two nodejs dynos with the updated data.
Dyno (whichever one it may be) takes modification/addition of data and makes a direct query into the mongo database to update the data.
Dyno sends confirmation back to the client that the update was successful.
Is this OK? Would I have to likely add more dynos (heroku)? I'm basically worried that if a bunch of users are trying to access a single database at once, it will be slow, or I'm somehow risking corrupting the entire database at the 1,000-10,000 person scale. Is this fear reasonable?
Short answer: Yes, it's a reasonable fear. Longer answer, depends.
MongoDB will queue the responses, and handle them in the order it receives. Depending on how much of it is being served from memory, it may or maybe not be fast enough.
NodeJS has the same design pattern, where it will queue responses it doesn't process, and execute them when the resources become available.
The only way to tell if performance is being hindered is by monitoring it, and seeing if resources consistently hit a threshold you're uncomfortable with passing. On the upside, during your discovery phase your clients will probably only notice a few milliseconds of delay.
The proper way to implement that is to spin up a new instance as the resources get consumed to handle the traffic.
Your database likely won't corrupt, but if your data is important (and why would you collect it if it isn't?), you should be creating a replica set. I would probably go with a replica set of data before I go with a second instance of node.

Resources