I am working on a nodejs API application that store and retrieve the data using MongoDB Database. For fast execution, I am using Redis DB to cache data. I am using a hash set to store and retrieve data.
When the first request comes with data I checked that data in Redis DB if it is present then I throw the error.
If it does not Present the I push that into Redis and do further processing and after that, I update previously push data.
But I observe that when I observe the Concurrency of data that time it is not working correctly it creating duplicate data in MongoDB.As Concurrency increase, multiple requests come at the same time due tho that it Redis caching not working properly
SO how I deal with such a case?
Redis is a single-threaded DB server. If you send multiple concurrent requests, then Redis will process them in the order that those requests are received at Redis' end. Therefore, you need to ensure the order of the requests sent from the application side.
If you still want to maintain the atomicity of a batch of commands, you can read more about Redis transactions and use Multi Exec block. When using a Multi command, subsequent commands are queued in the same order and executed when the Exec is received.
Related
So I understand how some queries can take a while and querying the same information many times can just eat up ram.
I am wondering is their away to the following query more friendly for real-time requests?
const LNowPlaying = require('mongoose').model('NowPlaying');
var query = LNowPlaying.findOne({"history":[y]}).sort({"_id":-1})
We have our iOS and Android apps that request this information every second - which takes toll on MongoDB Atlas.
We are wondering if their is away in nodeJS to cache the data that is returned for at least 30 seconds and then fetch the new playing data when the data has changed.
(NOTE: We have a listener script that listen for song metadata to change - and update NowPlaying for every listener).
MongoDB will try doing its own caching when possible of queried data in memory. But the frequent queries mentioned may still put too much load on the database.
You could use Redis, Memcached, or even in-memory on the NodeJS side to cache the query results for a time. The listener script referenced could invalidate the cache each time an update occurs for a song's metadata to ensure clients get the most up-to-date data. One example of an agnostic cache client for NodeJS is catbox.
I’m considering node.js for a backend application. Node will run a Rest API (express) and front a Postgres dB. I might have to make queries returning 1k records and I would then need to do in memory filtering/data manipulation to return Json Responses to my API.
I might be facing at 10-100 TPS (transaction per second).
I know I can’t block the event loop, but I’m not able to wrap my mind over what is considered blocking in terms of CPU processing load.
About once a minute, I need to cache all orderbooks from various cryptocurrency exchanges. There are hundreds of orderbooks, so this update function will likely never stop running.
My question is: If my server is constantly running this orderbook update function, will it block all other server functionality? Will users ever be able to interact with my server?
Do I need to create a separate service to perform the updating, or can Node somehow prioritize API requests and pause the caching function?
My question is: If my server is constantly running this orderbook
update function, will it block all other server functionality? Will
users ever be able to interact with my server?
If you are writing asynchronously, these actions will go into your eventloop and your node server would pick next event from eventloop while these actions are being performed. If you have too many events like this, your event queue would be long and user would face really slow response or may even get a timeout
Do I need to create a separate service to perform the updating, or can
Node somehow prioritize API requests and pause the caching function?
Node only consumes event from the event queue. There are no priorities.
From the design perspective, you should look for options which can reduce this write load like bulkCreate/edit or if you are using redis for cache, consider redis pipeline
This is a very open ended question much of which depends on your system. In general your server should be able to handle concurrent requests, but there are some things to watch out for.
Performance costs. If the operation to retrieve and store data requires too much computational power, then it will cause strain on all requests processed by the server.
Database connections. The server spends a lot of time waiting for database queries to complete. If you have one database connection for the entire application, and this connection is busy, they will have to wait until the database connection is free. You may want to look into database connection 'pooling'.
I'm developing an API for sending SMS with an Http request. I use node js and mongoose. So I have a problem like the one with multi thread application.
The fact is that when a user send a sms, I verify the number of sms he has already sent in database (using mongoose) and if the number doesn't exceed a limit his sms is sent and the number of sms he has sent is increment in the database (there is a value for the number of sms he has sent in the hour,day,week and month in the schema). But the fact is that I use a callbacks for the process of read value and increment value and many other operation in my code.
So the problem (I think) is that when user send requests very quickly the server different callbacks read the same count of the sms sent, authorize user to sent sms, increment and save the same value so that the count of sms is false.
In a multi thread application that access to a variable the solution would be to prevent other threads to read a variable before the actual thread has done all of it works.
With Node js event system and access to data in mongoDB I just don't know how to solve my problem.
Thank you in advance for the answers.
PS: I don't know the solution but it will be good if it works also with clusters that allow node js to use multi core.
I think you should try some cache approach.
now I meet same situation with you.
I will try to use cache to store the record_id that is in process.
When new request come, the coming process need check cache. If the record_id is in cache that means that record is using by other thread. So that thread need wait or do something else until finish. And when the process finish that will remove the record_id in cache in callback function
Thanks Cristy, I have solved the main part of my problem using async queue.
My application works well when I run it the default way of node js.
But there is an other problem. I intend to run my code on a server that has 4 cores so I want to use the node cluster module. But when I used this... because it runs code like 4 differents process (I used a server with 4 cores) they use differents queues and the error I mention earlier always occured, they read and write to the database without waiting for other thread to finish processing verifications + update.
So I would like to know what should I do to have an optimal and fast application.
Should I stop to use the cluster module and don't take benefit of multi core server (I don't think it is the best answer)?
Should I store it in my mongodb (maybe try to not persist the queue but store it in the memory in other to make it faster) ?
Is there a way to share the queue in the code when I use cluster?
What is my best choice?
I'm using Node js and Postgresql and trying to be most efficient in the connections implementation.
I saw that pg-promise is built on top of node-postgres and node-postgres uses pg-pool to manage pooling.
I also read that "more than 100 clients at a time is a very bad thing" (node-postgres).
I'm using pg-promise and wanted to know:
what is the recommended poolSize for a very big load of data.
what happens if poolSize = 100 and the application gets 101 request simultaneously (or even more)?
Does Postgres handles the order and makes the 101 request wait until it can run it?
I'm the author of pg-promise.
I'm using Node js and Postgresql and trying to be most efficient in the connections implementation.
There are several levels of optimization for database communications. The most important of them is to minimize the number of queries per HTTP request, because IO is expensive, so is the connection pool.
If you have to execute more than one query per HTTP request, always use tasks, via method task.
If your task requires a transaction, execute it as a transaction, via method tx.
If you need to do multiple inserts or updates, always use multi-row operations. See Multi-row insert with pg-promise and PostgreSQL multi-row updates in Node.js.
I saw that pg-promise is built on top of node-postgres and node-postgres uses pg-pool to manage pooling.
node-postgres started using pg-pool from version 6.x, while pg-promise remains on version 5.x which uses the internal connection pool implementation. Here's the reason why.
I also read that "more than 100 clients at a time is a very bad thing"
My long practice in this area suggests: If you cannot fit your service into a pool of 20 connections, you will not be saved by going for more connections, you will need to fix your implementation instead. Also, by going over 20 you start putting additional strain on the CPU, and that translates into further slow-down.
what is the recommended poolSize for a very big load of data.
The size of the data got nothing to do with the size of the pool. You typically use just one connection for a single download or upload, no matter how large. Unless your implementation is wrong and you end up using more than one connection, then you need to fix it, if you want your app to be scalable.
what happens if poolSize = 100 and the application gets 101 request simultaneously
It will wait for the next available connection.
See also:
Chaining Queries
Performance Boost
what happens if poolSize = 100 and the application gets 101 request simultaneously (or even more)? Does Postgres handles the order and makes the 101 request wait until it can run it?
Right, the request will be queued. But it's not handled by Postgres itself, but by your app (pg-pool). So whenever you run out of free connections, the app will wait for a connection to release, and then the next pending request will be performed. That's what pools are for.
what is the recommended poolSize for a very big load of data.
It really depends on many factors, and no one will really tell you the exact number. Why not test your app under huge load and see in practise how it performs, and find the bottlenecks.
Also I find the node-postgres documentation quite confusing and misleading on the matter:
Once you get >100 simultaneous requests your web server will attempt to open 100 connections to the PostgreSQL backend and 💥 you'll run out of memory on the PostgreSQL server, your database will become unresponsive, your app will seem to hang, and everything will break. Boooo!
https://github.com/brianc/node-postgres
It's not quite true. If you reach the connection limit at Postgres side, you simply won't be able to establish a new connection until any previous connection is closed. Nothing will break, if you handle this situation in your node app.