mongodb Atlas server - slow return - node.js

So I understand how some queries can take a while and querying the same information many times can just eat up ram.
I am wondering is their away to the following query more friendly for real-time requests?
const LNowPlaying = require('mongoose').model('NowPlaying');
var query = LNowPlaying.findOne({"history":[y]}).sort({"_id":-1})
We have our iOS and Android apps that request this information every second - which takes toll on MongoDB Atlas.
We are wondering if their is away in nodeJS to cache the data that is returned for at least 30 seconds and then fetch the new playing data when the data has changed.
(NOTE: We have a listener script that listen for song metadata to change - and update NowPlaying for every listener).

MongoDB will try doing its own caching when possible of queried data in memory. But the frequent queries mentioned may still put too much load on the database.
You could use Redis, Memcached, or even in-memory on the NodeJS side to cache the query results for a time. The listener script referenced could invalidate the cache each time an update occurs for a song's metadata to ensure clients get the most up-to-date data. One example of an agnostic cache client for NodeJS is catbox.

Related

Is writing multiple INSERTS versus UPDATE faster for temporary POSTGRES databases?

I am re-designing a project I built a year ago when I was just starting to learn how to code. I used MEAN stack, back then and want to convert it to a PERN stack now. My AWS knowledge has also grown a bit and I'd like to expand on these new skills.
The application receives real-time data from an api which I clean up to write to a database as well as broadcast that data to connected clients.
To better conceptualize this question I will refer to the following items:
api-m1 : this receives the incoming data and passes it to my schema I then send it to my socket-server.
socket-server: handles the WSS connection to the application's front-end clients. It also will write this data to a postgres database which it gets from Scraper and api-m1. I would like to turn this into clusters eventually as I am using nodejs and will incorporate Redis. Then I will run it behind an ALB using sticky-sessions etc.. for multiple EC2 instances.
RDS: postgres table which socket-server writes incoming scraper and api-m1 data to. RDS is used to fetch the most recent data stored along with user profile config data. NOTE: RDS main data table will have max 120-150 UID records with 6-7 columns
To help better visualize this see img below.
From a database perspective, what would be the quickest way to write my data to RDS.
Assuming we have during peak times 20-40 records/s from the api-m1 + another 20-40 records/s from the scraper? After each day I tear down the database using a lambda function and start again (as the data is only temporary and does not need to be saved for any prolonged period of time).
1.Should I INSERT each record using a SERIAL id, then from the frontend fetch the most recent rows based off of the uid?
2.a Should I UPDATE each UID so i'd have a fixed N rows of data which I just search and update? (I can see this bottlenecking with my Postgres client.
2.b Still use UPDATE but do BATCHED updates (what issues will I run into if I make multiple clusters i.e will I run into concurrency problems where table record XYZ will have an older value overwrite a more recent value because i'm using BATCH UPDATE with Node Clusters?
My concern is UPDATES are slower than INSERTS and I don't want to make it as fast as possible. This section of the application isn't CPU heavy, and the rt-data isn't that intensive.
To make my comments an answer:
You don't seem to need SQL semantics for anything here, so I'd just toss RDS and use e.g. Redis (or DynamoDB, I guess) for that data store.

Handle Concurrent Request in Redis

I am working on a nodejs API application that store and retrieve the data using MongoDB Database. For fast execution, I am using Redis DB to cache data. I am using a hash set to store and retrieve data.
When the first request comes with data I checked that data in Redis DB if it is present then I throw the error.
If it does not Present the I push that into Redis and do further processing and after that, I update previously push data.
But I observe that when I observe the Concurrency of data that time it is not working correctly it creating duplicate data in MongoDB.As Concurrency increase, multiple requests come at the same time due tho that it Redis caching not working properly
SO how I deal with such a case?
Redis is a single-threaded DB server. If you send multiple concurrent requests, then Redis will process them in the order that those requests are received at Redis' end. Therefore, you need to ensure the order of the requests sent from the application side.
If you still want to maintain the atomicity of a batch of commands, you can read more about Redis transactions and use Multi Exec block. When using a Multi command, subsequent commands are queued in the same order and executed when the Exec is received.

Should i cache data in my server or Just rely on MongoDB

I have a website which runs on Heroku and i am using Mongo Atlas as my database. I have tested the mongo connection speeds and found its around 5ms to 20ms based on the data what i am retrieving
Note: Both Heroku app and Mongo Atlas are in same aws zone.
Now my question is i have a collection with around 10K records which my users query frequently. For this usecase should i cache those 10K records in the server or should i leave it to MongoDB and live with the ~15ms overhead? What are your thoughts?
If its just one MongoDB call then i would say do not cache and leave it to the MongoDB to cache. In a real world scenario average response time will be around 300ms to 900ms (based on my pingdom results for my website) so when you compare the delay with the response time its relatively very low. So you are saving like a 15ms from ~900ms.
So better stay with the mongoDB for cleaner code and easy maintenance.

redis cache layer for mongodb queries for performance boost up

i have an ecommerce site. my product catalog is in mongodb and all other transaction in mysql.
I am planing to use express middleware which will use redis as cache layer of all outgoing mongodb queries.
Can anybody help me out how to design the architecture?
I will be very thankful.
current technology stack nodejs+mongodb+mysql
Generally redis will be very good for caching the data. Instead of hitting the main database for every request its good to go with caching technique, again it depends on how frequently you are updating the cache data. Serious issue will come if you miss to update the cache frequently or whenever the changes happen in the main databases. You have to listen to databases changes and update the cache, as of now there is no listener in may mongodb, so you can write some cron job to update the cache based on sometime. We need more details about the schema and architecture to help further.
Suggestion:
Step 1: Whenever you insert or update the data in mongodb or any main database, Add to cache redis database.
Step 2: So 99% data will available in cache redis database for particular product.
Step 3: Whenever making api a call based on the product or anyother detail, it will check it in cache redis database, if that particular data exists return that data, Else fetch it from main database at the same time add to cache as well so next time you will get data from cache.
Hopefully this edit gives some more idea.

Where / When to close Mongo db connection in a for each iteration context

I have a node.js app running on heroku backed by a Mongo db which breaks down like this:
Node app connects to db and stores db and collection into "top level" variables (not sure if Global is the right word)
App iterates through each document in the db using the foreach() function in node mongo driver.
Each iteration sends the document id to another function that uses the id to access fields on that document and take actions based on that data. In this case its making requests against api's from amazon and walmart getting updated pricing info. This function is also being throttled so as not to make too many requests too quickly.
My question is this, how can I know its safe to close the db connection. My best idea is to get a count of the documents, multiply that by the number of external api hits per document and then increment a variable by one each time a api transaction finishes and then test that number against the total number expected and if it hits that close the connection. This sounds so hackish there has to be a better way. Any ideas?

Resources