Execute a function after particular time in NodeJs and its lifetime - node.js

In my node application with mongodb I have feature where users can post books on rent and other users can request for them with a "whenDate". One post is mapped to only one book.
Consider a user requests for a book for 1 week 5 days from now. In this case I want to lock the book for a week so that no one else can request at that period.
1) How can I achieve in NodeJs that a function gets executed after sometime considering that I will be having many of them? This function will get executed after 5 days in the above case to lock the particular book document. Please consider the question 2 also.
2) I don't want these timers to get deleted if I restart my application. How can I achieve this?
Thanks in advance.

You can use TTL feature in mongo DB to discard records automatically after the time to live.
Let's say you keep a table with the booking requests and set TTL according to the booking duration. Mongo DB then can remove these booking record after the TTL is achieved. So your node.js application does not need to trigger any job.
Refer: https://docs.mongodb.com/manual/tutorial/expire-data/

Related

Prisma timesout whith large datasets

I have an app (Node.js (Fastify), postgres with prisma) that writes sales from an external onto the postgres db based on dates. Once the sales have been written the timestamps are written in a table in order to check later if that date has been queried (so if we request the sales for October 2019 it will check whether or not October 2019 has been queried before and return the sales from the db if that's the case or fetch from the external API, writes them on the db and write October 2019 on the date table for the next time).
My issue is when trying to get all the sales, which can be over several years. The way I do it right now is (please note that the only endpoint I can use with the API is year/month, so I have no other choice but to iterate my requests every month
Get the amount of months between first and last sale (for example, 97)
Loop over each month and check whether or not this month has been queried before
if it has been queried before, do nothing
If it has not been queried before, fetch this year/month combination from external API and write it on db
Once the loop has finished, get all the sales from the db in between those 2 dates
The issue I have is that while I paginated my endpoint, prisma timesout with some stores while upserting. Some months can have thousands of sales with relations for the products sold and I feel that that's where the issue is.
Here is the error message
Timed out fetching a new connection from the connection pool. More info: http://pris.ly/d/connection-pool (Current connection pool timeout: 10, connection limit: 10)"
My question is, is it my logic that is bad and should be redone, or should I not write that many objects in the database, is there a best practice I'm missing ?
I did not provide code as it is working and I feel the issue lies in the logic more than the code itself but I will happily provide code if needed.
Prisma has a connection pool, which you need to tell heroku's connection limit.
You'll need a ".profile" file in your root folder containing:
export DATABASE_URL="$DATABASE_URL?connection_limit=10&pool_timeout=0"
".profile" is like .bashrc or .zshrc. Its content will be executed on startup of your server. The line above will overwrite the standard env variable for databases on heroku.

How to copy managed database?

AFAIK there is no REST API providing this functionality directly. So, I am using restore for this (there are other ways but those don’t guarantee transactional consistency and are more complicated) via Create request.
Since it is not possible to turn off short time backup (retention has to be at least 1 day) it should be reliable. I am using current time for ‘properties.restorePointInTime’ property in request. This works fine for most databases. But one db returns me this error (from async operation request):
"error": {
"code": "BackupSetNotFound",
"message": "No backups were found to restore the database to the point in time 6/14/2021 8:20:00 PM (UTC). Please contact support to restore the database."
}
I know I am not out of range because if the restore time is before ‘earliestRestorePoint’ (this can be found in GET request on managed database) or in future I get ‘PitrPointInTimeInvalid’ error. Nevertheless, I found some information that I shouldn’t use current time but rather current time - 6 minutes at most. This is also true if done via Azure Portal (where it fails with the same error btw) which doesn’t allow to input time newer than current - 6 minutes. After few tries, I found out that current time - circa 40 minutes starts to work fine. But 40 minutes is a lot and I didn’t find any way to find out what time works before I try and wait for result of async operation.
My question is: Is there a way to find what is the latest time possible for restore?
Or is there a better way to do ‘copy’ of managed database which guarantees transactional consistency and is reasonably quick?
EDIT:
The issue I was describing was reported to MS. It was occuring when:
there is a custom time zone format e.g. UTC + 1 hour.
Backups are skipped for the source database at the desired point in time because the database is inactive (no active transactions).
This should be fixed as of now (25th of August 2021) and I were not able to reproduce it with current time - 10 minutes. Also I was told there should be new API which would allow to make copy without using PITR (no sooner than 1Q/22).
To answer your first question "Is there a way to find what is the latest time possible for restore?"
Yes. Via SQL. The only way to find this out is by using extended event (XEvent) sessions to monitor backup activity.
Process to start logging the backup_restore_progress_trace extended event and report on it is described here https://learn.microsoft.com/en-us/azure/azure-sql/managed-instance/backup-activity-monitor
Including the SQL here in case the link goes stale.
This is for storing in the ring buffer (max last 1000 records):
CREATE EVENT SESSION [Verbose backup trace] ON SERVER
ADD EVENT sqlserver.backup_restore_progress_trace(
WHERE (
[operation_type]=(0) AND (
[trace_message] like '%100 percent%' OR
[trace_message] like '%BACKUP DATABASE%' OR [trace_message] like '%BACKUP LOG%'))
)
ADD TARGET package0.ring_buffer
WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,
MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,
TRACK_CAUSALITY=OFF,STARTUP_STATE=ON)
ALTER EVENT SESSION [Verbose backup trace] ON SERVER
STATE = start;
Then to see output of all backup events:
WITH
a AS (SELECT xed = CAST(xet.target_data AS xml)
FROM sys.dm_xe_session_targets AS xet
JOIN sys.dm_xe_sessions AS xe
ON (xe.address = xet.event_session_address)
WHERE xe.name = 'Verbose backup trace'),
b AS(SELECT
d.n.value('(#timestamp)[1]', 'datetime2') AS [timestamp],
ISNULL(db.name, d.n.value('(data[#name="database_name"]/value)[1]', 'varchar(200)')) AS database_name,
d.n.value('(data[#name="trace_message"]/value)[1]', 'varchar(4000)') AS trace_message
FROM a
CROSS APPLY xed.nodes('/RingBufferTarget/event') d(n)
LEFT JOIN master.sys.databases db
ON db.physical_database_name = d.n.value('(data[#name="database_name"]/value)[1]', 'varchar(200)'))
SELECT * FROM b
NOTE: This tip came to me via Microsoft support when I had the same issue of point in time restores failing what seemed like randomly. They do not give any SLA for log backups. I found that on a busy database the log backups seemed to happen every 5-10 minutes but on a quiet database hourly. Recovery of a database this way can be slow depending on number of transaction logs and amount of activity to replay etc. (https://learn.microsoft.com/en-us/azure/azure-sql/database/recovery-using-backups)
To answer your second question: "Or is there a better way to do ‘copy’ of managed database which guarantees transactional consistency and is reasonably quick?"
I'd have to agree with Thomas - if you're after guaranteed transactional consistency and speed you need to look at creating a failover group https://learn.microsoft.com/en-us/azure/azure-sql/database/auto-failover-group-overview?tabs=azure-powershell#best-practices-for-sql-managed-instance and https://learn.microsoft.com/en-us/azure/azure-sql/managed-instance/failover-group-add-instance-tutorial?tabs=azure-portal
A failover group for a managed instance will have a primary server and failover server with the same user databases on each kept in synch.
But yes, whether this suits your needs depends on the question Thomas asked of what is the purpose of the copy.

Running a repetitive task in Node.js for each row in a postgres table on a different interval for each row

What would be a good approach to running a repetitive task for each row in a large postgres db table on a different per row interval in Node.js.
To give you some more context, here's a quick description of the application:
It's a chat based customer support app.
It consists of teams, which can be either a client team or a support team. Teams have users, which can be either client users or support users.
Client users send messages to a support team and wait for one of that team's users to answer their question.
When there's an unanswered client message waiting for a response, every agent for the receiving support team will receive a notification every n seconds (n being set on a per-team basis by the team admin).
So this task needs to infinitely loop through the rows in the teams table and send notifications if:
The team has messages waiting to be answered.
N seconds have passed since the last notification was sent (N being the number of seconds set by the team admin).
There might be a better approach to this condition altogether.
So my questions are:
What is an efficient way to infinitely loop through a postgres table with no upper limit on the number rows?
Should I load 1 row at a time? Several at a time?
What would be a good way to do this in Node?
I'm using Knex. Does Knex provide a mechanism for lazy loading a table and iterating through the rows?
A) Running a repetitive task via node can be done via a the js built-in function 'setInterval'.
// run the intervalFnc() every 5 seconds
const timerId = setTimeout(intervalFnc, 5000);
function intervalFnc() { console.log("Hello"); }
// to quit running it:
clearTimeout(timerId);
Then your interval function can do the actual work. An alternative would be to use cron (linux), or some OS process scheduler to trigger the function. I would use this method if you want to do it every minute, and a cron job if you want to do it every hour (in between these times becomes more debatable).
B) An efficient way...
B-1) Retrieving a block of records from a DB will be more efficient than one at a time. Knex has .offset and .limit clauses to choose a group of records to retrieve. A sample from the knex doc:
knex.select('*').from('users').limit(10).offset(30)
B-2) Database indexed access is important for performance if your tables are very large. I would recommend including an status flag field in your table to note which records are 'in-process', and also include a "next-review-timestamp" field with both fields being both indexed. Retrieve the records that have status_flag='in-process' AND next_review_timestamp <= now(). Sample:
knex('users').where('status_flag', 'in-process').whereRaw('next_review_timestamp <= now()')
Hope this helps!

Create expired time, or countdown timer

How do I make the time expired, suppose I made an article, to be able to make it again, had to wait for 15 hours, I was using nodejs and node-datetime.
I think the current time plus 15 hours, but how?
thanks before
I suppose you have a database with the articles, so, just save the creation date in each article, and when a user requests the access to create a new article, verify if his last article is more than 15 hours old

Run a CRON job that depends on entries of a database in NodeJS using AWS

I want to make schedules that depend on entries of a database to schedule cron jobs. Like if there's an entry in database with a timestamp 2:00 PM, 3rd of Apr, I want to send a mail to users on 2nd of Apr. I also want to send notifications at 1:55 PM 3rd of Apr.
So, this means I have to look into the database, find the entries after the current times tamp, see if they suit the criteria for notification (like 5 minutes to time stamp or 1 day to time stamp) and send the notification or mail. I'm only worried that every one minute seems like too much overload. Are the AWS web workers built for this sort of thing?
Any suggestions on how this can be accomplished?
I don't think crontab will be the best choice but if you're familiar with it, it's fine.
First you should estimate how frequently your entries are created. If, let's say, only a couple of hundred a day. My suggestion is to create the crontab job right after the entry is created. But if more than a hundred a minutes, pooling will be fine.
But there are also side effects, like canceling or updating the cron job .
I think it's better to use a proper MQ.

Resources