What would be a good approach to running a repetitive task for each row in a large postgres db table on a different per row interval in Node.js.
To give you some more context, here's a quick description of the application:
It's a chat based customer support app.
It consists of teams, which can be either a client team or a support team. Teams have users, which can be either client users or support users.
Client users send messages to a support team and wait for one of that team's users to answer their question.
When there's an unanswered client message waiting for a response, every agent for the receiving support team will receive a notification every n seconds (n being set on a per-team basis by the team admin).
So this task needs to infinitely loop through the rows in the teams table and send notifications if:
The team has messages waiting to be answered.
N seconds have passed since the last notification was sent (N being the number of seconds set by the team admin).
There might be a better approach to this condition altogether.
So my questions are:
What is an efficient way to infinitely loop through a postgres table with no upper limit on the number rows?
Should I load 1 row at a time? Several at a time?
What would be a good way to do this in Node?
I'm using Knex. Does Knex provide a mechanism for lazy loading a table and iterating through the rows?
A) Running a repetitive task via node can be done via a the js built-in function 'setInterval'.
// run the intervalFnc() every 5 seconds
const timerId = setTimeout(intervalFnc, 5000);
function intervalFnc() { console.log("Hello"); }
// to quit running it:
clearTimeout(timerId);
Then your interval function can do the actual work. An alternative would be to use cron (linux), or some OS process scheduler to trigger the function. I would use this method if you want to do it every minute, and a cron job if you want to do it every hour (in between these times becomes more debatable).
B) An efficient way...
B-1) Retrieving a block of records from a DB will be more efficient than one at a time. Knex has .offset and .limit clauses to choose a group of records to retrieve. A sample from the knex doc:
knex.select('*').from('users').limit(10).offset(30)
B-2) Database indexed access is important for performance if your tables are very large. I would recommend including an status flag field in your table to note which records are 'in-process', and also include a "next-review-timestamp" field with both fields being both indexed. Retrieve the records that have status_flag='in-process' AND next_review_timestamp <= now(). Sample:
knex('users').where('status_flag', 'in-process').whereRaw('next_review_timestamp <= now()')
Hope this helps!
In my node application with mongodb I have feature where users can post books on rent and other users can request for them with a "whenDate". One post is mapped to only one book.
Consider a user requests for a book for 1 week 5 days from now. In this case I want to lock the book for a week so that no one else can request at that period.
1) How can I achieve in NodeJs that a function gets executed after sometime considering that I will be having many of them? This function will get executed after 5 days in the above case to lock the particular book document. Please consider the question 2 also.
2) I don't want these timers to get deleted if I restart my application. How can I achieve this?
Thanks in advance.
You can use TTL feature in mongo DB to discard records automatically after the time to live.
Let's say you keep a table with the booking requests and set TTL according to the booking duration. Mongo DB then can remove these booking record after the TTL is achieved. So your node.js application does not need to trigger any job.
Refer: https://docs.mongodb.com/manual/tutorial/expire-data/
I'm using MongoDB and I'm quiet new to it, so I'd like your help on how to model my data. What is the best efficient way?
Here is my use case.
Let's say I have three income sources, named Income1, Income2, Income3. Tomorrow they might be 4 or 20. Each new Income source will suppose new integration to be implemented.
Let's say I have ten users, named User1, User2... User10. Tomorrow they might be 1000. (I hope ;-)). Here, no integration needed for a new user.
And let's say that I'm interested in storing, for each day, how much money User1 got from Income1, Income2, ... User2 from Income1, Income2... and so on. And even some day I'll aggregate all of this.
Still following me?
How should I model this?
First idea: Separate collections and separate documents
3 Collections : Income1, Income2, Income3. If an Income4 comes up, no problem, since I'll have to add some code, I can also create a new collection. Not an issue.
In each collection, the data for a user, with one document per user and per date, like this:
Income 1
{name:'user1', date:'2014-12-07',money:'24.32'}
{name:'user1', date:'2014-12-08',money:'14.20'}
{name:'user2', date:'2014-12-07',money:'0.00'}
{name:'user2', date:'2014-12-08',money:'0.00'}
{name:'user2', date:'2014-12-09',money:'10.00'}
{name:'user3', date:'2014-12-09',money:'124.32'}
Income 2
{name:'user1', date:'2014-12-05',money:'4.00'}
{name:'user2', date:'2014-12-06',money:'0.20'}
Second idea: Separate collections, and same document + embedded document
3 Collections as before. In each colection, the data for a user, with ONE document per user:
Income 1
{name:'user1', incomes:
[{date:'2014-12-07',money:'24.32'},{date:'2014-12-08',money:'14.20'}]}
{name:'user2', incomes:
[{date:'2014-12-07',money:'0.00'},{date:'2014-12-08',money:'0.00'},{date:'2014-12-09',money:'10.00'}]}
{name:'user3', incomes:
[{date:'2014-12-09',money:'124.32'}]}
Income 2
{name:'user1', incomes: [{date:'2014-12-05',money:'4.00'}]}
{name:'user2', incomes:[{date:'2014-12-06',money:'0.20'}]}
Third idea: SAme collection, and separate documents for everyghing.
{income_type:1,name:'user1', date:'2014-12-07',money:'24.32'}
{income_type:1,name:'user1', date:'2014-12-08',money:'14.20'}
{income_type:1,name:'user2', date:'2014-12-07',money:'0.00'}
{income_type:1,name:'user2', date:'2014-12-08',money:'0.00'}
{income_type:1,name:'user2', date:'2014-12-09',money:'10.00'}
{income_type:1,name:'user3', date:'2014-12-09',money:'124.32'}
{income_type:2,name:'user1', date:'2014-12-05',money:'4.00'}
{income_type:2,name:'user2', date:'2014-12-06',money:'0.20'}
These are some ideas. I'm sure there are others.
I will often have to query per user, on the most recent documents (i.e. with the most recent dates). I may from time to time need to aggregate information per week, month.... And, finally, I think I'll update the table from a cron running every night (to add the coresponding income for each income source and user)
Is this clear? I come from a relational database background (is it so obvious?) so maybe there is something I haven't even considered.
Thanks in advance.
At this point I would recommend the third idea. Rolling the data up per-user and / or per income stream is quite simple using the aggregation pipeline. Working with sub-documents is more pain than it's worth in my experience.
I am new in Apex. I want to write a trigger in apex for before insert. I have two standard objects (Contact, Opportunity).
SELECT sum(amount), Bussiness__c FROM opportunity
WHERE stagename='Closed Won' and id='006i000000Kt683AAB' GROUP BY Bussiness__c
I want when trigger runs this get sum(Amount) field and Bussiness__c value and then update Contact Total_Business__c with Sum(Amount) Value. Here Bussiness__C is contact id at opportunity object.
Thanks in advance and Waiting for your positive Response.
I'm assuming yo don't have currencies enabled in your org (if you'll see "CurrencyIsoCode" somewhere on your objects you'll have to modify this design a bit).
I am a lazy person and you didn't write anything about amount of data you expect. What I've written will work when there's reasonable amount of Opportunities per contact. If you'll start hitting the governor limit of 50K query rows it'd have to be done differently (I'll write a bit about it at the end).
I am not going to give you a ready solution because "homemade rollup summary" is one of assignments you might encounter during SF DEV 501 certification. I'll just outline some pointers and food for thought.
I wouldn't do it before insert, it's easier in after insert, after update (you didn't think about recalculation when the Amount changes, did you?). There should also be something said about after delete, after undelete if your users are allowed to delete Opportunities.
First thing is to build a set of "contacts we'll have to recalculate":
Set<Id> contactIds = new Set<Id>();
for(Opportunity o : trigger.old){
contactIds.add(o.Business__c);
}
for(Opportunity o : trigger.new){
contactIds.add(o.Business__c);
}
contactIds.remove(null);
This forces recalculation for all related contacts and ignores opportunities without contact. It'll fire always... which is not the best thing because on insert, delete, undelete you'd want it to fire always but on update you'd want it to fire only when Amount or Contact changes (trigger.old will hold different contact than trigger.new). You can control these scenarios by using stuff like Trigger.isUpdate, read up about it.
Anyway - you got an unique set of Contact Ids. I've said I'd do it in "after" trigger because at that point the new Amount is already saved to database and you can query it back from it:
SELECT Business__c, SUM(Amount) sumAmount
FROM Opportunity
WHERE Business__c IN :contactIds
This type of queries returns an "AggregateResult" that you'll have to parse like that:
List<Contact> contactsToUpdate = new List<Contact>();
for(AggregateResult ar : [SELECT Business__c, SUM(Amount) sumAmount
FROM Opportunity
WHERE Business__c IN :contactIds]){
System.debug(ar);
contactsToUpdate.add(new Contact(Id = (Id) ar.get('Business__c'),
Total_Business__c = (Double) ar.get('sumAmount)
);
}
update contactsToUpdate;
As I said - it's a basic outline, should get you started.
This thing queries all opportunities for given contact. Your trigger can fire on at most 200 Opps. Imagine a situation where you change contact on all 200 opps -> gives you 400 contacts you need to update to clear/fix old value and to set new value. With 50K rows limit, assuming no other business logic is triggered (like update of Accounts? Action that started because some Opportunity Products were added?) it gives you problems when on average 1 contact is involved in 125 Opps. It sounds like a ridiculous problem but there are scenarios when you need to do it differently.
In such cases you can attack it from another angle. You don't really need to query all opps for given Contact, it's lazy. You couuld instead learn the current value of total business (put 0 if it happens to be null) and then add/substract all changes to the amount as needed, looking only at your trigger.old and trigger.new. It makes for more code and more planning upfront but the performance will increase significantly and this solution will scale as the amount of opps grow (it'll continue to look at only the current max of 200 opps in the trigger's scope).
Another approach would be to accept some delay in this rollup summary and write a batch job for it.
My objective is, I need to get the current timestamp using Syncsort if one OPC job(existing Job) run fine in production. In my case I can not interpret my new job after existing OPC job. Is there any facility to check the existing job ran fine in production ?
I mean any reference table to have production job details with status for each day ?
Please help anyone to move.
There are commercial packages that track jobs and job status. CA (computer associates) is one such vendor.
However, these packages cost a lot. A simple, home grown solution, is to have a dataset known to both jobs and write a one line record into that data set when job1 completes and the second job2 can read the dataset to "KNOW" if the job ran. IF this is what you are trying to do, it is not exactly clear from your question. But any solution along these lines works, until management wants to cough up $50K (or whatever) for a commercial package.