We need to rebill x amount of customers on any given day.
Currently, we run a cron every 5 mins to bill 20 people/send invoice etc
However, when the number of customers grows, extending to 100 people per 5 min may result in the cron overlapping and billing customers twice.
I have two thoughts:
Running the cron once, but making it sleep x amount after 20 billed/invoiced so that we dont spam the API.
Using a message queue where people are added to the queue and then "workers" process the queue. The problem is I have no experience in this, so not sure what is the best route to take.
Does anyone have any experience in this?
Related
I apologize for the long question but I have to explain it.
I am doing a change point detection through python for almost 50 customers and for different processes.
I am getting minute by minute interval numeric data from influx.
I am calculating the Z-Score and saving that in mongoDB locally on
the
machine on which the cron job is running.
I am comparing the Z-score with historic Z-score and then I am
alerting
the system.
Issues:
As i am doing this for 50 customers which can scale to say 500 or
5000 and each customer will
be having say 10 process so its not practical to have that many cron jobs.
Increasing cron jobs are creating high cpu usage and as my data is
sitting locally and if I
will loose the linux machine then i will loose all my data sitting there and won't be able to
compare it with historic data.
Solutions:
Create a clustered mongoDB server rather than having the data saved
locally.
Replace the cron jobs with multi threading and multi processing.
Suggestions:
What could be the best way to implement this to decrease the load on
cpu considering this is going to run all the time in a loop?
After fixing the above, what could be the best way to decrease the
false positive numbers in alerts?
Keep in mind:
Its a time series data.
Every time stamp has 5 or 6 variables and as of now i am doing the
same
operation for each variable separately.
Time var1 var2 var3.................varN
09:00 PM 10,000 5,000 150,000..............10
09:01 PM 10,500 5,050 160,000..............25
There is possibility that there is a correlation in these numbers.
Thanks!
So I have database of users which have a reminderTime field which currently is just a string which looks like that 07:00 which is a UTC time.
In the future I'll have a multiple strings inside reminderTime which will correspond to at which time the user should receive a notification.
So imagine you logged into an app, set a multiple reminders like so 07:00, 15:00, 23:30 and sent it to server. The server will save those inside a database and run a task and send a notification at 07:00 then at 15:00 and so on. So later a user decided that he will no longer wants to receive notifications at 15:00 or change it to 15:30 and we should adapt to that.
And each user has a timezone, but I guess since reminderTime is already in UTC I can just create a task without looking at timezone.
Currently I have a reminderTime as number and after client sends me a 07:00 I convert it to seconds, but as I understand I can change that and stick to string.
All my tasks are running with Bull queue library and Redis. So as I understood the best scalable approach is to take reminderTime and just create notifications for each day at a given time and just run the task, the only problem is that should I save them to my database or add a task to a queue in Bull. The same will be for multiple times.
I don't understand how should I change already created tasks inside Bull so that the time will be different and so on.
Maybe I could just create like a 1000 records at which time user should receive a notification inside my database. Then I create a repeatable job which will run like every 5 minutes and take all of the notifications which should be send in the next couple of hours and then add them to a Bull queue and mark it that it was sent.
So basically you get the idea, maybe it could be done a little bit better.
Unless you have really a lot of users, you could simply create a schedule-like table in your DB, which is simply a list of user_id | notify_at records. Then, run a periodic task every 1-5 minutes, which compares current time and selects all the records, where notify_at is less than the current time.
Add the flag notified, if you want to send notifications more than once a day to ignore ones that was already sent. There is no need to create thousands of records for every day, you can just reset that flag once a day, e.g. at 00:00 AM.
It's ok that your users wont recieve their notifications all at the same time, there could be little delays.
The solution you suggested is pretty much fine :)
I want to make schedules that depend on entries of a database to schedule cron jobs. Like if there's an entry in database with a timestamp 2:00 PM, 3rd of Apr, I want to send a mail to users on 2nd of Apr. I also want to send notifications at 1:55 PM 3rd of Apr.
So, this means I have to look into the database, find the entries after the current times tamp, see if they suit the criteria for notification (like 5 minutes to time stamp or 1 day to time stamp) and send the notification or mail. I'm only worried that every one minute seems like too much overload. Are the AWS web workers built for this sort of thing?
Any suggestions on how this can be accomplished?
I don't think crontab will be the best choice but if you're familiar with it, it's fine.
First you should estimate how frequently your entries are created. If, let's say, only a couple of hundred a day. My suggestion is to create the crontab job right after the entry is created. But if more than a hundred a minutes, pooling will be fine.
But there are also side effects, like canceling or updating the cron job .
I think it's better to use a proper MQ.
I am using Jmeter (started using it a few days ago) as a tool to simulate a load of 30 threads using a csv data file that contains login credentials for 3 system users.
The objective I set out to achieve was to measure 30 users (threads) logging in and navigating to a page via the menu over a time span of 30 seconds.
I have set my thread group as:
Number of threads: 30
Ramp-up Perod: 30
Loop Count: 10
I ran the test successfully. Now I'd like to understand what the results mean and what is classed as good/bad measurements, and what can be suggested to improve the results. Below is a table of the results collated in the Summary report of Jmeter.
I have conducted research only to find blogs/sites telling me the same info as what is defined on the jmeter.apache.org site. One blog (Nicolas Vahlas) that I came across gave me some very useful information,but still hasn't help me understand what to do next with my results.
Can anyone help me understand these results and what I could do next following the execution of this test plan? Or point me in the right direction of an informative blog/site that will help me understand what to do next.
Many thanks.
According to me, Deviation is high.
You know your application better than all of us.
you should focus on, avg response time you got and max response frequency and value are acceptable to you and your users? This applies to throughput also.
It shows average response time is below 0.5 seconds and maximum response time is also below 1 second which are generally acceptable but that should be defined by you (Is it acceptable by your users). If answer is yes, try with more load to check scaling.
In you requirement it is mentioned that you need have 30 concurrent users performing different actions. The response time of your requests is less and you have ramp-up of 30 seconds. Can you please check total active threads during the test. I believe the time for which there will be 30 concurrent users in system is pretty short so the average response time that you are seeing seems to be misleading. I would suggest you run a test for some more time so that there will be 30 concurrent users in the system and that would be correct reading as per your requirements.
You can use Aggregate report instead of summary report. In performance testing
Throughput - Requests/Second
Response Time - 90th Percentile and
Target application resource utilization (CPU, Processor Queue Length and Memory)
can be used for analysis. Normally SLA for websites is 3 seconds but this requirement changes from application to application.
Your test results are good, considering if the users are actually logging into system/portal.
Samples: This means the no. of requests sent on a particular module.
Average: Average Response Time, for 300 samples.
Min: Min Response Time, among 300 samples (fastest among 300 samples).
Max: Max Response Time, among 300 samples (slowest among 300 samples).
Standard Deviation: A measure of the variation (for 300 samples).
Error: failure %age
Throughput: No. of request processed per second.
Hope this will help.
The Background
Our clients use a service for which they set daily budget. This is a prepaid service and we allocate a particular amount from user's budget every day.
Tables:
budgets - how much we are allowed to spend per day
money - clients real balance
money_allocated - amount deducted from money that can be spent today (based on budgets)
There is a cron job that runs every few minutes and checks:
if user has money_allocated for a given day
if money_allocated >= budgets (user may increase budget during the day)
In the first case we allocate full amount of daily budget, in the latter - the difference between budget and already allocated amount for that day (in this case we create additional record in money_allocated for the same day).
Allocation has two stages - in the first round we add a row with status "pending" (allocation requested) and another cron checks all "pending" allocations and moves money from money to money_allocated if user has enough money. This changes status to "completed".
The Problem
We have a cluster of application servers (under NLB) and above cron job runs on each of them which means that money can accidentally be allocated multiple times (or not allocated at all if we implement wrong "already allocated" triggers).
Our options include:
Run cron job on one server only - no redundancy, client complaints and money lost on failure
Add a unique index on money_allocated that goes like (client_id, date, amount) - won't allocate more money for a given day if client doubles the budget or increases it multiple times by the same amount during the day
There is an option to record each movement in budgets and link all allocations to either "first allocation of the day" or "change of budget (id xxx)" (add this to the unique index as well). This does not look sexy enough, however.
Any other options? Any advice would be highly appreciated!
Ok, so I ended up running this on one of the cluster's instances. If you use Amazon AWS and are in a similar situation, below is one of the options..
On each machine, at the beginning of your cron job's code, do the following:
Call describe_load_balancers (AWS API), parse the response to get a list/array of all instances
Get http://169.254.169.254/latest/meta-data/instance-id - this returns instance ID of the machine that is sending request
If received instance ID is #1 in the list/array of all instances - proceed, if not - exit
Also, be sure to automatically replace unhealthy instances under this load balancer in short time as describe_load_balancers returns a list of both healthy and unhealthy instances. You may end up with a job not being done for a while if instance #1 goes down.