This is not really a code related question, but more of a solutions oriented question.
To provide some context, I have a web application with a serverless Node.js backend which lets users define a recurring time period at which an endpoint will execute and perform some action. For example, a user decides that every 9 days, the action will be fired.
I am well aware of solutions such as GCP Cloud Tasks (which is what I'm currently using) and AWS SQS, but they have limits. For instance, GCP's Cloud Tasks has a max schedule limit of 720 hours (30 days). This means that my users can only schedule tasks at a future date within 720 hours in the future whereas I would prefer to give them the flexibility to schedule tasks for up to one year.
Is there currently a cloud solution that would allow me to perform such a feature?
I am suspecting that this is definitely possible because of Stripe's subscriptions. They allow yearly subscriptions, which are similar to what I'm using, but with an extended limit. I'm not exactly sure how Stripe engineers accomplished this under the hood (while keeping scalability as a requirement since cronjobs are quite expensive), but after googling solutions to no avail, this problem had me wondering if there existed a more appropriate solution or whether my solution is fine as it is.
I'm also aware that there are workarounds to "extend" the Cloud Task 720 hours limit, but I want to explore options before diving into those kind of workarounds. (e.g. Cronjobs for each user would work well but expensive at scale, One cronjob checking for all the current day's tasks to schedule but at scale, the cloud function might time out since I'm using serverless backend, etc.)
You should have a look to cloud Workflow. For each user request, launch an execution with the parameters, they can be "delay" or "url to call"; define those that you need.
So, when an execution start, wait the delay, and then call the URL. An execution can last up to 1 year, that fit your requirements
Related
I am writing payroll management web application in nodejs for my organisation. In many cases application shall involve cpu intensive mathematical calculation for calculating the figures and that too with many users trying to do this simulatenously.
If i plainly write the logic (setting aside the fact that i already did my best from algorithm and data structure point of view to contain the complexity) it will run synchronously blocking the event loop and make request, response slow.
How to resolve this scenario? What are the possible options to do this asynchronously? I also want to mention that this calculation stuff can be let to run in the background and later i can choose to tell user via notification about the status. I have searched for the solution all over this places and i found some solutions but only in theory & i haven't tested them all by implementing. Mentioning below:
Clustering the node server
Use worker threads
Use an alternate server and do some load balancing.
Use a message queue and couple it with worker thread to do backgound tasks.
Can someone suggest me some tried and battle tested advice on this scenario? and also some tutorial links associated with that.
You might wanna try web workers,easy to use and documented.
https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers
I have a use case wherein I am trying to understand the estimated bill per month for a scenario which requires one of the enterprise connectors which can yield up to 400 million messages a day. Assuming there will be in all 4-5 added actions involving parsing, transforming and persisting the message, how to calculate the price of service per month?
Well, your example broke the pricing calculator. It could not handle billions of executions on Actions executed, but I guess if you take the Actions executed price value times 5. Pricing example calculator
Sounds like it will be cheaper to do this in another way or at the very least, not using an Enterprise Connector. What connector is it that you are planning to run even?
Have you looked at the prerequisites and limitations of the connector? Will it be able to handle 400 million executions per day?
I'm aware of the many different ways of scheduling system-centric events in Azure. E.g. Azure Scheduler, Logic Apps, etc. These can be used for things like backups, sending batch emails, or other maintenance functions.
However, I'm less clear on what technology is available for events relating to a large volume of documents or records.
For example, imagine I have 100,000 documents in Cosmos and some of the datetime properties on those documents relate to events: e.g. expiry, reminders, escalations, timeouts, etc. Each record has a different set of dates and times.
What approaches are there to fire off code whenever one of those datetimes is reached?
Stuff I've thought of so far:
Have a scheduled task that runs once per minute and looks for anything relating to that particular minute in Cosmos then does "stuff".
Schedule tasks on Service Bus queues with a future date as-and-when the Cosmos records are created and then have something to receive those messages and do "stuff".
But are there better ways of doing this? Is there a ready-made Azure service that would take away much of the background infrastructure work and just let me schedule a single one-off event at a particular point in time and hit a webhook or something like that?
Am I mis-categorising Azure Scheduler as something that you'd use for a handful of regularly scheduled tasks rather than the mixed bag of dates and times you'd find in 100,000 Cosmos records?
FWIW, in my use-case there isn't really a precision issue - stuff scheduled for 10:05:00 happening at 10:05:32 would be perfectly acceptable, for example.
Appreciate your thoughts.
First of all, Azure Schedular will be replaced by Azure Logic Apps:
Azure Logic Apps is replacing Azure Scheduler, which is being retired. To schedule jobs, follow this article for moving to Azure Logic Apps instead.
(source)
That said, Azure Logic Apps is one of your options since you can define a logic apps that starts a one time job by using a delay activity. See the docs for details.
It scales very well and you can pay for what you use (or use a fixed pricing model).
Another option is using a durable azure function with a timer in it. Once elapsed, you could do your thing. You can use a consumption plan as well, so you pay only for what you use or you can use a fixed pricing model. It also scales very well so hundreds of those instances won't be a problem.
In both cases you have to trigger the function or logic app when the Cosmos records are created. Put the due time as context in the trigger and there you go.
Now, given your statement
I'm aware of the many different ways of scheduling system-centric events in Azure. E.g. Azure Scheduler, Logic Apps, etc. These can be used for things like backups, sending batch emails, or other maintenance functions.
That is up to you. You can do anything you want. You don't specify in your question what work needs to be done when the due time is reached but I doubt it is something you can't do with those services.
I have a Node.js app with a small set of users that is currently architected with a single web process. I'm thinking about adding an after save trigger that will get called when a record is added to one of my tables. When that after save trigger is executed, I want to perform a large number of IO operations to external APIs. The number of IO operations depends on the number of elements in an array column on the record. Thus, I could be performing a large number of asynchronous operations after each record is saved in this particular table.
I thought about moving this work to a background job as suggested in Worker Dynos, Background Jobs and Queueing. The article gives as a rule of thumb that tasks that take longer than 500 ms be moved to background job. However, after working through the example using RabbitMQ (Asynchronous Web-Worker Model Using RabbitMQ in Node), I'm not convinced that it's worth the time to set everything up.
So, my questions are:
For an app with a limited amount of concurrent users, is it ok to leave a long-running function in a web process?
If I eventually decide to send this work to a background job it doesn't seem like it would be that hard to change my after save trigger. Am I missing something?
Is there a way to do this that is easier than implementing a message queue?
For an app with a limited amount of concurrent users, is it ok to leave a long-running function in a web process?
this is more a question of preference, than anything.
in general i say no - it's not ok... but that's based on experience in building rabbitmq services that run in heroku workers, and not seeing this as a difficult thing to do.
with a little practice, you may find that this is the simpler solution, as I have (it allows simpler code, and more robust code, as it splits the web away from the background processor - allowing each to run without knowing about each other directly)
If I eventually decide to send this work to a background job it doesn't seem like it would be that hard to change my after save trigger. Am I missing something?
are you missing something? not really
as long as you write your current in-the-web-process code in a well structured and modular fashion, moving it to a background process is not usually a big deal
most of the panic that people get from having to move code into the background, comes from having their code tightly coupled to the HTTP request / response process (i know from personal experience how painful it can be)
Is there a way to do this that is easier than implementing a message queue?
there are many options for distributed computing and background processing. i personally like RabbitMQ and the messaging patterns that it uses.
i would suggest giving it a try and seeing if it's something that can work well for you.
other options include redis with pub/sub libraries on top of it, using direct HTTP API calls to another web server, or just using a timer in your background process to check database tables on a given frequency and having the code run based on the data it finds.
p.s. you may find my RabbitMQ For Developers course of interest, if you are wanting to dig deeper into RMQ w/ node: http://rabbitmq4devs.com
I am seeing erratic performance with an Azure Search Basic instance. Our index only has 1,544 documents and is 28MB in size, so I would expect searches to be very fast.
Azure Application Insights is reporting 4.7K calls to Azure Search from our app within the last 12 hours, with an average response time of 2.1s and a standard deviation of 35.8s(!).
I am personally seeing erratic performance during my manual testing. A query can take 20+ seconds at one moment, and then just a bit later the same query will take less than 100ms.
There queries are very simple. Here's an example query string:
api-version=2015-02-28&api-key=removed&search=&%24count=true&%24top=10&%24skip=0&searchMode=all&scoringProfile=FieldBoost&%24orderby=sortableTitle
What can I do to further troubleshoot this issue?
First off, I am assume you have a fairly even distribution of queries which means based on your numbers, you are only ~1 query per second. Does that sound correct? If not, and you are seeing large spikes of queries, it is very possible that you do not have enough replicas (copies of the index) to handle the query load. Please note that a single replica Basic service is targeted to handle low single digit QPS (although this can vary widely based on the complexity or simplicity of the queries). If you go beyond the limits of the service, latency can certainly become an issue. A good way to drill into this is to use Azure Search Traffic Analytics which can expose the search metrics that include data such as the number of queries per second over various timeframe as well as the latency metrics that we are seeing internally.
Also, most importantly, please try to reuse HTTP connections as much as possible and leverage HTTP connection pooling if possible. By the way, in .NET you should reuse a single HttpClient instance, or SearchIndexClient instance if using our Azure Search SDK.
I gathered more data and posted my results over at the Azure Search forum.
The slowdowns are due to the fact that we're running a single basic instance and code deployments by the Azure Search team cause a brief (a few minutes in my experience) interruption / degradation in service.
I find running two basic instances too expensive. Our search traffic doesn't warrant two instances except for availability purposes.
It's my understanding from the forum that the free tier has generally higher availability than a single basic instance. As a result, I have submitted a feedback item suggesting a paid shared tier that would provide more storage than the free tier while retaining higher availability than a single dedicated instance.