I have an asp.net core Web Api application.
In my application I have Web Api method which I want to prevent multi request from the same user to enter simultaneously. I don't mind request from different users to perform simultaneously.
I am not sure how to create the lock and where to put it. I thought about creating some kind of a dictionary which will contains the user id and perform the lock on the item but I don't think i'm getting it right. Also, what will happen if there is more than one server and there is a load balancer?
Example:
Let assume each registered user can do 10 long task each month. I need to check for each user if he exceeded his monthly limit. If the user will send many simultaneously requests to the server, he might be allowed to perform more than 10 operations. I understand that I need to put a lock on the method but I do want to allow other users to perform this action simultaneously.
What you're asking for is fundamentally not how the Internet works. The HTTP and underlying IP protocols are stateless, meaning each request is supposed to run independent of any knowledge of what has occurred previously (or concurrently, as the case may be). If you're worried about excessive load, your best bet is to implement rate limiting/throttling tied to authentication. That way, once a user burns through their allotted requests, they're cut off. This will then have a natural side-effect of making the developers programming against your API more cautious about sending excessive requests.
Just to be a bit more thorough, here, the chief problem with the approach you're suggesting is that I know of no way it can be practically implemented. You can use something like SemaphoreSlim to create a lock, but that needs to be static so that the same instance is used for each request. Being static is going to limit your ability to use a dictionary of them, which is what you'll need for this. It can technically be done, I suppose, but you'd have to use a ConcurrentDictionary and even then, there's no guarantee of single-thread additions. So, concurrent requests for the same user could load concurrent semphaphores into it, which defeats the entire point. I suppose you could front-load the dictionary with a semphaphore for each user from the start, but that could become a huge waste of resources, depending on your user-base. Long and short, it's one of those things where when you're finding a solution this darn difficult, it's a good sign you're likely trying to do something you shouldn't be doing.
EDIT
After reading your example, I think this really just boils down to an issue of trying to handle the work within the request pipeline. When there's some long-running task to be completed or just some heavy work to be done, the first step should always be to pass it off to a background service. This allows you to return a response quickly. Web servers have a limited amount of threads to handle requests with, and you want to service the request and return a response as quickly as possible to keep from exhausting your threadpool.
You can use a library like Hangfire to handle your background work or you can implement an IHostedService as described here to queue work on. Once you have your background service ready, you would then just immediately hand off to that any time your get a request to this endpoint, and return a 202 Accepted response with a URL the client can hit to check the status. That solves your immediate issue of not wanting to allow a ton of requests to this long-running job to bring your API down. It's now essentially doing nothing more that just telling something else to do it and then returning immediately.
For the actual background work you'd be queuing, there, you can check the user's allowance and if they have exceeded 10 requests (your rate limit), you fail the job immediately, without doing anything. If not, then you can actually start the work.
If you like, you can also enable webhook support to notify the client when the job completes. You simply allow the client to set a callback URL that you should notify on completion, and then when you've finish the work in the background task, you hit that callback. It's on the client to handle things on their end to decide what happens when the callback is it. They might for instance decide to use SignalR to send out a message to their own users/clients.
EDIT #2
I actually got a little intrigued by this. While I still think it's better for your to offload the work to a background process, I was able to create a solution using SemaphoreSlim. Essentially you just gate every request through the semaphore, where you'll check the current user's remaining requests. This does mean that other users must wait for this check to complete, but then your can release the semaphore and actually do the work. That way, at least, you're not blocking other users during the actual long-running job.
First, add a field to whatever class you're doing this in:
private static readonly SemaphoreSlim _semaphore = new SemaphoreSlim(1, 1);
Then, in the method that's actually being called:
await _semaphore.WaitAsync();
// get remaining requests for user
if (remaining > 0)
{
// decrement remaining requests for user (this must be done before this next line)
_semaphore.Release();
// now do the work
}
else
{
_semaphore.Release();
// handle user out of requests (return error, etc.)
}
This is essentially a bottle-neck. To do the appropriate check and decrementing, only one thread can go through the semaphore at a time. That means if your API gets slammed, requests will queue up and may take a while to complete. However, since this is probably just going to be something like a SELECT query followed by an UPDATE query, it shouldn't take that long for the semaphore to release. You should definitely do some load testing and watch it, though, if you're going to go this route.
Related
I just read this article from Node.js: Don't Block the Event Loop
The Ask
I'm hoping that someone can read over the use case I describe below and tell me whether or not I'm understanding how the event loop is blocked, and whether or not I'm doing it. Also, any tips on how I can find this information out for myself would be useful.
My use case
I think I have a use case in my application that could potentially cause problems. I have a functionality which enables a group to add members to their roster. Each member that doesn't represent an existing system user (the common case) gets an account created, including a dummy password.
The password is hashed with argon2 (using the default hash type), which means that even before I get to the need to wait on a DB promise to resolve (with a Prisma transaction) that I have to wait for each member's password to be generated.
I'm using Prisma for the ORM and Sendgrid for the email service and no other external packages.
A take-away that I get from the article is that this is blocking the event loop. Since there could potentially be hundreds of records generated (such as importing contacts from a CSV or cloud contact service), this seems significant.
To sum up what the route in question does, including some details omitted before:
Remove duplicates (requires one DB request & then some synchronous checking)
Check remaining for existing user
For non-existing users:
Synchronously create many records & push each to a separate array. One of these records requires async password generation for each non-existing user
Once the arrays are populated, send a DB transaction with all records
Once the transaction is cleared, create invitation records for each member
Once the invitation records are created, send emails in a MailData[] through SendGrid.
Clearly, there are quite a few tasks that must be done sequentially. If it matters, the asynchronous functions are also nested: createUsers calls createInvites calls sendEmails. In fact, from the controller, there is: updateRoster calls createUsers calls createInvites calls sendEmails.
There are architectural patterns that are aimed at avoiding issues brought by potentially long-running operations. Note here that while your example is specific, any long running process would possibly be harmful here.
The first obvious pattern is the cluster. If your app is handled by multiple concurrent independent event-loops of a cluster, blocking one, ten or even thousand of loops could be insignificant if your app is scaled to handle this.
Imagine an example scenario where you have 10 concurrent loops, one is blocked for a longer time but 9 remaining are still serving short requests. Chances are, users would not even notice the temporary bottleneck caused by the one long running request.
Another more general pattern is a separated long-running process service or the Command-Query Responsibility Segregation (I'm bringing the CQRS into attention here as the pattern description could introduce more interesting ideas you could be not familiar with).
In this approach, some long-running operations are not handled directly by backend servers. Instead, backend servers use a Message Queue to send requests to yet another service layer of your app, the layer that is solely dedicated to running specific long-running requests. The Message Queue is configured so that it has specific throughput so that if there are multiple long-running requests in short time, they are queued, so that possibly some of them are delayed but your resources are always under control. The backend that sends requests to the Message Queue doesn't wait synchronously, instead you need another form of return communication.
This auxiliary process service can be maintained and scaled independently. The important part here is that the service is never accessed directly from the frontend, it's always behind a message queue with controlled throughput.
Note that while the second approach is often implemented in real-life systems and it solves most issues, it can still be incapable of handling some edge cases, e.g. when long-running requests come faster than they are handled and the queue grows infintely.
Such cases require careful maintenance and you either scale your app to handle the traffic or you introduce other rules that prevent users from running long processes too often.
Context
In an ASP.NET Core application I would like to execute an operation which takes say 5 seconds (like sending email). I do know async/await and its purpose in ASP.NET Core, however I do not want to wait the end of the operation, instead I would like to return back to the to the client immediately.
Issue
So it is kinda Fire and Forget either homebrew, either Hangfire's BackgroundJob.Enqueue<IEmailSender>(x => x.Send("hangfire#example.com"));
Suppose I have some more complex method with injected ILogger and other stuff and I would like to Fire and Forget that method. In the method there are error handling and logging.(note: not necessary with Hangfire, the issue is agnostic to how the background worker is implemented). My problem is that method will run completely out of context, probably nothing will work inside, no HttpContext (I mean HttpContextAccessor will give null etc) so no User, no Session etc.
Question
How to correctly solve say this particular email sending problem? No one wants wait with the response 5 seconds, and the same time no one wants to throw and email, and not even logging if the send operation returned with error...
How to correctly solve say this particular email sending problem?
This is a specific instance of the "run a background job from my web app" problem.
there is no universal solution
There is - or at least, a universal pattern; it's just that many developers try to avoid it because it's not easy.
I describe it pretty fully in my blog post series on the basic distributed architecture. I think one important thing to acknowledge is that since your background work (sending an email) is done outside of an HTTP request, it really should be done outside of your web app process. Once you accept that, the rest of the solution falls into place:
You need a durable storage queue for the work. Hangfire uses your database; I tend to prefer cloud queues like Azure Storage Queues.
This means you'll need to copy all the data over that you will need, since it needs to be serialized into that queue. The same restriction applies to Hangfire, it's just not obvious because Hangfire runs in the same web application process.
You need a background process to execute your work queue. I tend to prefer Azure Functions, but another common approach is to run an ASP.NET Core Worker Service as a Win32 service or Linux daemon. Hangfire has its own ad-hoc in-process thread. Running an ASP.NET Core hosted service in-process would also work, though that has some of the same drawbacks as Hangfire since it also runs in the web application process.
Finally, your work queue processor application has its own service injection, and you can code it to create a dependency scope per work queue item if desired.
IMO, this is a normal threshold that's reached as your web application "grows up". It's more complex than a simple web app: now you have a web app, a durable queue, and a background processor. So your deployment becomes more complex, you need to think about things like versioning your worker queue schema so you can upgrade without downtime (something Hangfire can't handle well), etc. And some devs really balk at this because it's more complex when "all" they want to do is send an email without waiting for it, but the fact is that this is the necessary step upwards when a baby web app becomes distributed.
This is a Brain-Question for advice on which scenario is a smarter approach to tackle situations of heavy lifting on the server end but with a responsive UI for the User.
The setup;
My System consists of two services (written in node); One Frontend Service that listens on Requests from the user and a Background Worker, that does heavy lifting and wont be finished within 1-2 seconds (eg. video conversion, image resizing, gzipping, spidering etc.). The User is connected to the Frontend Service via WebSockets (and normal POST Requests).
Scenario 1;
When a User eg. uploads a video, the Frontend Service only does some simple checks, creates a job in the name of the User for the Background Worker to process and directly responds with status 200. Later on the Worker see's its got work, does the work and finishes the job. It then finds the socket the user is connected to (if any) and sends a "hey, job finished" with the data related to the video conversion job (url, length, bitrate, etc.).
Pros I see: Quick User feedback of sucessfull upload (eg. ProgressBar can be hidden)
Cons I see: User will get a fake "success" respond with no data to handle/display and needs to wait till the job finishes anyway.
Scenario 2;
Like Scenario 1 but that the Frontend Service doesn't respond with a status 200 but rather subscribes to the created job "onComplete" event and lets the Request dangle till the callback is fired and the data can be sent down the pipe to the user.
Pros I see: "onSuccess", all data is at the User
Cons I see: Depending on the job's weight and active job count, the Users request could Timeout
While writing this question things are getting clearer to me by the minute (Scenario 1, but with smart success and update events sent). Regardless, I'd like to hear about other Scenarios you use or further Pros/Cons towards my Scenarios!?
Thanks for helping me out!
Some unnecessary info; For websockets I'm using socket.io, for job creating kue and for pub/sub redis
I just wrote something like this and I use both approaches for different things. Scenario 1 makes most sense IMO because it matches the reality best, which can then be conveyed most accurately to the user. By first responding with a 200 "Yes I got the request and created the 'job' like you requested" then you can accurately update the UI to reflect that the request is being dealt with. You can then use the push channel to notify the user of updates such as progress percentage, error, and success as needed but without the UI 'hanging' (obviously you wouldn't hang the UI in scenario 2 but its an awkward situation that things are happening and the UI just has to 'guess' that the job is being processed).
Scenario 1 -- but instead of responding with 200 OK, you should respond with 202 Accepted. From Wikipedia:
https://en.wikipedia.org/wiki/List_of_HTTP_status_codes
202 Accepted The request has been accepted for processing, but the
processing has not been completed. The request might or might not
eventually be acted upon, as it might be disallowed when processing
actually takes place.
This leaves the door open for the possibility of worker errors. You are just saying you accepted the request and is trying to do something with it.
I have a WCF web service hosted in IIS- This service has a method - lets call it DoSomething(). DoSomething() is called from a client-side application.
DoSomething performs some work and returns the answer to the user. Now I need to log how often DoSomething is being called. I can add it to the DoSomething function so that it will for every call write to an sql database and update a counter, but this will slow down the DoSomething method as the user needs to wait for this extra database call.
Is it a good option to let the DoSomething method spawn a new thread which will update the counter in the database, and then just return the answer from the DoSomething method to the user without waiting for the thread to finnish? Then I will not know if the database update fails, but that is not critical.
Any problems with spawning a new background thread and not wait for it to finnish in WCF? Or is there a better way to solve this?
Update: To ask the question in a little different way. Is it a bad idea to spawn new threads insde a wcf web service method?
The main issue is one of reliability. Is this a call you care about? If the IIS process crashes after you returned the response, but before your thread completes, does it matter? If no, then you can use client side C# tools. If it does matter, then you must use a reliable queuing technology.
If you use the client side then spawning a new thread just to block on a DB call is never the correct answer. What you want is to make the call async, and for that you use SqlCommand.BeginExecute after you ensure that AsyncronousProcessing is enabled on the connection.
If you need reliable processing then you can use a pattern like Asynchronous procedure execution which relies on persisted queues.
As a side note things like logging, or hit counts, and the like are a huge performance bottleneck if done in the naive approach of writing to the database on every single HTTP request. You must batch and flush.
If you want to only track a single method like DoSomething() in service then you can create an custom operation behavior and apply it over the method.
The operation behavior will contain the code that logs the info to database. In that operation behavior you can use the .NET 4.0's new TPL library to create a task that will take care of database logging. If you use TPL you don't need to worry about directly creating threads.
The advantage of using operation behvaior tomorrow you need to track another method then at that time instead of duplicating the code there you are just going to mark the method with the custom operation behavior. If you want to track all the methods then you should go for service behavior.
To know more about operation behaviors check http://msdn.microsoft.com/en-us/library/system.servicemodel.operationbehaviorattribute.aspx
To know more about TPL(Task Parallel Library) check http://msdn.microsoft.com/en-us/library/dd460717.aspx
So I'd like to know what is the general algorithm to implementing an instant search that is not load intensive. Not specifically on the web but even in a desktop/winforms application.
Correct me if Im wrong but one cannot send async calls on every key stroke right? (Not sure how google instant manages this) It would create an insane load on the database/store etc.
Ive been thinking of something like this:
Fire timer every xxx milliseconds
On fire, Disable input, Disable timer, and send an async call to search.
When the call returns, display results, enable input, enable timer
Is this how it it generally handled, or is there a better way?
Search queries are generally quite small, so the increased load on the server may not be as significant as you think. Sending a query on every keystroke should be fine as long as you keep a limit on the length of queries.
Anyway, it's the server that knows how loaded it is, so the place to put the load management is on the server side. For example, you could follow a strategy something like this:
On the client:
When the search text changes, send it to the server.
When the server sends some results, update the page.
On the server, when a query is received from a client:
If I am already handling a query from that client, cancel the old query.
If I have a queued query from that client, discard it.
Add the new query to a queue of pending search queries, unless the queue is full.