Assume that I have a crud api gateway and a mircoservice, both of them are running on different docker instances.
And I have a message queue as the bridge between them.
When the api gateway receive a http request, then pass tasks to the message queue.
Then the microservice which listening the message queue, pull out the task and process it. When the process is done, how can I tell the api gateway that task is already done and send back the result to the request?
Because the stuff of mircoservice is some kind of stateless, it don't need to know where the task from, but the api gateway doesn't.
Is it a bit dum If I build another message queue for the task result, then the api gateway keep looping on it to compare the task result with something like id?
Chain the requests Client -> Gateway -> Service -> Gateway -> Client
The gateway will just wait till it gets a response from the service and return it back. But the problem here is that the gateway will be waiting for a response from the service. And as you said if there is a queue that is processing something, there is a good chance that it will take some time, especially if there are lot of items waiting to be processed.
Websockets
This allows you to communicate both ways. So you can inform the client that job is done. But not sure how it works with gateways.
Callback URL
The client will send the request and will specify some URL that will be called when the job is done. This means that the client will get, e.g., 200 OK back that item was put into the queue but without status of the task. And when the job is done, the service will call the URL defined by the client with the result, so it will push the status to the client instead of waiting for the result.
Pulling status
Client -> Gateway -> Service will add the task into the queue and will respond with 200 OK. Then the client will ask for the status in some predefined intervals to get the status of the task. This is not very effective but also an option.
Related
I have a namespace in ASB with a single topic and subscription. I can see the message count from when I was testing the consumer (ruby client), the first spike, but after I stopped testing and nothing in the client was running there were 10 incoming requests from 'somewhere'. The second spike in the graph. The machine was off at this point so it must have come from Azure.
Within half an hour of turning on my machine there were 6 incoming requests, the third spike but nothing is running as it's a commandline client so I assume it's Azure again.
I don't have anything else in Azure (functions apps etc). Literally nothing is running/polling/pulling/peeking etc.
Is it possible to identify where these requests are coming from? Graph is below.
Incoming requests are not incoming messages. When there are no messaging and you receive nothing, there are still operations taking place. This is due to the fact that the client (assuming you're using client with a message handler rather than trying to retrieve messages yourself), the client is performing long-polling. I.e. the client will poll for new messages rather than the broker to push those. Unlike RabbitMQ, which will push to a client when messages become available.
Incoming Requests is a 'Request Metric' that shows number of requests made to the "Service Bus service" over a specified period. Even without any message flow within the ServiceBus Namespace there can be incoming requests.
Probably it is not possible to find the origin of the incoming request.
I have a web service that accepts post requests. A post request specifies a specific job to be executed in the background, that modifies a database used for later analysis. The sender of the request does not care about the result, and only needs to receive a 202 acknowledgment from the web service.
How it was implemented so far:
Flask Web service will get the http request , and add the necessary parameters to the task queue (rq workers), and return back an acknowledgement. A separate rq worker process listens on the queue and processes the job.
We have now switched to aiohttp, and realized that the web service can now schedule the actual job request in its own event loop, by using the aiohttp.ensure_future() method.
This however blurs the lines between the web-server and the task queue. On the positive side, it eliminates the need of having to manage the rq workers.
Is this considered a good practice?
If your tasks are not CPU heavy - yes, it is good practice.
But if so, then you need to move them to separate service or use run_in_executor(). In other case your aiohttp event loop will be blocked by this tasks and server will not be able to accept new requests.
We need to implement a Async web service.
Behaviour of web service:
We send the request for an account to server and it sends back the sync response with an acknowledgement ID. After that we get multiple Callback requests which contains that acknowldegment ID. The last callback request for an acknowledgement ID will contain a text(completed:true) in the response which will tell us that this is the last callback request for that account and acknowledgement ID. This will help us to know that async call for a particular account is completed and we can mark its final status. We need to execute this web service for multiple accounts. So, we will be getting callback requests for many accounts.
Question:
What is the optimal way to process these multiple callback requests coming for multiple accounts.
Solutions that we thought of:
ExecutorService Fixed Thread Pool: This will parallely process our callback requests but the concern is that it does not maintain the sequence. So it will be difficult for us to determine that the last callback request for an acknowledgment ID(account) has come. Hence, we will not be able to mark the final status of that account as completed with surity.
ExecutorService Single Thread Executor: Here, only one thread is there in the pool with an unbouded queue. If we use this then processing will be pretty slow as only one thread will be actually processing.
Please suggest an optimal way to implement requirement both memory and performance wise.
Let's be clear about one thing: HTTP is a blocking, synchronous protocol. Request/response pairs aren't asynch. What you're doing is spawning asynch requests and returning to the caller to let them know the request is being processed (HTTP 200) or not (HTTP 500).
I'm not sure that I know optimal for this situation, but there are other considerations:
Use an ExecutorServiceThreadPool that you can configure. Make sure you have a prefix that lets you distinguish these threads from others.
Add request task to a blocking dequeue and have a pool of consumer threads process them. You can tune the dequeue and the consumer thread pool sizes.
If processing is intensive, send request messages to a queue running on another server. Have a pool of queue listeners process the requests.
You cannot assume that the callbacks will return in a certain order. Don't depend on "last" being "true". You'll have to join all those threads together to know when they're finished.
It sounds like the web service should have a URL that lets users query for status.
Looking for some advice on how to do the following:
Receive request from website for certain long running process (~10-30seconds)
Website backend schedules a job and puts onto distributed queue .. could be SQS/Kue/resque
A worker takes the job off the queue and processes it. Stores result somewhere.
Website backend subscribes to job complete event and gets the result of processed job.
Website backend closes request to website with result of the task.
1,2 and 3 are fine. I am just finding it tricky to pass the result of a queued task back to the backend so that it can close the request.
Polling from the website isnt an option - the request has to stay open for however long the task takes to be processed. I'm using nodejs.
2 - 4 are all happening on the server side. There is nothing stopping you from polling the expected result location (on the server side) for the result and then returning the result when it finally appears.
Client sends requests
Server starts job and begins polling for the result
The result comes back so the poll loop on the server side ends
Server sends result back to client
The client-server connection is finally severed
You could get even more efficient code going if the job can execute a url when it finishes. In this case your service would have two endpoints... one for the client to start the process, and another that your job queue can call.
Client sends requests
Server starts job... saves the response callback in a global object so that it is not closed (I'm assuming something like express here)
openJobs.push({ id: 12345, res: res });
jobQueue.execute({ id: 12345, data: {...}});
When the job finishes and saves the result, call the service url with the id
You can check that the job has actually finished and remove the job from the openJobs list
Finish the original response
openJob.res.send(data);
This will send the data and close the original client-server connection.
The overall result is that you have no polling at all... which is cool.
Of course... In either of these scenarios you are screwed if your server shuts down in the middle of a batch... This is why I would recommend something like socket.io in this scenario. You would queue the results of jobs somewhere and socket.io would poll/wait for callbacks on the list and push to the client when there are new items. This is better because if the server crashes no biggie - the client will re-connect once the server comes back up.
When a Web Role places a message onto a Storage Queue, how can it poll for a specific, correlated response? I would like the back-end Worker Role to place a message onto a response queue, with the intent being that the caller would pick the response up and go from there.
Our intent is to leverage the Queue in order to offload some heavy processing onto the back-end Worker Roles in order to ensure high performance on the Web Roles. However, we do not wish to respond to the HTTP requests until the back-end Workers are finished and have responded.
I am actually in the middle of making a similar decision. In my case i have a WCF service running in a web role which should off-load calculations to worker-roles. When the result has been computed, the web role will return the answer to the client.
My basic data structure knowledge tells me that i should avoid using something that is designed as a queue in a non-queue way. That means a queue should always be serviced in a FIFO like manner. So basically if using queues for both requests and response, the threads awaiting to return data to the client will have to wait untill the calculation message is at the "top" of the response queue, which is not optimal. If storing the responses by using Azure tables, the threads poll for messages creating unnecessary overhead
What i belive is a possible solution to this problem is using a queue for the requests. This enables use of the competeing consumers pattern and thereby load-balancing. On messages sent into this queue you set the correlationId property on the message. For reply the pub/sub part ("topics") part of Azure service bus is used togehter with a correlation filter. When your back-end has processed the request, it published a result to a "responseSubject" with the correlationId given in the original request. Now this response ca be retrieved by your client by calling CreateSubscribtion (Sorry, i can't post more than two links apparently, google it) using that correlation filter, and it should get notified when the answer is published. Notice that the CreateSubscribtion part should just be done one time in the OnStart method. Then you can do an async BeginRecieve on that subscribtion and the role will be notified in the given callback when a response for one of it's request is available. The correlationId will tell you which request the response is for. So your last challenge is giving this response back to the thread holding the client connection.
This could be achieved by creating Dictionary with the correlationId's (probably GUID's) as key and responses as value. When your web role gets a request it creates the guid, set it as correlationId, add it the hashset, fire the message to the queue and then call Monitor.Wait() on the Guid object. Then have the recieve method invoked by the topic subscribition add the response to the dictionary and then call Monitor.Notify() on that same guid object. This awakens your original request-thread and you can now return the answer to your client (Or something. Basically you just want your thread to sleep and not consume any ressources while waiting)
The queues on the Azure Service Bus have a lot more capabilities and paradigms including pub / sub capabilities which can address issues dealing with queue servicing across multiple instance.
One approach with pub / sub, is to have one queue for requests and one for the responses. Each requesting instance would also subscribe to the response queue with a filter on the header such that it would only receive the responses targeted for it. The request message would, of course contain the value to the placed in the response header to drive the filter.
For the Service Bus based solution there are samples available for implementing Request/Response pattern with Queues and Topics (pub-sub)
Let worker role keep polling and processing the message. As soon as the message is processed add an entry in Table storage with the required corelationId(RowKey) and the processing result, before deleting the processed message from the queue.
Then WebRoles just need to do a look up of the Table with the desired correlationId(RowKey) & PartitionKey
Have a look at using SignalR between the worker role and the browser client. So your web role puts a message on the queue and returns a result to the browser (something simple like 'waiting...') and hook it up to the worker role with SignalR. That way your web role carries on doing other stuff and doesn't have to wait for a result from asynchronous processing, only the browser needs to.
There is nothing intrinsic to Windows Azure queues that does what you are asking. However, you could build this yourself fairly easily. Include a message ID (GUID) in your push to the queue and when processing is complete, have the worker push a new message with that message ID into a response channel queue. Your web app can poll this queue to determine when processing is completed for a given command.
We have done something similar and are looking to use something like SignalR to help reply back to the client when commands are completed.