Is it possible to use event emitters in NodeJS / Typescript for Azure Functions given the Serverless nature of Azure Functions? For high throughput scenarios, perhaps millions of requests per day to a single Azure function endpoint, I want to make sure that I don't end up with orphaned events.
You could leverage Azure Event Grid, the distributed cousin of the in-memory EventEmitter for your scenario.
Here is how you can compare these
Instead of an EventEmitter object, you will have an Event Grid Topic
Instead of a listener, you will have an Event Handler
Instead of .emit(), you POST to the custom topic's endpoint
Instead of .on(), you use Event Filtering
As for scale, Event Grid has you covered as it was designed for such use cases.
Also, Event Grid has retry built-in and supports dead-lettering as well.
Related
I have a large message submitted to a rest service, it could be 100k or 50mb. I need to process it asynchronously, and it looks like the transactional outbox event pattern is suitable for my needs.
Effectively, my service would commit the data to a database, along with an event record in a single transaction. A process would poll for events, and push the event to a message queue. THe event contains a reference to the data in the db, usually through a unique identifier of some sort. The consumer of the queue would query the data from the database, do whatever it needs to do and then remove the event record from the database.
This pattern is well documented. Here and here are two places.
I have a reasonable understanding of how one could implement this on-premis, in a .net/sql server environment that we are familiar with. In azure what would this look like? are there other ways I can transitionally write to the database and a queue that do not require the outbox pattern, or following the outbox pattern, what would be the mechanism that polls for events in the db, and what would provide the queue service?
Usually if you want to use the transactional outbox event pattern in azure you can use a logic app or an azure function to get events in the db and send them to the queue.
Doing that would be great if you use cosmos change feed so that your architecture is also reactive and will perform well with less resource consumption.
To avoid this pattern well....you should find a queue in azure that is able to be in transaction with your db and for what I know is not possible at least that you don ' t use a 3rd part queue .
Currently we have a legacy system (.net framework library) where a message is queued in service bus queue. There is windows service, which has multiple message handlers (custom classes for processing a message). We have a logic to split the message from queue into these message handlers. Ratio of split is 90% and 10% into these message handlers.
We are planning to migrate this legacy system into azure function/durable function, where we could have a queue trigger to process a message. We could have multiple azure functions, one for each message handler in legacy system. Challenge we face is how do we handle splitting of messages into these azure function?
For e.g.: Azure-Function-1 should take 90% of the queue message, Azure-Function-2 should take 10% of the queue message.
My question is does azure has out-of-the-box solution for handling messages in such a scenario? Is there any other better solution than Azure functions/durable functions?
This not how it works. The number of functions is determined by the number of events. So if your Azure-Function-1 has more messages then Azure-function-2, changes are it will have more copies of your function in execution.
If you have very large number of messages, maybe you could switch to Event Hubs where you can determine (only in advance) the number of partitions and Throughput Units.
I've started thinking through a prototype architecture for a system I want to build based on Azure Functions and Event Grid.
What I would like to achieve is to have a single point of entry (Function) which a variety of external vendors will send Webhook (GET) HTTP requests to. The purpose of the Function is to add some metadata to the payload, and publish the package (metadata + original payload from vendor) to an Event Grid. The Event Grid will then trigger another Function, whose purpose is to respond to the original Webhook HTTP request with e.g. a status 204 HTTP code.
The diagram below is a simplified version of the architecture, the Event Grid will of course publish events also to other Functions, but for the sake of simplicity…
The challenge I'm facing at the moment is that the context of the original Webhook HTTP request from external vendor is lost after the first Function is triggered. Trying to send the context as part of the event payload to Event Grid feels like an anti-pattern, and regardless I cannot get it working (the .done() function is lost somewhere in the event). Trying to just use context.res = {} and context.done() in the last Function won't respond to the vendor's original HTTP request.
Any ideas here? Is the whole architecture just one big anti-pattern -- will it even work? Or do I have to immediately send the HTTP response in the first Function triggered by the vendor's request?
Thank you!
You are mixing two difference patterns such as a message-driven and event-driven.
The Azure Event Grid is a distributed Pub/Sub eventing Push model, where the subscriber subscribing an interest on the source in the loosely decoupled manner.
In your scenario, you want to use an eventing model within the message exchange request-response pattern in the sync manner. The request message exchange context can not flow via the Pub/Sub eventing model and back to the anonymous endpoint such as actually a point for response message.
However, there are a several options how to "logical" integrate these two different patterns, the following is showing some of them:
using a request - replyTo message exchange pattern, such as a full duplex communication, one for request and the other one for replyTo.
using a request - response message exchange pattern with a polling state. Basically, your first function will wait for a subscriber state and then return back to the caller. In the distributed internet architecture, we can use an azure lease blob storage for sharing a state between the sync part and async eventing part.
In your scenario, the first AF will create this lease blob, then firing an event to the AEG and then it will periodically polling the state in the lease blob for end of aggregation process (multiple subscribers, etc.).
Also, for this kind of pattern, you can use Azure Durable Function to simplify an integration to the event-driven AEG model.
The following screen snippet shows a sequence diagram using an Azure Lease Blob for sharing a "Request State" in the distributed model. Note, that this pseudo sync/async pattern is suitable for cases when the Request-Response is processing within a short time less than 60 seconds.
For more details about using a Lease Blob within the Azure Function, see my answer here.
I'm experimenting with event sourcing / cqrs pattern using serverless architecture in Azure.
I've chosen Cosmos DB document database for Event Store and Azure Event Grid for dispachting events to denormalizers.
How do I achieve that events are reliably delivered to Event Grid exactly once, when the event is stored in Cosmos DB? I mean, if delivery to Event Grid fails, it shouldn't be stored in the Event Store, should it?
Look into Cosmos Db Change Feed. Built in event raiser/queue for each change in db. You can register one or many listeners/handlers. I.e. Azure functions.
This might be exactly what you are asking for.
Some suggest you can go directly to cosmos db and attach eventgrid at the backside of changefeed.
You cannot but you shouldn't do it anyway. Maybe there are some very complicated methods using distributed transactions but they are not scalable. You cannot atomically store and publish events because you are writing to two different persistences, with different transactional boundaries. You can have a synchronous CQRS monolith, but only if you are using the same technology for the events persistence and readmodels persistence.
In CQRS the application is split in Write/Command and Read/Query sides (this long video may help). You are trying to unify the two parts into a single one, a downgrade if you will. Instead you should treat them separately, with different models (see Domain driven design).
The Write side should not depend on the outcome from the Read side. This means, that after the Event store persist the events, the Write side is done. Also, the Write side should contain all the data it needs to do its job, the emitting of events based on the business rules.
If you have different technologies in the Write and Read part then your Read side should be decoupled from the Write side, that is, it should run in a separate thread/process.
One way to do this is to have a thread/process that listens to appends to the Event store, fetch new events then publish them to the Event Grid. If this process fails or is restarted, it should resume from where it left off. I don't know if CosmosDB supports this but MongoDB (also a document database) has the rslog that you can tail to get the new events, in a few milliseconds.
I found the host.json which tells how to control behavior of my Function app. But it doesn't show entries about event grid trigger.
I was wondering as publisher(in my case, events related to blob storage) sends http requests to my function, does it mean I can control Event trigger with http configuration? By the way it's preferred not to realize custom Http trigger to handle events but if it's the only way, I may have to accept it.