Netflix Conductor Query on following topic - netflix-conductor

I want to know whether the Netflix conductor can read data from
Mongo DB
Queue
meaning, task or workflow should read data from the above endpoint while execution of logic
Similar question on Netflix Conductor discussion

Related

Implementing transactional outbox event architecture in azure

I have a large message submitted to a rest service, it could be 100k or 50mb. I need to process it asynchronously, and it looks like the transactional outbox event pattern is suitable for my needs.
Effectively, my service would commit the data to a database, along with an event record in a single transaction. A process would poll for events, and push the event to a message queue. THe event contains a reference to the data in the db, usually through a unique identifier of some sort. The consumer of the queue would query the data from the database, do whatever it needs to do and then remove the event record from the database.
This pattern is well documented. Here and here are two places.
I have a reasonable understanding of how one could implement this on-premis, in a .net/sql server environment that we are familiar with. In azure what would this look like? are there other ways I can transitionally write to the database and a queue that do not require the outbox pattern, or following the outbox pattern, what would be the mechanism that polls for events in the db, and what would provide the queue service?
Usually if you want to use the transactional outbox event pattern in azure you can use a logic app or an azure function to get events in the db and send them to the queue.
Doing that would be great if you use cosmos change feed so that your architecture is also reactive and will perform well with less resource consumption.
To avoid this pattern well....you should find a queue in azure that is able to be in transaction with your db and for what I know is not possible at least that you don ' t use a 3rd part queue .

RabbitMq and nodeJs integration Call function on the subscriber when 2 inter-related messages arrives in queue

I am having 2 micro service which are processing some inputted data (provided through queue ABC) asynchronously . after completing their processing they will push 2 messages into an another queue (queue ASD) that work has been done.
A subscriber is listening to the the queue ASD . On the basis of above 2 task completion , I want to run a function on subscriber when both of previous micro services completed their processing and pushed 2 related message to queue .
I am using RabbitMq as queue and nodejs as client.
Please suggest .
If I understand your question correctly what you are trying to do is stream processing. Unfortunately, RabbbitMQ (based on AMQP) is a pure messaging tool and does not provide this. Kafka lets you do stream processing out of the box. You can also refer to this for more details on what I am saying.
There are a couple of not so efficient approaches to the problem though.
You can just read all the messages from Queue 'ASD' and store them in db. Have another service/API read them periodically from this db. (or while writing to db this consumer can check if this is the second message and accordingly call the API). The final service can connect/associate them based on some attribute of the record, and do its work once both the records are available.
Ideally if it's not too late or you can, I would suggest using kafka instead of RabbitMQ for this use case.

How to control idempotency of messages in an event-driven architecture?

I'm working on a project where DynamoDB is being used as database and every use case of the application is triggered by a message published after an item has been created/updated in DB. Currently the code follows this approach:
repository.save(entity);
messagePublisher.publish(event);
Udi Dahan has a video called Reliable Messaging Without Distributed Transactions where he talks about a solution to situations where a system can fail right after saving to DB but before publishing the message as messages are not part of a transaction. But in his solution I think he assumes using a SQL database as the process involves saving, as part of the transaction, the correlationId of the message being processed, the entity modification and the messages that are to be published. Using a NoSQL DB I cannot think of a clean way to store the information about the messages.
A solution would be using DynamoDB streams and subscribe to the events published either using a Lambda or another service to transformed them into domain-specific events. My problem with this is that I wouldn't be able to send the messages from the domain logic, the logic would be spread across the service processing the message and the Lambda/service reacting over changes and the solution would be platform-specific.
Is there any other way to handle this?
I can't say a specific solution based on DynamoDB since I've not used this engine ever. But I've built an event driven system on top of MongoDB so I can share my learnings you might find useful for your case.
You can have different approaches:
1) Based on an event sourcing approach you can just save the events/messages your use case produce within a transaction. In Mongo when you are just inserting/appending new items to the same collection you can ensure atomicity. Anyway, if the engine does not provide that capability the query operation is so centralized that you are reducing the possibility of an error at minimum.
Once all the events are stored, you can then consume them and project them to a given state and then persist the updated state in another transaction.
Here you have to deal with eventual consistency as data will be stale in your read model until you have projected the events.
2) Another approach is applying the UnitOfWork pattern where you cache all the query operations (insert/update/delete) to save both events and the state. Once your use case finishes, you execute all the cached queries against the database (flush). This way although the operations are not atomic you are again centralizing them quite enough to minimize errors.
Of course the best is to use an ACID database if you require that capability and any other approach will be a workaround to get close to it.
About publishing the events I don't know if you mean they are published to a messaging transportation mechanism such as rabbitmq, Kafka, etc. But that must be a background process where you fetch the events from the DB and publishes them in order to break the 2 phase commit within the same transaction.

Service Fabric actors that receive events from other actors

I'm trying to model a news post that contains information about the user that posted it. I believe the best way is to send user summary information along with the message to create a news post, but I'm a little confused how to update that summary information if the underlying user information changes. Right now I have the following NewsPostActor and UserActor
public interface INewsPostActor : IActor
{
Task SetInfoAndCommitAsync(NewsPostSummary summary, UserSummary postedBy);
Task AddCommentAsync(string content, UserSummary, postedBy);
}
public interface IUserActor : IActor, IActorEventPublisher<IUserActorEvents>
{
Task UpdateAsync(UserSummary summary);
}
public interface IUserActorEvents : IActorEvents
{
void UserInfoChanged();
}
Where I'm getting stuck is how to have the INewsPostActor implementation subscribe to events published by IUserActor. I've seen the SubscribeAsync method in the sample code at https://github.com/Azure/servicefabric-samples/blob/master/samples/Actors/VS2015/VoiceMailBoxAdvanced/VoicemailBoxAdvanced.Client/Program.cs#L45 but is it appropriate to use this inside the NewsPostActor implementation? Will that keep an actor alive for any reason?
Additionally, I have the ability to add comments to news posts, so should the NewsPostActor also keep a subscription to each IUserActor for each unique user who comments?
Events may not be what you want to be using for this. From the documentation on events (https://azure.microsoft.com/en-gb/documentation/articles/service-fabric-reliable-actors-events/)
Actor events provide a way to send best effort notifications from the
Actor to the clients. Actor events are designed for Actor-Client
communication and should NOT be used for Actor-to-Actor communication.
Worth considering notifying the relevant actors directly or have an actor/service that will manage this communication.
Service Fabric Actors do not yet support a Publish/Subscribe architecture. (see Azure Feedback topic for current status.)
As already answered by charisk, Actor-Events are also not the way to go because they do not have any delivery guarantees.
This means, the UserActor has to initiate a request when a name changes. I can think of multiple options:
From within IUserAccount.ChangeNameAsync() you can send requests directly to all NewsPostActors (assuming the UserAccount holds a list of his posts). However, this would introduce additional latency since the client has to wait until all posts have been updated.
You can send the requests asynchronously. An easy way to do this would be to set a "NameChanged"-property on your Actor state to true within ChangeNameAsync() and have a Timer that regularly checks this property. If it is true, it sends requests to all NewsPostActors and sets the property to false afterwards. This would be an improvement to the previous version, however it still implies a very strong connection between UserAccounts and NewsPosts.
A more scalable solution would be to introduce the "Message Router"-pattern. You can read more about this pattern in Vaughn Vernon's excellent book "Reactive Messaging Patterns with the Actor Model". This way you can basically setup your own Pub/Sub model by sending a "NameChanged"-Message to your Router. NewsPostActors can - depending on your scalability needs - subscribe to that message either directly or through some indirection (maybe a NewsPostCoordinator). And also depending on your scalability needs, the router can forward the messages either directly or asynchronously (by storing it in a queue first).

How to implement something similar to Storm DRPC in Samza?

I have samza job with a number of tasks, each of which holds some state in its embedded store. I want to expose this store for reading to outside world via some kind of RPC mechanism. What could be the best solution for this?
Here is one paragraph in Samza documentation about it:
Samza does not currently have an equivalent API to DRPC,
but you can build it yourself using Samza’s stream
processing primitives.
The only solution which comes to my mind is to make my tasks, in addition to normal processing, to consume request messages with some correlation IDs on a special request topic, and to put response messages with the same correlation IDs into special response topic. So it's like RPC-over-Kafka solution which seems to me suboptimal.
Any thoughts are welcome!
As far as I remember the embedded store is backed up in a Kafka topic. When you set something in the store, the message is produced to the topic. Thus you can consume this topic and you can "clone" the embedded store to a different database. Then you can query the database. Or you can use just the database instead of the embedded store. But this approach could lead to performance issues in your Samza job...

Resources