I am trying to develop microservices for order / transaction process using event sourcing concept. The staff could place an order / transaction for a customer by phone. The system also record the number of order that grouped by customer. It is using AWS Kinesis to send the customer id in orderCreated event to the service of customer data so we could increment the number of created order. We separate the order processing and customer based on DDD concept. But, we should anticipate human error when staff select wrong customer id for the order. So, there is a feature to change the customer for related order.
The problem is the orderUpdated event just contains the latest data of the order. It means that the event only has the new customer id. We could increment the number of order for new customer. But, we should decrement the number of order for previous customer id.
How to solve this problem? Could you give me some suggestions?
Thanks in advance
It sounds like an OrderUpdated event is not granular enough to meet your needs. Based on the information you provided, you likely want to have a more specific event like OrderCustomerChanged that contains both the new and the old customer Id.
Related
My goal is to create daily reports for users about chat messages they've missed/not read yet. Right now, all data is getting stored in a ScyllaDB, and that is working out well for the most part. But when it comes to these reports I've no idea whether there a good way to achieve that without changing the database system.
Thing is, I don't want to query query for each user the unread messages. (I could do that because messages have a timeuuid I can compare with a last_read timestamp, but it's slow because it meant multiple queries for every single user there is.) Therefore, I tried to create a dedicated table for the reporting:
CREATE TABLE
user uuid,
channel uuid,
count_start_time timestamp,
missed_count int,
PRIMARY KEY (channel, user)
)
Once a new message in the channel arrives, I can retrieve all users in that channel (from another table). My idea was to increment missed_count, or decrement it in case a message was deleted (and it's creation timestamp is > count_start_time, I figure I could achieve that with an IF condition to the update). Once a user reads his messages, I reset the count_start_time to current date and missed_count to 0.
But several issues arise here:
Since I can't use a Counter my updates aren't atomic. But I think I could live with that.
For the reasons below it would be ideal if I could just delete a row once messages get read instead of reseting timestamp and counter. But I've read that many deletions might cause performance issues (and I'm also not sure what would happen if the entry gets recreated after a short period b/c new messages arrive in the channel again)
The real bummer: since I did not want to iterate over all users on the system in the first place, I don't want to iterate over all entries here either. The naive idea would be to query with WHERE missed_count > 0. But missed_count isn't part of the cluster key so for my understanding that's not feasible.
Since I have to paginate, it could happen that I get the missed messages for a single user in different hunks. I mean, it could happen that I report to user1 that he has unread messaged from channel1 first, and later that he has unread messages from channel2, That means additional overhead in case I want to avoid multiple reports for the same user.
Is there a way I could structure my table to solve that problem, especially how to query only entries with missed_count > 0 or to utilize row deletion? Or is my goal beyond the design of Cassandra/ScyllaDB?
Thanks in advance!
I have a number of events that are based on values from devices. They are read in intervals, e.g. every hour. The events are delivered to an Event Hub, which is used as an input to a Stream Analytics (SA) job.
I want to aggregate and calculate an average value in SA. Currently, I aggregate and group the events in SA using an origin id and other properties to create the correct groups and averages. The problem is that the averages are not correct. I think the events a either not complete and/or not correlated correct.
Using a TumblingWindow will produce a number of static windows based on time, but the events I need to aggregate might come across two or more windows.
Using a SlidingWindow, as I understand, will trigger output upon a specific condition and the "look back" for a specified interval. Is this correct? If it is correct, I could attach the same id, like a JobId, to each event that I need aggregated and a value indicating whether it is the last event. When the last event enters SA, the SlidingWindow is triggered and we can "look back" for all the events with the same id. Is this possible?
Are there other options in this case? Basically I need to correlate a number of events based on other characteristics than time.
I hope you can help me.
When using CQRS and Event Sourcing how does one query historical data. As an example if I am building a timesheet system that has a report for revenue I need to query against hours, pay rate, and bill rate for each employee. There is a EMPLOYEE_PAY_RATE table that has EmployeeID, PayRate, and EffectiveDate, as well as a BILL_RATE table which has ClientID, EmployeeID, Rate, and EffectiveDate. The effective date in those tables is basically keeping the running history so we can report accurately.
If we were to take a DDD, CQRS, and Event Sourcing Route how would we generate such a report? It's not like we can query the event store in the same way. I've looked at frameworks like Axon but not sure if it would allow us to do what we need to do from a reporting perspective.
When using CQRS and Event Sourcing how does one query historical data.
Pretty much the same way you query live data: you build the views that you want from the event history, and then query the views for the data that you want.
To borrow your example - your view might be supported by an EMPLOYEE_PAY_RATE table and a BILL_RATE table. Replay your events, as something interesting happens update the appropriate table. TaDa.
An important idea that may not be obvious - for something low latency like a history report, you'll probably want the historical aggregator to be pulling the events from the event store, rather than having a bus push events to the aggregator. The pull approach makes it a lot easier to keep track of where you are, so that you don't need to repeat a lot of work, worry about whether you've received all of the events you should, ordering, and so on.
You report is just another read-model/projection of the events, for example a SQL table that is populated by listening to the relevant events.
If the table is big, i.e. a lot of employees, in order to be fast, you should avoid using joins, by keeping the data denormalized; so, for every employee and day (or whatever granularity you want) you would have a row in a table containing the Employee ID and name, the start date and the end date of the day and other columns containing relevant data, i.e. the pay rate. You put heer the employee name also in order to avoid the joins and you keep it up-to-date by listening to the relevant employee events (like EmployeeChangedName).
I have a booking system in Core Data, and I have a Transaction entity which at the moment has a relationship with Appointment, along with some other things. An Appointment can be made by a Client. And an appointment has a relationship with a Service type.
I want to store all transactions made on the computer, however if a client is deleted I still want the past transactions of the client to show. Likewise, if an appointment or service is deleted I still want it to show up in the past transactions. Also any modifications made to the service name shouldn't change in the transaction, although any modifications to the client name should be changed within the transaction.
How can this be achieved? I know that it's possible to put a "delete" attribute in every entity rather than actually deleting an entity, but then if modifications are made within a service for example, the change will be reflected in the transaction.
If compliance is an issue you could keep multiple copies of your entities. With a creation timestamp you would have a pretty good unique id to identify them (in combination with another property or with your own ID scheme).
Now, instead of modifying any transaction, create a new one with the same creation date and copying all the data (but a different timestamp in a modifiedDate property). When displaying them, just display the latest version. When deleting, just mark as deleted.
I am still trying to wrap my head around how to apply DDD and, most recently, CQRS to a real production business application. In my case, I am working on an inventory management system. It runs as a server-based application exposed via a REST API to several client applications. My focus has been on the domain layer with the API and clients to follow.
The command side of the domain is used to create a new Order and allows modifications, cancellation, marking an Order as fulfilled and shipped/completed. I, of course, have a query that returns a list of orders in the system (as read-only, lightweight DTOs) from the repository. Another query returns a PickList used by warehouse employees to pull items from the shelves to fulfill specific orders. In order to create the PickList, there are calculations, rules, etc that must be evaluated to determine which orders are ready to be fulfilled. For example, if all order line items are in stock. I need to read the same list of orders, iterate over the list and apply those rules and calculations to determine which items should be included in the PickList.
This is not a simple query, so how does it fit into the model?
UPDATE
While I may be able to maintain (store) a set of PickLists, they really are dynamic until an employee retrieves the next PickList. Consider the following scenario:
The first Order of the day is received. I can raise a domain event that triggers an AssemblePickListCommand which applies all of the rules and logic to create one or more PickLists for that Order.
A second Order is received. The event handler should now REPLACE the original PickLists with one or more new PickLists optimized across both pending Orders.
Likewise after a third Order is received.
Let's assume we now have two PickLists in the 'queue' because the optimization rules split the lists because components are at opposite ends of the warehouse.
Warehouse employee #1 requests a PickList. The first PickList is pulled and printed.
A fourth Order is received. As before, the handler removes the second PickList from the queue (the only one remaining) and regenerates one or more PickLists based on the second PickList and the new Order.
The PickList 'assembler' will repeat this logic whenever a new Order is received.
My issue with this is that a request must either block while the PickList queue is being updated or I have an eventual consistency issue that goes against the behavior the customer wants. Each time they request a PickList, they want it optimized based on all of the Order received to that point in time.
While I may be able to maintain (store) a set of PickLists, they really are dynamic until an employee retrieves the next PickList. Consider the following scenario:
The first Order of the day is received. I can raise a domain event that triggers an AssemblePickListCommand which applies all of the rules and logic to create one or more PickLists for that Order.
A second Order is received. The event handler should now REPLACE the original PickLists with one or more new PickLists optimized across both pending Orders.
This sounds to me like you are getting tangled trying to use a language that doesn't actually match the domain you are working in.
In particular, I don't believe that you would be having these modeling problems if the PickList "queue" was a real thing. I think instead there is an OrderItem collection that lives inside some aggregate, you issue commands to that aggregate to generate a PickList.
That is, I would expect a flow that looks like
onOrderPlaced(List<OrderItems> items)
warehouse.reserveItems(List<OrderItems> items)
// At this point, the items are copied into an unasssigned
// items collection. In other words, the aggregate knows
// that the items have been ordered, and are not currently
// assigned to any picklist
fire(ItemsReserved(items))
onPickListRequested(Id<Employee> employee)
warehouse.assignPickList(Id<Employee> employee, PickListOptimizier optimizer)
// PickListOptimizer is your calculation, rules, etc that know how
// to choose the right items to put into the next pick list from a
// a given collection of unassigned items. This is a stateless domain
// *domain service* -- it provides the query that the warehouse aggregate needs
// to figure out the right change to make, but it *doesn't* change
// the state of the aggregate -- that's the aggregate's responsibility
List<OrderItems> pickedItems = optimizer.chooseItems(this.unassignedItems);
this.unassignedItems.removeAll(pickedItems);
// This mockup assumes we can consider PickLists to be entities
// within the warehouse aggregate. You'd need some additional
// events if you wanted the PickList to have its own aggregate
Id<PickList> = PickList.createId(...);
this.pickLists.put(id, new PickList(id, employee, pickedItems))
fire(PickListAssigned(id, employee, pickedItems);
onPickListCompleted(Id<PickList> pickList)
warehouse.closePicklist(Id<PickList> pickList)
this.pickLists.remove(pickList)
fire(PickListClosed(pickList)
onPickListAbandoned(Id<PickList> pickList)
warehouse.reassign(Id<PickList> pickList)
PickList list = this.pickLists.remove(pickList)
this.unassignedItems.addAll(list.pickedItems)
fire(ItemsReassigned(list.pickedItems)
Not great languaging -- I don't speak warehouse. But it covers most of your points: each time a new PickList is generated, it's being built from the latest state of pending items in the warehouse.
There's some contention - you can't assign items to a pick list AND change the unassigned items at the same time. Those are two different writes to the same aggregate, and I don't think you are going to get around that as long as the client insists upon a perfectly optimized picklist each time. It might be worth while to sit down with the domain experts and explore the real cost to the business if the second best pick list is assigned from time to time. After all, there's already latency between the placing the order and its arrival at the warehouse....
I don't really see what your specific question is. But the first thing that comes to mind is that pick list creation is not just a query but a full blown business concept that should be explicitly modeled. It then could be created with AssemblePicklist command for instance.
You seem to have two roles/processes and possibly also two aggregate roots - salesperson works with orders, warehouse worker with picklists.
AssemblePicklistsCommand() is triggered from order processing and recreates all currently unassigned picklists.
Warehouse worker fires a AssignPicklistCommand(userid) which tries to choose the most appropriate unassigned picklist and assign it to him (or doing nothing if he already has an active picklist). He could then use GetActivePicklistQuery(userid) to get the picklist, pick items with PickPicklistItemCommand(picklistid, item, quantity) and finally MarkPicklistCompleteCommand() to signal order he's done.
AssemblePicklist and AssignPicklist should block each other (serial processing, optimistic concurency?) but the relation between AssignPicklist and GetActivePicklist is clean - either you have a picklist assigned or you don't.