I am trying to reuse a value that was created during the previously run scenario. I am not sure if there is a way in cucumber to make a value available across scenarios.
For instance:
1st scenario:
Given inputOfA
When A is created
Then A is returned
And A contains an Id
2nd Scenario:
Given IdOfA
When customer gets A by Id
Then A is returned.
For the 2nd scenario, it would be great if I can get the Id created from the first without having to persist it anywhere.
Can any of you let me know if this is possible using cucumber or do I have to persist the Id created in the first scenario?
This is not something you want.
It would require the scenarios to be executed in a specific order. The scenario execution order isn't specified. It may even be random.
What you want scenarios that are independent. Scenarios you can execute in any order.
If you want to use the result from one scenario in a future scenario, you want to setup the second scenario to execute the same thing as the previous scenario should have done. This may feel like duplication and maybe it is. But the usage of BDD is to drive the implementation. The first scenario was used to drive some behaviour. The second scenario should be used to drive another behaviour.
It is possible that the first scenario can be deleted when the second is implemented. If it is redundant, remove it.
But whatever you do, avoid the path of scenarios that depends on each other. It will only lead to a bad place with seemingly random errors occurring.
Related
I'm having an odd case while thinking about a solution for my problem.
A quick recap: I'm using an event store with CQRS, and i have 2 aggregates called 'Group' and 'User'.
Basically a User defines some characteristics like his region, age, and a couple of interests.
He then can choose to 'match' with a Group that is in the same region, around the same age and same interests.
Now here's the case: the 'matchmaking' part should happen completely on the backend, it can be a long running process, but for the client it's just 1 call to the endpoint and the end result should be him matching with a group.
So for this case, I have to query the groups which have the same region, the same age slice, the interests don't really matter in my query. I know have a list of groups, and the match maker is going to give each group a rating based on the common interests between the group and the user. The group with the best rating will be joined.
So again, using CQRS and ES, and my problem is that this case seems a mix between queries and a command, and mixing queries into a match command seems to go against the purpose of CQRS.
Querying multiple groups and filtering them against my write side, the event store, also is a bad idea as the aggregates have to be rebuilt and loaded in memory before being able to filter them out.
So I:m kind of stuck here, something is telling me that a long running process / saga could be an answer to my problem, but I don't see how I would still not break the mix of query and commands in my saga, as a saga is basically a chain of commands/events.
How do I tackle this specific case ? No real code is needed, a conceptual solution to get me going is perfect.
Hi this is actually a case where CQRS can shine.
Creating a dedicated matching model seems to be ideal for this case to allow answering what might be a rather non-trivial query in other forms.
So,
create a dedicated (possibly ephemeral, possibly checkpointed/persisted) query model as derived store.
Upon request run a query to get the top matches.
based on the results of the query send a command to update the event store with the new links.
The query model will not need to manage commands and could be updated on a push basis from the event store. This will keep it rather simple to build and keep up to date and further can be optimized to only have the data needed for for this particular query.
An in-memory graph might do well.
-Chris
p.s.
On the command side: the commands here would each only update a single aggregate instance.
Further using the write ahead pattern would allow for not needing any sort of process manager or "saga."
e.g.
For each new membership 1 command to add the new membership to the user stream, then 1 command to the group to add the new member information. Then a simple audit process can scan for incomplete membership assignments both on start up/recovery and as a periodic data quality check.
-Chris
I have the two Aggregates 'notebook' and 'note'.
When I use the role 'aggregates reference only by there ids', I think I have two options:
Notebook(List<NoteId>, [other properties])
Note([other properties])
or
Notebook([other properties])
Note(NotebookId, [other properties])
With the first option, I need two DB calls to show all notes of a notebook (one to get the list and the second to load the notes).
So my current favorite is the second option. Now I have few options in my mind to save the order of the notes, where anyone has some disadvantages.
What is a good approach to solve my problem? Or is the first option better and the two DB calls are negligible?
Can anybody help?
Big THX
It looks that the order of the Notes is important, at least related to the Notebook, so maybe it should be part of the domain. If yes, I would suggest to store it together with the Note. Or use some other information of the Note to give an ordering when a list is loaded.
If not, why is the order relevant? I mean, the two entities have a related but separated lifecycle, or at least it looks: one aggregate - the Notebook - has a list that only references the other - the Note. Hence no direct interaction is planned. But, given the the domain is correctly modelled (there's not enough information to say something about it), somewhere you need a ordered list of Notes. The only way to have it as you need it is to store the information (or use one already stored), otherwise the hypothesis (order is relevant) is not valid anymore.
update after infos about number of Notes and their size
It looks that your domain is organized in this way:
a root entity, the Notebook, where the order of each Note, with only its ID, is also stored: any change in the order will be updated from here, not from the Note
another root entity, the Note, with its own lifecycle and its own 'actions' (operations that trigger a change in the entity)
Whenever you load the Notebook, you must load also the Note and it's order to show it correctly ordered. On the other side, when you change the order, this structure allows you to have a single action (or operation) on the Notebook, for example changeOrder(NoteId), that updates the order of the given Note and, if needed, changes the order of all the others. The trick, here, is that when you persist the Notebook you work just with the ID of the Note, so you don't have to load all the entity, but just a part of it, update and save it again. So, how big is the Note entity is not important, because you don't use it all. Hence, at every change you could trigger an update of all the couples (NoteID, order) for that Notebook. You can't do differently. But, to support this you need a single function in the repository where you load the ID of the Note and its order and you save it again; that should be not so expensive.
On the other side, all the actions that operate directly on the Note should load it, hence you have to load all. But in this case is required to load all, and save all, because you are changing the Note itself.
Anyway, the way you persist the order is totally demanded to the persistence layer, that is built over the domain. I mean, the domain has a Notebook and a set of Notes with order 1, 2, 3, etc.
Even if I don't think that this needs such a complex solution, you could use a totally differen way to store the order: you can use for example steps of 100 (so 100, 200, 300, etc): each new Note is put in the middle of the old two ones, and is the only one to be saved each time. Every since a while you run a job, or something else, that just normalizes all the values restoring the 100 steps (or whatever you use to persit the order). As I said, this looks an overcomplicated solution to the problem, but it also shows the fact that the entities of the domain could be totally different from the Persitence ones.
I am currently just trying to learn some new programming patterns and I decided to give event sourcing a shot.
I have decided to model a warehouse as my aggregate root in the domain of shipping/inventory where the number of warehouses is generally pretty constant (i.e. a company wont be adding warehouses too often).
I have run into the question of how to set my aggregateId, which should correspond to a warehouse, on my server. Most examples I have seen, including this one, show the aggregate ID being generated server side when a new aggregate is being created (in my case a warehouse), and then passed in the command request when referring to that aggregate for subsequent commands.
Would you say this is the correct approach? Can I expect the user to know and pass aggregate Ids when issuing commands? I realize this is probably domain dependent and could also be a UI/UX choice as well, just wondering what other's have done. It would make more sense to me if the number of my event sourced aggregates were more frequent, such as with meal tabs or shopping carts.
Thanks!
Heuristic: aggregate id, in many cases, is analogous to the primary key used to distinguish entities in a database table. Many of the lessons of natural vs surrogate keys apply.
Can I expect the user to know and pass aggregate Ids when issuing commands?
You probably can't depend on the human to know the aggregate ids. But the client that the human operator is using can very well know them.
For instance, if an operator is going to be working in a single warehouse during a session, then we might look up the appropriate identifier, cache it, and use it when constructing messages on behalf of the user.
Analog: when you fill in a web form and submit it, the browser does the work of looking at the form action and using that information to construct the correct URI, and similarly the correct HTTP Request.
The client will normally know what the ID is, because it just got it during a previous query.
Creation patterns are weird. It can, in some circumstances, make sense for the client to choose the identifier to be used when creating a new aggregate. In others, it makes sense for the client to provide an identifier for the command message, and the server decides for itself what the aggregate identifier should be.
It's messaging, so you want to be careful about coupling the client directly to your internal implementation details -- especially if that client is under a different development schedule. If you get the message contract right, then the server and client can evolve in any way consistent with the contract at any time.
You may want to review Greg Young's 10 year retrospective, which includes a discussion of warehouse systems. TL;DR - in many cases the messages coming from the human operators are events, not commands.
Would you say this is the correct approach?
You're asking if one of Greg Young's Event Sourcing samples represents the correct approach... Given that the combination of CQRS and Event Sourcing was essentially (re)invented by Greg, I'd say there's a pretty good chance of that.
In general, letting the code that implements the Command-side generate a GUID for every Command, Event, or other persistent object that it needs to write is by far the simplest implementation, since GUIDs are guaranteed to be unique. In a distributed system, uniqueness without coordination is a big thing.
Can I expect the user to know and pass aggregate Ids when issuing commands?
No, and you particularly can't expect a user to know the GUID of their assets. What you may be able to do is to present the user with a list of his or her assets. Each item in the list will have the GUID associated, but it may not be necessary to surface that ID in the user interface. It's just data that the underlying UI object carries around internally.
In some cases, users do need to know the ID of some of their assets (e.g. if it involves phone support). In that case, you can add a lookup API to address that concern.
I assume that most implementations have a base set of known data that gets spun up fresh each test run. I think there are a few basic schools of thought from here..
Have test code, use application calls to produce the data.
Have test code spin up the data manually via direct datastore calls.
Have that base set of data encompass everything you need to run tests.
I think it's obvious that #3 is the least maintainable approach.. but I'm still curious if anyone has been successful with it. Perhaps you could have databases for various scenarios, and drop/add them from test code.
It depends on the type of data and your domain. I had one unsuccessful attempt when the schema wasn't stable yet. We kept running into problems adding data to new and changed columns which bricked the tests all the time.
Now we successfully use starting state data where the dataset will largely be fixed, stable schemas and required in the same state for all tests. (e.g. A postcode database)
for most other stuff the tests are responible for setting up data themselves. That works for us!
I have certain objects in my domain which are not aggregate roots/entities, yet I still need to retrieve them from a database. I don't want to confuse things by creating repositories for these things. So, what are alternative data access patterns? Would you simply create a DAO for them, while still of course separating the interface?
Edit:
Some more detail on what I'm doing. I need to create a code. This code has certain rules as to its format. One of the rules is that the final character must be a unique number incremented by one from the last code generated. For example:
ABCD1
ABCD2
ABCD3
So, I'm keeping a table with one row, one column to store the number in question. Now, I don't want to consider this number an entity and create a repository for it - that's overkill. I just need a way of retrieving the number, adding 1 to it, and saving it. I know there are myriad ways I could do it, but I'm wondering if there's an customary way.
There are several data access patterns that could apply, in theory. You'd need to provide more detail though if you want us to suggest a specific pattern.
Without more detail, all I can suggest is to consider looking into Martin Fowler's Patterns of Enterprise Application Architecture book.
Edit: Customary way? No, not that I can think of - it really depends on where and how you're using this unique code in your domain. If I were doing this, I'd probably create a small service that speaks directly to the database to perform this function - not as heavy-weight as a repository, and very focused on the problem at hand.
Based on the edit: I would look first at the context in which you need to create that code. Perhaps there are some related entities or something that you are missing.
btw, I find the question really interesting as it comes up from time to time while coding specific features. I usually end up finding I was missing something on the scenario and it ends up fitting well with the normal repository pattern.
After surveying the options I'm going with the Table Gateway pattern.