Sequence Diagram: Interactions with resources (DB, Network, Caches, etc)

Sequence Diagram: Interactions with resources (DB, Network, Caches, etc) - resources

I am currently making a behavior assessment of different software modules regarding access to DB, Network, amount of memory allocations, etc.
The main goal is to pick a main use case( let's say system initialization) and recognize the modules that are:
Unnecessarily accessing DB.
Creating too many caches for same data.
Making too many allocations (or too big) at once.
Spawning many threads,
Network access
By assessing those, I could have an overview of the modules that need to be reworked in order to improve performance, delete redundant DB accesses, avoid CPU usage peaks, etc.
I found the sequence diagram a good candidate to represent the use cases behavior, but I am not sure how to depict their interaction with the above mentioned activities.
I could do something like shown in this picture, but that is an "invention" of tagging functions with colors. I not sure if it is too simplistic or childish (too many colors?).
I wonder if there is any specific UML diagram to represent these kind of interactions.

Using SDs is probably the most appropriate approach here. You might consider timing diagrams in certain cases if you need to present timing constraints. However, SDs already have a way to show timing constraints which is quite powerful.
You should adorn your diagram with a comment telling that the length of the colored self-calls represent percentage of use or something like that (or just adding a title telling this). Using colors is perfect by the way.
As a side note: (the colored) self-calls are shown with a self-pointing arrow like this
but I'd guess your picture can be understood by anyone and you can see that as nitpicking. And most likely they are not real self-calls but just indicators. So that's fine too.
tl;dr Whatever transports the message is appropriate.

Related

How to represent a complex use case where every step of the main flow can have multiple scenarios (alternative or error path)?

Little background
I'm new to writing use cases and representing their scenarios.
I'm dealing with a complex system. In the first step of analyzing the system, I created a use case diagram where each use case represents a distinct goal or value for the system. I have tried my best to keep the use cases independent. All these use cases require the initialization and activation of the system, so I decided to take out this common part and link it to the main use cases using include relationship.
I understand that include and extend relationships need to be used only when necessary.
Now I'm lookin into defining scenarios for each use case and then developing user stories and requirements based on scenarios.
Main issue
The use cases are very complex and the easiest way to analyze it seems to be mapping it into a sequence of steps/activities where each activity contains several scenarios and each scenario is represented using a sequence diagram.
I understand that an activity cannot be a use case which is related to the main use case using include relationship; but having sequence diagrams for activities seem wrong too.
What is the best way to represent a use case where each step of the main flow is complex and can have several interactions between actors and systems as well as having error scenarios which can result in termination of the sequence at that step or possibility of the user cancelling/aborting the sequence?
I have attached a simplified version of the activity diagram for "Initialize" use case.
As I mentioned, each activity can have many scenarios. For example
"Perform Self check" has many steps and each step might result in a failure that can terminate the sequence and alert the user (via a HMI). The user then can either terminate the initialization or retry.
"Validate system configuration" include steps for obtaining the reference config versions and comparing that to the system config, then download the new config files if necessary and then update the system configs. Each step might have a failure resulting in some sort of message to user and termination of the sequence. In some cases user should be able to skip the failed steps and proceed without doing that activity.
Same goes for every other activity in the diagram; many steps with exception or alternative paths.
Can I map these on one sequence diagram for the "Initialize" Use case?
My attempt to put all these on one sequence diagram failed.
I tried putting all these interactions on an activity diagram with swimlanes but things got so complex that stakeholders have a hard time understanding what is going on.
Maybe I'm trying to put too much details at the system level. Should I leave all these interim steps and interaction for the lower level of design? Should I create a hierarchy of use cases and roll down the complexity? I'm confused. :(
What is the best way to deal with such level of complexity? Could you provide some good examples.

The only way to represent a complex use case, where every step of the main flow can have multiple scenarios, is fortunately very simple:
The complexity of the scenarios does not change anything to the simplicity of the actor's goals. And if the goals are not sufficiently simple, you'd probably looking at too much details. Or the things are not as clear as they should.
The scenarios are often represented with a set of sequence diagrams. But if it gets really complex you'd better show the flow with an activity diagram.
By the way, you do not need to create an artificial extending or included use-case for the sake of modelling common steps. You may just create a separate activity diagram for the common part. Then, in each of your use-case activity diagram, you'd insert a call action of the common activity. This also avoids to misleadingly include the common part in the description of one UC and forget it for the others.
Last but not least, you also want to develop user-stories based on the use-case scenario. This is a mixed approach that requires some more thoughts:
user-stories are generally used without use-cases. Complex erquirements are described as an epic. The epic would then successfully be refine it into user-stories, that fit in an iteration;
it is possible to structure such user-stories according to stakeholder goals and tasks. THis approach is called user-story mapping. This is closer to the use-case, but there is no term to describe the higher-level goals.
use-case driven development is generally used without user-stories: the scenarios and activity directly lead to development without intermeriate user-stories.
Fortunately, the Use-Case 2.0 approach allows to combine both ways. Read the linked whitebook: it's short, it's free, it's written by the inventor of use-cases together with leading authors of use-case methodology; it offers a reegineered appraoch that allows agile developments, using use-case for the big picture and using use-case slices to break it down dynamically into units that can be developped in one iteration.

A complex use case can remain a single use case, but it may need multiple diagrams to specify its flows.
Your activity diagram (although not 100% UML compliant) gives a good overview of the flow of the use case. Keep this as the main diagram. I would decompose the complex steps in separate diagrams. To indicate that a step is decomposed in a separate diagram, you can display a rake symbol, as follows:
See UML 2.5.1 specification, section 16.3.4.1 for more information.

In UML Sequence diagrams, is it possible to model optional external inputs

In UML Sequence Diagrams you have the combined fragment type Alt to branch based on different values for parameters. But let's say that in the middle of your sequence you are waiting for one of two different messages from two different external actors and you shall branch the code depending on which one arrives, what would be the best way to model this? And to make the question a little more challenging, let's throw in the possibility that neither message comes (triggering a timeout).
Without a better solution, I would divide the sequence diagram into multiple sequence diagrams, each new one starting with the one of the two possible messages. Or possibly just go over to state machines. But is their a not too convoluted way that would allow me to show these different cases within one sequence diagram?

I would simply go for the two SDs which you can name accordingly. One should always keep in mind that a SD shall highlight a certain aspect of a complex chain of actions in a system. Trying to put more and more information in a single SD will mess it up and hinder more than it helps.
It is also possible to use diagram fragments which allows navigation through zooming into the two fragments.
The timing diagram will not really help here. You would still need a large alt-fragment to show the sequences depending on which message arrived first.

In addition to the answer I referred in the comment, I made a little sample with a duration constraint for the timeout.
If you have a lot of conditional logic to show Activity Diagrams are an alternative. They do not have object responsibilities or a time axis, but because of this they can freely use two dimensions to show flow control.

What's the difference between a centralized and a distributed sequence diagram?

I'm new to UML and I have crossed path with sequence diagram, and realized that there's 2 types: distributed and centralized. Can anyone explain me the differences?

centralized control, with one participant doing most of the processing and the other participants there to supply data.
Example:
Distributed control, in which the processing is split among many participants, each one doing a little bit of the algorithm
Example:
Both styles have their strengths and weaknesses. Most people, particularly those new to objects, are more used to centralized control. In many ways, it’s simpler, as all the processing is in one place; with distributed control, in contrast, you have the sensation of chasing around the objects, trying to find the program.
Despite this, object bigots like strongly prefer distributed control. One of the main goals of good design is to localize the effects of change. Data and behavior that accesses that data often change together. So putting the data and the behavior that uses it together in one place is the first rule of object-oriented design.
Furthermore, by distributing control, you create more opportunities for using polymorphism rather than using conditional logic. If the algorithms for product pricing are different for different types of product, the distributed control mechanism allows us to use subclasses of product to handle these variations.

Synchronizing Query-side Data in CQRS - won't there still be contention?

I have a general question about the CQRS paradigm in general.
I understand that a CommandBus and EventBus will decouple the domain model from our Query-side datastore, the merits of eventual consistency, and being able to denormalize the storage on the Query side to optimize reads, etc. That all sounds great.
But I wonder as I begin to expand the number of the components on the Query side responsible for updating the Query datastore, if they wouldn't start to contend with one another to perform their updates?
In other words, if we tried to use a pub/sub model for the EventBus, and there were a lot of different subscribers for a particular event type, couldn't they start to contend with one another over updating various bits of denormalized data? Wouldn't this put us in the same boat as we were before CQRS?
As I've heard it explained, it sounds like CQRS is supposed to do away with this contention all together, but is this just an ideal, and in reality we're only really minimizing it? I feel like I could be missing something here, but can't put my finger on it.

it all depends on how you have designed the infrastructure. Strictly speaking, CQRS in itself doesn't say anything about how the Query models are updated. Using Events is just a one of the options you have. CQRS doesn't say anything about dealing with contention either. It's just an architectural pattern that leaves you with more options and choices to deal with things like concurrency. In "regular" architectures, such as the layered architecture, you often don't have these options at all.
If you have scaled your command processing component out on multiple machines, you can assume that they can produce more events than a single event handling component can handle. That doesn't have to be a bad thing. It may just mean that the Query models will be updated with a slightly bigger delay during peak times. If it is a problem for you, then you should consider scaling out the query models too.
The Event Handler component themselves will not be contending with each other. They can safely process events in parallel. However, if you design the system to make them all update the same data store, your data store could be the bottleneck. Setting up a cluster or dividing the query model over different data sources altogether could be a solution to your problem.
Be careful not to prematurely optimize, though. Don't scale out until you have the figures to prove that it will help in your specific case. CQRS based architectures allow you to make a lot of choices. All you need to do is make the right choice at the right time.
So far, in the application's I am involved with, I haven't come across situations where the Query model was a bottleneck. Some of these applications produce more than 100mln events per day.

DDD/CQRS for composite .NET app with multiple databases

I'll admit that I am still quite a newbie with DDD and even more so with CQRS. I also realize that DDD and/or CQRS might not be the right approach to every problem. Nevertheless, I like the principals but have some questions in the context of a current project.
The solution is a simulator that generates performance data based on the current configuration. Administrators can create and modify the specifications for simulations. Testers set some environmental conditions and run the simulator. The results are captured, aggregated and reported.
The solution consists of 3 component areas each with their own use-cases, domain logic and supporting data structure. As a result, a modular designed seems appealing as a way to segregate logic and separate concerns.
The first area would be the administrative aspect which allows users to create and modify the specifications. This would be a CRUD heavy 'module'.
The second area would be for executing the simulations. The domain model would be similar to the first area but optimized for executing the simulation as opposed to providing a convenient model for editing.
The third area is reporting.
From this I believe that I have three Bounding Contexts, yes? I have three clear entry points into the application, three sets of domain logic and three different data models to support the domain logic.
My first instinct is to follow these lines and create three modules (assemblies) that encapsulate the domain layer for each area. Should I also have three separate databases? Maybe more than three to support write versus read?
I gather this may be preferred for CQRS but am not sure how to go about it. It appears to me that CQRS suggests a set of back-end processes that move data around. But if that's the case, and data persistence is cross-cutting (as DDD suggests), then doesn't my data access code need awareness of all of the domain objects? If so, then is there a benefit to having separate modules?
Finally, something I failed to mention earlier is that specifications are considered 'drafts' until published, which makes then available for simulation. My PublishingService needs to have knowledge of the domain model for both the first and second areas so that when it responds to the SpecificationPublishedEvent, it can read the specification, translate the model and persist it for execution. This makes me think I don't have three bounding contexts after all. Or am I missing something in my analysis?

You may have a modular UI for this, but I don't see three separate domains in what you are describing necessarily.
First off, in CQRS reporting is not directly a domain model concern, it is a facet of the separated Read Model which takes on the responsibility of presenting the domain state optimized for reporting.
Second just because you have different things happening in the domain is not necessarily a reason to bound them away from each other. I'd take a read through the blue DDD book to get a bit better feel for what BCs look like.
I don't really understand your domain well enough but I'll try to give some general suggestions.
Start with where you talked about your PublishingService. I see a Specification aggregate root which takes a few commands that probably look like CreateNewSpecification, UpdateSpecification and PublishSpecification.
The events look similar and probably feel redundant: SpecificationCreated, SpecificationUpdated, SpecificationPublished. Which kind of sucks but a CRUD heavy model doesn't have very interesting behaviors. I'd also suggest finding an automated way to deal with model/schema changes on this aggregate which will be tedious if you don't use code generation, or handle the changes in a dynamic *emphasized text*way that doesn't require you to build new events each time.
Also you might just consider not using event sourcing for such an aggregate root since it is so CRUD heavy.
The second thing you describe seems to be about starting a simulation which will run based on a Specification and produce data during that simulation (I assume). An event driven architecture makes sense here to decouple updating the reporting data from the process that is producing the data. This has huge benefits if you are producing large amounts of data to process.
However it doesn't sound like a Simulation is necessarily the kind of AR that would benefit from Event Sourcing either. For a couple reasons:
Simulation really takes only one Command which is something like StartSimulation
Simulation then produces events over it's life-time which represent what is happening internally with the simulation
Simulation doesn't seem to ever receive any other Commands that could depend on the current state of the Simulation
Simulation is not interacted with by multiple clients/users simultaneously and as we pointed out it isn't really interacted with at all
In general, domain modeling is very specific to each individual project so it's hard to give you all the information you need to build your domain model. It will come as a result of spending a great deal of time trying to understand your user's needs and the problem they are trying to solve with the software. It likely will go through multiple refinements as you develop insights into their process.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string