Which data model is better?

Which data model is better? - node.js

I'm working on a project and was trying to create data models for it. We've a usecase where a user can host an event and add members to it.
class Event {
event_name: String
}
class User {
username: String
}
I wanted to know which of the following way to store the event members in the Event class.
//v1
class Event {
event_name: String,
event_members: Array<String> // List of usernames
}
//v2
class Event {
event_name: String,
event_members: Array<User> // List of user objects
}
By using v2, I feel I'll be able to move the logic to get user information, from DB, from client side to my server.
Latency is also something that I'm considering. If I go with v1, then I need to make multiple calls to the server to fetch all the information about event members, resulting in increase in wait time. Whereas, in v2, the response payload is increasing which might impact our network calls.
I wanted to know which will be a better way to store my model among the two and if there's a different and more efficient way then please let me know.

There is no singular "better" data model here. When modeling data in a NoSQL database, it always depends on your use-cases. As you add more use-cases to your app, you'll expand and modify the data model to fit your needs.
That said, I typically store both directions of a many-to-many relationship, so both v1 and v2. This allows fast lookup of the related items in both directions, at the cost of some extra storage - so a typical time-vs-space trade-off.
But as said: there is no singular best data model, and when you're just getting started I typically focus on getting a simple model working quickly, and on securing access to that data.
For a good introduction to the topic in general, I recommend reading NoSQL data modeling, and for Firestore specifically watch Todd's excellent Getting to know Cloud Firestore series.

Related

Models vs DTO in NestJS

I am completely new to NestJS. I have seen that in NestJS, a model is created to specify the details of data, e.g. when creating a simple task manager, when we want to specify what a single task will look like, we specify it in the model (example below):
export interface Task {
id: string;
title: string;
description: string;
status: TaskStatus;
}
export enum TaskStatus {
OPEN = 'OPEN',
IN_PROGRESS = 'IN_PROGRESS',
DONE = 'DONE',
}
However, I later came across DTOs, where once again the shape of data is described. My understanding is that DTOs are used when transferring data, i.e. it describes the kind of data that you will post or get.
My question is that when I am already using DTOs to describe the shape of data, why use Models at all?
Also, I read that with DTOs we can have a single source of truth and in case we realise that the structure of data needs to change, we won't have to specify it separately in the controller and service files, however, this still means we will have to update the Model?

Most of the time over a long period of time your DTOs and your models can and will diverge from each other. What comes of the the HTTP request and what gets sent back can be in a different format than what is kept in your database, so keeping them separate can lead to more flexibility as time goes on. This basically comes into an argument of DTO (Data Transfer Objects) and DAO (Data Access Object) versus DTAO (Data Transfer/Access Object) (at least that's what I call them).
This also deals with the Single Responsibility Principle as each class should deal with one thing and one thing only.
There's also this SO post from a Java thread that talks about what you're thinking of

DDD: where should logic go that tests the existence of an entity?

I am in the process of refactoring an application and am trying to figure out where certain logic should fit. For example, during the registration process I have to check if a user exists based upon their email address. As this requires testing if the user exists in the database it seems as if this logic should not be tied to the model as its existence is dictated by it being in the database.
However, I will have a method on the repository responsible for fetching the user by email, etc. This handles the part about retrieval of the user if they exist. From a use case perspective, registration seems to be a use case scenario and accordingly it seems there should be a UserService (application service) with a register method that would call the repository method and perform if then logic to determine if the user entity returned was null or not.
Am I on the right track with this approach, in terms of DDD? Am I viewing this scenario the wrong way and if so, how should I revise my thinking about this?
This link was provided as a possible solution, Where to check user email does not already exits?. It does help but it does not seem to close the loop on the issue. The thing I seem to be missing from this article would be who would be responsible for calling the CreateUserService, an application service or a method on the aggregate root where the CreateUserService object would be injected into the method along with any other relevant parameters?
If the answer is the application service that seems like you are loosing some encapsulation by taking the domain service out of the domain layer. On the other hand, going the other way would mean having to inject the repository into the domain service. Which of those two options would be preferable and more in line with DDD?

I think the best fit for that behaviour is a Domain Service. DS could access to persistence so you can check for existence or uniquenes.
Check this blog entry for more info.
I.e:
public class TransferManager
{
private readonly IEventStore _store;
private readonly IDomainServices _svc;
private readonly IDomainQueries _query;
private readonly ICommandResultMediator _result;
public TransferManager(IEventStore store, IDomainServices svc,IDomainQueries query,ICommandResultMediator result)
{
_store = store;
_svc = svc;
_query = query;
_result = result;
}
public void Execute(TransferMoney cmd)
{
//interacting with the Infrastructure
var accFrom = _query.GetAccountNumber(cmd.AccountFrom);
//Setup value objects
var debit=new Debit(cmd.Amount,accFrom);
//invoking Domain Services
var balance = _svc.CalculateAccountBalance(accFrom);
if (!_svc.CanAccountBeDebitted(balance, debit))
{
//return some error message using a mediator
//this approach works well inside monoliths where everything happens in the same process
_result.AddResult(cmd.Id, new CommandResult());
return;
}
//using the Aggregate and getting the business state change expressed as an event
var evnt = Transfer.Create(/* args */);
//storing the event
_store.Append(evnt);
//publish event if you want
}
}
from http://blog.sapiensworks.com/post/2016/08/19/DDD-Application-Services-Explained

The problem that you are facing is called Set based validation. There are a lot of articles describing the possible solutions. I will give here an extract from one of them (the context is CQRS but it can be applied to some degree to any DDD architecture):
1. Locking, Transactions and Database Constraints
Locking, transactions and database constraints are tried and tested tools for maintaining data integrity, but they come at a cost. Often the code/system is difficult to scale and can be complex to write and maintain. But they have the advantage of being well understood with plenty of examples to learn from. By implication, this approach is generally done using CRUD based operations. If you want to maintain the use of event sourcing then you can try a hybrid approach.
2. Hybrid Locking Field
You can adopt a locking field approach. Create a registry or lookup table in a standard database with a unique constraint. If you are unable to insert the row then you should abandon the command. Reserve the address before issuing the command. For these sort of operations, it is best to use a data store that isn’t eventually consistent and can guarantee the constraint (uniqueness in this case). Additional complexity is a clear downside of this approach, but less obvious is the problem of knowing when the operation is complete. Read side updates are often carried out in a different thread or process or even machine to the command and there could be many different operations happening.
3. Rely on the Eventually Consistent Read Model
To some this sounds like an oxymoron, however, it is a rather neat idea. Inconsistent things happen in systems all the time. Event sourcing allows you to handle these inconsistencies. Rather than throwing an exception and losing someone’s work all in the name of data consistency. Simply record the event and fix it later.
As an aside, how do you know a consistent database is consistent? It keeps no record of the failed operations users have tried to carry out. If I try to update a row in a table that has been updated since I read from it, then the chances are I’m going to lose that data. This gives the DBA an illusion of data consistency, but try to explain that to the exasperated user!
Accepting these things happen, and allowing the business to recover, can bring real competitive advantage. First, you can make the deliberate assumption these issues won’t occur, allowing you to deliver the system quicker/cheaper. Only if they do occur and only if it is of business value do you add features to compensate for the problem.
4. Re-examine the Domain Model
Let’s take a simplistic example to illustrate how a change in perspective may be all you need to resolve the issue. Essentially we have a problem checking for uniqueness or cardinality across aggregate roots because consistency is only enforced with the aggregate. An example could be a goalkeeper in a football team. A goalkeeper is a player. You can only have 1 goalkeeper per team on the pitch at any one time. A data-driven approach may have an ‘IsGoalKeeper’ flag on the player. If the goalkeeper is sent off and an outfield player goes in the goal, then you would need to remove the goalkeeper flag from the goalkeeper and add it to one of the outfield players. You would need constraints in place to ensure that assistant managers didn’t accidentally assign a different player resulting in 2 goalkeepers. In this scenario, we could model the IsGoalKeeper property on the Team, OutFieldPlayers or Game aggregate. This way, maintaining the cardinality becomes trivial.

You seems to be on the right way, the only stuff I didn't get is what your UserService.register does.
It should take all the values to register a user as input, validate them (using the repository to check the existence of the email) and, if the input is valid store the new User.
Problems can arise when the validation involve complex queries. In that case maybe you need to create a secondary store with special indexes suited for queries that you can't do with your domain model, so you will have to manage two different stores that can be out of sync (a user exists in one but it isn't replicated in the other one, yet).
This kind of problem happens when you store your aggregates in something like a key-value store where you can search just with the id of the aggregate, but if you are using something like a sql database that permits to search using your entities fields, you can do a lot of stuff with simple queries.
The only thing you need to take care is avoid to mix query logic and commands logic, in your example the lookup you need to do is easy, is just one field and the result is a boolean, sometimes it can be harder like time operations, or query spanning multiple tables aggregating results, in these cases it is better to make your (command) service use a (query) service, that offers a simple api to do the calculation like:
interface UserReportingService {
ComplexResult aComplexQuery(AComplexInput input);
}
That you can implement with a class that use your repositories, or an implementation that executes directly the query on your database (sql, or whatever).
The difference is that if you use the repositories you "think" in terms of your domain object, if you write directly the query you think in terms of your db abstractions (tables/sets in case of sql, documents in case of mongo, etc..). One or the other depends on the query you need to do.

It is fine to inject repository into domain.
Repository should have simple inteface, so that domain objects could use it as simple collection or storage. Repositories' main idea is to hide data access under simple and clear interface.
I don't see any problems in calling domain services from usecase. Usecase is suppossed to be archestrator. And domain services are actions. It is fine (and even unavoidable) to trigger domain actions by usecase.
To decide, you should analyze Where is this restriction come from?
Is it business rule? Or maybe user shouldn't be a part of model at all?
Usualy "User" means authorization and authentification i.e behaviour, that for my mind should placed in usecase. I prefare to create separate entity for domain (e.g. buyer) and relate it with usecase's user. So when new user is registered it possible to trigger creation of new buyer.

How to handle Persistence with Rich Domain Model

I am redesigning my NodeJS application because I want to use the Rich Domain Model concept. Currently I am using Anemic Domain Model and this is not scaling well, I just see 'ifs' everywhere.
I have read a bunch of blog posts and DDD related blogs, but there is something that I simply cannot understand... How do we handle Persistence properly.
To start, I would like to describe the layers that I have defined and their purpose:
Persistence Model
Defines the Table Models. Defines the Table name, Columns, Keys and Relations
I am using Sequelize as ORM, so the Models defined with Sequelize are considered my Persistence Model
Domain Model
Entities and Behaviors. Objects that correspond to the abstractions created as part of the Business Domain
I have created several classes and the best thing here is that I can benefit from hierarchy to solve all problems (without loads of ifs yay).
Data Access Object (DAO)
Responsible for the Data management and conversion of entries of the Persistence Model to entities of the Domain Model. All persistence related activities belong to this layer
In my case DAOs work on top of the Sequelize models created on the Persistence Model, however, I am serializing the records returned on Database Interactions in different objects based on their properties. Eg.: If I have a Table with a column called 'UserType' that contains two values [ADMIN,USER], when I select entries on this table, I would serialize the return according to the User Type, so a User with Type: ADMIN would be an instance of the AdminUser class where a User with type: USER would simply be a DefaultUser...
Service Layer
Responsible for all Generic Business Logic, such as Utilities and other Services that are not part of the behavior of any of the Domain Objects
Client Layer
Any Consumer class that plays around with the Objects and is responsible in triggering the Persistence
Now the confusion starts when I implement the Client Layer...
Let's say I am implementing a new REST API:
POST: .../api/CreateOrderForUser/
{
items: [{
productId: 1,
quantity: 4
},{
productId: 3,
quantity: 2
}]
}
On my handler function I would have something like:
function(oReq){
var oRequestBody = oReq.body;
var oCurrentUser = oReq.user; //This is already a Domain Object
var aOrderItems = oRequestBody.map(function(mOrderData){
return new OrderItem(mOrderData); //Constructor sets the properties internally
});
var oOrder = new Order({
items: aOrderItems
});
oCurrentUser.addOrder(oOrder);
// So far so good... But how do I persist whatever
// happened above? Should I call each DAO for each entity
// created? Like, first create the Order, then create the
// Items, then update the User?
}
One way I found to make it work is to merge the Persistence Model and the Domain Model, which means that oCurrentUser.addOrder(...) would execute the business logic required and would call the OrderDAO to persist the Order along with the Items in the end. The bad thing about this is that now the addOrder also have to handle transactions, because I don't want to add the order without the items, or update the User without the Order.
So, what I am missing here?

Aggregates.
This is the missing piece on the story.
In your example, there would likely not be a separate table for the order items (and no relations, no foreign keys...). Items here seem to be values (describing an entity, ie: "45 USD"), and not entities (things that change in time and we track, ie: A bank account). So you would not directly persist OrderItems but instead, persist only the Order (with the items in it).
The piece of code I would expect to find in place of your comment could look like orderRepository.save(oOrder);. Additionally, I would expect the user to be a weak reference (by id only) in the order, and not orders contained in a user as your oCurrentUser.addOrder(oOrder); code suggests.
Moreover, the layers you describe make sense, but in your example you mix delivery concerns (concepts like request, response...) with domain concepts (adding items to a new order), I would suggest that you take a look at established patterns to keep these concerns decoupled, such as Hexagonal Architecture. This is especially important for unit testing, as your "client code" will likely be the test instead of the handler function. The retrieve/create - do something - save code would normally be a function in an Application Service describing your use case.
Vaughn Vernon's "Implementing Domain-Driven Design" is a good book on DDD that would definitely shed more light on the topic.

Lightweight aggregates and repositories

Let's assume that we have two simple domain objects :
Topic (entity) -> Messages (value object)
These two domain objects could be included into one aggregate according to DDD principles.
But in some cases we need to retrieve topics without messages (if want just show a list of topics) and sometimes we need to retrieve topics with messages.
What is the best way to design that simple case? Thanks in advance.

I would suggest you to separate domain logic from data needed for presentation. Something like Command-query separation (CQS) or command-query responsibility segregation (CQRS). For example, if someone adds a new message to topic, you create an appropriate command and handle it as a part of your domain logic. And if you need to display some data in user interface, you select only data that you really need through DTO (data transfer object). This solution avoids of unnessesary data retrieving and helps to keep simplicity. You retrieve only data you really need.
If this solution causes a lot of changes in your project, you can create an additional method in repository that returns a lightweight version of your aggregate (with default stub for Messages collection). But this solution has one drawback - you will need to keep in mind that this method returns incomplete data.

CQRS or App Service?

So I like the concepts of CQRS in our application, mainly because we already support event sourcing (conceptually, not following any prescriptions that you see out there). However, it really seems like CQRS is geared toward Big Data, Eventual consistency, that kind of thing. We are always going to be a Relational DB app, so I am not sure if it fits.
I also have concerns because I think I need to do some special things in my app layer. When doing a read, I need to enforce security and filter data, things that are traditionally implemented in the application layer.
My first question is, does my app fit (a traditional MVC / Relational DB app)? Or does it make more sense to have a traditional app layer and use a DTO Mapper?
My second question is, does it make sense to issue commands to your domain model out of a traditional application layer? I like the idea of commands / command handlers and eventing.
Let me clarify my question. I have concerns around data filtering that are tied to authorization. When a user requests data, there has to be a filter that restricts access to certain data elements by removing them all together (So they are not returned to the caller), hiding the values, or applying masks to the data. In a contrived example, for a Social Security Number, the user making the request may only be able to see the last 4 numbers, so the result would appear like ###-##-1234.
My assertion is that this responsibility goes in the Application layer. I consider this an aspect, where all responses to queries or commands have to go through this filter mechanism. Here is where my CQRS naivity shines through, perhaps it is that commands never return data, just pointers to data that are looked up through the read model?
Thanks!

First and foremost: CQRS and Relational Databases don't exclude each other. In advanced scenarios it may make sense to replace the SQL-based DB with other means of storage, but CQRS as a concept doesn't care about the persistence mechanism.
In the case of multiple views that depend on roles and/or users, the Thin Read Layer should probably provide multiple result sets:
One containing the full SSN for users that are authorized to access that information.
A different one for users that are not authorized to see that information
...
These can be stored in a separate data store, but they can also be provided through SQL views if you work with a single SQL-based database.
In CQRS the Application Service still exists in the form of Command Handlers. These can be nested, i.e. handling authorization first, then posting the command to a contained command handler.
public class AuthorizationHandler {
public CrmAuthorizationService(CrmCommandHandler handler) {
_next = handler;
}
public void Handle(SomeCommand c) {
if (authorized) _next.Handle(c);
}
}
// Usage:
var handler = new CrmAuthorizationService(new CrmCommandHandler());
bus.Register<SomeCommand>(handler.Handle);
This way you can nest multiple handlers, e.g. as a REST envelope, for logging, transactions, etc.
To answer your questions:
First: Is CQRS a good fit for your app? No-one can tell without really digging into the specific requirements. Just because you use MVC and a relational DB doesn't mean anything when it comes to the pros and cons of CQRS.
Second: Yes, in some cases it can make sense to let your application layer interact with the client in a classical way and handle things like authentication, authorization, etc., and then issue commands internally. This can be useful when putting an MVC-based UI or a REST API on top of your application.
Update in response to comment:
In an ideal, puristic CQRS scenario Sally would have her own denormalized data for every view, e.g. a couple of documents in a NoSQL DB called CustomerListForSally, CustomerDetailsForSally, etc. These are populated with what she's allowed to see.
Once she gets promoted - which would be an important domain event - all her denormalized data would automatically be overwritten, and extended to contain what she's allowed to see now.
Of course we must stay reasonable and pragmatic, but this ideal should be the general direction that we're heading for.
In reality you probably have some kind of user/role or user/group based system. To be able to view sensitive information, you'd have to be member of a particular role or group. Each of these could have their defined set of views and commands. This doesn't need t be denoalized data It can be as simpe as a cope of SQL-Views:
CustomerDetailsForSupportStaff
CustomerDetailsForSupportExecutive with unmasked SSN
CustomerListForSupportStaff
CustomerListForSupportExecutive with customer revenue totals
etc.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string