Grpc microservice architecture implementation - node.js

In a microservice architecture, is it advisable to have a centralized collection of proto files and have them as dependency for clients and servers? or have only 1 proto file per client and server?

If your organization uses a monolithic code base (i.e., all code is stored in one repository), I would strongly recommend to use the same file. The alternative is only to copy the file but then you have to keep all the versions in sync.
If you share protocol buffer file between the sender and the receiver, you can statically check that both the sender and the receiver use the same schema, especially if some new microservices will be written in a statically typed language (e.g., Java).
On the other hand, if you do not have a monolithic code base but instead have multiple repositories (e.g., one per microservice), then it is more cumbersome to share the protocol buffers file. What you can do is to put them in separate repositories that can be added as an dependency to microservices that need them. That is what I have seen in my previous company. We had multiple small API repositories for the schema.
So, if it is easy to use the same file, I would recommend to do so instead of creating copies. There may be situations, however, where it is more practical to copy them. The disadvantage is that you always have to apply a change at all copies. In best case, you know which files to update, then it is just tedious. In the worst case, you do not know which files to update, and your schema will get out of sync. Only when the code is released, you will find out.
Note that monolithic code base does not mean monolithic architecture. You can have microservices and still keep all the source code together in one repository. The famous example is, of course, Google. Google also heavily uses protocol buffers for their internal communication. I have not seen their source code, but I would be surprised if they do not share their protocol buffer files between services.

Related

Why we have redundant repository in BLoC Architecture?

In BLoC Architecture we have Data Provider and Repository , In many examples I see that
Repository just called Data Provider and it is really cumbersome to create Repository , Why Repository exists? what is purpose
This is actually something that comes from Adopting Clean Architecture, where a repository is an interface which provides data which is from a source to the UI.
The sources are usually Remote & Data where Remote refers to fetching data from a remote source (this could be other apps, REST API, Websocket connect) and data which is from a local source (something akin to a database.) The idea behind having two separate classes for this is to provide adequate separation of concerns.
Imagine an App like Instagram, where the App manages both offline data and online data. It would make sense then to have the logic handled for each separately, and then use the repository which is what your viewmodel/bloc takes in to access the data. The bloc doesn't need to know where the source of data came from, it only needs the data. Repository Implementation doesn't need to know what is used to make an API call, it just needs to consume the fetched data. Similarly, the Repository Implementation doesn't need to know where the local data is fetched from, it just needs to consume it. This way every bit is abstracted adequately and changes in one class, doesn't affect other classes because everything is an Interface.
All of this helps in testing the code better since mocking and stubbing becomes easier.

Common vs Core - difference

Assume we have a couple of libs. What is the difference between Core and Common library? How should they be recognized and do we organize the responsibilites of both?
+Common
-Class1
+Core
-Class2
+Lib1 has : Common
+Lib2 has : Core, Common
Should Common be truely common (i.e. all libs use it)? Or is Common only for those who need it?
What is good practice when refactoring / creating a project?
I don't really understand the difference between Core and Common.
I think this depends a lot on your particular application. In a single centralized app, I do think there might be a little overlap between the Core and Common folders. But the most important thing is that it makes sense for your app. Don't feel that you need to have those folders just because you've seen it in other apps...
For me, having a Core and a Common folders makes a lot of sense in some scenarios - e.g. a web app with an API and a client. You may have your Core folder in the API side, where the core execution (the business logic) takes place, and then have a Common folder with some things you need in both the API and the client sides - e.g., Http requests validation or a Json converter.
Anyway, it may make sense to have a Core and a Common folder in other kinds of apps.
For example, the Core folder would contain those classes that are central for your app - the vast majority of business model classes would be there.
In the Common folder, on the other hand, you can have some other classes that are shared, but not central - e.g., a Logger or a MessageSender could be there...
As for your little draft of code structure, I think that your Core package is the one to be revised - why Lib1 doesn't use Core? If something is core, generally it's because everything else needs it in order to run. If you do not have code that is conceptually central, maybe you can remove your Core package and keep only Common?
As for your other question - I do not think the Common stuff must be shared by all other packages, but just with 2 or more packages sharing something, that can be considered common.

Is it common practice to have a shared and dedicated assembly or library for messages in a distributed system?

I'm talking a crack at some of the concepts behind distributed domain driven design and I'm building a proof of concept. I have three C# solutions that have specific responsibility within the overall system.
The solutions I have are:
The write model (receives commands from a client and creates and sends events)
The read model (receives events from write model, creates a database and exposes DTO services to the client, could potentially be 2 separate solutions)
The client (calls services to get needed data and sends commands to the write model)
All three solutions use messaging (commands, events) through a service bus. (MassTransit in my case).
My main question is: Is it common practice to create an assembly with the messages and have each solution reference that assembly?
Extra credit: Is there anything I'm doing that seems weird or problematic in this POC? Any additional info I should be aware of when creating this type of a system?
Is it common practice to create an assembly with the messages and have
each solution reference that assembly?
Yes. This is a common practice with messaging systems in general. For example, many NServiceBus samples employ this approach. Think of this assembly as representing your contract. In systems built upon different platforms this representation would come in the form of an XSD schema or some other schema definition mechanism.
Is there anything I'm doing that seems weird or problematic in this
POC? Any additional info I should be aware of when creating this type
of a system?
Everything seems to be well fitted to CQRS so far. To be fair, I should mention that it can be easy to get carried away with CQRS as a silver bullet and structure systems around it. It is often a wise decision to forgo CQRS all together. Keep focus on the business domain and use CQRS as an architectural style to implement your system, not to guide its model.

Security Program - Splitting Files

How would you go about describing the architecture of a "system" that splits a sensitive file into smaller pieces on different servers in order to protect the file?
Would we translate the file into bytes, and then distribute those bytes onto different servers? How would you even go about getting all the pieces back together in order to call the original file back (if you have the correct permissions)?
This is a theoretical problem that I do not know how to approach. Any hints at where I should start?
Not an authoritative answer but you will get many here as replies which provides partial answers to your question. It may just give you some idea.
My guess is, you would be creating a custom file system.
Take a look at various filesystems like
GmailFS: http://richard.jones.name/google-hacks/gmail-filesystem/gmail-filesystem.html
pyfilesystem: http://code.google.com/p/pyfilesystem/
A distributed file system in python: http://pypi.python.org/pypi/koboldfs
Hence architecturally, it will be very similar to way a typical distributed filesystem is implemented.
It should be a client/server architecture in master/slave mode. You will have to create a custom protocol for their communication.
Master process is what you will talk to for retrieving / writing your files.
Slave fs would be distributed across different servers which will keep a tagged file which contains partial bits of information of a file
Master fs will contain a per file entry that locates all sequence of tagged data distributed across various slave servers.
You could have redundancy with a tagged data being store on multiple server.
Communication protocol will have to be designed to allow multiple servers to respond back to requested tagged data. Master fs simply picks one and ignores others in the simplest case.
Usual security requirements needs to be respected for storing and communicating this information across servers.
You will be most interested in secure distributed filesystem implemented in Python : Tahoe
http://tahoe-lafs.org/~warner/pycon-tahoe.html
http://tahoe-lafs.org/trac/tahoe-lafs

Should repositories be both loading and saving entities?

In my design, I have Repository classes that get Entities from the database (How it does that it does not matter). But to save Entities back to the database, does it make sense to make the repository do this too? Or does it make more sense to create another class, (such as a UnitOfWork), and give it the responsibility to save stuff by having it accept Entities, and by calling save() on it to tell it to go ahead and do its magic?
In DDD, Repositories are definitely where ALL persistence-related stuff is expected to reside.
If you had saving to and loading from the database encapsulated in more than one class, database-related code will be spread over too many places in your codebase - thus making maintenance significantly harder. Moreover, there will be a high chance that later readers of this code might not understand it at first sight, because such a design does not adhere to the quasi-standards that most developers are expecting to find.
Of course, you can have separate Reader/Writer-helper classes, if that's appropriate in your project. But seen from the Business Layer, the only gateway to persistence should be the repository...
HTH!
Thomas
I would give the repository the overall responsibility for encapsulating all aspects of load and save. This ensures that tricky issues such as managing contention between readers and writes has a place to be managed.
The repository might well use your UnitOfWork class, and might need to expose a BeginUow and Commit methods.
Fowler says that repository api should mimic collection:
A Repository mediates between the domain and data mapping layers, acting like an in-memory domain object collection.

Resources