How to access schema in a Kafka consumer when using schema registry? - node.js

I'm integrating Kafka in our microservices architecture. We're using Karaspace as the schema registry, and protobuf as data format. So in producer microservice, there's a .proto file defined underlining the schema to be pushed and I've created corresponding typescript interfaces using ts-node.
On the consumer side, schema registry will fetch the schema associated to the received data to validate and deserialise the data. But how do I access the corresponding interfaces in consuming microservice, so as to implement type checking?
Direct way seems to be writing interfaces for the expected response data beforehand. But then it will hamper schema evolution and I'll be back to square one.

writing interfaces for the expected response data beforehand
Yes, but you can also download them, not re-write. I.e. your producer code (assuming also Typescript) can be responsible for publishing the .d.ts types to common NPM registry, which the consumer adds as a dependency.
Or you can setup NPM/yarn pre-build hooks to download from the registry and run necessary protoc commands to compile the schema, similar to Confluent's own Registry Maven/Gradle plugins + Avro/Protobuf plugins.

Related

How to use compact serialization in hazelcast in kubernetes cluster

I tried to follow this tutorial with compact serialization but I don't know what is the workflow when I want to use it with custom CompactSerializer
So I have object Employee which I want to de/serialize. When I installing hazelcast in kubernetes I need to add jar with this class. Now I want add field which should be supported in schema evolution:
Compact serialization permits schemas and classes to evolve by adding
or removing fields, or by changing the types of fields. More than one
version of a class may live in the same cluster and different clients
or members might use different versions of the class.
but how I can add this class to running hazelcast in kubernetes without reinstalling it ?
When I add this serializer in my application
hzConfig.getSerializationConfig().getCompactSerializationConfig().setEnabled(true);
hzConfig.getSerializationConfig().getCompactSerializationConfig().register(Employee.class, "employee",
new EmployeeSerializer());
this will be used just in serialization inside hazelcast cluster. But when I want to deserialize object from hazelcast client and I did not use this serializer I got exception that GenericRecord can not be cast to Employee
So I am curious if there is tutorial of workflow how to use this compact serialization with custom object

How to share DB schema when there are two backend servers?

I am developing some server. This server consists of one front-end and two back-ends. So far, I have completed the development of one back-end, and I want to develop the other one. Both are express servers and db is using mongodb. At this time, I am developing using the mongoose module, and I want to share a collection (ie schema). But I have already created a model file on one server. If so, I am wondering if I need to generate the same model file on the server I am developing now. Because if I modify the model file later, I have to modify both.
If there is a good way, please let me know with an example.
Thank you.
I have two answers for you one is direct and the other will to introduce the concept of microservice.
Answer 1 - Shared module (NPM or GIT)
You can create an additional project that will be an NPM lib (It can be installed via NPM or git submodules).
This lib will expose a factory method that will accept the mongoose option and return the mongoose connection.
Using a single Shared module will make it easier to update each backend after updating the DB (A bit cumbersome if you have many backends).
Answer 2 - The microservice approach
In the microservice approach, each service (backend) manages its own DB and only it. This means that each service needs to expose an internal API for other services to use.
So instead of sharing lib, each service has a well-defined internal API that other services can use.
I would recommend looking into NestJS (NodeJS microservice framework) to get a better feel of how to approach microservice
It goes without saying that I prefer Answer 2 but it's more complex and you may need to learn more before giving it a go. But I highlight recommend it because microservice (If implemented right) will make your code more future proof.

Why we have redundant repository in BLoC Architecture?

In BLoC Architecture we have Data Provider and Repository , In many examples I see that
Repository just called Data Provider and it is really cumbersome to create Repository , Why Repository exists? what is purpose
This is actually something that comes from Adopting Clean Architecture, where a repository is an interface which provides data which is from a source to the UI.
The sources are usually Remote & Data where Remote refers to fetching data from a remote source (this could be other apps, REST API, Websocket connect) and data which is from a local source (something akin to a database.) The idea behind having two separate classes for this is to provide adequate separation of concerns.
Imagine an App like Instagram, where the App manages both offline data and online data. It would make sense then to have the logic handled for each separately, and then use the repository which is what your viewmodel/bloc takes in to access the data. The bloc doesn't need to know where the source of data came from, it only needs the data. Repository Implementation doesn't need to know what is used to make an API call, it just needs to consume the fetched data. Similarly, the Repository Implementation doesn't need to know where the local data is fetched from, it just needs to consume it. This way every bit is abstracted adequately and changes in one class, doesn't affect other classes because everything is an Interface.
All of this helps in testing the code better since mocking and stubbing becomes easier.

Grpc microservice architecture implementation

In a microservice architecture, is it advisable to have a centralized collection of proto files and have them as dependency for clients and servers? or have only 1 proto file per client and server?
If your organization uses a monolithic code base (i.e., all code is stored in one repository), I would strongly recommend to use the same file. The alternative is only to copy the file but then you have to keep all the versions in sync.
If you share protocol buffer file between the sender and the receiver, you can statically check that both the sender and the receiver use the same schema, especially if some new microservices will be written in a statically typed language (e.g., Java).
On the other hand, if you do not have a monolithic code base but instead have multiple repositories (e.g., one per microservice), then it is more cumbersome to share the protocol buffers file. What you can do is to put them in separate repositories that can be added as an dependency to microservices that need them. That is what I have seen in my previous company. We had multiple small API repositories for the schema.
So, if it is easy to use the same file, I would recommend to do so instead of creating copies. There may be situations, however, where it is more practical to copy them. The disadvantage is that you always have to apply a change at all copies. In best case, you know which files to update, then it is just tedious. In the worst case, you do not know which files to update, and your schema will get out of sync. Only when the code is released, you will find out.
Note that monolithic code base does not mean monolithic architecture. You can have microservices and still keep all the source code together in one repository. The famous example is, of course, Google. Google also heavily uses protocol buffers for their internal communication. I have not seen their source code, but I would be surprised if they do not share their protocol buffer files between services.

sharing code between microservices

I have a suite i'm working on that has a few micro-services working togther.
I'm using Docker to setup the environment and it works great.
My project components are as follows:
MongoDB
Node.js worker that does some processing on the DB
Node.js Rest API that serves the user
As you can probably guess the 2 Node.js servers are suppose to work with the same DB.
Now I've defined my models in one of the projects but I'm wondering what is the best practice when it comes to handling the second.
I would really love to avoid copy pasting my code because that means I have to keep both of them up to date when I do changes to the Schema.
is there a good way to share the code between them?
my project looks like this:
rest-api // My first Node.js application
models
MyFirstModel.js // This is identical to the one in the worker/models folder
MySecondModel.js
index.js
package.json
Dockerfile
worker // My second Node.js application
models
MyFirstModel.js
MySecondModel.js
index.js
package.json
Dockerfile
docker-compose.yml
Any input will be helpful.
Thanks.
Of course you can.
What you have to do is to put your common files in an volume, and share this volume with both Node containers.
You should setup a data volume in which you put all the files you want to share. More about this here or anywhere else by googling it.
Cheers.
The common opinion is the following: two microservices should not share same data model. There are several article about it and some question related to this topic.
How to deal with shared models in micro service architectures
However I think there are some cases when you need it and acceptable. Trust is a luxury even if everything is internal, thus security and conformity must be considered. Any incoming object must be normalised, validated and checked before initiate any process with it. The two service should handle the data with the same way.
My solution that I used for an API and an Admin services which shared the models:
I created 3 repositories, one for the API and one for the Admin and a 3th one for the models directory. Models should be present in both repositories so and I added it as a git submodule. Whenever you change something on a schema, you should commit it separately, but I think it is the best solution to manage the changes without duplicating the code.

Resources