scalable architecture with node.js

scalable architecture with node.js - node.js

As a part of next assignment, I need to prepare a scalable and full concurrent supporting node architecture. I am confused with kubernetes/containers concept and really need some help. And I cannot use any paid service! Just plain raw DO servers and load balancers.
Basically a basic sketch/idea/explanation/pointers to The architecture which should explain API endpoints, data service connectivity and data flows between database, server and client is needed!
Here is what I have in my mind:
Client <-> NginX -> Nodejs <-> MongoDB
So above is a standard setup for nodejs based REST APIs I believe. Now how to add scalability to this and concurrency?
Any help would be appreciated!

Let me give you a quick overview and after that just ask more questions in the comments of my answer if you need to know more.
You need a docker image of all your services:
You will need an nginx image wich contains your frontend code. (https://serversforhackers.com/c/dckr-nginx-image)
You will need a docker image with which contains your backend code.
(https://nodejs.org/en/docs/guides/nodejs-docker-webapp/)
You will need an simple mongo-db base image.
(https://medium.com/#pablo_ezequiel/creating-a-docker-image-with-mongodb-4c8aa3f828f2)
Now for beginners I would go to Google Cloud Plattform and set up a manged kubernetes cluster. This is done in 1 minute and you will have a fulll functinal kubernetes environment. (https://cloud.google.com/kubernetes-engine/docs/quickstart) - In the first year you will have 300$ for free usage. So this is more then enough to play arround and set up an environment for your assignment.
Now you will need an Ingress API. The Ingress is the only access point to the Services you will later deploy on your cluster. Lets say your Ingress is listening to 14.304.233. When your write 14.304.233/customerBackend, it will redirect this request to the customerBackend Service (You need to define this of course) More information here: https://kubernetes.io/docs/concepts/services-networking/ingress/#what-is-ingress
Now you need to deploy the images you created. In Kubernetes you have the concept of Pods (see here: https://kubernetes.io/docs/concepts/workloads/pods/pod/).Normally in each Pod there runs only one container. Each Pod-Group (f.e all Pods which have an Node Container inside) has one so called Service, which is managing the access on the pod. Let say you want to have 3 instances of your NodeJS backend. Each of the 3 Container will run in a individual pod. If you want to send a request to the backend, it will go trough the Service, which then redirects the requests to one of the pos. When you need to scale, you simply deploy more pods. The Service automaticly balances the load over the deployed pods.
How many pods you want have is deployt is defined in a so called deployment.yaml
(see: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/).
This is very simular to a docker-compose.yaml but with some more attributes you can configure.

Related

Issue with load balancing in Openshift

I have deployed a Python API into OCP with 3 pod replicas. All the incoming requests seem to be going to only one pod while the other 2 being idle all the time.
Configuration I have is :
haproxy.router.openshift.io/timeout:1800s
haproxy.router.openshift.io/balance:roundrobin
haproxy.router.openshift.io/disable_cookies:”true”
Need help to resolve this issue
Tried changing balance above with leastconn and roundrobin. I don’t see any difference

I found the fix for my issue, Actually i was making the API requests to these pods from another pod in the same namespace. I used the name of service in my url instead of using OCP url in the API call ex: http://ocpservicename:port

Heroku - restart on failed health check

Heroku does not support health checks on its own. It will restart services that crashed, but there is nothing like health checks.
It sometimes happen that service become unresponsive, but the process is still running. In most of modern cloud solution, you can provide health endpoint which is periodically called by the cloud hosting service and if that endpoints return either error or not at all, it will shut down such service and start new one.
That seems like industrial standard these days, but I am unable to find any solution to this for Heroku. I can even use external service with Heroku CLI, but just calling some endpoint is not sufficient - if there are multiple instances, they all share same URL and load balancer calls one of them randomly -> therefore it is possible to not hit failed instance at all. Even when I hit it, usually the health checks have something like "after 3 failed health checks in a row restart that instance", which is highly unprobable if there are 10 instances and one of it become unhealthy.
Do you have any solution to this?

You are right that this is industry standard and shame that it's not provided out of box.
I can think of 2 solutions (both involve running some extra code that does all of this:
a) use heroku API which allows you to get the IP of individual dynos, and then you can call each dyno how you want
b) in each dyno instance you can send a request to webserver like https://iamaalive.com/?dyno=${process.env.HEROKU_DYNO_ID}

How to paginate logs from a Kubernetes pod?

I have a service that displays logs from pods running in my Kubernetes cluster. I receive them via k8s /pods/{name}/log API. The logs tend to grow big so I'd like to be able to paginate the response to avoid loading them whole every time. Result similar to how Kubernetes dashboard displays logs would be perfect.
This dashboard however seems to solve the problem by running a separate backend service that loads the logs, chops them into pieces and prepares for the frontend to consume.
I'd like to avoid that and use only the API with its query parameters like limitBytes and sinceSeconds but those seem to be insufficient to make proper pagination work.
Does anyone have a good solution for that? Or maybe know if k8s plans to implement pagination in logs API?

Service Fabric - A web api in cluster who' only job is to serve data from reliable collection

I am new to Service Fabric and currently I am struggling to find out how to access data from reliable collection (That is defined, and initialized in a Statefull Service context) from a WEB API (that is, also living in the Service fabric cluster, as a separate application). The problem is very basic and I am sure I am missing something very obvious. So apologies to the community if this sounds lame.
I have a large XML, a portions of which I want to expose via a WEB API endpoints as results from various queries . Searched for similar questions, but couldn't find a suitable answer.
Would be happy to see how an experienced SF developer would do such task.
EDIT I posted the solution i have came up with

After reading around and observing others issues and Azure's samples, I have implemented a solution. Posting here the gotchas I had, hoping that will help other devs that are new to Azure Service fabric (Disclaimer: I am still a newbie in Service Fabric, so comments and suggestions are highly appreciated):
First, pretty simple - I ended up with a stateful service and a WEB Api Stateless service in an azure service fabric application:
DataStoreService - Stateful service that is reading the large XMLs and stores them into Reliable dictionary (happens in the RunAsync method).
Web Api provides an /api/query endpoint that filters out the Collection of XElements that is stored in the rteliable dictionary and serialize it back to the requestor
3 Gotchas
1) How to get your hands on the reliable dictionary data from the Stateless service, i.e how to get an instance of the Stateful service from Stateless one :
ServiceUriBuilder builder = new ServiceUriBuilder("DataStoreService");
IDataStoreService DataStoreServiceClient = ServiceProxy.Create<IDataStoreService>(builder.ToUri(), new ServicePartitionKey("Your.Partition.Name"));
Above code is already giving you the instance. I.e - you need to use a service proxy. For that purpose you need:
define an interface that your stateful service will implement, and use it when invoking the Create method of ServiceProxy (IDataStoreService)
Pass the correct Partition Key to Create method. This article gives very good intro on Azure Service Bus partiotions
2) Registering of Replica listeners - in order to avoid errors saying
The primary or stateless instance for the partition 'a67f7afa-3370-4e6f-ae7c-15188004bfa1' has invalid address, this means that right address from the replica/instance is not registered in the system
, you need to register replica listeners as stated in this post :
public DataStoreService(StatefulServiceContext context)
: base(context)
{
configurationPackage = Context.CodePackageActivationContext.GetConfigurationPackageObject("Config");
}
3) Service fabric name spacing and referencing services - the ServiceUriBuilder class I took from the service-fabric-dotnet-web-reference-app. Basically you need something to generate an Uri of the form:
new Uri("fabric:/" + this.ApplicationInstance + "/" + this.ServiceInstance);,
where ServiceInstance is the name of the service you want to get instance of (DataStoreService in this case)

You can use WebAPI with OWIN to setup a communication listener and expose data from your reliable collections. See Build a web front end for your app for info on how to set that up. Take a look at the WordCount sample in the Getting started sample apps, which feeds a bunch of random words into a stateful service and keeps a count of the words processed. Hope that helps.

Zero downtime deploy with node.js and mongodb?

I'm looking after building a global app from the ground up that can be updated and scaled transparently to the user.
The architecture so far is very easy, each part of the application has it own process and talk to other trough sockets.
This way i can spawn as many instances i want for each part of the application and distribute them across the globe accordingly to my needs.
In the front of the system i'll have a load balancer, which will them route the users to their closest instance, and when new code is spawned my instances will spawn new processes with the new code and route new requests to it and gracefully shutdown.
Thank you very much for any advice.
Edit:
The question is: What is the best ( and simplest ) solution for achieving zero downtime when deploying node to multiple instances ?
About the app:
https://github.com/Raynos/boot for "socket" connections,
http for http requests,
mongo for database
Solutions i'm trying at the moment:
https://www.npmjs.org/package/thalassa ( which managed haproxy configuration files and app instances ), if you don't know it, watch this talk: https://www.youtube.com/watch?v=k6QkNt4hZWQ and be aware crowsnest is being replaced by https://github.com/PearsonEducation/thalassa-consul

Deployment with zero downtime is only possible if the data you share between old and new nodes are compatible.
So in case you change the structure, you have to build a intermediate release, that can handle the old and new data structure without utilizing the new structure until you have replaced all nodes with that intermediate version. Then roll out the new version.
Taking nodes in and out of production can be done with your loadbalancer (and a grace time until all sessions expired on the nodes) (don't know enough about your application).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string