rethinkdb nodejs container in cluster environment

rethinkdb nodejs container in cluster environment - node.js

rethinkdb and nodejs+express app fit well in container for cluster environment??
The situation is below in a docker container
1. Running rethinkdb and nodjs+express app in one container
2. During the boot up of nodejs app it checks if there is a specific database and table exist or not. if not then create database and table
Running in one docker container works fine. But the problem is we need to do clustering of rethinkdb as well as maintaining specific number of replicas of the table.
putting all those clustering and replicas logic in the nodejs app seems not a good idea. Kind of stuck how can I proceed.
Help is very much appreciated.

Running rethinkdb and nodjs+express app in one container
You should typically not do this. Put rethinkdb in it's own container and put your application in a separate container.
I'd recommend using docker-compose and setup a docker-compose.yml file for your service. Make sure to use the depends_on property on the web application declaration so that docker will startup the rethinkdb container before the application container.
If you're hand spinning up your RethinkDB containers you should be totally set, but if you're using Swarm or some other scheduler, continue reading.
One problem RethinkDB has currently with automated / scheduled / containerized environments are ephemerality of containers and the possibility that they will possibly restart and come back with a different IP address. This requires some additional tooling around RethinkDB to modify the config tables.
For a bit of reading I'd recommend checking out how this was achieved in Kubernetes.

Related

K8s: deployment patterns for node.js apps with dbs

Hi!
My problem is relevant with the deployment of node.js apps via k8s, architecture patterns, and connected them with DBs.
alpha | beta | gamma1 | gamma2
I have the following node.js app services, some of them are scalable with app instances (like gamma), others are separate ones, all of them are built in a single docker image with .Dockefile and running from it.
And I also have a-non-cloud DBs, like elastic & mongo running from their containers with .env: mongo | elastic
As for now, my docker-compose.yml is like a typical node.js example app, but with common volume and bridge-network (except I have more then one node.js app):
version: '3'
services:
node:
restart: always
build: .
ports:
- 80:3000
volumes:
- ./:/code
mongo:
image: mongo
ports:
- 27017:27017
volumes:
- mongodb:/data/db
volumes:
mongodb:
networks:
test-network:
driver: bridge
Current deployment:
All these things are running on a single heavy VPS (X CPU cores, Y RAM, Z SSD, everything loaded by 70%) from single docker-compose.yml file.
What I want to ask and achieve:
Since one VPS is already not enough, I'd like to start using k8s with rancher. So the question is about correct deployment:
For example, I have N VPSs connected within one private network, each VPS is a worker connected in one cluster, (with Rancher, of course, one of them is a master node) which gives me X cores, Y RAM, and other shared resources.
Do I need another, separate cluster (or a VPS machine in a private network, but not part of a cluster) with DB running on it? Or I could deploy DB in the same cluster? And what if each VPS (worker) in the cluster has only 40GB volume, and DB will grow more than this volume? Do shared resources from workers include the shared volume space?
Is it right to have one image from which I can start all my apps, or in the case of k8s, I should I have a separate docker image for each service? So if I have 5 node.js apps within one mono-repo, I should have 5 separate docker-image, not one common?
I'll understand that my question can have a complex answer, so I will be glad to see, not just answer but links or anything that is connected with a problem. It's much more easy to find or google for something, if you know and how to ask.

A purist answer:
Each of your five services should have their own image, and their own database. It's okay for the databases to be in the same cluster so long as you have a way to back them up, run migrations, and do other database-y things. If your cloud provider offers managed versions of these databases then storing the data outside the cluster is fine too, and can help get around some of the disk-space issues you cite.
I tend to use Helm for actual deployment mechanics as a way to inject things like host names and other settings at deploy time. Each service would have its own Dockerfile, its own Helm chart, its own package.json, and so on. Your CI system would build and deploy each service separately.
A practical answer:
There's nothing technically wrong with running multiple containers off the same image doing different work. If you have a single repository and a single build system now, and you don't mind a change in one service causing all of them to redeploy, this approach will work fine.
Whatever build system your repository has now, if you go with this approach, I'd put a single Dockerfile in the repository root and probably have a single Helm chart to deploy it. In the Helm chart Deployment spec you can override the command to run with something like
# This fragment appears under containers: in a Deployment's Pod spec
# (this is Helm chart, Go text/template templated, YAML syntax)
image: {{ .Values.repository }}/{{ .Values.image }}:{{ .Values.tag }}
command: node service3/index.js
Kubernetes's terminology here is slightly off from Docker's, particularly if you use an entrypoint wrapper script. Kubernetes command: overrides a Dockerfile ENTRYPOINT, and Kubernetes args: overrides CMD.
In either case:
Many things in Kubernetes allocate infrastructure dynamically. For example, you can set up a horizontal pod autoscaler to set the replica count of a Deployment based on load, or a cluster autoscaler to set up more (cloud) instances to run Pods if needed. If you have a persistent volume provisioner then a Kubernetes PersistentVolumeClaim object can be backed by dynamically allocated storage (on AWS, for example, it creates an EBS volume), and you won't be limited to the storage space of a single node. You can often find prebuilt Helm charts for the databases; if not, use a StatefulSet to have Kubernetes create the PVCs for you.
Make sure your CI system produces images with a unique tag, maybe based on a timestamp or source control commit ID. Don't use ...:latest or another fixed string: Kubernetes won't redeploy on update unless the text of the image: string changes.
Multiple clusters is tricky in a lot of ways. In my day job we have separate clusters per environment (development, pre-production, production) but the application itself runs in a single cluster and there is no communication between clusters. If you can manage the storage then running the databases in the same cluster is fine.
Several Compose options don't translate well to Kubernetes. I'd especially recommend removing the volumes: that bind-mount your code into the container and validating your image runs correctly, before you do anything Kubernetes-specific. If you're replacing the entire source tree in the image then you're not really actually running the image, and it'll be much easier to debug locally. In Kubernetes you also have almost no control over networks: but they're not really needed in Compose either.

I can't answer the part of your question about the VPS machine setup, but I can make some suggestions about the image setup.
Actually, while you have asked this question about a node app, it's actually applicable for more than just node.
Regarding the docker image and having a common image or separate ones; generally it's up to you and/or your company as to whether you have a common or separate image.
There's both pros and cons about both methods:
You could "bake in" the code into the image, and have a different image per app, but if you run into any security vulnerabilities, you have to patch, rebuild, and redeploy all the images. If you had 5 apps all using the same library, but that library was not in the base image, then you would have to patch it 5 times, once in each image, rebuild the image and redeploy.
Or you could just use a single base image which includes the libraries needed, and mount the codebase in (for example as a configmap), and that base image would never need to change unless you had to patch something in the underlying operating system. The same vulnerability mentioned in the paragraph above, would only need to be patched in the base image, and the affected pods could be respun (no need to redeploy).

Running NodeJS server in production

I have a react + node app which I need to deploy. I am using nginx to serve my front end but I am not sure what to use to keep my nodejs server running in production.
The project is hosted on a windows VM. I cannot use pm2 due to license issues. I have no idea if running the server using nodemon in production is good or not. I have never deployed an app in production, hence I have no idea about appropriate methods.

You may consider forever or supervisor.
Check this blog post on the same.

You can also use docker. You can create multiple docker containers that will run your node server. Now at the nginx level at your host machine you can do load balancing configuration which will route the traffic equally to different docker node containers this will improve your availability and scalability, In heavy traffic you just need to increase the number of docker node containers as and when required. I guess initially 2 containers will be enough to handle traffic (depends on your use case though).
Note:- You can also use forever or supervisor as suggested by #Rajesh Gupta inside your docker containers for running node server. We use PM2 for that.
If you have a database then you can create a separate docker container for the database and map it to a volume in your host machine.
You can learn about docker from here.
Also you can read about load balancing in nginx from here.
Further more to improve your availability you can add a caching layer in between nginx and docker containers. Varnish is the best caching service i have used till date.
PS:- We use a similar but more advanced architecture to run our Ecommerce application that generates 5-10k orders daily. So this is a tested approach with 0 downtime.

Try to dockerize the whole app including the db, caching server (if any) etc.
Here are some examples why:
You can launch a fully capable development environment on any
computer supporting Docker; you don't have to install libraries,
dependencies, download packages, mess with config files etc.
The working environment of the application remains consistent across
the whole workflow. This means the app runs exactly the same for
developer, tester, and client, be it on development, staging or
production server. In short, Docker is the counter-measure for the
age-old response in the software development: "Strange, it works for
me!"
Every application requires a specific working environment: pre-installed applications, dependencies, data bases, everything in specific version. Docker containers allow you to create such environments. Contrary to VM, however, the container doesn't hold the whole operating system—just applications, dependencies, and configuration. This makes Docker containers much lighter and faster than regular VM's.

What is best for node app with mongodb using docker container?

What is best for node app with mongodb using docker container ?
Both node and mongodb in same docker container or having interlinked separate containers of nodeApp and mongodb ?
I have tried both approaches and both of them worked for me. For the first case I took ubuntu based image and installed node and mongodb using Dockerfile and started that container having both environments in same container. And for the second case, I used node and mongodb base-images and ran as separate containers. But confused which approach should I select?

The approach of using both in separate containers provides you with multiple advantages,the first one is that you can scale them independently of each other.
In addition to that it will also allow you to use more lightweight images since they only require a very specific set of dependencies.
It will also allow you to create a more flexible environment for the future, i.e. if you ever want to add more containers which have a dependency on only one of these containers or the other way around you reduce the number of interactions between the components. If both were in the same container it would not be possible to allow another container only access to only MongoDB for example. Or if you expand your Node application let it only connect to another backend container instead of also having to couple that backend server with mongo.
TLDR use the approach with two separate containers, that is what docker is meant for and provides the most flexibility

Considering scalability. It would be ideal using a separate container for both Node and MongoDB.
It gives you the flexibility. If you want to migrate only your MongoDB container to some other instance or server.

Can containers in Swarm Mode be automaticaly raised when the Load is high?

So we are getting started with Docker Containers and Swarm Mode on Windows. Currently, we have installed Docker for Windows, enabled the Swarm mode in Single-Node Mode, Scaled sevices etc.
Now we are looking, if there is a way to automaticaly create new containers, on the node(s) when the load, on the existing containers, is high.
There is a way to monitor the load on the Nodes, for example if the memory on the Node is high. I'm aware of the fact that there can be some automated Nodecreation, that will host new containers, if the load on the Nodes is high, refering to this example of doing this. But is there a way to monitor containers on a container host/swarm?
We are aiming to host a WebApp running on a microsoft/iis image which currently works fine. But we wanted to know if there is a way to handle the possible incoming load without this leading the system to fail, or having to manualy create new containers.
The current enviroment is a local test VM on our servers and the goal is to do all this stuff in a MS Azure VM running also Windows Server 2016.
Also what would you suggest as a tool/solution for creating traffic load on a website? Somehow we will have to test the whole concept.
I also wanted to add that I am a newbie with Docker and Swarm mode, so there might be a posibility that I'm putting this in words down the wrong way.
Any suggestions will be appreciated! :)

Microservices on docker - architecture

I am building a micro-services project using docker.
one of my micro-services is a listener that should get data from various number of sources.
What i'm trying to achieve is the ability to start and stop getting data from sources dynamically.
For example in this drawing, i have 3 sources connected to 3 dockers.
My problem starts because i need to create another docker instance when a new source is available. In this example lets say source #4 is now available and i need to get his data (I know when a new source became available) but i want it to be scaled automatically (with source #4 information for listening)
I came up with two solutions, each has advantages and disadvantages:
1) Create a docker pool of a large number of docker running the listener service and every time a new source is available send a message (using rabbitmq but i think less relevant) to an available docker to start getting data.
in this solution i'm a little bit afraid of the memory consumption of the docker images running for no reason - but it is not a very complex solution.
2) Whenever a new source is becoming available create a new docker (with different environment variables)
With this solution i have a problem creating the docker.
At this moment i have achieved this one, but the service that is starting the dockers (lets call it manager) is just a regular nodejs application that is executing commands on the same server - and i need it to be inside a docker container also.
So the problem here is that i couldn't manage create an ssh connection from the main docker to create my new Docker.
I am not quite sure that both of my solutions are on track and would really appreciate any suggestions for my problem.

Your question is a bit unclear, but if you just want to scale a service horizontally you should look into a container orchestration technology that will allow you that - For example Kubernetes. I recommend reading the introduction.
All you would need to do for adding additional service containers is to update the number of desired replicas in the Deployment configuration. For more information read this.
Using kubernetes (or short k8s) you will benefit from deployment automation, self healing and service discovery as well as load balancing capabilities in addition to the horizontal scalability.
There are other orchestration alternatives, too ( e.g. Docker Swarm), but I would recommend to look into kubernetes first.
Let me know if that solves your issue or if you have additional requirements that weren't so clear in your original question.
Links for your follow up questions:
1 - Run kubectl commands inside container
2 - Kubernetes autoscaling based on custom metrics
3 - Env variables in Pods

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string