Kubernetes/Istio - Restart a service on error - node.js

I have an application deployed in Kubernetes. I am using the Istio service mesh. One of my services needs to be restarted when a particular error occurs. Is this something that can be achieved using Istio?
I don't want to use a cronjob. Also, making the application restart itself seems like an anti-pattern.
The application is a node js app with fastify.

Istio is a network connection tool. I was creating this answer when David Maze made a very correct mention in a comment:
Istio is totally unrelated to this. Another approach could be to use a Kubernetes liveness probe if the cluster can detect the pod is unreachable; but if you're going to add a liveness hook to your code, the Kubernetes documentation also endorses just crashing on unrecoverable failure.
The kubelet uses liveness probes to know when to restart a container. For example, liveness probes could catch a deadlock, where an application is running, but unable to make progress. Restarting a container in such a state can help to make the application more available despite bugs.
See also:
health checks on the cloud - GCP example
creating custom rediness/liveness probe
customization liveness probe

Related

Kubernetes Pod - read ECONNRESET when requesting external web services

I have a bare-metal Kubernetes cluster setup running on separate lxc containers as nodes on Ubuntu 20.04 . It has Istio service mesh configured and approx 20 application services running on it (ServiceEntries for Istio are created for external services to be reached). I use MetalLB for the gateway's external IP provisioning.
I have an issue with pods making requests outside the cluster (egress), specifically reaching some of the external web services such as Cloudflare API or Sendgrid API to make some REST API calls. The DNS is working fine as the hosts I try to reach are indeed reachable from the pods (containers). It happens only that the first time pod is successful at making requests outside to the internet and after that random read ECONNRESET error happens when it tries to make REST API calls and sometimes even connect ETIMEDOUT but not frequent as the first error. Making the network requests from the nodes themselves to the internet seem to have no problems at all. Pods communicate with each other through the k8s' services fine without any of the problems.
I would guess something is not configured correctly and that the packets are not properly delivered back to the pod but I can't find any relevant help on the internet and I am a little bit lost on this one, I appreciate and I am very grateful for any of your help! I will happily provide more details if needed.
Thank you all once again!

Azure Container Instance - restarting once in a while for no apparent reason

I have a few different containerized web apps running on Azure Container Instance (ACI). I recently noticed that some of these containers just restart with no apparent reason once in a month or so. Since the restarts are on different apps/containers each time, I have no reason to suspect that the apps are crashing.
The restart policy on all of them are set to "Always".
Is it normal or expected for the containers to restart even when there is no app crash? Perhaps when Azure does maintenance on the host machines or maybe a noisy neighbor on the same host causing a pod movement to another host?
(I am in the process of adding a log analytics workspace so that I can view the logs before the restart. Since the restarts are so infrequent, I wouldn't have any logs to look at for quite some time.)
Same here
I've contacted MS support and got the response that per design ACI maintenance can restart the hosts so it can't be expected to run ACI for weeks uninterrupted
Recommendation is to
adapt your app to be resilient (so you don't care about restarts)
use AKS to gain full control over lifecycle
use VM as host for your app with appropriate policies (no updates / restarts...)
For me this was a deal-breaker since I couldn't find this info anywhere. I've ended up with VM.

How do make my microservices only accessible by the api gateway

I would like to know how I can protect my Nodejs microservices so only the API gateway can access it. Currently the microservices are exposed on a unique port on my machine and can be access directly without passing through the gateway. That defeats the purpose of the gateway to serve as the only entry point in the system for secure and authorized information exchange.
The microservices and the gateway are currently built with Nodejs and express.
The plan is to eventually deploy it on the cloud (digital ocean). I'd appreciate any response. Thanks.
Kubernetes can solve this problem.
Kubernetes manages containers where each container can be a micro service.
While connecting your micro services to your gateway server, you can choose to only allow foreign connections to your gateway server. You would have a load balancer / nginx in your kubernetes cluster that redirects request to your gateway server.
Kubernetes has many other features such as:
service discovery: each of your micro service's IP could potentially change on restart/deployment unless you have static IP for all ur services. service discovery solves this problem.
high availability & horizontal scaling & zero downtime: you can configure to have several replicas for each of your service. So when one of the service goes down there still are other replicas alive to deal with the remaining requests. This also helps with CICD. With something like github action, you can make a smooth CICD pipeline. When you deploy a new docker image(update a micro service), kubernetes will launch a new container first and then kill the old container. So you have zero down time.
If you are working with micro services, you should definitely have a deep dive into kubernetes.

AppDynamics - monitoring of services

I have a problem with my health rule configuration. All I want is to have health rule which will be checking if service is running or not. I have two types of services:
IIS
Standalone services
The problem is that some services are recognized as critical due to health rule violation. For example, I have two exactly the same services on two hosts and the only difference is that one of them is in use not so often. Due to lack of activity on this service appdynamics pointing me it as critical.
Most probably I have done something wrong. Any ideas?
I'm struggling with it as additional task. Tried appdynamics community website but nothing which could point me solution.
Here's my health rule configuration :
If you want to monitor only your worker processes of IIS and Standalone Service, You can use CLR Crash event on your policy configuration.
AppDynamics automaticaly creates CLR Crash Events If your IIS or Standalone Service are crashed.
You can find the details of CLR Crash Events:
https://docs.appdynamics.com/display/PRO45/Monitor+CLR+Crashes
Also, Sample policy configuration:
Policy Configuration Screen

Link containers in Azure Container Service with Mesos & Marathon

I'm trying to deploy a simple WordPress example (WordPress & MySQL DB) on Microsofts new Azure Container Service with Mesos & Marathon as the underlying orchestration platform. I already ran this on the services offered by Google (Kubernetes) and Amazon (ECS) and thought it would be an easy task on ACS as well.
I have my Mesos cluster deployed and everything is up and running. Deploying the MySQL container isn't a problem either, but when I deploy my WordPress container I can't get a connection to my MySQL container. I think this might be because MySQL runs on a different Mesos agent?
What I tried so far:
Using the Mesos DNS to get ahold of the MySQL container host (for now I don't really care which container I get ahold of). I set the WORDPRESS_DB_HOST environment var to mysql.marathon.mesos and specified the host of MySQL container as suggested here.
I created a new rule for the Agent Load Balancer and a Probe for port 3306 in Azure itself, this worked but seems like a very complicated way to achieve something so simple. In Kubernetes and ECS links can be simply defined by using the container name as hostname.
An other question that came up, what difference is their in Marathon between setting the Port in the Port Mappings Section and in the Optional Settings section. (See screenshot attached)
Update: If I ssh into the master node than I can dig by using mysql.marathon.mesos, how ever I can't get a connection to work from within an other container (in my case the wordpress container).
So there are essentially two questions here: one around stateful services on Marathon, the other around port management. Let me first clarify that neither has to do anything with Azure or ACS in the first place, they are both Marathon-related.
Q1: Stateful services
Depending on your requirements (development/testing or prod) you can either use Marathon's persistent volumes feature (simple but no automatic failover/HA for the data) or, since you are on Azure, a robust solution like I showed here (essentially mounting a file share).
Q2: Ports
The port mapping you see in the Marathon UI screen shot is only relevant if you launch a Docker image and want to explicitly map container ports to host ports in BRIDGE mode, see the docs for details.

Resources