Does anyone know why we are experiencing on our kubernetes master node some system load peaks. I thought that the master node is not doing anything except monitoring our agent nodes.
Each time we have a system load peak of 1.8-2 on our dual-core machine, I see in the kube-controller-manager log that the master tries to start 3 things:
controllermanager.go:373] Attempting to start disruption controller
controllermanager.go:385] Attempting to start petset
controllermanager.go:460] Attempting to start certificates
Our kubernetes version is 1.4.6 and is created via the azure portal. The system peaks can we see via datadog monitoring.
Related
I have a few different containerized web apps running on Azure Container Instance (ACI). I recently noticed that some of these containers just restart with no apparent reason once in a month or so. Since the restarts are on different apps/containers each time, I have no reason to suspect that the apps are crashing.
The restart policy on all of them are set to "Always".
Is it normal or expected for the containers to restart even when there is no app crash? Perhaps when Azure does maintenance on the host machines or maybe a noisy neighbor on the same host causing a pod movement to another host?
(I am in the process of adding a log analytics workspace so that I can view the logs before the restart. Since the restarts are so infrequent, I wouldn't have any logs to look at for quite some time.)
Same here
I've contacted MS support and got the response that per design ACI maintenance can restart the hosts so it can't be expected to run ACI for weeks uninterrupted
Recommendation is to
adapt your app to be resilient (so you don't care about restarts)
use AKS to gain full control over lifecycle
use VM as host for your app with appropriate policies (no updates / restarts...)
For me this was a deal-breaker since I couldn't find this info anywhere. I've ended up with VM.
We are currently running AKS k8s cluster. this cluster is using a virtual node which is to handle bursting load when request/second reaches a specific limit.
we are also trying to explore APM tools like new-relic.new relic provide integration with k8s by daemon set.
query
as per my understanding daemon set run on each node.
what will be the case for virtual node and daemon set?
will daemon set run 24 by 7 on the virtual node, if yes then how to reduce cost and monitor pods running on a virtual node?
DaemonSets are currently not supported via AKS virtual node (as per documentation)
The error I am getting after running kubectl cluster-info
Kubernetes master is running at https://xxx-xxx-aks-yyyy.hcp.westeurope.azmk8s.io:443
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
Unable to connect to the server: net/http: TLS handshake timeout
Rifats-MacBook-Pro:~ rifaterdemsahin$ kubectl cluster-info
There is some issue on Azure Server. Please refer to this similar issue.
I suggest you could try again, if it still does not work, you could open a ticket or give feedback to Azure.
The solution to this one for me was to scale the nodes in my Cluster from the Azure Kubernetes service blade web console (NOT your local terminal since you can't connect with kubectl).
After scaling up by one (and back down to save costs) I am back working again.
Workaround / Solution
Log into the Azure Console — Kubernetes Service blade.
Scale your cluster up by 1 node.
Wait for scale to complete and attempt to connect (you should be able to).
Scale your cluster back down to the normal size to avoid cost increases.
Total time it took me ~2 mins.
More Background Info on the Issue
Added this to the full ticket description write up that I posted over here (if you want more info have a read):
'Unable to connect Net/http: TLS handshake timeout' — Why can't Kubectl connect to Azure AKS server?
I have VM scale set for my Azure ServiceFabric Application deployed in Azure. I need to run RabbitMQ server on each virtual machine in my VM scale set when it starts (especially actual when I am going to scale up my cluster and new VM is going to be created). In other words I want make queue run automatically. Are there any possibilities to do the next steps after VM has been launched:
Check if RabbitMQ is already installed.
Download and install if not from specified URL.
If it has been installed just run it.
I guess this issue can be solved with virtual machine scale set Automation Script, but I am not sure. Any ideas and suggestions?
You could do this using a VM custom script extension. An extension runs on every new VM when a scale set is deployed or when it scales out.
Your extension could do the checks, install and run, and perhaps create a service so RabbitMQ runs if the VM is rebooted etc.
The following articles provide more details on deploying apps with scale sets:
Deploy your application on virtual machine scale sets
How are Applications deployed on VM Scale Sets?
I am setting up a multi container application on mesos cluster on Azure using azure container service and currently stuck in linking containers.
My setup brief info:
- Mesos cluster is deployed on Azure using Azure container service
- It's a 3 container application - A, B and C
- B is dependent on A and C is dependent on A & B-
- A is deployed currently
How can I link the above containers?
Thanks,
Suraj
If by linking you mean Docker's --link then thats deprecated practice and inter-container communication should be done using Docker networks and port mappings.
For DC/OS - you have some different ways to achieve this (also called Service Discovery). I have written a blog post explaining these different tools by examples: http://blog.itaysk.com/2017/04/28/dcos-service-discovery-and-load-balancing-by-example
If you don't want to read through that long post and looking for a recommendation: Try using VIPs.
When creating the application (either from Marathon or DC/OS UI), look for the 'VIP' setting. Enter an IP there (it can be a made up IP) and port. Your service will be discoverable under this IP:Port.
More on VIPs: https://dcos.io/docs/1.9/networking/load-balancing-vips/virtual-ip-addresses/