acs-engine with custom vnet dns: error server misbehaving - azure

With acs-engine I have created a k8s cluster with a custom vnet. The cluster was deployed and the pods are running.
When I do a kubectl get nodes or get pod I get a reply. But when I use exec to get into a pod or use helm install then I get the error:
Error from server: error dialing backend: dial tcp: lookup k8s-agentpool on 10.40.1.133:53: server misbehaving
I used the following json file to create the arm templates:
acs-engine.json
When not using a custom vnet then the default azure dns is used and with a custom vnet our own dns servers are used. Is the only option to register all masters and agents to the dns server?

Resolved it by adding all cluster nodes to our dns servers

Related

azure devops with self Hosted agent : can't deploy to aks cluster

i want to create azure devops release pipeline that build a docker image and deploy it to aks cluster .
the build and deployment to acr work well but the deployment to aks doesn't work.
this is the results after runing the pipeline :
and this is the error logs :
2023-01-08T22:20:48.7666031Z ##[section]Starting: deploy
2023-01-08T22:20:48.7737773Z ==============================================================================
2023-01-08T22:20:48.7741356Z Task : Deploy to Kubernetes
2023-01-08T22:20:48.7745738Z Description : Use Kubernetes manifest files to deploy to clusters or even bake the manifest files to be used for deployments using Helm charts
2023-01-08T22:20:48.7750005Z Version : 0.212.0
2023-01-08T22:20:48.7752721Z Author : Microsoft Corporation
2023-01-08T22:20:48.7755489Z Help : https://aka.ms/azpipes-k8s-manifest-tsg
2023-01-08T22:20:48.7757618Z ==============================================================================
2023-01-08T22:20:49.2976400Z Downloading: https://storage.googleapis.com/kubernetes-release/release/stable.txt
2023-01-08T22:20:49.8627101Z Found tool in cache: kubectl 1.26.0 x64
2023-01-08T22:20:50.6940515Z ==============================================================================
2023-01-08T22:20:50.6942077Z Kubectl Client Version: v1.26.0
2023-01-08T22:20:50.6943172Z Kubectl Server Version: v1.23.12
2023-01-08T22:20:50.6944430Z ==============================================================================
2023-01-08T22:20:50.7161602Z [command]/azp/_work/_tool/kubectl/1.26.0/x64/kubectl apply -f /azp/_work/_temp/Deployment_acrdemo2ss-deployment_1673216450713,/azp/_work/_temp/Service_acrdemo2ss-loadbalancer-service_1673216450713 --namespace dev
2023-01-08T22:20:50.9679948Z Unable to connect to the server: dial tcp: lookup tfkcluster-dns-074e9373.hcp.canadacentral.azmk8s.io on 192.168.1.1:53: no such host
2023-01-08T22:20:50.9771688Z ##[error]Unable to connect to the server: dial tcp: lookup tfkcluster-dns-074e9373.hcp.canadacentral.azmk8s.io on 192.168.1.1:53: no such host
2023-01-08T22:20:50.9809463Z ##[section]Finishing: deploy
this is my service connection :
Unable to connect to the server: dial tcp: lookup xxxx on
192.168.1.1:53: no such host
It appears that you are using a private cluster (The Private Cluster option is enabled while creating the AKS cluster).
Kubectl is a kubernetes control client. It is an external connectivity provider to connect with kubernetes cluster. We can't connect with the private cluster externally.
However, we can't disable this option after the cluster creation. We need to delete the cluster and create a new one with the option "Private Cluster" disabled.
Alternately, you can set up another self-hosted agent which will be in the same Vnet as the cluster and have access to AKS and the Azure Pipelines.
See Options for connecting to the private cluster
The API server endpoint has no public IP address. To manage the API
server, you'll need to use a VM that has access to the AKS cluster's
Azure Virtual Network (VNet). There are several options for
establishing network connectivity to the private cluster.
Create a VM in the same Azure Virtual Network (VNet) as the AKS cluster.
Use a VM in a separate network and set up Virtual network peering. See the section below for more information on this option.
Use an Express Route or VPN connection.
Use the AKS command invoke feature.
Use a private endpoint connection.
Creating a VM in the same VNET as the AKS cluster is the easiest
option. Express Route and VPNs add costs and require additional
networking complexity. Virtual network peering requires you to plan
your network CIDR ranges to ensure there are no overlapping ranges.

Kubernetes pod failed to connect to external service

I have an Azure Kubernetes Cluster Running with Azure CNI (virtual network) as the Network. The cluster is running on 1 subnet of the network.
On another subnet, I have a Virtual Machine running as it has a private IP of 10.1.0.4.
Now I have a pod in the K8S cluster, which is trying to connect with the Virtual Machine. But it's not able to do so.
Also, the ping 10.1.0.4 from inside the pod gives a timeout.
Please help me to figure out, what I am doing wrong so that I can connect the Pod with the VM.
• You cannot directly create communication between an AKS cluster pod and a Virtual Machine as the IP assigned to a pod/node in an AKS cluster is a subset range of the address space of the higher CIDR IP address range assigned while deploying the cluster. And communication within the cluster between the nodes is uninterrupted and possible readily. But the same with resources other AKS is restricted as they are governed by Azure CNI framework policy which directs the Kubernetes cluster to direct traffic outbound of the cluster in a regulated and conditional way.
• Thus, the above said can only be achieved by initiating intermediate services such as an internal load balancer between the AKS and the VMs as the CIDR of the VM and the AKS is different. So, leveraging the Azure plugin to deploy an internal load balancer as a service through AKS is only way through which you can achieve communication between AKS pod and a VM deployed in Azure. Below is a diagram for illustration purposes.
To deploy the internal load balancer through YAML files in AKS for external communication with VMs, kindly refer to the link below for details: -
https://fabriciosanchez-en.azurewebsites.net/implementing-virtual-machine-to-pod-communication-in-azure-kubernetes-service-aks/

How to access API from Postman with containerized app deployed in Azure Kubernetes Service

I have created a sample API application with Node and Express to be containerized and deployed into Azure Kubernetes Services (AKS). However, I was unable to access the API endpoint through the external API generated from the service.yml that was deployed.
I have made use of deployment center within AKS to deploy my application to AKS and generate the relevant deployment.yml and service.yml. The following is the services running containing the external IP.
The following is the response from postman. I have tried with or without port number and ip address from kubectl get endpoints but to no avail. The request will timeout eventually and unable to access the api.
The following is the dockerfile config
I have tried searching around for solutions, how it was not possible to resolve. I would greatly appreciate if you have encountered similar issues and able to share your experience, thank you.
From client machine where kubectl is installed do
kubectl get pods -o wide -n restapicluster5ca2
this will give you all the pods with the ip of the Pods
kubectl describe svc restapicluster-bb91 -n restapicluster5ca2
this will give details about the service and then check LoadBalancer Ingress: for the external IP address, Port: for the port to access, TargetPort: the port on the containers to access i.e 5000 in your case, Endpoints: to verify if all IP of the pod with correct port i.e 5000 is displaying or not.
Log into any of the machines in the AKS cluster do the following
curl [CLUSTER-IP]:[PORT]/api/posts i.e curl 10.-.-.67:5000
check if you get the response.
For reference to use kubectl locally with AKS cluster check the links below
https://learn.microsoft.com/en-us/azure/aks/kubernetes-walkthrough
https://learn.microsoft.com/en-us/cli/azure/install-azure-cli?view=azure-cli-latest
As I see it you need to bring ingress in front of your service. Folks use NgInx etc. for that.
If you want to stay "pure" Azure you could use AGIC - Application Gateway Ingress Controller and annotate your service to have it exposed over AppGw. You could also spin up your own custom AppGw and hook it up with the AKS Service/LoadBalancer IP.

Kubectl not working when AKS API authorized ranges are in place

We're implementing security on our k8s cluster in Azure (managed Kubernetes - AKS).
Cluster is deployed via ARM template, the configuration is as following:
1 node, availability set, Standard load balancer, Nginx-based ingress controller, a set of application ddeployed.
According to the document we've updated cluster to protect API server from the whole internet:
az aks update --resource-group xxxxxxxx-xxx-xx-xx-xx-x -n xx-xx-xxx-aksCluster
--api-server-authorized-ip-ranges XX.XX.X.0/24,XX.XX.X.0/24,XX.XXX.XX.0/24,XX.XXX.XXX.XXX/32
--subscription xxxxx-xxx-xxx-xxx-xxxxxx
Operation is completed successfully.
When trying to grab logs from the pod the follwoing error is occured:
kubectl get pods -n lims-dev
NAME READY STATUS RESTARTS AGE
XXXX-76df44bc6d-9wdxr 1/1 Running 0 14h
kubectl logs XXXXX-76df44bc6d-9wdxr -n lims-dev
Error from server: Get https://aks-agentpool-XXXXXX-1:10250/containerLogs/XXXX/XXXXX-
76df44bc6d-9wdxr/listener: dial tcp 10.22.0.35:10250: i/o timeout
When trying to deploy using Azure DevOps, the same error is raised:
2020-04-07T04:49:49.0409528Z ##[error]Error: error installing:
Post https://xxxxx-xxxx-xxxx-akscluster-dns-xxxxxxx.hcp.eastus2.azmk8s.io:443
/apis/extensions/v1beta1/namespaces/kube-system/deployments:
dial tcp XX.XX.XXX.142:443: i/o timeout
Of course, the subnet where I'm running the kubectl is added to authorized range.
I'm trying to understand what's the source of the problem.
You need also to specify --load-balancer-outbound-ips parameter once creating AKS cluster. This IP will be used by your pods to communicate to external world, as well as to AKS API server. See here

Kubernetes + Socket.io: Pod client -> LoadBalancer service SSL issues

I have a socket.io-based node.js deployment on my Kubernetes cluster with a LoadBalancer-type service through Digital Ocean. The service uses SSL termination using a certificate uploaded to DO.
I've written a pod which acts as a health check to ensure that clients are still able to connect. This pod is node.js using the socket.io-client package, and it connects via the public domain name for the service. When I run the container locally, it connects just fine, but when I run the container as a pod in the same cluster as the service, the health check can't connect. When I shell into the pod, or any pod really, and try wget my-socket.domain.com, I get an SSL handshake error "wrong version number".
Any idea why a client connection from outside the cluster works, a client connection out of the cluster to a normal server works, but a client connection from a pod in the cluster to the public domain name of the service doesn't work?
You have to set up Ingress Controller to route traffic from a Load-Balancer to a Service.
The flow of traffic looks like this:
INTERNET -> LoadBalancer -> [ Ingress Controller -> Service]
If you want to use SSL:
You can provision your own SSL certificate and create a Secret to hold it. You can then refer to the Secret in an Ingress specification to create an HTTP(S) load balancer that uses the certificate.
You can deploy an ingress controller like nginx using following instruction: ingress-controller.
Turns out, the issue is with how kube-proxy handles LoadBalancer-type services and requests to it from inside the cluster. Turns out, when the service is created, it adds iptables entries that causes requests inside the cluster skip the load balancer completely, which becomes an issue when the load balancer also handles SSL termination. There is a workaround, which is to add a loadbalancer-hostname annotation which forces all connections to use the load balancer. AWS tends not to have this problem because they automatically apply the workaround to their service configurations, but Digital Ocean does not.
Here are some more details:
https://github.com/digitalocean/digitalocean-cloud-controller-manager/blob/master/docs/controllers/services/annotations.md

Resources