Managed Azure Kubernetes connection error - azure

The error I am getting after running kubectl cluster-info
Kubernetes master is running at https://xxx-xxx-aks-yyyy.hcp.westeurope.azmk8s.io:443
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
Unable to connect to the server: net/http: TLS handshake timeout
Rifats-MacBook-Pro:~ rifaterdemsahin$ kubectl cluster-info

There is some issue on Azure Server. Please refer to this similar issue.
I suggest you could try again, if it still does not work, you could open a ticket or give feedback to Azure.

The solution to this one for me was to scale the nodes in my Cluster from the Azure Kubernetes service blade web console (NOT your local terminal since you can't connect with kubectl).
After scaling up by one (and back down to save costs) I am back working again.
Workaround / Solution
Log into the Azure Console — Kubernetes Service blade.
Scale your cluster up by 1 node.
Wait for scale to complete and attempt to connect (you should be able to).
Scale your cluster back down to the normal size to avoid cost increases.
Total time it took me ~2 mins.
More Background Info on the Issue
Added this to the full ticket description write up that I posted over here (if you want more info have a read):
'Unable to connect Net/http: TLS handshake timeout' — Why can't Kubectl connect to Azure AKS server?

Related

Kubernetes Pod - read ECONNRESET when requesting external web services

I have a bare-metal Kubernetes cluster setup running on separate lxc containers as nodes on Ubuntu 20.04 . It has Istio service mesh configured and approx 20 application services running on it (ServiceEntries for Istio are created for external services to be reached). I use MetalLB for the gateway's external IP provisioning.
I have an issue with pods making requests outside the cluster (egress), specifically reaching some of the external web services such as Cloudflare API or Sendgrid API to make some REST API calls. The DNS is working fine as the hosts I try to reach are indeed reachable from the pods (containers). It happens only that the first time pod is successful at making requests outside to the internet and after that random read ECONNRESET error happens when it tries to make REST API calls and sometimes even connect ETIMEDOUT but not frequent as the first error. Making the network requests from the nodes themselves to the internet seem to have no problems at all. Pods communicate with each other through the k8s' services fine without any of the problems.
I would guess something is not configured correctly and that the packets are not properly delivered back to the pod but I can't find any relevant help on the internet and I am a little bit lost on this one, I appreciate and I am very grateful for any of your help! I will happily provide more details if needed.
Thank you all once again!

AKS service response is slow after idle time

I set up my Azure Kubernetes Service deployment exactly as it's described in the official tutorial with static IP, Ingress Controller and Let's Encrypt SSL certs. I use:
Standard_B2s node size
1-node pool
Single pod
AKS Network type: Basic (Kubenet)
Let's Encrypt certs for prod
Response latency sometimes is very very high. One browser can very quickly get to the service (open swagger for instance) but then I go to my mobile phone and try to open it from it. It takes forever and often times out. As soon as you open the page once, next time it would most likely open very fast and without any delays. Why does it happen?
I can guarantee, it's not my service issue.

AKS issues connecting to Azure Database for MySQL Server

This has been previously working but stopped recently. I have a Wordpress container running in an AKS pod that connects to Azure Database for MySQL Server Basic Pricing Tier. Recently the container is unable to connect. I have tried to whitelist all IPs in the MySQL Connection Security to rule that out 0.0.0.0 - 255.255.255.255 but that did not seem to help.
When exec into the pod and install a MySQL client and try to connect to the MySQL Server I see an error:
ERROR 9009 (28000): Client connections to Basic tier servers through Virtual Network Service Endpoints are not supported. Virtual Network Service Endpoints are supported for General Purpose and Memory Optimized severs.
I don't understand why this was working in the past and stopped now. Is this error message correct and is it basically telling you either upgrade (which I don't think you can just do in the portal to scale up like you would do for SQL Server), or you will not be able to access the DB. To upgrade would I have to backup the DB, create a new server in the General Purpose Pricing Tier and restore, so no smooth Scale Up path?
I don't seem to have the VNet option in Azure Portal
EDIT:
I have since found this post on microsoft forum and what that is saying is that you have to upgrade to GP pricing tier. So going from £19.805/month to £104.789/month. Just wow.
EDIT:
The way to get it work with MySQL Basic Tier was to disable Service Endpoints in AKS VNet as suggested in the accepted answer. The problem was that the SQL Server was configured to use Service Endpoints. To get that going after removing the service endpoints I had to disable service endpoints in the SQL Server. Not too happy with that but I guess you can't have both, your Basic Tier MySQL Server and decent security. If you want both, you will have to pay :(
It was never working, unless you were not using Service Endpoints. If you switch those off - it should resume working.

How can I diagnose a connection failure to my Load-balanced Service Fabric Cluster in Azure?

I'm taking my first foray into Azure Service Fabric using a cluster hosted in Azure. I've successfully deployed my cluster via ARM template, which includes the cluster manager resource, VMs for hosting Service Fabric, a Load Balancer, an IP Address and several storage accounts. I've successfully configured the certificate for the management interface and I've successfully written and deployed an application to my cluster. However, when I try to connect to my API via Postman (or even via browser, e.g. Chrome) the connection invariably times out and does not get a response. I've double checked all of my settings for the Load Balancer and traffic should be getting through since I've configured my load balancing rules using the same port for the front and back ends to use the same port for my API in Service Fabric. Can anyone provide me with some tips for how to troubleshoot this situation and find out where exactly the connection problem lies ?
To clarify, I've examined the documentation here, here and here
Have you tried logging in to one of your service fabric nodes via remote desktop and calling your API directly from the VM? I have found that if I can confirm it's working directly on a node, the issue likely lies within the LB or potentially an NSG.

kubectl: net/http: TLS handshake timeout

I followed the Quickstart docs (here) to deploy a k8s cluster in the Western Europe region. The cluster boots up fine, but I cannot connect to it using kubectl - kubectl times out while trying to perform a TLS handshake:
Unable to connect to the server: net/http: TLS handshake timeout
There is currently a github issue where others are reporting the same problem.
Following some advice on the thread, I attempted to perform an upgrade from 1.8.1 to 1.8.2, which failed:
bash-4.3# az aks upgrade --resource-group=k8s --name=phlo -k 1.8.2
Kubernetes may be unavailable during cluster upgrades.
Are you sure you want to perform this operation? (y/n): y
/ Running ..
Deployment failed. Correlation ID: <redacted>. Operation failed with status: 200. Details: Resource state Failed
According to others on the github thread, it seems to be a region-specific issue.
The solution to this one for me was to scale the nodes in my Cluster from the Azure Kubernetes service blade web console.
Workaround / Solution
An interesting solution (worked for me!) to test...
Log into the Azure Console — Kubernetes Service blade.
Scale your cluster up by 1 node.
Wait for scale to complete and attempt to connect (you should be able to).
Scale your cluster back down to the normal size to avoid cost increases.
Total time it took me ~2 mins.
More Background Info on the Issue
Added this to the full ticket description write up that I posted over here (if you want more info have a read):
'Unable to connect Net/http: TLS handshake timeout' — Why can't Kubectl connect to Azure AKS server?
I was able to get a working AKS setup after ignoring the Azure CLI response on when the k8s cluster was ready, and watching for the "creating..." bar in the AKS overview section of Azure Console to disappear.
There's some good comments here if you are still stuck: https://github.com/Azure/AKS/issues/112
For me, the issue disappeared after freeing some space on my Mac
and then start the proxy again with kubectl proxy

Resources