Kubernetes in Azure - Unable to connect to the server: proxyconnect tcp: dial tcp: lookup http: no such host - azure

I'm new in Azure. I've install a Kubernetes Cluster v1.10.3 in Azure and followed the steps in order to operate kubernetes from console.
When I try to get any resource, for example the pods, by executing kubectl get pods, I'm getting back the following error:
Unable to connect to the server: proxyconnect tcp: dial tcp: lookup http: no such host
Equally, if I try to execute any other command by using AKS, for example next one az aks browse --resource-group POC_Service_Mesh --name ServiceMeshCluster, you get back the following error:
Unable to connect to the server: proxyconnect tcp: dial tcp: lookup http: no such host.
My region is WestEurope.
Finally, this issue occurs also with Kubernetes v.1.9.6. Because of that, I've updated to v.1.10.3.
Any suggestion? Thanks a lot! :-)

I've just solved the issue.
Finally it was a matter with the proxy, which was blocking all communications.
Thanks a lot anyway!

Related

rafthttp: dial tcp timeout on etcd 3-node cluster creation

I don't have an access to the etcd part of the project's source code, however I do have access to the /var/log/syslog.
The goal is to setup up 3-node cluster.
(1)The very first etcd error that comes up is:
rafthttp: failed to dial 76e7ffhh20007a98 on stream MsgApp v2 (dial tcp 10.0.0.134:2380: i/o timeout)
Before continuing, I would say that I can ping all three nodes from each of the nodes. As well as I have tried to open the 2380 TCP ports and still no success - same error.
(2)So, before that error I had following messages from the etcd, which in my opinion confirm that cluster is setup correctly:
etcdserver/membership: added member 76e7ffhh20007a98 [https://server2:2380]
etcdserver/membership: added member 222e88db3803e816 [https://server1:2380]
etcdserver/membership: added member 999115e00e17123d [https://server3:2380]
In /etc/hosts file these DNS names are resolved as:
server2 10.0.0.135
server1 10.0.0.134
server3 10.0.0.136
(3)The initial setup, however (on each nodes looks like this):
embed: listening for peers on https://127.0.0.1:2380
embed: listening for client requests on 127.0.0.1:2379
So, to sum up, each node have got this initial setup log (3) and then adds members (2) then once these steps are done it fails with (1). As I know the etcd cluster creation is following this pattern: https://etcd.io/docs/v3.5/tutorials/how-to-setup-cluster/
Without knowing the source code is really hard to debug, however maybe some ideas on the error and what could cause it?
UPD: etcdctl cluster-health output (ETCDCTL_ENDPOINT is exported):
cluster may be unhealthy: failed to list members Error: client: etcd
cluster is unavailable or misconfigured; error #0: client: endpoint
http://127.0.0.1:2379 exceeded header timeout ; error #1: dial tcp
127.0.0.1:4001: connect: connection refused
error #0: client: endpoint http://127.0.0.1:2379 exceeded header
timeout error #1: dial tcp 127.0.0.1:4001: connect: connection refused

Istio Multicluster primary-remote on Azure AKS

I am trying to create multi cluster istio primary-remote.
First created two clusters AZURE AKS. Used AzureCNI for Network Configuaration and following are the settings of the cluster.
First cluster
vnet istioclusterone - 10.10.0.0/20
subnet default - 10.10.0.0/20
k8s service address range 10.100.0.0/16
DNS service ip - 10.100.0.10
Docker Bridge address - 172.17.0.1/16
DNS-prefix - app-cluster-dns
Second cluster
vnet istioclusterone - 10.11.0.0/20
subnet default - 10.11.0.0/20
k8s service address range 10.101.0.0/16
DNS service ip - 10.101.0.10
Docker Bridge address - 172.18.0.1/16
DNS-prefix - processing-cluster-dns
Other than this gone with default settings.
Next Followed below articles to setup multi Istio cluster.
Before you begin
Primary-remote
last step in second article to setup cluster2 as remote is failed.
Found below errors in logs of istio-ingressgateway pod.
2022-04-11T07:51:00.352057Z warning envoy config StreamAggregatedResources gRPC config stream closed since 431s ago: 14, connection error: desc = "transport: authentication handshake failed: x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"cluster.local\")"
2022-04-11T07:51:08.514428Z warning envoy config StreamAggregatedResources gRPC config stream closed since 439s ago: 14, connection error: desc = "transport: authentication handshake failed: x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"cluster.local\")"
2022-04-11T07:51:12.462140Z warning envoy config StreamAggregatedResources gRPC config stream closed since 443s ago: 14, connection error: desc = "transport: authentication handshake failed: x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"cluster.local\")"
2022-04-11T07:51:39.950935Z warning envoy config StreamAggregatedResources gRPC config stream closed since 471s ago: 14, connection error: desc = "transport: authentication handshake failed: x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"cluster.local\")"
Has anyone tried this scenario please share your insights.
Thanks.
Update:
Have used custom certs for both the clusters previous error was solved.
then created a gateway in both the clusters.
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: cluster-aware-gateway
namespace: istio-system
spec:
selector:
istio: ingressgateway
servers:
- port:
number: 443
name: tls
protocol: TLS
tls:
mode: AUTO_PASSTHROUGH
hosts:
- "*.local"
Now getting new error. check below logs of pod istio-ingressgateway-575ccb4d79 of cluster2.
2022-04-13T09:14:04.650502Z warning envoy config StreamAggregatedResources gRPC config stream closed since 60s ago: 14, connection error: desc = "transport: Error while dialing dial tcp <publicIPofEastWestgateway>:15012: i/o timeout"
2022-04-13T09:14:27.026016Z warning envoy config StreamAggregatedResources gRPC config stream closed since 83s ago: 14, connection error: desc = "transport: Error while dialing dial tcp <publicIPofEastWestgateway:15012: i/o timeout"
what I undertood here, I have an eastwestgateway installed in cluster1 as in the documentation linkToDoc
cluster2 is trying to access cluster1. using publicIp of eastwest-gateway on port 15012 which is failing.
checked security groups port is opened. Tried telnet from a test pod from within the cluster to check. its failing.
can anyone help me here.
Thanks
It looks like a firewall issue. not sure if it'll help, but try opening the ports 15012 and 15443 on the remote cluster's outbound, to the eastwestgateway elb ip (primary cluster)

Fabic: Issue connection refused 7050

I am trying to create a network from the hyperledger fabic tutorial. I get the following error:
Error: failed to create deliver client for orderer: orderer client failed to connect to localhost:7050: failed to create new connection: connection error: desc = "transport: error while dialing: dial tcp [::1]:7050: connect: connection refused"
I opened up the port on the Centos 7 Virtual machine and still no luck. The docker container is exposing the port to the host.
I removed all docker containers, images and volumes. I even rebuilt the VM from scratch.
Any help would be great.
Thanks,
This situation is happened because you called a gRPC to orderer server but your call failed to hit the server. This situation may happen for many reasons, but for most of the cases the situation is happened due to server down(orderer server exit or down due to misconfiguration) or your call failed to hit the server due to misconfiguration.
I somehow encounter this problem before and the port was opened. Somehow it was a mistake where I forgot to put '-a' in command (launch cerificate authorities). Hope it help.
You might also refer this : https://hyperledger-fabric.readthedocs.io/en/release-2.0/build_network.html

How to fix etcd cluster misconfigured error

Have two servers : pg1: 10.80.80.195 and pg2: 10.80.80.196
Version of etcd :
etcd Version: 3.2.0
Git SHA: 66722b1
Go Version: go1.8.3
Go OS/Arch: linux/amd64
I'm trying to run like this :
pg1 server :
etcd --name infra0 --initial-advertise-peer-urls http://10.80.80.195:2380 --listen-peer-urls http://10.80.80.195:2380 --listen-client-urls http://10.80.80.195:2379,http://127.0.0.1:2379 --advertise-client-urls http://10.80.80.195:2379 --initial-cluster-token etcd-cluster-1 --initial-cluster infra0=http://10.80.80.195:2380,infra1=http://10.80.80.196:2380 --initial-cluster-state new
pg2 server :
etcd --name infra1 --initial-advertise-peer-urls http://10.80.80.196:2380 --listen-peer-urls http://10.80.80.196:2380 --listen-client-urls http://10.80.80.196:2379,http://127.0.0.1:2379 --advertise-client-urls http://10.80.80.196:2379 --initial-cluster-token etcd-cluster-1 --initial-cluster infra0=http://10.80.80.195:2380,infra1=http://10.80.80.196:2380 --initial-cluster-state new
When trying to cherck health state on pg1:
etcdctl cluster-health
have an error :
cluster may be unhealthy: failed to list members
Error: client: etcd cluster is unavailable or misconfigured; error #0: client: endpoint http://127.0.0.1:2379 exceeded header timeout
; error #1: dial tcp 127.0.0.1:4001: getsockopt: connection refused
error #0: client: endpoint http://127.0.0.1:2379 exceeded header timeout
error #1: dial tcp 127.0.0.1:4001: getsockopt: connection refused
What I'm doing wrong and how to fix it ?
Both servers run on virtual machines with Bridged Adapter
I've got similar error when I set up etcd clusters using systemd according to the official tutorial from kubernetes.
It's three centos 7 of medium instances on AWS. I'm pretty sure the security groups are correct. And I've just:
$ systemctl restart network
and the
$ etcdctl cluster-health
just gives a healthy result.

Kubernetes on Azure : connectex

Followed steps from the link to create a K8s cluster using the Azure Portal. Tried using kubectl on a remote machine to check if it's working. Got this error.
Unable to connect to the server: dial tcp 13.90.35.157:443: connectex:
A connection attempt failed because the connected party did not
properly respond after a period of time, or established connection
failed because connected host has failed to respond.
I can SSH to the K8s master. Tried kubectl get nodes from the master and got similar error.
It is really hard to say from such a description what went wrong, but as this is a new cluster ( and I'm saying this because sometimes k8s cluster gets deployed but doesn't really work, so ), I would suggest deleting it and creating a new one and\or creating it using the Azure Cli\Azure Cloud Shell.
Basically its as simple as:
az acs create -n acs-cluster -g acsrg1 -d applink789 --generate-ssh-keys
if you have the resource group created, if not you can create it with:
az group create -n acsrg1 -l "westus"
According to your description, it seems you have not configured the Service Principal correctly. I use wrong service principal to deploy K8S in Azure, get the same error:
C:\Users>kubectl get nodes
Unable to connect to the server: dial tcp 13.90.27.73:443: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
You may need to check to ensure the credentials were provided accurately, and that the configured Service Principal has read and write permissions to the target Subscription.
If your Service Principal is misconfigured, none of the kubernetes components will come up in a healthy manner. We can check to see if this the problem:
root#k8s-master-6FEE48E1-0:~# journalctl -u kubelet | grep --text autorest
If you see output that looks like the following, it means you have not configured the service Principal correctly.
root#k8s-master-6FEE48E1-0:~# journalctl -u kubelet | grep --text autorest
Jun 01 01:58:47 k8s-master-6FEE48E1-0 docker[5522]: E0601 01:58:47.447321 6028 kubelet.go:1186] Cannot get Node info: failed to get external ID from cloud provider: autorest#WithErrorUnlessStatusCode: POST https://login.microsoftonline.com/1fcf418e-66ed-4c99-9449-d8e18bf8737a/oauth2/token?api-version=1.0 failed with 400 Bad Request: StatusCode=400
Jun 01 01:58:47 k8s-master-6FEE48E1-0 docker[5522]: E0601 01:58:47.627128 6028 kubelet_node_status.go:70] Unable to construct api.Node object for kubelet: failed to get external ID from cloud provider: autorest#WithErrorUnlessStatusCode: POST https://login.microsoftonline.com/1fcf418e-66ed-4c99-9449-d8e18bf8737a/oauth2/token?api-version=1.0 failed with 400 Bad Request: StatusCode=400
Jun 01 01:58:47 k8s-master-6FEE48E1-0 docker[5522]: E0601 01:58:47.885092 6028 kubelet_node_status.go:70] Unable to construct api.Node object for kubelet: failed to get external ID from cloud provider: autorest#WithErrorUnlessStatusCode: POST https://login.microsoftonline.com/1fcf418e-66ed-4c99-9449-d8e18bf8737a/oauth2/token?api-version=1.0 failed with 400 Bad Request: StatusCode=400
More information about how to create /configure a service principal for ACS-Engin Kubernetes cluster, please refer to this link.

Resources