Can't do 'helm install' on cluster. Tiller was installed by gitab - gitlab

I created a cluster in GKE using Gitlab and installed Helm & Tiller and some other stuffs like ingress and gitlab runner using gitab's interface. But when I try to install something using helm from gcloud, it gives "Error: Transport is closing".
I did gcloud container clusters get-credentials ....
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default jaeger-deployment-59ffb979c8-lmjk5 1/1 Running 0 17h
gitlab-managed-apps certmanager-cert-manager-6c8cd9f9bf-67wnh 1/1 Running 0 17h
gitlab-managed-apps ingress-nginx-ingress-controller-75c4d99549-x66n4 1/1 Running 0 21h
gitlab-managed-apps ingress-nginx-ingress-default-backend-6f58fb5f56-pvv2f 1/1 Running 0 21h
gitlab-managed-apps prometheus-kube-state-metrics-6584885ccf-hr8fw 1/1 Running 0 22h
gitlab-managed-apps prometheus-prometheus-server-69b9f444df-htxsq 2/2 Running 0 22h
gitlab-managed-apps runner-gitlab-runner-56798d9d9d-nljqn 1/1 Running 0 22h
gitlab-managed-apps tiller-deploy-74f5d65d77-xk6cc 1/1 Running 0 22h
kube-system heapster-v1.6.0-beta.1-7bdb4fd8f9-t8bq9 2/2 Running 0 22h
kube-system kube-dns-7549f99fcc-bhg9t 4/4 Running 0 22h
kube-system kube-dns-autoscaler-67c97c87fb-4vz9t 1/1 Running 0 22h
kube-system kube-proxy-gke-cluster2-pool-1-05abcbc6-0s6j 1/1 Running 0 20h
kube-system kube-proxy-gke-cluster2-pool-2-67e57524-ht5p 1/1 Running 0 22h
kube-system metrics-server-v0.2.1-fd596d746-289nd 2/2 Running 0 22h
visual-react-10450736 production-847c7d879c-z4h5t 1/1 Running 0 22h
visual-react-10450736 production-postgres-64cfcf9464-jr74c 1/1 Running 0 22h
$ ./helm install stable/wordpress --tiller-namespace gitlab-managed-apps --name wordpress
E0127 10:27:29.790366 418 portforward.go:331] an error occurred forwarding 39113 -> 44134: error forwarding port 44134 to pod 86b33bdc7bc30c08d98fe44c0772517c344dd1bdfefa290b46e82bf84959cb6f, uid : exit status 1: 2019/01/27 04:57:29 socat[11124] E write(5, 0x14ed120, 186): Broken pipe
Error: transport is closing
Another one
$ ./helm install incubator/jaeger --tiller-namespace gitlab-managed-apps --name jaeger --set elasticsearch.rbac.create=true --set provisionDataStore.cassandra=false --set provisionDataStore.elasticsearch=true --set storage.type=elasticsearch
E0127 10:30:24.591751 429 portforward.go:331] an error occurred forwarding 45597 -> 44134: error forwarding port 44134 to pod 86b33bdc7bc30c08d98fe44c0772517c344dd1bdfefa290b46e82bf84959cb6f, uid : exit status 1: 2019/01/27 05:00:24 socat[13937] E write(5, 0x233d120, 8192): Connection reset by peer
Error: transport is closing
I tried forwarding ports myself and it never returns to prompt, takes forever.
kubectl port-forward --namespace gitlab-managed-apps tiller-deploy 39113:44134
Apparently installing anything from Gitab's ui uses Helm and those do not fail. Yet doing so from shell fails. Please help me out.
Thanks in advance.

I know it's late but I'll share this just in case someone else struggles with this issue. I've found an answer in the gitlab forums: HERE.
The trick is to export and decode the certificates from the tiller service account and pass them as arguments to helm like this:
helm list --tiller-connection-timeout 30 --tls --tls-ca-cert tiller-ca.crt --tls-cert tiller.crt --tls-key tiller.key ---all --tiller-namespace gitlab-managed-apps

Related

kubernetes networking: pod cannot reach nodes

I have kubernetes cluster with 3 masters and 7 workers. I use Calico as cni. When I deploy Calico, the calico-kube-controllers-xxx fails because it cannot reach 10.96.0.1:443.
2020-06-23 13:05:28.737 [INFO][1] main.go 88: Loaded configuration from environment config=&config.Config{LogLevel:"info", WorkloadEndpointWorkers:1, ProfileWorkers:1, PolicyWorkers:1, NodeWorkers:1, Kubeconfig:"", DatastoreType:"kubernetes"}
W0623 13:05:28.740128 1 client_config.go:541] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
2020-06-23 13:05:28.742 [INFO][1] main.go 109: Ensuring Calico datastore is initialized
2020-06-23 13:05:38.742 [ERROR][1] client.go 261: Error getting cluster information config ClusterInformation="default" error=Get https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-06-23 13:05:38.742 [FATAL][1] main.go 114: Failed to initialize Calico datastore error=Get https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
this is the situation in the kube-system namespace:
kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-77d6cbc65f-6bmjg 0/1 CrashLoopBackOff 56 4h33m
calico-node-94pkr 1/1 Running 0 36m
calico-node-d8vc4 1/1 Running 0 36m
calico-node-fgpd4 1/1 Running 0 37m
calico-node-jqgkp 1/1 Running 0 37m
calico-node-m9lds 1/1 Running 0 37m
calico-node-n5qmb 1/1 Running 0 37m
calico-node-t46jb 1/1 Running 0 36m
calico-node-w6xch 1/1 Running 0 38m
calico-node-xpz8k 1/1 Running 0 37m
calico-node-zbw4x 1/1 Running 0 36m
coredns-5644d7b6d9-ms7gv 0/1 Running 0 4h33m
coredns-5644d7b6d9-thwlz 0/1 Running 0 4h33m
kube-apiserver-k8s01 1/1 Running 7 34d
kube-apiserver-k8s02 1/1 Running 9 34d
kube-apiserver-k8s03 1/1 Running 7 34d
kube-controller-manager-k8s01 1/1 Running 7 34d
kube-controller-manager-k8s02 1/1 Running 9 34d
kube-controller-manager-k8s03 1/1 Running 8 34d
kube-proxy-9dppr 1/1 Running 3 4d
kube-proxy-9hhm9 1/1 Running 3 4d
kube-proxy-9svfk 1/1 Running 1 4d
kube-proxy-jctxm 1/1 Running 3 4d
kube-proxy-lsg7m 1/1 Running 3 4d
kube-proxy-m257r 1/1 Running 1 4d
kube-proxy-qtbbz 1/1 Running 2 4d
kube-proxy-v958j 1/1 Running 2 4d
kube-proxy-x97qx 1/1 Running 2 4d
kube-proxy-xjkjl 1/1 Running 3 4d
kube-scheduler-k8s01 1/1 Running 7 34d
kube-scheduler-k8s02 1/1 Running 9 34d
kube-scheduler-k8s03 1/1 Running 8 34d
Besides, also coredns cannot get internal kubernetes service.
Within a node, if I run wget -S 10.96.0.1:443, I receive a response.
wget -S 10.96.0.1:443
--2020-06-23 13:12:12-- http://10.96.0.1:443/
Connecting to 10.96.0.1:443... connected.
HTTP request sent, awaiting response...
HTTP/1.0 400 Bad Request
2020-06-23 13:12:12 ERROR 400: Bad Request.
But, if I run wget -S 10.96.0.1:443 in a pod, I receive a timeout error.
Also, i cannot ping nodes from pods.
Cluster pod cidr is 192.168.0.0/16.
I resolve recreating the cluster with different pod cidr

New AKS cluster unreachable via network (including dashboard)

Yesterday I spun up an Azure Kubernetes Service cluster running a few simple apps. Three of them have exposed public IPs that were reachable yesterday.
As of this morning I can't get the dashboard tunnel to work or the LoadBalancer IPs themselves.
I was asked by the Azure twitter account to solicit help here.
I don't know how to troubleshoot this apparent network issue - only az seems to be able to touch my cluster.
dashboard error log
❯❯❯ make dashboard ~/c/azure-k8s (master)
az aks browse --resource-group=akc-rg-cf --name=akc-237
Merged "akc-237" as current context in /var/folders/9r/wx8xx8ls43l8w8b14f6fns8w0000gn/T/tmppst_atlw
Proxy running on http://127.0.0.1:8001/
Press CTRL+C to close the tunnel...
error: error upgrading connection: error dialing backend: dial tcp 10.240.0.4:10250: getsockopt: connection timed out
service+pod listing
❯❯❯ kubectl get services,pods ~/c/azure-k8s (master)
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
azure-vote-back ClusterIP 10.0.125.49 <none> 6379/TCP 16h
azure-vote-front LoadBalancer 10.0.185.4 40.71.248.106 80:31211/TCP 16h
hubot LoadBalancer 10.0.20.218 40.121.215.233 80:31445/TCP 26m
kubernetes ClusterIP 10.0.0.1 <none> 443/TCP 19h
mti411-web LoadBalancer 10.0.162.209 52.168.123.30 80:30874/TCP 26m
NAME READY STATUS RESTARTS AGE
azure-vote-back-7556ff9578-sjjn5 1/1 Running 0 2h
azure-vote-front-5b8878fdcd-9lpzx 1/1 Running 0 16h
hubot-74f659b6b8-wctdz 1/1 Running 0 9s
mti411-web-6cc87d46c-g255d 1/1 Running 0 26m
mti411-web-6cc87d46c-lhjzp 1/1 Running 0 26m
http failures
❯❯❯ curl --connect-timeout 2 -I http://40.121.215.233 ~/c/azure-k8s (master)
curl: (28) Connection timed out after 2005 milliseconds
❯❯❯ curl --connect-timeout 2 -I http://52.168.123.30 ~/c/azure-k8s (master)
curl: (28) Connection timed out after 2001 milliseconds
If you are getting getsockopt: connection timed out while trying to access to your AKS Dashboard, I think deleting tunnelfront pod will help as once you delete the tunnelfront pod, this will trigger creation of new tunnelfront by Master. Its something I have tried and worked for me.
#daniel Did rebooting the agent VM's solve your issue or are you still seeing issues?

not able to access statefulset headless service from kubernetes

I have created a headless statefull service in kubernates. and cassandra db is running fine.
PS C:\> .\kubectl.exe get svc
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
cassandra None <none> 9042/TCP 50m
kubernetes 10.0.0.1 <none> 443/TCP 6d
PS C:\> .\kubectl.exe get pods
NAME READY STATUS RESTARTS AGE
cassandra-0 1/1 Running 0 49m
cassandra-1 1/1 Running 0 48m
cassandra-2 1/1 Running 0 48m
I am running all this on minikube. From my laptop i am trying to connect to 192.168.99.100:9402 using a java program. But it is not able to connect.
Looks like your service not defined with NodePort. can you change service type to NodePort and test it.
when we define svc to NodePort we should get two port number for the service.

Kubernetes on Rasperry Pi kube flannel CrashLoopBackOff and kube dns rpc error code = 2

I used this tutorial to set up a kubernetes cluster on my Raspberry 3.
I followed the instructions until the setup of flannel by:
curl -sSL https://rawgit.com/coreos/flannel/v0.7.0/Documentation/kube-flannel.yml | sed "s/amd64/arm/g" | kubectl create -f -
I get the following error message on kubectl get po --all-namespaces:
kube-system etcd-node01 1/1 Running
0 34m
kube-system kube-apiserver-node01 1/1 Running
0 34m
kube-system kube-controller-manager-node01 1/1 Running
0 34m
kube-system kube-dns-279829092-x4dc4 0/3 rpc error:
code = 2 desc = failed to start container
"de9b2094dbada10a0b44df97d25bb629d6fbc96b8ddc0c060bed1d691a308b37":
Error response from daemon: {"message":"cannot join network of a non
running container:
af8e15c6ad67a231b3637c66fab5d835a150da7385fc403efc0a32b8fb7aa165"}
15 39m
kube-system kube-flannel-ds-zk17g 1/2
CrashLoopBackOff
11 35m
kube-system kube-proxy-6zwtb 1/1 Running
0 37m
kube-system kube-proxy-wbmz2 1/1 Running
0 39m
kube-system kube-scheduler-node01 1/1 Running
Interestingly I have the same issue, installing kubernetes with flannel on my laptop with another tutorial.
Version details are here:
Client Version: version.Info{Major:"1", Minor:"6",
GitVersion:"v1.6.3",
GitCommit:"0480917b552be33e2dba47386e51decb1a211df6",
GitTreeState:"clean", BuildDate:"2017-05-10T15:48:59Z",
GoVersion:"go1.8rc2", Compiler:"gc", Platform:"linux/arm"}
Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.3",
GitCommit:"0480917b552be33e2dba47386e51decb1a211df6",
GitTreeState:"clean", BuildDate:"2017-05-10T15:38:08Z",
GoVersion:"go1.8rc2", Compiler:"gc", Platform:"linux/arm"}
Any suggestions, that might help?
I solved this issue by generating cluster-roles before setting up the pod network driver:
curl -sSL https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel-rbac.yml | sed "s/amd64/arm/g" | kubectl create -f -
Then setting up the pod network driver by:
curl -sSL https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml | sed "s/amd64/arm/g" | kubectl create -f -
Worked for me so far...

kube-dns stays in ContainerCreating status

I have 5 machines running Ubuntu 16.04.1 LTS. I want to set them up as a Kubernetes Cluster. Iḿ trying to follow this getting started guide where they're using kubeadm.
It all worked fine until step 3/4 Installing a pod network. I've looked at there addon page to look for a pod network and chose the flannel overlay network. Iǘe copied the yaml file to the machine and executed:
root#up01:/home/up# kubectl apply -f flannel.yml
Which resulted in:
configmap "kube-flannel-cfg" created
daemonset "kube-flannel-ds" created
So i thought that it went ok, but when I display all the pod stuff:
root#up01:/etc/kubernetes/manifests# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system dummy-2088944543-d5f50 1/1 Running 0 50m
kube-system etcd-up01 1/1 Running 0 48m
kube-system kube-apiserver-up01 1/1 Running 0 50m
kube-system kube-controller-manager-up01 1/1 Running 0 49m
kube-system kube-discovery-1769846148-jvx53 1/1 Running 0 50m
kube-system kube-dns-2924299975-prlgf 0/4 ContainerCreating 0 49m
kube-system kube-flannel-ds-jb1df 2/2 Running 0 32m
kube-system kube-proxy-rtcht 1/1 Running 0 49m
kube-system kube-scheduler-up01 1/1 Running 0 49m
The problem is that the kube-dns keeps in the ContainerCreating state. I don't know what to do.
It is very likely that you missed this critical piece of information from the guide:
If you want to use flannel as the pod network, specify
--pod-network-cidr 10.244.0.0/16 if you’re using the daemonset manifest below.
If you omit this kube-dns will never leave the ContainerCreating STATUS.
Your kubeadm init command should be:
# kubeadm init --pod-network-cidr 10.244.0.0/16
and not
# kubeadm init
Did you try restarting NetworkManager ...? it worked for me.. Plus, it also worked when I also disabled IPv6.

Resources