Increasing PVC size for statefull set in kubernetes - azure

I am want to increase the size of my pvc from 50 GB to 100GB, can you please help on this.
EFK is deployed on Azure kubernetes Cluster and storage type is azurefile-standard-zrs
i have deployed Elasticsearch as a statefullset using helm, tried updating values.yaml file however its not happening .
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
logging-es es 6 2021-02-23 16:01:17.013698 +0000 UTC deployed opendistro-es-1.13.0 1.13.0
[# .kube]$ kubectl get pvc -n es
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
data-logging-es-opendistro-es-data-0 Bound pvc-e926bbfc-a873-4543-9867-234bc508977c 50Gi RWO azurefile-standard-zrs 53d
data-logging-es-opendistro-es-data-1 Bound pvc-0f5e7e46-5138-45da-90e6-0dfbe0aadff3 50Gi RWO azurefile-standard-zrs 53d
data-logging-es-opendistro-es-master-0 Bound pvc-e8a57019-5eeb-4a93-ba02-3f1b5c2e8fc8 20Gi RWO azurefile-standard-zrs 53d
data-logging-es-opendistro-es-master-1 Bound pvc-2ea1845d-7d08-4fca-b3c4-5b067559af3c 20Gi RWO azurefile-standard-zrs 53d
[# .kube]$ kubectl get pvc data-logging-es-opendistro-es-data-0 -n es -o yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
pv.kubernetes.io/bind-completed: "yes"
pv.kubernetes.io/bound-by-controller: "yes"
volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/azure-file
volume.kubernetes.io/selected-node: azwe-wvm-0
creationTimestamp: "2021-01-15T07:25:20Z"
finalizers:
- kubernetes.io/pvc-protection
labels:
app: logging-es-opendistro-es
heritage: Helm
release: logging-es
role: data
name: data-logging-es-opendistro-es-data-0
namespace: es
resourceVersion: "33101689"
selfLink: /api/v1/namespaces/es/persistentvolumeclaims/data-logging-es-opendistro-es-data-0
uid: e926bbfc-a873-4543-9867-234bc508977c
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
storageClassName: azurefile-standard-zrs
volumeMode: Filesystem
volumeName: pvc-e926bbfc-a873-234bc508977c
status:
accessModes:
- ReadWriteOnce
capacity:
storage: 50Gi
phase: Bound
Error on trying helm upgrade
$ helm upgrade -f values_runtime.yaml logging-es infrastructure/opendistro-es -n es --kubeconfig=/home/tsiadm/qa_fmo_config
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /home/tsiadm/qa_fmo_config
WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /home/tsiadm/qa_fmo_config
Error: UPGRADE FAILED: cannot patch "logging-es-opendistro-es-data" with kind StatefulSet: StatefulSet.apps "logging-es-opendistro-es-data" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', and 'updateStrategy' are forbidden```

couple of things:
ensure storageclass used for this pvc is enabled for volume expansion
edit the pvc to request more space
scale down your stateful set to 0 replicas
wait for the pvc to scale up
scale your stateful set back to whatever replicas it requires

Related

azure kubespray cluster persistant volume claim failing to bound

Deployed kubernetes cluster on azure using kubespray. Configured Cloud Controller Manager and Cloud Node Manager components. The cluster is able to create load balancer for the service in the azure. To this point it was successful story.
I'm trying to setup storage class now, internet talks about only AKS when it comes to azure, but our case is custom k8s cluster on azure.
Deployed the storage class:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
annotations:
storageclass.kubernetes.io/is-default-class: "true"
name: azurefile-sc
provisioner: kubernetes.io/azure-file
mountOptions:
- dir_mode=0755
- file_mode=0755
- uid=0
- gid=0
- mfsymlinks
- cache=strict
parameters:
skuName: Standard_LRS
$ kubectl get sc azurefile-sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
azurefile-sc (default) kubernetes.io/azure-file Delete Immediate false 16m
Deployed pvc:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/azure-file
volume.kubernetes.io/storage-provisioner: kubernetes.io/azure-file
name: azurefile-sc-pvc
spec:
accessModes:
- ReadWriteMany
storageClassName: azurefile-sc
resources:
requests:
storage: 1Gi
$ kubectl describe pvc azurefile-sc-pvc
Name: azurefile-sc-pvc
Namespace: default
StorageClass: azurefile-sc
Status: Pending
Volume:
Labels: <none>
Annotations: volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/azure-file
volume.kubernetes.io/storage-provisioner: kubernetes.io/azure-file
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode: Filesystem
Used By: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning ProvisioningFailed 15s (x16 over 19m) persistentvolume-controller Failed to create provisioner: failed to get Azure Cloud Provider. GetCloudProvider returned <nil> instead
In all the nodes, kubelet service is configured with: KUBELET_CLOUD_PROVIDER="--cloud-provider=extenal" according to kubernetes CCM DOC
KUBE_LOGTOSTDERR="--logtostderr=true"
KUBE_LOG_LEVEL="--v=2"
KUBELET_ADDRESS="--node-ip=10.0.0.135"
KUBELET_HOSTNAME="--hostname-override=minion-2"
KUBELET_ARGS="--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf \
--config=/etc/kubernetes/kubelet-config.yaml \
--kubeconfig=/etc/kubernetes/kubelet.conf \
--container-runtime=remote \
--container-runtime-endpoint=unix:///var/run/containerd/containerd.sock \
--runtime-cgroups=/systemd/system.slice \
"
KUBELET_NETWORK_PLUGIN="--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin"
KUBELET_CLOUDPROVIDER="--cloud-provider=external"
PATH=/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

Kubernetes nfs-subdir-external-provisioner stuck in ContainerCreating / Unable to attach or mount volumes: unmounted volumes=[nfs-client-root]

I'm trying to install a nfs-client-provisioner and run a mongdb with it.
Unfortunately, the nfs-client-provisioner hangs in ContainerCreating and says "Warning FailedMount 3m35s (x13 over 37m) kubelet Unable to attach or mount volumes: unmounted volumes=[nfs-client-root], unattached volumes=[nfs-client-root kube-api-access-lr9tl]: timed out waiting for the condition
".
The nfs server is configured on the same VPS machine (Debian 10).
I am able to mount and write files on the nfs server from a second vps with debian 10.
The cluster is setup with K0s
I have an error with the helm chart and manual installation.
Any help is apechiatet1
For Some more info see below:
Helm version:
version.BuildInfo{Version:"v3.8.2", GitCommit:"6e3701edea09e5d55a8ca2aae03a68917630e91b", GitTreeState:"clean", GoVersion:"go1.17.5"}
Kubernetes version:
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.6", GitCommit:"ad3338546da947756e8a88aa6822e9c11e7eac22", GitTreeState:"clean", BuildDate:"2022-04-14T08:49:13Z", GoVersion:"go1.17.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.5+k0s", GitCommit:"5ab78974affb1a76f1e5687aaa8b02aeac4380b8", GitTreeState:"clean", BuildDate:"2022-03-24T22:59:27Z", GoVersion:"go1.17.8", Compiler:"gc", Platform:"linux/amd64"}
k0s version: v1.23.5+k0s.0
worker added with:
token=$(k0s token create --role=worker)
docker run -d --name k0s-worker1 --hostname k0s-worker1 --privileged -v /var/lib/k0s docker.io/k0sproject/k0s:latest k0s worker $token
kubectl get nodes
NAME STATUS ROLES AGE VERSION
k0s-worker9 Ready <none> 42m v1.23.5+k0s
v2202204173709187201 Ready control-plane 43m v1.23.5+k0s
kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
managed-nfs-storage k8s-sigs.io/nfs-subdir-external-provisioner Delete Immediate false 40m
/ect/exports
/data/nfs-storage *(rw,sync,no_root_squash,no_subtree_check,insecure)
Output with
sudo k0s kubectl describe pod nfs-client-provisioner-6889579fdb-t7j74
Name: nfs-client-provisioner-6889579fdb-t7j74
Namespace: default
Priority: 0
Node: k0s-worker9/172.17.0.2
Start Time: Tue, 26 Apr 2022 08:45:49 +0200
Labels: app=nfs-client-provisioner
pod-template-hash=6889579fdb
Annotations: kubernetes.io/psp: 00-k0s-privileged
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/nfs-client-provisioner-6889579fdb
Containers:
nfs-client-provisioner:
Container ID:
Image: gcr.io/k8s-staging-sig-storage/nfs-subdir-external-provisioner:v4.0.1
Image ID:
Port: <none>
Host Port: <none>
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment:
PROVISIONER_NAME: k8s-sigs.io/nfs-subdir-external-provisioner
NFS_SERVER: 47.122.181.39
NFS_PATH: /data/nfs-storage
Mounts:
/persistentvolumes from nfs-client-root (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-lr9tl (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
nfs-client-root:
Type: NFS (an NFS mount that lasts the lifetime of a pod)
Server: 47.122.181.39
Path: /data/nfs-storage
ReadOnly: false
kube-api-access-lr9tl:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 18m default-scheduler Successfully assigned default/nfs-client-provisioner-6889579fdb-t7j74 to k0s-worker9
Warning FailedMount 2m42s (x6 over 16m) kubelet Unable to attach or mount volumes: unmounted volumes=[nfs-client-root], unattached volumes=[nfs-client-root kube-api-access-lr9tl]: timed out waiting for the condition
Warning FailedMount 24s (x2 over 7m14s) kubelet Unable to attach or mount volumes: unmounted volumes=[nfs-client-root], unattached volumes=[kube-api-access-lr9tl nfs-client-root]: timed out waiting for the condition
command using helm:
helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner \
--set nfs.server=47.122.181.39 \
--set nfs.path=/data/nfs-storage
without helm:
from https://github.com/kubernetes-sigs/nfs-subdir-external-provisioner/tree/v4.0.2/deploy
kubectl create -f rbac.yaml
kubectl create -f class.yaml
kubectl create -f deployment.yaml
class.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: managed-nfs-storage
provisioner: k8s-sigs.io/nfs-subdir-external-provisioner # or choose another name, must match deployment's env PROVISIONER_NAME'
parameters:
archiveOnDelete: "false"
deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nfs-client-provisioner
labels:
app: nfs-client-provisioner
# replace with namespace where provisioner is deployed
namespace: default
spec:
replicas: 1
strategy:
type: Recreate
selector:
matchLabels:
app: nfs-client-provisioner
template:
metadata:
labels:
app: nfs-client-provisioner
spec:
serviceAccountName: nfs-client-provisioner
containers:
- name: nfs-client-provisioner
image: gcr.io/k8s-staging-sig-storage/nfs-subdir-external-provisioner:v4.0.1
volumeMounts:
- name: nfs-client-root
mountPath: /persistentvolumes
env:
- name: PROVISIONER_NAME
value: k8s-sigs.io/nfs-subdir-external-provisioner
- name: NFS_SERVER
value: 47.122.181.39
- name: NFS_PATH
value: /data/nfs-storage
volumes:
- name: nfs-client-root
nfs:
server: 47.122.181.39
path: /data/nfs-storage
K0s cluster setup:
sudo curl -sSLf https://get.k0s.sh | sudo sh
sudo k0s install controller --enable-worker
sudo k0s start
sudo cp /var/lib/k0s/pki/admin.conf ~/admin.conf
export KUBECONFIG=~/admin.conf
token=$(k0s token create --role=worker)
docker run -d --name k0s-worker9 --hostname k0s-worker9 --privileged -v /var/lib/k0s docker.io/k0sproject/k0s:latest k0s worker $token
Have you tried:
nfs.mountOptions = {
nfsvers = 4
}
I also used this provisioner. It works only with nfs4. And also check mount point of your NFS server if it's root / or not. If your provisioner is configured and ready try to recreate pvc.

Cannot access Web API deployed in Azure ACS Kubernetes Cluster

Please help. I am trying to deploy a web API to Azure ACS Kubernetes cluster, it is a simple web API created in VSTS and the result should be like this: { "value1", "value2" }.
I plan to make the type as Cluster-IP but I want to test and access it first that is why this is LoadBalancer, the pods is running and no restart (I think it's good).
The guide I'm following is: Running Web API using Docker and Kubernetes
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.0.0.1 <none> 443/TCP 3d
sampleapi-service LoadBalancer 10.0.238.155 102.51.223.6 80:31676/TCP 1h
When I tried to browse the IP 102.51.223.6/api/values it says:
"This site can’t be reached"
service.yaml
kind: Service
apiVersion: v1
metadata:
name: sampleapi-service
labels:
name: sampleapi
app: sampleapi
spec:
selector:
name: sampleapi
ports:
- protocol: "TCP"
# Port accessible inside the cluster
port: 80
# Port to forwards inside the pod
targetPort: 80
# Port accessible oustide the cluster
#nodePort: 80
type: LoadBalancer
deployment.yml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: sampleapi-deployment
spec:
replicas: 3
template:
metadata:
labels:
app: sampleapi
spec:
containers:
- name: sampleapi
image: mycontainerregistry.azurecr.io/sampleapi:latest
ports:
- containerPort: 80
POD
Name: sampleapi-deployment-498305766-zzs2z
Namespace: default
Node: c103facs9001/10.240.0.4
Start Time: Fri, 27 Jul 2018 00:20:06 +0000
Labels: app=sampleapi
pod-template-hash=498305766
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"default","name":"sampleapi-deployme
-498305766","uid":"d064a8e0-9132-11e8-b58d-0...
Status: Running
IP: 10.244.2.223
Controlled By: ReplicaSet/sampleapi-deployment-498305766
Containers:
sampleapi:
Container ID: docker://19d414c87ebafe1cc99d101ac60f1113533e44c24552c75af4ec197d3d3c9c53
Image: mycontainerregistry.azurecr.io/sampleapi:latest
Image ID: docker-pullable://mycontainerregistry.azurecr.io/sampleapi#sha256:9635a9df168ef76a6a27cd46cb15620d762657e9b57a5ac2514ba0b9a8f47a8d
Port: 80/TCP
Host Port: 0/TCP
State: Running
Started: Fri, 27 Jul 2018 00:20:48 +0000
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-mj5m1 (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
default-token-mj5m1:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-mj5m1
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 50m default-scheduler Successfully assigned sampleapi-deployment-498305766-zzs2z to c103facs9001
Normal SuccessfulMountVolume 50m kubelet, c103facs9001 MountVolume.SetUp succeeded for volume "default-token-mj5m1"
Normal Pulling 49m kubelet, c103facs9001 pulling image "mycontainerregistry.azurecr.io/sampleapi:latest"
Normal Pulled 49m kubelet, c103facs9001 Successfully pulled image "mycontainerregistry.azurecr.io/sampleapi:latest"
Normal Created 49m kubelet, c103facs9001 Created container
Normal Started 49m kubelet, c103facs9001 Started container
It seems like to me that your service isn't set to a port on the container. You have your targetPort commented out. So the service is reachable on port 80 but the service doesn't know to target the pod on that port.
You will need to start the service which exposes the internal port to some external Ip:port that can be used in your browser to access the service. try this after deploying your deployment and service yml files:
kubectl service sampleapi-service

Azure Container Services: trying and failing to pull image

I'm trying to deploy my k8s cluster. But when I do, it can't pull the image. Here's what I get when I run kubectl describe pods:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal BackOff 47m kubelet, dc9ebacs9000 Back-off pulling image "tlk8s.azurecr.io/devicecloudwebapi:v1"
Warning FailedSync 9m (x3 over 47m) kubelet, dc9ebacs9000 Error syncing pod
Warning Failed 9m kubelet, dc9ebacs9000 Failed to pull image "tlk8s.azurecr.io/devicecloudwebapi:v1": [rpc error: code = 2 desc = failed to register layer: re-exec error: exit status 1: output: remove \\?\C:\ProgramData\docker\windowsfilter\930af9d006462c904d9114da95523cc441206db8bb568769f4f0612d3a96da5b\Files\Windows\System32\LogFiles\Scm\SCM.EVM: The system cannot find the file specified., rpc error: code = 2 desc = failed to register layer: re-exec error: exit status 1: output: remove \\?\C:\ProgramData\docker\windowsfilter\e30d44f97c53edf7e91c69f246fe753a84e4cb40899f472f75aae6e6d74b5c45\Files\Windows\System32\LogFiles\Scm\SCM.EVM: The system cannot find the file specified.]
Normal Pulling 9m (x3 over 2h) kubelet, dc9ebacs9000 pulling image "tlk8s.azurecr.io/devicecloudwebapi:v1"
Here's what I get when I look at the individual pod:
Error from server (BadRequest): container "tl-api" in pod "tl-api-3363368743-d7kjq" is waiting to start: image can't be pulled
Here's my YAML file:
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: tl-api
spec:
replicas: 1
template:
metadata:
labels:
app: tl-api
spec:
containers:
- name: tl-api
image: tlk8s.azurecr.io/devicecloudwebapi:v1
ports:
- containerPort: 80
imagePullSecrets:
- name: acr-secret
nodeSelector:
beta.kubernetes.io/os: windows
---
apiVersion: v1
kind: Service
metadata:
name: tl-api
spec:
type: LoadBalancer
ports:
- port: 80
selector:
app: tl-api
My docker images result:
REPOSITORY TAG IMAGE ID CREATED SIZE
devicecloudwebapi latest ee3d9c3e231d 8 days ago 7.85GB
tlk8s.azurecr.io/devicecloudwebapi v1 ee3d9c3e231d 8 days ago 7.85GB
devicecloudwebapi dev bb33ab221910 8 days ago 7.76GB
You must create a secret to your registry in kubectl:
kubectl create secret docker-registry <secret-name> \
--namespace <namespace> \
--docker-server=<container-registry-name>.azurecr.io \
--docker-username=<service-principal-ID> \
--docker-password=<service-principal-password>
More info: https://learn.microsoft.com/pt-br/azure/container-registry/container-registry-auth-kubernetes
Remember to set the "imagePullSecrets" into your spec.
apiVersion: v1
kind: Pod
metadata: #informaçoes internas do container
name: mongodb-pod
spec: #maneira com o pod tem que se comportar
containers: # informações sobre os containeres que irão rodar no pod
- name: mongodb
image: mongo
ports:
- containerPort: 27017
imagePullSecrets:
- name: <secret-name>
First, I would double check you are logged into docker at the right registry via cli.
something like docker login <REGISTRY_NAME> -u <CLIENT_ID>
You will want to make sure you have created a k8s secret and bound it to the registry. Maybe check out this post if you haven't already done so. I see your yaml specifies a secret, but is this configured on the registry as well?

Kubernetes DNS fails in Kubernetes 1.2

I'm attempting to set up DNS support in Kubernetes 1.2 on Centos 7. According to the documentation, there's two ways to do this. The first applies to a "supported kubernetes cluster setup" and involves setting environment variables:
ENABLE_CLUSTER_DNS="${KUBE_ENABLE_CLUSTER_DNS:-true}"
DNS_SERVER_IP="10.0.0.10"
DNS_DOMAIN="cluster.local"
DNS_REPLICAS=1
I added these settings to /etc/kubernetes/config and rebooted, with no effect, so either I don't have a supported kubernetes cluster setup (what's that?), or there's something else required to set its environment.
The second approach requires more manual setup. It adds two flags to kubelets, which I set by updating /etc/kubernetes/kubelet to include:
KUBELET_ARGS="--cluster-dns=10.0.0.10 --cluster-domain=cluster.local"
and restarting the kubelet with systemctl restart kubelet. Then it's necessary to start a replication controller and a service. The doc page cited above provides a couple of template files for this that require some editing, both for local changes (my Kubernetes API server listens to the actual IP address of the hostname rather than 127.0.0.1, making it necessary to add a --kube-master-url setting) and to remove some Salt dependencies. When I do this, the replication controller starts four containers successfully, but the kube2sky container gets terminated about a minute after completing initialization:
[david#centos dns]$ kubectl --server="http://centos:8080" --namespace="kube-system" logs -f kube-dns-v11-t7nlb -c kube2sky
I0325 20:58:18.516905 1 kube2sky.go:462] Etcd server found: http://127.0.0.1:4001
I0325 20:58:19.518337 1 kube2sky.go:529] Using http://192.168.87.159:8080 for kubernetes master
I0325 20:58:19.518364 1 kube2sky.go:530] Using kubernetes API v1
I0325 20:58:19.518468 1 kube2sky.go:598] Waiting for service: default/kubernetes
I0325 20:58:19.533597 1 kube2sky.go:660] Successfully added DNS record for Kubernetes service.
F0325 20:59:25.698507 1 kube2sky.go:625] Received signal terminated
I've determined that the termination is done by the healthz container after reporting:
2016/03/25 21:00:35 Client ip 172.17.42.1:58939 requesting /healthz probe servicing cmd nslookup kubernetes.default.svc.cluster.local 127.0.0.1 >/dev/null
2016/03/25 21:00:35 Healthz probe error: Result of last exec: nslookup: can't resolve 'kubernetes.default.svc.cluster.local', at 2016-03-25 21:00:35.608106622 +0000 UTC, error exit status 1
Aside from this, all other logs look normal. However, there is one anomaly: it was necessary to specify --validate=false when creating the replication controller, as the command otherwise gets the message:
error validating "skydns-rc.yaml": error validating data: [found invalid field successThreshold for v1.Probe, found invalid field failureThreshold for v1.Probe]; if you choose to ignore these errors, turn validation off with --validate=false
Could this be related? These arguments come directly Kubernetes documentation. if not, what's needed to get this running?
Here is the skydns-rc.yaml I used:
apiVersion: v1
kind: ReplicationController
metadata:
name: kube-dns-v11
namespace: kube-system
labels:
k8s-app: kube-dns
version: v11
kubernetes.io/cluster-service: "true"
spec:
replicas: 1
selector:
k8s-app: kube-dns
version: v11
template:
metadata:
labels:
k8s-app: kube-dns
version: v11
kubernetes.io/cluster-service: "true"
spec:
containers:
- name: etcd
image: gcr.io/google_containers/etcd-amd64:2.2.1
resources:
# TODO: Set memory limits when we've profiled the container for large
# clusters, then set request = limit to keep this container in
# guaranteed class. Currently, this container falls into the
# "burstable" category so the kubelet doesn't backoff from restarting it.
limits:
cpu: 100m
memory: 500Mi
requests:
cpu: 100m
memory: 50Mi
command:
- /usr/local/bin/etcd
- -data-dir
- /var/etcd/data
- -listen-client-urls
- http://127.0.0.1:2379,http://127.0.0.1:4001
- -advertise-client-urls
- http://127.0.0.1:2379,http://127.0.0.1:4001
- -initial-cluster-token
- skydns-etcd
volumeMounts:
- name: etcd-storage
mountPath: /var/etcd/data
- name: kube2sky
image: gcr.io/google_containers/kube2sky:1.14
resources:
# TODO: Set memory limits when we've profiled the container for large
# clusters, then set request = limit to keep this container in
# guaranteed class. Currently, this container falls into the
# "burstable" category so the kubelet doesn't backoff from restarting it.
limits:
cpu: 100m
# Kube2sky watches all pods.
memory: 200Mi
requests:
cpu: 100m
memory: 50Mi
livenessProbe:
httpGet:
path: /healthz
port: 8080
scheme: HTTP
initialDelaySeconds: 60
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 5
readinessProbe:
httpGet:
path: /readiness
port: 8081
scheme: HTTP
# we poll on pod startup for the Kubernetes master service and
# only setup the /readiness HTTP server once that's available.
initialDelaySeconds: 30
timeoutSeconds: 5
args:
# command = "/kube2sky"
- --domain="cluster.local"
- --kube-master-url=http://192.168.87.159:8080
- name: skydns
image: gcr.io/google_containers/skydns:2015-10-13-8c72f8c
resources:
# TODO: Set memory limits when we've profiled the container for large
# clusters, then set request = limit to keep this container in
# guaranteed class. Currently, this container falls into the
# "burstable" category so the kubelet doesn't backoff from restarting it.
limits:
cpu: 100m
memory: 200Mi
requests:
cpu: 100m
memory: 50Mi
args:
# command = "/skydns"
- -machines=http://127.0.0.1:4001
- -addr=0.0.0.0:53
- -ns-rotate=false
- -domain="cluster.local"
ports:
- containerPort: 53
name: dns
protocol: UDP
- containerPort: 53
name: dns-tcp
protocol: TCP
- name: healthz
image: gcr.io/google_containers/exechealthz:1.0
resources:
# keep request = limit to keep this container in guaranteed class
limits:
cpu: 10m
memory: 20Mi
requests:
cpu: 10m
memory: 20Mi
args:
- -cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1 >/dev/null
- -port=8080
ports:
- containerPort: 8080
protocol: TCP
volumes:
- name: etcd-storage
emptyDir: {}
dnsPolicy: Default # Don't use cluster DNS.
and skydns-svc.yaml:
apiVersion: v1
kind: Service
metadata:
name: kube-dns
namespace: kube-system
labels:
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
kubernetes.io/name: "KubeDNS"
spec:
selector:
k8s-app: kube-dns
clusterIP: "10.0.0.10"
ports:
- name: dns
port: 53
protocol: UDP
- name: dns-tcp
port: 53
protocol: TCP
I just commented out the lines that contain the successThreshold and failureThreshold values in skydns-rc.yaml, then re-run the kubectl commands.
kubectl create -f skydns-rc.yaml
kubectl create -f skydns-svc.yaml

Resources