Kind (Kubernetes) cluster throwing ImagePullBackOff error - linux

I need to pull the image from public docker repository i.e hello-world:latest and run that image on kubernetes. I created cluster using Kind . I ran that image using the below command
kubectl run test-pod --image=hello-world
Then I did
kubectl describe pods
to get the status of the pods. It threw me ImagePullBackOff error . Please find the snapshot below. It seems there is some network issue when pulling the image using kind cluster. Although I am able to pull image from docker easily .
Have searched the whole internet regarding this issue but nothing worked out. Following is my pod specification :
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: "2022-05-16T15:01:17Z"
labels:
run: test-pod
name: test-pod
namespace: default
resourceVersion: "4370"
uid: 6ef121e2-805b-4022-9a13-c17c031aea88
spec:
containers:
- image: hello-world
imagePullPolicy: Always
name: test-pod
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-jjsmp
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
nodeName: kind-control-plane
preemptionPolicy: PreemptLowerPriority
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- name: kube-api-access-jjsmp
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2022-05-16T15:01:17Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2022-05-16T15:01:17Z"
message: 'containers with unready status: [test-pod]'
reason: ContainersNotReady
status: "False"
type: Ready
containerStatuses:
- image: hello-world
imageID: ""
lastState: {}
name: test-pod
ready: false
restartCount: 0
started: false
state:
waiting:
message: Back-off pulling image "hello-world"
reason: ImagePullBackOff
hostIP: 172.18.0.2
phase: Pending
podIP: 10.244.0.5
podIPs:
- ip: 10.244.0.5
qosClass: BestEffort
startTime: "2022-05-16T15:01:17Z"

The ImagePullBackOff error means that Kubernetes couldn't pull the image from the registry and will keep trying to pull the image until it reaches a compiled-in limit of 300 seconds (5 minutes). This issue could happen because Kubernetes is facing one of the following conditions:
You have exceeded the rate or download limit on the registry.
The image registry requires authentication.
There is a typo in the image name or tag.
The image or tag does not exist.
You can start reviewing if you can pull the image locally or try to ssh jumping on the node and run docker pull and get the image directly.
If you still can't pull the image, another option is to add 8.8.8.8 to /etc/resolv.conf.
Update:
To avoid exposing your kind cluster to the internet try to pull the image locally at your PC by manually specifying a new path from a different registry.
Sample:
docker pull myregistry.local:5000/testing/test-image

Related

Kubernetes - "Mount Volume Failed" when trying to deploy

I deployed my first container, I got info:
deployment.apps/frontarena-ads-deployment created
but then I saw my container creation is stuck in Waiting status.
Then I saw the logs using kubectl describe pod frontarena-ads-deployment-5b475667dd-gzmlp and saw MountVolume error which I cannot figure out why it is thrown:
Warning FailedMount 9m24s kubelet MountVolume.SetUp
failed for volume "ads-filesharevolume" : mount failed: exit status 32 Mounting command:
systemd-run Mounting arguments: --description=Kubernetes transient
mount for
/var/lib/kubelet/pods/85aa3bfa-341a-4da1-b3de-fb1979420028/volumes/kubernetes.io~azure-file/ads-filesharevolume
--scope -- mount -t cifs -o username=frontarenastorage,password=mypassword,file_mode=0777,dir_mode=0777,vers=3.0
//frontarenastorage.file.core.windows.net/azurecontainershare
/var/lib/kubelet/pods/85aa3bfa-341a-4da1-b3de-fb1979420028/volumes/kubernetes.io~azure-file/ads-filesharevolume
Output: Running scope as unit
run-rf54d5b5f84854777956ae0e25810bb94.scope. mount error(115):
Operation now in progress Refer to the mount.cifs(8) manual page (e.g.
man mount.cifs)
Before I run the deployment I created a secret in Azure, using the already created azure file share, which I referenced within the YAML.
$AKS_PERS_STORAGE_ACCOUNT_NAME="frontarenastorage"
$STORAGE_KEY="mypassword"
kubectl create secret generic fa-fileshare-secret --from-literal=azurestorageaccountname=$AKS_PERS_STORAGE_ACCOUNT_NAME --from-literal=azurestorageaccountkey=$STORAGE_KEY
In that file share I have folders and files which I need to mount and I reference azurecontainershare in YAML:
My YAML looks like this:
apiVersion: apps/v1
kind: Deployment
metadata:
name: frontarena-ads-deployment
labels:
app: frontarena-ads-deployment
spec:
replicas: 1
template:
metadata:
name: frontarena-ads-aks-test
labels:
app: frontarena-ads-aks-test
spec:
containers:
- name: frontarena-ads-aks-test
image: faselect-docker.dev/frontarena/ads:test1
imagePullPolicy: Always
ports:
- containerPort: 9000
volumeMounts:
- name: ads-filesharevolume
mountPath: /opt/front/arena/host
volumes:
- name: ads-filesharevolume
azureFile:
secretName: fa-fileshare-secret
shareName: azurecontainershare
readOnly: false
imagePullSecrets:
- name: fa-repo-secret
selector:
matchLabels:
app: frontarena-ads-aks-test
The Issue was because of the different Azure Regions in which AKS cluster and Azure File Share are deployed. If they are in the same Region you would not have this issue.

AKS Azure DevOps Build Agent

I'm trying to build a Azure DevOps Linux Build Agent in Azure Kubernetes Service.
I created the yaml file and created the secrets to use inside of the file.
I applied the file and have "CreateContainerConfigError" with my pod in a "waiting" state.
I run command
"kubectl get pod <pod name> -o yaml"
and it states the secret "vsts" could not be found.
I find this weird because I used "kubectl get secrets" and I see the secrets "vsts-account" and "vsts-token" listed.
You may check your kubernetes configuration, which is supposed to be like below:
apiVersion: v1
kind: ReplicationController
metadata:
name: vsts-agent
spec:
replicas: 1
template:
metadata:
labels:
app: vsts-agent
version: "0.1"
spec:
containers:
– name: vsts-agent
image: microsoft/vsts-agent:ubuntu-16.04-docker-18.06.1-ce-standard
env:
– name: VSTS_ACCOUNT
valueFrom:
secretKeyRef:
name: vsts
key: VSTS_ACCOUNT
– name: VSTS_TOKEN
valueFrom:
secretKeyRef:
name: vsts
key: VSTS_TOKEN
– name: VSTS_POOL
value: dockerized-vsts-agents
volumeMounts:
– mountPath: /var/run/docker.sock
name: docker-volume
volumes:
– name: docker-volume
hostPath:
path: /var/run/docker.sock
You may follow the blog below to see whether it helps you:
https://mohitgoyal.co/2019/01/10/run-azure-devops-private-agents-in-kubernetes-clusters/

Unable to access local certificate on kubernetes cluster

I have a node application running in a container that works well when I run it locally on docker.
When I try to run it in my k8 cluster, I get the following error.
kubectl -n some-namespace logs --follow my-container-5d7dfbf876-86kv7
> code#1.0.0 my-container /src
> node src/app.js
Error: unable to get local issuer certificate
at TLSSocket.onConnectSecure (_tls_wrap.js:1486:34)
at TLSSocket.emit (events.js:315:20)
at TLSSocket._finishInit (_tls_wrap.js:921:8)
at TLSWrap.ssl.onhandshakedone (_tls_wrap.js:695:12) {
code: 'UNABLE_TO_GET_ISSUER_CERT_LOCALLY'
}
This is strange as the only I run the container with
command: ["npm", "run", "consumer"]
I have also tried adding to my Dockerfile
npm config set strict-ssl false
as per the recommendation here: npm install error - unable to get local issuer certificate but it doesn't seem to help.
So it should be trying to authenticate this way.
I would appreciate any pointers on this.
Here is a copy of my .yaml file for completeness.
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
name: label
name: label
namespace: some-namespace
spec:
replicas: 1
selector:
matchLabels:
name: lable
template:
metadata:
labels:
name: label
spec:
containers:
- name: label
image: some-registry:latest
resources:
limits:
memory: 7000Mi
cpu: '3'
ports:
- containerPort: 80
command: ["npm", "run", "application"]
env:
- name: "DATABASE_URL"
valueFrom:
secretKeyRef:
name: postgres
key: DBUri
- name: "DEBUG"
value: "*,-babel,-mongo:*,mongo:queries,-http-proxy-agent,-https-proxy-agent,-proxy-agent,-superagent,-superagent-proxy,-sinek*,-kafka*"
- name: "ENV"
value: "production"
- name: "NODE_ENV"
value: "production"
- name: "SERVICE"
value: "consumer"
volumeMounts:
- name: certs
mountPath: /etc/secrets
readOnly: true
volumes:
- name: certs
secret:
secretName: certs
items:
- key: certificate
path: certificate
- key: key
path: key
It looks that the pod is not mounting the secrets in the right place. Make sure that .spec.volumeMounts.mountPath is pointing on the right path for the container image.

Not able to create Azure Container Instance with CLI using private image

I can't deploy pod using private image (ACR) using CLI and yaml file.
Deploying from registry directly using either az container or kubectl run does work however.
Pod status:
"containers": [
{
"count": 3,
"firstTimestamp": "2017-08-26T07:31:36+00:00",
"lastTimestamp": "2017-08-26T07:32:20+00:00",
"message": "Failed: Failed to pull image \"ucont01.azurecr.io/unreal-deb\": rpc error: code 2 desc Error: im age unreal-deb:latest not found",
"type": "Warning"
},
],
},
Yaml file:
apiVersion: v1
kind: Pod
metadata:
generateName: "game-"
namespace: default
spec:
nodeName: aci-connector
dnsPolicy: ClusterFirst
restartPolicy: Never
containers:
- name: unreal-dev-server
image: ucont01.azurecr.io/unreal-deb
imagePullPolicy: Always
ports:
- containerPort: 7777
protocol: UDP
imagePullSecrets:
- name: registrykey
Unfortunately the aci-connector-k8s doesn't currently support images from private repositories. There is an issue open to add support but it's not currently implemented.
https://github.com/Azure/aci-connector-k8s/issues/35
According to your description, could you please check your repositories via Azure portal, like this:
Use your YAML, it work for me:
apiVersion: v1
kind: Pod
metadata:
generateName: "game-"
namespace: default
spec:
nodeName: k8s-agent-379980cb-0
dnsPolicy: ClusterFirst
restartPolicy: Never
containers:
- name: unreal-dev-server
image: jasontest.azurecr.io/samples/nginx
imagePullPolicy: Always
ports:
- containerPort: 7777
protocol: TCP
imagePullSecrets:
- name: secret1
Here is the screenshot:
Here is my secret:
jason#k8s-master-379980CB-0:~$ kubectl get secret
NAME TYPE DATA AGE
default-token-865dj kubernetes.io/service-account-token 3 1h
secret1 kubernetes.io/dockercfg 1 47m
If the credentials (corresponding to registrykey) are incorrect, you may get 'image not found' error, though the image exists. you may want to verify the registrykey credentials again..

Kubernetes Unable to mount volumes for pod with timeout

I am trying to mount an NFS volume to my pods but with no success.
I have a server running the nfs mount point, when I try to connect to it from some other running server
sudo mount -t nfs -o proto=tcp,port=2049 10.0.0.4:/export /mnt works fine
Another thing worth mentioning is when I remove the volume from the deployment and the pod is running. I log into it and i can telnet to 10.0.0.4 with ports 111 and 2049 successfully. so there really doesnt seem to be any communication problems
as well as:
showmount -e 10.0.0.4
Export list for 10.0.0.4:
/export/drive 10.0.0.0/16
/export 10.0.0.0/16
So I can assume that there is no network or configuration problems between the server and the client (I am using Amazon and the server that i tested on is in the same security group as the k8s minions)
P.S:
The server is a simple ubuntu->50gb disk
Kubernetes v1.3.4
So I start creating my PV
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs
spec:
capacity:
storage: 50Gi
accessModes:
- ReadWriteMany
nfs:
server: 10.0.0.4
path: "/export"
And my PVC
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: nfs-claim
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 50Gi
here is how kubectl describes them:
Name: nfs
Labels: <none>
Status: Bound
Claim: default/nfs-claim
Reclaim Policy: Retain
Access Modes: RWX
Capacity: 50Gi
Message:
Source:
Type: NFS (an NFS mount that lasts the lifetime of a pod)
Server: 10.0.0.4
Path: /export
ReadOnly: false
No events.
AND
Name: nfs-claim
Namespace: default
Status: Bound
Volume: nfs
Labels: <none>
Capacity: 0
Access Modes:
No events.
pod deployment:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: mypod
labels:
name: mypod
spec:
replicas: 1
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
type: RollingUpdate
template:
metadata:
name: mypod
labels:
# Important: these labels need to match the selector above, the api server enforces this constraint
name: mypod
spec:
containers:
- name: abcd
image: irrelevant to the question
ports:
- containerPort: 80
env:
- name: hello
value: world
volumeMounts:
- mountPath: "/mnt"
name: nfs
volumes:
- name: nfs
persistentVolumeClaim:
claimName: nfs-claim
When I deploy my POD i get the following:
Volumes:
nfs:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: nfs-claim
ReadOnly: false
default-token-6pd57:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-6pd57
QoS Tier: BestEffort
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
13m 13m 1 {default-scheduler } Normal Scheduled Successfully assigned xxx-2140451452-hjeki to ip-10-0-0-157.us-west-2.compute.internal
11m 7s 6 {kubelet ip-10-0-0-157.us-west-2.compute.internal} Warning FailedMount Unable to mount volumes for pod "xxx-2140451452-hjeki_default(93ca148d-6475-11e6-9c49-065c8a90faf1)": timeout expired waiting for volumes to attach/mount for pod "xxx-2140451452-hjeki"/"default". list of unattached/unmounted volumes=[nfs]
11m 7s 6 {kubelet ip-10-0-0-157.us-west-2.compute.internal} Warning FailedSync Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "xxx-2140451452-hjeki"/"default". list of unattached/unmounted volumes=[nfs]
Tried everything I know, and everything i can think of. What am i missing or doing wrong here?
I tested version 1.3.4 and 1.3.5 of Kubernetes and NFS mount didn't work for me. Later I switched to the 1.2.5 and that version gave me some more detailed info ( kubectl describe pod ...). It turned out that 'nfs-common' is missing in the hyperkube image. After I added nfs-common to all container instances based on hyperkube image on master and worker nodes the NFS share started to work normally (mount was successful). So that's the case here. I tested it in practice and it solved my problem.

Resources