At first, I set limit range for namespace kube-system as below:
apiVersion: v1
kind: LimitRange
metadata:
name: cpu-limit-range
namespace: kube-system
spec:
limits:
- default:
cpu: 500m
memory: 500Mi
defaultRequest:
cpu: 100m
memory: 100Mi
type: Container
However, later found that there is insufficient CPU and Memory to start up my pod as limits are > 100% from namespace kube-system already.
How can I reset reasonable limits for pods in kube-system. It is better to set their limits to unlimitted but I don't know how to set it.
Supplement information for namespace kube-system:
Not sure if your kube-system namespace has a limit set. You can confirm it checking the namespace itself:
kubectl describe namespace kube-system
If you have a limit range or a resource quota set, it will appear in the description. Something like the following:
Name: default-cpu-example
Labels: <none>
Annotations: <none>
Status: Active
No resource quota.
Resource Limits
Type Resource Min Max Default Request Default Limit Max Limit/Request Ratio
---- -------- --- --- --------------- ------------- -----------------------
Container cpu - - 500m 1 -
In this case I have set resource limits for my namespace.
Now I can list all the ResourceQuotas and LimitRanges using:
kubectl get resourcequotas -n kube-system
kubectl get limitranges -n kube-system
If somethings returns, from those you can simply remove it:
kubectl delete resourcequotas NAME_OF_YOUR_RESOURCE_QUOTA -n kube-system
kubectl delete limitranges NAME_OF_YOUR_LIMIT_RANGE -n kube-system
I'm still not sure if that's your true problem, but that answers your question.
You can find more info here:
https://kubernetes.io/docs/tasks/administer-cluster/manage-resources/cpu-default-namespace/
Related
I'd like to host Windows containers, which act as build agents, at an Azure Kubernetes Service instance - unfortunately I can't increase the default 20GB pod disk space. I'd need more disk space for running build jobs at the pods.
The pod is getting deployed using an ADO pipeline by applying YAML which describes the workload.
Attaching the pod, and proving the disk space results in following:
PS: C:\ Get-PSDrive C
Name Used (GB) Free (GB) Provider Root
---- --------- --------- -------- ----
C 0.31 19.57 FileSystem C:\
Does anybody know how to increase the disk space?
At our on-premise cluster it is possible by adding
--storage-opt 50G
as parameter with regard to the modified Docker service parameter.
But how does it work for AKS?
Thank you a lot in advance!
We can increase the pod disk size in AKS by creating the disks manually using persistent volume
By default disk size will be 4GiB
For me its 30GiB, I increased to 50GiB
To increase the disk size please follow the below steps
I have created the storage class for disk
vi sc.yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: azuredisk-premium-retain
provisioner: kubernetes.io/azure-disk
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
parameters :
storageaccounttype: Premium_LRS
kind: Managed
To deploy the Storage class use below command
kubectl apply -f sc.yaml
Please use the below command to check the storage class created or not
kubectl get sc
I have created the persistent volume to to create the disk manually
vi pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: azure-managed-disk-pvc
spec:
accessModes:
- ReadWriteOnce
storageClassName: azuredisk-premium-retain
resources:
requests:
storage: 50GiB
In the pvc file i am increasing the storage to 50GiB
To deploy the PVC use below commands
kubectl apply -f pvc.yaml
kubectl get pvc
I have created the pod for mounting the volume
vi pod.yaml
kind: Pod
apiVersion: v1
metadata:
name: newpod #pod name
spec:
containers:
- name: newpod
image: nginx:latest
volumeMounts:
- mountPath: "/mnt/azure" # mounting the volume
name: volume
volumes:
- name: volume
persistentVolumeClaim:
claimName: azure-managed-disk-pvc
To deploy the pod
kubectl apply -f pod.yaml
kubectl get pods
After deploying the pvc_file Go-To>portal>disks>search with pvc_name you created, disk will be increased with created with 50GiB
Previously it was 30GiB now it increased to 50GiB
NOTE : we cannot decrease the disk size once it increase
Reference:
MS-DOC
Here is the output I am getting:
[root#ip-10-0-3-103 ec2-user]# kubectl get pod --namespace=migration
NAME READY STATUS RESTARTS AGE
clear-nginx-deployment-cc77649fb-j8mzj 0/1 Pending 0 118m
clear-nginx-deployment-temp-cc77649fb-hxst2 0/1 Pending 0 41s
Could not understand the message shown in json:
*"status":
{
"conditions": [
{
"message": "0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.",
"reason": "Unschedulable",
"status": "False",
"type": "PodScheduled"
}
],
"phase": "Pending",
"qosClass": "BestEffort"
}*
If you could please help to get through this.
The earlier question on stackoverflow doesn't answer my query as my message output is different.
This is due to the fact that your Pods have been instructed to claim storage, however, in your case there is storage available.
Check your Pods with kubectl get pods <pod-name> -o yaml and look at the exact yaml that has been applied to the cluster. In there you should be able to see that the Pod is trying to claim a PersistentVolume (PV).
To quickly create a PV backed by a hostPath apply the following yaml:
apiVersion: v1
kind: PersistentVolume
metadata:
name: stackoverflow-hostpath
namespace: migration
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data"
Kubernetes will exponentially try to schedule the Pod again; to speed things up delete one of your pods (kubectl delete pods <pod-name>) to reschedule it immediately.
I have a simple service
apiVersion: apps/v1 # for versions before 1.9.0 use apps/v1beta2
kind: Deployment
metadata:
name: nginx-deployment
spec:
selector:
matchLabels:
app: nginx
replicas: 2 # tells deployment to run 2 pods matching the template
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
And here is how my cluster looks like. Pretty simple.
$kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
my-shell-95cb5df57-cdj4z 1/1 Running 0 23m 10.60.1.32 aks-nodepool-19248108-0 <none> <none>
nginx-deployment-76bf4969df-58d66 1/1 Running 0 36m 10.60.1.10 aks-nodepool-19248108-0 <none> <none>
nginx-deployment-76bf4969df-jfkq7 1/1 Running 0 36m 10.60.1.21 aks-nodepool-19248108-0 <none> <none>
$kubectl get services -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
internal-ingress LoadBalancer 10.0.0.194 10.60.1.35 80:30157/TCP 5m28s app=nginx-deployment
kubernetes ClusterIP 10.0.0.1 <none> 443/TCP 147m <none>
$kubectl get rs -o wide
NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR
my-shell-95cb5df57 1 1 1 23m my-shell ubuntu pod-template-hash=95cb5df57,run=my-shell
nginx-deployment-76bf4969df 2 2 2 37m nginx nginx:1.7.9 app=nginx,pod-template-hash=76bf4969df
I see I have 2 pods wiht my nginx app. I want to be able to send a request from any other new pod to either one of them. If one crashes, I want to still be able to send this request.
In the past I used a load balancer for this. The problem with load balancers is that they open up a public IP and int this specific scenario, I don't want a public IP anymore. I want this service to be invoked by other pods directly, without a public IP.
I tried to use an internal load balancer.
apiVersion: v1
kind: Service
metadata:
name: internal-ingress
annotations:
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
service.beta.kubernetes.io/azure-load-balancer-internal-subnet: "my-subnet"
spec:
type: LoadBalancer
loadBalancerIP: 10.60.1.45
ports:
- port: 80
selector:
app: nginx-deployment
The problem is that it does not get an IP in my 10.60.0.0/16 network like it is described here: https://learn.microsoft.com/en-us/azure/aks/internal-lb#specify-a-different-subnet
I get this never ending <pending>.
kubectl get services -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
internal-ingress LoadBalancer 10.0.0.230 <pending> 80:30638/TCP 15s app=nginx-deployment
kubernetes ClusterIP 10.0.0.1 <none> 443/TCP 136m <none>
What am I missing? How to troubleshoot? Is it even possible to have pod to service communication?
From the message you provide, it seems you want to use a special private IP address which is in the subnet that the same as the AKS cluster use. I think the possible reason is that the special IP address which you want to use is already assigned by the AKS, it means you cannot use it.
Troubleshooting
So you need to guide to the Vnet which your AKS cluster used and check if the IP address is already in use. Here is the screenshot:
Solution
Choose an IP address that is not assigned by the AKS from the subnet the AKS used. Or do not use a special one, let the AKS assign your load balancer dynamic. Then change your YAML file like below:
apiVersion: v1
kind: Service
metadata:
name: internal-ingress
annotations:
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
spec:
type: LoadBalancer
ports:
- port: 80
selector:
app: nginx-deployment
Use a ClusterIP Service (the default type) which creates only a cluster-internal IP and no public IP:
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
selector:
app: nginx
ports:
- port: 80
targetPort: 80
Then you can access the Service (and thus the Pods behind it) from any other Pod in the same namespace by using the Service name as the DNS name:
curl nginx-service
If the Pod from which you want to access the Service is in a different namespace, you have to use the fully qualified domain name of the Service:
curl nginx-service.my-namespace.svc.cluster.local
First off a disclaimer: I have only been using Azure's Kubernetes framework for a short while so my apologies for asking what might be an easy problem.
I have two Kubernetes services running in AKS. I want these services to be able to discover each other by service name. The pods associated with these services are each given an IP from the subnet I've assigned to my cluster:
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP ...
tom 1/1 Running 0 69m 10.0.2.10 ...
jerry 1/1 Running 5 67m 10.0.2.21 ...
If I make REST calls between these services using their pod IPs directly, the calls work as expected. I don't want to of course use hard coded IPs. In reading up on kube dns, my understanding is that entries for registered services are created in the dns. The tests I've done confirms this, but the IP addresses assigned to the dns entries are not the IP addresses of the pods. For example:
$ kubectl exec jerry -- ping -c 1 tom.default
PING tom.default (10.1.0.246): 56 data bytes
The IP address that is associated with the service tom is the so-called "cluster ip":
$ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
tom ClusterIP 10.1.0.246 <none> 6010/TCP 21m
jerry ClusterIP 10.1.0.247 <none> 6040/TCP 20m
The same is true with the service jerry. The problem with these IP addresses is that REST calls using these addresses do not work. Even a simple ping times out. So my question is how can I associate the kube-dns entry that's created for a service with the pod IP instead of the cluster IP?
Based on the posted answer, I updated my yml file for "tom" as follows:
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: tom
spec:
template:
metadata:
labels:
app: tom
spec:
containers:
- name: tom
image: myregistry.azurecr.io/tom:latest
imagePullPolicy: Always
ports:
- containerPort: 6010
---
apiVersion: v1
kind: Service
metadata:
name: tom
spec:
ports:
- port: 6010
name: "6010"
selector:
app: tom
and then re-applied the update. I still get the cluster IP though when I try to resolve tom.default, not the pod IP. I'm still missing part of the puzzle.
Update: As requested, here's the describe output for tom:
$ kubectl describe service tom
Name: tom
Namespace: default
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"name":"tom","namespace":"default"},"spec":{"ports":[{"name":"6010","po...
Selector: app=tom
Type: ClusterIP
IP: 10.1.0.139
Port: 6010 6010/TCP
TargetPort: 6010/TCP
Endpoints: 10.0.2.10:6010
The output is similar for the service jerry. As you can see, the endpoint is what I'd expect--10.0.2.10 is the IP assigned to the pod associated with the service tom. Kube DNS though resolves the name "tom" as the cluster IP, not the pod IP:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE IP ...
tom-b4ccbfb97-wfmjp 1/1 Running 0 15h 10.0.2.10
jerry-dd8fbf98f-8jgw7 1/1 Running 0 14h 10.0.2.20
$ kubectl exec jerry-dd8fbf98f-8jgw7 nslookup tom
Name: tom
Address 1: 10.1.0.139 tom.default.svc.cluster.local
This doesn't really matter of course as long as REST calls are routed to the expected pod IP. I've had some success with this today:
$ kubectl exec jerry-5554b956b-9kpj7 -- wget -O - http://tom:6010/actuator/health
{"status":"UP"}
This shows that even though the name "tom" resolves to the cluster IP there is routing in place that makes sure the call gets to the pod. I've tried the same call from service tom to service jerry and that also works. Curiously, a loopback, from tom to tom, times out:
$ kubectl exec tom-5c68d66cf9-dxlmf -- wget -O - http://tom:6010/actuator/health
Connecting to tom:6010 (10.1.0.139:6010)
wget: can't connect to remote host (10.1.0.139): Operation timed out
command terminated with exit code 1
If I use the pod IP explicitly, the call works:
$ kubectl exec tom-5c68d66cf9-dxlmf -- wget -O - http://10.0.2.10:6010/actuator/health
{"status":"UP"}
So for some reason the routing doesn't work in the loopback case. I can probably get by with that since I don't think we'll need to make calls back to the same service. It is puzzling though.
Peter
This means you didnt publish ports through your service (or used wrong labels). What you are trying to achieve should be done using services exactly, what you need to do is fix your service definition so that it works properly.
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: xxx-name
spec:
template:
metadata:
labels:
app: xxx-label
spec:
containers:
- name: xxx-container
image: kmrcr.azurecr.io/image:0.7
imagePullPolicy: Always
ports:
- containerPort: 7003
- containerPort: 443
---
apiVersion: v1
kind: Service
metadata:
name: xxx-service
spec:
ports:
- port: 7003
name: "7003"
- port: 443
name: "443"
selector:
app: xxx-label < must match your pod label
type: LoadBalancer
notice how this exposes same ports container is listening on and uses the same label as selector to determine to which pods the traffic must go
I have tectonic kubernetes cluster installed on Azure. It's made from tectonic-installer GH repo, from master (commit 0a7a1edb0a2eec8f3fb9e1e612a8ef1fd890c332).
> kubectl version
Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.2", GitCommit:"922a86cfcd65915a9b2f69f3f193b8907d741d9c", GitTreeState:"clean", BuildDate:"2017-07-21T08:23:22Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.3+coreos.0", GitCommit:"42de91f04e456f7625941a6c4aaedaa69708be1b", GitTreeState:"clean", BuildDate:"2017-08-07T19:44:31Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
On the cluster I created storage class, PVC and pod as in: https://gist.github.com/mwieczorek/28b7c779555d236a9756cb94109d6695
But the pod cannot start. When I run:
kubectl describe pod mypod
I get in events:
FailedMount Unable to mount volumes for pod "mypod_default(afc68bee-88cb-11e7-a44f-000d3a28f26a)":
timeout expired waiting for volumes to attach/mount for pod "default"/"mypod". list of unattached/unmounted volumes=[mypd]
In kubelet logs (https://gist.github.com/mwieczorek/900db1e10971a39942cba07e202f3c50) I see:
Error: Volume not attached according to node status for volume "pvc-61a8dc6a-88cb-11e7-ad19-000d3a28f2d3"
(UniqueName: "kubernetes.io/azure-disk//subscriptions/abc/resourceGroups/tectonic-cluster-mwtest/providers/Microsoft.Compute/disks/kubernetes-dynamic-pvc-61a8dc6a-88cb-11e7-ad19-000d3a28f2d3") pod "mypod" (UID: "afc68bee-88cb-11e7-a44f-000d3a28f26a")
When I create PVC - new disc on Azure is created.
And after creating pod - I see on the azure portal that the disc is attached to worker VM where the pod is scheduled.
> fdisk -l
shows:
Disk /dev/sdc: 2 GiB, 2147483648 bytes, 4194304 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
I found similar issue on GH ( kubernetes/kubernetes/issues/50150) but I have cluster built from master so it's not the udev rules (I checked - file /etc/udev/rules.d/66-azure-storage.rules exists)
Does anybody knows if it's a bug (maybe know issue)?
Or am I doing something wrong?
Also: how can I troubleshoot that further?
I had test in lab, use your yaml file to create pod, after one hour, it still show pending.
root#k8s-master-ED3DFF55-0:~# kubectl get pod
NAME READY STATUS RESTARTS AGE
mypod 0/1 Pending 0 1h
task-pv-pod 1/1 Running 0 2h
We can use this yaml file to create pod:
PVC:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mypvc
namespace: kube-public
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
Output:
root#k8s-master-ED3DFF55-0:~# kubectl get pvc --namespace=kube-public
NAME STATUS VOLUME CAPACITY ACCESSMODES STORAGECLASS AGE
mypvc Bound pvc-1b097337-8960-11e7-82fc-000d3a191e6a 100Gi RWO default 3h
Pod:
kind: Pod
apiVersion: v1
metadata:
name: task-pv-pod
spec:
volumes:
- name: task-pv-storage
persistentVolumeClaim:
claimName: task-pv-claim
containers:
- name: task-pv-container
image: nginx
ports:
- containerPort: 80
name: "http-server"
volumeMounts:
- mountPath: "/usr/share/nginx/html"
name: task-pv-storage
Output:
root#k8s-master-ED3DFF55-0:~# kubectl get pods
NAME READY STATUS RESTARTS AGE
task-pv-pod 1/1 Running 0 3h
As a workaround, we can use default as the storageclass.
In Azure, there are managed disk and unmanaged disk. if your nodes are use managed disk, two storage classes will be created to provide access to create Kubernetes persistent volumes using Azure managed disks.
They are managed-premium and managed-standard and map to Standard_LRS and Premium_LRS managed disk types respectively.
If your nodes are use non-managed disk, the default storage class will be used if persistent volume resources don't specify a storage class as part of the resource definition.
The default storage class uses non-managed blob storage and will provision the blob within an existing storage account present in the resource group or provision a new storage account.
Non-managed persistent volume types are available on all VM sizes.
More information about managed disk and non-managed disk, please refer to this link.
Here is the test result:
root#k8s-master-ED3DFF55-0:~# kubectl get pvc --namespace=default
NAME STATUS VOLUME CAPACITY ACCESSMODES STORAGECLASS AGE
shared Pending standard-managed 2h
shared1 Pending managed-standard 15m
shared12 Pending standard-managed 14m
shared123 Bound pvc-a379ced4-897c-11e7-82fc-000d3a191e6a 2Gi RWO default 12m
task-pv-claim Bound pvc-3cefd456-8961-11e7-82fc-000d3a191e6a 3Gi RWO default 3h
Update:
Here is my K8s agent's unmanaged disk:
In your case, "kubectl describe pod-name" does not provide suffiecient info, you need to provide k8s contoller manager logs for troubleshooting
Get the controller manager logs on master:
#get the "CONTAINER ID" of "/hyperkube controlle"
docker ps -a | grep "hyperkube controlle" | awk -F ' ' '{print $1}'
#get controller manager logs
docker logs "CONTAINER ID" > "CONTAINER ID".log 2>&1 &
Provisioning should be very quick. Check your controller logs to make sure the PV required by the PVC is provisioned correctly:
Navigate to Azure portal > cluster > Activity Log
Remove filter for namespaces and look for "Update Storage Account Create" entries.
In our case we needed to register our cluster subscription for the 'Microsoft.Storage' namespace so that the controller can provision the required PV. You can do this with the azure cli:
az provider register --namespace Microsoft.Storage
I had a similar issue, this command worked for me.
az resource update --ids /subscriptions/<SUBSCRIPTION-ID>/resourcegroups/<RESOURCE-GROUP>/providers/Microsoft.ContainerService/managedClusters/<AKS-CLUSTER-NAME>/agentpools/<NODE-GROUP-NAME>