I am submitting a simple Pyspark wordcount job to a freshly built YARN cluster, via Airflow and the SparkSubmitOperator. The job hits YARN, I can see it in the ResourceManager UI, but fails with this error:
"Diagnostics: Application application_1582063076991_0002 submitted by user root to unknown queue: root.default"
*User: root
Name: PySpark Wordcount
Application Type: SPARK
Application Tags:
YarnApplicationState: FAILED
Queue: root.default
FinalStatus Reported by AM: FAILED
Started: Fri Feb 21 08:01:25 +1100 2020
Elapsed: 0sec
Tracking URL: History
Diagnostics: Application application_1582063076991_0002 submitted by user root to unknown queue: root.default*
The default.root queue certainly seems to be there:
*Application Queues
Legend:CapacityUsedUsed (over capacity)Max Capacity
.root 0.0% used
..Queue: default 0.0% used
'default' Queue Status
Queue State: RUNNING
Used Capacity: 0.0%
Configured Capacity: 100.0%
Configured Max Capacity: 100.0%
Absolute Used Capacity: 0.0%
Absolute Configured Capacity: 100.0%
Absolute Configured Max Capacity: 100.0%
Used Resources: <memory:0, vCores:0>
Num Schedulable Applications: 0
Num Non-Schedulable Applications: 0
Num Containers: 0
Max Applications: 10000
Max Applications Per User: 10000
Max Application Master Resources: <memory:3072, vCores:1>
Used Application Master Resources: <memory:0, vCores:0>
Max Application Master Resources Per User: <memory:3072, vCores:1>
Configured Minimum User Limit Percent: 100%
Configured User Limit Factor: 1.0
Accessible Node Labels: *
Preemption: disabled*
What am I missing here ? Thanks
Submit with queue name default.
The root in the Resource Manager is used only to group the queues in hierarchical form.
Related
We have deployed jenkins on docker container and recently we started seeing that our jenkins server is not coming up due to disk space issue. Below is the error we see in logs.
2022-09-17 21:41:32.567+0000 [id=32] INFO hudson.slaves.SlaveComputer#tryReconnect: Attempting to reconnect V3LOCITY-SLAVE-02
/usr/local/bin/jenkins.sh: line 38: cannot create temp file for here-document: No space left on device
Running from: /usr/share/jenkins/jenkins.war
webroot: EnvVars.masterEnvVars.get("JENKINS_HOME")
Exception in thread "main" java.io.IOException: Jenkins has failed to create a temporary file in /tmp
at Main.extractFromJar(Main.java:498)
at Main._main(Main.java:310)
at Main.main(Main.java:151)
Caused by: java.io.IOException: No space left on device
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(File.java:2063)
at Main.extractFromJar(Main.java:495)
... 2 more
We assume issue with docker container running of out space, See below info for your reference.
TYPE TOTAL ACTIVE SIZE RECLAIMABLE
Images 1 1 572.5MB 0B (0%)
Containers 1 0 9.467GB 9.467GB (100%)
Local Volumes 0 0 0B 0B
Build Cache 0 0 0B 0B
Assuming container running of space we have increased it to 40 GB by adding below content in /etc/docker/daemon.json file and recreated the contained but still see the same issue after restart of container
{
"storage-driver": "devicemapper",
"storage-opts": [
"dm.basesize=40G"
]
}
See below docker info your reference.
Client:
Debug Mode: false
Server:
Containers: 1
Running: 0
Paused: 0
Stopped: 1
Images: 1
Server Version: 19.03.11-ol
Storage Driver: devicemapper
Pool Name: docker-249:0-1140851221-pool
Pool Blocksize: 65.54kB
Base Device Size: 42.95GB
Backing Filesystem: xfs
Udev Sync Supported: true
Data file: /dev/loop0
Metadata file: /dev/loop1
Data loop file: /var/lib/docker/devicemapper/devicemapper/data
Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
Data Space Used: 10.82GB
Data Space Total: 107.4GB
Data Space Available: 96.56GB
Metadata Space Used: 6.877MB
Metadata Space Total: 2.147GB
Metadata Space Available: 2.141GB
Thin Pool Minimum Free Space: 10.74GB
Deferred Removal Enabled: true
Deferred Deletion Enabled: true
Deferred Deleted Device Count: 0
Library Version: 1.02.170-RHEL7 (2020-03-24)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 7eba5930496d9bbe375fdf71603e610ad737d2b2
runc version: 52de29d
init version: fec3683
Security Options:
seccomp
Profile: default
Kernel Version: 4.1.12-124.65.1.2.el7uek.x86_64
Operating System: Oracle Linux Server 7.9
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 6.56GiB
Name: vm-app-docker-jenkinsqa
ID: TAII:OWLM:Y3BU:65DC:A3SK:SSJQ:H6H2:BLA2:HQA5:ODCP:Y7S5:KCJ2
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
WARNING: the devicemapper storage-driver is deprecated, and will be removed in a future release.
WARNING: devicemapper: usage of loopback devices is strongly discouraged for production use.
Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.
Registries:
You need to map jenkins home to an external folder (volume) and make sure the host has enough space.
See Jenkins docs for more details.
For example:
docker run --name jenkins -v /var/jenkins_home:/var/jenkins_home ...
So on GKE I have a Node.js app which for each pod uses about: CPU(cores): 5m, MEMORY: 100Mi
However I am only able to deploy 1 pod of it per node. I am using the GKE n1-standard-1 cluster which has 1 vCPU, 3.75 GB per node.
So in order to get 2 pods of app up total = CPU(cores): 10m, MEMORY: 200Mi, it requires another entire +1 node = 2 nodes = 2 vCPU, 7.5 GB to make it work. If I try to deploy those 2 pods on the same single node, I get insufficient CPU error.
I have a feeling I should actually be able to run a handful of pod replicas (like 3 replicas and more) on 1 node of f1-micro (1 vCPU, 0.6 GB) or f1-small (1 vCPU, 1.7 GB), and that I am way overprovisioned here, and wasting my money.
But I am not sure why I seem so restricted by insufficient CPU. Is there some config I need to change? Any guidance would be appreciated.
Allocatable:
cpu: 940m
ephemeral-storage: 47093746742
hugepages-2Mi: 0
memory: 2702216Ki
pods: 110
Non-terminated Pods: (7 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
--------- ---- ------------ ---------- --------------- -------------
default mission-worker-5cf6654687-fwmk4 100m (10%) 0 (0%) 0 (0%) 0 (0%)
default mission-worker-5cf6654687-lnwkt 100m (10%) 0 (0%) 0 (0%) 0 (0%)
kube-system fluentd-gcp-v3.1.1-5b6km 100m (10%) 1 (106%) 200Mi (7%) 500Mi (18%)
kube-system kube-dns-76dbb796c5-jgljr 260m (27%) 0 (0%) 110Mi (4%) 170Mi (6%)
kube-system kube-proxy-gke-test-cluster-pool-1-96c6d8b2-m15p 100m (10%) 0 (0%) 0 (0%) 0 (0%)
kube-system metadata-agent-nb4dp 40m (4%) 0 (0%) 50Mi (1%) 0 (0%)
kube-system prometheus-to-sd-gwlkv 1m (0%) 3m (0%) 20Mi (0%) 20Mi (0%)
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 701m (74%) 1003m (106%)
memory 380Mi (14%) 690Mi (26%)
Events: <none>
After the deployment, check the node capacities with kubectl describe nodes. For e.g: In the code example at the bottom of the answer:
Allocatable cpu: 1800m
Already used by pods in kube-system namespace: 100m + 260m + +100m + 200m + 20m = 680m
Which means 1800m - 680m = 1120m is left for you to use
So, if your pod or pods request for more than 1120m cpu, they will not fit on this node
So in order to get 2 pods of app up total = CPU(cores): 10m, MEMORY:
200Mi, it requires another entire +1 node = 2 nodes = 2 vCPU, 7.5 GB
to make it work. If I try to deploy those 2 pods on the same single
node, I get insufficient CPU error.
If you do the exercise described above, you will find your answer. In case, there is enough cpu for your pods to use and still you are getting insufficient CPU error, check if you are setting the cpu request and limit params correctly. See here
If you do all the above and still it's an issue. Then, I think in your case, what could be happening is that you are allocating 5-10m cpu for your node app which is too less cpu to allocate. Try increasing that may be to 50m cpu.
I have a feeling I should actually be able to run a handful of pod
replicas (like 3 replicas and more) on 1 node of f1-micro (1 vCPU, 0.6
GB) or f1-small (1 vCPU, 1.7 GB), and that I am way overprovisioned
here, and wasting my money.
Again, do the exercise describe above to conclude that
Name: e2e-test-minion-group-4lw4
[ ... lines removed for clarity ...]
Capacity:
cpu: 2
memory: 7679792Ki
pods: 110
Allocatable:
cpu: 1800m
memory: 7474992Ki
pods: 110
[ ... lines removed for clarity ...]
Non-terminated Pods: (5 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
--------- ---- ------------ ---------- --------------- -------------
kube-system fluentd-gcp-v1.38-28bv1 100m (5%) 0 (0%) 200Mi (2%) 200Mi (2%)
kube-system kube-dns-3297075139-61lj3 260m (13%) 0 (0%) 100Mi (1%) 170Mi (2%)
kube-system kube-proxy-e2e-test-... 100m (5%) 0 (0%) 0 (0%) 0 (0%)
kube-system monitoring-influxdb-grafana-v4-z1m12 200m (10%) 200m (10%) 600Mi (8%) 600Mi (8%)
kube-system node-problem-detector-v0.1-fj7m3 20m (1%) 200m (10%) 20Mi (0%) 100Mi (1%)
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
680m (34%) 400m (20%) 920Mi (12%) 1070Mi (14%)
I'd like to configure cluster autoscaler on AKS. When scaling down it fails due to PDB:
I1207 14:24:09.523313 1 cluster.go:95] Fast evaluation: node aks-nodepool1-32797235-0 cannot be removed: no enough pod disruption budget to move kube-system/metrics-server-5cbc77f79f-44f9w
I1207 14:24:09.523413 1 cluster.go:95] Fast evaluation: node aks-nodepool1-32797235-3 cannot be removed: non-daemonset, non-mirrored, non-pdb-assignedkube-system pod present: cluster-autoscaler-84984799fd-22j42
I1207 14:24:09.523438 1 scale_down.go:490] 2 nodes found to be unremovable in simulation, will re-check them at 2018-12-07 14:29:09.231201368 +0000 UTC m=+8976.856144807
All system pods have minAvailable: 1 PDB assigned manually. I can imagine that this is not working for PODs with only a single replica like the metrics-server:
❯ k get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
aks-nodepool1-32797235-0 Ready agent 4h v1.11.4 10.240.0.4 <none> Ubuntu 16.04.5 LTS 4.15.0-1030-azure docker://3.0.1
aks-nodepool1-32797235-3 Ready agent 4h v1.11.4 10.240.0.6 <none> Ubuntu 16.04.5 LTS 4.15.0-1030-azure docker://3.0.1
❯ ks get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
cluster-autoscaler-84984799fd-22j42 1/1 Running 0 2h 10.244.1.5 aks-nodepool1-32797235-3 <none>
heapster-5d6f9b846c-g7qb8 2/2 Running 0 1h 10.244.0.16 aks-nodepool1-32797235-0 <none>
kube-dns-v20-598f8b78ff-8pshc 4/4 Running 0 3h 10.244.1.4 aks-nodepool1-32797235-3 <none>
kube-dns-v20-598f8b78ff-plfv8 4/4 Running 0 1h 10.244.0.15 aks-nodepool1-32797235-0 <none>
kube-proxy-fjvjv 1/1 Running 0 1h 10.240.0.6 aks-nodepool1-32797235-3 <none>
kube-proxy-szr8z 1/1 Running 0 1h 10.240.0.4 aks-nodepool1-32797235-0 <none>
kube-svc-redirect-2rhvg 2/2 Running 0 4h 10.240.0.4 aks-nodepool1-32797235-0 <none>
kube-svc-redirect-r2m4r 2/2 Running 0 4h 10.240.0.6 aks-nodepool1-32797235-3 <none>
kubernetes-dashboard-68f468887f-c8p78 1/1 Running 0 4h 10.244.0.7 aks-nodepool1-32797235-0 <none>
metrics-server-5cbc77f79f-44f9w 1/1 Running 0 4h 10.244.0.3 aks-nodepool1-32797235-0 <none>
tiller-deploy-57f988f854-z9qln 1/1 Running 0 4h 10.244.0.8 aks-nodepool1-32797235-0 <none>
tunnelfront-7cf9d447f9-56g7k 1/1 Running 0 4h 10.244.0.2 aks-nodepool1-32797235-0 <none>
What needs be changed (number of replicas? PDB configuration?) for down-scaling to work?
Basically, this is an administration issues when draining nodes that are configured by PDB ( Pod Disruption Budget )
This is because the evictions are forced to respect the PDB you specify
you have two options:
Either force the hand:
kubectl drain foo --force --grace-period=0
you can check other options from the doc -> https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#drain
or use the eviction api:
{
"apiVersion": "policy/v1beta1",
"kind": "Eviction",
"metadata": {
"name": "quux",
"namespace": "default"
}
}
Anyhow, the drain or the eviction api attempts delete on pod to let them be scheduled elswhere before completely draining the node
As mentioned in the docs:
the API can respond in one of three ways:
If the eviction is granted, then the pod is deleted just as if you had sent a DELETE request to the pod’s URL and you get back 200 OK.
If the current state of affairs wouldn’t allow an eviction by the rules set forth in the budget, you get back 429 Too Many Requests. This is typically used for generic rate limiting of any requests
If there is some kind of misconfiguration, like multiple budgets pointing at the same pod, you will get 500 Internal Server Error.
For a given eviction request, there are two cases:
There is no budget that matches this pod. In this case, the server always returns 200 OK.
There is at least one budget. In this case, any of the three above responses may apply.
If it gets stuck then you might need to do it manually
you can read me here or here
Update 2: I was able to get the statistics by using grafana and influxDB. However, I find this overkill. I want to see the current status of my cluster, not persee the historical trends. Based on the linked image, it should be possible by using the pre-deployed Heapster and the Kubernetes Dashboard
Update 1:
With the command below, I do see resource information. I guess the remaining part of the question is why it is not showing up (or how I should configure it to show up) in the kubernetes dashboard, as shown in this image: https://docs.giantswarm.io/img/dashboard-ui.png
$ kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
k8s-agentpool0-41204139-0 36m 1% 682Mi 9%
k8s-agentpool0-41204139-1 33m 1% 732Mi 10%
k8s-agentpool0-41204139-10 36m 1% 690Mi 10%
[truncated]
I am trying to monitor performance in my Azure Kubernetes deployment. I noticed it has Heapster running by default. I did not launch this one, but do want to leverage it if it is there. My question is: how can I access it, or is there something wrong with it? Here are the details I can think of, let me know if you need more.
$ kubectl cluster-info
Kubernetes master is running at https://[hidden].uksouth.cloudapp.azure.com
Heapster is running at https://[hidden].uksouth.cloudapp.azure.com/api/v1/namespaces/kube-system/services/heapster/proxy
KubeDNS is running at https://[hidden].uksouth.cloudapp.azure.com/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
kubernetes-dashboard is running at https://[hidden].uksouth.cloudapp.azure.com/api/v1/namespaces/kube-system/services/kubernetes-dashboard/proxy
tiller-deploy is running at https://[hidden].uksouth.cloudapp.azure.com/api/v1/namespaces/kube-system/services/tiller-deploy:tiller/proxy
I set up a proxy:
$ kubectl proxy
Starting to serve on 127.0.0.1:8001
Point my browser to
localhost:8001/api/v1/namespaces/kube-system/services/kubernetes-dashboard/proxy/#!/workload?namespace=default
I see the kubernetes dashboard, but do notice that I do not see the performance graphs that are displayed at https://kubernetes.io/docs/tasks/access-application-cluster/web-ui-dashboard/. I also do not see the admin section.
I then point my browser to localhost:8001/api/v1/namespaces/kube-system/services/heapster/proxy and get
404 page not found
Inspecting the pods:
kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
heapster-2205950583-43w4b 2/2 Running 0 1d
kube-addon-manager-k8s-master-41204139-0 1/1 Running 0 1d
kube-apiserver-k8s-master-41204139-0 1/1 Running 0 1d
kube-controller-manager-k8s-master-41204139-0 1/1 Running 0 1d
kube-dns-v20-2000462293-1j20h 3/3 Running 0 16h
kube-dns-v20-2000462293-hqwfn 3/3 Running 0 16h
kube-proxy-0kwkf 1/1 Running 0 1d
kube-proxy-13bh5 1/1 Running 0 1d
[truncated]
kube-proxy-zfbb1 1/1 Running 0 1d
kube-scheduler-k8s-master-41204139-0 1/1 Running 0 1d
kubernetes-dashboard-732940207-w7pt2 1/1 Running 0 1d
tiller-deploy-3007245560-4tk78 1/1 Running 0 1d
Checking the log:
$kubectl logs heapster-2205950583-43w4b heapster --namespace=kube-system
I0309 06:11:21.241752 19 heapster.go:72] /heapster --source=kubernetes.summary_api:""
I0309 06:11:21.241813 19 heapster.go:73] Heapster version v1.4.2
I0309 06:11:21.242310 19 configs.go:61] Using Kubernetes client with master "https://10.0.0.1:443" and version v1
I0309 06:11:21.242331 19 configs.go:62] Using kubelet port 10255
I0309 06:11:21.243557 19 heapster.go:196] Starting with Metric Sink
I0309 06:11:21.344547 19 heapster.go:106] Starting heapster on port 8082
E0309 14:14:05.000293 19 summary.go:389] Node k8s-agentpool0-41204139-32 is not ready
E0309 14:14:05.000331 19 summary.go:389] Node k8s-agentpool0-41204139-56 is not ready
[truncated the other agent pool messages saying not ready]
E0309 14:24:05.000645 19 summary.go:389] Node k8s-master-41204139-0 is not ready
$kubectl describe pod heapster-2205950583-43w4b --namespace=kube-system
Name: heapster-2205950583-43w4b
Namespace: kube-system
Node: k8s-agentpool0-41204139-54/10.240.0.11
Start Time: Fri, 09 Mar 2018 07:11:15 +0100
Labels: k8s-app=heapster
pod-template-hash=2205950583
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"kube-system","name":"heapster-2205950583","uid":"ac75e772-2360-11e8-9e1c-00224807...
scheduler.alpha.kubernetes.io/critical-pod=
Status: Running
IP: 10.244.58.2
Controlled By: ReplicaSet/heapster-2205950583
Containers:
heapster:
Container ID: docker://a9205e7ab9070a1d1bdee4a1b93eb47339972ad979c4d35e7d6b59ac15a91817
Image: k8s-gcrio.azureedge.net/heapster-amd64:v1.4.2
Image ID: docker-pullable://k8s-gcrio.azureedge.net/heapster-amd64#sha256:f58ded16b56884eeb73b1ba256bcc489714570bacdeca43d4ba3b91ef9897b20
Port: <none>
Command:
/heapster
--source=kubernetes.summary_api:""
State: Running
Started: Fri, 09 Mar 2018 07:11:20 +0100
Ready: True
Restart Count: 0
Limits:
cpu: 121m
memory: 464Mi
Requests:
cpu: 121m
memory: 464Mi
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from heapster-token-txk8b (ro)
heapster-nanny:
Container ID: docker://68e021532a482f32abec844d6f9ea00a4a8232b8d1004b7df4199d2c7d3a3b4c
Image: k8s-gcrio.azureedge.net/addon-resizer:1.7
Image ID: docker-pullable://k8s-gcrio.azureedge.net/addon-resizer#sha256:dcec9a5c2e20b8df19f3e9eeb87d9054a9e94e71479b935d5cfdbede9ce15895
Port: <none>
Command:
/pod_nanny
--cpu=80m
--extra-cpu=0.5m
--memory=140Mi
--extra-memory=4Mi
--threshold=5
--deployment=heapster
--container=heapster
--poll-period=300000
--estimator=exponential
State: Running
Started: Fri, 09 Mar 2018 07:11:18 +0100
Ready: True
Restart Count: 0
Limits:
cpu: 50m
memory: 90Mi
Requests:
cpu: 50m
memory: 90Mi
Environment:
MY_POD_NAME: heapster-2205950583-43w4b (v1:metadata.name)
MY_POD_NAMESPACE: kube-system (v1:metadata.namespace)
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from heapster-token-txk8b (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
heapster-token-txk8b:
Type: Secret (a volume populated by a Secret)
SecretName: heapster-token-txk8b
Optional: false
QoS Class: Guaranteed
Node-Selectors: beta.kubernetes.io/os=linux
Tolerations: <none>
Events: <none>
I have seen in the past that if you restart the dashboard pod it starts working. Can you try that real fast and let me know?
I'm trying to run Cassandra on kubernetes.
One thing I understood is Cassandra is trying to access kubernetes api server on port 443 (Secure Connection) but I'm running api server on non secure connection port 8080.
Also there is Has no permission to create /cassandra_data/data directory error
Error (Cassandra pod log):
INFO 13:04:02 Getting endpoints from https://10.100.0.1:443/api/v1/namespaces/default/endpoints/cassandra
WARN 13:04:11 Request to kubernetes apiserver failed
java.io.IOException: Server returned HTTP response code: 401 for URL: https://10.100.0.1:443/api/v1/namespaces/default/endpoints/cassandra
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1628) ~[na:1.7.0_95]
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:254) ~[na:1.7.0_95]
at io.k8s.cassandra.KubernetesSeedProvider.getSeeds(KubernetesSeedProvider.java:143) ~[kubernetes-cassandra.jar:na]
at org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:659) [apache-cassandra-2.1.13.jar:2.1.13]
at org.apache.cassandra.config.DatabaseDescriptor.<clinit>(DatabaseDescriptor.java:136) [apache-cassandra-2.1.13.jar:2.1.13]
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:168) [apache-cassandra-2.1.13.jar:2.1.13]
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:564) [apache-cassandra-2.1.13.jar:2.1.13]
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:653) [apache-cassandra-2.1.13.jar:2.1.13]
INFO 13:04:11 JVM vendor/version: OpenJDK 64-Bit Server VM/1.7.0_95
WARN 13:04:11 OpenJDK is not recommended. Please upgrade to the newest Oracle Java release
INFO 13:04:11 Heap size: 526385152/526385152
INFO 13:04:11 Code Cache Non-heap memory: init = 2555904(2496K) used = 795136(776K) committed = 2555904(2496K) max = 50331648(49152K)
INFO 13:04:11 Eden Space Heap memory: init = 83886080(81920K) used = 79390336(77529K) committed = 83886080(81920K) max = 83886080(81920K)
INFO 13:04:11 Survivor Space Heap memory: init = 10485760(10240K) used = 0(0K) committed = 10485760(10240K) max = 10485760(10240K)
INFO 13:04:11 CMS Old Gen Heap memory: init = 432013312(421888K) used = 0(0K) committed = 432013312(421888K) max = 432013312(421888K)
INFO 13:04:11 CMS Perm Gen Non-heap memory: init = 21757952(21248K) used = 19488984(19032K) committed = 21757952(21248K) max = 174063616(169984K)
INFO 13:04:11 Classpath: /kubernetes-cassandra.jar:/etc/cassandra:/usr/share/cassandra/lib/ST4-4.0.8.jar:/usr/share/cassandra/lib/airline-0.6.jar:/usr/share/cassandra/lib/antlr-runtime-3.5.2.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang3-3.1.jar:/usr/share/cassandra/lib/commons-math3-3.2.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/share/cassandra/lib/disruptor-3.0.1.jar:/usr/share/cassandra/lib/guava-16.0.jar:/usr/share/cassandra/lib/high-scale-lib-1.0.6.jar:/usr/share/cassandra/lib/jackson-core-asl-1.9.2.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/share/cassandra/lib/jamm-0.3.0.jar:/usr/share/cassandra/lib/javax.inject.jar:/usr/share/cassandra/lib/jbcrypt-0.3m.jar:/usr/share/cassandra/lib/jline-1.0.jar:/usr/share/cassandra/lib/jna-4.0.0.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.9.2.jar:/usr/share/cassandra/lib/logback-classic-1.1.2.jar:/usr/share/cassandra/lib/logback-core-1.1.2.jar:/usr/share/cassandra/lib/lz4-1.2.0.jar:/usr/share/cassandra/lib/metrics-core-2.2.0.jar:/usr/share/cassandra/lib/netty-all-4.0.23.Final.jar:/usr/share/cassandra/lib/reporter-config-2.1.0.jar:/usr/share/cassandra/lib/slf4j-api-1.7.2.jar:/usr/share/cassandra/lib/snakeyaml-1.11.jar:/usr/share/cassandra/lib/snappy-java-1.0.5.2.jar:/usr/share/cassandra/lib/stream-2.5.2.jar:/usr/share/cassandra/lib/super-csv-2.1.0.jar:/usr/share/cassandra/lib/thrift-server-0.3.7.jar:/usr/share/cassandra/apache-cassandra-2.1.13.jar:/usr/share/cassandra/apache-cassandra-thrift-2.1.13.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/cassandra/cassandra-driver-core-2.0.9.2.jar:/usr/share/cassandra/netty-3.9.0.Final.jar:/usr/share/cassandra/stress.jar::/usr/share/cassandra/lib/jamm-0.3.0.jar
INFO 13:04:11 JVM Arguments: [-ea, -javaagent:/usr/share/cassandra/lib/jamm-0.3.0.jar, -XX:+CMSClassUnloadingEnabled, -XX:+UseThreadPriorities, -XX:ThreadPriorityPolicy=42, -Xms512M, -Xmx512M, -Xmn100M, -XX:+HeapDumpOnOutOfMemoryError, -Xss256k, -XX:StringTableSize=1000003, -XX:+UseParNewGC, -XX:+UseConcMarkSweepGC, -XX:+CMSParallelRemarkEnabled, -XX:SurvivorRatio=8, -XX:MaxTenuringThreshold=1, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+UseTLAB, -XX:CompileCommandFile=/etc/cassandra/hotspot_compiler, -XX:CMSWaitDuration=10000, -XX:+CMSParallelInitialMarkEnabled, -XX:+CMSEdenChunksRecordAlways, -XX:CMSWaitDuration=10000, -XX:+UseCondCardMark, -Djava.net.preferIPv4Stack=true, -Dcassandra.jmx.local.port=7199, -XX:+DisableExplicitGC, -Dlogback.configurationFile=logback.xml, -Dcassandra.logdir=/var/log/cassandra, -Dcassandra.storagedir=, -Dcassandra-foreground=yes]
WARN 13:04:13 Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being swapped out, especially with mmapped I/O enabled. Increase RLIMIT_MEMLOCK or run Cassandra as root.
WARN 13:04:13 JMX is not enabled to receive remote connections. Please see cassandra-env.sh for more info.
ERROR 13:04:20 Directory /cassandra_data/data doesn't exist
ERROR 13:04:20 Has no permission to create /cassandra_data/data directory
ReplicationController:
apiVersion: v1
kind: ReplicationController
metadata:
labels:
app: cassandra
name: cassandra
spec:
replicas: 2
selector:
app: cassandra
template:
metadata:
labels:
app: cassandra
spec:
containers:
- command:
- /run.sh
resources:
limits:
cpu: 0.1
env:
- name: MAX_HEAP_SIZE
value: 512M
- name: HEAP_NEWSIZE
value: 100M
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
image: gcr.io/google-samples/cassandra:v8
name: cassandra
ports:
- containerPort: 9042
name: cql
- containerPort: 9160
name: thrift
volumeMounts:
- mountPath: /cassandra_data
name: data
volumes:
- name: data
emptyDir: {}
You changed cassandra default data directory so first create data directory . Give permission to write and read to created new data directory or run cassandra as root . if you set credential for cassandra then provide user name and password also .