K8s did not kill my airflow webserver pod

K8s did not kill my airflow webserver pod - linux

I have airflow running in k8s containers.
The webserver encountered a DNS error (could not translate the url for my db to an ip) and the webserver workers were killed.
What is troubling me is that the k8s did not attempt to kill the pod and start a new one its place.
Pod log output:
OperationalError: (psycopg2.OperationalError) could not translate host name "my.dbs.url" to address: Temporary failure in name resolution
[2017-12-01 06:06:05 +0000] [2202] [INFO] Worker exiting (pid: 2202)
[2017-12-01 06:06:05 +0000] [2186] [INFO] Worker exiting (pid: 2186)
[2017-12-01 06:06:05 +0000] [2190] [INFO] Worker exiting (pid: 2190)
[2017-12-01 06:06:05 +0000] [2194] [INFO] Worker exiting (pid: 2194)
[2017-12-01 06:06:05 +0000] [2198] [INFO] Worker exiting (pid: 2198)
[2017-12-01 06:06:06 +0000] [13] [INFO] Shutting down: Master
[2017-12-01 06:06:06 +0000] [13] [INFO] Reason: Worker failed to boot.
The k8s status is RUNNING but when I open an exec shell in the k8s UI i get the following output (gunicorn appears to realize it's dead):
root#webserver-373771664-3h4v9:/# ps -Al
F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD
4 S 0 1 0 0 80 0 - 107153 - ? 00:06:42 /usr/local/bin/
4 Z 0 13 1 0 80 0 - 0 - ? 00:01:24 gunicorn: maste <defunct>
4 S 0 2206 0 0 80 0 - 4987 - ? 00:00:00 bash
0 R 0 2224 2206 0 80 0 - 7486 - ? 00:00:00 ps
The following is the YAML for my deployment:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: webserver
namespace: airflow
spec:
replicas: 1
template:
metadata:
labels:
app: airflow-webserver
spec:
volumes:
- name: webserver-dags
emptyDir: {}
containers:
- name: airflow-webserver
image: my.custom.image :latest
imagePullPolicy: Always
resources:
requests:
cpu: 100m
limits:
cpu: 500m
ports:
- containerPort: 80
protocol: TCP
env:
- name: AIRFLOW_HOME
value: /var/lib/airflow
- name: AIRFLOW__CORE__SQL_ALCHEMY_CONN
valueFrom:
secretKeyRef:
name: db1
key: sqlalchemy_conn
volumeMounts:
- mountPath: /var/lib/airflow/dags/
name: webserver-dags
command: ["airflow"]
args: ["webserver"]
- name: docker-s3-to-backup
image: my.custom.image:latest
imagePullPolicy: Always
resources:
requests:
cpu: 50m
limits:
cpu: 500m
env:
- name: ACCESS_KEY
valueFrom:
secretKeyRef:
name: aws
key: access_key_id
- name: SECRET_KEY
valueFrom:
secretKeyRef:
name: aws
key: secret_access_key
- name: S3_PATH
value: s3://my-s3-bucket/dags/
- name: DATA_PATH
value: /dags/
- name: CRON_SCHEDULE
value: "*/5 * * * *"
volumeMounts:
- mountPath: /dags/
name: webserver-dags
---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: webserver
namespace: airflow
spec:
scaleTargetRef:
apiVersion: apps/v1beta1
kind: Deployment
name: webserver
minReplicas: 2
maxReplicas: 20
targetCPUUtilizationPercentage: 75
---
apiVersion: v1
kind: Service
metadata:
labels:
name: webserver
namespace: airflow
spec:
type: NodePort
ports:
- port: 80
selector:
app: airflow-webserver

you need to define the readiness and liveness probe Kubernetes to detect the POD status.
like documented on this page. https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/#define-a-tcp-liveness-probe
- containerPort: 8080
readinessProbe:
tcpSocket:
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
tcpSocket:
port: 8080
initialDelaySeconds: 15
periodSeconds: 20

Well, when process dies in a container, this container will exit and kubelet will restart the container on the same node / within the same pod. What happened here is by no means a fault of kubernetes, but in fact a problem of your container. The main process that you launch in the container (be it just from CMD or via ENTRYPOINT) needs to die, for the above to happen, and the ones you launch did not (one went zombie mode, but was not reaped, which is an example of another issue all together - zombie reaping. Liveness probe will help in this case (as mentioned by #sfgroups) as it will terminate the pod if it fails, but this is treating symptoms rather then root cause (not that you shouldn't have probes defined in general as a good practice).

Related

Azure Function ScaledJob in AKS should complete after complete

I have an Azure Function in NodeJS running in an AKS cluster. Now it will be deployed based on queue storage messages. After the function/job is done processing it should turn to complete and be removed. For some reason it doesn't.
Below is the KEDA job I've deployed to AKS:
apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
name: keda-queue-storage-job
namespace: orchestrator
spec:
jobTargetRef:
completionMode: Indexed
parallelism: 10
completions: 1
backoffLimit: 3
template:
metadata:
namespace: orchestrator
labels:
app: orchestrator
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- orchestrator
topologyKey: kubernetes.io/hostname
containers:
- name: orchestrator
image: {acr}:latest
env:
- name: AzureFunctionsJobHost__functions__0
value: orchestrator
envFrom:
- secretRef:
name: orchestrator
readinessProbe:
failureThreshold: 3
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 240
httpGet:
path: /
port: 80
scheme: HTTP
startupProbe:
failureThreshold: 3
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 240
httpGet:
path: /
port: 80
scheme: HTTP
resources:
limits:
cpu: 4
memory: 10G
restartPolicy: Never
pollingInterval: 10 # Optional. Default: 30 seconds
successfulJobsHistoryLimit: 1 # Optional. Default: 100. How many completed jobs should be kept.
failedJobsHistoryLimit: 1 # Optional. Default: 100. How many failed jobs should be kept.
triggers:
- type: azure-queue
metadata:
queueName: searches
queueLength: "1"
activationQueueLength: "0"
connectionFromEnv: AzureWebJobsStorage
accountName:{account-name}
It seems like the Azure Function isn't signalling to kubernetes that it's done. When I run a process.exit(0) then the entire process stops too early it seems since any outputBindings aren't being done.
Does anyone know how to make the job complete?

Unable to connect to MongoDB: MongoNetworkError & MongoNetworkError connecting to kubernetis MongoDB pod with mongoose

I am trying to connect to MongoDB in a microservice-based project using NodeJs, Kubernetes, Ingress, and skaffold.
I got two errors on doing skaffold dev:
MongoNetworkError: failed to connect to server [auth-mongo-srv:21017] on first connect [MongoNetworkTimeoutError: connection timed out.
Mongoose default connection error: MongoNetworkError: MongoNetworkError: failed to connect to server [auth-mongo-srv:21017] on first connect [MongoNetworkTimeoutError: connection timed out at connectionFailureError.
My auth-mongo-deploy.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: auth-mongo-deploy
spec:
replicas: 1
selector:
matchLabels:
app: auth-mongo
template:
metadata:
labels:
app: auth-mongo
spec:
containers:
- name: auth-mongo
image: mongo
---
apiVersion: v1
kind: Service
metadata:
name: auth-mongo-srv
spec:
selector:
app: auth-mongo
ports:
- name: db
protocol: TCP
port: 27017
targetPort: 27017
My server.ts
const dbURI: string = "mongodb://auth-mongo-srv:21017/auth"
logger.debug(dbURI)
logger.info('connecting to database...')
// changing {} --> options change nothing!
mongoose.connect(dbURI, {}).then(() => {
logger.info('Mongoose connection done')
app.listen(APP_PORT, () => {
logger.info(`server listening on ${APP_PORT}`)
})
console.clear();
}).catch((e) => {
logger.info('Mongoose connection error')
logger.error(e)
})
Additional information:
1. pod is created:
rhythm#vivobook:~/Documents/TicketResale/server$ kubectl get pods
NAME STATUS RESTARTS AGE
auth-deploy-595c6cbf6d-9wzt9 1/1 Running 0 5m53s
auth-mongo-deploy-6b96b7798c-9726w 1/1 Running 0 5m53s
tickets-deploy-675b7b9b58-f5bzs 1/1 Running 0 5m53s
2. pod description:
kubectl describe pod auth-mongo-deploy-6b96b7798c-9726w
Name: auth-mongo-deploy-694b67f76d-ksw82
Namespace: default
Priority: 0
Node: minikube/192.168.49.2
Start Time: Tue, 21 Jun 2022 14:11:47 +0530
Labels: app=auth-mongo
pod-template-hash=694b67f76d
skaffold.dev/run-id=2f5d2142-0f1a-4fa4-b641-3f301f10e65a
Annotations: <none>
Status: Running
IP: 172.17.0.2
IPs:
IP: 172.17.0.2
Controlled By: ReplicaSet/auth-mongo-deploy-694b67f76d
Containers:
auth-mongo:
Container ID: docker://fa43cd7e03ac32ed63c82419e5f9722deffd2f93206b6a0f2b25ae9be8f6cedf
Image: mongo
Image ID: docker-pullable://mongo#sha256:37e84d3dd30cdfb5472ec42b8a6b4dc6ca7cacd91ebcfa0410a54528bbc5fa6d
Port: <none>
Host Port: <none>
State: Running
Started: Tue, 21 Jun 2022 14:11:52 +0530
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-zw7s9 (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-zw7s9:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 79s default-scheduler Successfully assigned default/auth-mongo-deploy-694b67f76d-ksw82 to minikube
Normal Pulling 79s kubelet Pulling image "mongo"
Normal Pulled 75s kubelet Successfully pulled image "mongo" in 4.429126953s
Normal Created 75s kubelet Created container auth-mongo
Normal Started 75s kubelet Started container auth-mongo
I have also tried:
kubectl describe service auth-mongo-srv
Name: auth-mongo-srv
Namespace: default
Labels: skaffold.dev/run-id=2f5d2142-0f1a-4fa4-b641-3f301f10e65a
Annotations: <none>
Selector: app=auth-mongo
Type: ClusterIP
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.100.42.183
IPs: 10.100.42.183
Port: db 27017/TCP
TargetPort: 27017/TCP
Endpoints: 172.17.0.2:27017
Session Affinity: None
Events: <none>
And then changed:
const dbURI: string = "mongodb://auth-mongo-srv:21017/auth" to
const dbURI: string = "mongodb://172.17.0.2:27017:21017/auth"
generated a different error of MongooseServerSelectionError.

const dbURI: string = "mongodb://auth-mongo-srv:27017/auth"

Kubernetes Ingress returns "Cannot get /"

I'm trying to deploy a NodeRED pod on my cluster, and have created a service and ingress for it so it can be accessible as I access the rest of my cluster, under the same domain. However when i try to access it via host-name.com/nodered I receive Cannot GET /nodered.
Following are the templates used and describes of all the involved components.
apiVersion: v1
kind: Service
metadata:
name: nodered-app-service
namespace: {{ kubernetes_namespace_name }}
spec:
ports:
- port: 1880
targetPort: 1880
selector:
app: nodered-service-pod
I have also tried with port:80 for the service, to no avail.
apiVersion: apps/v1
kind: Deployment
metadata:
name: nodered-service-deployment
namespace: {{ kubernetes_namespace_name }}
labels:
app: nodered-service-deployment
name: nodered-service-deployment
spec:
replicas: 1
selector:
matchLabels:
app: nodered-service-pod
template:
metadata:
labels:
app: nodered-service-pod
target: gateway
buildVersion: "{{ kubernetes_build_number }}"
spec:
terminationGracePeriodSeconds: 10
serviceAccountName: nodered-service-account
automountServiceAccountToken: false
securityContext:
runAsUser: 1000
runAsGroup: 1000
fsGroup: 1000
containers:
- name: nodered-service-statefulset
image: nodered/node-red:{{ nodered_service_version }}
imagePullPolicy: {{ kubernetes_image_pull_policy }}
readinessProbe:
httpGet:
path: /
port: 1880
initialDelaySeconds: 30
timeoutSeconds: 1
periodSeconds: 10
failureThreshold: 3
livenessProbe:
httpGet:
path: /
port: 1880
initialDelaySeconds: 30
timeoutSeconds: 1
periodSeconds: 10
failureThreshold: 3
securityContext:
allowPrivilegeEscalation: false
resources:
limits:
memory: "2048M"
cpu: "1000m"
requests:
memory: "500M"
cpu: "100m"
ports:
- containerPort: 1880
name: port-name
envFrom:
- configMapRef:
name: nodered-service-configmap
env:
- name: BUILD_TIME
value: "{{ kubernetes_build_time }}"
The target: gateway refers to the ingress controller
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: nodered-ingress
namespace: {{ kubernetes_namespace_name }}
annotations:
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/ssl-redirect: "false"
spec:
rules:
- host: host-name.com
http:
paths:
- path: /nodered(/|$)(.*)
backend:
serviceName: nodered-app-service
servicePort: 1880
The following is what my Describes show
Name: nodered-app-service
Namespace: nodered
Labels: <none>
Annotations: <none>
Selector: app=nodered-service-pod
Type: ClusterIP
IP: 55.3.145.249
Port: <unset> 1880/TCP
TargetPort: port-name/TCP
Endpoints: 10.7.0.79:1880
Session Affinity: None
Events: <none>
Name: nodered-service-statefulset-6c678b7774-clx48
Namespace: nodered
Priority: 0
Node: aks-default-40441371-vmss000007/10.7.0.66
Start Time: Thu, 26 Aug 2021 14:23:33 +0200
Labels: app=nodered-service-pod
buildVersion=latest
pod-template-hash=6c678b7774
target=gateway
Annotations: <none>
Status: Running
IP: 10.7.0.79
IPs:
IP: 10.7.0.79
Controlled By: ReplicaSet/nodered-service-statefulset-6c678b7774
Containers:
nodered-service-statefulset:
Container ID: docker://a6f8c9d010feaee352bf219f85205222fa7070c72440c885b9cd52215c4c1042
Image: nodered/node-red:latest-12
Image ID: docker-pullable://nodered/node-red#sha256:f02ccb26aaca2b3ee9c8a452d9516c9546509690523627a33909af9cf1e93d1e
Port: 1880/TCP
Host Port: 0/TCP
State: Running
Started: Thu, 26 Aug 2021 14:23:36 +0200
Ready: True
Restart Count: 0
Limits:
cpu: 1
memory: 2048M
Requests:
cpu: 100m
memory: 500M
Liveness: http-get http://:1880/ delay=30s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:1880/ delay=30s timeout=1s period=10s #success=1 #failure=3
Environment Variables from:
nodered-service-configmap ConfigMap Optional: false
Environment:
BUILD_TIME: 2021-08-26T12:23:06.219818+0000
Mounts: <none>
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes: <none>
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events: <none>
Name: nodered-app-service
Namespace: nodered
Labels: <none>
Annotations: <none>
Selector: app=nodered-service-pod
Type: ClusterIP
IP: 55.3.145.249
Port: <unset> 1880/TCP
TargetPort: port-name/TCP
Endpoints: 10.7.0.79:1880
Session Affinity: None
Events: <none>
PS C:\Users\hid5tim> kubectl describe ingress -n nodered
Name: nodered-ingress
Namespace: nodered
Address: 10.7.31.254
Default backend: default-http-backend:80 (<error: endpoints "default-http-backend" not found>)
Rules:
Host Path Backends
---- ---- --------
host-name.com
/nodered(/|$)(.*) nodered-app-service:1880 (10.7.0.79:1880)
Annotations: kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/ssl-redirect: false
Events: <none>
The logs from the ingress controller are below. I've been on this issue for the last 24 hours or so and its tearing me apart, the setup looks identical to other deployments I have that are functional. Could this be something wrong with the nodered image? I have checked and it does expose 1880.
194.xx.xxx.x - [194.xx.xxx.x] - - [26/Aug/2021:10:40:12 +0000] "GET /nodered HTTP/1.1" 404 146 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36" 871 0.008 [nodered-nodered-app-service-80] 10.7.0.68:1880 146 0.008 404
74887808fa2eb09fd4ed64061639991e ```

as the comment by Andrew points out, I was using rewrite annotation wrong, once I removed the (/|$)(.*) and specified the path type as prefix it worked.

How does the master bootstrap process work and how can I debug it?

I am working to stand up 3 instances of the yugabyte master and tserver in separate k8s clusters connected over LoadBalancer services on bare metal. However, on all three master instances it looks like the bootstrap process is failing:
I0531 19:50:28.081645 1 master_main.cc:94] NumCPUs determined to be: 2
I0531 19:50:28.082594 1 server_base_options.cc:124] Updating master addrs to {yb-master-black.example.com:7100},{yb-master-blue.example.com:7100},{yb-master-white.example.com:7100},{:7100}
I0531 19:50:28.082682 1 server_base_options.cc:124] Updating master addrs to {yb-master-black.example.com:7100},{yb-master-blue.example.com:7100},{yb-master-white.example.com:7100},{:7100}
I0531 19:50:28.082937 1 mem_tracker.cc:249] MemTracker: hard memory limit is 1.699219 GB
I0531 19:50:28.082963 1 mem_tracker.cc:251] MemTracker: soft memory limit is 1.444336 GB
I0531 19:50:28.083189 1 server_base_options.cc:124] Updating master addrs to {yb-master-black.example.com:7100},{yb-master-blue.example.com:7100},{yb-master-white.example.com:7100},{:7100}
I0531 19:50:28.090148 1 server_base_options.cc:124] Updating master addrs to {yb-master-black.example.com:7100},{yb-master-blue.example.com:7100},{yb-master-white.example.com:7100},{:7100}
I0531 19:50:28.090863 1 rpc_server.cc:86] yb::server::RpcServer created at 0x1a7e210
I0531 19:50:28.090924 1 master.cc:146] yb::master::Master created at 0x7ffe2d4bd140
I0531 19:50:28.090958 1 master.cc:147] yb::master::TSManager created at 0x1a90850
I0531 19:50:28.090975 1 master.cc:148] yb::master::CatalogManager created at 0x1dea000
I0531 19:50:28.091152 1 master_main.cc:115] Initializing master server...
I0531 19:50:28.093097 1 server_base.cc:462] Could not load existing FS layout: Not found (yb/util/env_posix.cc:1482): /mnt/disk0/yb-data/master/instance: No such file or directory (system error 2)
I0531 19:50:28.093150 1 server_base.cc:463] Creating new FS layout
I0531 19:50:28.193439 1 fs_manager.cc:463] Generated new instance metadata in path /mnt/disk0/yb-data/master/instance:
uuid: "5f2f6ad78d27450b8cde9c8bcf40fefa"
format_stamp: "Formatted at 2020-05-31 19:50:28 on yb-master-0"
I0531 19:50:28.238484 1 fs_manager.cc:463] Generated new instance metadata in path /mnt/disk1/yb-data/master/instance:
uuid: "5f2f6ad78d27450b8cde9c8bcf40fefa"
format_stamp: "Formatted at 2020-05-31 19:50:28 on yb-master-0"
I0531 19:50:28.377483 1 fs_manager.cc:251] Opened local filesystem: /mnt/disk0,/mnt/disk1
uuid: "5f2f6ad78d27450b8cde9c8bcf40fefa"
format_stamp: "Formatted at 2020-05-31 19:50:28 on yb-master-0"
I0531 19:50:28.378015 1 server_base.cc:245] Auto setting FLAGS_num_reactor_threads to 2
I0531 19:50:28.380707 1 thread_pool.cc:166] Starting thread pool { name: Master queue_limit: 10000 max_workers: 1024 }
I0531 19:50:28.382266 1 master_main.cc:118] Starting Master server...
I0531 19:50:28.382313 24 async_initializer.cc:74] Starting to init ybclient
I0531 19:50:28.382365 1 master_main.cc:119] ulimit cur(max)...
ulimit: core file size unlimited(unlimited) blks
ulimit: data seg size unlimited(unlimited) kb
ulimit: open files 1048576(1048576)
ulimit: file size unlimited(unlimited) blks
ulimit: pending signals 22470(22470)
ulimit: file locks unlimited(unlimited)
ulimit: max locked memory 64(64) kb
ulimit: max memory size unlimited(unlimited) kb
ulimit: stack size 8192(unlimited) kb
ulimit: cpu time unlimited(unlimited) secs
ulimit: max user processes unlimited(unlimited)
W0531 19:50:28.383322 24 master.cc:186] Failed to get current config: Illegal state (yb/master/catalog_manager.cc:6854): Node 5f2f6ad78d27450b8cde9c8bcf40fefa peer not initialized.
I0531 19:50:28.383525 24 client-internal.cc:1847] New master addresses: [yb-master-black.example.com:7100,yb-master-blue.example.com:7100,yb-master-white.example.com:7100,:7100]
I0531 19:50:28.383685 1 service_pool.cc:148] yb.master.MasterBackupService: yb::rpc::ServicePoolImpl created at 0x1a82b40
I0531 19:50:28.384888 1 service_pool.cc:148] yb.master.MasterService: yb::rpc::ServicePoolImpl created at 0x1a83680
I0531 19:50:28.385342 1 service_pool.cc:148] yb.tserver.TabletServerService: yb::rpc::ServicePoolImpl created at 0x1a838c0
I0531 19:50:28.388526 1 thread_pool.cc:166] Starting thread pool { name: Master-high-pri queue_limit: 10000 max_workers: 1024 }
I0531 19:50:28.388588 1 service_pool.cc:148] yb.consensus.ConsensusService: yb::rpc::ServicePoolImpl created at 0x201eb40
I0531 19:50:28.393231 1 service_pool.cc:148] yb.tserver.RemoteBootstrapService: yb::rpc::ServicePoolImpl created at 0x201ed80
I0531 19:50:28.393501 1 webserver.cc:148] Starting webserver on 0.0.0.0:7000
I0531 19:50:28.393544 1 webserver.cc:153] Document root: /home/yugabyte/www
I0531 19:50:28.394471 1 webserver.cc:240] Webserver started. Bound to: http://0.0.0.0:7000/
I0531 19:50:28.394668 1 service_pool.cc:148] yb.server.GenericService: yb::rpc::ServicePoolImpl created at 0x201efc0
I0531 19:50:28.395015 1 rpc_server.cc:169] RPC server started. Bound to: 0.0.0.0:7100
I0531 19:50:28.420223 23 tcp_stream.cc:308] { local: 10.233.80.35:55710 remote: 172.16.0.34:7100 }: Recv failed: Network error (yb/util/net/socket.cc:537): recvmsg error: Connection refused (system error 111)
E0531 19:51:28.523921 24 async_initializer.cc:84] Failed to initialize client: Timed out (yb/rpc/rpc.cc:213): Could not locate the leader master: GetLeaderMasterRpc(addrs: [yb-master-black.example.com:7100, yb-master-blue.example.com:7100, yb-master-white.example.com:7100, :7100], num_attempts: 293) passed its deadline 2074493.105s (passed: 60.140s): Not found (yb/master/master_rpc.cc:284): no leader found: GetLeaderMasterRpc(addrs: [yb-master-black.example.com:7100, yb-master-blue.example.com:7100, yb-master-white.example.com:7100, :7100], num_attempts: 1)
W0531 19:51:29.524827 24 master.cc:186] Failed to get current config: Illegal state (yb/master/catalog_manager.cc:6854): Node 5f2f6ad78d27450b8cde9c8bcf40fefa peer not initialized.
I0531 19:51:29.524914 24 client-internal.cc:1847] New master addresses: [yb-master-black.example.com:7100,yb-master-blue.example.com:7100,yb-master-white.example.com:7100,:7100]
E0531 19:52:29.524785 24 async_initializer.cc:84] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:512): Could not locate the leader master: GetMasterRegistration RPC (request call id 2359) to 172.29.1.1:7100 timed out after 0.033s
W0531 19:52:30.525079 24 master.cc:186] Failed to get current config: Illegal state (yb/master/catalog_manager.cc:6854): Node 5f2f6ad78d27450b8cde9c8bcf40fefa peer not initialized.
I0531 19:52:30.525205 24 client-internal.cc:1847] New master addresses: [yb-master-black.example.com:7100,yb-master-blue.example.com:7100,yb-master-white.example.com:7100,:7100]
W0531 19:53:28.114395 36 master-path-handlers.cc:150] Illegal state (yb/master/catalog_manager.cc:6854): Unable to list Masters: Node 5f2f6ad78d27450b8cde9c8bcf40fefa peer not initialized.
W0531 19:53:29.133951 36 master-path-handlers.cc:1002] Illegal state (yb/master/catalog_manager.cc:6854): Unable to list Masters: Node 5f2f6ad78d27450b8cde9c8bcf40fefa peer not initialized.
E0531 19:53:30.625366 24 async_initializer.cc:84] Failed to initialize client: Timed out (yb/rpc/rpc.cc:213): Could not locate the leader master: GetLeaderMasterRpc(addrs: [yb-master-black.example.com:7100, yb-master-blue.example.com:7100, yb-master-white.example.com:7100, :7100], num_attempts: 299) passed its deadline 2074615.247s (passed: 60.099s): Not found (yb/master/master_rpc.cc:284): no leader found: GetLeaderMasterRpc(addrs: [yb-master-black.example.com:7100, yb-master-blue.example.com:7100, yb-master-white.example.com:7100, :7100], num_attempts: 1)
W0531 19:53:31.625660 24 master.cc:186] Failed to get current config: Illegal state (yb/master/catalog_manager.cc:6854): Node 5f2f6ad78d27450b8cde9c8bcf40fefa peer not initialized.
I0531 19:53:31.625742 24 client-internal.cc:1847] New master addresses: [yb-master-black.example.com:7100,yb-master-blue.example.com:7100,yb-master-white.example.com:7100,:7100]
W0531 19:53:34.024369 37 master-path-handlers.cc:150] Illegal state (yb/master/catalog_manager.cc:6854): Unable to list Masters: Node 5f2f6ad78d27450b8cde9c8bcf40fefa peer not initialized.
E0531 19:54:31.870801 24 async_initializer.cc:84] Failed to initialize client: Timed out (yb/rpc/rpc.cc:213): Could not locate the leader master: GetLeaderMasterRpc(addrs: [yb-master-black.example.com:7100, yb-master-blue.example.com:7100, yb-master-white.example.com:7100, :7100], num_attempts: 300) passed its deadline 2074676.348s (passed: 60.244s): Not found (yb/master/master_rpc.cc:284): no leader found: GetLeaderMasterRpc(addrs: [yb-master-black.example.com:7100, yb-master-blue.example.com:7100, yb-master-white.example.com:7100, :7100], num_attempts: 1)
W0531 19:54:32.871065 24 master.cc:186] Failed to get current config: Illegal state (yb/master/catalog_manager.cc:6854): Node 5f2f6ad78d27450b8cde9c8bcf40fefa peer not initialized.
I0531 19:54:32.871222 24 client-internal.cc:1847] New master addresses: [yb-master-black.example.com:7100,yb-master-blue.example.com:7100,yb-master-white.example.com:7100,:7100]
W0531 19:55:28.190217 41 master-path-handlers.cc:1002] Illegal state (yb/master/catalog_manager.cc:6854): Unable to list Masters: Node 5f2f6ad78d27450b8cde9c8bcf40fefa peer not initialized.
W0531 19:55:31.745038 42 master-path-handlers.cc:1002] Illegal state (yb/master/catalog_manager.cc:6854): Unable to list Masters: Node 5f2f6ad78d27450b8cde9c8bcf40fefa peer not initialized.
E0531 19:55:33.164300 24 async_initializer.cc:84] Failed to initialize client: Timed out (yb/rpc/rpc.cc:213): Could not locate the leader master: GetLeaderMasterRpc(addrs: [yb-master-black.example.com:7100, yb-master-blue.example.com:7100, yb-master-white.example.com:7100, :7100], num_attempts: 299) passed its deadline 2074737.593s (passed: 60.292s): Not found (yb/master/master_rpc.cc:284): no leader found: GetLeaderMasterRpc(addrs: [yb-master-black.example.com:7100, yb-master-blue.example.com:7100, yb-master-white.example.com:7100, :7100], num_attempts: 1)
W0531 19:55:34.164574 24 master.cc:186] Failed to get current config: Illegal state (yb/master/catalog_manager.cc:6854): Node 5f2f6ad78d27450b8cde9c8bcf40fefa peer not initialized.
I0531 19:55:34.164667 24 client-internal.cc:1847] New master addresses: [yb-master-black.example.com:7100,yb-master-blue.example.com:7100,yb-master-white.example.com:7100,:7100]
E0531 19:56:34.315380 24 async_initializer.cc:84] Failed to initialize client: Timed out (yb/rpc/rpc.cc:213): Could not locate the leader master: GetLeaderMasterRpc(addrs: [yb-master-black.example.com:7100, yb-master-blue.example.com:7100, yb-master-white.example.com:7100, :7100], num_attempts: 299) passed its deadline 2074798.886s (passed: 60.150s): Not found (yb/master/master_rpc.cc:284): no leader found: GetLeaderMasterRpc(addrs: [yb-master-black.example.com:7100, yb-master-blue.example.com:7100, yb-master-white.example.com:7100, :7100], num_attempts: 1)
As far as connectivity goes, I am able to verify the LoadBalancer endpoints are responding across the different network boundaries by curling the same service endpoint but on the UI port:
[root#yb-master-0 yugabyte]# curl -I http://yb-master-blue.example.com:7000
HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 1975
Access-Control-Allow-Origin: *
[root#yb-master-0 yugabyte]# curl -I http://yb-master-white.example.com:7000
HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 1975
Access-Control-Allow-Origin: *
[root#yb-master-0 yugabyte]# curl -I http://yb-master-black.example.com:7000
HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 1975
Access-Control-Allow-Origin: *
What strategies are there to debug the bootstrap process?
EDIT:
Here are the startup flags for the master:
/home/yugabyte/bin/yb-master --fs_data_dirs=/mnt/disk0,/mnt/disk1 --server_broadcast_addresses=yb-master-white.example.com:7100 --master_addresses=yb-master-black.example.com:7100, yb-master-blue.example.com:7100, yb-master-white.example.com:7100, --replication_factor=3 --enable_ysql=true --rpc_bind_addresses=0.0.0.0:7100 --metric_node_name=yb-master-0 --memory_limit_hard_bytes=1824522240 --stderrthreshold=0 --num_cpus=2 --undefok=num_cpus,enable_ysql --default_memory_limit_to_ram_ratio=0.85 --leader_failure_max_missed_heartbeat_periods=10 --placement_cloud=AAAA --placement_region=XXXX --placement_zone=XXXX
/home/yugabyte/bin/yb-master --fs_data_dirs=/mnt/disk0,/mnt/disk1 --server_broadcast_addresses=yb-master-blue.example.com:7100 --master_addresses=yb-master-black.example.com:7100, yb-master-blue.example.com:7100, yb-master-white.example.com:7100, --replication_factor=3 --enable_ysql=true --rpc_bind_addresses=0.0.0.0:7100 --metric_node_name=yb-master-0 --memory_limit_hard_bytes=1824522240 --stderrthreshold=0 --num_cpus=2 --undefok=num_cpus,enable_ysql --default_memory_limit_to_ram_ratio=0.85 --leader_failure_max_missed_heartbeat_periods=10 --placement_cloud=AAAA --placement_region=YYYY --placement_zone=YYYY
/home/yugabyte/bin/yb-master --fs_data_dirs=/mnt/disk0,/mnt/disk1 --server_broadcast_addresses=yb-master-black.example.com:7100 --master_addresses=yb-master-black.example.com:7100, yb-master-blue.example.com:7100, yb-master-white.example.com:7100, --replication_factor=3 --enable_ysql=true --rpc_bind_addresses=0.0.0.0:7100 --metric_node_name=yb-master-0 --memory_limit_hard_bytes=1824522240 --stderrthreshold=0 --num_cpus=2 --undefok=num_cpus,enable_ysql --default_memory_limit_to_ram_ratio=0.85 --leader_failure_max_missed_heartbeat_periods=10 --placement_cloud=AAAA --placement_region=ZZZZ --placement_zone=ZZZZ
For the sake of completeness here is one of the k8s manifest that I've modified from one of the helm examples. It is modified to utilize LoadBalancer for the master service:
---
# Source: yugabyte/templates/service.yaml
apiVersion: v1
kind: Service
metadata:
name: "yb-masters"
labels:
app: "yb-master"
heritage: "Helm"
release: "blue"
chart: "yugabyte"
component: "yugabytedb"
spec:
type: LoadBalancer
loadBalancerIP: 172.16.0.34
ports:
- name: "rpc-port"
port: 7100
- name: "ui"
port: 7000
selector:
app: "yb-master"
---
# Source: yugabyte/templates/service.yaml
apiVersion: v1
kind: Service
metadata:
name: "yb-tservers"
labels:
app: "yb-tserver"
heritage: "Helm"
release: "blue"
chart: "yugabyte"
component: "yugabytedb"
spec:
clusterIP: None
ports:
- name: "rpc-port"
port: 7100
- name: "ui"
port: 9000
- name: "yedis-port"
port: 6379
- name: "yql-port"
port: 9042
- name: "ysql-port"
port: 5433
selector:
app: "yb-tserver"
---
# Source: yugabyte/templates/service.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: "yb-master"
namespace: "yugabytedb"
labels:
app: "yb-master"
heritage: "Helm"
release: "blue"
chart: "yugabyte"
component: "yugabytedb"
spec:
serviceName: "yb-masters"
podManagementPolicy: Parallel
replicas: 1
volumeClaimTemplates:
- metadata:
name: datadir0
annotations:
volume.beta.kubernetes.io/storage-class: rook-ceph-block
labels:
heritage: "Helm"
release: "blue"
chart: "yugabyte"
component: "yugabytedb"
spec:
accessModes:
- "ReadWriteOnce"
storageClassName: rook-ceph-block
resources:
requests:
storage: 10Gi
- metadata:
name: datadir1
annotations:
volume.beta.kubernetes.io/storage-class: rook-ceph-block
labels:
heritage: "Helm"
release: "blue"
chart: "yugabyte"
component: "yugabytedb"
spec:
accessModes:
- "ReadWriteOnce"
storageClassName: rook-ceph-block
resources:
requests:
storage: 10Gi
updateStrategy:
type: RollingUpdate
rollingUpdate:
partition: 0
selector:
matchLabels:
app: "yb-master"
template:
metadata:
labels:
app: "yb-master"
heritage: "Helm"
release: "blue"
chart: "yugabyte"
component: "yugabytedb"
spec:
affinity:
# Set the anti-affinity selector scope to YB masters.
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- "yb-master"
topologyKey: kubernetes.io/hostname
containers:
- name: "yb-master"
image: "yugabytedb/yugabyte:2.1.6.0-b17"
imagePullPolicy: IfNotPresent
lifecycle:
postStart:
exec:
command:
- "sh"
- "-c"
- >
mkdir -p /mnt/disk0/cores;
mkdir -p /mnt/disk0/yb-data/scripts;
if [ ! -f /mnt/disk0/yb-data/scripts/log_cleanup.sh ]; then
if [ -f /home/yugabyte/bin/log_cleanup.sh ]; then
cp /home/yugabyte/bin/log_cleanup.sh /mnt/disk0/yb-data/scripts;
fi;
fi
env:
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: HOSTNAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
resources:
limits:
cpu: 2
memory: 2Gi
requests:
cpu: 500m
memory: 1Gi
command:
- "/home/yugabyte/bin/yb-master"
- "--fs_data_dirs=/mnt/disk0,/mnt/disk1"
- "--server_broadcast_addresses=yb-master-blue.example.com:7100"
- "--master_addresses=yb-master-black.example.com:7100, yb-master-blue.example.com:7100, yb-master-white.example.com:7100, "
- "--replication_factor=3"
- "--enable_ysql=true"
- "--rpc_bind_addresses=0.0.0.0:7100"
- "--metric_node_name=$(HOSTNAME)"
- "--memory_limit_hard_bytes=1824522240"
- "--stderrthreshold=0"
- "--num_cpus=2"
- "--undefok=num_cpus,enable_ysql"
- "--default_memory_limit_to_ram_ratio=0.85"
- "--leader_failure_max_missed_heartbeat_periods=10"
- "--placement_cloud=AAAA"
- "--placement_region=YYYY"
- "--placement_zone=YYYY"
ports:
- containerPort: 7100
name: "rpc-port"
- containerPort: 7000
name: "ui"
volumeMounts:
- name: datadir0
mountPath: /mnt/disk0
- name: datadir1
mountPath: /mnt/disk1
- name: yb-cleanup
image: busybox:1.31
env:
- name: USER
value: "yugabyte"
command:
- "/bin/sh"
- "-c"
- >
mkdir /var/spool/cron;
mkdir /var/spool/cron/crontabs;
echo "0 * * * * /home/yugabyte/scripts/log_cleanup.sh" | tee -a /var/spool/cron/crontabs/root;
crond;
while true; do
sleep 86400;
done
volumeMounts:
- name: datadir0
mountPath: /home/yugabyte/
subPath: yb-data
volumes:
- name: datadir0
hostPath:
path: /mnt/disks/ssd0
- name: datadir1
hostPath:
path: /mnt/disks/ssd1
---
# Source: yugabyte/templates/service.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: "yb-tserver"
namespace: "yugabytedb"
labels:
app: "yb-tserver"
heritage: "Helm"
release: "blue"
chart: "yugabyte"
component: "yugabytedb"
spec:
serviceName: "yb-tservers"
podManagementPolicy: Parallel
replicas: 1
volumeClaimTemplates:
- metadata:
name: datadir0
annotations:
volume.beta.kubernetes.io/storage-class: rook-ceph-block
labels:
heritage: "Helm"
release: "blue"
chart: "yugabyte"
component: "yugabytedb"
spec:
accessModes:
- "ReadWriteOnce"
storageClassName: rook-ceph-block
resources:
requests:
storage: 10Gi
- metadata:
name: datadir1
annotations:
volume.beta.kubernetes.io/storage-class: rook-ceph-block
labels:
heritage: "Helm"
release: "blue"
chart: "yugabyte"
component: "yugabytedb"
spec:
accessModes:
- "ReadWriteOnce"
storageClassName: rook-ceph-block
resources:
requests:
storage: 10Gi
updateStrategy:
type: RollingUpdate
rollingUpdate:
partition: 0
selector:
matchLabels:
app: "yb-tserver"
template:
metadata:
labels:
app: "yb-tserver"
heritage: "Helm"
release: "blue"
chart: "yugabyte"
component: "yugabytedb"
spec:
affinity:
# Set the anti-affinity selector scope to YB masters.
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- "yb-tserver"
topologyKey: kubernetes.io/hostname
containers:
- name: "yb-tserver"
image: "yugabytedb/yugabyte:2.1.6.0-b17"
imagePullPolicy: IfNotPresent
lifecycle:
postStart:
exec:
command:
- "sh"
- "-c"
- >
mkdir -p /mnt/disk0/cores;
mkdir -p /mnt/disk0/yb-data/scripts;
if [ ! -f /mnt/disk0/yb-data/scripts/log_cleanup.sh ]; then
if [ -f /home/yugabyte/bin/log_cleanup.sh ]; then
cp /home/yugabyte/bin/log_cleanup.sh /mnt/disk0/yb-data/scripts;
fi;
fi
env:
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: HOSTNAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
resources:
limits:
cpu: 2
memory: 4Gi
requests:
cpu: 500m
memory: 2Gi
command:
- "/home/yugabyte/bin/yb-tserver"
- "--fs_data_dirs=/mnt/disk0,/mnt/disk1"
- "--server_broadcast_addresses=$(HOSTNAME).yb-tservers.$(NAMESPACE).svc.cluster.local:9100"
- "--rpc_bind_addresses=$(HOSTNAME).yb-tservers.$(NAMESPACE).svc.cluster.local"
- "--cql_proxy_bind_address=$(HOSTNAME).yb-tservers.$(NAMESPACE).svc.cluster.local"
- "--enable_ysql=true"
- "--pgsql_proxy_bind_address=$(POD_IP):5433"
- "--tserver_master_addrs=yb-master-black.example.com:7100, yb-master-blue.example.com:7100, yb-master-white.example.com:7100, "
- "--metric_node_name=$(HOSTNAME)"
- "--memory_limit_hard_bytes=3649044480"
- "--stderrthreshold=0"
- "--num_cpus=2"
- "--undefok=num_cpus,enable_ysql"
- "--leader_failure_max_missed_heartbeat_periods=10"
- "--placement_cloud=AAAA"
- "--placement_region=YYYY"
- "--placement_zone=YYYY"
- "--use_cassandra_authentication=false"
ports:
- containerPort: 7100
name: "rpc-port"
- containerPort: 9000
name: "ui"
- containerPort: 6379
name: "yedis-port"
- containerPort: 9042
name: "yql-port"
- containerPort: 5433
name: "ysql-port"
volumeMounts:
- name: datadir0
mountPath: /mnt/disk0
- name: datadir1
mountPath: /mnt/disk1
- name: yb-cleanup
image: busybox:1.31
env:
- name: USER
value: "yugabyte"
command:
- "/bin/sh"
- "-c"
- >
mkdir /var/spool/cron;
mkdir /var/spool/cron/crontabs;
echo "0 * * * * /home/yugabyte/scripts/log_cleanup.sh" | tee -a /var/spool/cron/crontabs/root;
crond;
while true; do
sleep 86400;
done
volumeMounts:
- name: datadir0
mountPath: /home/yugabyte/
subPath: yb-data
volumes:
- name: datadir0
hostPath:
path: /mnt/disks/ssd0
- name: datadir1
hostPath:
path: /mnt/disks/ssd1

This was mostly resolved (looks like I've now run into an unrelated issue), by dropping the extraneous comma on the master addresses list:
--master_addresses=yb-master-black.example.com:7100, yb-master-blue.example.com:7100, yb-master-white.example.com:7100,
vs
--master_addresses=yb-master-black.example.com:7100, yb-master-blue.example.com:7100, yb-master-white.example.com:7100

OpenShift Access Mongodb Pod from another Pod

I'm currentrly trying to deploy a mongodb pod on OpenShift and accessing this pod from another node.js application via mongoose. Now at first everything seems fine. I have created a route to the mongodb and when i open it in my browser I get
It looks like you are trying to access MongoDB over HTTP on the
native driver port.
So far so good. But when I try opening a connection to the database from another pod it refuses the connection. I'm using the username and password provided by OpenShift and connect to
mongodb://[username]:[password]#[host]:[port]/[dbname]
unfortunately without luck. It seems that the database is just accepting connections from the localhost. However I could not find out how to change that. Would be great if someone had an idea.
Heres the Deployment Config
apiVersion: v1
kind: DeploymentConfig
metadata:
annotations:
template.alpha.openshift.io/wait-for-ready: "true"
creationTimestamp: null
generation: 1
labels:
app: mongodb-persistent
template: mongodb-persistent-template
name: mongodb
spec:
replicas: 1
selector:
name: mongodb
strategy:
activeDeadlineSeconds: 21600
recreateParams:
timeoutSeconds: 600
resources: {}
type: Recreate
template:
metadata:
creationTimestamp: null
labels:
name: mongodb
spec:
containers:
- env:
- name: MONGODB_USER
valueFrom:
secretKeyRef:
key: database-user
name: mongodb
- name: MONGODB_PASSWORD
valueFrom:
secretKeyRef:
key: database-password
name: mongodb
- name: MONGODB_ADMIN_PASSWORD
valueFrom:
secretKeyRef:
key: database-admin-password
name: mongodb
- name: MONGODB_DATABASE
valueFrom:
secretKeyRef:
key: database-name
name: mongodb
image: registry.access.redhat.com/rhscl/mongodb-32-rhel7#sha256:82c79f0e54d5a23f96671373510159e4fac478e2aeef4181e61f25ac38c1ae1f
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
tcpSocket:
port: 27017
timeoutSeconds: 1
name: mongodb
ports:
- containerPort: 27017
protocol: TCP
readinessProbe:
exec:
command:
- /bin/sh
- -i
- -c
- mongo 127.0.1:27017/$MONGODB_DATABASE -u $MONGODB_USER -p $MONGODB_PASSWORD
--eval="quit()"
failureThreshold: 3
initialDelaySeconds: 3
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
limits:
memory: 512Mi
securityContext:
capabilities: {}
privileged: false
terminationMessagePath: /dev/termination-log
volumeMounts:
- mountPath: /var/lib/mongodb/data
name: mongodb-data
dnsPolicy: ClusterFirst
restartPolicy: Always
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- name: mongodb-data
persistentVolumeClaim:
claimName: mongodb
test: false
triggers:
- imageChangeParams:
automatic: true
containerNames:
- mongodb
from:
kind: ImageStreamTag
name: mongodb:3.2
namespace: openshift
type: ImageChange
- type: ConfigChange
status:
availableReplicas: 0
latestVersion: 0
observedGeneration: 0
replicas: 0
unavailableReplicas: 0
updatedReplicas: 0
The Service Config
apiVersion: v1
kind: Service
metadata:
annotations:
template.openshift.io/expose-uri: mongodb://{.spec.clusterIP}:{.spec.ports[?(.name=="mongo")].port}
creationTimestamp: null
labels:
app: mongodb-persistent
template: mongodb-persistent-template
name: mongodb
spec:
ports:
- name: mongo
port: 27017
protocol: TCP
targetPort: 27017
selector:
name: mongodb
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
and the pod
apiVersion: v1
kind: Pod
metadata:
annotations:
kubernetes.io/created-by: |
{"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicationController","namespace":"some-name-space","name":"mongodb-3","uid":"xxxx-xxx-xxx-xxxxxx","apiVersion":"v1","resourceVersion":"244413593"}}
kubernetes.io/limit-ranger: 'LimitRanger plugin set: cpu request for container
mongodb'
openshift.io/deployment-config.latest-version: "3"
openshift.io/deployment-config.name: mongodb
openshift.io/deployment.name: mongodb-3
openshift.io/scc: nfs-scc
creationTimestamp: null
generateName: mongodb-3-
labels:
deployment: mongodb-3
deploymentconfig: mongodb
name: mongodb
ownerReferences:
- apiVersion: v1
controller: true
kind: ReplicationController
name: mongodb-3
uid: a694b832-5dd2-11e8-b2fc-40f2e91e2433
spec:
containers:
- env:
- name: MONGODB_USER
valueFrom:
secretKeyRef:
key: database-user
name: mongodb
- name: MONGODB_PASSWORD
valueFrom:
secretKeyRef:
key: database-password
name: mongodb
- name: MONGODB_ADMIN_PASSWORD
valueFrom:
secretKeyRef:
key: database-admin-password
name: mongodb
- name: MONGODB_DATABASE
valueFrom:
secretKeyRef:
key: database-name
name: mongodb
image: registry.access.redhat.com/rhscl/mongodb-32-rhel7#sha256:82c79f0e54d5a23f96671373510159e4fac478e2aeef4181e61f25ac38c1ae1f
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
tcpSocket:
port: 27017
timeoutSeconds: 1
name: mongodb
ports:
- containerPort: 27017
protocol: TCP
readinessProbe:
exec:
command:
- /bin/sh
- -i
- -c
- mongo 127.0.1:27017/$MONGODB_DATABASE -u $MONGODB_USER -p $MONGODB_PASSWORD
--eval="quit()"
failureThreshold: 3
initialDelaySeconds: 3
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
limits:
memory: 512Mi
requests:
cpu: 250m
memory: 512Mi
securityContext:
capabilities:
drop:
- KILL
- MKNOD
- SETGID
- SETUID
- SYS_CHROOT
privileged: false
runAsUser: 1049930000
seLinuxOptions:
level: s0:c223,c212
terminationMessagePath: /dev/termination-log
volumeMounts:
- mountPath: /var/lib/mongodb/data
name: mongodb-data
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: default-token-rfvr5
readOnly: true
dnsPolicy: ClusterFirst
imagePullSecrets:
- name: default-dockercfg-3mpps
nodeName: thenode.name.net
nodeSelector:
region: primary
restartPolicy: Always
securityContext:
fsGroup: 1049930000
seLinuxOptions:
level: s0:c223,c212
supplementalGroups:
- 5555
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 30
volumes:
- name: mongodb-data
persistentVolumeClaim:
claimName: mongodb
- name: default-token-rfvr5
secret:
defaultMode: 420
secretName: default-token-rfvr5
status:
phase: Pending

Ok that was a long search and finally I was able to solve it. My first mistake was, that routes are not suited to make a connection to a database as they only use the http-protocol.
Now there were 2 usecases left for me
You're working on your local machine and want to test code that you later upload to OpenShift
You deploy that code to OpenShift (has to be in the same project but is a different app than the database)
1. Local Machine
Since the route doesn't work port forwarding is used. I've read that before but didn't really understand what it meant (i thought the service itsself is forwading ports already).
When you are on your local machine you will do the following with the oc
oc port-forward <pod-name> <local-port>:<remote-port>
You'll get the info that the port is forwarded. Now the thing is, that in your app you will now connect to localhost (even on your local machine)
2. App running on OpenShift
After you will upload your code to OpenShift(In my case, just Add to project --> Node.js --> Add your repo), localhost will not be working any longer.
What took a while for me to understand is that as long as you are in the same project you will have a lot of information in your environment variables.
So just check the name of the service of your database (in my case mongodb) and you will find the host and port to use
Summary
Here's a little code example that works now, as well on the local machine as on OpenShift. I have already set up a persistand MongoDB on OpenShift called mongodb.
The code doesn't do much, but It will make a connection and tell you that it did, so you know it's working.
var mongoose = require('mongoose');
// Connect to Mongodb
var username = process.env.MONGO_DB_USERNAME || 'someUserName';
var password = process.env.MONGO_DB_PASSWORD || 'somePassword';
var host = process.env.MONGODB_SERVICE_HOST || '127.0.0.1';
var port = process.env.MONGODB_SERVICE_PORT || '27017';
var database = process.env.MONGO_DB_DATABASE || 'sampledb';
console.log('---DATABASE PARAMETERS---');
console.log('Host: ' + host);
console.log('Port: ' + port);
console.log('Username: ' + username);
console.log('Password: ' + password);
console.log('Database: ' + database);
var connectionString = 'mongodb://' + username + ':' + password +'#' + host + ':' + port + '/' + database;
console.log('---CONNECTING TO---');
console.log(connectionString);
mongoose.connect(connectionString);
mongoose.connection.once('open', (data) => {
console.log('Connection has been made');
console.log(data);
});

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

K8s did not kill my airflow webserver pod - linux

Related

Azure Function ScaledJob in AKS should complete after complete

Unable to connect to MongoDB: MongoNetworkError & MongoNetworkError connecting to kubernetis MongoDB pod with mongoose

Kubernetes Ingress returns "Cannot get /"

How does the master bootstrap process work and how can I debug it?

OpenShift Access Mongodb Pod from another Pod

Categories

Resources