Kubernetes helm for creating DB scripts for Database Conatiner - python-3.x

I am developing the Kubernetes helm for deploying the Python application. Within python application i have a Database that has to be connected.
I want to run the Database scripts that would create db, create user, create table or any alter Database column and any sql script. I was thinking this can be run as a initContainer but that it is not recommended way since this will be running every time even when there is no db scripts also to run.
Below is the solution i am looking for:
Create Kubernetes job to run the scripts which will connect to postgres db and run the scripts from the files. Is there way that in Kunernetes Job to connect to Postgres service and run the sql scripts?
Please suggest any good approach for sql script to be run in kubernetes which we can monitor also with pod.

I would recommend you to simply use the idea of 'postgresql' sub-chart along with your newly developed app helm chart (check here how to use it within the section called "Use of global variables").
It uses the concept of 'initContainers' instead of Job, to let you initialize on startup a user defined schema/configuration of database from the custom *.sql script.

Related

Databricks Lakehouse JDBC and Docker

Pretty new to Databricks.
I've got a requirement to access data in the Lakehouse using a JDBC driver. This works fine.
I now want to stub the Lakehouse using a docker image for some tests I want to write. Is it possible to get a Databricks / spark docker image with a database in it? I would also want to bootstrap the database on startup to create a bunch of tables.
No - Databricks is not a database but a hosted service (PaaS). You can theoretically you can use OSS Spark with Thriftserver started on it, but the connections strings and other functionality would be very different, so it makes no sense to spend time on it (imho). Real solution would depend on the type of tests that you want to do.
Regarding bootstrapping database & create a bunch of tables - just issue these commands, like, create database if not exists or create table if not exists when you application starts up (see documentation for an exact syntax)

Does Kubernetes restart a failed container or create a new container when the running container fails for any reason?

I have ran a docker container locally and it stores data in a file (currently no volume is mounted). I stored some data using the API. After that I failed the container using process.exit(1) and started the container again. The previously stored data in the container survives (as expected). But when I do this same thing in Kubernetes (minikube) the data is lost.
Posting this as a community wiki for better visibility, feel free to edit and expand it.
As described in comments, kubernetes replaces failed containers with new (identical) ones and this explain why container's filesystem will be clean.
Also as said containers should be stateless. There are different options how to run different applications and take care about its data:
Run a stateless application using a Deployment
Run a stateful application either as a single instance or as a replicated set
Run automated tasks with a CronJob
Useful links:
Kubernetes workloads
Pod lifecycle

Random error on external Oracle database connection with Kubernetes

After month of research, we are here, hoping for someone to have a insight about these issue:
On a GKE cluster, our pods (node.JS) are having trouble connecting to our external oracle business database.
To be more precise, ~70% of our connection tentative are ending in error:
ORA-12545: Connect failed because target host or object does not exist
The 30% left are working well, and doesn't reset or end prematurely. Once it's connected, it's all good from here.
Our stack:
Our flux are handed by containers based on a node:12.15.0-slim image, at which we add LIBAIO1 and a instant oracle client (v12.2). We use oracleDB v5.0.0 as node module
We use cron job pod handling our node container, in a clusterIP service on a GKE cluster (1.16.15-gke.4300).
Our external oracle database in on a private network (which our cluster have access), in a Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi version, behind a load balancer
I can give more detail if needed.
What we have already tried:
We have tried to pass directly on the database, cutting off the load balancer: no effect
We had cron job pod doing ping each min on the database server for a day: no error, although flux pod somehow encounter the ORA-12545 error
We redo all our code, connecting differently to the database and making update for our node module oracledb (v4 to v5): no effect
We tried to monitore the load up over the oracle database and take action spreading our flux over all night instead of a 1 hour window: no effect
We had our own kubernetes cluster before GKE, directly in our private network, causing the exactly same error.
We had a audit by some expert on kubernetes, without them finding the issue or seeing a critical issue over our cluster/k8s configuration
What works:
All our pods, some requesting into mySql database, micro service, web front, are all working fine.
All our business tool (dozen of, including Talend and some custom software) are using the oracle database without issue.
Our own flux handling node container are working fine with the oracle database as long they are into a docker env, and not a kube one.
To resume: We have a mysterious issue when trying to connect to an oracle database from a kubernetes env, where pods are randomly unable to reach the database
We are looking for any hint we can have

Deploy Mongodb and NodeJS with Kubernetes

I am learning Kubernetes and finding some suggestions about deploying my application.
My application background:
Backend: NodeJS
Frontend: ReactJS
Database: MongoDB (Just run mongod to start instead of using MongoDB cloud services)
I already know how to use Docker compose to deploy the application in single node.
And now I want to deploy the application with Kubernetes (3 nodes).
So how to deploy MongoDB and make sure the MongoDB data is synchronize in 3 nodes?
I have researched some information about this and I am confused on some keywords.
E.g. Deploy a Standalone MongoDB Instance,
StatefulSet, ...
Are this information / articles suitable for my situation? or do you know any information about this? Thanks!
You can install mongodb using this helm chart.
You can start the MongoDB chart in replica set mode with the following parameter: replicaSet.enabled=true
Some characteristics of this chart are:
Each of the participants in the replication has a fixed stateful set so you always know where to find the primary, secondary or arbiter nodes.
The number of secondary and arbiter nodes can be scaled out independently.
Easy to move an application from using a standalone MongoDB server to use a replica set.
See here to learn configuration and installation details
You can create helm charts for your apps for deployment -
Create Dockerfile for your app, make sure you copy the build that was created using npm build
Push to dockerhub or any other registry like ACR or ECR
Add the image tags in helm deployments & pass values from values.yaml
For MongoDb deployment, use this chart https://github.com/bitnami/charts/tree/master/bitnami/mongodb

Creating catalog/schema/table in prestosql/presto container

I would like to use prestosql/presto container for automated tests. For this purpose I want to receive the ability to programmatically to create catalog/schema/table. Unfortunately, I didn't find the option via docker environment variables. If I trying to do it via jdbc connector, I receive following error:"This connector does not support creating tables"
How can I create schemas or tables using prestosql/presto container?
If you are writing tests in Java (as suggested by JDBC tag), you can use testcontainers library. It comes with Presto module.
uses prestosql/presto container under the hood
comes with Presto memory connector pre-installed, so you can create schemas & tables there

Resources