Random error on external Oracle database connection with Kubernetes - node.js

After month of research, we are here, hoping for someone to have a insight about these issue:
On a GKE cluster, our pods (node.JS) are having trouble connecting to our external oracle business database.
To be more precise, ~70% of our connection tentative are ending in error:
ORA-12545: Connect failed because target host or object does not exist
The 30% left are working well, and doesn't reset or end prematurely. Once it's connected, it's all good from here.
Our stack:
Our flux are handed by containers based on a node:12.15.0-slim image, at which we add LIBAIO1 and a instant oracle client (v12.2). We use oracleDB v5.0.0 as node module
We use cron job pod handling our node container, in a clusterIP service on a GKE cluster (1.16.15-gke.4300).
Our external oracle database in on a private network (which our cluster have access), in a Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi version, behind a load balancer
I can give more detail if needed.
What we have already tried:
We have tried to pass directly on the database, cutting off the load balancer: no effect
We had cron job pod doing ping each min on the database server for a day: no error, although flux pod somehow encounter the ORA-12545 error
We redo all our code, connecting differently to the database and making update for our node module oracledb (v4 to v5): no effect
We tried to monitore the load up over the oracle database and take action spreading our flux over all night instead of a 1 hour window: no effect
We had our own kubernetes cluster before GKE, directly in our private network, causing the exactly same error.
We had a audit by some expert on kubernetes, without them finding the issue or seeing a critical issue over our cluster/k8s configuration
What works:
All our pods, some requesting into mySql database, micro service, web front, are all working fine.
All our business tool (dozen of, including Talend and some custom software) are using the oracle database without issue.
Our own flux handling node container are working fine with the oracle database as long they are into a docker env, and not a kube one.
To resume: We have a mysterious issue when trying to connect to an oracle database from a kubernetes env, where pods are randomly unable to reach the database
We are looking for any hint we can have

Related

Does "spring data cassandra" have client side loadbalancing?

I'm operating project using spring-boot, spring-data-cassandra.
When I setup that project, I set cassandra properties by ip and port.
(referred by https://www.baeldung.com/spring-data-cassandra-tutorial)
When set it up like this, If I had 3 cassandra nodes and 1 cassandra node died, I think project should fail to connect with cassandra at a 33% probability.
But my project was fine even though 1 cassandra node was dead. (just have some error on one's deathbed)
Do It happen to have A function in spring-data-cassandra like client-side-loadbalancing?
If they have that function, Where can I see that code??
I tried to find that code but failed.
Please give me a little clue.
Spring Data Cassandra relies on the functionality of the DataStax Java driver that is responsible for making everything works. This includes:
establishing the initial connection to the cluster. This is where the contact points play their role. After driver is connected to any of points, it reads information about the whole cluster and establishes connections to all nodes (by default)
establishing the control connection that is used to receive notifications about changes in the cluster - nodes going up & down, changes in schema, etc. If node goes down or up, this information is used to modify the list of the active nodes
providing the load balancing of requests based on the replication, and nodes availability - if the node is down, it's excluded from list of candidates, so we don't send queries to node that is known to be down

Is it possible to fake Cassandra connection?

I have been given a task to configure Cassandra DB for the project. We are facing a problem - for all environments there is a dedicated server for Cassandra. But, for the DEV environment, the client does not want to provide a seperate server and current DEV servers are already fully packaged and we can't afford to install Cassandra on them.
My question is, is there any possibility to fake connection to Cassandra in an environment? I've created CassandraConfiguration.java class, configured session, cluster etc etc, it all works smoothly on other envs, but on DEV, well, it fails, as it cannot connect, because there's no Cassandra... Commiting the cassandraconfiguration file will kill the dev.
You can use scassandra (simulated cassandra), or Simulacron that are emulating Cassandra. Or you can use cassandra-unit that will run Cassandra in the same JVM as your test.

Issue to start coordinator

I have install 3 arangodb servers. But i have always the same listening port 8529 no 8530 for coordinator so i cannot create a cluster.
tcp 0 0 0.0.0.0:8529 0.0.0.0:* LISTEN 13142/arangod
So when i try to create a cluster via the web interface, i have the following error
ERROR bootstrapping DB servers failed: Could not connect to 'tcp://10.0.0.18:8530' 'connect() failed with #111 - Connection refused'
How can i start and/or configure the corrdinator to have a listen on my servers?
Regards
Dispatcher based clusters
Please note that dispatcher based setups as you asked are intended for evaluation purposes only.
To start a cluster from the dispatcher webfrontend you need to configure all nodes to start the arangod daemon in dispatcher mode:
[cluster]
disable-dispatcher-kickstarter = no
disable-dispatcher-frontend = no
To start a cluster on a single machine you only need to install ArangoDB and reconfigure it once; it will then use the same installation to start the dispatcher and dbserver nodes.
One should know that the initial cluster startup may take a while.
Another side note is that authentication is not supported in this scenario, so you may need to turn it off.
You should now find the log output of the dbserver and coordinator instances under /var/log/arangodb/cluster/ so you can get the actual informations of what went wrong.
Script based cloud install clusters
A better way to get a cluster running in the cloud may be to use one of the scripts we prepared for Digital Ocean, Google Compute Engine, AWS or Azure.
ArangoDB Clusters based on Mesosphere DCOS
The currently recommended way of running an ArangoDB cluster is to use Mesosphere DCOS, as Max describes in these slides using some example configurations.
ArangoDB is an official Mesosphere partner and we offer an official DCOS subcommand to manage an ArangoDB Cluster on Mesosphere DCOS.
Mesosphere adds additional services on top of Mesos and eases management of the Mesos cluster via the dcos-cli.
If you want to use a raw Apache Mesos Cluster, you can use the Mesos framework directly to schedule the neccessary tasks to create ArangoDB cluster.
Meanwhile there is a better article about Running ArangoDB on DC/OS.

Migrating existing Marklogic Application Server from Linux to AWS

I'd like to migrate Marklogic 7 Application Server from a Linux environment to AWS.
I've seen pdfs/tutorials on creating a new server on AWS but I'm not sure how to migrate existing data and configurations.
There is more than one cluster.
Thanks
NGala
This question has nothing to do with AWS (AWS servers are just standard Linux servers). Consult your Marklogic documentation on how to migrate between servers.
It makes a big difference whether you need to keep the server online the whole time, or not. If you can shut it down, just install MarkLogic on an AWS linux image and copy /var/opt/MarkLogic and any external data directories.
If you need to keep the system online, export a configuration package for your database and app server(s) from the MarkLogic configuration manager on port 8000. Then import it on the new host. Then set up database replication as described at http://docs.marklogic.com/guide/database-replication/dbrep_intro - then once replication has synchronized, fail over to the new system.
Specific to AWS, you could back up a database to S3 from one cluster and then restore it on another cluster. This works even outside AWS, as long as the system can access S3.

Cassandra native transport port 9042 slow on EC2 Machine

I have a 5 node Cassandra cluster set up on EC2, all in the same region.
If I connect over cqlsh (9160), queries respond in under a second.
When I connect via Dev Center, or using the native Java Driver, both of which use port 9042, the queries take over 20 seconds to respond.
They consistently respond in the same 21 second region. Never fast and then slow.
I have set up a few Cassandra Clusters on EC2 and have seen this before but do not know how to fix the problem. The last time, I scrapped the cluster and built a new one and the response time on port 9042 was fine.
Any help in how to debug or fix this problem would be appreciated, thanks.
The current version of DevCenter was designed to support as main scenario running (longish) CQL scripts (vs an interactive console with queries executed one after another). DevCenter is using as an underlying connector the DataStax Java driver for Cassandra.
For the above mentioned scenario, in order to ensure there are no "conflicts", a new Session is created for each execution. When a Session is initialized, the driver performs an auto-node discovery, creates connection pools, etc. Basically it does a lot of preparation work. Depending on the latency from your client machine to the EC2 nodes, the size of the cluster and also the configuration of these nodes (see the connection requirements), this initialization phase can be quite expensive.
As you can imagine the time spent preparing wouldn't represent a large percentage of running a DDL script and a decent size of inserts/updates. But for an interactive scenario, it will result in a suboptimal behavior (the one you are describing)
The next version(s) of DevCenter will address the interactive scenario and optimize for it so the user experience would be what you'd expect. And supporting this scenario is pretty high on our list of priorities.
The underlying Java driver obtains the whole cluster topology when it initially connects. This enables it to automatically connect to any node in the cluster. On EC2 it only obtains the private addresses, tries each one, and then times out. It then sends the request over the initial connection

Resources