How to reconcile data automatically between gitaly nodes - gitlab

I have deployed Gitlab with a separate praefect server and 3 gitaly nodes.
This setup works fine , I am facing issues when i replace an gitaly node with a new server. Data is not getting replicated from other gitaly nodes.
I tried using below command but i got error , as I am using tls connection between praefect and gitaly nodes.
sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml reconcile -virtual default -reference gitaly-2 -target gitaly-1 -f
Getting below error
unable to reconcile: failed to dial "localhost:2305" connection: context deadline exceeded
Looking for some pointers on how can I do automatic reconcilation of data between gitaly nodes

Related

Cannot connect to cluster with cqlsh using secure connect bundle

I am getting error when I try to datastax cassandra instance.
bin/cqlsh -u admin -p PASSWORD -b BUNDLE_ZIP_PATH
Connection error: ('Unable to connect to any servers', \
{'xxx:xxx:xxx': ValueError('No host_id to create the SniEndPoint',)} \
)
Have anyone seen this error? This is a to a cloud managed datastax instance on IBM Cloud and the connection used to work before.
The error is generated by the embedded Python driver that cqlsh uses to connect to clusters. It indicates that it couldn't get the host from the secure bundle.
The most likely cause is that the secure bundle you're using is corrupted so I'd suggest downloading it from the source again. Cheers!

Databricks DBT Runtime Error, cannot connect to Database. Maybe an SSL error?

I have a custom Databricks instance with a Domain name that points to an AWS Load Balancer. When I put that information in using either the HTTP instructions here or the databricks cluster instructions here, I get this response in the DBT CLI:
Connection:
host: https://subdomain.domain.com
port: 443
cluster: 123456-stuff00003
endpoint: None
schema: default
organization: 0
16:40:39.470091 [debug] [MainThread]: Acquiring new spark connection "debug"
16:40:39.471632 [debug] [MainThread]: Using spark connection "debug"
16:40:39.472524 [debug] [MainThread]: On debug: select 1 as id
16:40:39.472953 [debug] [MainThread]: Opening a new connection, currently in state init
Connection test: [ERROR]
1 check failed:
dbt was unable to connect to the specified database.
The database returned the following error:
>Runtime Error
Database Error
failed to connect
Unfortunately, DBT's debugging logs are terrible and I am not entirely sure why it is failing. I do know that when I connect to the cluster via Intellij I have to provide the CA file, the Client Certificate file, and the Client key file, because I am using a self-signed SSL cert (unfortunately, the self signed cert is required). Also, when defining my ~/.databrickscfg file I have to provide the argument insecure = true.
I've encountered this issue recently and I fixed it by installing root certificates by executing the "Install Certificates.command" script in the python home directory used to run dbt.
Laurent

cassandra service (3.11.5) stops automaticall after it starts/restart on AWS linux

cassandra service (3.11.5) stops automatically after it starts/restart on AWS linux.
I have fresh installation of cassandra on new instance of AWS linux (t3.xlarge) and
sudo service cassandra start
or
sudo service cassandra restart
after 1 or 2 seconds, the service stop automatically. I looked into logs and I found these.
I am not sure, I havent change configs related to snitch and its always SimpleSnitch. I dont have any multiple cassandras. Just only on single EC2.
Logs
INFO [main] 2020-02-12 17:40:50,833 ColumnFamilyStore.java:426 - Initializing system.schema_aggregates
INFO [main] 2020-02-12 17:40:50,836 ViewManager.java:137 - Not submitting build tasks for views in keyspace system as storage service is not initialized
INFO [main] 2020-02-12 17:40:51,094 ApproximateTime.java:44 - Scheduling approximate time-check task with a precision of 10 milliseconds
ERROR [main] 2020-02-12 17:40:51,137 CassandraDaemon.java:759 - Cannot start node if snitch's data center (datacenter1) differs from previous data center (dc1). Please fix the snitch configuration, decommission and rebootstrap this node or use the flag -Dcassandra.ignore_dc=true.
Installation steps
sudo curl -OL https://www.apache.org/dist/cassandra/redhat/311x/cassandra-3.11.5-1.noarch.rpm
sudo rpm -i cassandra-3.11.5-1.noarch.rpm
sudo pip install cassandra-driver
export CQLSH_NO_BUNDLED=true
sudo chkconfig --levels 3 cassandra on
The issue is in your log file:
ERROR [main] 2020-02-12 17:40:51,137 CassandraDaemon.java:759 - Cannot start node if snitch's data center (datacenter1) differs from previous data center (dc1). Please fix the snitch configuration, decommission and rebootstrap this node or use the flag -Dcassandra.ignore_dc=true.
It seems that you started the cluster, stopped it and renamed the datacenter from dc1 to datacenter1.
In order to fix:
If no data is stored, delete the data directories
If data is stored, rename the datacenter back to dc1 in the config
I had the same problem , where cassandra service immediately stops after it was started.
in the cassandra configuration file located at /etc/cassandra/cassandra.yaml change the cluster_name to the previous one, like this:
...
# The name of the cluster. This is mainly used to prevent machines in
# one logical cluster from joining another.
cluster_name: 'dc1'
# This defines the number of tokens randomly assigned to this node on the ring
# The more tokens, relative to other nodes, the larger the proportion of data
...

Error reading from 192.168.1.164:44214: rpc error: code = Canceled desc = context canceled

i got this error when i am trying to connect peers running in different machines .I found this error in docker logs of orderer.There is an error in docker logs of peer2 running in different machine
Failed obtaining connection: Could not connect to any of the endpoints: [orderer.example.com:7050]
You can find the orderer.yaml file at fabric-samples/config folder.
Going through the fields and their respective comments in orderer.yaml and core.yaml can help you to understand the method of configuring the network(orderer/peer).
And here you can get the info related to TLS.

SPARK YARN: cannot send job from client (org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:8032)

I'm trying to send spark job to yarn (without HDFS) in HA mode.
For submitting I'm using org.apache.spark.deploy.SparkSubmit.
When I send request from machine with active Resource Manager, it works well. But if I' trying to send from machine with standby Resource Manager, job fails with error:
DEBUG org.apache.hadoop.ipc.Client - Connecting to spark2-node-dev/10.10.10.167:8032
DEBUG org.apache.hadoop.ipc.Client - Connecting to /0.0.0.0:8032
org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep
However, when I send request via command line (spark-submit), it works well through both active and standby machine.
What can cause the problem?
P.S. Use the same parameters for both type of sending job: org.apache.spark.deploy.SparkSubmit and spark-submit command line request. And properties yarn.resourcemanager.hostname.rm_id defined for all rm hosts
The problem was with absence of yarn-site.xml within class path for spark-submitter jar. Actually spark submitter jar does not take to account YARN_CONF_DIR or HADOOP_CONF_DIR env var, so cannot see yarn-site.
One solution that I found was to put yarn-site into classpath of jar.

Resources