Error starting vreplication engine: error in connecting to mysql db with connection <nil> Vitess on kubernetes - vitess

kubernetes version: v1.16.3
linux version: 7.3.1611
Starting Vitess cluster on kubernetes using default operator.yaml and 101_initial_cluster.yaml, one of example-vttablet-zone1-xxx pod is restarting forever.
using kubectl logs -f example-vttablet-zone1-2548885007-46a852d0 -c vttablet to see the logs, i got
W0706 07:42:02.200507 1 tm_init.go:531] Cannot get current mysql port, will keep retrying every 1s: net.Dial(/vt/socket/mysql.sock) to local server failed: dial unix /vt/socket/mysql.sock: connect: no such file or directory (errno 2002) (sqlstate HY000)
E0706 07:42:02.285406 1 engine.go:213] Error starting vreplication engine: error in connecting to mysql db with connection <nil>, err net.Dial(/vt/socket/mysql.sock) to local server failed: dial unix /vt/socket/mysql.sock: connect: no such file or directory (errno 2002) (sqlstate HY000), will keep retrying.
E0706 07:42:02.285504 1 state_manager.go:276] Error transitioning to the desired state: MASTER, Serving, will keep retrying: net.Dial(/vt/socket/mysql.sock) to local server failed: dial unix /vt/socket/mysql.sock: connect: no such file or directory (errno 2002) (sqlstate HY000)
I0706 07:42:02.285527 1 state_manager.go:661] State: exiting lameduck
E0706 07:42:02.285539 1 tm_state.go:258] Cannot start query service: net.Dial(/vt/socket/mysql.sock) to local server failed: dial unix /vt/socket/mysql.sock: connect: no such file or directory (errno 2002) (sqlstate HY000)
I0706 07:42:02.285553 1 tm_state.go:305] Publishing state: alias:<cell:"zone1" uid:2548885007 > hostname:"10.233.107.217" port_map:<key:"grpc" value:15999 > port_map:<key:"vt" value:15000 > keyspace:"commerce" shard:"-" key_range:<> type:MASTER db_name_override:"vt_commerce" mysql_hostname:"10.233.107.217" master_term_start_time:<seconds:1625527268 nanoseconds:196807555 >
I didn't change any yaml in operator directory, anyone know why is this?

Related

rafthttp: dial tcp timeout on etcd 3-node cluster creation

I don't have an access to the etcd part of the project's source code, however I do have access to the /var/log/syslog.
The goal is to setup up 3-node cluster.
(1)The very first etcd error that comes up is:
rafthttp: failed to dial 76e7ffhh20007a98 on stream MsgApp v2 (dial tcp 10.0.0.134:2380: i/o timeout)
Before continuing, I would say that I can ping all three nodes from each of the nodes. As well as I have tried to open the 2380 TCP ports and still no success - same error.
(2)So, before that error I had following messages from the etcd, which in my opinion confirm that cluster is setup correctly:
etcdserver/membership: added member 76e7ffhh20007a98 [https://server2:2380]
etcdserver/membership: added member 222e88db3803e816 [https://server1:2380]
etcdserver/membership: added member 999115e00e17123d [https://server3:2380]
In /etc/hosts file these DNS names are resolved as:
server2 10.0.0.135
server1 10.0.0.134
server3 10.0.0.136
(3)The initial setup, however (on each nodes looks like this):
embed: listening for peers on https://127.0.0.1:2380
embed: listening for client requests on 127.0.0.1:2379
So, to sum up, each node have got this initial setup log (3) and then adds members (2) then once these steps are done it fails with (1). As I know the etcd cluster creation is following this pattern: https://etcd.io/docs/v3.5/tutorials/how-to-setup-cluster/
Without knowing the source code is really hard to debug, however maybe some ideas on the error and what could cause it?
UPD: etcdctl cluster-health output (ETCDCTL_ENDPOINT is exported):
cluster may be unhealthy: failed to list members Error: client: etcd
cluster is unavailable or misconfigured; error #0: client: endpoint
http://127.0.0.1:2379 exceeded header timeout ; error #1: dial tcp
127.0.0.1:4001: connect: connection refused
error #0: client: endpoint http://127.0.0.1:2379 exceeded header
timeout error #1: dial tcp 127.0.0.1:4001: connect: connection refused

502 Bad Gateway issue while starting Jfrog

Am trying to bring Jfrog up, in local tomcat is running and artifactory service also looking fine. But in UI jfrog is not coming up.
Getting 502 Bad Gateway error. I have shared the console log details below.
Below is the console log
[TRACE] [Service registry ping] operation attempt #94 failed. retrying in 1s. current error: error while trying to connect to local router at address 'http://localhost:8046/access/api/v1/system/ping': Get "http://localhost:8046/access/api/v1/system/ping": dial tcp 127.0.0.1:8046: connect: connection refused
[TRACE] [Service registry ping] running retry attempt #95
[INFO ] Cluster join: Retry 95: Service registry ping failed, will retry. Error: error while trying to connect to local router at address 'http://localhost:8046/access/api/v1/system/ping': Get "http://localhost:8046/access/api/v1/system/ping": dial tcp 127.0.0.1:8046: connect: connection refused
[TRACE] [Service registry ping] operation attempt #95 failed. retrying in 1s. current error: error while trying to connect to local router at address 'http://localhost:8046/access/api/v1/system/ping': Get "http://localhost:8046/access/api/v1/system/ping": dial tcp 127.0.0.1:8046: connect: connection refused
2022-09-10T06:14:20.271Z [jffe ] [INFO ] [ ] [ ] [main ] - pinging artifactory, attempt number 90
2022-09-10T06:14:20.274Z [jffe ] [INFO ] [ ] [ ] [main ] - pinging artifactory attempt number 90 failed with code : ECONNREFUSED
[TRACE] [Service registry ping] running retry attempt #96
[DEBUG] Cluster join: Retry 96: Service registry ping failed, will retry. Error: error while trying to connect to local router at address 'http://localhost:8046/access/api/v1/system/ping': Get "http://localhost:8046/access/api/v1/system/ping": dial tcp 127.0.0.1:8046: connect: connection refused
[TRACE] [Service registry ping] operation attempt #96 failed. retrying in 1s. current error: error while trying to connect to local router at address 'http://localhost:8046/access/api/v1/system/ping': Get "http://localhost:8046/access/api/v1/system/ping": dial tcp 127.0.0.1:8046: connect: connection refused
2022-09-10T06:14:21.188Z [jfrou] [INFO ] [2b4bfed554e45cf6] [join_executor.go:169 ] [main ] [] - Cluster join: Retry 100: Service registry ping failed, will retry. Error: could not parse error from service registry, status code: 404, raw body: <!doctype html><html lang="en"><head><title>HTTP Status 404 – Not Found</title><style type="text/css">body {font-family:Tahoma,Arial,sans-serif;} h1, h2, h3, b {color:white;background-color:#525D76;} h1 {font-size:22px;} h2 {font-size:16px;} h3 {font-size:14px;} p {font-size:12px;} a {color:black;} .line {height:1px;background-color:#525D76;border:none;}</style></head><body><h1>HTTP Status 404 – Not Found</h1></body></html>
[TRACE] [Service registry ping] running retry attempt #97
[DEBUG] Cluster join: Retry 97: Service registry ping failed, will retry. Error: error while trying to connect to local router at address 'http://localhost:8046/access/api/v1/system/ping': Get "http://localhost:8046/access/api/v1/system/ping": dial tcp 127.0.0.1:8046: connect: connection refused
[TRACE] [Service registry ping] operation attempt #97 failed. retrying in 1s. current error: error while trying to connect to local router at address 'http://localhost:8046/access/api/v1/system/ping': Get "http://localhost:8046/access/api/v1/system/ping": dial tcp 127.0.0.1:8046: connect: connection refused
2022-09-10T06:14:22.016Z [jfmd ] [INFO ] [ ] [accessclient.go:60 ] [main ] - Cluster join: Retry 100: Service registry ping failed, will retry. Error: Error while trying to connect to local router at address 'http://localhost:8046/access': Get "http://localhost:8046/access/api/v1/system/ping": dial tcp 127.0.0.1:8046: connect: connection refused [access_client]
[TRACE] [Service registry ping] running retry attempt #98
[DEBUG] Cluster join: Retry 98: Service registry ping failed, will retry. Error: error while trying to connect to local router at address 'http://localhost:8046/access/api/v1/system/ping': Get "http://localhost:8046/access/api/v1/system/ping": dial tcp 127.0.0.1:8046: connect: connection refused
and this is the error am getting in UI.
502 Bad Gateway error
Is it a newly installed Artifactory instance? If yes, we need to verify whether the required ports are in place (open at the firewall level). If the ports are already available, disable the IPv6 address from the VM where Artifactory is installed and restart the Artifactory. There are chances of this error occurrence if the application is trying to pick up the IPv6 address for initialisation instead of Ipv4.

How to install MongoDB Enterprise 4.4 on remote redhat server?

I followed the instructions listed here, https://docs.mongodb.com/manual/tutorial/install-mongodb-enterprise-on-red-hat/, and tried to install on a remote server from my local machine. I ssh from my local machine into the server and performed the steps for installation.
I'm not sure if there are additional steps that need to be completed or whether you have to set Directory Paths that are not the default ones since you are using a server instead of local machine. My current error is when I run mongo from my terminal and I get this error
Error: couldn't connect to server 127.0.0.1:27017, connection attempt failed: SocketException: Error connecting to 127.0.0.1:27017 :: caused by :: Connection refused :
connect#src/mongo/shell/mongo.js:374:17
#(connect):2:6
exception: connect failed
exiting with code 1
[h699972#csc2cxp00020938 ~]$ mongo --host
sudo vim /etc/mongod.conf and setting bindIp: 0.0.0.0 did not work. Any help would be appreciated.

How to fix etcd cluster misconfigured error

Have two servers : pg1: 10.80.80.195 and pg2: 10.80.80.196
Version of etcd :
etcd Version: 3.2.0
Git SHA: 66722b1
Go Version: go1.8.3
Go OS/Arch: linux/amd64
I'm trying to run like this :
pg1 server :
etcd --name infra0 --initial-advertise-peer-urls http://10.80.80.195:2380 --listen-peer-urls http://10.80.80.195:2380 --listen-client-urls http://10.80.80.195:2379,http://127.0.0.1:2379 --advertise-client-urls http://10.80.80.195:2379 --initial-cluster-token etcd-cluster-1 --initial-cluster infra0=http://10.80.80.195:2380,infra1=http://10.80.80.196:2380 --initial-cluster-state new
pg2 server :
etcd --name infra1 --initial-advertise-peer-urls http://10.80.80.196:2380 --listen-peer-urls http://10.80.80.196:2380 --listen-client-urls http://10.80.80.196:2379,http://127.0.0.1:2379 --advertise-client-urls http://10.80.80.196:2379 --initial-cluster-token etcd-cluster-1 --initial-cluster infra0=http://10.80.80.195:2380,infra1=http://10.80.80.196:2380 --initial-cluster-state new
When trying to cherck health state on pg1:
etcdctl cluster-health
have an error :
cluster may be unhealthy: failed to list members
Error: client: etcd cluster is unavailable or misconfigured; error #0: client: endpoint http://127.0.0.1:2379 exceeded header timeout
; error #1: dial tcp 127.0.0.1:4001: getsockopt: connection refused
error #0: client: endpoint http://127.0.0.1:2379 exceeded header timeout
error #1: dial tcp 127.0.0.1:4001: getsockopt: connection refused
What I'm doing wrong and how to fix it ?
Both servers run on virtual machines with Bridged Adapter
I've got similar error when I set up etcd clusters using systemd according to the official tutorial from kubernetes.
It's three centos 7 of medium instances on AWS. I'm pretty sure the security groups are correct. And I've just:
$ systemctl restart network
and the
$ etcdctl cluster-health
just gives a healthy result.

PuppetDB configurtion not working

I'm trying to configure puppetDB on the same puppet master server. I followed the puppet documentation, installed the database and configured the puppet to use database.
when I run puppet agent --test command its giving below error message.
I didn't see any process running in port 8081, I see puppet java process running on port 8140.
How can I resolve this error?
Warning: Unable to fetch my node definition, but the agent run will continue:
Warning: Error 500 on SERVER: Server Error: Could not retrieve facts for webserver: Failed to find facts from PuppetDB at puppet:8140: Failed to execute '/pdb/query/v4/nodes/webserver/facts' on at least 1 of the following 'server_urls': https://puppetdb:8081
Info: Retrieving pluginfacts
Info: Retrieving plugin
Warning: Error connecting to puppetdb on 8081 at route /pdb/query/v4/nodes/webserver/facts, error message received was 'Connection refused - connect(2) for "puppetdb" port 8081'. Failing over to the next PuppetDB server_url in the 'server_urls' list
Error: Cached facts for webserver failed: Failed to find facts from PuppetDB at puppet:8140: Failed to execute '/pdb/query/v4/nodes/webserver/facts' on at least 1 of the following 'server_urls': https://puppetdb:8081
Info: Loading facts
Info: Caching facts for webserver
Warning: Error connecting to puppetdb on 8081 at route /pdb/cmd/v1?checksum=039e22c7bf98e9cbf2f08169047d288c9b451c73&version=5&certname=webserver&command=replace_facts, error message received was 'Connection refused - connect(2) for "puppetdb" port 8081'. Failing over to the next PuppetDB server_url in the 'server_urls' list
Error: Failed to execute '/pdb/cmd/v1?checksum=039e22c7bf98e9cbf2f08169047d288c9b451c73&version=5&certname=webserver&command=replace_facts' on at least 1 of the following 'server_urls': https://puppetdb:8081
Error: Could not retrieve local facts: Failed to execute '/pdb/cmd/v1?checksum=039e22c7bf98e9cbf2f08169047d288c9b451c73&version=5&certname=webserver&command=replace_facts' on at least 1 of the following 'server_urls': https://puppetdb:8081
Error: Failed to apply catalog: Could not retrieve local facts: Failed to execute '/pdb/cmd/v1?checksum=039e22c7bf98e9cbf2f08169047d288c9b451c73&version=5&certname=webserver&command=replace_facts' on at least 1 of the following 'server_urls': https://puppetdb:8081
Hope you checked the SSL certs stored in /etc/puppetlabs/puppetdb/ssl are matching with the /etc/puppetlabs/puppet/ssl/certs/<certnameof your puppetserver.FQDN> .
This can be verified by
puppetdb ssl-setup
Sample entry
puppetdb ssl-setup
PEM files in /etc/puppetlabs/puppetdb/ssl already exists, checking integrity.
Setting ssl-host in /etc/puppetlabs/puppetdb/conf.d/jetty.ini already correct.
Setting ssl-port in /etc/puppetlabs/puppetdb/conf.d/jetty.ini already correct.
Setting ssl-key in /etc/puppetlabs/puppetdb/conf.d/jetty.ini already correct.
Setting ssl-cert in /etc/puppetlabs/puppetdb/conf.d/jetty.ini already correct.
Let me know if you have further issues .I have had the same issue and rectified by removing the /etc/puppetlabs/puppetdb/ssl directory and rerun the "puppetdb ssl-setup" command.
For some reason puppetdb process went down that's why no process running on port 8081. I have restarted puppetdb process, then agent -test command stated connecting to the webserver.
Here is the output of puppetdb service in centos 7.
# systemctl status puppetdb
● puppetdb.service - puppetdb Service
Loaded: loaded (/usr/lib/systemd/system/puppetdb.service; enabled; vendor preset: disabled)
Active: active (running) since Tue 2017-03-28 18:26:58 EDT; 1h 20min ago
Main PID: 5503 (java)
CGroup: /system.slice/puppetdb.service
└─5503 /usr/bin/java -Xmx192m -Djava.security.egd=/dev/urandom -XX:OnOutOfMemoryError=kill -9 %p -cp /opt/puppetlabs/...

Resources