rafthttp: dial tcp timeout on etcd 3-node cluster creation - linux

I don't have an access to the etcd part of the project's source code, however I do have access to the /var/log/syslog.
The goal is to setup up 3-node cluster.
(1)The very first etcd error that comes up is:
rafthttp: failed to dial 76e7ffhh20007a98 on stream MsgApp v2 (dial tcp 10.0.0.134:2380: i/o timeout)
Before continuing, I would say that I can ping all three nodes from each of the nodes. As well as I have tried to open the 2380 TCP ports and still no success - same error.
(2)So, before that error I had following messages from the etcd, which in my opinion confirm that cluster is setup correctly:
etcdserver/membership: added member 76e7ffhh20007a98 [https://server2:2380]
etcdserver/membership: added member 222e88db3803e816 [https://server1:2380]
etcdserver/membership: added member 999115e00e17123d [https://server3:2380]
In /etc/hosts file these DNS names are resolved as:
server2 10.0.0.135
server1 10.0.0.134
server3 10.0.0.136
(3)The initial setup, however (on each nodes looks like this):
embed: listening for peers on https://127.0.0.1:2380
embed: listening for client requests on 127.0.0.1:2379
So, to sum up, each node have got this initial setup log (3) and then adds members (2) then once these steps are done it fails with (1). As I know the etcd cluster creation is following this pattern: https://etcd.io/docs/v3.5/tutorials/how-to-setup-cluster/
Without knowing the source code is really hard to debug, however maybe some ideas on the error and what could cause it?
UPD: etcdctl cluster-health output (ETCDCTL_ENDPOINT is exported):
cluster may be unhealthy: failed to list members Error: client: etcd
cluster is unavailable or misconfigured; error #0: client: endpoint
http://127.0.0.1:2379 exceeded header timeout ; error #1: dial tcp
127.0.0.1:4001: connect: connection refused
error #0: client: endpoint http://127.0.0.1:2379 exceeded header
timeout error #1: dial tcp 127.0.0.1:4001: connect: connection refused

Related

Kubernetes error [::1]:6443: connect: cannot assign requested address

I got the following error:
controller.go:228] unable to sync kubernetes service: Post "https://[::1]:6443/api/v1/namespaces": dial tcp [::1]:6443: connect: cannot assign requested address
I have the following warnings in my cluster kube (3x3 master/workers on prem (kvm)) with 3 etcd on masters.
kubectl get events --field-selector type!=Normal -n kube-system
LAST SEEN TYPE REASON OBJECT MESSAGE
3m25s Warning Unhealthy pod/kube-apiserver-kube-master-1 Readiness probe failed: HTTP probe failed with statuscode: 500
3m24s Warning Unhealthy pod/kube-apiserver-kube-master-2 Readiness probe failed: HTTP probe failed with statuscode: 500
3m25s Warning Unhealthy pod/kube-apiserver-kube-master-2 Liveness probe failed: HTTP probe failed with statuscode: 500
3m27s Warning Unhealthy pod/kube-apiserver-kube-master-3 Readiness probe failed: HTTP probe failed with statuscode: 500
17m Warning Unhealthy pod/kube-apiserver-kube-master-3 Liveness probe failed: HTTP probe failed with statuscode: 500
This error not affect my cluster or my servicies in any form. It's appear from the begining. How do I solve? :D
Somewhere you are assigning [::1] address to endpoints..
The endpoint IPs must not be: loopback (127.0.0.0/8 for IPv4, ::1/128
for IPv6), or link-local (169.254.0.0/16 and 224.0.0.0/24 for IPv4,
fe80::/64 for IPv6).
[::1] Means 127.0.0.1 in ipv6 address.
I had the same Error.
My CoWorker deactivated IPv6 (to try something..) and Kubernetes tried to use IPv6.
After rebooting my Master, IPv6 came back and it worked again.
I searched for a bit and found this article: https://kubernetes.io/blog/2021/12/08/dual-stack-networking-ga/ which basically says you can set ipFamilyPolicy to one of three options:
SingleStack
PreferDualStack
RequireDualStack

Fabic: Issue connection refused 7050

I am trying to create a network from the hyperledger fabic tutorial. I get the following error:
Error: failed to create deliver client for orderer: orderer client failed to connect to localhost:7050: failed to create new connection: connection error: desc = "transport: error while dialing: dial tcp [::1]:7050: connect: connection refused"
I opened up the port on the Centos 7 Virtual machine and still no luck. The docker container is exposing the port to the host.
I removed all docker containers, images and volumes. I even rebuilt the VM from scratch.
Any help would be great.
Thanks,
This situation is happened because you called a gRPC to orderer server but your call failed to hit the server. This situation may happen for many reasons, but for most of the cases the situation is happened due to server down(orderer server exit or down due to misconfiguration) or your call failed to hit the server due to misconfiguration.
I somehow encounter this problem before and the port was opened. Somehow it was a mistake where I forgot to put '-a' in command (launch cerificate authorities). Hope it help.
You might also refer this : https://hyperledger-fabric.readthedocs.io/en/release-2.0/build_network.html

TCP connect error during parallel execution

Parallel execution of a program gives me error.
TCP connect error: ECONNREFUSED.
DDI Process 41: error code 911
TCP: Connect failed. node2 -> master:36883.
TCP connect error: ECONNREFUSED.
TCP connect error: ECONNREFUSED.
TCP connect error: ECONNREFUSED.
TCP connect error: ECONNREFUSED.
TCP connect error: ECONNREFUSED.
I guess that this is due to some internal firewall of ubuntu because I am using a separate switch with no Internet connection.
I have used different information to open the port and allow ssh via ufw, but still I am getting the same error.
Could you please let me know how to overcome this problem?
Which command about iptables should I run on master and other nodes?

How to fix etcd cluster misconfigured error

Have two servers : pg1: 10.80.80.195 and pg2: 10.80.80.196
Version of etcd :
etcd Version: 3.2.0
Git SHA: 66722b1
Go Version: go1.8.3
Go OS/Arch: linux/amd64
I'm trying to run like this :
pg1 server :
etcd --name infra0 --initial-advertise-peer-urls http://10.80.80.195:2380 --listen-peer-urls http://10.80.80.195:2380 --listen-client-urls http://10.80.80.195:2379,http://127.0.0.1:2379 --advertise-client-urls http://10.80.80.195:2379 --initial-cluster-token etcd-cluster-1 --initial-cluster infra0=http://10.80.80.195:2380,infra1=http://10.80.80.196:2380 --initial-cluster-state new
pg2 server :
etcd --name infra1 --initial-advertise-peer-urls http://10.80.80.196:2380 --listen-peer-urls http://10.80.80.196:2380 --listen-client-urls http://10.80.80.196:2379,http://127.0.0.1:2379 --advertise-client-urls http://10.80.80.196:2379 --initial-cluster-token etcd-cluster-1 --initial-cluster infra0=http://10.80.80.195:2380,infra1=http://10.80.80.196:2380 --initial-cluster-state new
When trying to cherck health state on pg1:
etcdctl cluster-health
have an error :
cluster may be unhealthy: failed to list members
Error: client: etcd cluster is unavailable or misconfigured; error #0: client: endpoint http://127.0.0.1:2379 exceeded header timeout
; error #1: dial tcp 127.0.0.1:4001: getsockopt: connection refused
error #0: client: endpoint http://127.0.0.1:2379 exceeded header timeout
error #1: dial tcp 127.0.0.1:4001: getsockopt: connection refused
What I'm doing wrong and how to fix it ?
Both servers run on virtual machines with Bridged Adapter
I've got similar error when I set up etcd clusters using systemd according to the official tutorial from kubernetes.
It's three centos 7 of medium instances on AWS. I'm pretty sure the security groups are correct. And I've just:
$ systemctl restart network
and the
$ etcdctl cluster-health
just gives a healthy result.

Cassandra nodetool in standalone mode

I've got Cassandra 0.7 running in standalone mode and I'm tryin to run nodetool but I'm getting JMX exceptions. Isn't the JMX configuration required on accessing a remote server? I'm accessing my local machine.
Also why is nodetool looking for 63.251.179.13?
[rav#ubix bin]$ ./nodetool -h 127.0.0.1 flush
Error connection to remote JMX agent!
java.rmi.ConnectException: Connection refused to host: 63.251.179.13; nested exception is:
java.net.ConnectException: Connection refused
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:619)
at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216)
at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202)
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:128)
at javax.management.remote.rmi.RMIServerImpl_Stub.newClient(Unknown Source)
at javax.management.remote.rmi.RMIConnector.getConnection(RMIConnector.java:2343)
at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:296)
at javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:267)
at org.apache.cassandra.tools.NodeProbe.connect(NodeProbe.java:144)
at org.apache.cassandra.tools.NodeProbe.<init>(NodeProbe.java:114)
at org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:621)
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384)
at java.net.Socket.connect(Socket.java:546)
at java.net.Socket.connect(Socket.java:495)
at java.net.Socket.<init>(Socket.java:392)
at java.net.Socket.<init>(Socket.java:206)
at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40)
at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:146)
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613)
... 10 more
Thanks,
Try nodetool with -h or --host and -p or --port as per the instructions:
-h,--host <arg> node hostname or ip address
-p,--port <arg> remote jmx agent port number
When Cassandra is offline, check the ports in use to see if another process is using the default port that Cassandra binds to. You can find the default in conf/cassandra-env.sh
Once you know the port, you can see if another process is bound to it with netstat -an
If nothing is running on the port, and you start up cassandra, verify that it is running on the correct port and try to connect again with the -p or --port arguments. More information can be found here: http://wiki.apache.org/cassandra/GettingStarted
Is the machine unix or windows? do you have a bad entry in /etc/hosts indicating that 127.0.0.1 maps to another hostname or IP address, namely 63.251.179.13
I had a similar issue running nodetool on an instance of Cassandra running locally on my machine. When trying to run nodetool -h 127.0.0.1 nodetool was issuing an exception relating to JMX that looked like this (where there was an unknown - to me - IP Address).
Error connecting to remote JMX agent!
java.rmi.ConnectIOException: Exception creating connection to: ; nested exception is:
java.net.SocketException: Host is down
Douglas Muth posted a similar issue here, and from this, I found out that Cassandra seems to be recording the hostname at startup. Unfortunately, by the time I ran nodetool the hostname had become stale (my IP address is allocated dynamically).
My solution then, was to restart cassandra, which updated the IP and rerun nodetool. No more JMX errors, no more strange IP address. This worked a treat for me as I'm running a local instance of Cassandra on localhost and don't mind the restart but it's not a very satisfactory solution.

Resources