Hyperledger orderer <-> peer connection breaks after a while on MS Azure

Hyperledger orderer <-> peer connection breaks after a while on MS Azure - hyperledger-fabric

I have 2 different machines in cloud.
Containers on first machine:
orderer.mydomain.com
peer0.org1.mydomain.com
db.peer0.org1.mydomain.com
ca.org1.mydomain.com
Containers on second machine:
peer0.org2.mydomain.com
db.peer0.org2.mydomain.com
ca.org2.mydomain.com
I start them both. I can make them both join the same channel. I deploy a BNA exported from hyperledger composer to both peers. I send transactions to peer0.org1.mydomain.com and query and get same results from peer0.org2.mydomain.com.
Everything works perfectly so far.
However after 5 - 10 minutes peer on second machine (peer0.org2) gets disconnected from the orderer. When I send transactions to org1 I can query them from org1 and I see the results. But org2 gets detached. Doesn't accept new tx. (orderer connection gone) I can query org2 and see old results.
I added CORE_CHAINCODE_KEEPALIVE=30 to my peer env variables. I see keep alive actions in org2 peers logs. But didn't solve my problem.
I should note: Containers are in a docker network called "basic". This network was used in my local computer. However it still works in cloud.
In orderer logs:
Error sending to stream: rpc error: code = Internal desc = transport is closing
This happens every time I try. But when I run these containers in my local machine they keep connected without problems.
EDIT1: After checking the logs: peer0.org2 receives all tx and sends them to orderer. Orderer receives requests from peer but can't update peers. I can connect to both requestUrl or eventUrl on the problematic peer. There is no network problem.

I guess I found the problem. It is about MS Azure networking. After 4 minutes azure cuts idle connections:
https://discuss.pivotal.io/hc/en-us/articles/115005583008-Azure-Networking-Connection-idle-for-more-than-4-minutes
EDIT1:
Yes the problem was about MS Azure.. If there is anyone out there trying to run hyperledger on Azure keep in mind that if peer stays idle for more than 4 minutes azure times out tcp connections. You can configure it to timeout in 30 mins. It is not a bug but it was annoying for us not being able to understand why it wasn't working after 4 mins.
So you can use your own server or other cloud solution or use azure by adapting to their rules.

Related

Hyperledger Fabric Orderer - TLS Handshake Bad Certificate Issue

I'm developing an insurance application project that utilizes a hyperledger fabric network. I currently have a problem where my orderer nodes do not stay online for more than about 10 seconds before they crash. Inspecting the logs, there are a multitude of error messages suggesting that TLS certificates are not agreeing. The error messages do not seem to specify exactly what certificates are the faulting certificates in question however further up the logs is an error that says it could not match an expected certificate with a certificate that it found instead (Shown in this screenshot). While this error was also vague, I deduced that it was in fact comparing the orderer's public key certificate with the certificate within the locally stored genesis block. Upon inspection of the genesis block in the orderer node, it is indeed a completely different certificate. I have noticed that even after destroying the whole network from Docker and re-building the network, the certificate inside the genesis block of which is stored in the orderer nodes always remains exactly the same.
In terms of my network layout, I have 2 organizations. One for my orderer(s) and one for my peers. I also have 4 CA Servers (A CA and TLS CA Server for both the orderer organization and peer organization).
I have 3 peer nodes and 3 orderer nodes.
Included below is a pastebin of the logs from orderer1 node before it crashes, and the GitHub repo
orderer1 logs - ``https://pastebin.com/AYcpAKHn
repo - ``https://github.com/Ewan99/InsurApp/tree/remake
NOTE: When running the app, first run ./destroy.sh, then ./build.sh
"iaorderer" is the org for my orderers
"iapeer" is the org for my peers
I've tried re-naming certificates incase they were being over-written by each other on creation.
I tried reducing down from 3 orderers to 1 to see if it made any differences.
Of course, in going from 3 to 1 orderers I changed from RAFT to solo, and still encountered the same problems

As per david_k's comment suggestion:
Looks like you are using persistent volumes in docker, so do you prune these volumes when you start again from scatch if you don't then you can pick up data that doesn't match newly created crypto material
I had to prune my old docker volumes as they were conflicting and providing newly built containers with older certificates

I looked at the docker-compose.yaml file and there is something there that by all accounts should create a failure. Each peer definition uses the same port e.g.
CORE_PEER_GOSSIP_EXTERNALENDPOINT=peer1.iapeer.com:7051
CORE_PEER_GOSSIP_EXTERNALENDPOINT=peer2.iapeer.com:7051
CORE_PEER_GOSSIP_EXTERNALENDPOINT=peer3.iapeer.com:7051
To my mind, this cannot possibly work while running on a single server, but perhaps I am missing something.

Default time for orderers to detect change in their endpoints in system channel

I was trying to migrate my Hyperledger Fabric network (running a RAFT ordering service) from one host to another.
In this process, I was making sure that the TLS communication is respected, which means that I made required changes in the system channel before migration process. I used the backup and genesis block (of old ordering service) to restore the network on target host. One new thing that I found was that when the orderer nodes started at new host, it took 10 minutes for them to sync blocks and start the RAFT election.
The question is: Is this default time configured in the orderer code-base or is it some other functionality?
NOTE: I know the that addition of an existing orderer node in some application channel takes 5 minutes by default for that orderer to detect the change. So, is the above situation something similar to this or is a different capability?
The complete orderer node (one that was started first on new host) logs can be found here.

Eviction suspicion is a mechanism which triggers after a default timeout of 10 minutes.

How to pass all orderers while creating a channel ( raft)

I have a 3 org 6 peer system with node SDK and 5 raft orderers. The raft is working fine, tried killing leaders and election takes place. The SDK also working well can invoke transactions. But the problem bothering me is while starting the network the ordering system is defaulting to the first orderer say for example, orderer1.example.com and now if I kill this first orderer the network fails, invocation of transaction fails while raft selects a new leader. When I try to invoke a transaction it shows connection failed, cannot connect to all addresses and service unavailable.
I see in the typescript section of the SDk a way of passing the orderers and there I wrote a loop to pass in all orderers and the above problem is solved.
Is there any way to resolve this in the js implementation ?

Hey #Anantha Padmanabhan
This is nothing to do with the ordering system and raft is a perfect distributed consensus algorithm
In your case when 5 orderers are present you tried to kill one and the remaining 4 will start leader election if 5th one is leader and your network is stable no worries.
The problem is in your SDK side in the connection profile
For Example:
"channels": {
"samchannel": {
"orderers": [
"sam-orderer1",
"sam-orderer2",
"sam-orderer3",
"sam-orderer4",
"sam-orderer5"
],
...
If you try to remove sam-orderer1 orderer then your SDK try to send the transaction to the sam-orderer1 since it is in the 0th index of the array
Test: try to remove other than sam-orderer1 for example sam-orderer3 and now try to invoke the transaction and it will still work
Do this test and update me the status of the test
this is coming from the SDK side, as soon as it detects any orderer is down then it is stoping the execution instead it should redirect to another available orderer. I think the only way is instead of SDK resolves the orderers automatically using connection profile, u can take this step and send only the available orderer and available orderers can be provided using discovery service

Hyperledger Fabric 1.4 CouchDB Txn logs

I have created Hyperledger Fabric network with 2 Orgs and 1 solo orderer. On the peer I configured CouchDB as state database and launched the peer(After creating channel and joining). I can see CouchDB is creating databases:
mychannel_
mychannel_mycc
mychannel_lscc
I installed and initiated "chain-code_example02" Go chain-code on mychannel. I can successfully query and invoke commands on peer end. CouchDB gets updated on invoke command and mychannel_mycc updates field "revpos", but I cant see transaction logs anywhere like I saw in many tutorials. Where can I see the history of transaction logs with ID? mychannel_mycc database only have data for A and B key but not the values I transferred from A to B with Transaction details like how much I transfer and value.

CouchDB only saves the state, not the transactions.
Transactions (and events...) are ordered in blocks and added to the chain, which is saved in files under /var/hyperledger/production in your joined peers.
You can see the logs in the peer container...
docker logs -f --tail 100 mypeercontainer
...or use the client SDK to inspect your channel's chain elements: https://hyperledger.github.io/fabric-sdk-node/release-1.4/Channel.html.

Shutdown machine while inserting data stops the network

I have my network running with 3 machines, each one with:
1 orderer
1 peer
1 ca
1 Node.Js client
They are deployed on AWS and a load balancer correctly distributes the requests to the 3 different clients.
The clients are not using discovery service, it is disabled.
Client1 only contacts orderer1 and peer1 and ca1, and so on and so forth for the others.
I want to try the high availability of Hyperledger so when I am inserting data I shutdown a machine, let's suppose machine1, and others should continue the execution.
What happens is that while the machine is down, the network stops the execution. The clients are not moving at all (they do not crash, just stop).
When I bring up the machine again, I see errors coming but it continues the execution now.
It seems like there are calls to machine 1 suspended but they recover as soon as the machine is up.
What I want is that if machine1 goes down, the requests to it are rejected and machine 2-3 continue the execution.
How to obtain it?
[EDIT] Additional information: I have inserted some logs in the client, especially in my endpoint for creation of transactions. Like this:
console.log('Starting Creation')
await contract.submitTransaction(example)
console.log('Creation done')
res.send(200)
Let me also say that this rows are encapsulated in an error handler, so that if any error occurs, I encapsulate the error.
But I get no error, I just get the first print done and the submitTransaction working for a lot of time, never receiving answers.
It seems like it tries to deliver request to orderer but orderer is not online.
When I bring down an orderer with docker service scale orderer1=0 (since I am using services with docker swarm), the orderer leader knows in the logs that he went offline. Also, if I bring the orderer up again, a new election starts.
This seems correct, in fact the problem only happens when I shutdown the machine, closing the connection in a non-friendly way.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string