Hyperledger fabric peers panic due to previous block hash mismatch

Hyperledger fabric peers panic due to previous block hash mismatch - hyperledger-fabric

I have a hyperledger fabric network with 2 organizations and 2 peers per organization. I restored the network from backup and the peers started throwing the following error:
2021-01-28 09:49:55.374 UTC [gossip.state] commitBlock -> ERRO 26b Got error while committing(unexpected Previous block hash. Expected PreviousHash = [91e8aafc47e2f521afc8a52d44b80c60fe781084dec9e1b92ab40d8b6138e7d0], PreviousHash referred in the latest block= [042c3059d75a1622fea88f7f2a5f268363206004ebca635118039e0496b83197]
github.com/hyperledger/fabric/common/ledger/blkstorage/fsblkstorage.(*blockfileMgr).addBlock
/opt/gopath/src/github.com/hyperledger/fabric/common/ledger/blkstorage/fsblkstorage/blockfile_mgr.go:254
github.com/hyperledger/fabric/common/ledger/blkstorage/fsblkstorage.(*fsBlockStore).AddBlock
/opt/gopath/src/github.com/hyperledger/fabric/common/ledger/blkstorage/fsblkstorage/fs_blockstore.go:42
github.com/hyperledger/fabric/core/ledger/ledgerstorage.(*Store).CommitWithPvtData
/opt/gopath/src/github.com/hyperledger/fabric/core/ledger/ledgerstorage/store.go:132
github.com/hyperledger/fabric/core/ledger/kvledger.(*kvLedger).CommitWithPvtData
/opt/gopath/src/github.com/hyperledger/fabric/core/ledger/kvledger/kv_ledger.go:312
github.com/hyperledger/fabric/core/ledger/ledgermgmt.(*closableLedger).CommitWithPvtData
<autogenerated>:1
github.com/hyperledger/fabric/core/committer.(*LedgerCommitter).CommitWithPvtData
/opt/gopath/src/github.com/hyperledger/fabric/core/committer/committer_impl.go:93
github.com/hyperledger/fabric/gossip/privdata.(*coordinator).StoreBlock
/opt/gopath/src/github.com/hyperledger/fabric/gossip/privdata/coordinator.go:243
github.com/hyperledger/fabric/gossip/state.(*GossipStateProviderImpl).commitBlock
/opt/gopath/src/github.com/hyperledger/fabric/gossip/state/state.go:810
github.com/hyperledger/fabric/gossip/state.(*GossipStateProviderImpl).deliverPayloads
/opt/gopath/src/github.com/hyperledger/fabric/gossip/state/state.go:598
runtime.goexit
/opt/go/src/runtime/asm_amd64.s:1333
commit failed
github.com/hyperledger/fabric/gossip/privdata.(*coordinator).StoreBlock
/opt/gopath/src/github.com/hyperledger/fabric/gossip/privdata/coordinator.go:246
github.com/hyperledger/fabric/gossip/state.(*GossipStateProviderImpl).commitBlock
/opt/gopath/src/github.com/hyperledger/fabric/gossip/state/state.go:810
github.com/hyperledger/fabric/gossip/state.(*GossipStateProviderImpl).deliverPayloads
/opt/gopath/src/github.com/hyperledger/fabric/gossip/state/state.go:598
runtime.goexit
/opt/go/src/runtime/asm_amd64.s:1333
github.com/hyperledger/fabric/gossip/state.(*GossipStateProviderImpl).commitBlock
/opt/gopath/src/github.com/hyperledger/fabric/gossip/state/state.go:811
github.com/hyperledger/fabric/gossip/state.(*GossipStateProviderImpl).deliverPayloads
/opt/gopath/src/github.com/hyperledger/fabric/gossip/state/state.go:598
runtime.goexit
/opt/go/src/runtime/asm_amd64.s:1333)
I am not sure what went wrong but all 4 peers started throwing the error. Any suggestion on what should be done next?
Thanks

You probably backed up the network by first backing up peers and then backing up orderers.
This is dangerous because then the backup you create results in peers being at a higher block height than the orderers.
As an example, imagine the orderers have a latest backed up block 100 and the peers have a latest backed up block 101 with hash of bar. The orderers will create block 101 with hash of foo and the peers will never need to pull it. Then, the orderers will create block 102 with previous hash foo but the peers expect a previous hash of bar.

Related

How to recover from a terminated pulsar bookie

We are running Apache Pulsar 2.72. in Prod which uses a 5 node (aws r5ad.2xlarge) bookie cluster (4.12.0). One of the nodes was terminated. As per our ASG a new node came up and joined the cluster.
The Bookies have
autoRecoveryDaemonEnabled=true
lostBookieRecoveryDelay=0
bookkeeperClientMinNumRacksPerWriteQuorum=2
managedLedgerDefaultEnsembleSize=3
managedLedgerDefaultWriteQuorum=3
However the ledger re-replication wasn't taking place. I tried decommissioning the terminated node using sudo /opt/apache-pulsar/apache-pulsar-2.7.2/bin/bookkeeper shell decommissionbookie -bookieid bookieIP:port but it was stuck at
23:53:36.465 [main] INFO org.apache.bookkeeper.client.BookKeeperAdmin - Count of Ledgers which need to be rereplicated: 793
00:03:37.293 [main] INFO org.apache.bookkeeper.client.BookKeeperAdmin - Count of Ledgers which need to be rereplicated: 793
00:13:38.119 [main] INFO org.apache.bookkeeper.client.BookKeeperAdmin - Count of Ledgers which need to be rereplicated: 793
00:23:39.194 [main] INFO org.apache.bookkeeper.client.BookKeeperAdmin - Count of Ledgers which need to be rereplicated: 793
00:33:39.995 [main] INFO org.apache.bookkeeper.client.BookKeeperAdmin - Count of Ledgers which need to be rereplicated: 793
for more than 30 mins. We even tried getting the ledgers which were underreplicated using sh bookkeeper shell listunderreplicated and read some of the returned ledgers using sh bookkeeper shell ledger -m but that failed with an exception complaining about unable to access terminated bookie. We ended up deleting the underreplicated ledgers.
I am looking for a suggestion to best recover from a terminated bookie with our having to delete ledgers

Now that Apache Pulsar 2.8.1 is out, can you upgrade and try again. It seems unusual.
To get access to all the Pulsar people in one location, sign up for the summit
https://streamnative.io/en/blog/community/2021-09-07-speakers-announced-for-pulsar-virtual-summit-europe-2021/

ProposalResponsePayloads do not match - ERC-1155 Chaincode / fabric-samples

I follow ERC-1155 chaincode example for Fabric. When I run the BatchTransferFrom part. It sometimes gives error and sometimes runs successfully. I cannot understand why it fails sometimes. Is this error normal when invoking chaincode functions on Fabric?
The error is:
Error: could not assemble transaction: ProposalResponsePayloads do not match - proposal response: version:1 response:<status:200 > payload: ...
When I call the command using Fabric Node SDK API, it gives the following error:
2021-08-30T09:59:41.794Z - error: [DiscoveryHandler]: compareProposalResponseResults[undefined] - read/writes result sets do not match index=1
2021-08-30T09:59:41.794Z - error: [Transaction]: Error: No valid responses from any peers. Errors:
peer=undefined, status=grpc, message=Peer endorsements do not match

When you perform a transaction, all the responses from different endorsements in the transaction must match.
For any reason, this is not happening with your proposals. Different peers return different responses.
I don't know about that specific chaincode, but common causes are:
Using pseudo-random values.
Using current timestamps instead of the ones from transaction or block.
Serializing a JSON in an undeterministic way, so that it results in different strings, as elements have been serialized in different order.
Etc.

I found the cause of the problem. Iterating maps in Go is not deterministic and the function BatchTransferFrom uses maps. The map is iterated in different orders in different peers and as a result, this causes the proposals to be different.

Analysis of Testing Report of Caliper

My Question is:
Q1. What do we mean by 'label 1, 2' and how 4 Peers are contributing to it?
Q2. What do we mean by label 3, when we compare it with 'send rate' ?
Q3. What is difference between label 3 and lable 5 and why there is much gap in memory utilization of both?

Q1: Lable 1 is the number of transactions that were successfully processed and written on the ledger. Lable 2 is the number of transactions that are being submitted every second. Lable 2 has nothing to do with the number of peers but the number of peers (and their processing power) contributes to this as if a peer fails to do its job (endorsement, verification, etc.) the transaction would fail therefore this number would be different.
Q2: Lable 3 represents the number of transaction that has been processed vs send rate which is the per second rate of the transactions submitted to the blockchain. e.g., in your Test1 the 49 transactions per second were submitted but only 47 transactions per second were processed hence the 2.4 seconds Max Latency (It is more complex than what I said.)
Q3: Lable 5 represents a peer which is in charge of running and verifying the smart contracts and probably endorsement as well (depending on your endorsement policy) but the label 3 is a world state database peer (more here: https://vitalflux.com/hyperledger-fabric-difference-world-state-db-transaction-logs/ ) and running smart contracts uses more resources.

Bulk insert into Hyperledger Fabric keeps timing out

We're bulk inserting records into Hyperledger Fabric. However, we are hitting time out issue. Even we keep increasing the timeout, we will simply have this error happening at a later point.
Each transaction inserts 1000 records using PutState in a loop for all those records (blind insert, nothing in the read-set). We have also increased BatchTimeout to 3s and MaxMessageCount to 100, so that we get larger blocks (we see 4 transactions per block so 4000 [4x1000 records per transaction] records being inserted into ledger with every block).
When the bulk_update fails for CouchDB and the peer has to retry for each (of the 1000 records per transaction) records separately, the queries takes too long and overshoots the timeout. But this is our assumption. Also we found this : https://jira.hyperledger.org/browse/FAB-10558 , but it says it was already fixed in v1.2.0, which is the version we are using.
The error we got is net/http: request canceled (Client.Timeout exceeded while reading body) from the logs below:
We tried setting the following environment variable in the peer container:
CORE_CHAINCODE_EXECUTETIMEOUT=120s
And also req.setProposalWaitTime(120 * 1000) when using the Java SDK.
But then, we just get the same timeout error at a later point. So we can keep increasing timeout variable to a bigger number, but we believe it will happen again at a later point. Is the time required for inserting to CouchDB proportional to the number of records in CouchDB? Maybe updating the index will take more time when number of documents increase?
The runtime error log that we get (after 2-4 million or so records have been inserted) is as below:
October 5th 2018, 04:36:38.646 github.com/hyperledger/fabric/core/committer.(*LedgerCommitter).CommitWithPvtData(0xc4222db8c0, 0xc451e4f470, 0xc4312ddd40, 0xdf8475800)
October 5th 2018, 04:36:38.646 github.com/hyperledger/fabric/gossip/state.(*GossipStateProviderImpl).deliverPayloads(0xc4220c5a00)
October 5th 2018, 04:36:38.646 goroutine 283 [running]:
October 5th 2018, 04:36:38.646 /opt/gopath/src/github.com/hyperledger/fabric/core/committer/committer_impl.go:105 +0x6b
October 5th 2018, 04:36:38.646 /opt/gopath/src/github.com/hyperledger/fabric/gossip/privdata/coordinator.go:236 +0xc3b
October 5th 2018, 04:36:38.646 /opt/gopath/src/github.com/hyperledger/fabric/gossip/state/state.go:771 +0x6c
October 5th 2018, 04:36:38.646
October 5th 2018, 04:36:38.646 github.com/hyperledger/fabric/core/ledger/kvledger.(*kvLedger).CommitWithPvtData(0xc421fb1860, 0xc451e4f470, 0x0, 0x0)
October 5th 2018, 04:36:38.646 github.com/hyperledger/fabric/gossip/privdata.(*coordinator).StoreBlock(0xc422286e60, 0xc42462cd80, 0x0, 0x0, 0x0, 0xc4312dde78, 0x7329db)
October 5th 2018, 04:36:38.646 github.com/hyperledger/fabric/gossip/state.(*GossipStateProviderImpl).commitBlock(0xc4220c5a00, 0xc42462cd80, 0x0, 0x0, 0x0, 0x0, 0x0)
October 5th 2018, 04:36:38.646 panic: Error during commit to txmgr:net/http: request canceled (Client.Timeout exceeded while reading body)
October 5th 2018, 04:36:38.646 /opt/gopath/src/github.com/hyperledger/fabric/core/ledger/kvledger/kv_ledger.go:273 +0x870
October 5th 2018, 04:36:38.646 /opt/gopath/src/github.com/hyperledger/fabric/gossip/state/state.go:558 +0x3c5
October 5th 2018, 04:36:38.646 /opt/gopath/src/github.com/hyperledger/fabric/gossip/state/state.go:239 +0x681
October 5th 2018, 04:36:38.646 created by github.com/hyperledger/fabric/gossip/state.NewGossipStateProvider
October 5th 2018, 04:36:03.645 2018-10-04 20:36:00.783 UTC [kvledger] CommitWithPvtData -> INFO 466e[0m Channel [mychannel]: Committed block [1719] with 4 transaction(s)
October 5th 2018, 04:35:56.644 [33m2018-10-04 20:35:55.807 UTC [statecouchdb] commitUpdates -> WARN 465c[0m CouchDB batch document update encountered an problem. Retrying update for document ID:32216027-da66-4ecd-91a1-a37bdf47f07d
October 5th 2018, 04:35:56.644 [33m2018-10-04 20:35:55.866 UTC [statecouchdb] commitUpdates -> WARN 4663[0m CouchDB batch document update encountered an problem. Retrying update for document ID:6eaed2ae-e5c4-48b1-b063-20eb3009969b
October 5th 2018, 04:35:56.644 [33m2018-10-04 20:35:55.870 UTC [statecouchdb] commitUpdates -> WARN 4664[0m CouchDB batch document update encountered an problem. Retrying update for document ID:2ca2fbcc-e78f-4ed0-be70-2c4d7ecbee69
October 5th 2018, 04:35:56.644 [33m2018-10-04 20:35:55.904 UTC [statecouchdb] commitUpdates -> WARN 4667[0m CouchDB batch document update encountered an problem. ... and so on

[33m2018-10-04 20:35:55.870 UTC [statecouchdb] commitUpdates -> WARN 4664[0m CouchDB batch document update encountered an problem. Retrying update for document ID:2ca2fbcc-e78f-4ed0-be70-2c4d7ecbee69
The above suggests that POST http://localhost:5984/db/_bulk_docks failed and so the individual documents were tried separately.
Looking at the different parameters available to configure, increasing requestTimeout under the ledger section might be worth a shot.
This can be done by setting the following environment variable in the docker-compose for your peer container :
CORE_LEDGER_STATE_COUCHDBCONFIG_REQUESTTIMEOUT=100s
The name of the environment variable associated to a configuration parameter can be derived by looking at this answer.
Configuring CORE_CHAINCODE_EXECUTETIMEOUT and proposalWaitime might not have had an effect as some other connection downstream (here it being the http connection between peer and couchdb) was timing out and then the timeout exception being propagated up.

fabric channel : common ledger for all it's members or can restricted access be set?

Little confused on the understanding of channels in hyperledger.
The ledger within a channel is one and the same for all of it's members ?
Consider a network with n parties in it which consists of a channel C with members 1, 2, 3.
If a transaction is sent from party 1 to 2 it would appear on the ledger of 3 as they are all part of same channel but not members 4...n as they aren't part of C !!
Usecase: Member 1 initiates a transaction with 3, in which case the ledger record of 2 shouldn't reflect the record. Does this mean I have to create a new channel with just 1 and 3 in it or can I use C with some policies of sorts?
If it's the former then that would mean creating a new channel for every possible private ledger update!

The ledger within a channel is one and the same for all of it's members ?
The quick answer is yes, participant which agreed to create and join a channel agreed to share information and accepted channels rules.
Consider a network with n parties in it which consists of a channel C with members 1, 2, 3.
If a transaction is sent from party 1 to 2 it would appear on the ledger of 3 as they are all part of same channel but not members 4...n as they aren't part of C !!
Correct if 1, 2, 3 joined same channel they share same ledger and therefore transaction between 1 and 2 will be updated on ledger of 3.
Usecase: Member 1 initiates a transaction with 3, in which case the ledger record of 2 shouldn't reflect the record. Does this mean I have to create a new channel with just 1 and 3 in it or can I use C with some policies of sorts?
If you would like to keep private communication between 1 and 3, such that it won't be accessible by org 2, you have a few options:
As you explained you can have separate channel where only participants will be peers of org 1 and 3.
You might consider use encryption to prevent org 2 to read content of transaction between 1 and 3, however the fact that 2 can see transaction even encrypted reveals some business relationships between 1 and 3, therefore option #1 (to have separate channel) my be preferable.
If it's the former then that would mean creating a new channel for every possible private ledger update!
If you would like to have mutual exclusive bilateral private between pair of organization, yes you need to create a channel for each pair, thus separate ledger.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Hyperledger fabric peers panic due to previous block hash mismatch - hyperledger-fabric

Related

How to recover from a terminated pulsar bookie

ProposalResponsePayloads do not match - ERC-1155 Chaincode / fabric-samples

Analysis of Testing Report of Caliper

Bulk insert into Hyperledger Fabric keeps timing out

fabric channel : common ledger for all it's members or can restricted access be set?

Categories

Resources