How to estimate how many storage will fabric peer occupy?
If I have a main struct (the biggest size and used in high frequency ) in chaincode.
I found the storage of a peer's ledger is almost equal to the size of the data of main struct multiplied by 5~10.
Also the world state storage is almost the same with the ledger.
Related
This is a question about StateDB automatic fix after tampering.
We're wondering if manipulation to StateDB (CouchDB) could be detected and fixed automatically by peer.
The following documents states there is a a state reconciliation process synchronizing world state in Peer.
https://hyperledger-fabric.readthedocs.io/ja/latest/gossip.html#gossip-messaging
In addition to the automatic forwarding of received messages, a state
reconciliation process synchronizes world state across peers on each
channel. Each peer continually pulls blocks from other peers on the
channel, in order to repair its own state if discrepancies are
identified.
But when we test it as follows:
Step 4: modify value of key a in StateDB
Step 10: Wait for 15 minutes
Step 11: Create a new block
Step 12: Check value of a through chaincode and confirm it in StateDB directly
The tampered value is not fixed automatically by peer.
Can you help clarify the meaning of "state reconciliation process" in the above document and if peer would fix the tampering to StateDB.
Thank you.
Gossip protocol is to sync up legit data among peers, not tampered data in my view. What is legit data? Data whose computed hash at anypoint of time matches with the orignally computed hash, which will not be the case for the tampered data, and so I'd not expect Gossip protocol to sync such 'dirty data'. This defeats the purpose of Blockchain altogether as a technology, and hence this is a wrong test to be performed in my view.
Now, then what is Gossip protocol? Refer https://hyperledger-fabric.readthedocs.io/ja/latest/gossip.html#gossip-protocol
"Peers affected by delays, network partitions, or other causes
resulting in missed blocks will eventually be synced up to the current
ledger state by contacting peers in possession of these missing
blocks."
So, in cases where the peer should have comitted a block to ledger and would have missed due to some reasons like the ones said above, 'Gossip' is only a fall back strategy for the HLF to reconcile the ledger among the peers.
Now in your test case, I see you are using 'query', now query does not go via the orderer to all the peers, it just goes to one peer and returns the value, you need to do a 'getStringState' as a 'transaction', for the 'enderosement' to run, and that is when would the endorsement fail citing the mismatch between the values for the same key among the peers is what I'd expect.
# Gossip state transfer related configuration
state:
# indicates whenever state transfer is enabled or not
# default value is true, i.e. state transfer is active
# and takes care to sync up missing blocks allowing
# lagging peer to catch up to speed with rest network
enabled: false
.......................................................................................................
# the process of reconciliation is done in an endless loop, while in each iteration reconciler tries to
# pull from the other peers the most recent missing blocks with a maximum batch size limitation.
# reconcileBatchSize determines the maximum batch size of missing private data that will be reconciled in a
# single iteration.
reconcileBatchSize: 10
# reconcileSleepInterval determines the time reconciler sleeps from end of an iteration until the beginning
# of the next reconciliation iteration.
reconcileSleepInterval: 1m
# reconciliationEnabled is a flag that indicates whether private data reconciliation is enable or not.
reconciliationEnabled: true
Link: https://github.com/hyperledger/fabric/blob/master/sampleconfig/core.yaml
In addition to the automatic forwarding of received messages, a state reconciliation process synchronizes world state across peers on each channel. Each peer continually pulls blocks from other peers on the channel, in order to repair its own state if discrepancies are identified. Because fixed connectivity is not required to maintain gossip-based data dissemination, the process reliably provides data consistency and integrity to the shared ledger, including tolerance for node crashes.
I am currently developing a fabric chaincode.
I created a function in the chaincode that was I/O-bound (reading many values from the ledger).
Experiment with this on two nodes. One node uses HDD and the other node uses SSD.
In the ledger, 10,000 objects with 4k size keys were stored. (It seems to be too small.. When I put 100,000, an error occurred, so I tested it with 10,000.)
If I READ(GetState) a lot of values in the ledger, I expected that the READ speed of the node using SSD would be faster, but there was no difference.
I understood that LevelDB is a key-value storage, so there is no difference because it is fast. (Sequential and random reads have similar execution times)
Wondering how to experiment so that the difference in performance of HDD/SDD appears using LevelDB.
And if there is a way, I would like to ask for advice.
If you are querying data via chain code, it is unlikely that you will be I/O bound at the disk level. The path length for querying data via chaincode has a few hops so that is likely going to be your constraint.
I'm digging into the new consensus of Hyperledger Fabric 1.4. The concept of Raft and the way it works is quite clear in the document.
But there is a concept called snapshot, which is
While it’s possible to keep all logs indefinitely, in order to save disk space, Raft uses a process called “snapshotting”
I've read through the documents about Raft too (Ex: this one), and understand the reason Raft need snapshot where it let the leaders and followers store less data (just snapshot and some new logs). But in the case of Fabric, the example about snapshot describe the way a lagging orderer get the missed blocks
For example, let’s say lagging replica R1 was just reconnected to the network. Its latest block is 100. Leader L is at block 196, and is configured to snapshot at amount of data that in this case represents 20 blocks. R1 would therefore receive block 180 from L and then make a Deliver request for blocks 101 to 180. Blocks 180 to 196 would then be replicated to R1 through the normal Raft protocol
It confuses to me while the orderers store the full ledger, so why it still needs snapshot and how can it save disk space here?
In a Hyperledger Fabric network, ledgers which all peers(endorsing peers and committing peers) have are replicated ledgers.
It seems to imply there is a unique 'real/original/genuine' ledger per channel.
I'd like to ask these:
Is there a real ledger? If so, where is it(or where is it defined?) and who owns it?
Those replicated ledgers are updated by each peer, after VSCC, MVCC validation. Then who updates the 'real' ledger?
Does 'World State' only refers to the 'real' ledger?
I'd really appreciate if you answer my questions.
Please tell me if these questions are clarified to you. Thank you!
I don't understand what exactly you mean by 'real' ledger. There is one & only ledger per channel, replicated across all participants per channel. When I say participants, I mean all peers (both endorsing & committing) of an organization's MSP belonging to a given channel.
State DB (a.k.a World state) refers to the database that maintains the current value of a given key. Let me give you an example. You know that a blockchain is a liked list on steroids (with added security, immutability, etc). Say, you have a key A with value 100 in Block 1. You transact in the following manner.
Block 2 -- A := A-10
Block 15 -- A := A-12
.
.
.
Block 10,000 -- A := A-3
So, after Block 10,000, if you need the current value of key A, you have to calculate the value from Block 1. So to manage this efficiently, Fabric folks implemented a state database that updates the value of a key in the state after every transaction. It's sole responsibility is to improve efficiency. If your state gets corrupted, Fabric will automatically rebuild it from Block 0.
Is there a limit on number of channels in a Hyperledger Fabric network?
What are the implications of larger number of channels?
Thanks,
Naveen
There is an upper bound that you can define for the ordering service:
# Max Channels is the maximum number of channels to allow on the ordering
# network. When set to 0, this implies no maximum number of channels. MaxChannels: 0
In the peer, every channel logic is maintained by its own goroutines and data structures.
I'm pretty confident that unless for very extreme use cases, you shouldn't be too worried about the number of channels a peer has joined to.