State Sync in for Hyperledger Fabric using LEVELDB? - hyperledger-fabric

Reading the docs ,I understand that there are two options to maintain the world state LEVELDB and COUCHDB .
In case of LevelDB ,"LevelDB is the default state database embedded in the peer node" I am assuming it is local to a peer . As in every peer has a copy of LevelDB running .
In case of CouchDB there is a separate container to run it ,and all the peers can use it to execute transactions (all the peers see the same data )
In the first case for LevelDB ,how is the version of the data synced across all the peers?
Is this a plug and play feature , Can I for
example use an ETCD cluster ,instead of CouchDB ?

It's actually the same for both LevelDB and CouchDB. The database is basically used as a persistent last know value store. It can be rebuilt at any time for the ledger (the block chain) which is actually what is replicated across all peers via the ordering service
EDIT To clarify, peers receive blocks from the orderer. Each peer than validates the transactions in each block and for each valid transaction it will update the state in the database. In the case of LevelDB, it's an embedded call and in the case of CouchDB the peer communicates with CouchDB via the CouchDB HTTP API. Of course it also writes the blocks to the ledger files on disk as well.
It's not plug and play ... only LevelDB and CouchDB are currently supported. The codebase itself supports adding additional databases, but someone would need to implement the support in the actual codebase.

Related

Hyperledger fabric couchdb persistent

background:
I am developing Hyperledger Fabric Network(v1.1) and are using couchdb.
For continuing operation. It is necessary to persist data go each component(peer, orderer etc..).
Issue:
I don't know what I should persist couchdb's data for continuing operation in production environment.
Question:
(1)Should I persist these data in the below? And, if there is insufficient, please tell me that what I should persist data.
/opt/couchdb/data
/opt/couchdb/etc
(2)If I don't persist these couchdb's data. what will happen? (For example,
querying error,clush the data and difference from block's data).Please tell me.
sincerely.
If you don't persist the CouchDB data, then if you delete or upgrade the container you will lose the data. The good news is that if you have persisted the ledger from your peer, it will rebuild the data in CouchDB on startup but of course this will delay how long it takes for the peer to be able to serve any type of request.
The CouchDB image uses /opt/couchdb/data as the volume where it stores the data so you'll want to mount an external volume there.

consensus among peers in hyperledger fabric after transaction is committed

Suppose I have started fabric with two peers in a single organization. After running my application/rest-server through composer and submitting transactions. I was able to make changes in the values of Couchdb instance of peer1 by going on the address http://localhost:6984/_utils/#/_all_dbs. Now, the two peers are not in sync with each other - application should throw some error but it isn't. Mostly, because it is getting data just from the first peer i.e. peer1.
So, firstly how can I get data from multiple peers - if I want to get data from peer2 aswell?
Secondly, why it is getting data from state database not from ledger?
Thirdly, data should remain in sync even after committed how can I configure this? if some peer tampered its database it should be notified. I have read consensus part and got that it is for the correct order of transactions and blocks but what if someone tampered with the state database?
If you're able to change the entries in state database for 1 peer, with a strong endorsement policy such as AND, your transactions will fail validation because of difference in the data for the two peers. This is one of the most important pros of decentralized network.
State database and the Ledger are not same thing. this should help you in understanding the differences between the both.
Every participating member of a Hyperledger Fabric network is a known entity per se (since Fabric being a permissioned blockchain). Said that, a change in state database of a single peer will again lead to scenario #1 above, where the Read/Write sets in a transaction won't match for multiple peers (as their state databases contain different values of an asset). This will lead to invalidation of the transactions. Now it just becomes a question of how can the network know about the corrupted peer(and subsequent state db). There can be multiple solutions for the same.
But most importantly, Fabric being a permissioned blockchain network, the state databases must be very strictly access protected and authorized outside of the network too.
the fact that you modified the world db doesn't mean anything. any changes you make to that database are not a representation of the ledger.
The ledger itself, the blocks and the transactions they contain are stored in a physical file. The world state db is simply a collection of the current state for each asset. This is a good design because an application will not care about every state change an item went through, it will only care about the current state. The world state db can easily be recreated whenever there is a need for it.
Now, you should not make any changes directly to to the world state db, because that's useless. Any change needs to go through the proper process, via a proposal submitted by a peer which then goes through the orderer. Only when everything is followed, does a change go onto the ledger and get synced with every peer and the world state db will reflect that.
In terms of where you should get the data, the answer is that it doesn't matter. Each peer will have an exact copy of the ledger so if you get data from peer 1 or 2 is irrelevant, it will be the same thing.
Again, just because you changed the world state, that doesn't mean anything, the ledger is untouched, but your application reports the current state from the world state db, which is now incorrect because of your change.

How your data is safe in Hyperledger Fabric when one can make changes to couchdb data directly

I am wondering that how your data is safe when an admin can change the latest state in Couchdb using Fauxton or cURL provided by Couchdb directly.
According to my understanding Hyperledger Fabric provides immutable data feature and is best for fraud prevention(Blockchain feature).
The issue is :- I can easily change the data in couchdb and when I query from my chaincode it shows the changed data. But when I query ledger by using GetHistoryForKey() it does not shows that change I made to couchdb. Is there any way I can prevent such fraud? Because user will see the latest state always i.e data from couchdb not from ledger
Any answer would be appreciated.
Thanks
You should not expose the CouchDB port beyond the peer's network to avoid the data getting tampered. Only the peer's administrator should be able to access CouchDB, and the administrator has no incentive to tamper their own data. Let me explain further...
The Hyperledger Fabric state database is similar to the bitcoin unspent transaction database, in that if a peer administrator tampers with their own peer’s database, the peer will not be able to convince other peers that transactions coming from it are valid. In both cases, the database can be viewed as a cache of current blockchain state. And in both cases, if the database becomes corrupt or tampered, it can be rebuilt on the peer from the blockchain. In the case of bitcoin, this is done with the -reindex flag. In the case of Fabric, this is done by dropping the state database and restarting the peer.
In Fabric, peers from different orgs as specified in the endorsement policy must return the same chaincode execution results for transactions to be validated. If ledger state data had been altered or corrupted (in CouchDB or LevelDB file system) on a peer, then the chaincode execution results would be inconsistent across endorsing peers, the 'bad’ peer/org will be found out, and the application client should throw out the results from the bad peer/org before submitting the transaction for ordering/commit. If a client application tries to submit a transaction with inconsistent endorsement results regardless, this will be detected on all the peers at validation time and the transaction will be invalidated.
You must secure your couchdb from modification by processes other than the peer, just as you must generally protect your filesystem or memory.
If you make your filesystem world writable, other users could overwrite ledger contents. Similarly, if you do not put access control on couchdb writes, then you lose the immutability properties.
In Hyperledger Fabric v1.2, each peer has its own CouchDB. So even if you change the data directly from CouchDB of one peer. The endorsement would fail. If the endorsement fails, your data will not be written neither in world state nor in the current state.
That's the beauty of a decentralized distributed system. Even if you or someone else changes the state of your database/ledger it will not match with the state of others in the network, neither will it match the transaction block hash rendering any transactions invalid by the endorsers unless you can restore the actual agreed upon state of the ledger from the network participants or the orderer.
To take advantage of the immutability of ledger you must query the ledger. Querying the database does not utilize the power of blockchain and hence must be protected in fashion similar to how access to any other database is protected.
You need to understand 2 things here
Although the data of a couchdb of a peer maybe tampered, you should setup your endorsement policy in such a way that it must be endorsed by all the peers.
You cannot expose your couchdb to be altered, I recommend to see Cilium
As explained by others - endorsements/consensus is the key. Despite the fact that ledger state of an endorsing peer can be modified externally - in that event all transactions endorsed by that peer would get discarded, because other endorsing peers would be sending correct transactions (assuming other's world state was also not tampered with) and consensus would play the key role here to help select the correct transaction.
Worst case scenario all transactions would fail.
Hyperledger fabric's world state (Ledger state) can be regenerated from the blockchain (Transactions Log) anytime. And, in the event of peer failure this regeneration happens automatically. With some careful configuration, one can build a self-healing network where a peer at fault would automatically rise from ashes (pun intended).
The key point to consider here is the Gossip Data dissemination protocol which can be considered as the mystical healer. All peers within the network continuously connect, and exchange data with other peers within the network.
To quote the documentation -
Peers affected by delays, network partitions, or other causes resulting in missed blocks will eventually be synced up to the current ledger state by contacting peers in possession of these missing blocks.
and ...
Any peer with data that is out of sync with the rest of the channel identifies the missing blocks and syncs itself by copying the correct data.
That's why it is always recommended to have more and more of endorsing peers within the network and organizations. The bigger the network - harder it would be beat with malicious intent.
I hope, I could be of some help. Ref Link: https://hyperledger-fabric.readthedocs.io/en/release-1.4/gossip.html
Even though this is plausible, the endorsement policy is the general means by which you protect yourself (the system) from the effects of such an act.
"a state reconciliation process synchronizes world state across peers on each channel. Each peer continually pulls blocks from other peers on the channel, in order to repair its own state if discrepancies are identified."

hyperledger fabric: read query and communicating amoung peers within a channel

I'm aware that read query is a chaincode invocation which reads the ledger current state but does not write to the ledger.
The client app can choose to submit the read-only transaction for ordering, validation, and commit (for auditing purpose)
Question:
Is it possible to read the ledger db without going via the chaincode ?
What prevents someone on channel from directly reading CouchDB (assuming that is the underlying DB used) without using contracts and avoiding recording of reads
Fundamentally, we use an async communication in fabric. The node initiating transaction talks only with the peers synchronously and after that ordering service is initiated which will commit the txn to ledger after all verification check..
Is the following possible ?
node A submits a data to the ledger request for an operation -> which needs to be done by node B.
Is this possible for node A to notify B to check ledger and perform operation and B upon completion notify A to check the updated ledger ?
Is it possible to read the ledger db without going via the chaincode ?
What prevents someone on channel from directly reading CouchDB
(assuming that is the underlying DB used) without using contracts and
avoiding recording of reads
Nothing prevents you from doing it, if you have access to the database itself.
However, the clients usually don't have access to the database and thus the peer (through the chaincode) may enforce selective logic on which clients and what data can be read.
Think of an application server, a servlet container, etc.
The user that browses from the browser doesn't have access to the database the application server uses, right?
A similar thing takes place here.
Fundamentally, we use an async communication in fabric. The node
initiating transaction talks only with the peers synchronously and
after that ordering service is initiated which will commit the txn to
ledger after all verification check..
Strictly speaking, the ordering service does not commit the txn to the ledger, as the ordering service has no ledger. It packages transactions into blocks, then sends the blocks to peers. It is the peers that then validate and commit the block to their own ledgers.
Is this possible for node A to notify B to check ledger and perform
operation and B upon completion notify A to check the updated ledger ?
You need the client SDK to orchestrate all of these. User Chaincodes only run in the context of a client's request.
So, you can have a client perform an operation that submits a transaction, and then the client sends another transaction in which the chaincode checks the current ledger, etc.

Where is the blockchain physically

just playing with Hyperledger Composer, and I'm wondering ,Where is the blockchain physically
I'mean is it binary file , text file ....?
is it portable ?
thank you all
There are two place which "store" data in Hyperledger Fabric (the underlying blockchain infrastructure used by Composer which is a runtime abstraction layer above it):
the ledger
the state database ('World state')
The ledger is the actual "blockchain". It is a file-based ledger which stores serialized blocks. Each block has one or more transactions. Each transaction contains a 'read-write set' which modifies one or more key/value pairs. The ledger is the definitive source of data and is immutable.
The state database (or 'World State') holds the last known committed value for any given key - an indexed view into the chain’s transaction log. It is populated when each peer validates and commits a transaction. The state database can always be rebuilt from re-processing the ledger (ie replaying the transactions that led to that state). There are currently two options for the state database: an embedded LevelDB or an external CouchDB.
As an aside, if you are familiar with Hyperledger Fabric 'channels', there is a separate ledger for each channel.
The chain is a transaction log, structured as hash-linked blocks, where each block contains a sequence of N transactions. The block header includes a hash of the block’s transactions, as well as a hash of the prior block’s header. In this way, all transactions on the ledger are sequenced and cryptographically linked together.
The state database is simply an indexed view into the chain’s transaction log, it can therefore be regenerated from the chain at any time.
Source: http://hyperledger-fabric.readthedocs.io/en/release/ledger.html
Suffice to say, Hyperledger Composer uses Fabric (as the blockchain infrastructure) to endorse/order/commit transactions to blocks on the ledger.
To see the physical location of these data you can go to /var/hyperledger/production in each peer container in your fabric network
Actually, the blockchain is a shared database, which is read-only and append-only. And it is distributed in every peers. So every peer has a copy of the shared database. Normally, LevelDB or CouchDB is used in HyperLedger Fabric.
The ledger is comprised of a blockchain (‘chain’) to store the immutable, sequenced record in blocks, as well as a state database to maintain current fabric state.
Here is more information about HyperLedger Fabric Ledger (the blockchain)
Blockchain is shared among all the peer to peer networks hosting it. So basically blockchain is stored in many HDD all around the world.

Resources