How to deal with external DB in a Hyperledger fabric chaincode

How to deal with external DB in a Hyperledger fabric chaincode - hyperledger-fabric

I have an operation with X meters, the number of meter may vary.
For each meter, I have to set a percentage of allocation.
So let's say, in my Operation 1, I have 3 meters, m1, m2, m3, I will assign 10% for m1, 50% for m2, and 40% for m3.
So, in this case, when I receive data from m1, I will want to check that operation 1 and meter 1 exists, that meter 1 belongs to operation 1, and get the repartition for my meter.
All those settings are present in an external DB (postgres). I can get it easily in Golang. Thing is I heard that chaincode must be deterministic, and it is a good practice not to have any external dependency. I understand that if the result of your chaincode depends on an external DB, you will not be able to audit this one, so the whole blockchain lose a bit of interest.
Should I hardcode it in an array or in a config file ? So each time I have a config change, I must publish my chaincode again ? I am not so happy on having 2 configs to sync ( DB + config file in DB), it might quickly lead to mistakes.
What is the recommended way of managing external DB connection in a chaincode ?

You could place the "meter information" into the blockchain data store and query it from there?
For instance the application may be used:
maintain the state of all meters, with what ever information they require. This data is written to the fabric state store, where is may be queried.
perform an additional transaction that contains the logic required to query meter information and act accordingly
In the above case the chaincode will be able to update meter information and act on information stored via queries and subsequent action.
Everything is then on-chain, therefore it is accessible, updatable and auditable

Related

Can i check the changed contents through hyperledger fabric transaction?

I am currently participating in the hyperledger fabric project.
The concept of hyperledger fabric is still insufficient. I understand that the transaction is stored in the block. The transaction knows that the chaincode is executed.
For example, suppose you executed send chaincode. Suppose A sends 500 to B with A = 1000 and B = 1000. It will be A = 500 and B = 1500. Suppose this is TxId = AAA.
I want to see AAA's history of "A sent 500 to B" in this situation. Mychannel.block and mychannel in the channel-artifact directory created by running the current network.I tried to decode tx to json file.
However, it was found that there was no content related to this. Is there any way I can see the contents of TxId=AAA?
Decode .tx and .block file. but I didn't get what I wanted.

If you want to see the history of transactions, you can use the ctx.stub.getHistoryForKey(id) function, where the id parameter is a record key. This is the Node.js SDK method, I expect that it is similarly named for Java and Go. I think that the information that you require should be held in the contract, as the history only returns the different contract versions over time. If you want to see that A transacted with B, you would need the contract code to show that funds came from A and landed with B during the transfer. Depending on implementation, this might require a cross-contract call to a different contract (one containing clients A and B) so that 500 could be taken from Account A's fund and added to Account B's fund. In this scenario (if we are talking about a sale), the AssetTransfer contract could show the change of ownership, whereas the client contract will show 2 updates, one where A's fund decreases by 500 and another where B's fund increases by 500.
In the scenario above, there are now three updates that have history i.e. an asset sale, which you don't mention, but I am using as an example which will have a change of ownership history. A client A, whose fund record will have decreased, and a client B, who will have a corresponding increase in funds. Therefore, it's not a block history that you require, but a history of the Client records for A and B. Even if you only had a single contract e.g. (Client), and you exchanged funds, you will still have two updates, one for A and the other for B. It's the records within the contract code that change. The block is the manifestation of the entire transaction i.e. all rules have been satisfied by the different peers and whatever concensus policy is in place.

Aggregate Design for Ledger

I'm trying to design a double-entry ledger with DDD and running into some trouble with defining aggregate roots. There are three domain models:
LedgerLine: individual line items that have data such as amount, timestamp they are created at, etc.
LedgerEntry: entries into the ledger. Each entry contains multiple LedgerLines where the debit and credit lines must balance.
LedgerAccount: accounts in the ledger. There are two types of accounts: (1) internal accounts (e.g. cash) (2) external accounts (e.g. linked bank accounts). External accounts can be added/removed.
After reading some articles online (e.g. this one: https://lorenzo-dee.blogspot.com/2013/06/domain-driven-design-accounting-domain.html?m=0). It seems like LedgerEntry should be one aggregate root, holding references to LedgerLines. LedgerAccount should be the other aggregate root. LedgerLines would hold the corresponding LedgerAccount's ID.
While this makes a lot of sense, I'm having trouble figuring out how to update the balance of ledger accounts when ledger lines are added. The above article suggests that the balance should be calculated on the fly, which means it wouldn't need to be updated when LedgerEntrys are added. However, I'm using Amazon QLDB for the ledger, and their solutions engineer specifically recommended that the balance should be computed and stored on the LedgerAccount since QLDB is not optimized for such kind of "scanning through lots of documents" operation.
Now the dilemma ensues:
If I update the balance field synchronously when adding LedgerEntrys, then I would be updating two aggregates in one operation, which violates the consistency boundary.
If I update the balance field asynchronously after receiving the event emitted by the "Add LedgerEntry" operation, then I could be reading a stale balance on the account if I add another LedgerEntry that spends the balance on the account, which could lead to overdrafts.
If I subsume the LedgerAccount model into the same aggregate of LedgerEntry, then I lose the ability to add/remove individual LedgerAccount since I can't query them directly.
If I get rid of the balance field and compute it on the fly, then there could be performance problems given (1) QLDB limitation (2) the fact that the number of ledger lines is unbounded.
So what's the proper design here? Any help is greatly appreciated!

You could use Saga Pattern to ensure the whole process completes or fails.
Here's a primer ... https://medium.com/#lfgcampos/saga-pattern-8394e29bbb85
I'd add 'reserved funds' owned collection to the Ledger Account.
A Ledger Account will have 'Actual' balance and 'Available Balance'.
'Available Balance' is 'Actual' balance less the value of 'reserved funds'
Using a Saga to manage the flow:
Try to reserve funds on the Account aggregate. The Ledger Account will check its available balance (actual minus total of reserved funds) and if sufficient, add another reserved funds to its collection. If reservation succeeds, the account aggregate will return a reservation unique id. If reservation fails, then the entry cannot be posted.
Try to complete the double entry bookkeeping. If it fails, send a 'release reservation' command to the Account aggregate quoting the reservation unique id, which will remove the reservation and we're back to where we started.
After double entry bookkeeping is complete, send a command to Account to 'complete' reservation with reservation unique id. The Account aggregate will then remove the reservation and adjust its actual balance.
In this way, you can manage a distributed transaction without the possibility of an account going overdrawn.

An aggregate root should serve as a transaction boundary. A multi-legged transaction spans multiple accounts, hence an account cannot be.
So a ledger itself is an aggregate root. An accounting transaction should correspond to database transaction.
Actually, "ledger itself" doesn't mean a singleton. It can be org branch * time period ledger. And it usually is in non-computer event-sourcing systems.
Update.
A ledger account balances is merely a view into the ledger. And as a view it has a state as of some known event. When making up a decision whether to accept an operation or not, you should make sure that the actual state of the ledger is the latest state processed as of the balances. If it is not - the newer events should be processed first, and then an account operation should be tried again.

Hyperledger Fabric 1.4 Private data collection

Hyperledger fabric provides inbuilt support storing offchain data with the help of private collections. For this we need to specify the collection config which contains various collection names along with the participants that has access to data present in those collections.
There is a setting called "BlockToLive" using which we can specify for how many blocks the peers should store the private data they have access to. The peers will automatically purge the private data after the ledger block height reaches to the mentioned threshold.
We have a requirement in which we need to use the private data collections but the data should be removed (automatically/manually) after exactly 30 days. Is there any possibility to achieve the same?
timeToLive: Is there any implementation for specifying the timeToLive or similar configuration? Using this the peers will automatically purge the data after mentioned duration.
If there is no automatic way possible currently, how can the data present in private collection be removed manually? Is there any way by which the data in private collections be removed directly using external scripts/code? We don't want to to create chaincode methods that will be used to invoke as transactions to delete private data as even the deletion of private data will need to be endorsed and sent to the orderer and needs to be added to the ledger. How can the private data be removed directly?

First, everything that you put on the blockchain is permanent and supposed to be decentralized. So, having unilateral control over when to delete the private data goes against the virtue of decentralization and you should avoid it (answer to point 2). Endorsers endorse every change or transactions. (including the BlockToLive), so it does not make sense to deviate from the agreed period.
Second, time in distributed systems is subjective and it impossible to have a global clock ⏰ (say 30 days for one node can be 29.99 for another or 29.80 days for another). Hence, time is measured in blocks which is objective for all nodes. So, it is recommended that you use BlockToLive. It can be difficult first, but you can calculate backwards.
Say you have BlockSize as 10 (no. of transactions in a block) and expect around 100 transactions per day, then you can set BlockToLive = 300. (Of course, this is a ballpark number).
Finally, if you still want to delete private data at will, I would recommend manual off-chain storage mechanisms.

Can old block data be deleted from a blockchain?

Just a general question, if I'm building a blockchain for a business I want to store 3 years of transactions but anything older than that I won't need and don't want actively in the working database. Is there a way to backup and purge a blockchain or delete items older than some time frame? I'm more interested in the event logic than the forever memory aspect.

I'm not aware of any blockchain technology capable of this yet, but Hyperledger Fabric in particular is planning to support data archiving (checkpointing). Simply put, participants need to agree on a block height, so that older blocks can be discarded. This new block then becomes the source of trust, similar to original genesis block. Also, snapshot needs to be taken and consented, which captures current state.
From serviceability point of view, it's slightly more complicated, i.e. you may have nodes that are down while snapshotting, etc.

If you just want to purge the data after a while, Fabric Private Data has an option which could satisfy your desire.
blockToLive Represents how long the data should live on the private database in
terms of blocks. The data will live for this specified number of
blocks on the private database and after that it will get purged,
making this data obsolete from the network so that it cannot be
queried from chaincode, and cannot be made available to requesting
peers
You can read more here.
Personally, I don't think there is a way to remove a block from the chain. It might destroy the Immutable property of blockchain.

There are 2 concepts which help you achieve your goals.
The one thing is already mentioned. It is about Private Data. Private data gives you the possibility to 'label' data with a time to live. Then only the private data hashes are stored on the chain (to be able to verify this transaction) but the data itself is stored in so called SideDBs and gets fully pruned (except the hashes on the chain of course). This is kind of the basis for using Fabric without workarounds and achieving GDPR.
The other thing, which was not mentioned yet and kind of is very helpful to this question
Is there a way to backup and purge a blockchain or delete items older than some time frame?
Every peer only stores the 'current state' of the ledger in his StateDB. The current state could be described as the data which is labeled 'active' and probably soon to be used again. You can think of the StateDB being like a Cache. Every Data is comes into this Cache by creating or updating a new key (invoking). To remove a key from the Cache you can use 'DelState'. So it is labeled 'deleted' and not in the Cache anymore. BUT it is still on the ledger! and you can retrieve the history and data to that key.
Conclusion: For 'real' deleting of data you have to use the concept of Private Data and for managing data in your StateDB (think of the 'Cache' analogy) you can simply use built in functions.

Hyperledger fabric - Concurrent transactions

i'm wondering how is possible to execute concurrent transactions in hyperledger fabric using hyperledger composer.
When i try to submit two transactions at the same time against the same resource I get this error:
Error trying invoke business network. Error: Peer has rejected transaction \'transaction-number\' with code MVCC_READ_CONFLICT
Does anyone know if exist a workaround or a design pattern to avoid this?

Though I may not be providing the best solution, I hope to share some ideas and possible workarounds to this question.
First let's briefly explain why you are getting this error. The underlying DB of Hyperledger Fabric employs a MVCC-like (Multi-Version Concurrency Control) model. An example to this would be two clients trying to update an asset of version 0 to a certain value. One would succeed (updated the value and incremented the version number in the stateDB to 1), while another would fail with this error (MVCC_READ_CONFLICT) due to version mismatch.
One possible solution discussed here (https://medium.com/wearetheledger/hyperledger-fabric-concurrency-really-eccd901e4040) would be to implement a FIFO queue on your own between the business logic and Fabric SDK. Retry logic could also be added in this case.
Another way would be using the delta-concept. Suppose there is an asset A with value 10 (maybe it's representing account balance). This asset is being updated frequently (say being updated in this set of value 12 -> 19 -> 16) by multiple concurrent transactions and the above mentioned error would easily be triggered. Instead, we store the value as deltas (+2 -> +7 -> -3) and the final aggregated value would be the same in the ledger. But keep in mind this trick MAY NOT suit every case and in this example, you may also need to closely monitor the running total to avoid giving money if you got empty in your account. So it depends heavily on the data type and use case.
For more information, you can take a look at this: https://github.com/hyperledger/fabric-samples/tree/release-1.1/high-throughput

I recently ran into this problem and solved it by creating an array of promises of calls to async functions, then resolving one at a time.
My transactions add items from arrays of asset2Ids and asset3Ids to an array field on asset1. My transactions are all acting on the same asset so I was getting an MVCC_READ_CONFLICT error as the read/write set is changes before each transaction is committed. By forcing the transactions to resolve in a synchronous way, this conflict is fixed:
// Create a function array
let funcArray = [];
for (const i of asset2Ids) {
// Add this transaction to array of promises to be resolved
funcArray.push(()=>transactionFunctionThatAddsAsset2IdToAsset1(i).toPromise());
}
for (const j of asset3Ids) {
// Add this transaction to array of promises to be resolved
funcArray.push(()=>transactionFunctionThatAddsAsset3IdToAsset1(j).toPromise());
}
// Resolve all transaction promises against asset in a synchronous way
funcArray.reduce((p,fn) => p.then(fn), Promise.resolve());

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string