How can I identity MVCC Aboard? - hyperledger-fabric

I tried to see the problem MVCC with hyperledger fabric but the problem that when I invoke transaction that modifies and reads the same variable with the same key it works.
What are the changes to make to highlight MVCC?

What are the changes to make to highlight MVCC?
MVCC stands for Multi Value Concurrency Control, this is well know approach to enable optimistic update mechanism which allow to prevent concurrent modifications of the same key. In Fabric context concurrent modification would be transactions which grouped into same block and modifies same key. Therefore in order to experience MVCC failure it's not enough to modify same key several times, you also need to make sure all those transactions will be batched to the same block.
The easiest way to achieve it is to throw as much as possible tx updates to increase probability of transactions to be places into same block.

Related

How to build a timed chaincode?

I want to write a timed chaincode. I wish the chaincode won't response to any invoke after a certain period. In Ethereum, I can do this by counting blocks. For example, I can make a smart contract valid only before block 100000. But in hyperledger fabric, I don't know how to do it.
I had similar problem. Even though the block information is available via system chaincode I was not able to implement the querying of it from my chaincode elegantly. For my purpose, however, the following implementation was enough:
Introduce a counter state and initialize it with '0'.
Check this state in all your important chaincode functions. If it is lower than the certain limit, increase it by 1 and proceed with the function logic. Otherwise, throw error/print message/do nothing.
Based on the expected frequency of your transactions you can set more or less appropriate limit.
Maybe this soltion will work for you as well.

Why do we need total order across view changes in consensus protocols?

In their famous article, Miguel Castro and Barbara Liskov justify the commit phase of the PBFT consensus protocol like this:
This ensures that replicas agree on a total order for requests in the
same view but it is not sufficient to ensure a total order for
requests across view changes. Replicas may collect prepared
certificates in different views with the same sequence number and
different requests. The commit phase solves this problem as follows.
Each replica i multicasts <COMMIT, v, n, i>_{α_i} saying it has the
prepared certificate and adds this message to its log. Then each
replica collects messages until it has a quorum certificate with 2 f +
1 COMMIT messages for the same sequence number n and view v from
different replicas (including itself). We call this certificate the
committed certificate and say that the request is committed by the
replica when it has both the prepared and committed certificates.
But why exactly do we need to guarantee total order across view changes?
If a leader/primary replica fails and triggers a view change, wouldn't it suffice to discard everything from the previous view? What situation does the commit phase prevent that this solution does not?
Apologies if this is too obvious. I'm new to distributed systems and I haven't found any source which directly answers this question.
There is a conceptual reason for this. The system appears to a client as a black box. The whole idea of this box is to provide reliable access to some service, thus, it should mask the failures of a particular replica. Otherwise, if you discard everything at each view change, clients will constantly lose their data. So basically, your solution simply contradicts the specification. The commit phase is needed exactly to prevent such kind of situations. If the request is "accepted" only when there are 2f + 1 COMMIT messages, then, even if all f replicas are faulty, the remaining nodes can recover all committed requests, this provides durable access to the system.
There is also a technical reason. In theory the system is asynchronous, this means that you can't even guarantee that the view change will occur only as a result of a failure. Some replicas may only suspect that the leader is faulty and change the view. With your solution it is possible that the system discards everything it is accepted even if non of replicas is faulty.
If you're new to distributed systems I suggest you to have a look at the classic protocols tolerating non-Byzantine failures (e.g., Paxos), they are simpler but solves the problems in the similar way.
Edit
When I say "clients constantly lose their data" it is a bit more than it sounds. I'm talking about the impact of a particular client request to the system. Let's take a key-value store. A clinet A associates some value to some key via our "black box". The "black box" now orders this request with respect to any other concurrent (or simply parallel) requests. It then replicates it across all replicas and finally notifies A. Without commit phase there is no ordering and at two different views our "black box" can chose two different order of execution of client requests. That being said, the following is possible:
at a time t, A associates value to key and the "box" approves this,
at the time t+1, B associates value_2 to key and the "box" approves this,
at the time t+2, C reads value_2 from key,
view change (invisible to clients),
at the time t+3, D reads value from key.
Note that (5) is possible not because the "box" is not aware of value_2 (as you mentioned the value itself can be resubmitted) but because it is not aware that previously it first wrote value and then overwrote it with value_2. At the new view, the system needs somehow order those two requests but no luck, the decision is not coherent with the past.
The eventual synchrony is a way to guarantee liveness of the protocols, however, it cannot prevent the situations described above. Eventual synchrony states that eventually your system will behave much like the synchronous one, but you don't know when, before that time any kind of weird things can happen. If during the asynchronous period a safety property is violated, then obviously the whole system is not safe.
The output of PBFT should not be one log per view, but rather an ever-growing global log to which every view tries to contribute new 'blocks'.
The equivalent notion in a blockchain is that each block proposer, or block miner, must append to the current blockchain, instead of starting its new blockchain from scratch. I.e. new blocks must respect previous transactions, the same way new views must respect previous views.
If the total ordering is not consistent across views, then we lose the property above.
In fact if we force a view change after every sequence number in PBFT, it looks a lot like blockchain, but with a much more complicated recovery/safety mechanism (in part since PBFT blocks don't commit to the previous block, so we need to agree on each of them individually)

Can LMDB be made concurrent for writes as well under specific circumstances?

MDB_NOLOCK as described at mdb_env_open() apidoc:
MDB_NOLOCK Don't do any locking. If concurrent access is anticipated, the caller must manage all concurrency itself. For proper operation the caller must enforce single-writer semantics, and must ensure that no readers are using old transactions while a writer is active. The simplest approach is to use an exclusive lock so that no readers may be active at all when a writer begins.
What if an RW txnA intends to modify a set of keys which has no key in common with another set of keys which another RW txnB intends to modify? Couldn't they be sent concurrently?
Isn't the single-writer semantic wasteful for such situations? As one txn is waiting for the previous one to finish, even though they intend to operate in entirely separate regions in an lmdb env.
In an environment opened with MDB_NOLOCK, what if the client app calculates in the domainland, that two write transactions are intending to RW to mutually exclusive set of keys anywhere in an lmdb environment, and sends only such transactions concurrently anyway? What could go wrong?
Could such concurrent writes scale linearly with cores? Like RO txns do? Given the app is able to manage these concurrent writes, in the manner described in 3.
No, since modifying key/value pairs requires also modifying the b-tree structure, and the two transactions would conflict with each other.
You should avoid doing long-running computations in the middle of a write transaction. Try to do as much as possible beforehand. If you can't do this, then LMDB might not be a great fit for you application. Usually you can though.
Very bad stuff. Application crashes and DB corruption.
Writes are generally IO bound, and will not scale with many cores anyway. There are some very hacky things you can do with LMDB's writemap and/or pwrite(2), but you are very much on your own here.
I'm going to assume that writing to the value part of a pre-existing key does not modify the b-tree because you are not modifying the keys. So what Doug Hoyte's comment stands, except possibly point 3:
Key phrase here is "are intending to RW to mutually exclusive set of keys". So assuming that the keys are pre-allocated, and already in the DB, changing the values should not matter. I don't even know if LMDB can store variable sized values, in which case it could matter if the values are different sizes.
So, it should be possible to write with MDB_NOLOCK concurrently as long as you can guarantee to never modify, add, or delete any keys during the concurrent writes.
Empirically I can state that working with LMDB opened with MDB_NO_LOCK (or lock=False in Python) and simply modifying values of pre-existing keys, or even only adding new key/values - seems to work well. Even if LMDB itself is mounted across an NFS like medium and queried from different machines.
#Doug Hoyte - I would appreciate more context as to what specific circumstances might lead to a crash or corruption. In my case there are many small short-lived type of writes to the same DB.

DDD - How to modify several AR (from different bounded contexts) throughout single request?

I would want expose a little scenario which is still at paper state, and which, regarding DDD principle seem a bit tedious to accomplish.
Let's say, I've an application for hosting accounts management. Basically, the application compose several bounded contexts such as Web accounts management, Ftp accounts management, Mail accounts management... each of them represented by their own AR (they can live standalone).
Now, let's imagine I want to provide a UI with an HTML form that compose one fieldset for each bounded context, for instance to update limits and or features. How should I process exactly to update all AR without breaking single transaction per request principle? Can I create a kind of "outer" AR, let's say a ClientHostingProperties AR which would holds references to other AR and update them as part of single transaction, using own repository? Or should I better create an AR that emit messages to let's listeners provided by the bounded contexts react on, in which case, I should probably think about ES?
Thanks.
How should I process exactly to update all AR without breaking single transaction per request principle?
You are probably looking for a process manager.
Basic sketch: persisting the details from the submitted form is a transaction unto itself (you are offered an opportunity to accrue business value; step 1 is to capture that opportunity).
That gives you a way to keep track of whether or not this task is "done": you compare the changes in the task to the state of the system, and fire off commands (to run in isolated transactions) to make changes.
Processes, in my mind, end up looking a lot like state machines. These tasks are commands are done, these commands are not done, these commands have failed: now what? and eventually reach a state where there are no additional changes to be made, and this instance of the process is "done".
Short answer: You don't.
An aggregate is a transactional boundary, which means that if you would update multiple aggregates in one "action", you'd have to use multiple transactions. The reason for an aggregate to be equivalent to one transaction is that this allows you to guarantee consistency.
This means that you have two options:
You can make your aggregate larger. Then you can actually guarantee consistency, but your ability to handle concurrent requests gets worse. So this is usually what you want to avoid.
You can live with the fact that it's two transactions, which means you are eventually consistent. If so, you usually use something such as a process manager or a flow to handle updating multiple aggregates. In its simplest form, a flow is nothing but a simple if this event happens, run that command rule. In its more complex form, it has its own state.
Hope this helps 😊

Locks on postgres transactions

I am load testing my node.js application. At some point I reach state where requests are pending and my best guess it's because of a locked transaction. This is the last log statement:
SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ;
And in pg_lock I've got 4 rows with the above query which are GRANTED = true, with mode ExclusiveLock.
Where should I start looking for a bug?
If in this locking request I make there are a lot of insert and update operations, should the isolation level be REPEATABLE READ?
Is there any way to debug/process that kind of situations?
Is there any mechanism to timeout that locks so app can be easily/automatically released and not blocking further requests?
Side question (since I'm not looking for a tool directly): are there any tools to monitor and spot that kind of situations? (I was hoping to use Munin.)
I am using nodejs 4.2.1 with express 4.13.3, sequelize 3.19.3 as Postgres 9.4.1 ORM.
Welcome to PostgreSQL transaction locks hell :)
You can spend a lot of time trying to figure out where exactly the lock happens and why. But there is a very little chance that it will help you in resolving the situation.
The general recipe for solving this kind of situations is as follows:
Keep your transactions size to the bare minimum required by the business logic of your application. For example, avoid same-type inserts or updates, replacing them with multi-row analogues, because query IO is expensive
Do not use transactions while executing only a single query that modifies data, i.e. avoid unnecessary transactions.
Implement error handling that can determine a transaction lock and provide a repeated attempt at executing the transaction. Logging such repeats will help you understand weak spots of your system and how to redesign it better.
Even in a well-engineered system the last step often becomes a necessity, don't let it scare you ;)
I encountered a similar situation where I started 5 parallell transactions requesting the same update lock, and the first one also continued with work that required more postgres calls. The entire system deadlocks, and the first transaction is listed as idle in transaction in pg_stat_activity and granted access to all locks it has requested in pg_locks.
What I think is happening;
The first transaction got the lock granted, and then finished the query. After this it drops its connection to postgres.
The following 4 transactions open a connection each and blocks on the lock, that is held by the first transaction.
Since they are blocked, the first transaction gets to execute, when it tries to connect to postgres to make a query, it gets deadlocked, because sequiezlize has run out of connections.
When I changed my sequiezlize initialisation and added more connections to the pool, default being 5, the deadlock disappears.
I am not sure who is using the 5'th connection, or if the default happens to be 4 and not 5, for some reason, but still seem to tick all the boxes.
Another solution is to use the NOWAIT option in postgres, so a transaction abort when asking for a lock and not getting it, depending on your usecase.
Hope it helps if someone else gets encounters the same issue.

Resources