What's the difference between data flow graph and dependence graph in TBB - tbb

I have read about data flow graph and dependence graph from the
Intel TBB Tutorial, and feel a bit confusing about these two concepts.
Can I say that the key difference between data flow graph and dependence graph is whether there are explicitly shared resources or not?
But it seems that we can implement a dependence graph using function_node with pseudo messages, or implement a data flow graph using continue_node with shared global variables.

The difference between a function_node accepting a continue_msg input and a continue_node is the behavior when receiving a message. This is a consequence of the concept of "dependence graph."
The idea of dependence graphs is that the only information being passed by the graph is the completion of a task. If you have four tasks (A,B,C,D) all operating on the same shared data, and tasks A and B must be complete before either C or D can be started, you define four continue_nodes, and attach the output of node A to C and D, and the same for B. You may also create a broadcast_node<continue_msg> and attach A and B as successors to it. (The data being used in the computation must be accessible by some other means.)
To start the graph you do a try_put of a continue_msg to the broadcast_node. The broadcast_node sends a continue_msg to each successor (A & B).
continue_nodes A and B each have 1 predecessor (the broadcast_node.) On receiving a number of continue_msgs equal to their predecessor count (1), they are queued to execute, using and updating the data representing the state of the computation.
When continue_node A completes, it sends a continue_msg to each successor, C & D. Those nodes each have two predecessors, so they do not execute on receiving this message. They only remember they have received one message.
When continue_node B completes, it also sends a continue_msg to C and D. This will be the second continue_msg each node receives, so tasks will be queued to execute their function_bodies.
continue_nodes use the graph only to express this order. No data is transferred from node to node (beyond the signal that a predecessor is complete.)
If the nodes in the graph were function_nodes accepting continue_msgs rather than continue_nodes, the reaction to the broadcast_node getting a continue_msg would be
The broadcast_node would forward a continue_msg to A and B, and they would each execute their function_bodies.
Node A would complete, and pass continue_msgs to C and D.
On receiving the continue_msg, tasks would be queued to execute the function_bodies of C and D.
Node B would complete execution, and forward a continue_msg to C and D.
C and D, on receiving this second continue_msg, would queue a task to execute their function_bodies a second time.
Notice 3. above. The function_node reacts each time it receives a continue_msg. The continue_node knows how many predecessors it has, and only reacts when it receives a number of continue_msgs equal to its number of predecessors.
The dependence graph is convenient if there is a lot of state being used in the computation, and if the sequence of tasks is well understood. The idea of "shared state" does not necessarily dictate the use of a dependence graph, but a dependence graph cannot pass anything but completion state of the work involved, and so must use shared state to communicate other data.
(Note that the completion order I am describing above is only one possible ordering. Node B could complete before node A, but the sequence of actions would be similar.)

Related

How to handle in Event Driven Microservices if the messaging queue is down?

Assume there are two services A and B, in a microservice environment.
In between A and B sits a messaging queue M that is a broker.
A<---->'M'<----->B
The problem is what if the broker M is down?
Possible Solution i can think of:
Ping from Service A at regular intervals to check on Messaging queue M as long as it is down. In the meantime, service A
stores the data in a local DB and dumps it into the queue once the broker M is up.
Considering the above problem, if someone can suggest whether threads or reactive programming is best suited for this scenario and ways it could be handled via code, I would be grateful.
The problem is what if the broker M is down?
If the broker is down, then A and B can't use it to communicate.
What A and B should do in that scenario is going to depend very much on the details of your particular application/use-case.
Is there useful work they can do in that scenario?
If not, then they might as well just stop trying to handle any work/transactions for the time being, and instead just sit and wait for M to come back up. Having them do periodic pings/queries of M (to see if it's back yet) while in this state is a good idea.
If they can do something useful in this scenario, then you can have them continue to work in some sort of "offline mode", caching their results locally in anticipation of M's re-appearance at some point in the future. Of course this can become problematic, especially if M doesn't come back up for a long time -- e.g.
what if the set of cached local results becomes unreasonably large, such that A/B runs out of space to store it?
Or what if A and B cache local results that will both apply to the same data structure(s) within M, such that when M comes back online, some of A's results will overwrite B's (or vice-versa, depending on the order in which they reconnect)? (This is analogous to the sort of thing that source-code-control servers have to deal with after several developers have been working offline, both making changes to the same lines in the same file, and then they both come back online and want to commit their changes to that file. It can get a bit complex and there's not always an obvious "correct" way to resolve conflicts)
Finally what if it was something A or B "said" that caused M to crash in the first place? In that case, re-uploading the same requests to M after it comes back online might only cause it to crash again, and so on in an infinite loop, making the service perpetually unusable. (In this case, of course, the proper fix would be to debug M)
Another approach might be to try to avoid the problem by having multiple redundant brokers (e.g. M1, M2, M3, ...) such that as long as at least one of them is still available, productive work can continue. Or perhaps allow A and B to communicate with each other directly rather than through an intermediary.
As for whether this sort of thing would best be handled by threads or reactive programming, that's a matter of personal preference -- personally I prefer reactive programming, because the multiple-threads style usually means blocking-RPC-operations, and a thread that is blocked inside a blocking-operation is a frozen/helpless thread until the remote party responds (e.g. if M takes 2 minutes to respond to an RPC request, then A's RPC call to M cannot return for 2 minutes, which means that the calling thread is unable to do anything at all for 2 minutes). In a reactive approach, A's thread could also be doing other things during this period (such as pinging M to make sure it's okay, or contacting a backup broker, or whatever) during that 2 minute period if it wanted to.

Design choice for a microservice event-driven architecture

Let's suppose we have the following:
DDD aggregates A and B, A can reference B.
A microservice managing A that exposes the following commands:
create A
delete A
link A to B
unlink A from B
A microservice managing B that exposes the following commands:
create B
delete B
A successful creation, deletion, link or unlink always results in the emission of a corresponding event by the microservice that performed the action.
What is the best way to design an event-driven architecture for these two microservices so that:
A and B will always eventually be consistent with each other. By consistency, I mean A should not reference B if B doesn't exist.
The events from both microservices can easily be projected in a separate read model on which queries spanning both A and B can be made
Specifically, the following examples could lead to transient inconsistent states, but consistency must in all cases eventually be restored:
Example 1
Initial consistent state: A exists, B doesn't, A is not linked to B
Command: link A to B
Example 2
Initial consistent state: A exists, B exists, A is linked to B
Command: delete B
Example 3
Initial consistent state: A exists, B exists, A is not linked to B
Two simultaneous commands: link A to B and delete B
I have two solutions in mind.
Solution 1
Microservice A only allows linking A to B if it has previously received a "B created" event and no "B deleted" event.
Microservice B only allows deleting B if it has not previously received a "A linked to B" event, or if that event was followed by a "A unlinked from B" event.
Microservice A listens to "B deleted" events and, upon receiving such an event, unlinks A from B (for the race condition in which B is deleted before it has received the A linked to B event).
Solution 2:
Microservice A always allows linking A to B.
Microservice B listens for "A linked to B" events and, upon receiving such an event, verifies that B exists. If it doesn't, it emits a "link to B refused" event.
Microservice A listens for "B deleted" and "link to B refused" events and, upon receiving such an event, unlinks A from B.
EDIT: Solution 3, proposed by Guillaume:
Microservice A only allows linking A to B if it has not previously received a "B deleted" event.
Microservice B always allows deleting B.
Microservice A listens to "B deleted" events and, upon receiving such an event, unlinks A from B.
The advantage I see for solution 2 is that the microservices don't need to keep track of of past events emitted by the other service. In solution 1, basically each microservice has to maintain a read model of the other one.
A potential disadvantage for solution 2 could maybe be the added complexity of projecting these events in the read model, especially if more microservices and aggregates following the same pattern are added to the system.
Are there other (dis)advantages to one or the other solution, or even an anti-pattern I'm not aware of that should be avoided at all costs?
Is there a better solution than the two I propose?
Any advice would be appreciated.
Microservice A only allows linking A to B if it has previously received a "B created" event and no "B deleted" event.
There's a potential problem here; consider a race between two messages, link A to B and B Created. If the B Created message happens to arrive first, then everything links up as expected. If B Created happens to arrive second, then the link doesn't happen. In short, you have a business behavior that depends on your message plumbing.
Udi Dahan, 2010
A microsecond difference in timing shouldn’t make a difference to core business behaviors.
A potential disadvantage for solution 2 could maybe be the added complexity of projecting these events in the read model, especially if more microservices and aggregates following the same pattern are added to the system.
I don't like that complexity at all; it sounds like a lot of work for not very much business value.
Exception Reports might be a viable alternative. Greg Young talked about this in 2016. In short; having a monitor that detects inconsistent states, and the remediation of those states, may be enough.
Adding automated remediation comes later. Rinat Abdullin described this progression really well.
The automated version ends up looking something like solution 2; but with separation of the responsibilities -- the remediation logic lives outside of microservice A and B.
Your solutions seem OK but there are some things that need to be clarified:
In DDD, aggregates are consistencies boundaries. An Aggregate is always in a consistent state, no matter what command it receives and if that command succeeds or not. But this does not mean that the whole system is in a permitted permanent state from the business point of view. There are moments when the system as whole is in a not-permitted state. This is OK as long as eventually it will transition in a permitted state. Here comes the Saga/Process managers. This is exactly their role: to bring the system in a valid state. They could be deployed as separate microservices.
One other type of component/pattern that I used in my CQRS projects are Eventually-consistent command validators. They validate a command (and reject it if it is not valid) before it reaches the Aggregate using a private read-model. These components minimize the situations when the system enters an invalid state and they complement the Sagas. They should be deployed inside the microservice that contains the Aggregate, as a layer on top of the domain layer (aggregate).
Now, back to Earth. Your solutions are a combination of Aggregates, Sagas and Eventually-consistent command validations.
Solution 1
Microservice A only allows linking A to B if it has previously received a "B created" event and no "B deleted" event.
Microservice A listens to "B deleted" events and, upon receiving such an event, unlinks A from B.
In this architecture, Microservice A contains Aggregate A and a Command validator and Microservice B contains Aggregate B and a Saga. Here is important to understand that the validator would not prevent the system's invalid state but only would reduce the probability.
Solution 2:
Microservice A always allows linking A to B.
Microservice B listens for "A linked to B" events and, upon receiving such an event, verifies that B exists. If it doesn't, it
emits a "link to B refused" event.
Microservice A listens for "B deleted" and "link to B refused" events and, upon receiving such an event, unlinks A from B.
In this architecture, Microservice A contains Aggregate A and a Saga and Microservice B contains Aggregate B and also a Saga. This solution could be simplified if the Saga on B would verify the existence of B and send Unlink B from A command to A instead of yielding an event.
In any case, in order to apply the SRP, you could extract the Sagas to their own microservices. In this case you would have a microservice per Aggregate and per Saga.
I will start with the same premise as #ConstantinGalbenu but follow with a different proposition ;)
Eventual consistency means that the whole system will eventually
converge to a consistent state.
If you add to that "no matter the order in which messages are received", you've got a very strong statement by which your system will naturally tend to an ultimate coherent state without the help of an external process manager/saga.
If you make a maximum number of operations commutative from the receiver's perspective, e.g. it doesn't matter if link A to B arrives before or after create A (they both lead to the same resulting state), you're pretty much there. That's basically the first bullet point of Solution 2 generalized to a maximum of events, but not the second bullet point.
Microservice B listens for "A linked to B" events and, upon receiving
such an event, verifies that B exists. If it doesn't, it emits a "link
to B refused" event.
You don't need to do this in a nominal case. You'd do it in the case where you know that A didn't receive a B deleted message. But then it shouldn't be part of your normal business process, that's delivery failure management at the messaging platform level. I wouldn't put this kind of systematic double-check of everything by the microservice where the original data came from, because things get way too complex. It looks as if you're trying to put some immediate consistency back into an eventually consistent setup.
That solution might not always be feasible, but at least from the point of view of a passive read model that doesn't emit events in response to other events, I can't think of a case where you couldn't manage to handle all events in a commutative way.

Is there some algorithm for R/W lock graphs?

Suppose we have resources A,B,C and their dependencies not cyclic:
B->A
C->A
Means B strongly depends on A and C strongly depends on A. For example: B,C is precomputed resources from A. So if A updates, B,C should be updated too. But if B updated - nothing changes except B.
And for the problem: Considering the fact that each node of graph can be accessed for Read or Write or Read/Upgrade to Write in multi-threaded manner, how one supposed to manage locks in such graph? Is there generalization of this problem?
Update
Sorry for not clear question. Here is also one very important thing:
If for example A changes and will force B,C to be updated it means that the moment B and their dependencies updates - it will free write lock.
Your question is a blend of transaction - locking - concurrency - conflict resolution. Therefore models used in relational databases might serve your purpose.
There are many methods defined for concurrency control.
In your case some might apply depending of how optimistic or pessimistic your algorithm needs to be, how many reads or writes, and what is the amount of data per-transaction.
I can think of the two methods that can help in your case:
1. Strict Two-Phase Locking (SSPL or S2PL)
A transaction begins, A, B, C locks are being obtained and are kept until the end of the transaction. Because multiple locks are kept until the end of the transaction, while acquiring the locks a deadlock condition might be encountered. Locks can change during the transaction time.
This approach is serializable, meaning that all events come in order and no other party can make any changes while the transaction holds.
This approach is pessimistic and locks might hold for a good amount of time, thus resources and time will be spent.
2. Multiversion
Instead of placing locks on A, B, C, maintain version numbers and create a snapshot of each. All changes will be done to snapshots. At the end, all snapshots will replace the previous versions. If any version of A, B and C has changed then an error condition occurs and changes are discarded.
This approach does not place read or write locks meaning that will be fast. But in case of conflicts, if any version has changed in the interim, then data will be discarded.
This is optimistic but might spend much more resources in favor of speed.
Transaction log
In database systems there is also the concept of "transaction log". This means that any transaction being it completed or pending will be present in the "transaction log". So every operation done in any of the above methods is first done to the transaction log. Operations from the log will be materialized at the right moment in the main store. In case of failures the log is analyzed, completed transactions are materialized to the main store and the pending ones are just discarded.
This is used also in "log shipping" in order to ship the log to other servers for the purpose of replication.
Known Implementations
There are multiple in-memory databases that might prevent some hassle with implementing your own solution.
H2 provides also serializable isolation level that can match your use case.
go-memdb provides multiversion concurrency. This one uses an immutable radix tree algorithm, therefore you can look also into this one for details if you are searching to build your own solution.
Many more are defined here.
I am not aware of a specific pattern here; so my solution would go like this:
First of all, I would reverse the edges in your graph. You don't care that A is a dependency for B; meaning: the other direction is telling you what is required to lock on:
A->B
A->C
Because now you can say: if I want to do X on A, I need the X lock on A, and any object depending on A.
And now you can go; inspect A, and the objects depending on A; ... and so forth to determine the set of objects you need an X lock on.
Regarding your comment: Because X in this case is either Read or UpgradedWrite, and if A need Write it doesn't clearly mean that B needs it to. ... for me that translates to: the whole "graph idea" doesn't help. You see, such a graph is only useful to express such direct relations, such as "if a then b". If there is an edge between A and B, then that means that you would want to treat them "the same way". If you are now saying that your objects might or might not need to be both write locked - what would be the point of this graph then? Because then you end up with a lot of actually independent objects, and sometimes a write to A needs a write lock something else; and sometimes not.

Concurrent access to msg/payload on split flows in node-red?

I'm working on a node-red flow and came upon some (hopefully not real) concurrency problem.
I have a node outputting a msg.payload, that on one connection is written to a database. The database insert node is dead end, so another connection from the first outputting node goes to a function node that overwrites the msg.payload again, needed for the HTTP reply.
I'm wondering about the ordering of execution in this case, or rather the protection against the database accessing the modified msg.payload when it runs after the function node.
Obviously this seems to work - but I would like to know if this is just chance, or is the msg object cloned before each function, or on multiple outputs?
There is no concurrency issue as Node-RED is totally single threaded (as are all NodeJS apps) so only one leg of a branching flow can actually execute at any given time.
Flow execution branch order is in the order the nodes were added to the flow, so for assuming nodes were added in order A, B, C, D, E
A ----> B ---> D ---> E
|
--> C
The message will be delivered from A to B to D to E then to C (assuming that none of B,D,E block for io)
Also messages are cloned when there are multiple nodes hooked up to a single output, you can easily test this with the following flow:
[{"id":"9fd37544.36664","type":"inject","z":"8b231c78.b8edc8","name":"","topic":"","payload":"foo","payloadType":"str","repeat":"","crontab":"","once":false,"x":269.5,"y":284.25,"wires":[["48eda9a0.b455e8","e1f3c665.9af04"]]},{"id":"48eda9a0.b455e8","type":"function","z":"8b231c78.b8edc8","name":"","func":"msg.payload = \"bar\";\nreturn msg;","outputs":1,"noerr":0,"x":454.5,"y":283.75,"wires":[["5f27ffc7.a54ce"]]},{"id":"5f27ffc7.a54ce","type":"debug","z":"8b231c78.b8edc8","name":"","active":true,"console":"false","complete":"false","x":635.5,"y":284.5,"wires":[]},{"id":"e1f3c665.9af04","type":"debug","z":"8b231c78.b8edc8","name":"","active":true,"console":"false","complete":"false","x":475.5,"y":362.5,"wires":[]}]
This has a single input that flows to 2 debug outputs, the first branch includes a function node which modifies the payload before it is output.

Managing dynamic conditional dependencies with generated state machines?

Greetings SO denizens!
I'm trying to architect an overhaul of an existing NodeJS application that has outgrown its original design. The solutions I'm working towards are well beyond my experience.
The system has ~50 unique async tasks defined as various finite state machines which it knows how to perform. Each task has a required set of parameters to begin execution which may be supplied by interactive prompts, a database or from the results of a previously completed async task.
I have a UI where the user may define a directed graph ("the flow"), specifying which tasks they want to run and the order they want to execute them in with additional properties associated with both the vertices and edges such as extra conditionals to evaluate before calling a child task(s). This information is stored in a third normal form PostgreSQL database as a "parent + child + property value" configuration which seems to work fairly well.
Because of the sheer number of permutations, conditionals and absurd number of possible points of failure I'm leaning towards expressing "the flow" as a state machine. I merely have just enough knowledge of graph theory and state machines to implement them but practically zero background.
I think what I'm trying to accomplish is at the flow run time after user input for the root services have been received, is somehow compile the database representation of the graph + properties into a state machine of some variety.
To further complicate the matter in the near future I would like to be able to "pause" a flow, save its state to memory, load it on another worker some time in the future and resume execution.
I think I'm close to a viable solution but if one of you kind souls would take mercy on a blind fool and point me in the right direction I'd be forever in your debt.
I solved similar problem few years ago as my bachelor and diploma thesis. I designed a Cascade, an executable structure which forms growing acyclic oriented graph. You can read about it in my paper "Self-generating Programs – Cascade of the Blocks".
The basic idea is, that each block has inputs and outputs. Initially some blocks are inserted into the cascade and inputs are connected to outputs of other blocks to form an acyclic graph. When a block is executed, it reads its inputs (cascade will pass values from connected outputs) and then the block sets its outputs. It can also insert additional blocks into the cascade and connect its inputs to outputs of already present blocks. This should be equal to your task starting another task and passing some parameters to it. Alternative to setting output to an value is forwarding a value from another output (in your case waiting for a result of some other task, so it is possible to launch helper sub-tasks).

Resources