TON supports multiple workchains - which one should I use in my code? - ton

According to TON whitepaper, the TON blockchain network supports multiple chains.
What's the differences between these and which workchain should I use when deploying contracts or reading data from contracts?
When deploying my contract I must to specify which workchain I’m working on, I’m not sure which value to put there:
import { contractAddress } from "ton";
const workchain = ?;
const newContractAddress = contractAddress({ workchain, initialData: initDataCell, initialCode: initCodeCell });

TLDR
For regular user work, always use workchain 0 - which is the workchain with workchain_id = 0
What are the different chains in TON?
One master chain - the special unique workchain with workchain_id = -1
Mostly used by network validators for running the PoS elections contracts, regular users don't normally send transactions on this chain.
Up to 2^32 workchains - today there's only one with workchain_id = 0 but possibly more in the future
99.9% of user transactions on TON take place on workchain 0, this is where you should work unless you know exactly what you're doing.
Up to 2^60 shardchains per each workchain (they all have the same workchain_id)
This is an internal implementation detail of TON's infinite sharding (autoscaling). If any of the workchains is under heavy load, it will be split automatically to two shardchains and when the load is reduced, it will be merged back. You would normally not care about this, it happens under the hood. When you deploy contracts or send transactions you don't need to specify the shardchain you're working on, it's calculated by the system automatically.

Related

DDD - How to modify several AR (from different bounded contexts) throughout single request?

I would want expose a little scenario which is still at paper state, and which, regarding DDD principle seem a bit tedious to accomplish.
Let's say, I've an application for hosting accounts management. Basically, the application compose several bounded contexts such as Web accounts management, Ftp accounts management, Mail accounts management... each of them represented by their own AR (they can live standalone).
Now, let's imagine I want to provide a UI with an HTML form that compose one fieldset for each bounded context, for instance to update limits and or features. How should I process exactly to update all AR without breaking single transaction per request principle? Can I create a kind of "outer" AR, let's say a ClientHostingProperties AR which would holds references to other AR and update them as part of single transaction, using own repository? Or should I better create an AR that emit messages to let's listeners provided by the bounded contexts react on, in which case, I should probably think about ES?
Thanks.
How should I process exactly to update all AR without breaking single transaction per request principle?
You are probably looking for a process manager.
Basic sketch: persisting the details from the submitted form is a transaction unto itself (you are offered an opportunity to accrue business value; step 1 is to capture that opportunity).
That gives you a way to keep track of whether or not this task is "done": you compare the changes in the task to the state of the system, and fire off commands (to run in isolated transactions) to make changes.
Processes, in my mind, end up looking a lot like state machines. These tasks are commands are done, these commands are not done, these commands have failed: now what? and eventually reach a state where there are no additional changes to be made, and this instance of the process is "done".
Short answer: You don't.
An aggregate is a transactional boundary, which means that if you would update multiple aggregates in one "action", you'd have to use multiple transactions. The reason for an aggregate to be equivalent to one transaction is that this allows you to guarantee consistency.
This means that you have two options:
You can make your aggregate larger. Then you can actually guarantee consistency, but your ability to handle concurrent requests gets worse. So this is usually what you want to avoid.
You can live with the fact that it's two transactions, which means you are eventually consistent. If so, you usually use something such as a process manager or a flow to handle updating multiple aggregates. In its simplest form, a flow is nothing but a simple if this event happens, run that command rule. In its more complex form, it has its own state.
Hope this helps 😊

Managing Azure Search 503

Is there a well-known pattern to manage the ServiceUnavailable ?
I'm NOT so happy with something like this
catch (CloudException e) when (System.Net.HttpStatusCode.ServiceUnavailable.Equals(e.Response?.StatusCode))
{
var howMuchWait = TimeSpan.FromMinutes(1);
if (e.Response.Headers.TryGetValue("Retry-After", out var hValue))
{
if(RetryConditionHeaderValue.TryParse(hValue.FirstOrDefault(), out var time) && time.Delta.HasValue)
{
howMuchWait = time.Delta.Value;
}
}
logger.LogWarning(() => $"Service Unavailable... Let him rest a bit, I will wait for {howMuchWait}.");
await Task.Delay(howMuchWait);
return indexEntities.Select(x => (string)x[indexKeyFieldName]).ToList();
}
This code simply delay the current call to it but does not prevent calls from other threads.
Now I'm implementing something different using a Stopwatch and I would like to know if there is a well-known pattern.
NOTE: everything using the SDK.
When managing for 503's, it is best to implement more of an incremental back-off mechanism as opposed to a fixed time delay. For example, start with 1 second, then 2, then 4, then 8, etc... The key reason for this, is if you are sending a lot of work to Azure Search, it is best to let it catch up as opposed to continually trying to send more work to it.
By the way, for others reading this, sometimes you might think that is a good idea to add partitions while you are seeing these 503's (typically for mass uploads) to give more resources, when in fact, this can cause more 503's as this is more work the service needs to do to configure the partitions. If you do think you need more partitions it would be best to do this before you do the work and then scale down afterwards, if needed.
Also, another side note is that if you leveraged the Azure Search .NET SDK, retry is already integrated. Bruce, has some good information on this here: Azure Search RetryPolicy

difference between passing control to different program using return() and calling a program using xctl

If I have ,say, 2 screens. First is the prompt screen which asks for, say, some record key and the next screen displays the information about the record.
Now when I want to transfer the control to the second screen (after doing the job of the 1st screen) I can do that by :
exec cics
return(trans-id)
commarea(ws-commarea)
end exec.
where trans-id is that of the 2nd screen.
Then what is need for using a calling function such as xctl when we already have the return() available in cics?
Using XCTL or LINK or dynamic CALLs confines your processing to one CICS transaction.
If you so desire, you can design your application to spread different business functions across multiple transactions, passing data with a commarea.
Historically this wasn't done for a number of reasons. Thirty years ago, some CICS Systems Programmers felt transaction IDs were a limited resource and encouraged application designers to keep processing to the minimum number of transactions possible.
Security in CICS is handled at the transaction level, so your user must have authority to execute all transactions that comprise the business function they must perform.
Resources such as temporary storage queues are often named in part using the transaction ID to differentiate and keep them separate.
Prior to CICS TS version 2 (I think) the data to be shared between those transactions was limited to the size of a commarea (32K). All supported versions of CICS now have channels and containers, allowing you to pass significantly larger amounts of data.
My experience is that it is simpler to code and easier to maintain pseudo-conversational transactions with screen interactions if the code is all in one transaction. You really want your transactions to be pseudo-conversational or non conversational. I believe this to be the overriding reason you see transactions designed to use XCTL, LINK, or dynamic CALLs.
XCTL also doesn't allow dynamic routing (you always stay in the same CICS region), and is one way only. Pseudo-conversational return as above will let the user update the screen, and then only when they press an Attention Identifier (such as Enter) will the next program run. XCTL will run immediately.

How to avoid concurrency issues when scaling writes horizontally?

Assume there is a worker service that receives messages from a queue, reads the product with the specified Id from a document database, applies some manipulation logic based on the message, and finally writes the updated product back to the database (a).
This work can be safely done in parallel when dealing with different products, so we can scale horizontally (b). However, if more than one service instance works on the same product, we might end up with concurrency issues, or concurrency exceptions from the database, in which case we should apply some retry logic (and still the retry might fail again and so on).
Question: How do we avoid this? Is there a way I can ensure two instances are not working on the same product?
Example/Use case: An online store has a great sale on productA, productB and productC that ends in an hour and hundreds of customers are buying. For each purchase, a message is enqueued (productId, numberOfItems, price). Goal: How can we run three instances of our worker service and make sure that all messages for productA will end up in instanceA, productB to instanceB and productC to instanceC (resulting in no concurrency issues)?
Notes: My service is written in C#, hosted on Azure as a Worker Role, I use Azure Queues for messaging, and I'm thinking to use Mongo for storage. Also, the Entity IDs are GUID.
It's more about the technique/design, so if you use different tools to solve the problem I'm still interested.
Any solution attempting to divide the load upon different items in the same collection (like orders) are doomed to fail. The reason is that if you got a high rate of transactions flowing you'll have to start doing one of the following things:
let nodes to talk each other (hey guys, are anyone working with this?)
Divide the ID generation into segments (node a creates ID 1-1000, node B 1001-1999) etc and then just let them deal with their own segment
dynamically divide a collection into segments (and let each node handle a segment.
so what's wrong with those approaches?
The first approach is simply replicating transactions in a database. Unless you can spend a large amount of time optimizing the strategy it's better to rely on transactions.
The second two options will decrease performance as you have to dynamically route messages upon ids and also change the strategy at run-time to also include newly inserted messages. It will fail eventually.
Solutions
Here are two solutions that you can also combine.
Retry automatically
Instead you have an entry point somewhere that reads from the message queue.
In it you have something like this:
while (true)
{
var message = queue.Read();
Process(message);
}
What you could do instead to get very simple fault tolerance is to retry upon failure:
while (true)
{
for (i = 0; i < 3; i++)
{
try
{
var message = queue.Read();
Process(message);
break; //exit for loop
}
catch (Exception ex)
{
//log
//no throw = for loop runs the next attempt
}
}
}
You could of course just catch db exceptions (or rather transaction failures) to just replay those messages.
Micro services
I know, Micro service is a buzz word. But in this case it's a great solution. Instead of having a monolithic core which processes all messages, divide the application in smaller parts. Or in your case just deactivate the processing of certain types of messages.
If you have five nodes running your application you can make sure that Node A receives messages related to orders, node B receives messages related to shipping etc.
By doing so you can still horizontally scale your application, you get no conflicts and it requires little effort (a few more message queues and reconfigure each node).
For this kind of a thing I use blob leases. Basically, I create a blob with the ID of an entity in some known storage account. When worker 1 picks up the entity, it tries to acquire a lease on the blob (and create the blob itself, if it doesn't exist). If it is successful in doing both, then I allow the processing of the message to occur. Always release the lease afterwards.
If I am not successfull, I dump the message back onto the queue
I follow the apporach originally described by Steve Marx here http://blog.smarx.com/posts/managing-concurrency-in-windows-azure-with-leases although tweaked to use new Storage Libraries
Edit after comments:
If you have a potentially high rate of messages all talking to the same entity (as your commend implies), I would redesign your approach somewhere.. either entity structure, or messaging structure.
For example: consider CQRS design pattern and store changes from processing of every message independently. Whereby, product entity is now an aggregate of all changes done to the entity by various workers, sequentially re-applied and rehydrated into a single object
If you want to always have the database up to date and always consistent with the already processed units then you have several updates on the same mutable entity.
In order to comply with this you need to serialize the updates for the same entity. Either you do this by partitioning your data at producers, either you accumulate the events for the entity on the same queue, either you lock the entity in the worker using an distributed lock or a lock at the database level.
You could use an actor model (in java/scala world using akka) that is creating a message queue for each entity or group of entities that process them serially.
UPDATED
You can try an akka port to .net and here.
Here you can find a nice tutorial with samples about using akka in scala.
But for general principles you should search more about [actor model]. It has drawbacks nevertheless.
In the end pertains to partition your data and ability to create a unique specialized worker(that could be reused and/or restarted in case of failure) for a specific entity.
I assume you have a means to safely access the product queue across all worker services. Given that, one simple way to avoid conflict could be using global queues per product next to the main queue
// Queue[X] is the queue for product X
// QueueMain is the main queue
DoWork(ProductType X)
{
if (Queue[X].empty())
{
product = QueueMain().pop()
if (product.type != X)
{
Queue[product.type].push(product)
return;
}
}else
{
product = Queue[X].pop()
}
//process product...
}
The access to queues need to be atomic
You should use session enabled service bus queue for ordering and concurrency.
1) Every high scale data solution that I can think of has something built in to handle precisely this sort of conflict. The details will depend on your final choice for data storage. In the case of a traditional relational database, this comes baked in without any add'l work on your part. Refer to your chosen technology's documentation for appropriate detail.
2) Understand your data model and usage patterns. Design your datastore appropriately. Don't design for scale that you won't have. Optimize for your most common usage patterns.
3) Challenge your assumptions. Do you actually have to mutate the same entity very frequently from multiple roles? Sometimes the answer is yes, but often you can simply create a new entity that's similar to reflect the update. IE, take a journaling/logging approach instead of a single-entity approach. Ultimately high volumes of updates on a single entity will never scale.

Strategy to handle race conditions with regrads to web applicaiton backend?

I have been asked questions regarding race conditions in web application like movie ticket or travel website often in interviews.
Question is something like this.
Say for a bus or plane ticket website, there is only seat left. Two(or many in extreme scenario) users on different computer log into the website at the same time and see that one seat is left. They both go ahead, select that seat and place the order.
Now there are two requests we have to handle. For the first request, we will book the ticket and but for the second request, we have to sort-of throw an error and show the error message to the end user saying the seat is not available.
Say the database schema is some-thing like this:
bus_id, seat_id,is_taken
so for the first request, we make the is_taken for corresponding bus_id, seat_id 1. Then for the second request, there won't be any seat_id with is_taken =0 so we won't book the ticket.
But here, in my opinion, we have put a restriction that at one time, only one request can be handled; Second request can be handled, only after first request has been completed.
However that is not practical, since we might have a huge website with loads of traffic and application running on several servers in parallel. We have to process requests in parallel.
Since I don't have much experience with handling race conditions in these sorts of multi-threaded web applications, I can't quite figure, what is the right way about solving this.
What is the right(even if basic) approach/ design patterns to tackle these scenarios?
Web applictions are necessarily multithreaded. There are two ways of solving this.
Application level (Not preferred)
I am not sure which programming language are you using for building the application. But all the programming language used for building websites will have something like "synchornize" which allows you to prevent two threads accessing same block of code simultaneously.
This is not preferred as this solution is not horizontally scalable. When you decide to do the increase the capacity by running one more instance of your web application, this solution fails terribly.
Database level
This is the preferred solution. You obtain the lock on the record in the database before you update.
SQL provides an option for selecting the record for update.
SELECT * FROM BUS_SEATS WHERE BUS_ID = 1 FOR UPDATE;
Above sql is one way to obtain lock. All the database provide this kind of feature. With this feature you can lock the required row and do the update and ensure consistency in the database.
At some point, there has to be some sort of synchronization.
Since you're using a database, which is usually the bottleneck anyway, you might as well let it handle the race condition.
All you have to do is update the row atomically. The requests can still be handled in parallel by the application.
Sql-pseudocode:
DECLARE #success = false;
UPDATE bus_seats
SET is_taken = 1, success = true
WHERE seat_id = #seat_id AND is_taken=0
return #success;

Resources