Is this possible using Reactive Framework?

Is this possible using Reactive Framework? - c#-4.0

I have a list of objects in my C# 4.0 app. Suppose this list contains 100 objects of student class. Is there any way in Reactive Framework to parallel execute 10 objects each at a time?
Each student object runs a method which is some what time consuming for about 10 to 15 seconds. So the first time through, take the first 10 student objects from the list and wait for all the 10 student objects to finish its work and then take next 10 student objects and so on until it completes the full items in the lists?
I have a List<Student> with 100 count.
First take 10 items from the lists and calls each object's long run method in parallel.
Receives each objects return value and update the UI [subscription part].
Next round starts only if the first 10 rounds completes and releases all the memory.
Repeat the same process for all the items in the lists.
How to catch the errors in each process ??
How to release each student object's resources and other resources from memory ?
Which is the best way to do all these things in Reactive Framework ?

This version will always have 10 students running at a time. As a student finishes, another will start. And as each student finishes, you can handle any error it had and then clean it up (this will happen before the next student starts running).
students
.ToObservable()
.Select(student => Observable.Defer(() => Observable.Start(() =>
{
// do the work for this student, then return a Tuple of the student plus any error
try
{
student.DoWork();
return { Student = student, Error = (Exception)null };
}
catch (Exception e)
{
return { Student = student, Error = e };
}
})))
.Merge(10) // let 10 students be executing in parallel at all times
.Subscribe(studentResult =>
{
if (studentResult.Error != null)
{
// handle error
}
studentResult.Student.Dispose(); // if your Student is IDisposable and you need to free it up.
});
This is not exactly what you asked since it does not finish the first batch of 10 before starting the next batch. This always keeps 10 running in parallel. If you really want batches of 10 I'll adjust the code for that.

My attempt....
var students = new List<Student>();
{....}
var cancel = students
.ToObservable(Scheduler.Default)
.Window(10)
.Merge(1)
.Subscribe(tenStudents =>
{
tenStudents.ObserveOn(Scheduler.Default)
.Do(x => DoSomeWork(x))
.ObserverOnDispatcher()
.Do(tenStudents => UpdateUI(tenStudents))
.Subscribe();
});

This to me sounds very much like a problem for TPL. You have a known set of data at rest. You want to partition up some heavy processing to run in parallel and you want to be able to batch process the load.
I don't see anywhere in your problem a source that is async, a source that is data in motion, or a consumer that needs to be reactive. This is my rationale for suggesting that you use TPL instead.
On a separate note, why the magic number of 10 to be processed in parallel? Is this a business requirement, or potentially an attempt to optimize performance? Normally it is best practice to allow the TaskPool to work out what is best for the client CPU based in the number of cores and current load. I imagine this becomes ever more important with the large variations in Devices and their CPU structures (Single Core, Multi Core, Many Core, low power/disabled cores etc).
Here is one way you could do it in LinqPad (but note the lack of Rx)
void Main()
{
var source = new List<Item>();
for (int i = 0; i < 100; i++){source.Add(new Item(i));}
//Put into batches of ten, but only then pass on the item, not the temporary tuple construct.
var batches = source.Select((item, idx) =>new {item, idx} )
.GroupBy(tuple=>tuple.idx/10, tuple=>tuple.item);
//Process one batch at a time (serially), but process the items of the batch in parallel (concurrently).
foreach (var batch in batches)
{
"Processing batch...".Dump();
var results = batch.AsParallel().Select (item => item.Process());
foreach (var result in results)
{
result.Dump();
}
"Processed batch.".Dump();
}
}
public class Item
{
private static readonly Random _rnd = new Random();
private readonly int _id;
public Item(int id)
{
_id = id;
}
public int Id { get {return _id;} }
public double Process()
{
var threadId = Thread.CurrentThread.ManagedThreadId;
string.Format("Processing on thread:{0}", threadId).Dump(Id);
var loopCount = _rnd.Next(10000,1000000);
Thread.SpinWait(loopCount);
return _rnd.NextDouble();
}
public override string ToString()
{
return string.Format("Item:{0}", _id);
}
}
I would be interested to find out if you do have a data-in-motion problem or a reactive consumer problem, but have just "dumbed down" the question to make it easier to explain.

Related

Acumatica return to function with PXLongOperation

I'm creating an integration for Acumatica that loads data from another application to synchronize inventory items. It uses an API call to get a list (of up to 5000 items) and then I'm using PXLongOperation to insert or update these items. I can't run it without this method as the large batches (aka inserting 5000 stock items) will timeout and crash.
The processing form is a custom table/form that retrieves this information then parses the JSON list of items and calls a custom function on the InventoryItemMaint graph. All that works perfectly, but it never returns to the calling function. I'd love to be able to write information to record to record that it was a success or failure. I've tried PXLongOperation.WaitCompletion but that doesn't seem to change anything. I'm sure I'm not using the asynchronous nature of this correctly but am wondering if there is a reasonable work around.
// This is the lsit of items from SI
List<TEKDTools.TEKdtoolModels.Product> theItems;
if (Guid.TryParse(Convert.ToString(theRow.DtoolsID), out theCatID))
{
// Get the list of items from dtools.
theItems = TEKDTools.TEKdtoolsCommon.ReadOneCatalog(theCatID);
// Start the long operation
PXLongOperation.StartOperation(this, delegate () {
// Create the graph to make a new Stock Item
InventoryItemMaint itemMaint = PXGraph.CreateInstance<InventoryItemMaint>();
var itemMaintExt = itemMaint.GetExtension<InventoryItemMaintTEKExt>();
foreach (TEKDTools.TEKdtoolModels.Product theItem in theItems)
{
itemMaint.Clear();
itemMaintExt.CreateUpdateDToolsItem(theItem, true);
PXLongOperation.WaitCompletion(itemMaint.UID);
}
}
);
}
stopWatch.Stop(); // Just using this to figure out how long things were taking.
// For fun I tried the Wait Completion here too
PXLongOperation.WaitCompletion(this.UID);
theRow = MasterView.Current;
// Tried some random static values to see if it was writing
theRow.RowsCreated = 10;
theRow.RowsUpdated = 11;
theRow.Data2 = "Elasped Milliseconds: " + stopWatch.ElapsedMilliseconds.ToString();
theRow.RunStart = startTime;
theRow.RunEnd = DateTime.Now;
// This never gets the record udpated.
Caches[typeof(TCDtoolsBatch)].Update(theRow);

One possible solution would be to use the PXLongOperation.SetCustomInfo method. Usually this is used to update the UI thread after the long operation has been finished. In this "class" you can subscribe to events which you can use to update rows. The definition of the class is as follows:
public class UpdateUICustomInfo : IPXCustomInfo
{
public void Complete(PXLongRunStatus status, PXGraph graph)
{
// Set Code Here
}
}
The wait completion method you are using, generally is used to wait for another long operation to finish by passing the key of that long operation.

Cosmos DB .NET SDK V3 Query With Paging example needed

I'm struggling to find a code example from MS for the v3 SDK for queries with paging, they provide examples for V2 but that SDK is a completely different code base using the "CreateDocumentQuery" method.
I've tried searching through GitHub here: https://github.com/Azure/azure-cosmos-dotnet-v3/blob/master/Microsoft.Azure.Cosmos.Samples/Usage/Queries/Program.cs
I believe I'm looking for a method example using continuation tokens, with the assumption that if I cache the previously used continuation tokens in my web app then I can page backwards as well as forwards?
I'm also not quite understanding MS explanation in that MaxItemCount doesn't actually mean it will only try to return X items, but simply limits the No. of items in each search across each partition, confused!
Can anyone point me to the right place for a code example please? I also tried searching through https://learn.microsoft.com/en-us/azure/cosmos-db/sql-query-pagination but appears to lead us to the older SDK (V2 I believe)
UPDATE (following comments from Gaurav below)
public async Task<(List<T>, string)> QueryWithPagingAsync(string query, int pageSize, string continuationToken)
{
try
{
Container container = GetContainer();
List<T> entities = new(); // Create a local list of type <T> objects.
QueryDefinition queryDefinition = new QueryDefinition(query);
using FeedIterator<T> resultSetIterator = container.GetItemQueryIterator<T>(
query, // SQL Query passed to this method.
continuationToken, // Value is always null for the first run.
requestOptions: new QueryRequestOptions()
{
// Optional if we already know the partition key value.
// Not relevant here becuase we're passing <T> which could
// be any model class passed to the generic method.
//PartitionKey = new PartitionKey("MyParitionKeyValue"),
// This does not actually limit how many documents are returned if
// what we're querying resides across multiple partitions.
// If we set the value to 1, then control the number of times
// the loop below performs the ReadNextAsync, then we can control
// the number of items we return from this method. I'm not sure
// whether this is best way to go, it seems we'd be calling
// the API X no. times by the number of items to return?
MaxItemCount = 1
});
// Set var i to zero, we'll use this to control the number of iterations in
// the loop, then once i is equal to the pageSize then we exit the loop.
// This allows us to limit the number of documents to return (hope this is the best way to do it)
var i = 0;
while (resultSetIterator.HasMoreResults & i < pageSize)
{
FeedResponse<T> response = await resultSetIterator.ReadNextAsync();
entities.AddRange(response);
continuationToken = response.ContinuationToken;
i++; // Add 1 to var i in each iteration.
}
return (entities, continuationToken);
}
catch (CosmosException ex)
{
//Log.Error($"Entities was not retrieved successfully - error details: {ex.Message}");
if (ex.StatusCode == HttpStatusCode.NotFound)
{
return (null, null);
}
else { throw; }
}
}
The above method is my latest attempt, and whilst I'm able to use and return continuation tokens, the next challenge is how to control the number of items returned from Cosmos. In my environment, you may notice the above method is used in a repo with where we're passing in model classes from different calling methods, therefore hard coding the partition key is not practical and I'm struggling with configuring the number of items returned. The above method is in fact controlling the number of items I am returning to the calling method further up the chain, but I'm worried that my methodology is resulting in multiple calls to Cosmos i.e. if I set the page size to 1000 items, am I making an HTTP call to Cosmos 1000 times?
I was looking at a thread here https://stackoverflow.com/questions/54140814/maxitemcount-feed-options-property-in-cosmos-db-doesnt-work but not sure the answer in that thread is a solution, and given I'm using the V3 SDK, there does not seem to be the "PageSize" parameter available to use in the request options.
However I also found an official Cosmos code sample here: https://github.com/Azure/azure-cosmos-dotnet-v3/blob/master/Microsoft.Azure.Cosmos.Samples/Usage/Queries/Program.cs#L154-L186 (see example method "QueryItemsInPartitionAsStreams" line 171) and it looks like they have used a similar pattern i.e. setting the MaxItemCount variable to 1 and then controlling the no. of items returned in the loop before exiting. I guess I'd just like to understand better what, if any impact this might have on the RUs and API calls to Cosmos?

Please try the following code. It fetches all documents from a container with a maximum of 100 documents in a single request.
using System;
using System.Collections.Generic;
using System.Threading.Tasks;
using Microsoft.Azure.Cosmos;
namespace CosmosDbSQLAPISamples
{
class Program
{
private static string connectionString =
"AccountEndpoint=https://account-name.documents.azure.com:443/;AccountKey=account-key==;";
private static string databaseName = "database-name";
private static string containerName = "container-name";
static async Task Main(string[] args)
{
CosmosClient client = new CosmosClient(connectionString);
Container container = client.GetContainer(databaseName, containerName);
string query = "Select * From Root r";
string continuationToken = null;
int pageSize = 100;
do
{
var (entities, item2) = await GetDataPage(container, query, continuationToken, pageSize);
continuationToken = item2;
Console.WriteLine($"Total entities fetched: {entities.Count}; More entities available: {!string.IsNullOrWhiteSpace(continuationToken)}");
} while (continuationToken != null);
}
private static async Task<(List<dynamic>, string)> GetDataPage(Container container, string query, string continuationToken, int pageSize)
{
List<dynamic> entities = new(); // Create a local list of type <T> objects.
QueryDefinition queryDefinition = new QueryDefinition(query);
QueryRequestOptions requestOptions = new QueryRequestOptions()
{
MaxItemCount = pageSize
};
FeedIterator<dynamic> resultSetIterator = container.GetItemQueryIterator<dynamic>(query, continuationToken, requestOptions);
FeedResponse<dynamic> response = await resultSetIterator.ReadNextAsync();
entities.AddRange(response);
continuationToken = response.ContinuationToken;
return (entities, continuationToken);
}
}
}
UPDATE
I think I understand your concerns now. Essentially there are two things you would need to consider:
MaxItemCount - This is the maximum number of documents that will be returned by Cosmos DB in a single request. Please note that you can get anywhere from 0 to the value specified for this parameter. For example, if you specify 100 as MaxItemCount you can get anywhere from 0 to 100 documents in a single request.
FeedIterator - It keeps track of continuation token internally. Based on the response received, it sets HasMoreResults to true or false if a continuation token is found. Default value for HasMoreResults is true.
Now coming to your code, when you do something like:
while (resultSetIterator.HasMoreResults)
{
//some code here...
}
Because FeedIterator keeps track of the continuation token, this loop will return all the documents that match the query. If you notice, in my code I am not using this logic. I simply send the request once and then return the result.
I think setting MaxItemCount to 1 is a bad idea. If you want to fetch say 100 then you're making a minimum of 100 requests to your Cosmos DB account. If you have a hard need to get exactly 100 (or any fixed number) documents from your API, you can implement your own pagination logic. For example, please see code below. It fetches a total of 1000 documents with a maximum of 100 documents in a single request.
static async Task Main(string[] args)
{
CosmosClient client = new CosmosClient(connectionString);
Container container = client.GetContainer(databaseName, containerName);
string query = "Select * From Root r";
string continuationToken = null;
int pageSize = 100;
int maxDocumentsToFetch = 1000;
List<dynamic> documents = new List<dynamic>();
do
{
var numberOfDocumentsToFetch = Math.Min(pageSize, maxDocumentsToFetch);
var (entities, item2) = await GetDataPage(container, query, continuationToken, numberOfDocumentsToFetch);
continuationToken = item2;
Console.WriteLine($"Total entities fetched: {entities.Count}; More entities available: {!string.IsNullOrWhiteSpace(continuationToken)}");
maxDocumentsToFetch -= entities.Count;
documents.AddRange(entities);
} while (maxDocumentsToFetch > 0 && continuationToken != null);
}

The solution:
Summary:
From the concerns raised in my question and taking note from Gaurav Mantri's comments, if we are fetching the items from Cosmos in a loop then the MaxItemCount does not actually limit the total number of results returned but simply limits the number of results per request. If we continue to fetch more items in the loop then we end up with more results returned than what the user may want to retrieve.
In my case, the reason for paging is to present the items back to the web App using a razor list view, but we want to be able to set the maximum number of results returned per page.
The solution below is based on capturing information on the count of items in each iteration of the loop, therefore if we check the Count of the items returned on each iteration of the loop and if we have achieved less than or equal to the MaxItemCount value then we break from the loop with our set maximum number of items and the continuationToken that we can use on the next method run.
I have tested the method with continuation tokens and am able to affectively page backwards and forwards, but the key difference from the code example in my original question is that we're only calling Cosmos DB once to get the desired number of results back, as opposed to limiting the request to one item per run and having to run multiple requests.
public async Task<(List<T>, string)> QueryWithPagingAsync(string query, int pageSize, string continuationToken)
{
string unescapedContinuationToken = null;
if (!String.IsNullOrEmpty(continuationToken)) // Check if null before unescaping.
{
unescapedContinuationToken = Regex.Unescape(continuationToken); // Needed in my case...
}
try
{
Container container = GetContainer();
List<T> entities = new(); // Create a local list of type <T> objects.
QueryDefinition queryDefinition = new(query); // Create the query definition.
using FeedIterator<T> resultSetIterator = container.GetItemQueryIterator<T>(
query, // SQL Query passed to this method.
unescapedContinuationToken, // Value is always null for the first run.
requestOptions: new QueryRequestOptions()
{
// MaxItemCount does not actually limit how many documents are returned
// from Cosmos, if what we're querying resides across multiple partitions.
// However this parameter will control the max number of items
// returned on 'each request' to Cosmos.
// In the loop below, we check the Count of the items returned
// on each iteration of the loop and if we have achieved less than or
// equal to the MaxItemCount value then we break from the loop with
// our set maximum number of items and the continuationToken
// that we can use on the next method run.
// 'pageSize' is the max no. items we want to return for each page in our list view.
MaxItemCount = pageSize,
});
while (resultSetIterator.HasMoreResults)
{
FeedResponse<T> response = await resultSetIterator.ReadNextAsync();
entities.AddRange(response);
continuationToken = response.ContinuationToken;
// After the first iteration, we get the count of items returned.
// Now we'll either return the exact number of items that was set
// by the MaxItemCount, OR we may find there were less results than
// the MaxItemCount, but either way after the first run, we should
// have the number of items returned that we want, or at least
// the maximum number of items we want to return, so we break from the loop.
if (response.Count <= pageSize) { break; }
}
return (entities, continuationToken);
}
catch (CosmosException ex)
{
//Log.Error($"Entities was not retrieved successfully - error details: {ex.Message}");
if (ex.StatusCode == HttpStatusCode.NotFound)
{
return (null, null);
}
else { throw; }
}
}

In Code:
var sqlQueryText = $"SELECT * FROM c WHERE OFFSET {offset} LIMIT {limit}";
but this is more expensive (more RU/s) then using continuationToken.
When using Offset/Limit continuationToken will be used in background by Azure Cosmos SDK to get all the results.

CQRS in data-centric processes

I have got a question related to CQRS in data centric processes. Let me explain it better.
Consider we have a SOAP/JSON/whatever service, which transfers some data to our system during an integration process. It is said that in CQRS every state change must be achieved by the means of commands (or events if Event Sourcing is used).
When it comes to our integrating process we have got a great deal of structured DATA instead of a set of commands/events and I am wondering how to actually process those data.
// Some Façade service
class SomeService
{
$_someService;
public function __construct(SomeService $someService)
{
$this->_someService = $someService;
}
// Magic function to make it all good and
public function process($dto)
{
// if I get it correctly here I need somehow
// convert incoming dto (xml/json/array/etc)
// to a set of commands, i. e
$this->someService->doSomeStuff($dto->someStuffData);
// SomeStuffChangedEvent raised here
$this->someService->doSomeMoreStuff($dtom->someMoreStuffData);
// SomeMoreStuffChangedEvent raised here
}
}
My question is whether my suggestion is suitable in the given case or there may be some better methods to do what I need. Thank you in advance.

Agreed, a service may have a different interface. If you create a rest-api to update employees, you may want to provide an UpdateEmployeeMessage which contains everything that can change. In a CRUD-kind of service, this message would probably mirror the database.
Inside of the service, you can split the message into commands:
public void Update(UpdateEmployeeMessage message)
{
bus.Send(new UpdateName
{
EmployeeId = message.EmployeeId,
First = message.FirstName,
Last = message.LastName,
});
bus.Send(new UpdateAddress
{
EmployeeId = message.EmployeeId,
Street = message.Street,
ZipCode = message.ZipCode,
City = message.City
});
bus.Send(new UpdateContactInfo
{
EmployeeId = message.EmployeeId,
Phone = message.Phone,
Email = message.Email
});
}
Or you could call the aggregate directly:
public void Update(UpdateEmployeeMessage message)
{
var employee = repository.Get<Employee>(message.EmployeeId);
employee.UpdateName(message.FirstName, message.LastName);
employee.UpdateAddress(message.Street, message.ZipCode, message.City);
employee.UpdatePhone(message.Phone);
employee.UpdateEmail(message.Email);
repository.Save(employee);
}

Processing an emaillist async in MVC4

I'm trying to make my MVC4-website check to see if people should be alerted with an email because they haven't done something.
I'm having a hard time figuring out how to approach this. I checked if the shared hosting platform would allow me to activate some sort of cronjob, but this is not available.
So now my idea is to perform this check on each page-request, which already seems suboptimal (because of the overhead). But I thought that with using an async it would not be in the way of people just visiting the site.
I first tried to do this in the Application_BeginRequest method in Global.asax, but then it gets called multiple times per page-request, so that didn't work.
Next I found that I can make a Global Filter which executes on OnResultExecuted, which would seemed promising, but still it's no go.
The problem I get there is that I'm using MVCMailer to send the mails, and when I execute it I get the error: {"Value cannot be null.\r\nParameter name: httpContext"}
This probably means that mailer needs the context.
The code I now have in my global filter is the following:
public override void OnResultExecuted(ResultExecutedContext filterContext)
{
base.OnResultExecuted(filterContext);
HandleEmptyProfileAlerts();
}
private void HandleEmptyProfileAlerts()
{
new Thread(() =>
{
bool active = false;
new UserMailer().AlertFirst("bla#bla.com").Send();
DB db = new DB();
DateTime CutoffDate = DateTime.Now.AddDays(-5);
var ProfilesToAlert = db.UserProfiles.Where(x => x.CreatedOn < CutoffDate && !x.ProfileActive && x.AlertsSent.Where(y => y.AlertType == "First").Count() == 0).ToList();
foreach (UserProfile up in ProfilesToAlert)
{
if (active)
{
new UserMailer().AlertFirst(up.UserName).Send();
up.AlertsSent.Add(new UserAlert { AlertType = "First", DateSent = DateTime.Now, UserProfileID = up.UserId });
}
else
System.Diagnostics.Debug.WriteLine(up.UserName);
}
db.SaveChanges();
}).Start();
}
So my question is, am I going about this the right way, and if so, how can I make sure that MVCMailer gets the right context?

The usual way to do this kind of thing is to have a single background thread that periodically does the checks you're interested in.
You would start the thread from Application_Start(). It's common to use a database to queue and store work items, although it can also be done in memory if it's better for your app.

Using thread.join to make sure all threads in a collection execute before moving on

Is there a problem with this type of implementation to wait for a batch of threads to complete before moving on, given the following circumstances?:
CCR or PFX cannot be used.
Customer.Prices collection and newCustomer are NOT being mutated.
CloneCustomerPrices performs a deep copy on each of the prices in Customer.Prices collection into a new price Collection.
public List[Customer] ProcessCustomersPrices(List [Customer] Customers)
{
[Code to check Customers and deep copy Cust data into newCustomers]
List[Thread] ThreadList = new List[Thread]();
foreach(Customer cust in Customers)
{
ThreadList.Add(new Thread(() => CloneCustomerPrices(cust.Prices, newCustomer)));
}
Action runThreadBatch = () =>
{
ThreadList.ForEach(t => t.Start());
ThreadList.All (t => t.Join([TimeOutNumber]));
};
runThreadBatch(CopyPriceModelsCallback, null);
[More Processing]
return newCustomers;
}

The waiting implementation seems fine, just be sure that CloneCustomerPrices is thread safe.

Makes sense to me, so long as the threads finish by the timeout. Not sure what newCustomer is (same as cust?). If that's the case I also don't know how to plan to return just one of them.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Is this possible using Reactive Framework? - c#-4.0

Related

Acumatica return to function with PXLongOperation

Cosmos DB .NET SDK V3 Query With Paging example needed

CQRS in data-centric processes

Processing an emaillist async in MVC4

Using thread.join to make sure all threads in a collection execute before moving on

Categories

Resources