Adding Tables to dataset by multiple threads is thread safe? - multithreading

Is
Adding Tables to dataset by multiple threads is thread safe?
List<Action> TestActions = new List<Action>();
Action action1 = new Action(() => Method1(dsDataset));
TestActions.Add(action1);
Action action2 = new Action(() => Method2(dsDataset));
TestActions.Add(action2);
Action action3 = new Action(() => Method3(dsDataset));
TestActions.Add(action3);
try
{
Parallel.ForEach(TestActions, (a) => a());
}
Method1(DataSet ds) {
//excecute a db call and returns datatable
Datatable db ={database query}
ds.Tables.Add(db)
}
Method2(DataSet ds) {
//excecute a db call and returns datatable
Datatable db ={database query}
ds.Tables.Add(db)
}
Method3(DataSet ds) {
//excecute a db call and returns datatable
Datatable db ={database query}
ds.Tables.Add(db)
}
Is the Above code works or i need to lock put the lock on the dataset when each thread tries to add?
Is there any better way i can do?

From official documentation:
This type is safe for multithreaded read operations. You must synchronize any write operations.
So, the answer is: no, it is not safe. Use synchrinisation or add tables in a single thread.
Just curious, why do you need to do it in parallel?
If it is because of some work that have to be done in parallel and as result of this work you need to add a table into ds, I can recoment use of one ConcurrentQueue to store tables and one worker thread that will pick items from that queue and add them to a ds.

Related

Acumatica return to function with PXLongOperation

I'm creating an integration for Acumatica that loads data from another application to synchronize inventory items. It uses an API call to get a list (of up to 5000 items) and then I'm using PXLongOperation to insert or update these items. I can't run it without this method as the large batches (aka inserting 5000 stock items) will timeout and crash.
The processing form is a custom table/form that retrieves this information then parses the JSON list of items and calls a custom function on the InventoryItemMaint graph. All that works perfectly, but it never returns to the calling function. I'd love to be able to write information to record to record that it was a success or failure. I've tried PXLongOperation.WaitCompletion but that doesn't seem to change anything. I'm sure I'm not using the asynchronous nature of this correctly but am wondering if there is a reasonable work around.
// This is the lsit of items from SI
List<TEKDTools.TEKdtoolModels.Product> theItems;
if (Guid.TryParse(Convert.ToString(theRow.DtoolsID), out theCatID))
{
// Get the list of items from dtools.
theItems = TEKDTools.TEKdtoolsCommon.ReadOneCatalog(theCatID);
// Start the long operation
PXLongOperation.StartOperation(this, delegate () {
// Create the graph to make a new Stock Item
InventoryItemMaint itemMaint = PXGraph.CreateInstance<InventoryItemMaint>();
var itemMaintExt = itemMaint.GetExtension<InventoryItemMaintTEKExt>();
foreach (TEKDTools.TEKdtoolModels.Product theItem in theItems)
{
itemMaint.Clear();
itemMaintExt.CreateUpdateDToolsItem(theItem, true);
PXLongOperation.WaitCompletion(itemMaint.UID);
}
}
);
}
stopWatch.Stop(); // Just using this to figure out how long things were taking.
// For fun I tried the Wait Completion here too
PXLongOperation.WaitCompletion(this.UID);
theRow = MasterView.Current;
// Tried some random static values to see if it was writing
theRow.RowsCreated = 10;
theRow.RowsUpdated = 11;
theRow.Data2 = "Elasped Milliseconds: " + stopWatch.ElapsedMilliseconds.ToString();
theRow.RunStart = startTime;
theRow.RunEnd = DateTime.Now;
// This never gets the record udpated.
Caches[typeof(TCDtoolsBatch)].Update(theRow);
One possible solution would be to use the PXLongOperation.SetCustomInfo method. Usually this is used to update the UI thread after the long operation has been finished. In this "class" you can subscribe to events which you can use to update rows. The definition of the class is as follows:
public class UpdateUICustomInfo : IPXCustomInfo
{
public void Complete(PXLongRunStatus status, PXGraph graph)
{
// Set Code Here
}
}
The wait completion method you are using, generally is used to wait for another long operation to finish by passing the key of that long operation.

Intercepting ADD to detect if it is an ADD through a collection in Entity Framework 5

I have kinda 2 questions that can be answered separately.
Q#1
I am trying to save round trips to the database server.
Here's my algo:
Insert 2 entities (to get their IDs generated by the database)
Use the IDs returned to call a stored procedure passing it the IDs
The stored procedure takes the IDs and populates an adjacency list table which I am using to store a directed acyclic graph.
Currently I have a round-trip to the RDBMS for each parent-child relationship, plus one for the Insert of the entities.
I am known to do stuff like this:
public override int SaveChanges()
{
foreach (var entry in this.ChangeTracker.Entries().Where(e => e.State == System.Data.EntityState.Added).ToList())
{
if (entry.Entity is IRobot)
{
entry.Reference("Owner").CurrentValue = skyNet;
}
}
return base.SaveChanges();
}
So I was wondering if there was a way that I can detect an EntityState.Added for an "ADD" that was done similar to the following code:
var robot = new Robot();
skyNet.Robots.Add(robot);
db.Add(skyNet);
db.SaveChanges();
So that I can do something like this: (Note that this is psuedocode)
public override int SaveChanges()
{
foreach (var entry in this.ChangeTracker.Entries().Where(e => e.State == EntityState.**AddedToCollection**).ToList())
{
db.Relate(parent: skyNet, child: entry.Entity);
}
return base.SaveChanges();
}
Q#2
Is there anyway to call a stored procedure as part of the same "trip" to the database after calling a SaveChanges()?
Question 1
You can detect the state of an entity by
db.Entry(robot).State
After the line
skyNet.Robots.Add(robot);
the EntityState of robot will be Added. However, in your pseudocode it is not clear where the skyNet variable comes from. If you add the skyNet as you do in your code snippet you could do:
foreach( var skyNet in ChangeTracker.Entries()
.Where(e => e.State == EntityState.Added)
.Select (e => e.Entity)
.OfType<SkyNet>())
{
foreach(var robot in skyNet.Robots
.Where(r => db.Entry(r).State == EntityState.Added))
{
db.Relate(parent: skyNet, child: robot);
}
}
Question 2
You can't call a stored procedure in one roundtrip, that would require something like NHibernate's multi query. But, you can wrap SaveChanges and a stored procedure call in one transaction (which I think is what you mean) by using TransactionScope:
using (TransactionScope scope = new TransactionScope())
{
// stored procedure call here.
db.SaveChanges();
scope.Complete();
}

Is this possible using Reactive Framework?

I have a list of objects in my C# 4.0 app. Suppose this list contains 100 objects of student class. Is there any way in Reactive Framework to parallel execute 10 objects each at a time?
Each student object runs a method which is some what time consuming for about 10 to 15 seconds. So the first time through, take the first 10 student objects from the list and wait for all the 10 student objects to finish its work and then take next 10 student objects and so on until it completes the full items in the lists?
I have a List<Student> with 100 count.
First take 10 items from the lists and calls each object's long run method in parallel.
Receives each objects return value and update the UI [subscription part].
Next round starts only if the first 10 rounds completes and releases all the memory.
Repeat the same process for all the items in the lists.
How to catch the errors in each process ??
How to release each student object's resources and other resources from memory ?
Which is the best way to do all these things in Reactive Framework ?
This version will always have 10 students running at a time. As a student finishes, another will start. And as each student finishes, you can handle any error it had and then clean it up (this will happen before the next student starts running).
students
.ToObservable()
.Select(student => Observable.Defer(() => Observable.Start(() =>
{
// do the work for this student, then return a Tuple of the student plus any error
try
{
student.DoWork();
return { Student = student, Error = (Exception)null };
}
catch (Exception e)
{
return { Student = student, Error = e };
}
})))
.Merge(10) // let 10 students be executing in parallel at all times
.Subscribe(studentResult =>
{
if (studentResult.Error != null)
{
// handle error
}
studentResult.Student.Dispose(); // if your Student is IDisposable and you need to free it up.
});
This is not exactly what you asked since it does not finish the first batch of 10 before starting the next batch. This always keeps 10 running in parallel. If you really want batches of 10 I'll adjust the code for that.
My attempt....
var students = new List<Student>();
{....}
var cancel = students
.ToObservable(Scheduler.Default)
.Window(10)
.Merge(1)
.Subscribe(tenStudents =>
{
tenStudents.ObserveOn(Scheduler.Default)
.Do(x => DoSomeWork(x))
.ObserverOnDispatcher()
.Do(tenStudents => UpdateUI(tenStudents))
.Subscribe();
});
This to me sounds very much like a problem for TPL. You have a known set of data at rest. You want to partition up some heavy processing to run in parallel and you want to be able to batch process the load.
I don't see anywhere in your problem a source that is async, a source that is data in motion, or a consumer that needs to be reactive. This is my rationale for suggesting that you use TPL instead.
On a separate note, why the magic number of 10 to be processed in parallel? Is this a business requirement, or potentially an attempt to optimize performance? Normally it is best practice to allow the TaskPool to work out what is best for the client CPU based in the number of cores and current load. I imagine this becomes ever more important with the large variations in Devices and their CPU structures (Single Core, Multi Core, Many Core, low power/disabled cores etc).
Here is one way you could do it in LinqPad (but note the lack of Rx)
void Main()
{
var source = new List<Item>();
for (int i = 0; i < 100; i++){source.Add(new Item(i));}
//Put into batches of ten, but only then pass on the item, not the temporary tuple construct.
var batches = source.Select((item, idx) =>new {item, idx} )
.GroupBy(tuple=>tuple.idx/10, tuple=>tuple.item);
//Process one batch at a time (serially), but process the items of the batch in parallel (concurrently).
foreach (var batch in batches)
{
"Processing batch...".Dump();
var results = batch.AsParallel().Select (item => item.Process());
foreach (var result in results)
{
result.Dump();
}
"Processed batch.".Dump();
}
}
public class Item
{
private static readonly Random _rnd = new Random();
private readonly int _id;
public Item(int id)
{
_id = id;
}
public int Id { get {return _id;} }
public double Process()
{
var threadId = Thread.CurrentThread.ManagedThreadId;
string.Format("Processing on thread:{0}", threadId).Dump(Id);
var loopCount = _rnd.Next(10000,1000000);
Thread.SpinWait(loopCount);
return _rnd.NextDouble();
}
public override string ToString()
{
return string.Format("Item:{0}", _id);
}
}
I would be interested to find out if you do have a data-in-motion problem or a reactive consumer problem, but have just "dumbed down" the question to make it easier to explain.

Parallel.ForEach - Add object to Db If not exists

Hi guys and girls:) I have a question about parralelism and ms sql in C#
I have method that looks into Db for specific object. If it not exists it will add it to Db. Unfortunately it is done with Parallel.ForEach, so I have had some situation, using thread A and B:
A: look for entity with code 'xxx' - result: Not Exist
B: look for entity with code 'xxx' - result: Not Exist
A: Add entity to Db - result OK
B: Add entity to Db - result: "Violation of UNIQUE KEY constraint (...) The duplicate key value is 'xxx' "
what should I do to avoid that situation ?
For do not have this duplicate error, you want to catch here
try{
//execute you insert in base
}catch(Exception ex){
// If your constraint is not respected, an error is thrown.
console.WriteLine("db error : "+ex.Message);
}
But, it's temporary, It's functionnaly, but it's bad, it's not proper...
For have a proper code, you want to create a spooler:
class Spooler{
public System.Collections.Generic.List<String> RequestList = new System.Collections.Generic.List<String>();
public Spooler(){
// Open you Database
// Start you thread will be verify if a request adding in the collection
SpoolThread = new Thread(new ThreadStart(SpoolerRunner));
SpoolThread.Start();
}
public createRequestDb(String DbRequest){
RequestList.Add(DbRequest);
}
private void SpoolerRunner()
{
while (true)
{
if (RequestList.Count() >= 1){
Foreach (String request in RequestList){
// Here, you want to verify your request, if args already exist
// And add request in Database
}
}
// Verify is request exist in the collection every 30 seconds..
Thread.Sleep(30000);
}
}
}
Why using a spooler?
just because, when you initialize you spooler before call you threads, you want to call spooler in every threah, and, for each request, you adding request in collection, and, the spooler will process one after another... and not in the same time in every different threads...
EDIT:
This spooler is a sample, for insert a string request one by one in your database,
you can create a spooler with a collection for your object in you want, and insert in db if not exist... It's just a sample for a solution, when you have many threads, to have a treatments one after another ^^

Using thread.join to make sure all threads in a collection execute before moving on

Is there a problem with this type of implementation to wait for a batch of threads to complete before moving on, given the following circumstances?:
CCR or PFX cannot be used.
Customer.Prices collection and newCustomer are NOT being mutated.
CloneCustomerPrices performs a deep copy on each of the prices in Customer.Prices collection into a new price Collection.
public List[Customer] ProcessCustomersPrices(List [Customer] Customers)
{
[Code to check Customers and deep copy Cust data into newCustomers]
List[Thread] ThreadList = new List[Thread]();
foreach(Customer cust in Customers)
{
ThreadList.Add(new Thread(() => CloneCustomerPrices(cust.Prices, newCustomer)));
}
Action runThreadBatch = () =>
{
ThreadList.ForEach(t => t.Start());
ThreadList.All (t => t.Join([TimeOutNumber]));
};
runThreadBatch(CopyPriceModelsCallback, null);
[More Processing]
return newCustomers;
}
The waiting implementation seems fine, just be sure that CloneCustomerPrices is thread safe.
Makes sense to me, so long as the threads finish by the timeout. Not sure what newCustomer is (same as cust?). If that's the case I also don't know how to plan to return just one of them.

Resources