cache coherency in application - multithreading

cache coherency protocol is well known in multi-cores context which is in the hardware low-level, however, we will meet the similar case in the application domain. Recently, I am working on a project which has two threads updating shared objects.
UI thread is responsible for
displaying and updating(via users
action) objects.
Background replication thread
periodically updates the shared
objects if something is changed by
other users.
since we have lots of objects(50,000~100,000), each thread have to copy part of objects into its own buffer, updating the shared objects is serial.
UI thread will not update object
each time when users update.
Background replication thread will
update objects immediately once
changes are found and then notify UI
thread to refresh.
So this brings the question, if there is a object updated by two threads, how is conflict sovled? Is there any common idiom to handle this case?

The simplest way to handle this is to use a mutex. The UI locks the mutex before it reads the value, then unlocks it afterwards. The background thread locks the mutex before it updates the value, then unlocks it afterwards.
You can have the update thread send a notification message to the UI telling it to reread the shared object, as you suggested.

Related

How worker threads works in Nodejs?

Nodejs can not have a built-in thread API like java and .net
do. If threads are added, the nature of the language itself will
change. It’s not possible to add threads as a new set of available
classes or functions.
Nodejs 10.x added worker threads as an experiment and now stable since 12.x. I have gone through the few blogs but did not understand much maybe due to lack of knowledge. How are they different than the threads.
Worker threads in Javascript are somewhat analogous to WebWorkers in the browser. They do not share direct access to any variables with the main thread or with each other and the only way they communicate with the main thread is via messaging. This messaging is synchronized through the event loop. This avoids all the classic race conditions that multiple threads have trying to access the same variables because two separate threads can't access the same variables in node.js. Each thread has its own set of variables and the only way to influence another thread's variables is to send it a message and ask it to modify its own variables. Since that message is synchronized through that thread's event queue, there's no risk of classic race conditions in accessing variables.
Java threads, on the other hand, are similar to C++ or native threads in that they share access to the same variables and the threads are freely timesliced so right in the middle of functionA running in threadA, execution could be interrupted and functionB running in threadB could run. Since both can freely access the same variables, there are all sorts of race conditions possible unless one manually uses thread synchronization tools (such as mutexes) to coordinate and protect all access to shared variables. This type of programming is often the source of very hard to find and next-to-impossible to reliably reproduce concurrency bugs. While powerful and useful for some system-level things or more real-time-ish code, it's very easy for anyone but a very senior and experienced developer to make costly concurrency mistakes. And, it's very hard to devise a test that will tell you if it's really stable under all types of load or not.
node.js attempts to avoid the classic concurrency bugs by separating the threads into their own variable space and forcing all communication between them to be synchronized via the event queue. This means that threadA/functionA is never arbitrarily interrupted and some other code in your process changes some shared variables it was accessing while it wasn't looking.
node.js also has a backstop that it can run a child_process that can be written in any language and can use native threads if needed or one can actually hook native code and real system level threads right into node.js using the add-on SDK (and it communicates with node.js Javascript through the SDK interface). And, in fact, a number of node.js built-in libraries do exactly this to surface functionality that requires that level of access to the nodejs environment. For example, the implementation of file access uses a pool of native threads to carry out file operations.
So, with all that said, there are still some types of race conditions that can occur and this has to do with access to outside resources. For example if two threads or processes are both trying to do their own thing and write to the same file, they can clearly conflict with each other and create problems.
So, using Workers in node.js still has to be aware of concurrency issues when accessing outside resources. node.js protects the local variable environment for each Worker, but can't do anything about contention among outside resources. In that regard, node.js Workers have the same issues as Java threads and the programmer has to code for that (exclusive file access, file locks, separate files for each Worker, using a database to manage the concurrency for storage, etc...).
It comes under the node js architecture. whenever a req reaches the node it is passed on to "EVENT QUE" then to "Event Loop" . Here the event-loop checks whether the request is 'blocking io or non-blocking io'. (blocking io - the operations which takes time to complete eg:fetching a data from someother place ) . Then Event-loop passes the blocking io to THREAD POOL. Thread pool is a collection of WORKER THREADS. This blocking io gets attached to one of the worker-threads and it begins to perform its operation(eg: fetching data from database) after the completion it is send back to event loop and later to Execution.

Access a specific thread from Grand Central Dispatch

For a Mac application I'm using an external (C++) library that has build in memory management. A drawback of that memory manager is that memory needs to be deleted on the same thread as the new call.
Currently I'm using GCD to run code concurrently, but I run into the problem that objects of that library get allocated on various threads and I can't correctly delete them.
Is there a way to call the delete operator on the original thread that called new? I realise that GCD wants to abstract the underlying threads away from me, but otherwise I've to write a custom GCD-like implementation where I have full control over the threads.

Identify threads in a Delphi application outside debugging environment

I have found an application which requests process information using wmi queries (all threads and more info on each thread). I modified this application to determine the CPU usage per thread.
(if my application is called 'appy', then the threads are named 'appy/0', 'appy/1', ...)
My question: is there a way to easily identify these threads outside of an IDE or another debugging environment?
I know there is the NameThreadForDebugging method, but this isn't accessible outside the debugging environment.
Is there a way to assign your own thread id upon creating that thread?
Or is the only way to know who is who (the threads) by creating a dictionary and write that dictionary to a file so it is externally accessible.
Thanks in advance!
No, you cannot assign your own thread ID, the thread ID is assigned to a thread by the CreateThread function and cannot be changed during its lifetime. And as you said the only way to identify thread in the external application (not a debugger) is to share the thread identification with that application somehow.
However it's not necessary to share the information through a file, you can use a shared memory block for instance. It will be much more efficient than using files.
As the reference about thread ID you can take the remark by the GetCurrentThreadId function:
Until the thread terminates, the thread identifier uniquely identifies
the thread throughout the system.

Delphi - Creating a control that runs in its own process

HI
I have a control that accesses a database using proprietary datasets. The database is an old ISAM bases database.
The control uses a background thread to query the database using the proprietary datasets.
A form will have several of these controls on it, each using their own thread to access the data as they all need to load simultaneously.
The proprietary datasets handle concurrency by displaying a VCL TForm notifying the user that the table being opened is locked by another user and that the dataset is waiting for the lock to be released.
The form has a cancel button on it which lets the user cancel the lock wait.
The problem:
When using the proprietary datasets from within a thread, the application will crash, hang or give some error if the lock wait form it displayed. I suspect this is to do with the VCL not being thread safe.
I have solved the issue by synchronizing Dataset.Open however this holds up the main thread until the dataset.open returns, which can take a considerable amount of time depending on the complexity of the query.
I have displayed a modal progress bar which lets to user know that something it happening but I don't like this idea as the user will be sitting waiting for the progress bar to complete.
The proprietary dataset code is compiled into the main application, i.e. its not stored in a separate DLL. We are not allowed to change how the locking works or whether a form is displayed or not at this stage of the development process as we are too close to release.
Ideally I would like to have Dataset.open run in the controls thread as well instead of having the use the main thread, however this doesn't seem likely to work.
Can anyone else suggest a work around? please.
Fibers won't help you one bit, because they are in the Windows API solely to help ease porting old code that was written with cooperative multitasking in mind. Fibers are basically a form of co-routines, they all execute in the same process, have their own stack space, and the switching between them is controlled by the user code, not by the OS. That means that the switching between them can be made to occur only at times that are safe, so no synchronization issues. OTOH that means that only one fiber can be running within one thread at the same time, so using fibers with blocking code has the same characteristics as calling blocking code from within one thread - the application becomes unresponsive.
You could use fibers together with multiple threads, but that can be dangerous and doesn't bring any benefit over using threads alone.
I have used fibers successfully within VCL applications, but only for specific purposes. Forget about them if you want to deal with potentially blocking code.
As for your problem - you should make a control that is used for display purposes only, and which uses the standard inter-process communication mechanisms to exchange data with another process that accesses your database.
COM objects can run in out-of-process mode. May be in delphi it will be a bit easier to use them, then another IPC mechanisms.

What does “Autoreleased with no pool in place” mean?

My Application structure is as follows,
the core part is written in C++ and using thread heavily, and i am developing UI in Objective C on top of it,
if i don't execute the thread it works fine, but i can't disable, stop thread, UI is crashing randomly in the log i could see , following message
__NSAutoreleaseNoPool(): Object 0x350270 of class NSCFString autoreleased with no pool in place - just leaking
Similar messages coming more then once,
by googling come to know, i need to set NSAutoReleasePool to get rid of it, but how its possible to integrate the same with C++ code.
Edit: Core lib will be activated from UI , hence i suppose, its safe to say UI is running in the main thread, Lib is creating/terminating thread without notifying UI,
in this case, can i call AutoReleasePool in the UI
Can anyone guide me?
See these docs for what you should know about multithreading with Cocoa: http://developer.apple.com/library/mac/#documentation/Cocoa/Conceptual/Multithreading/ThreadSafetySummary/ThreadSafetySummary.html
It's OK to design your app like you have, but two things should be kept in mind:
Life is simplest (and sometimes necessary) when UI controls like views (AppKit or UIKit) are manipulated on the main thread. You can use Foundation objects and some AppKit/UIKit objects on background threads, and some Foundation objects can be used from multiple threads.
If you're using any Cocoa objects at all in background threads, you'll need to set up autorelease pools on those threads.
Like so:
- (void)backgroundThreadStart
{
NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
// do stuff
[pool release];
}
That will fix your console errors, but you might have other issues that led to the actual crashing you were seeing.
It means you autoreleased something without an autorelease pool in place.
Every thread has a stack of autorelease pools. On the main thread, an autorelease pool is created for you before Cocoa calls out to your code, and drained after your code returns. Every object you autorelease (whether explicitly or implicitly) goes into the pool, so that the pool will release it when the pool gets drained. When you create a thread, you have to create and drain an autorelease pool on that thread yourself. (Or just not autorelease anything, but that's practically impossible for any meaningful amount of code.)
If you ever decide to run your code under garbage-collection, you'll need to send the pool drain, not release, when you're done with it, for the pool to be useful. When GC is enabled, release and autorelease messages do nothing—they don't even go through. Your autorelease pool will respond to drain by poking the garbage collector, which is the nearest equivalent to releasing the objects that would have been in the pool.
The Memory Management Programming Guide for Cocoa has more information about autorelease pools, among other things.

Resources