How to program critical section for reader-writer systems? - readerwriterlock

Lets say, I have a reader-writer system where reader and writer are concurrently running. 'a' and 'b' are two shared variables, which are related to each other, so modification to them needs to be an atomic operation.
A reader-writer system can be of the following types:
rr
ww
r-w
r-ww
rr-w
rr-ww
where
[ r : single reader
rr: multiple reader
w : single writer
ww: multiple writer ]
Now, We can have a read method for a reader and a write method for a writer as follows. I have written them system type wise.
rr
read_method
{ read a; read b; }
ww
write_method
{ lock(m); write a; write b; unlock(m); }
r-w
r-ww
rr-w
rr-ww
read_method
{ lock(m); read a; read b; unlock(m); }
write_method
{ lock(m); write a; write b; unlock(m); }
For multiple reader system, shared variable access doesn't need to be atomic.
For multiple writer system, shared variable access need to be atomic, so locked with 'm'.
But, for system types 3 to 6, is my read_method and write_method correct? How can I improve?
Sincerely,
Srinivas Nayak

If you want to use Java you can try ReentrantReadWriteLock.
You can find several tutorials on its usage, e.g. here.

If you want to use .NET you can try ReaderWriterLockSlim. I believe it gives you the exact functionality you need. You can also read about the way they implemented it to learn how to implement such locks yourself.

Related

is a read or write operation on a pointer value atomic in golang? [duplicate]

Is assigning a pointer atomic in Go?
Do I need to assign a pointer in a lock? Suppose I just want to assign the pointer to nil, and would like other threads to be able to see it. I know in Java we can use volatile for this, but there is no volatile in Go.
The only things which are guaranteed to be atomic in go are the operations in sync.atomic.
So if you want to be certain you'll either need to take a lock, eg sync.Mutex or use one of the atomic primitives. I don't recommend using the atomic primitives though as you'll have to use them everywhere you use the pointer and they are difficult to get right.
Using the mutex is OK go style - you could define a function to return the current pointer with locking very easily, eg something like
import "sync"
var secretPointer *int
var pointerLock sync.Mutex
func CurrentPointer() *int {
pointerLock.Lock()
defer pointerLock.Unlock()
return secretPointer
}
func SetPointer(p *int) {
pointerLock.Lock()
secretPointer = p
pointerLock.Unlock()
}
These functions return a copy of the pointer to their clients which will stay constant even if the master pointer is changed. This may or may not be acceptable depending on how time critical your requirement is. It should be enough to avoid any undefined behaviour - the garbage collector will ensure that the pointers remain valid at all times even if the memory pointed to is no longer used by your program.
An alternative approach would be to only do the pointer access from one go routine and use channels to command that go routine into doing things. That would be considered more idiomatic go, but may not suit your application exactly.
Update
Here is an example showing how to use atomic.SetPointer. It is rather ugly due to the use of unsafe.Pointer. However unsafe.Pointer casts compile to nothing so the runtime cost is small.
import (
"fmt"
"sync/atomic"
"unsafe"
)
type Struct struct {
p unsafe.Pointer // some pointer
}
func main() {
data := 1
info := Struct{p: unsafe.Pointer(&data)}
fmt.Printf("info is %d\n", *(*int)(info.p))
otherData := 2
atomic.StorePointer(&info.p, unsafe.Pointer(&otherData))
fmt.Printf("info is %d\n", *(*int)(info.p))
}
Since the spec doesn't specify you should assume it is not. Even if it is currently atomic it's possible that it could change without ever violating the spec.
In addition to Nick's answer, since Go 1.4 there is atomic.Value type. Its Store(interface) and Load() interface methods take care of the unsafe.Pointer conversion.
Simple example:
package main
import (
"sync/atomic"
)
type stats struct{}
type myType struct {
stats atomic.Value
}
func main() {
var t myType
s := new(stats)
t.stats.Store(s)
s = t.stats.Load().(*stats)
}
Or a more extended example from the documentation on the Go playground.
Since Go 1.19 atomic.Pointer is added into atomic
The sync/atomic package defines new atomic types Bool, Int32, Int64, Uint32, Uint64, Uintptr, and Pointer. These types hide the underlying values so that all accesses are forced to use the atomic APIs. Pointer also avoids the need to convert to unsafe.Pointer at call sites. Int64 and Uint64 are automatically aligned to 64-bit boundaries in structs and allocated data, even on 32-bit systems.
Sample
type ServerConn struct {
Connection net.Conn
ID string
}
func ShowConnection(p *atomic.Pointer[ServerConn]) {
for {
time.Sleep(10 * time.Second)
fmt.Println(p, p.Load())
}
}
func main() {
c := make(chan bool)
p := atomic.Pointer[ServerConn]{}
s := ServerConn{ID: "first_conn"}
p.Store(&s)
go ShowConnection(&p)
go func() {
for {
time.Sleep(13 * time.Second)
newConn := ServerConn{ID: "new_conn"}
p.Swap(&newConn)
}
}()
<-c
}
Please note that atomicity has nothing to do with "I just want to assign the pointer to nil, and would like other threads to be able to see it". The latter property is called visibility.
The answer to the former, as of right now is yes, assigning (loading/storing) a pointer is atomic in Golang, this lies in the updated Go memory model
Otherwise, a read r of a memory location x that is not larger than a machine word must observe some write w such that r does not happen before w and there is no write w' such that w happens before w' and w' happens before r. That is, each read must observe a value written by a preceding or concurrent write.
Regarding visibility, the question does not have enough information to be answered concretely. If you merely want to know if you can dereference the pointer safely, then a plain load/store would be enough. However, the most likely cases are that you want to communicate some information based on the nullness of the pointer. This requires you using sync/atomic, which provides synchronisation capabilities.

Debugging in threading building Blocks

I would like to program in threading building blocks with tasks. But how does one do the debugging in practice?
In general the print method is a solid technique for debugging programs.
In my experience with MPI parallelization, the right way to do logging is that each thread print its debugging information in its own file (say "debug_irank" with irank the rank in the MPI_COMM_WORLD) so that the logical errors can be found.
How can something similar be achieved with TBB? It is not clear how to access the thread number in the thread pool as this is obviously something internal to tbb.
Alternatively, one could add an additional index specifying the rank when a task is generated but this makes the code rather complicated since the whole program has to take care of that.
First, get the program working with 1 thread. To do this, construct a task_scheduler_init as the first thing in main, like this:
#include "tbb/tbb.h"
int main() {
tbb::task_scheduler_init init(1);
...
}
Be sure to compile with the macro TBB_USE_DEBUG set to 1 so that TBB's checking will be enabled.
If the single-threaded version works, but the multi-threaded version does not, consider using Intel Inspector to spot race conditions. Be sure to compile with TBB_USE_THREADING_TOOLS so that Inspector gets enough information.
Otherwise, I usually first start by adding assertions, because the machine can check assertions much faster than I can read log messages. If I am really puzzled about why an assertion is failing, I use printfs and task ids (not thread ids). Easiest way to create a task id is to allocate one by post-incrementing a tbb::atomic<size_t> and storing the result in the task.
If I'm having a really bad day and the printfs are changing program behavior so that the error does not show up, I use "delayed printfs". Stuff the printf arguments in a circular buffer, and run printf on the records later after the failure is detected. Typically for the buffer, I use an array of structs containing the format string and a few word-size values, and make the array size a power of two. Then an atomic increment and mask suffices to allocate slots. E.g., something like this:
const size_t bufSize = 1024;
struct record {
const char* format;
void *arg0, *arg1;
};
tbb::atomic<size_t> head;
record buf[bufSize];
void recf(const char* fmt, void* a, void* b) {
record* r = &buf[head++ & bufSize-1];
r->format = fmt;
r->arg0 = a;
r->arg1 = b;
}
void recf(const char* fmt, int a, int b) {
record* r = &buf[head++ & bufSize-1];
r->format = fmt;
r->arg0 = (void*)a;
r->arg1 = (void*)b;
}
The two recf routines record the format and the values. The casting is somewhat abusive, but on most architectures you can print the record correctly in practice with printf(r->format, r->arg0, r->arg1) even if the the 2nd overload of recf created the record.
~
~

Memory coherence with respect to c++ initializers

If I set the value of a variable in one thread and read it in another, I protect it with a lock to ensure that the second thread reads the value most recently set by the first:
Thread 1:
lock();
x=3;
unlock();
Thread 2:
lock();
<use the value of x>
unlock();
So far, so good. However, suppose I have a c++ object that sets the value of x in an initializer:
theClass::theClass() : x(3) ...
theClass theInstance;
Then, I spawn a thread that uses theInstance. Is there any guarantee that the newly spawned thread will see the proper value of x? Or is it necessary to place a lock around the declaration of theInstance? I am interested primarily in c++ on Linux.
Prior to C++11, the C++ standard had nothing to say about multiple threads of execution and so made no guarantees of anything.
C++11 introduced a memory model that defines under what circumstances memory written on one thread is guaranteed to become visible to another thread.
Construction of an object is not inherently synchronized across threads. In your particular case though, you say you first construct the object and then 'spawn a thread'. If you 'spawn a thread' by constructing an std::thread object and you do it after constructing some object x on the same thread then you are guaranteed to see the proper value of x on the newly spawned thread. This is because the completion of the thread constructor synchronizes-with the beginning of your thread function.
The term synchronizes-with is a specific term used in defining the C++ memory model and it's worth understanding exactly what it means to understand more complex synchronization but for the case you outline things 'just work' without needing any additional synchronization.
This is all assuming you're using std::thread. If you're using platform threading APIs directly then the C++ standard has nothing to say about what happens but in practice you can assume it will work without needing a lock on any platform I know of.
You seem to have a misconception on locks:
If I set the value of a variable in one thread and read it in another,
I protect it with a lock to ensure that the second thread reads the
value most recently set by the first.
This is incorrect. Locks are used to prevent data races. Locks do not schedule the instructions of Thread 1 to happen before the instructions of Thread 2. With your lock in place, Thread 2 can still run before Thread 1 and read the value of x before Thread 1 changes the value of x.
As for your question:
If your initialization of theInstance happens-before the initialization/start of a certain thread A, then thread A is guaranteed to see the proper value of x.
Example
#include <thread>
#include <assert.h>
struct C
{
C(int x) : x_{ x } {}
int x_;
};
void f(C const& c)
{
assert(c.x_ == 42);
}
int main()
{
C c{ 42 }; // A
std::thread t{ f, std::ref(c) }; // B
t.join();
}
In the same thread: A is sequenced-before B, therefore A happens-before B. The assert in thread t will thus never fire.
If your initialization of 'theInstance' inter-thread happens-before its usage by a certain thread A, then thread A is guaranteed to see the proper value of x.
Example
#include <thread>
#include <atomic>
#include <assert.h>
struct C
{
int x_;
};
std::atomic<bool> is_init;
void f0(C& c)
{
c.x_ = 37; // B
is_init.store(true); // C
}
void f1(C const& c)
{
while (!is_init.load()); // D
assert(c.x_ == 37); // E
}
int main()
{
is_init.store(false); // A
C c;
std::thread t0{ f0, std::ref(c) };
std::thread t1{ f1, std::ref(c) };
t0.join();
t1.join();
}
The inter-thread happens-before relationship occurs between t0 and t1. As before, A happens-before the creation of threads t0 and t1.
The assignment c.x_ = 37 (B) happens-before the store to the is_init flag (C). The loop in f1 is the source of the inter-thread happens-before relationship: f1 only proceeds once is_init is set, therefore C happens before E. Since these relationships are transitive, B inter-thread happens-before D. Thus, the assert will never fire in f1.
First of all, your example above doesn't warrant any locks. All you need to do is to declare your variable atomic. No locks, no worries.
Second, your question does not really make a lot of sence. Since you can not use your object (instance of the class) before it is constructed, and construction is happening within single thread, there is no need to lock anything which is done in class constructor. You simply can not access non-constructed class from multiple threads, it is impossible.

Is there a way in c++11 to prevent "normal" operations from sliping before or after atomic operation

I'm interested in doing something like(single thread update, multiple threads read banneedURLs):
atomic<bannedURLList*> bannedURLs;//global variable pointing to the currently used instance of struct
void updateList()
{
//no need for mutex because only 1 thread updates
bannedURLList* newList= new bannedURLList();
bannedURLList* oldList=bannedURLs;
newList->initialize();
bannedURLs=newList;// line must be after previous line, because list must be initialized before it is ready to be used
//while refcnt on the oldList >0 wait, then delete oldList;
}
reader threads do something like this:
{
bannedURLs->refCnt++;
//use bannedURLs
bannedURLs->refCnt--;
}
struct memeber refCnt is also atomic integer
My question is how to prevent reordering of this 2 lines:
newList->initialize();
bannedURLs=newList;
Can it be done in std:: way?
Use bannedURLs.store(newList); instead of bannedURLs=newList;. Since you didn't pass a weak ordering specifier, this forces full ordering in the store.

Looking for a lock-free RT-safe single-reader single-writer structure

I'm looking for a lock-free design conforming to these requisites:
a single writer writes into a structure and a single reader reads from this structure (this structure exists already and is safe for simultaneous read/write)
but at some time, the structure needs to be changed by the writer, which then initialises, switches and writes into a new structure (of the same type but with new content)
and at the next time the reader reads, it switches to this new structure (if the writer multiply switches to a new lock-free structure, the reader discards these structures, ignoring their data).
The structures must be reused, i.e. no heap memory allocation/free is allowed during write/read/switch operation, for RT purposes.
I have currently implemented a ringbuffer containing multiple instances of these structures; but this implementation suffers from the fact that when the writer has used all the structures present in the ringbuffer, there is no more place to change from structure... But the rest of the ringbuffer contains some data which don't have to be read by the reader but can't be re-used by the writer. As a consequence, the ringbuffer does not fit this purpose.
Any idea (name or pseudo-implementation) of a lock-free design? Thanks for having considered this problem.
Here's one. The keys are that there are three buffers and the reader reserves the buffer it is reading from. The writer writes to one of the other two buffers. The risk of collision is minimal. Plus, this expands. Just make your member arrays one element longer than the number of readers plus the number of writers.
class RingBuffer
{
RingBuffer():lastFullWrite(0)
{
//Initialize the elements of dataBeingRead to false
for(unsigned int i=0; i<DATA_COUNT; i++)
{
dataBeingRead[i] = false;
}
}
Data read()
{
// You may want to check to make sure write has been called once here
// to prevent read from grabbing junk data. Else, initialize the elements
// of dataArray to something valid.
unsigned int indexToRead = lastFullWriteIndex;
Data dataCopy;
dataBeingRead[indexToRead] = true;
dataCopy = dataArray[indexToRead];
dataBeingRead[indexToRead] = false;
return dataCopy;
}
void write( const Data& dataArg )
{
unsigned int writeIndex(0);
//Search for an unused piece of data.
// It's O(n), but plenty fast enough for small arrays.
while( true == dataBeingRead[writeIndex] && writeIndex < DATA_COUNT )
{
writeIndex++;
}
dataArray[writeIndex] = dataArg;
lastFullWrite = &dataArray[writeIndex];
}
private:
static const unsigned int DATA_COUNT;
unsigned int lastFullWrite;
Data dataArray[DATA_COUNT];
bool dataBeingRead[DATA_COUNT];
};
Note: The way it's written here, there are two copies to read your data. If you pass your data out of the read function through a reference argument, you can cut that down to one copy.
You're on the right track.
Lock free communication of fixed messages between threads/processes/processors
fixed size ring buffers can be used in lock free communications between threads, processes or processors if there is one producer and one consumer. Some checks to perform:
head variable is written only by producer (as an atomic action after writing)
tail variable is written only by consumer (as an atomic action after reading)
Pitfall: introduction of a size variable or buffer full/empty flag; these are typically written by both producer and consumer and hence will give you an issue.
I generally use ring buffers for this purpoee. Most important lesson I've learned is that a ring buffer of can never contain more than elements. This way a head and tail variable are written by producer respectively consumer.
Extension for large/variable size blocks
To use buffers in a real time environment, you can either use memory pools (often available in optimized form in real time operating systems) or decouple allocation from usage. The latter fits to the question, I believe.
If you need to exchange large blocks, I suggest to use a pool with buffer blocks and communicate pointers to buffers using a queue. So use a 3rd queue with buffer pointers. This way the allocates can be done in application (background) and you real time portion has access to a variable amount of memory.
Application
while (blockQueue.full != true)
{
buf = allocate block of memory from heap or buffer pool
msg = { .... , buf };
blockQueue.Put(msg)
}
Producer:
pBuf = blockQueue.Get()
pQueue.Put()
Consumer
if (pQueue.Empty == false)
{
msg=pQueue.Get()
// use info in msg, with buf pointer
// optionally indicate that buf is no longer used
}

Resources