delete framebufferobject from different but shared context - multithreading

I have a few opengl contexts, which are created from the mainthread directly at the beginning of the program. At that time they also get shared with the wglShareLists(contextItem.hglrc, hglrc); command. I also have a lot of threads, each of the threads gets one context with the wglMakeCurrent(hdc, m_vUsingContexts[i].hglrc); command.
Now I only want to know, if I have thread1 which is linked to context1 and thread2 which is linked to context2 and both are shared. Is it possible, that thread1 creates a framebufferobject and thread2 deletes this framebufferobject? (Yes or No is quite enough)
I know that this is absolutely stupid to do. Usually the thread which is creating something also should delete only his own stuff. But I'm not able to change that, because it is a directx 11 program and I'm only program the openGL driver for it. In directx 11 it does not matter which thread is creating or deleting.
Can I also do the same with vertexbufferobjects and textures?

Framebuffer objects are container objects and therefore are not shared across contexts. So no, you may not delete FBOs themselves. Indeed, you cannot access them in any way across contexts, since they are not shared.
However, Texture and Renderbuffer objects can be shared across contexts. So you can delete them in another context. Not that this will necessarily free up memory, of course. By the rules of OpenGL's context model, the objects will continue to exist so long as they are attached or bound to something else.
Object destruction needs to be managed very carefully when using multiple contexts.

Related

Can a singleton object have a (thread-safe) method executing in different threads simultaneously?

If so, how will different threads share the same instance or chunk of memory that represents the object? Will different threads somehow "copy" the single-instance object method code to be run on its own CPU resources?
EDIT - to clarify this question further:
I understand that different threads can be in the "process" of executing a singleton object's method at the same time, while they might not all be actively executing - they may be waiting to be scheduled for execution by the OS and in various states of execution within the method. This question is specifically for multiple active threads that are executing each on a different processor. Can multiple threads be actively executing the same exact code path (same region of memory) at the same time?
Can a singleton object have a (thread-safe) method executing in different threads simultaneously?
Yes, of course. Note that this is no different from multiple threads running a method on the same non-singleton object instance.
how will different threads share the same instance or chunk of memory that represents the object?
(Ignoring NUMA and processor caches), the object exists in only one place in memory (hence, it's a singleton) so the different threads are all reading from the same memory addresses.
Now, if the object is immutable (and doesn't have external side-effects, like IO) then that's okay: multiple threads reading from the same memory and never changing it doesn't introduce any problems.
By analogy, this is like having a single (single-sided) piece of paper on a desk and 10 people reading it simultaneously: no problem.
If the object is mutable or (does have external side-effects like IO) then that's a problem :) You need to synchronize each thread's changes otherwise they'll be overwriting each other - this is bad.
Note that in many languages and platforms, like C#/.NET and C++, documentation can often assert or claim that a method is "thread-safe" but this is not necessarily absolutely guaranteed by the runtime (excepting the case where a function is provably "const-correct" and has no IO, which is something C++ can do, but C# cannot currently do that).
Note that just because a single method is thread-safe, doesn't mean an entire logical operation (e.g. calling multiple object methods in sequence) is thread-safe; for example, consider an Dictionary<K,V>: many threads can simultaneously call TryGetValue without issues - but as soon as one thread calls .Add then that messes up all of the other threads calling TryGetValue (because Dictionary<K,V> does not guarantee that .Add is atomic and blocks TryGetValue, this is why we use ConcurrentDictionary which has a different API - or we wrap our logical business/domain operations in lock (Monitor) statements).
Will different threads somehow "copy" the single-instance object method code to be run on its own CPU resources?
No.
You can make that happen with [ThreadLocal] and [ThreadStatic] (and [AsyncLocal]) but proponents of functional programming techniques (like myself) will strongly recommend against this for many reasons. If you want a thread to "own" something then pass the object on the stack as a parameter and not inside hidden static state.
(Also note this does not apply to value-types (e.g. struct) which are always copied between uses, but I assume you're referring to class object instances, fwiw - as .NET discourages using mutable structs anyway).

How are threads made?

I was reading up on threads and as I understand it, they are a set of values for an execution context. From what I understand, a thread is comprised of values (registers, PC, stack, etc.) that allow a CPU to continue running a set of instructions.
However, my question is: how are these threads made? I hear some of my professors throw around the word thread as a way to break up a process into multiple (mostly) independent parts of code (ie. multithreading). How does this work? Is there another section of memory that stores specifically what a thread can run, as well as it's state?
First of all you have to understand that operating systems vary greatly in their general working as well as in their implementations of seemingly identical functions.
so don't go into these kind of questions thinking that if one operating system does something in some way then other operating systems would do that in similar manner.
Now to your question
how are these threads made?
I will answer it using Linux as an example. When creating a new process Linux lets you specify which data structures (file descriptors, IO context etc) new process would share with its parent process. you can do this using the clone system call.
you can see in the documentation of clone that it takes some parameters specifying the sharing properties.
Now you can call a task_struct thread if it shares all sharable data structures with its parent ( because this property is consistent with the conventional definition of a thread). and if it shares none then you would call it a process.
But as far as Linux is concerned there is no notion of a thread or a process, all you have is a task_struct which may share certain resources with its parent.

Monitor vs Mutex

I read that mutex is a semaphore with value 1 (binary semaphore) used to enforce mutual exclusion.
I read this link
Semaphore vs. Monitors - what's the difference?
which says that monitor helps in achieving mutual exclusion.
Can someone tell me the difference between mutex and monitor as both help achieve the same thing (Mutual Exclusion)?
Since you haven't specified which OS or language/library you are talking about, let me answer in a generic way.
Conceptually they are the same. But usually they are implemented slightly differently
Monitor
Usually, the implementation of monitors is faster/light-weight, since it is designed for multi-threaded synchronization within the same process. Also, usually, it is provided by a framework/library itself (as opposed to requesting the OS).
Mutex
Usually, mutexes are provided by the OS kernel and libraries/frameworks simply provide an interface to invoke it. This makes them heavy-weight/slower, but they work across threads on different processes. OS might also provide features to access the mutex by name for easy sharing between instances of separate executables (as opposed to using a handle that can be used by fork only).
Monitor are different then Mutex but they can be considered similar in a sense that Monitor are build on top of Mutex. See depiction of monitor in an image at the bottom, for clarity.
Monitor is a synchronization construct that allows threads to have both mutual exclusion (using locks) and cooperation i.e. the ability to make threads wait for certain condition to be true (using wait-set).
In other words, along with data that implements a lock, every Java object is logically associated with data that implements a wait-set. Whereas locks help threads to work independently on shared data without interfering with one another, wait-sets help threads to cooperate with one another to work together towards a common goal e.g. all waiting threads will be moved to this wait-set and all will be notified once lock is released. This wait-set helps in building monitors with additional help of lock (mutex).
I you want, you can see my answer here, which may or may not be relevant to this question.
You can find another relevant discussion here
Semaphore vs. Monitors - what's the difference?
Unfortunately the textbook definitions does not always correspond to how different platforms and languages use the terms. So to get precise answers you have to specify the platform and context. But in general:
A mutex is a lock which can only be owned by a single thread at a time. The lock doesn't in itself protect anything, but code can check for ownership of a mutex to ensure that some section of code is only executed by a single thread at a time. If a thread wants to acquire a mutex lock the thread is blocked until it becomes available.
In Java terminology a monitor is a mutex lock which is implicitly associated with an object. When the synchronized keyword is applied to classes or methods an implicit mutex lock is created around the code, which ensures that only one thread at a time can execute it. This is called a monitor lock or just a monitor.
So in Java a monitor is not a specific object, rather any object has a monitor lock available which is invoked with the synchronized keyword.
The synchronized keyword can also be used on a block of code, in which case the object to lock on is explicit specified. Here it gets a bit weird because you can use the monitor of one object to lock access to another object.
In computer science textbooks you may meet a different kind of monitor, the Brinch-Hansen or Hoare-monitor, which is a class or module which is implicitly thread-safe (like a synchronized class in Java) and which have multiple conditions threads can wait/signal on. This is a higher-level concept than the Java monitor.
C#/.NET has monitors similar to Java, but also have a Mutex class in the standard library - which is different from the mutex lock used in the monitor. The monitor lock only exist inside a single process, while the Mutex-lock is machine wide. So a monitor lock is appropriate for making objects and data-structures thread safe, but not for providing system-wide exclusive access to say a file or device.
So bottom line: These terms can mean different things, so if you want a more specific answer you should specify a specific platform.

Am I allowed to simultaneously render from the same buffer object on multiple shared contexts in OpenGL 2.1?

In Apple's documentation, I read this:
1 — "Shared contexts share all texture objects, display lists, vertex programs, fragment programs, and buffer objects created before and after sharing is initiated."
2 — "Contexts that are on different threads can share object resources. For example, it is acceptable for one context in one thread to modify a texture, and a second context in a second thread to modify the same texture. The shared object handling provided by the Apple APIs automatically protects against thread errors."
So I expected to be able to create my buffer objects once, then use them to render simultaneously on multiple contexts. However if I do that, I get crashes on my NVIDIA GeForce GT 650M with backtraces like this:
Crashed Thread: 10 Dispatch queue: com.apple.root.default-qos
Exception Type: EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: EXC_I386_GPFLT
…
Thread 10 Crashed:: Dispatch queue: com.apple.root.default-qos
0 GLEngine 0x00007fff924111d7 gleLookupHashObject + 51
1 GLEngine 0x00007fff925019a9 gleBindBufferObject + 52
2 GLEngine 0x00007fff9243c035 glBindBuffer_Exec + 127
I've posted my complete code at https://gist.github.com/jlstrecker/9df10ef177c2a49bae3e. At the top, there's #define SHARE_BUFFERS — when commented out it works just fine, but uncommented it crashes.
I'm not looking to debate whether I should be using OpenGL 2.1 — it's a requirement of other software I'm interfacing with. Nor am I looking to debate whether I should use GLUT — my example code just uses that since it's included on Mac and doesn't have any external dependencies. Nor am I looking for feedback on performance/optimization.
I'd just like to know if I can expect to be able to simultaneously render from a single shared buffer object on multiple contexts — and if so, why my code is crashing.
We also ran into the 'gleLookupHashObject' crash and made a small repro-case (very similar to yours) which was posted in an 'incident' to Apple support. After investigation, an Apple DTS engineer came back with the following info, quoting:
"It came to my attention that glFlush() is being called on both the main thread and also a secondary thread that binds position data. This would indeed introduce issues and, while subtle, actually does indicate that the constraints we place on threads and GL contexts aren’t being fully respected.
At this point it behoves you to either further investigate your implementation to ensure that such situations are avoided or, better yet, extend your implementation with explicit synchronization mechanisms (such as what we offer with GCD). "
So if you run into this crash you will need to do explicit synchronization on the application side (pending a fix on the driver-side).
Summary of relevant snippets related to "OpenGL, Contexts and Threading" from the official Apple Documentation:
[0] Section: "Use Multiple OpenGL Contexts"
If your application has multiple scenes that can be rendered in parallel, you can use a context for each scene you need to render. Create one context for each scene and assign each context to an operation or task. Because each task has its own context, all can submit rendering commands in parallel.
https://developer.apple.com/library/mac/documentation/GraphicsImaging/Conceptual/OpenGL-MacProgGuide/opengl_threading/opengl_threading.html#//apple_ref/doc/uid/TP40001987-CH409-SW6
[1] Section: Guidelines for Threading OpenGL Applications
(a) Use only one thread per context. OpenGL commands for a specific context are not thread safe. You should never have more than one thread accessing a single context simultaneously.
(b) Contexts that are on different threads can share object resources. For example, it is acceptable for one context in one thread to modify a texture, and a second context in a second thread to modify the same texture. The shared object handling provided by the Apple APIs automatically protects against thread errors. And, your application is following the "one thread per context" guideline.
https://developer.apple.com/library/mac/documentation/GraphicsImaging/Conceptual/OpenGL-MacProgGuide/opengl_threading/opengl_threading.html
[2] OpenGL Restricts Each Context to a Single Thread
Each thread in an OS X process has a single current OpenGL rendering context. Every time your application calls an OpenGL function, OpenGL implicitly looks up the context associated with the current thread and modifies the state or objects associated with that context.
OpenGL is not reentrant. If you modify the same context from multiple threads simultaneously, the results are unpredictable. Your application might crash or it might render improperly. If for some reason you decide to set more than one thread to target the same context, then you must synchronize threads by placing a mutex around all OpenGL calls to the context, such as gl* and CGL*. OpenGL commands that blockâsuch as fence commandsâdo not synchronize threads.
https://developer.apple.com/library/mac/documentation/GraphicsImaging/Conceptual/OpenGL-MacProgGuide/opengl_threading/opengl_threading.html

guarding a critical section with a mutex

Let's say that I have a shared object that has a piece of code protected with a critical section and more than 1 thread are accessing the object for read/write. When a thread is inside the critical section the other threads are waiting. Once the thread gets out of the CS then the OS gives access to any of the waiting threads.
If I am confined to only one process, does the CS alone is a good protection for the shared object?
I ask because I have seen on the web that the right way to do it is to use a kernel object (ex: mutex, semaphone) to guard the CS. A thread wishing to use the shared resource needs to obtain the mutex/semaphore first with a WaitForSingleObject type of function. If a mutex is used then only one of then can access the resource. Once the mutex is obtained then the thread Enters the CS, does what is supposed to do, then Leave the CS and Releases the mutex. Then the OS allows any other waiting thread to obtain the mutex and so on and so forth.
But isn't is the same as using only the CS?
Also, using a mutex is supposed to be significantly slower than using a CS alone. The only problem I see of using only a CS is that if the thread crashes inside the CS then the other threads may never access the shared resource.
Is there any other reason why this approach is better?
thanks in advance
It sounds like you're discussing some Windows-specific terminology in a way that's getting it mixed up with some general computer science terminology.
In computer science the term "critical section" is used for areas of code that must run exclusively (usually due to data sharing). In Windows, there's a synchronization object called CRITICAL_SECTION that can be used to provide exclusive access to areas of execution. Another attribute of a CRITICAL_SECTION object on Windows is that it is limited to being used within a single process.
In computer science, the term 'mutex' is often used to describe an object that can be used to provide synchronization among parallel or cncurrent threads of execution. In Windows, there is also a mutex object which can be created with the CreateMutex() function (which returns a HANDLE representing the mutex). That object can be used to synchronize access among threads in the same or different processes, so it can be used similar to a CRITICAL_SECTION (but with different APIs) in many ways. If you want to synchronized threads of execution that are in different processes, a mutex object can be used, while a CRITICAL_SECTION object cannot.
So to answer your question (I think), if you are only concerned with protecting a critical section among threads that are part of the same process, a CRITICAL_SECTION object should be adequate. A mutex object can be used instead, but it may be somewhat less performant. There should be no need to use both types of objects

Resources