Draw from a separate thread with NSOpenGLLayer - multithreading

I'm working on an app which needs to draw with OpengGL at a refresh rate at least equal to the refresh rate of the monitor. And I need to perform the drawing in a separate thread so that drawing is never locked by intense UI actions.
Actually I'm using a NSOpenGLView in combination with CVDisplayLink and I'm able to achive 60-80FPS without any problem.
Since I need also to display some cocoa controls on top of this view I tried to subclass NSOpenGLView and make it layer-backed, following LayerBackedOpenGLView Apple example.
The result isn't satisfactory and I get a lot of artifacts.
Therefore I've solved the problem using a separate NSWindow to host the cocoa controls and adding this window as a child window of the main window containing the NSOpenGLView.
It works fine and I'm able to get quite the same FPS as the initial implementation.
Since I consider this solution quite like a dirty hack, I'm looking for an alternative and more clean way of achiving what I need.
Few days ago I came across NSOpenGLLayer and I thought that it could be used as a viable solution for my problem.
So finally, after all this preamble, here comes my question:
is it possible to draw to a NSOpenGLLayer from a separate thread using CVDisplayLink callback?.
So far I've tried to implement this but I'm not able to draw from the CVDisplayLink callback. I can only -setNeedsDisplay:TRUE on the NSOpenGLLayer from the CVDisplayLink callback and then perform the drawing in -drawInOpenGLContext:pixelFormat:forLayerTime:displayTime: when it gets automatically called by cocoa. But I suppose that this way I'm drawing from the main thread, isn't it?
After googling for this I've even found this post in which the user claims that under Lion drawing can occur only inside -drawInOpenGLContext:pixelFormat:forLayerTime:displayTime:.
I'm on Snow Leopard at the moment but the app should run flawlessly even on Lion.
Am I missing something?

Yes, it is possible, though not recommend. Call display on the layer from within your CVDisplayLink. This will cause canDrawInContext:... to be called and if it returns YES, drawInContext:... will be called and all this on whatever thread called display. To make the rendered image visible on screen, you have to call [CATransaction flush]. This method has been suggested on the Apple mailing list, though it is not completely problem free (the display method of other view may get called on your background thread as well and not all views support rendering from a background thread).
The recommend way is to make the layer asynchronous and render the OpenGL context on main thread. If you cannot achieve a good framerate that way, since your main thread is busy elsewhere, it is recommend to rather move everything else (pretty much your whole application logic) to other threads (e.g. using Grand Central Dispatch) and only keep user input and drawing code on the main thread. If your window is very big, you may still not get anything better than 30 FPS (one frame ever two screen refreshes), yet that comes from the fact, that CALayer composition seems a rather expensive process and it has been optimized for more or less static layers (e.g. layers containing a picture) and not for layers updating themselves 60 FPS.
E.g. if you are writing a 3D game, you are advised not to mix CALayers with OpenGL content at all. If you need Cocoa UI elements, either keep them separated from your OpenGL content (e.g. split the window horizontally into a part that displays only OpenGL and a part that only displays controls) or draw all controls yourself (which is pretty common for games).
Last but not least, the two window approach is not as exotic as you may think, that's how VLC (the video player) draws its controls over the video image (which is also rendered by OpenGL on Mac).

Related

SDL2 + openGL ES 2.0 frame rate performance boost with less CPU load

I am developing on a linux system using latest (at the moment) SDL2 (2.0.8) + openGL ES 2.0 (GLSL 1.0) eventually targeting a raspberry pi 3 board. I have so far done a few things like drawing text with freetype, drawing lines, text boxes (editable), text lists, waveform boxes (all i need to pass to a function is an array of vertices) and other shapes with glDrawArrays(). Now, there are things that need to be refreshed at, let's say, 10 times per sec and others that need 1 time per second. What would be the best approach to skip re-rendering everything at the rate of 10 times per sec? Because obviously openGL works by drawing everything from scratch on every 'frame'. However i know and you know that other approaches exist that include: rendering on top of the screen you already have or taking a screenshot and rendering on top of it only the fast changing things as well as other solutions. What do you thing would be the best approach to skip re-doing everything before calling SDL_GL_SwapWindow() ? How can i take a screen shot and render it on the invisible buffer then render only the fast changing objects and then call SDL_GL_SwapWindow() ?
This is a screen shot of the app so far drawing basic things
Thanks in advance.
i eventually had to realize that i should not have posted the question in the first place but since this is a place where people learn from others i now feel somewhat nicer :) . So, the thing i had to do was to simply stop clearing the invisible buffer (i will call it that for simplicity) and render on top of it only controls that change. Those that change are updated by covering the area that they take by a rectangle and then draw new stuff on that area. I have already done it and the frame rate just 'exploded'. I do not really think that there is a better approach since the way i do it requires no action at all. All i had to do was to add a few if conditions that selectively rendered or skipped every time the execution reached the point where functions iterate through the controls that have to be drawn on screen and therefore decide what to render and what not. However a well thought set of structures is required for every control instead of declaring and defining endlessly global variables which will only makes things confusing and difficult to maintain.
Regards to all.

Render loop vs. explicitely calling update method

I am working on a 3D simulation program using openGL which uses a render loop with a fixed framerate to keep the screen updated as the world changes. Standard procedure really, and for a typical video game this is certainly the best approach (I originally took this code from an openGL game tutorial). But for me, the 3D scene will not be changing as rapidly and unpredictably as in a computer game. It will be possible for the 3D scene itself to change from time to time but in general it won't change between render calls (it's more of a visualisation tool for geometric problems). The user will be able to control the position/orientation of the camera but in general there will be times when the camera won't move for several seconds/minutes (potentially hundreds of render calls) and since the 3D scene is likely be static for the majority of the time, I wonder if I really need a continuous render loop...?
My thinking is that I will remove the automatic render loop and instead I will explicitly call my update method when either,
The 3D scene changes (very rare)
The camera moves (somewhat rare)
As I will be using this largely for research purposes, the scene/camera is likely to stay in one state for several minutes at a time and it seems silly to be continuously updating the frame buffer when it's not changing.
My question then is, is this a good approach? All the online tutorials for 3D graphics rendering seem to deal with game design but that's not really my requirement. In other words, what are the pros and cons of using a render loop vs. manually calling "update()" whenever something changes?
Thanks
There's no problem with this approach, in fact many 3D apps, like 3DS MAX use explicit rendering. You just pick what is better for your needs, in most games scene changes each frame so it's better to have update loop, but if you were doing some chess game, without animated UI you could also use explicit rendering only when the scene changes.
For apps with rare changes, like 3DS or Blender it would be better to call rendering only on change. This way you save the CPU/GPU but also power and your PC don't heat up so much.
With explicit rendering you can also have some performance tricks, like drawing simplified scene when camera moves, for better performance. Then when camera stops you render the full scene in background once again, and replace the low-quality rendering with the new one.

mouse movements for background thread created windows forms

I am wondering if it is possible to
1) Hosting a background thread created windows form inside application main thread created windows form ?
or 2) Separate the thread which processes mouse movements from the thread which is painting a windows form application ?
I know these questions sounds very crazy, but I find myself encountering a peculiar problem and I welcome advice of being able to side-step this problem, though I do not think we can part from windows forms and use different ui technology easily.
Our software team writes some plugins for a 3-rd party windows form application. SO we provide a root main form which hosts a bunch of user controls to them, using their api and they host them in their own windows form. Currently all user interfaces are created on the application's main thread, so they play nicely with each other. Many of our user controls we provide have system.windows.forms.datavisualisation charts, which are painted live by data the application is sourcing. One of the problem we have is when the user moves the mouse around erratically the display stop updating as all the painting of the graphs (GDI+) is done on the main thread ( we use background threads and TPL for sourcing and computing data, but painting is done on the main thread). So I am wondering if it is possible to make all the gdi+ painting of these charts occur on a different thread than the application main thread, so the painting would continue and we can still receive mouse movement inputs and clicks for user interactions, but erratic mouse movements can not flood the message queue and stop the gdi+ painting of the user controls.
Any help specially, specially pointers to relevant apis or articles demonstrating techniques would be much appreciated.
Thank you.
It depends on the cause of the slow down when the draw is done.
First, you should optimize your drawing routines maybe you are painting the entire form/control on each call to OnPaint? If that's the case then you should only re draw the invalidated area, that will speed up things a lot.
Else, if you still need to do the drawing on another thread, then you cant do it directly, ui can only be modified on main thread, but you can use a bitmap as an off screen buffer and then re draw from it.
To do that, when your control is created you create also a bitmap of your controls size (also you should take care of resizes). So, when any change must be done to your control's appearance, draw it to the bitmap, you can do it in another thread, and then invalidate your control's updated area.
And finally, in your control's OnPaint just blit from that bitmap to your control, and only the invalidated area, this call will still be done on main thread, but i assure you any machine today is able of blitting very large images in milliseconds.
As an example, let's suppose you have a control where you draw a gradient background and a circle.
Usually, you will paint directly to the Control surface through a Graphics object each time you want to update something, let's say the background, so, to change it you will draw the full background and your circle over it.
To do that in background, if you created a bitmap as off screen buffer, you don't draw to the surface, you draw to the bitmap in another thread (obtaining a graphics object from it) and then invalidate the updated área of the control. When you invalidate the área OnPaint will be called and then you can blit from that bitmap to the controls surface.
In this step you gained little or none speed, but let's expand our example. Suppose you are drawing a very complex control,with lots of calls to DrawImage, DrawCircle, etc.
When the mouse moves over the control, small áreas get invalidated, and on each OnPaint call you will draw all the "layers" composing the invalidated área, it can be lots of draws to do if as we said the control is very complex.
But if you did draw to a bitmap your controls appearance, on each OnPaint call you will just blit the corresponding área from the bitmap to the control, as you see you are reducing calls from lots of draws to just a blit.
Hope it clarifies the idea.

Is a drag-and-drop interface simply infeasible in Pygame?

I'm building a simple RPG using Pygame and would like to implement a drag-and-drop inventory. However, even with the consideration of blitting a separate surface, it seems that the entire screen will need to be recalculated every single time the user drags an item around. Would it be best to allow a limited range of motion, or is it simply not feasible to implement such an interface?
redrawing most or all of the screen is a very normal thing, across all windowing systems. this is rarely an issue, since most objects on screen can be drawn quickly.
To make this practical, it's necessary to organize all of the game objects that have to be drawn in such a way that they can be quickly found and drawn in the right order. This often means that objects of a particular type are grouped into some sort of layer. The drawing code can go through each layer, and for each object in each layer, ask the object to draw itself. If a particular layer is costly to draw, because it's got a lot of objects, can store a prerendered surface and blit that instead.
A really simple hack to get a similar effect is to capture the screen at the start of a drag to a surface, and then blit that every frame instead of the whole game. This obviously only makes sense in a game where dragging also means that the rest of the game is effectively paused.
There are many GUI examples on pygame.org, as well as libraries for GUIs.

How to present to a different window using IDXGISwapChain and ID3D11Device/ID3D11DeviceContext?

Previously, when I've built tools, I've used D3D version 9, where the call to Present() can take a target window and rectangle, and you can thus draw from a single device into many different windows. This is great when using D3D to accelerate desktop applications, and/or building tools rather than games!
I've also built a game renderer with D3D11 before, which is also great, because the state management and threading interfaces are well designed, and you can even target D3D 9 level hardware that's still pretty common in the wild (as opposed to D3D 10, which can only target 10-and-better).
However, now I want to build a tool with D3D11. Unfortunately, the IDXGISwapChain that comes back from D3D11CreateDeviceAndSwapChain() seems to "remember" its HWND, and only wants to present to that window. This is highly inconvenient, because I may have a large number of windows that each need fairly simple graphics drawn to them, and only in response to a WM_PAINT (again, this is for a tool, not a game).
What I want to do is to save back buffer RAM. Specifically, I used to be able to create a single back buffer, the size of the desktop, that I knew could cover all rendering needs, and then that would be the single copy allocated. Even if there are 10 overlapping windows, they all render through the same back buffer, so there's no waste of memory beyond the initial allocation. I can create textures that are not swap chains, and use them as "render targets," but I can't find a good way of presenting to an arbitrary rectangle of an arbitrary client window, without reading back the bitmap and copying it into a DIBSection, which would be really inefficient. Also, there is no way to create many swap chains, and having them share the same back buffer.
The best I can do is to create one swap chain per window, and resize the back buffer of each swap chain to be really small, except when I render to the swap chain, at which point I resize it to match the window. However, this seems inefficient, because resizing the targets is not a "free" operation AFAICT. So, is there a better way?
The answer I ended up with was to create one back buffer per separate display area, and not size it to the back buffer. I imagine that, in a world where desktop composition and transparency can happy to "anything" behind my back, that's probably helpful to the system.
Learn to love the VVM system, I guess :-) (VVM for Virtual Video Memory)

Resources