I am noticing some tearing with some QML 2 animations with Qt 5.4.2 on my Tegra 3 based embedded Linux board. I doubt if this is a complete vsync issue because most of the animations are smooth but there are some animations that involve a lot of parallel motion and clipping that tear consistently. These animation come out torn as opposed to simply stuttering so I don't think it is completely a performance issue either. Though it might be caused by the system not being able to put out the necessary FPS to sync properly? The exact same application has no such trouble on my Haswell i7 PC.
I have enabled QT_QPA_EGLFS_FORCEVSYNC to no effect and have not yet managed to find anything else that I can try. I should mention though that I am running EGLFS with an X11 backend (http://code.qt.io/cgit/qt/qtbase.git/tree/src/plugins/platforms/eglfs/qeglfshooks_x11.cpp?h=5.4) as a result of the Nvidia drivers dictating the use of X11. I would assume that this means that I can't really use the FB related settings normally available with EGLFS. Is there anything else that I can try to fix this?
PS. By setting QT_QPA_EGLFS_SWAPINTERVAL to 0 I can get the tearing to become a whole lot worse. This again suggests that I most likely do not have a whole system vsync issue.
PPS. I am getting a "QSGContext::initialize: stencil buffer support missing, expect rendering errors" warning at the start of my application.
On a Freescale/NXP imx6 with Vivante GC2000 I see a similar problem even when not using x11.
Setting "export QT_QPA_EGLFS_SWAPINTERVAL=2" seems to reduce the tearing on 3.14.38 kernel.
On 3.14.52 kernel that did not work but "export FB_MULTI_BUFFER=3" does help on both Qt 5.5.1 and 5.6 with imx6.
Related
I am implementing a simple Vulkan renderer according to a popular Vulkan tutorial (https://vulkan-tutorial.com/Introduction), and I've run into an interesting issue with the presentation mode and the desktop environment performance.
I wrote the triangle demo on Windows, and it performed well; however, I ported it to my Ubuntu installation (running MATE 1.20.1) and discovered a curious problem with the performance of the entire desktop environment while running it; certain swapchain presentation modes seem to wreak utter havoc with the desktop environment.
When setting up a Vulkan swapchain with presentMode set to VK_PRESENT_MODE_FIFO_KHR and subsequently running the application, the entire desktop environment grinds to a halt whenever any window is dragged. When literally any window on the entire desktop is dragged, the entire desktop environment slows to a crawl, appearing to run at roughly 4-5 fps. However, when I replace the presentMode with VK_PRESENT_MODE_IMMEDIATE_KHR, the desktop environment is immune to this issue and does not suffer the performance issues when dragging windows.
When I researched this before asking here, I saw that several people discovered that they experienced this behavior when their application was delivering frames as fast as possible (not vsync'd), and that properly synchronizing with vsync resolved this stuttering. However, in my case, it's the opposite; when I use VK_PRESENT_MODE_IMMEDIATE_KHR, i.e., not waiting for vsync, the dragging performance is smooth, and when I synchronize with vsync with VK_PRESENT_MODE_FIFO_KHR, it stutters.
VK_PRESENT_MODE_FIFO_RELAXED_KHR produces identical (catastrophic) results as the standard FIFO mode.
I tried using the Compton GPU compositor instead of Compiz; the effect was still there (regardless of what window was being dragged, the desktop still became extremely slow) but was slightly less pronounced than when using Compiz.
I have fully implemented the VkSemaphore-based frame/image/swapchain synchronization scheme as defined in the tutorial, and I verified that while using VK_PRESENT_MODE_FIFO_KHR the application is only rendering frames at the target 60 frames per second. (When using IMMEDIATE, it runs at 7,700 fps.)
Most interestingly, when I measured the frametimes (using glfwGetTime()), during the periods when the window is being dragged, the frametime is extremely short. The screenshot shows this; you can see the extremely short/abnormal frame time when a window is being dragged, and then the "typical" frametime (locked to 60fps) while the window is still.
In addition, only while using VK_PRESENT_MODE_FIFO_KHR, while this extreme performance degradation is being observed, Xorg pegs the CPU to 100% on one core, while the running Vulkan application uses a significant amount of CPU time as well (73%) as shown in the screenshot below. This spike is only observed while dragging windows in the desktop environment, and is not observed at all if VK_PRESENT_MODE_IMMEDIATE_KHR is used.
I am curious if anyone else has experienced this and if there is a known fix for this window behavior.
System info: Ubuntu 18.04, Mate 1.20.1 w/ Compiz, Nvidia proprietary drivers.
Edit: This Reddit thread seems to have a similar description of an issue; the VK_PRESENT_MODE_FIFO_KHR causing extreme desktop performance issues under Nvidia proprietary drivers.
Edit 2: This bug can be easily reproduced using vkcube from vulkan-tools. Compare the desktop performance of vkcube using --present-mode 0 vs --present-mode 2.
How to get screenshot of graphical application programmatically? Application draw its window using EGL API via DRM/KMS.
I use Ubuntu Server 16.04.3 and graphical application written using Qt 5.9.2 with EGLFS QPA backend. It started from first virtual terminal (if matters), then it switch display to output in full HD graphical mode.
When I use utilities (e.g. fb2png) which operates on /dev/fb?, then only textmode contents of first virtual terminal (Ctrl+Alt+F1) are saved as screenshot.
It is hardly, that there are EGL API to get contents of any buffer from context of another process (it would be insecure), but maybe there are some mechanism (and library) to get access to final output of GPU?
One way would be to get a screenshot from within your application, reading the contents of the back buffer with glReadPixels(). Or use QQuickWindow::grabWindow(), which internally uses glReadPixels() in the correct way. This seems to be not an option for you, as you need to take a screenshot when the Qt app is frozen.
The other way would be to use the DRM API to map the framebuffer and then memcpy the mapped pixels. This is implemented in Chromium OS with Python and can be translated to C easily, see https://chromium-review.googlesource.com/c/chromiumos/platform/factory/+/367611. The DRM API can also be used by another process than the Qt UI process that does the rendering.
This is a very interesting question, and I have fought this problem from several angles.
The problem is quite complex and dependant on platform, you seem to be running on EGL, which means embedded, and there you have few options unless your platform offers them.
The options you have are:
glTexSubImage2D
glTexSubImage2D can copy several kinds of buffers from OpenGL textures to CPU memory. Unfortunatly it is not supported in GLES 2/3, but your embedded provider might support it via an extension. This is nice because you can either render to FBO or get the pixels from the specific texture you need. It also needs minimal code intervertion.
glReadPixels
glReadPixels is the most common way to download all or part of the GPU pixels which are already rendered. Albeit slow, it works on GLES and Desktop. On Desktop with a decent GPU is bearable up to interactive framerates, but beware on embedded it might be really slow as it stops your render thread to get the data (horrible framedrops ensured). You can save code as it can be made to work with minimal code modifications.
Pixel Buffer Objects (PBO's)
Once you start doing real research PBO's appear here and there because they can be made to work asynchronously. They are also generally not supported in embedded but can work really well on desktop even on mediocre GPU's. Also a bit tricky to setup and require specific render modifications.
Framebuffer
On embedded, sometimes you already render to the framebuffer, so go there and fetch the pixels. Also works on desktop. You can enven mmap() the buffer to a file and get partial contents easily. But beware in many embedded systems EGL does not work on the framebuffer but on a different 'overlay' so you might be snapshotting the background of it. Also to note some multimedia applications are run with UI's on the EGL and media players on the framebuffer. So if you only need to capture the video players this might work for you. In other cases there is EGL targeting a texture which is copied to the framebuffer, and it will also work just fine.
As far as I know render to texture and stream to a framebuffer is the way they made the sweet Qt UI you see on the Ableton Push 2
More exotic Dispmanx/OpenWF
On some embedded systems (notably the Raspberry Pi and most Broadcom Videocore's) you have DispmanX. Whichs is really interesting:
This is fun:
The lowest level of accessing the GPU seems to be by an API called Dispmanx[...]
It continues...
Just to give you total lack of encouragement from using Dispmanx there are hardly any examples and no serious documentation.
Basically DispmanX is very near to baremetal. So it is even deeper down than the framebuffer or EGL. Really interesting stuff because you can use vc_dispmanx_snapshot() and really get a snapshot of everything really fast. And by fast I mean I got 30FPS RGBA32 screen capture with no noticeable stutter on screen and about 4~6% of extra CPU overhead on a Rasberry Pi. Night and day because glReadPixels got was producing very noticeable framedrops even for 1x1 pixel capture.
That's pretty much what I've found.
Some days ago I set up my computer and installed a new copy of Windows 8 because of some hardware changes. Among others I changed the video card from Radeon HD 7870 to Nvidia GTX 660.
After setting up Visual Studio 11 again, I downloaded my last OpenGL project from Github and rebuilt the whole project. I ran the application out of Visual Studio and it crashed because of nvoglv32.dll.
Unhandled exception at 0x5D9F74E3 (nvoglv32.dll) in Application.exe: 0xC0000005: Access violation reading location 0x00000000.
In the old environment the application worked as expected. I didn't changed anything of the project or source code. The only difference was the language of the Visual Studio installtion which is English now and was German before. Therefore I created a new project and adopted all settings, but the error remains.
In order to locate the crash, I noticed that all initialization (window, shaders, ...) succeeded and the error is at the draw call glDrawElements() which referrs to the gemoetry pass of my deferred renderer.
After some reseach I found out that nvoglv32.dll is from Nvidia and is about a services called Compatible OpenGL ICD. Does that somehow mean that my application runs in a compatible mode? That sounds like a mode to support older applications and I want mine to run in a regular mode! By the way I installed the lastest stable drivers for my video card.
To be honest, I have no clue how to approach fixing this crash. What could cause it and how to fix it?
Update: I found a post on Geforce Forums about my issue. Although there was no reply, the autor could fix the problem by changing the order of two OpenGL calls.
Hi all,
After poking around with my application source code for a few hours, I found that calling the functions...
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, #)
glBindVertexArray(#)
...in that order causes the crash in nvoglv64.dll.
Reversing the order of these calls to...
glBindVertexArray(#)
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, #)
...prevents the crash and appears to be well-behaved.
Cheers,
Robert Graf
Since I do not use vertex arrays I cannot simple do this fix but there might be a similar issue. I will report my progress.
Update: I have absolutely no clue how to solve my problem. I tried different video driver versions but it makes no difference. I completely rewrote the renderer using minimal shaders and simple forward rendering. But the crash sill occurs at the first draw call.
In order to locate the crash, I noticed that all initialization (window, shaders, ...) succeeded and the error is at the draw call glDrawElements().
Most likely you had a out-of-bounds access in your code all the time, but the AMD Radeon Catalyst drivers did reserve more address space, or caught them beforehand. And now your NVidia GeForce driver's don't.
Either you're passing glDrawElements a too large number for count elements to draw, or your index buffer contains values that index beyond the range of your vertex arrays. If it's the later, then you're probably using client side vertex arrays, as VBOs usually catch out-of-bounds accesses; also those wouldn't crash your client side program, but just render garbage.
Finally I came up with a solution to fix the crash.
The SFML framework I use to create the window and more provides a function to reset the OpenGL state of the context. I called it right after window creation.
Even though I can't explain why, removing that function call solved the crash. Maybe it is because GLEW or something else isn't yet initialized that moment.
sf::RenderWindow window;
window.create(VideoMode(1024, 768), "Window Title");
window.resetGLStates(); // removing this line fixed the crash
window.setVerticalSyncEnabled(true);
I notice that the IsDirect() method which should say whether hardware rendering is on returns false on my vtkRenderWindow*, link to class doc:
http://www.vtk.org/doc/nightly/html/classvtkRenderWindow.html
How do I enable it? Seems like a build option in cmake should do it, but what? My OpenGL libs are found OK. I notice there's a thread from a while back here: http://www.vtk.org/pipermail/vtkusers/2005-July/081034.html but I don't see any more recent info so far, is it just not supported on Linux?
First of all, I doubt it is a cmake option to enable direct HW rendering.
Applications are falling back to the software rending when the hardware rendering is not possible (GPU doesn't support some of the picked options).
Without seeing some code and getting details (which GPU), it is not possible to give a better answer.
On most of computers my program performs fine. But on one computer it failed to generate mipmap.
I created a texture with D3DUSAGE_AUTOGENERATEMIPMAP,
D3DCAPS2_CANAUTOGENMIPMAP says yes and CheckDeviceFormat says D3D_OK(not D3DOK_NOAUTOGEN) too.
then I use LoadSurfaceFromMemory to fill the texture.
But on that particular computer, no mipmap is generated. What's worse, the computer is my leader's!!
I'd bet its an intel (or other lesser manufacturer) integrated or that its drivers are not up-to-date. If it works using the "reference" (REF) driver then its a driver bug.
If you are loading a surface from memory anyway is it really that much extra to call D3DXFilterTexture afterwards? Its far more likely to work.
Alas you discover the joys of caps that outright lie :(