AAssetManager_open deadlocks on Pixel/Pixel2 XL - android-ndk

I have in beta tests that is getting ANRs on a Pixel, and a Pixel 2 XL both running 8.1. From the logs I am getting back it appears that the call to AAssetManager_open is blocked waiting for a mutex to unlock.
From the log there is not discernible threading conflicts. On the one device it is happen on the third asset read as the app is loading. All of which are separate (and clean up) but sequential. The other device the deadlock is later. No threads are touching related code.
I have yet to encounter this issue on another device so I can't even play with it locally to understand further (I don't have either device). From what Android source I could find the mutex locked doesn't have anything complex about its usage.
The calls are happening on the main thread (hence the ANRs), so I can patch the issue by moving them off there. Ideally, though, I want to fix (or at least understand the cause of) the underlying problem of why the deadlock is happening on these devices in the first place.
So, what I am wondering is if there are any known ways to create a deadlock with AAssetManager_open?
On a side note, the closet I have found is a single article that mentions in passing people getting ANRs on AAssetManager_open in the Oreo pre-release builds but I can't find anything else on that.
Edit: I know have a crash on a different 8.1 device (OnePlus5), so it appears to not be specific to Pixel but 8.1 in general.
I also moved what AssetManager reads were on the main thread just incase and as expected the issue still exists (just don't get an ANR).
Edit #2: It is specific to 8.1 with AdMob mediation.

The reason of deadlock can be invalid pointer to AAssetManager. Please make sure, that pointer which you get from AAssetManager_fromJava is still valid (was not destroyed by GC).
link to AAssetManager_fromJava description

Opening an asset may be a blocking operation. Some assets are stored uncompressed, and AAssetManager_open() can return handle to work with the 'file' in-place. Other assets must be uncompressed to a local cache before they can be used. Some of these files will already be unzipped to your disk, and AAssetManager_open() will return almost instantaneously. Others must be unzipped, and this will compete for CPU with other threads, producing ANR if you are unlucky.
The bottom line, do your best to open assets not from the UI thread.

Related

Loading/removing dynamically buffers with Vulkan

I switched to Vulkan from OpenGL to use multi-threading improvements.
In OpenGL, I was able to load dynamically object to the scene (buffer, textures, etc) while rendering by using a waiting system. I was loading all app-side stuffs in a thread, then when it was ready, just before a frame render in the main thread, I was sending everything into the video memory. That was fine.
With Vulkan, I know I can call some functions between threads without provoking the well known segfault from OpenGL. But, this doesn't works with vkQueueSubmit(). I already know, I tried the naive way. To me, it seems logical you can't bother a queue from multiple threads.
I came with some ideas, but I don't know which one is good or bad.
First, I would go the OpenGL way, I will prepare everything I can from the CPU/App side, then just before render a frame, I will submit buffers (with transfer queue) to the video memory. But I feel there is no a real improvement from OpenGL way...
Second, I will try to use the synchronization mechanism to be able to send buffers in a thread and render from an other. But I keep reading there is a lot of way to slow down everything by causing irrelevant locks or by using incorrectly semaphores and fences.
So my question, is basically what path to pick to solve this problem ? How can I load a buffer dynamically from an other thread while the main thread is rendering without making too much pain to performances ? How Vulkan can help ?
If you want to stream resources for immediate use (i.e. the main render cannot proceed without them), then you're pretty much going to either block the main thread waiting, or have it spin doing something visually interesting (e.g. an animated loading screen) waiting for the resources to load.
If you want to stream resources while the app is doing real rendering then the main trick here is to load resources asynchronously in the background and only switch to using those resources in the main thread once they are already loaded. If the main thread ever ends up actually blocked on a semaphore then you've probably already started dropping frames, so your "engine" design needs to ensure that never happens. A lot of game use simple low-detail proxy objects as stand-in versions while the high-detail version is loading in the background.
None of this is particularly related to the graphics API - both GL and Vulkan need the same macro-scale behavior. Vulkan API features don't particularly help because the major bottlenecks which cause problems here are storage/network/CPU which have nothing to do with the graphics part of the problem.
I decided to trust threads !
In the first place it seems to work, I get a lot of :
[MESSAGE:Validation Error: [ UNASSIGNED-Threading-MultipleThreads ] Object 0: handle = 0x56414228bad8, type = VK_OBJECT_TYPE_QUEUE; | MessageID = 0x141cb623 | THREADING ERROR : vkQueueSubmit(): object of type VkQueue is simultaneously used in thread 0x7f6b977fe640 and thread 0x7f6bc2bcb740]
But it works !
So, the basic idea is to have a thread for loading objects while the engine is drawing. This thread takes care of creating the UBO for the location of the object, then when the geometry is loaded from RAM, it creates the VBO and IBO (I left material with image/UBO on hold for now), then creates the graphics pipeline (with layout, descriptor layout, shaders compiled with GLSLang on the fly) (The next idea is to reuse pipeline for similar needs) and finallly sets a flag to say the object is ready to use. In the other hand, I have my main thread rendering and takes new objects when they shows up ready.
I think it works because I have a gentle video card (GTX 1070) with multiple queues setup, I had one for graphics and an other one for transfer setup.
I'm pretty sure, this will crash or goes poorly with a GPU with a single queue, and this should be why the validation layers tolds me these messages.

replace a process bin file when it is running

I have a server program(compile by g++) which is running. And I change some code and compile a new bin file. Without kill the running process, I mv the new created bin to overwrite the old one.
After a while, the server process crashed. Dose it relate to my replace action?
My server is an multi-thread high concurrent server. One crash is segfault, other one is deadlock.
I print all parameters in the core dump file and pass them exactly same to the function which was crashed. But it is OK.
And I carefully watch all thread info in the deadlock core dump, I can not find it is an possibility to cause deadlock.
So I doubt the replacement will cause strange things
According to this question, if swap action is happen, it indeed will generate strange things
For a simple standard program, even if it is currently opened by the running process, moving a new file will first unlink the original file which will remain untouched apart from that.
But for long running servers, many things can happen: some fork new processes and occasionally some can even exec a new fresh version. In that case, you could have different versions running side by side which could or not be supported depending on the change.
Said differently, without more info on what is the server program, how it is designed to run and what was the change, the only answer I can give is maybe.
If you can make sure that you remove ONLY the bin file, and the bin file isn't used by any other process (such as some daemon). Then it doesn't relate to your replace action.

multithreaded IDirect3DDevice9::CreateDevice freeze

I'm moving my renderer to a different thread.
During this process I'm making two calls to IDirect3D9::CreateDevice:
1. from the 'rendering thread' - in order to create a rendering device and resize it properly
2. from the 'main thread' - here I'm creating a Null device in order to compile shaders etc.
These calls of course can overlap (be made simultaneously ), so I'm synchronizing them with a CriticalSection.
The problem is that one of these calls sometime freezes. DirectX doesn't throw any warnings prior to that happening, so I suspect an internal deadlock.
I studied the documentation and it's mentioned that all calls that operate on a single device, especially IDirect3D9::CreateDevice, IDirect3DDevice9::TestCooperativeLevel and IDirect3DDevice9::Reset, need to be called from the same thread - but I have that covered.
So what am I missing? Can anyone please tell me?
Thanks,
Paksas
I only have a vague memory of this but:
The docs state "Any call to create, release, or reset the device must be done using the same thread as the window procedure of the focus window."
As I remember things, even if you try and create a device using a NULL HWND, internally Direct3D goes and digs one up for your app anyway.
Therefore one of your threads is surely violating the first point.

How do you safely read memory in Unix (or at least Linux)?

I want to read a byte of memory but don't know if the memory is truly readable or not. You can do it under OS X with the vm_read function and under Windows with ReadProcessMemory or with _try/_catch. Under Linux I believe I can use ptrace, but only if not already being debugged.
FYI the reason I want to do this is I'm writing exception handler program state-dumping code, and it helps the user a lot if the user can see what various memory values are, or for that matter know if they were invalid.
If you don't have to be fast, which is typical in code handling state dump/exception handlers, then you can put your own signal handler in place before the access attempt, and restore it after. Incredibly painful and slow, but it is done.
Another approach is to parse the contents of /dev/proc//maps to build a map of the memory once, then on each access decide whether the address in inside the process or not.
If it were me, I'd try to find something that already does this and re-implement or copy the code directly if licence met my needs. It's painful to write from scratch, and nice to have something that can give support traces with symbol resolution.

Draw on top of suspended full-screen Direct3D app

Currently, I am able to hook onto Direct3D application and draw custom stuff onto its surface. However, I would like to suspend this application and then draw something else.
Is this even remotely possible to do so? Like creating another my own Direct3D window on top of that application?
I'm targetting only Windows 7, but the application I want to draw on is using only DirectX 9.
The problem is that I have very little experience with DirectX in general.
Sort of.
You're working with two different elements here, one quite large and but not particularly complex: hooking D3D. The other ("suspending" the app) is simple within that, but you don't quite want what you think you want.
To hook D3D, by the simplest method, you need to intercept the call to CreateDirect3D9 and return your own IDirect3D9, which later creates and returns your own IDirect3DDevice9. This will give you full control over the app's render process.
In order to "suspend" it, you need to wait for the desired trigger, then in your IDirect3DDevice9::Present, call your own event loop. This will, for all intents and purposes, suspend execution of the original app's code, but not the process itself (allowing your code and event loop to process). There will be some limitations of this, and you may not be able to consume window/Windows events (simply), but it will give you full control and effectively pause the original app.
Note, however, that you must intercept and reroute execution in every thread you want to "suspend," it's only specific to a single thread and you don't want physics or AI crunching on while render and UI are paused.
You need to perform your overlay drawing, whatever that may be, during your loop or your IDirect3DDevice9::Present hook, then call the real device's Present method as needed. If you want to run multiple frames of your overlay, then call the real Present repeatedly before returning from your Present. Tweak as necessary. Rendering here is done pretty much normally (check out general D3D tutorials for that), but there is one major catch: the device's state is unknown and may be incompatible, but must be "untouched" on return. This is handled simply by caching an IDirect3DStateBlock9 created from the device immediately after creating it. In your Present hook, create another state block with the state on entrance, restore the clean state block, run your code, then restore the entrance state block. You can work with any states, off a fresh slate, without damaging the device's state (I use this in practice, in works great).
If you want some rather extensive examples of how this works, I'd suggest checking out the Voodoo Shader project, which has full D3D8 and 9 hooks, including everything needed for overlays [/shameless own-project promotion]. Feel free to reuse any of the concepts, or comment with further questions; this certainly isn't all the details that may be useful to you.
This is a very complex thing to accomplish, as it is very much a hack to do so. The only people you see doing such things are steam, teamspeak, xfire, fraps, and a few hard-core devs.
There are kits out on the internet that show you have to inject a DLL into the memory space of the target application to achieve such a feat, and methods such as proxy DLLs.
Proxy DLL:
http://www.codeguru.com/cpp/g-m/directx/directx8/article.php/c11453
Injection:
http://www.progamercity.net/d3d/372-c-directx9-0-hooking-via-detours.html
Good luck, this will take you a while.

Resources