What can I do to find the cause of pread64/pwrite64 hangs? - linux

My application is doing some heavy IO on raw /dev/sdb block device using pread64/pwrite64. Sometimes it doing just fine. Call to pread64/pwrite64 usually takes as little as 50-100us. But sometimes it takes a whole lot more, up to several seconds.
What can you recommend to find the cause of such problem?

I have not used it but I have heard about a tool called latencytop.

When it's hung like that, grab a stackshot. Another option is pstack or lsstack.
And as #Zan pointed out, latencytop could also give you that information.
That might not fully answer your question, but at least you'll know with certainty what it was trying to do when it was hung.

Related

How to know what blocks NodeJS from closing

I have code that runs on NodeJS which involves intervals, sockets, and other async things.
Sometimes when it should close, it hangs forever, presumably since somewhere under some circumstances, I forget to clear an interval, close a socket, or something else.
Is there a way to get the currently active timers, and other such runtime information? Or inspect in any kind of way what blocks the exit?
Found this package https://www.npmjs.com/package/wtfnode from a related question to this one (https://stackoverflow.com/a/38471228/2503048). Oddly enough I couldn't find this information when googling. It should probably answer my question. Mostly the part about process._getActiveHandles().

How to see what started a thread in Xcode?

I have been asked to debug, and improve, a complex multithreaded app, written by someone I don't have access to, that uses concurrent queues (both GCD and NSOperationQueue). I don't have access to a plan of the multithreaded architecture, that's to say a high-level design document of what is supposed to happen when. I need to create such a plan in order to understand how the app works and what it's doing.
When running the code and debugging, I can see in Xcode's Debug Navigator the various threads that are running. Is there a way of identifying where in the source-code a particular thread was spawned? And is there a way of determining to which NSOperationQueue an NSOperation belongs?
For example, I can see in the Debug Navigator (or by using LLDB's "thread backtrace" command) a thread's stacktrace, but the 'earliest' user code I can view is the overridden (NSOperation*) start method - stepping back earlier in the stack than that just shows the assembly instructions for the framework that invokes that method (e.g. __block_global_6, _dispatch_call_block_and_release and so on).
I've investigated and sought various debugging methods but without success. The nearest I got was the idea of method swizzling, but I don't think that's going to work for, say, queued NSOperation threads. Forgive my vagueness please: I'm aware that having looked as hard as I have, I'm probably asking the wrong question, and probably therefore haven't formed the question quite clearly in my own mind, but I'm asking the community for help!
Thanks
The best I can think of is to put breakpoints on dispatch_async, -[NSOperation init], -[NSOperationQueue addOperation:] and so on. You could configure those breakpoints to log their stacktrace, possibly some other info (like the block's address for dispatch_async, or the address of the queue and operation for addOperation:), and then continue running. You could then look though the logs when you're curious where a particular block came from and see what was invoked and from where. (It would still take some detective work.)
You could also accomplish something similar with dtrace if the breakpoints method is too slow.

How do you safely read memory in Unix (or at least Linux)?

I want to read a byte of memory but don't know if the memory is truly readable or not. You can do it under OS X with the vm_read function and under Windows with ReadProcessMemory or with _try/_catch. Under Linux I believe I can use ptrace, but only if not already being debugged.
FYI the reason I want to do this is I'm writing exception handler program state-dumping code, and it helps the user a lot if the user can see what various memory values are, or for that matter know if they were invalid.
If you don't have to be fast, which is typical in code handling state dump/exception handlers, then you can put your own signal handler in place before the access attempt, and restore it after. Incredibly painful and slow, but it is done.
Another approach is to parse the contents of /dev/proc//maps to build a map of the memory once, then on each access decide whether the address in inside the process or not.
If it were me, I'd try to find something that already does this and re-implement or copy the code directly if licence met my needs. It's painful to write from scratch, and nice to have something that can give support traces with symbol resolution.

How to detect that no one is writing to a file in Linux?

I am wondering, is there a simple way to tell whether another entity has a certain file open for writing? I don't have time to use iNotify continuously to wait for any current writer to finish writing. I need to do an intermittent check.
Thanks.
What exactly are you doing where you "don't have time to use iNotify continuously"? First, you should be using the IN_CLOSE_WRITE flag so that iNotify just make one notification when the file gets closed after being written. Using it continuously makes no sense. Second, if your timing is that critical, I'm thinking writing to a file isn't your ideal solution. Do you control the first writer? Do you have to worry about anything else writing to the file after the first writer closes it?
lsof LiSts Open Files. fuser also works similarly (File USER), by telling you which user is using the file.
See: http://www.refining-linux.org/archives/23/16-Introduction-to-lsof-and-fuser/
Since you seem to be wanting to use a library-style interface, and not system, see ofl-lib.c. (It's really just having removed everything but the main function from the ofl program itself.)
You can't do so easily in the general case, and even if you could, you cannot use the information in a non-racy manner (see caf's comment).
So I'd say, redesign your application so you do not need to know.

Using callgrind/kcachegrind to get per-thread statistics

I'd like to be able to see how "expensive" each thread in my application is using callgrind. I profiled with the --separate-thread=yes option which gives you a callgrind file for the whole app and then one per-thread.
This is useful for viewing the profile of any given thread, but what I really want is just a sorted list of CPU time from each thread so I can see which threads are the the biggest hogs.
Valgrind/Callgrind doesn't allow this behaviour. Neither kcachegrind does, but I think it will be a good improvement. Maybe some answers could be found on their mailing-list.
A working but really boring way could be to use option --separate-thread=no, and update your code to use for each thread a different function name or class name. Depending your code complexity, it could be the answer (using 1computeData(), 2computeData(), ..)
Just open multiple profiles in kcachegrind at the same time.

Resources