How can I find a reason that a program stalls? - linux

I'm handling a program which controls a car.
The program has a pretty large scale and it was made by other people.
So I don't understand completely how it works.
But I have to apply it and make a car moving.
The problem I'm facing is that the program often stalls with no error, no segmentation.
If it crashes, I can trace the cause with gdb or something like that.
But it does not crash, it silently stops.
How can I find the cause?

From your description - program silently stops - I understand that your program simply and gracefully exited, but not from your expected flow. This can happen for many reasons - for example, maybe your program enters illegal flow and some sub-component, such as standard library or other library decide that program should exit, and thus calls c-runtime exit() or directly calls Kernel32!ExitProcess().The best way to debug this flow is to attach a debugger and set a breakpoint on these two methods and find out who is calling them.If you mean your program enters a deadlock and halts then also you will need to attach a debugger and find out who is stuck.

Related

Python multiprocessing deadlock when calling logger-issue6721

I have a code running in Python 3.7.4 which forks off multiple processes. I believe I'm hitting a known issue (issue6721: https://github.com/python/cpython/issues/50970). I setup the child process to send "progress report" through a pipe to the parent process and noticed that sometimes a log statement doesn't get printed and that the code gets stuck in a deadlock situation.
After reading issue6721, I'm not sure I'm still understanding why parent might hold logger Handler lock after a log statement is done execution (i.e the line that logs is executed and the execution has moved to the next line of code). I totally get it that in the context of C++, the compiler might re-arrange instructions. Not fully understand it in context of Python. In C++ I can have barrier instructions to stop the compiler moving instructions beyond a point. Is there something similar that can be done in Python to avoid having a lock getting copied to child process?
I have seen solutions using "atfork" which is a library that seems not supported (so I can't really use it).
Does anyone know a reliable and standard solution to this problem?

How can I handle an access violation in Visual Studio C++?

Usually an access violation terminates the program and I cannot catch a Win32 exception using try and catch. Is there a way I can keep my program running, even in case of an access violation? Preferably I would like to handle the exception and show to the user an access violation occurred.
EDIT: I want my program to be really robust, even against programming errors. The thing I really want to avoid is a program termination even at the cost of some corrupted state.
In Windows, this is called Structured Exception Handling (SEH). For details, see here:
http://msdn.microsoft.com/en-us/library/windows/desktop/ms680657%28v=vs.85%29.aspx
In effect, you can register to get a callback when an exception happens. You can't do this to every exception for obvious reasons.
Using SEH, you can detect a lot of exceptions, access violations included, but not all (e.g. double stack fault). Even with the exceptions that are detectable, there is no way to ensure 100% stability after the exception. However, it may be enough to inform the user, log the error, send a message back to the server, and gracefully exit.
I will start by saying that your question contains a contradiction:
EDIT: I want my program to be really robust, ... The thing I really want to avoid is a program termination even at the cost of some corrupted state.
A program that keeps on limpin' in case of corrupted state isn't robust, it's a liability.
Second, an opinion of sorts. Regarding:
EDIT: I want my program to be really robust, even against programming errors. ...
When, by programming errors you mean all bugs, then this is impossible.
If by programming errors you mean: "programmer misused some API and I want error messages instead of a crash, then write all code with double checks built in: For example, always check all pointers for NULL before usage, even if "they cannot be NULL if the programmer didn't make a mistake", etc. (Oh, you might also consider not using C++ ;-)
But IMHO, some amount of program-crashing-no-matter-what bugs will have to be accepted in any C++ application. (Unless it's trivial or you test the hell out of it for military or medical use (even then ...).)
Others already mentioned SEH -- it's a "simple" matter of __try / __catch.
Maybe instead of trying to catch bugs inside the program, you could try to become friends with Windows Error Reporting (WER) -- I never pulled this, but as far as I understand, you can completely customize it via the OutOfProcessException... callback functions.

Why the window of my vb6 application stalls when calling a function written in C?

I'm using 3.9.7 cURL library to download files from the internet, so I created a dynamic bibioteca of viculo. dll written in C using VC + + 6.0 the problem is that when either I call my function from within my vb6 application window locks and unlocks only after you have downloaded the file how do I solve this problem?
The problem is that when you call the function from your DLL, it "blocks" your app's execution until it gets finished. Basically, execution goes from the piece of code that makes the function call, to the code inside of the function call, and then only comes back to the next line after the function call after the code inside of the function has finished running. In fact, that's how all function calls work. You can see this for yourself by single-stepping through your code in the VB 6 development environment.
You don't normally notice this because the code inside of a function being called doesn't take very long to execute before control is returned to the caller. But in this case, since the function you're calling from the DLL is doing a lot of processing, it takes a while to execute, so it "blocks" the execution of your application's code for quite a while.
This is a good general explanation for the reason why your application window appears to be frozen. A bit more technically, it's because the message pump that is responsible for processing user interaction with on-screen elements is not running (it's part of your code that has been temporarily suspended until the function that you called finishes processing). This is a bit more difficult for a VB programmer to appreciate, since none of this nitty-gritty stuff is exposed in the world of VB. It's all happening behind the scenes, just like it is in a C program, but you don't normally have to deal with any of it. Occasionally, though, the abstraction leaks, and the nitty-gritty rears its ugly head. This is one of those cases.
The correct solution to this general problem, as others have hinted at, is to run lengthy operations on a background thread. This leaves your main thread (right now, the only one you have, the one your application is running on) free to continue processing user input, while the other thread can process the data and return that processed data to the main thread when it is finished. Of course, computers can't actually do more than one thing at a time, but the magic of the operating system rapidly switching between one task and another means that you can simulate this. The mechanism for doing so involves threads.
The catch comes in the fact that the VB 6 environment does not have any type of support for creating multiple threads. You only get one thread, and that's the main thread that your application runs on. If you freeze execution of that one, even temporarily, your application freezes—as you've already found out.
However, if you're already writing a C++ DLL, there's no reason you can't create multiple threads in a VB 6 app. You just have to handle everything yourself as if you were using another lower-level language like C++. Run the C++ code on a background thread, and only return its results to the main thread when it is completely finished. In the mean time, your main thread is free.
This is still quite a bit of work, though, especially if you're inexperienced when it comes to Win32 programming and the issues surrounding multiple threads. It might be easier to find a different library that supports asynchronous function calls out-of-the-box. Antagony suggests using VB's AsyncRead method. That is probably a good option; as Karl Peterson says in the linked article, it keeps everything in pure VB 6 code, which can be a real time saver as well as a boon to future maintenance programmers. The only problem is that you'll still have to process the data somehow once you obtain it. And if that's slow, you're right back where you started from…
Check out this article, which demonstrates how to asynchronously transfer large files using a little-known method in user controls.

What do you do with badly behaving 3rd party processes in Linux?

Anytime I have a badly behaving process (pegging CPU, or frozen, or otherwise acting strangely) I generally kill it, restart it and hope it doesn't happen again.
If I wanted to explore/understand the problem (i.e. debug someone else's broken program as it's running) what are my options?
I know (generally) of things like strace, lsof, dmesg, etc. But I'm not really sure of the best way start poking around productively.
Does anyone have a systematic approach for getting to the bottom of these issues? Or general suggestions? Or is killing & restarting really the best one can do?
Thanks.
If you have debugging symbols of the program in question installed, you can attach to it with gdb and look what is wrong there. Start gdb, type attach pid where pid is the process id of the program in question (you can find it via top or ps). Then type Ctrl-C to stop it. Saying backtrace gives you the call stack, that means it tells which line of code is currently running and which functions called the currently running function.

Can I instruct gdb to run commands in response to SIGTRAP?

I'm debugging a reference leak in a GObject-based application. GObject has a simple built-in mechanism to help with such matters: you can set the g_trap_object_ref variable in gobject.c to the object that you care about, and then every ref or unref of that object will hit a breakpoint instruction (via G_BREAKPOINT()).
So sure enough, the program does get stopped, with gdb reporting:
Program received signal SIGTRAP, Trace/breakpoint trap.
g_object_ref (_object=0x65f090) at gobject.c:2606
2606 old_val = g_atomic_int_exchange_and_add ((int *)&object->ref_count, 1);
(gdb) _
which is a great start. Now, normally I'd script some commands to be run at a breakpoint I manually set using commands 3 (for breakpoint 3, say). But the equivalent for SIGTRAP, namely handle SIGTRAP, doesn't give me the option of doing anything particularly interesting. Is there a good way to do this?
(I'm aware that there are other ways to debug reference leaks, such as setting watchpoints on the object's ref_count field, refdbg, scripting regular breakpoints on g_object_ref() and g_object_unref(). I'm about to go try of those now. I'm looking specifically for a way to script a response to SIGTRAP. It might come in useful in other situations, too, and I'd be surprised if gdb doesn't support this.)
Do you want to show some values and continue execution of the program? In that case, just define a macro that displays the values you're interested in, continues execution and calls itself recursively:
define c
echo do stuff\n
continue
c
end
GDB doesn't support it.
In general, attaching a command script to signal makes little sense -- your program could be receiving SIGTRAP in any number of places, and the command will not know whether a particular SIGTRAP came in in expected context or not.

Resources