As per the title, what is the difference between these two and when should I consider using one over the other?
There may or may not be a difference depending on your definition of what happens when a panic happens (defined in Cargo.toml). Depending on whether you have it set to unwind or abort, different things will happen:
With unwind, this will (as the name suggests) unwind the stack. With this, in particular, it is possible to get a full stack trace
With abort, you will only get the last callee
process::exit(), on the other hand, is a "clean" exit - you will not get a last callee, and you'll get a regular process exit status.
Due to this, you'll ideally want to keep to the following:
For planned shutdowns, use exit(). Do note that a known error is considered a planned shutdown
For unplanned shutdowns (i.e. exceptional failures) consider panic!(), as you'll both benefit from being able to get a stack trace when this happens, and the failure case should be exceptional enough that it is effectively unaccounted for and stems from an unplanned scenario
Afaik, a panic is never supposed to happen in a released program. It gives informations for developpers, but not anything user friendly. I'd say "use it for errors that should not happen in prod". There is probably behind something like an exit(101);
exit just terminates your process with the code you give to it. An exit(0) should mean "Everything is okay".
Related
I am surprised that Linux kernel has infinite loop in 'do_select' function implementation. Is it normal practice?
Also I am interested in how file changes monitoring implemented in Linux kernel? Is it infinite loop again?
select.c source code
This is not an infinite loop; that term is reserved for loops with no exit condition at all. This loop has its exit condition in the middle: http://lxr.linux.no/#linux+v3.9/fs/select.c#L482 This is a very common idiom in C. It's called "loop and a half" and there's a simple pseudocode example here: https://stackoverflow.com/a/10767975/388520 which clearly illustrates why you would want to do this. (That question talks about Java but that's not important; this is a general structured-programming idiom.)
I'm not a kernel expert, but this particular loop appears to have been written this way because the logic of the inner loop needs to run both before and after the call to poll_schedule_timeout at the very bottom of the outer loop. That code is checking whether there are any events to return; if there are already events to return when select is invoked, it's supposed to return immediately; if there aren't any initially, there will be when poll_schedule_timeout returns. So in normal operation the outer loop should cycle either 0.5 or 1.5 times. (There may be edge-case circumstances where the outer loop cycles more times than that.) I might have chosen to pull the inner loop out to its own function, but that might involve passing pointers to too many local variables around.
This is also not a spin loop, by which I mean, the CPU is not wasting electricity checking for events over and over again until one happens. If there are no events to report when control reaches the call to poll_schedule_timeout, that function (by, ultimately, calling __schedule) will cause the calling thread to block -- the CPU is taken away from that thread and assigned to another process that can do something useful with it. (If there are no processes that need the CPU, it'll be put into a low-power "halt" until the next interrupt fires.) When one of the events happens, or the timeout, the thread that called select will get "woken up" and poll_schedule_timeout will return.
On a larger note, operating system kernels often do things that would be considered strange, poor style, or even flat-out wrong, in the service of other engineering goals (efficiency, code reuse, avoidance of race conditions that can only occur on some CPUs, ...) They are written by people who know exactly what they are doing and exactly how far they can get away with bending the rules. You can learn a lot from reading though OS code, but you probably shouldn't try to imitate it until you have a bit more experience. You wouldn't try to pastiche the style of James Joyce as your first exercise in creative writing, ne? Same deal.
I am using shared variables on perl with use threads::shared.
That variables can we modified only from single thread, all other threads are only 'reading' that variables.
Is it required in the 'reading' threads to lock
{
lock $shared_var;
if ($shared_var > 0) .... ;
}
?
isn't it safe to simple verification without locking (in the 'reading' thread!), like
if ($shared_var > 0) ....
?
Locking is not required to maintain internal integrity when setting or fetching a scalar.
Whether it's needed or not in your particular case depends on the needs of the reader, the other readers and the writers. It rarely makes sense not to lock, but you haven't provided enough details for us to determine what your needs are.
For example, it might not be acceptable to use an old value after the writer has updated the shared variable. For starters, this can lead to a situation where one thread is still using the old value while the another thread is using the new value, a situation that can be undesirable if those two threads interact.
It depends on whether it's meaningful to test the condition just at some point in time or other. The problem however is that in a vast majority of cases, that Boolean test means other things, which might have already changed by the time you're done reading the condition that says it represents a previous state.
Think about it. If it's an insignificant test, then it means little--and you have to question why you are making it. If it's a significant test, then it is telltale of a coherent state that may or may not exist anymore--you won't know for sure, unless you lock it.
A lot of times, say in real-time reporting, you don't really care which snapshot the database hands you, you just want a relatively current one. But, as part of its transaction logic, it keeps a complete picture of how things are prior to a commit. I don't think you're likely to find this in code, where the current state is the current state--and even a state of being in a provisional state is a definite state.
I guess one of the times this can be different is a cyclical access of a queue. If one consumer doesn't get the head record this time around, then one of them will the next time around. You can probably save some processing time, asynchronously accessing the queue counter. But here's a case where it means little in context of just one iteration.
In the case above, you would just want to put some locked-level instructions afterward that expected that the queue might actually be empty even if your test suggested it had data. So, if it is just a preliminary test, you would have to have logic that treated the test as unreliable as it actually is.
Usually an access violation terminates the program and I cannot catch a Win32 exception using try and catch. Is there a way I can keep my program running, even in case of an access violation? Preferably I would like to handle the exception and show to the user an access violation occurred.
EDIT: I want my program to be really robust, even against programming errors. The thing I really want to avoid is a program termination even at the cost of some corrupted state.
In Windows, this is called Structured Exception Handling (SEH). For details, see here:
http://msdn.microsoft.com/en-us/library/windows/desktop/ms680657%28v=vs.85%29.aspx
In effect, you can register to get a callback when an exception happens. You can't do this to every exception for obvious reasons.
Using SEH, you can detect a lot of exceptions, access violations included, but not all (e.g. double stack fault). Even with the exceptions that are detectable, there is no way to ensure 100% stability after the exception. However, it may be enough to inform the user, log the error, send a message back to the server, and gracefully exit.
I will start by saying that your question contains a contradiction:
EDIT: I want my program to be really robust, ... The thing I really want to avoid is a program termination even at the cost of some corrupted state.
A program that keeps on limpin' in case of corrupted state isn't robust, it's a liability.
Second, an opinion of sorts. Regarding:
EDIT: I want my program to be really robust, even against programming errors. ...
When, by programming errors you mean all bugs, then this is impossible.
If by programming errors you mean: "programmer misused some API and I want error messages instead of a crash, then write all code with double checks built in: For example, always check all pointers for NULL before usage, even if "they cannot be NULL if the programmer didn't make a mistake", etc. (Oh, you might also consider not using C++ ;-)
But IMHO, some amount of program-crashing-no-matter-what bugs will have to be accepted in any C++ application. (Unless it's trivial or you test the hell out of it for military or medical use (even then ...).)
Others already mentioned SEH -- it's a "simple" matter of __try / __catch.
Maybe instead of trying to catch bugs inside the program, you could try to become friends with Windows Error Reporting (WER) -- I never pulled this, but as far as I understand, you can completely customize it via the OutOfProcessException... callback functions.
I have never seen a real use for checking if a file was closed correctly. I mean, if it didn't close, then what? You have nothing smart to do. Beside, I'm not sure if there's a real world use case, where non of the write/reads/flush will fail, and only the close will.
Does anyone actually uses the return value of close?
From the close(2):
Not checking the return value of close() is a common but nevertheless serious
programming error. It is quite possible that errors on a previous write(2)
operation are first reported at the final close(). Not checking the return
value when closing the file may lead to silent loss of data. This can
especially be observed with NFS and with disk quota.
And if you use signals in your application close may be interrupted (EINTR).
EDIT: That said, I seldom bother unless I'm prepared to handle such cases and write code that has to be 100% fool-proof.
I made changes in sched.c in Linux kernel 2.4 (homework), and now the system goes into kernel panic. The strange thing is: it seems to pass A LOT of booting checks and initializations, and panics only at the very end, showing hte following stack trace:
update_process_times
do_timer
timer_interrupt
handle_IRQ_event
do_IRQ
call_do_IRQ
do)wp_page
handle_mm_fault
do_page_fault
do_sigaction
sys_rt_sigaction
do_page_fault
error_code
And the error is: "In interrupt handler - not synching"
I know it's hard to tell without any code, but can anybody make an educated guess to point me in the right direction?
I can give you my own personal mantra when debugging kernel problems: "it's always your fault."
I often see issues due to overwriting memory outside where I'm working -- if I feed hardware an incorrect address for DMA for example. You may be screwing up a lock somehow; that seems possible in this case if you are seeing a timeout: a forgotten locked lock is causing a timeout to occur due to a hang.
To me, a panic in update_process_times might suggest a problem with the task struct pointer... but I really have no idea.
Keep in mind that things in the kernel often go wrong long before a failure occurs, so a wrong bit anywhere in your code may be to blame, even if it doesn't seem like it should have an effect. If you can, I recommend incrementally adding or removing your code and checking for the problem to see if you can isolate it.