if and when try/except statements are overkill - python-3.x

Sorry if this isn't the right place to ask this. I'm still learning a lot about good design. I was just wondering, say I process raw data through 20 functions. Is it idiotic or extremely slow to think of wrapping the contents of each function with a try/except statement, so if I ever run into issues I can see exactly where and why the data wasn't properly processed? Surely there's another more efficient way of facilitating the debugging process.
I've tried searching through articles for if and when to use try/except statements. But I think the experience of some of the guys on stack overflow will provide a much better answer :)

I can only give my personal opinion, but i think you shouldn't wrap your entire code inside "try/except" conditions. To me, theses are meant for specific cases, when manipulating streams, sending HTTP request, to ensure that we don't reach a part of the code that won't run (or adopt specific behaviour depending on the error).
The risk is to catch an error from another line of your program, but without knowing it (for example if you wrap an entire function).
It is important to cover your code, but without completely hide every error that you could encounter.
you probably already checked it but a little reminders of good practices :
Try / Except good practices
I hope that's will be helpful !

When exceptions are raised (and recorded somewhere) they have a stacktrace showing the calls that lead to the error. That should be enough to trace where the problem was.
If you catch an exception at the lowest level, how will the subsequent methods continue? They'll not get the returned values they were expecting. Better to let the exception propagate up the process to somewhere it makes sense to handle it. If you do manual checks you can raise specific exceptions with messages to help debug, eg:
def foo(bar):
if bar < 0:
raise ValueError(f"Can't foo a value less than 0, got {bar}")
# foo bar here

Related

MS Access writing to Excel over multiple minutes will sometimes throw a false error

I'm someone who solves problems by looking, not asking. So this is new to me. This has been an issue for years, and it crops up with different computers, networks, versions and completely different code. There is a lot here, so, thank you in advance if you are willing to read the whole thing.
Generally speaking, I write MS Access programs that will open Excel and then create multiple worksheets inside of a workbook using data from Access tables and/or Excel sheets. The process can take a couple of minutes to run and occasionally, it will get an error. I could tell you the error message, but it doesn't matter because it will be different depending where the error occurs. When it occurs I simply click debug and click continue and it... continues. If it errors out again (many loops later), it will happen in the exact same spot.
So, what I start with is to make minor changes to the code. In the current program I'm working on, the error happens when I write to a cell and the value is a value directly from a table. I created a variable, copied the value to the variable and then wrote to the cell. The error moved to a completely different part of the program and it became a "paste" error. Generally what fixes it is to put a wait function at the spot where the error occurs. One second is usually good enough. Sometimes it takes a couple of these, but that usually solves it. It only took one delay per loop this time, so it is working. I just hate causing delays in my program. So... Has anyone seen anything like this before, or is it just me. It feels like a timing issue between Access and Excel since the delays are usually helpful. Thanks in advance.
I dug up my last major Access project that interacted with Word (ca. 2016) where I struggled with similar issues. I see many, many Debug.Print statements (some commented, some still active), but unlike what I recalled earlier in my comments, I don't see any "wait" statements anymore! From what I now recall and after re-inspecting the code, most problems were resolved by
implementing robust error handling and best practices for always closing automation objects (and/or releasing the objects if I wanted the instances to persist)
subscribing to and utilizing appropriate automation object events to detect and handle interaction rather than trying to force everything into serialized work-then-wait code. To do this, I placed all automation code in well-structured classes that declared automation objects WithEvents (in VBA of course) and then defined relevant event handlers for actions I was effecting. I now recall finally being able to avoid weird errors and application hangs, etc.
You also may never get a good answer to a question like this, so despite that I am not an absolute expert on Office development, I have had my own experience with frustrating bugs like this and so I'll share my 2 cents. This may not be satisfying, but after experiencing similar behavior using office automation objects, my general understanding is that interaction between OS processes are not deterministic. Especially since VBA generally has no threading or parallelism concerns, it can be strange to deal with objects that behave in unpredictable ways. The time slices given to each process separately is at the mercy of the OS and it will vary greatly with multiple processors/cores, running processes, memory management, etc. Despite the purpose of the automation objects--to control instances of office apps--the API's are not designed well for inter-application processes.
Although it would be great if old automation code would produces more useful errors, perhaps nested exceptions (like in .Net and other modern environments), something that indicates delays and timeouts within callbacks between automation objects, instead you get hodgepodge of various context errors.
My hardware is old, but still ticking. I often get delays, even if only for a second, when switching between apps, etc. Instead of thinking of it as an error, I just perceive it as a slow machine, just wait and continue. It may be useful to consider these type of random errors as similar delays. If a wait call here or there resolves the issue, however annoying, that may just be the best solution... wait and continue.
Every now and then after debugging these types of issues I would actually discover the underlying problem and be able to fix it. At the least I would be able to avoid actual problems with the data, despite errors being raised, just like you describe. But even when I felt that I understood the problem, the answer was still often to do exactly as you have done and just add a short wait.
I do believe now this is a timing issue. After thinking things through, I realized that I could easily (well 3 hours later) separate the database info from the spreadsheet info and then move the updated code that is causing problems into an Excel Macro. I then called that macro from Access. Not only do the errors go away, but it runs about 4 times faster. It's not surprising, I just hadn't thought of that direction before.

Is there a way to tell Stata to execute the whole do-file ignoring the lines producing exceptions (or even syntax errors)?

I run the time-consuming code in the background and oftentimes even the 10% of the code is not executed due to a small syntax mistake in the beginning of the do-file.
I would prefer the rest of the do-file to be executed as well since sometimes that mistake in the beginning has no effect on the computations at the end.
(version 2)
Wanting Stata to ignore mistakes can itself be mistaken.
If there is a mistake early in a do-file, it usually has implications for what follows.
Suppose you got Stata to work as you wish. How do you know whether Stata ignored something important or something trivial? If it ignored something trivial, that should be easy to fix. If it ignored something important, that was the wrong decision.
Let's now be more constructive. The help for do tells you that there is a nostop option.
You need to be very careful about how you use it, but it can help here.
The context of do, nostop is precisely that of the OP. People had do-files, often expected to take a long time because of big datasets or lots of heavy computation, and set them going, historically "overnight" or "while you went to lunch". Then they would be irritated to find that the do-file had quickly terminated on the first error, and especially irritated if the error was trivial. So, the idea of do, nostop is to do as much as you can, but as an aid to debugging. For example, suppose you got a variable name wrong in various places; you generate Y but refer to y later, which doesn't exist. You might expect to find corresponding error messages scattered through the file, which you can fix. The error messages are the key here.
The point about do files is that once they are right, you can save yourself lots of time, but no-one ever promised that getting the do-file right in the first place would always be easy.
My firm advice is: Fix the bugs; don't try to ignore them.
P.S. capture was mentioned in another answer. capture may seem similar in spirit, but used in similar style it can be a bad idea.
capture eats error messages, so the user doesn't see them. For debugging, that is the opposite of what is needed.
capture is really a programmer's command, and its uses are where the programmer is being smart on behalf of the user and keeping quiet about it.
Suppose, for example, a variable supplied could be numeric or it could be string. If it's numeric, we need to do A; if it's string we need to do B. (Either A or B could be "nothing".) There could be branches like this.
capture confirm str variable myvar
if _rc { /// it's numeric
<do A>
}
else {
<do B>
}
Otherwise put, capture is for handling predictable problems if and as they arise. It is not for ignoring bugs.
If there are only a couple commands that are hanging up and are otherwise inconsequential to your later calculations, you can always use capture (either as an inline prefix or as a block command, see help capture on use) to force the program to run through commands that stop the program.
But--echoing Nick's general comments about this way of writing and executing do-files--be similarly careful where you apply capture: generally, you should only apply it to commands you are sure would not affect later code or calculations. Or, even better, just remove from the program those lines that are giving problems and you apparently don't need anyway.

When exactly am I required to set objects to nothing in classic asp?

On one hand the advice to always close objects is so common that I would feel foolish to ignore it (e.g. VBScript Out Of Memory Error).
However it would be equally foolish to ignore the wisdom of Eric Lippert, who appears to disagree: http://blogs.msdn.com/b/ericlippert/archive/2004/04/28/when-are-you-required-to-set-objects-to-nothing.aspx
I've worked to fix a number of web apps with OOM errors in classic asp. My first (time consuming) task is always to search the code for unclosed objects, and objects not set to nothing.
But I've never been 100% convinced that this has helped. (That said, I have found it hard to pinpoint exactly what DOES help...)
This post by Eric is talking about standalone VBScript files, not classic ASP written in VBScript. See the comments, then Eric's own comment:
Re: ASP -- excellent point, and one that I had not considered. In ASP it is sometimes very difficult to know where you are and what scope you're in.
So from this I can say that everything he wrote isn't relevant for classic ASP i.e. you should always Set everything to Nothing.
As for memory issues, I think that assigning objects (or arrays) to global scope like Session or Application is the main reason for such problems. That's the first thing I would look for and rewrite to hold only single identifider in Session then use database to manage the data.
Basically by setting a COM object to Nothing, you are forcing its terminator to run deterministically, which gives you the opportunity to handle any errors it may raise.
If you don't do it, you can get into a situation like the following:
Your code raises an error
The error isn't handled in your code and therefore ...
other objects instantiated in your code go out of scope, and their terminators run
one of the terminators raises an error
and the error that is propagated is the one from the terminator going out of scope, masking the original error.
I do remember from the dark and distant past that it was specifically recommended to close ADO objects. I'm not sure if this was because of a bug in ADO objects, or simply for the above reason (which applies more generally to any objects that can raise errors in their terminators).
And this recommendation is often repeated, though often without any credible reason. ("While ASP should automatically close and free up all object instantiations, it is always a good idea to explicitly close and free up object references yourself").
It's worth noting that in the article, he's not saying you should never worry about setting objects to nothing - just that it should not be the default behaviour for every object in every script.
Though I do suspect he's a little too quick to dismiss the "I saw this elsewhere" method of coding behaviour, I'm willing to bet that there is a reason Eric didn't consider that has caused this to be passed along as a hard 'n' fast rule - dealing with junior programmers.
When you start looking more closely at the Dreyfus model of skill acquisition, you see that at the beginning levels of acquiring a new skill, learners need simple to follow recipes. They do not yet have the knowledge or ability to make the judgement calls Eric qualifies the recommendation with later on.
Think back to when you first started programming. Could you readily judge if you were "set[tting an] expensive objects to Nothing when you are done with them if you are done with them well before they go out of scope"? Did you really know which objects were expensive or when they truly went out of scope?
Thus, most entry level programmers are simply told "always set every object to Nothing when you are done with it" because it is within their grasp to understand and follow. Unfortunately, not many programmers take the time to self-educate, learn, and grow into the higher-level Dreyfus stages where you can use the more nuanced situational approach.
And then we come back to my earlier statement - even the best of us started out at that earlier stage, where we reflexively closed all objects because that was the best we were capable of. We left large bodies of code that people look at now, and project our current competence backwards to the earlier work and assume we did that for reasons we don't understand.
I've got to get going, but I hope to expand this a little futher...

How to keep code layout visually 'attractive' when introducing error handling?

When writing code, I find it important that my code looks well (apart from the fact that it has to work well). It is well described in the book Code Complete (p729): 'The visual and intellectual enjoyment of well-formatted code is a pleasure that few nonprogrammers can appreciate'.
The problem is, as soon as I got my code functionally working, and I start to introduce error handling (try-except clauses etc.) to make it robust, I find that this usually messes up my well-laid out code and turns it into something that is definitely not visually pleasing. The try-except statements and additional if's, make the code less readable and structured.
I wonder if this is because I misuse or overuse error handling, or is this unavoidable?
Any tips or tricks to keep it good-looking?
It is hard to give you a general answer for this, since there are a lot of different cases of error handling and so there are a lot of different approaches to deal with this problem. If you would post some real-world examples, you probably would get a lot of suggestions here on SO how to improve your code.
In general, adding error handling to existing functions makes them bigger, so refactoring them up into smaller methods is always a good idea. If you are looking for a more general approach, you should make yourself comfortable with Aspect-Oriented programming. That is an approach to keep out the code for so-called cross cutting concerns (like error handling) completely out of your business logic code.
EDIT: Just one simple trick:
I avoid writing error-checks like this:
int MyFunction()
{
if( ErrorCheck1Passes())
{
if( ErrorCheck2Passes())
{
if( ErrorCheck3Passes())
{
callSomeFunction(...);
}
else
return failureCode3;
}
else
return failureCode2;
}
else
return failureCode1;
return 0;
}
I prefer
int MyFunction()
{
if( !ErrorCheck1Passes())
return failureCode1;
if( !ErrorCheck2Passes())
return failureCode2;
if( !ErrorCheck3Passes())
return failureCode3;
callSomeFunction(...);
return 0;
}
I often wrap up chunks of code that require error handling into their own functions that will handle all possible exceptions internally, and so in a sense always succeed. The code that calls them looks cleaner, and if those functions are called more than once, then your code also becomes smaller.
This can make more sense if you are working more at the front end of an application, where from the user's point of view not every single possible exception needs to bubble up to their level. It's fine in the context of some classes for there to be a robust internal way of handling errors and moving on.
So for example, I might have functions like
SaveSettingsSafe();
// 'Safe' in the sense that all errors are handled internally before returning
You have rediscovered a major reason that Don Knuth invented literate programming: it is all too common for error handling to obscure the main algorithm. If you're lucky, you'll have some language constructs that offer you some flexibility. For example, exceptions may let you move the error handling elsewhere, or first-class functions may make it possible to move error handling around and reduce if-then-else checks to a function call.
If you're stuck in a language without these features, like C, you might want to look into a literate-programming tool. Literate-programming tools are preprocessors, but their whole mission is to help you make your code more readable. They have a small but rabid following, and you'll be able to find some guidance on the web and in published papers.
It all depends on how you do your programming. You can avoid a lot of these try-catch (or as you put it, try-except) statements if you just do proper input validation. Sure, it's a lot of work to cover most of the junk users tend to put into forms etc, but it will keep your data handling routines clean (or cleaner).

Using break statement even though previous line results in exit

I was reading through some code (C, if it makes a difference to anyone) today and got curious about a switch-block.
switch (c_type) {
case -1:
some_function(some_var);
break;
[...]
default:
abort();
}
Now, this is a perfectly simple switch-block. It's the some_function(some_var)-call I'm curious about: If you, the programmer, is absolutely, positively, super duper sure that the call will result in the process exiting, do you still put the break-statement underneath even though it is completely unnecessary? Would you say that it is best-practice?
I would say best practice would be to have a bomb-out assert() below the function call. This serves a dual purpose: it documents the fact that this point in the runtime should be unreachable, and it produces an error message if the code somehow does reach that spot.
Leave the break in there. It doesn't matter what you are sure about: you write your programs for other humans to read, and the break makes it clear that the given case is completely separate from the case that follows.
Even though you could be absolutely certain about the code today, the specification may change tomorrow and some_function won't exit anymore. Nothing in specifications is certain (in my experience anyway).
If I'm super-duper-super-sure that the call would result in the process exiting, I'd stick an assert in just for that.
That way, if someone modifies the function and it doesn't always-terminate, the bug will be caught pretty much the first time it occurs.
EDIT: Beaten, with pretty much the same answer too :/
It's always best practice to end every case statement with a break or return, unless you want it to fall through to the next.

Resources