Python 3 try-except: which solution is better and why? - python-3.x

folks.
I'm trying to configure logging from an external yaml configuration file which may or may not have the necessary options forcing me to check and fail over in several different ways. I wrote two solutions doing same thing, but in different styles:
More traditional "C-like":
try:
if config['log']['stream'].lower() == 'console':
handler = logging.StreamHandler()
handler.setFormatter(logging.Formatter(fmt='scheduler: (%(levelname).1s) %(message)s'))
elif config['log']['stream'].lower() == 'syslog':
raise ValueError
else:
print('scheduler: (E) Failed to set log stream: Unknown stream: \'' + config['log']['stream'] + '\'. Failing over to syslog.', file=sys.stderr)
raise ValueError
except (KeyError, ValueError) as e:
if type(e) == KeyError:
print('scheduler: (E) Failed to set log stream: Stream is undefined. Failing over to syslog.', file=sys.stderr)
handler = logging.handlers.SysLogHandler(facility=logging.handlers.SysLogHandler.LOG_DAEMON, address = '/dev/log')
handler.setFormatter(logging.Formatter(fmt='scheduler[%(process)d]: (%(levelname).1s) %(message)s'))
finally:
log.addHandler(handler)
And "pythonic" with internal procedure:
def setlogstream(stream):
if stream == 'console':
handler = logging.StreamHandler()
handler.setFormatter(logging.Formatter(fmt='scheduler: (%(levelname).1s) %(message)s'))
elif stream == 'syslog':
handler = logging.handlers.SysLogHandler(facility=logging.handlers.SysLogHandler.LOG_DAEMON, address = '/dev/log')
handler.setFormatter(logging.Formatter(fmt='scheduler[%(process)d]: (%(levelname).1s) %(message)s'))
else:
raise ValueError
log.addHandler(handler)
try:
setlogstream(config['log']['stream'].lower())
except KeyError:
print('scheduler: (E) Failed to set log stream: Stream is undefined. Failing over to syslog.', file=sys.stderr)
setlogstream('syslog')
except ValueError:
print('scheduler: (E) Failed to set log stream: Unknown stream: \'' + config['log']['stream'] + '\'. Failing over to syslog.', file=sys.stderr)
setlogstream('syslog')
They both do what I need, both short, both extendible in case I need more streams, but now I wonder which one is better and why?

Saying one is "better" is mostly a matter of personal preference; if it accomplishes the task it needs to, then pick whichever way you prefer. That said, I think the second one should be used, and here's why:
defining setlogstream() both makes it clear what that section of your code does, and allows you to use it again later, if you need to.
using separate except cases makes your code more readable and easier to follow. this could be especially useful if somehow another error occurred in the handling of the first.
Overall, the second one is far more readable, and your future self will thank you for writing it that way.

Related

cannot upload data to s3 through lambda

I'm trying to extract data from trust advisor through lambda function and upload to s3. Some part of the function executes the append module on the data. However, that module block throws error. That specific block is
try:
check_summary = support_client.describe_trusted_advisor_check_summaries(
checkIds=[checks['id']])['summaries'][0]
if check_summary['status'] != 'not_available':
checks_list[checks['category']].append(
[checks['name'], check_summary['status'],
str(check_summary['resourcesSummary']['resourcesProcessed']),
str(check_summary['resourcesSummary']['resourcesFlagged']),
str(check_summary['resourcesSummary']['resourcesSuppressed']),
str(check_summary['resourcesSummary']['resourcesIgnored'])
])
else:
print("unable to append checks")
except:
print('Failed to get check: ' + checks['name'])
traceback.print_exc()
The error logs
unable to append checks
I'm new to Python. So, unsure of how to check for trackback stacks under else: statement. Also, am I doing anything wrong in the above ? Plz help
You are not calling the s3_upload function anywhere, also the code is invalid since it has file_name variable in it which is not initialized.
I've observed your script-
traceback.print_exc() This should be executed before the return statement so that the python compiler can identify the obstacles/errors
if __name__ == '__main__':
lambda_handler
This will work only if is used to execute some code only if the file was run directly, and not imported.
According to the documentation the first three parameters of the put_object method,
def put_object(self, bucket_name, object_name, data, length,
Fix your parameters of put_object.
you're not using s3_upload in your lambda.

how to throw an error if certain condition evaluates to true

I have below code block:
try:
if str(symbol.args[0]) != str(expr.args[0]):
print('true')
raise SyntaxError('====error')
except:
pass
Here I am trying to raise Syntax error if certain condition is true.I am testing this code block, I can see 'true' is getting printed which means condition is met but even after that also the code is not throwing syntax error.
I am trying to understand what is wrong in the above code.
You're putting pass in the except: block which is swallowing the exception. Either remove the code from the try-except block or change pass to raise
Above answer is pointing the issue, I just want to give some examples to help you better understand how try/except works:
# Just raise an exception (no try/except is needed)
if 1 != 2:
raise ValueError("Values do not match")
# Catch an exception and handle it
a = "1"
b = 2
try:
a += b
except TypeError:
print("Cannot add an int to a str")
# Catch an exception, do something about it and re-raise it
a = "1"
b = 2
try:
a += b
except TypeError:
print("Got to add an int to a str. I'm re-raising the exception")
raise
try/except can also be followed by else and finally, you can check more about these here: try-except-else-finally

How can I keep sensitive data out of logs?

How can I keep sensitive data out of logs?
Currently I have an exception which is raised in a method A. In method B this exception is re-raised adding further information. In method C I want to log the exception with a further information.
My first attempt was to add a string replace method before logging the exception, but this does not affect the whole traceback. Especially because I call the Python library request in methodA. The first exception takes place in this library.
first excepton in requests: urllib3.exceptions.MaxRetryError
second exception in requests: requests.exceptions.ProxyError
Both exceptions within the request library contain already the sensitive data in the traceback.
def methodA():
try:
connect_to_http_with_request_lib()
except requests.exceptions.ProxyError as err:
raise MyExceptionA(f"this log contains sensitive info in err: {err}")
def methodB():
try:
methodA()
except MyExceptionA as err:
raise MyExceptionB (f"add some more info to: {err}")
def methodC():
try:
methodB()
return True
except MyExceptionB as err:
err = re.sub(r"(?is)password=.+", "password=xxxx", str(err))
logger.exception(f("methodB failed exception {err}")
return False
How can I parse the whole traceback before logging the exception in order to mask out sensitive data?
I use loguru as logging library.
The Django framework seems to address the same problem with their own methods. See here

Can You Retry/Loop inside a Try/Except?

I'm trying to understand if it's possible to set a loop inside of a Try/Except call, or if I'd need to restructure to use functions. Long story short, after spending a few hours learning Python and BeautifulSoup, I managed to frankenstein some code together to scrape a list of URLs, pull that data out to CSV (and now update it to a MySQL db). The code is now working as planned, except that I occasionally run into a 10054, either because my VPN hiccups, or possibly the source host server is occasionally bouncing me (I have a 30 second delay in my loop but it still kicks me on occasion).
I get the general idea of Try/Except structure, but I'm not quite sure how I would (or if I could) loop inside it to try again. My base code to grab the URL, clean it and parse the table I need looks like this:
for url in contents:
print('Processing record', (num+1), 'of', len(contents))
if url:
print('Retrieving data from ', url[0])
html = requests.get(url[0]).text
soup = BeautifulSoup(html, 'html.parser')
for span in soup('span'):
span.decompose()
trs = soup.select('div#collapseOne tr')
if trs:
print('Processing')
for t in trs:
for header, value in zip(t.select('td')[0], t.select('td:nth-child(2)')):
if num == 0:
headers.append(' '.join(header.split()))
values.append(re.sub(' +', ' ', value.get_text(' ', strip=True)))
After that is just processing the data to CSV and running an update sql statement.
What I'd like to do is if the HTML request call fails is wait 30 seconds, try the request again, then process, or if the retry fails X number of times, go ahead and exit the script (assuming at that point I have a full connection failure).
Is it possible to do something like that in line, or would I need to make the request statement into a function and set up a loop to call it? Have to admit I'm not familiar with how Python works with function returns yet.
You can add an inner loop for the retries and put your try/except block in that. Here is a sketch of what it would look like. You could put all of this into a function and put that function call in its own try/except block to catch other errors that cause the loop to exit.
Looking at requests exception hierarchy, Timeout covers multiple recoverable exceptions and is a good start for everything you may want to catch. Other things like SSLError aren't going to get better just because you retry, so skip them. You can go through the list to see what is reasonable for you.
import itertools
# requests exceptions at
# https://requests.readthedocs.io/en/master/_modules/requests/exceptions/
for url in contents:
print('Processing record', (num+1), 'of', len(contents))
if url:
print('Retrieving data from ', url[0])
retry_count = itertools.count()
# loop for retries
while True:
try:
# get with timeout and convert http errors to exceptions
resp = requests.get(url[0], timeout=10)
resp.raise_for_status()
# the things you want to recover from
except requests.Timeout as e:
if next(retry_count) <= 5:
print("timeout, wait and retry:", e)
time.sleep(30)
continue
else:
print("timeout, exiting")
raise # reraise exception to exit
except Exception as e:
print("unrecoverable error", e)
raise
break
html = resp.text
etc…
I've done a little example by myself to graphic this, and yes, you can put loops inside try/except blocks.
from sys import exit
def example_func():
try:
while True:
num = input("> ")
try:
int(num)
if num == "10":
print("Let's go!")
else:
print("Not 10")
except ValueError:
exit(0)
except:
exit(0)
example_func()
This is a fairly simple program that takes input and if it's 10, then it says "Let's go!", otherwise it tells you it's not 10 (if it's not a valid value, it just kicks you out).
Notice that inside the while loop I put a try/except block, taking into account the necessary indentations. You can take this program as a model and use it on your favor.

Effective method of error handling a function

Example error handling function:
def read_file():
try:
with open(filename, 'rb') as fd:
x = fd.read()
except FileNotFoundError as e:
return(e)
return(x)
I would call the function like so:
file = read_file("test.txt")
if file:
#do something
is there a more efficient/effective way to handle errors than using return multiple times?
It's very strange to catch e and then return it; why would a user of your function want the error to be returned instead of raised? Returning an error doesn't handle it, it just passes responsibility to the caller to handle the error. Letting the error be raised is a more natural way to make the caller responsible for handling it. So it makes sense not to catch the error at all:
def read_file(filename):
with open(filename, 'rb') as fd:
return fd.read()
For your desired use-case where you want to write if file: to test whether the file existed, your read_file function could catch the error and return None, so that your if condition will be falsey:
def read_file(filename):
try:
with open(filename, 'rb') as fd:
return fd.read()
except FileNotFoundError:
return None
However, this means that if the caller isn't aware that the function might return None, you'll get an error from using None where the file data was expected, instead of a FileNotFoundError, and it will be harder to identify the problem in your code.
If you do intend for your function to be called with a filename that might not exist, naming the function something like read_file_if_exists might be a better way to make clear that calling this function with a non-existent filename isn't considered an error.

Resources