python3/logging: Optionally write to more than one stream - python-3.x

I'm successfully using the logging module in my python3 program to send log messages to a log file, for example, /var/log/myprogram.log. In certain cases, I want a subset of those messages to also go to stdout, with them formatted through my logging.Logger instance in the same way that they are formatted when they go to the log file.
Assuming that my logger instance is called loginstance, I'd like to put some sort of wrapper around loginstance.log(level, msg) to let me choose whether the message only goes to /var/log/myprogram.log, or whether it goes there and also to stdout, as follows:
# Assume `loginstance` has already been instantiated
# as a global, and that it knows to send logging info
# to `/var/log/myprogram.log` by default.
def mylogger(level, msg, with_stdout=False):
if with_stdout:
# Somehow send `msg` through `loginstance` so
# that it goes BOTH to `/var/log/myprogram.log`
# AND to `stdout`, with identical formatting.
else:
# Send only to `/var/log/myprogram.log` by default.
loginstance.log(level, msg)
I'd like to manage this with one, single logging.Logger instance, so that if I want to change the format or other logging behavior, I only have to do this in one place.
I'm guessing that this involves subclassing logging.Logger and/or logging.Formatter, but I haven't figured out how to do this.
Thank you in advance for any suggestions.

I figured out how to do it. It simply requires that I use a FileHandler subclass and pass an extra argument to log() ...
class MyFileHandler(logging.FileHandler):
def emit(self, record):
super().emit(record)
also_use_stdout = getattr(record, 'also_use_stdout', False)
if also_use_stdout:
savestream = self.stream
self.stream = sys.stdout
try:
super().emit(record)
finally:
self.stream = savestream
When instantiating my logger, I do this ...
logger = logging.getLogger('myprogram')
logger.addHandler(MyFileHandler('/var/log/myprogram.log'))
Then, the mylogger function that I described above will look like this:
def mylogger(level, msg, with_stdout=False):
loginstance.log(level, msg, extra={'also_use_stdout': with_stdout})
This works because anything passed to the log function within the optional extra dictionary becomes an attribute of the record object that ultimately gets passed to emit.

Related

How to set log level with structlog when not using event parameter?

The idiomatic way (I think) to create a logger in structlog that only prints up to a certain log level is to use the following:
wrapper_class=structlog.make_filtering_bound_logger(logging.INFO),
This works fine, but it breaks with the following pattern:
l = logger.bind(event="get_tar", key=value)
l.info(status="download_start")
buf = f.read()
l.info(status="download_finish")
by default, when using the logfmt format -- structlog will print the "message" as the event key, so I just like to set it directly.
Anyways, this breaks though b/c under the hood make_filtering_bound_logger calls this:
def make_method(level: int) -> Callable[..., Any]:
if level < min_level:
return _nop
name = _LEVEL_TO_NAME[level]
def meth(self: Any, event: str, **kw: Any) -> Any:
return self._proxy_to_logger(name, event, **kw)
meth.__name__ = name
return meth
which requires an event kwarg to exist. Is there a workaround?
event is the only (reasonable :)) key that cannot be bound – it’s always the log message. That’s not a matter of make_… but all structlog internals.
You can get something similar-ish by renaming a key-value pair using the EventRenamer processor.
See also https://github.com/hynek/structlog/issues/35
It’s good you brought this up, I’m currently rewriting the docs and am looking for common recipes.

How to log the return value of a POST method after returning the response?

I'm working on my first ever REST API, so apologies in advance if I've missed something basic. I have a function that takes a JSON request from another server, processes it (makes a prediction based on the data), and returns another JSON with the results. I'd like to keep a log on the server's local disk of all requests to this endpoint along with their results, for evaluation purposes and for retraining the model. However, for the purposes of minimising the latency of returning the result to the user, I'd like to return the response data first, and then write it to the local disk. It's not obvious to me how to do this properly, as the FastAPI paradigm necessitates that the result of a POST method is the return value of the decorated function, so anything I want to do with the data has to be done before it is returned.
Below is a minimal working example of what I think is my closest attempt at getting it right so far, using a custom object with a log decorator - my idea was just to assign the result to the log object as a class attribute, then use another method to write it to disk, but I can't figure out how to make sure that that function gets called after get_data every time.
import json
import uvicorn
from fastapi import FastAPI, Request
from functools import wraps
from pydantic import BaseModel
class Blob(BaseModel):
id: int
x: float
def crunch_numbers(data: Blob) -> dict:
# does some stuff
return {'foo': 'bar'}
class PostResponseLogger:
def __init__(self) -> None:
self.post_result = None
def log(self, func, *args, **kwargs):
#wraps(func)
def func_to_log(*args, **kwargs):
post_result = func(*args, **kwargs)
self.post_result = post_result
# how can this be done outside of this function ???
self.write_data()
return post_result
return func_to_log
def write_data(self):
if self.post_result:
with open('output.json', 'w') as f:
json.dump(self.post_result, f)
def main():
app = FastAPI()
logger = PostResponseLogger()
#app.post('/get_data/')
#logger.log
def get_data(input_json: dict, request: Request):
result = crunch_numbers(input_json)
return result
uvicorn.run(app=app)
if __name__ == '__main__':
main()
Basically, my question boils down to: "is there a way, in the PostResponseLogger class, to automatically call self.write_data after every call to self.log?", but if I'm using the wrong approach altogether, any other suggestions are also welcome.
You could have a Background Task for that purpose. A background task "will run only once the response has been sent" (see Starlette documentation). "This is useful for operations that need to happen after a request, but that the client doesn't really have to be waiting for the operation to complete before receiving the response" (see FastAPI documentation).
You can define a task function to run in the background for writing the log data, as shown below:
def write_log_data():
logger.write_data()
Then, import BackgroundTasks and define a parameter in your endpoint with a type declaration of BackgroundTasks. Inside of your endpoint, pass your task function (i.e., write_log_data, as defined above) to the background_tasks object with the method .add_task():
from fastapi import BackgroundTasks
#app.post('/get_data/')
#logger.log
def get_data(input_json: dict, request: Request, background_tasks: BackgroundTasks):
result = crunch_numbers(input_json)
background_tasks.add_task(write_log_data)
return result
The same principle could be applied if a middleware was used to capture and log the response data, as described in this answer, or a custom APIRoute class, as demonstrated in this answer.
For future reference, if you (or anyone) ever need to use async/await syntax, and run into concurrency issues (such as the event loop getting blocked) while performing some heavy background computation, please have a look at this answer, which explains the difference between defining an endpoint or a background task function with async def and def (briefly, async def endpoints/background tasks will run in the event loop, whereas def functions will run in an external threadpool that is then awaited), as well as provides solutions when it comes to running blocking I/O-bound or CPU-bound operations in such functions.

Get a user's keyboard input that was requested by another function

I am using a python package for database managing. The provided class has a method delete() that deletes a record from the database. Before deleting, it asks a user to verify the operation from a console, e.g. Proceed? [yes, No]:
My function needs to perform other actions depending on whether a user chose to delete a record. Can I get user's input requested by the function from the package?
Toy example:
def ModuleFunc():
while True:
a=input('Proceed? [yes, No]:')
if a in ['yes','No']:
#Perform some actions behind a hood
return
This function will wait for one of the two responses and return None once it gets either. After calling this function, can I determine the User's response (without modifying this function)? I think a modification of the Package's source code is not a good idea in general.
Why not just patch the class at runtime? Say you had a file ./lib/db.py defining a class DB like this:
class DB:
def __init__(self):
pass
def confirm(self, msg):
a=input(msg + ' [Y, N]:')
if a == 'Y':
return True
return False
def delete(self):
if self.confirm('Delete?'):
print ('Deleted!')
return
Then in main.py you could do:
from lib.db import DB
def newDelete(self):
if self.confirm('Delete?'):
print('Do some more stuff!')
print('Deleted!')
return
DB.delete = newDelete
test = DB()
test.delete()
See it working here
I would save key events to somewhere(file or memory) with something like Keylogger. Then, you will be able to reuse last one.
However, if you can modify module package 📦 and redistribute, it would be easier.
Return
To
Return a

Unit Testing a Method That Uses a Context Manager

I have a method I would like to unit test. The method expects a file path, which is then opened - using a context manager - to parse a value which is then returned, should it be present, simple enough.
#staticmethod
def read_in_target_language(file_path):
"""
.. note:: Language code attributes/values can occur
on either the first or the second line of bilingual.
"""
with codecs.open(file_path, 'r', encoding='utf-8') as source:
line_1, line_2 = next(source), next(source)
get_line_1 = re.search(
'(target-language=")(.+?)(")', line_1, re.IGNORECASE)
get_line_2 = re.search(
'(target-language=")(.+?)(")', line_2, re.IGNORECASE)
if get_line_1 is not None:
return get_line_1.group(2)
else:
return get_line_2.group(2)
I want to avoid testing against external files - for obvious reasons - and do not wish to create temp files. In addition, I cannot use StringIO in this case.
How can I mock the file_path object in my unit test case? Ultimately I would need to create a mock path that contains differing values. Any help is gratefully received.
(Disclaimer: I don't speak Python, so I'm likely to err in details)
I suggest that you instead mock codecs. Make the mock's open method return an object with test data to be returned from the read calls. That might involve creating another mock object for the return value; I don't know if there are some stock classes in Python that you could use for that purpose instead.
Then, in order to actually enable testing the logic, add a parameter to read_in_target_language that represents an object that can assume the role of the original codecs object, i.e. dependency injection by argument. For convenience I guess you could default it to codecs.
I'm not sure how far Python's duck typing goes with regards to static vs instance methods, but something like this should give you the general idea:
def read_in_target_language(file_path, opener=codecs):
...
with opener.open(file_path, 'r', encoding='utf-8') as source:
If the above isn't possible you could just add a layer of indirection:
class CodecsOpener:
...
def open(self, file_path, access, encoding):
return codecs.open(file_path, access, encoding)
class MockOpener:
...
def __init__(self, open_result):
self.open_result = open_result
def open(self, file_path, access, encoding):
return self.open_result
...
def read_in_target_language(file_path, opener=CodecsOpener()):
...
with opener.open(file_path, 'r', encoding='utf-8') as source:
...
...
def test():
readable_data = ...
opener = MockOpener(readable_data)
result = <class>.read_in_target_language('whatever', opener)
<check result>

Python 3 logging - MemoryHandler and flushOnClose behaviour

If I forget to close the MemoryHandler before the end of the script the log message 'debug' is displayed even though flushOnClose=False (Python 3.6).
Am I doing something wrong or is this the expected behaviour? I would have thought flushOnClose would be obeyed regardless of how the handle is closed (i.e. when the script ends).
import logging.config
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
# file handler, triggered by the memory handler
fh = logging.FileHandler('log.txt')
# set the logging level
fh.setLevel(logging.DEBUG)
# capacity is the number of records
mh = logging.handlers.MemoryHandler(5, flushLevel=logging.ERROR, target=fh, flushOnClose=False)
logger.addHandler(mh)
logger.debug('debug')
# mh.close()
For the arguments 5, flushLevel=logging.ERROR, target=fh, flushOnClose=False the 'debug' message should not be displaying, because
I have not added 5 messages to the queue
flushOnClose=False, therefore when the script ends there should not be a flush
debug does not trigger flushing from flushLevel
I find that when I use mh.close() the message does not flush, as expected. However when the script ends without mh.close() (commented), the single debug message seems to get flushed despite the settings suggestion that it shouldn't.
Faced this issue too where logger should not supposed to print anything unless 'error' event is encountered only.
Had to manually call close() on all MemoryHandlers for my Logger instance via atexit:
def _close_all_memory_handlers():
for handler in Logger.handlers:
if isinstance(handler, logging.handlers.MemoryHandler):
handler.close()
import atexit
atexit.register(_close_all_memory_handlers)
This should work as long as you register this atexit handler after logging module is initialized.
I think this is the correct behaviour:
logger.debug('debug') --> this will print to your file 'debug' without waiting any flush.
Sorry...yes the default is True. I saw the add above and in my opinion the behaviour is normal, in sense that if you do NOT terminate then everything will be flush at the end of execution (this is typical in order to debug what went wrong). In case you are terminating, then the message was append to the buffer and the "False" cause the message to be destroyed within the buffer. Isn't it right behaviour?
In addition the flushOnClose does not exist in the handler class as below:
class MemoryHandler(BufferingHandler):
"""
A handler class which buffers logging records in memory, periodically
flushing them to a target handler. Flushing occurs whenever the buffer
is full, or when an event of a certain severity or greater is seen.
"""
def __init__(self, capacity, flushLevel=logging.ERROR, target=None):
"""
Initialize the handler with the buffer size, the level at which
flushing should occur and an optional target.
Note that without a target being set either here or via setTarget(),
a MemoryHandler is no use to anyone!
"""
BufferingHandler.__init__(self, capacity)
self.flushLevel = flushLevel
self.target = target
def shouldFlush(self, record):
"""
Check for buffer full or a record at the flushLevel or higher.
"""
return (len(self.buffer) >= self.capacity) or \
(record.levelno >= self.flushLevel)
def setTarget(self, target):
"""
Set the target handler for this handler.
"""
self.target = target
def flush(self):
"""
For a MemoryHandler, flushing means just sending the buffered
records to the target, if there is one. Override if you want
different behaviour.
The record buffer is also cleared by this operation.
"""
self.acquire()
try:
if self.target:
for record in self.buffer:
self.target.handle(record)
self.buffer = []
finally:
self.release()
def close(self):
"""
Flush, set the target to None and lose the buffer.
"""
try:
self.flush()
finally:
self.acquire()
try:
self.target = None
BufferingHandler.close(self)
finally:
self.release()
Anyway the behavior is normal, in sense that even when you open a file, you can decide nowadays if close or not in the end. In the end the file will be closed anyway in order not to lose information :-)

Resources