In a library, I have declared a custom logger like this in the file log.py:
import os
import sys
import logging
LOG_LEVEL = os.getenv('LOG_LEVEL', 'INFO')
# disable root logger
root_logger = logging.getLogger()
root_logger.disabled = True
# create custom logger
logger = logging.getLogger('my-handle')
logger.removeHandler(sys.stdout)
logger.setLevel(logging.getLevelName(LOG_LEVEL))
formatter = logging.Formatter(
'{"timestamp": "%(asctime)s", "level": "%(levelname)s", "logger": "%(name)s", '
'"filename": "%(filename)s", "message": "%(message)s"}',
'%Y-%m-%dT%H:%M:%S%z')
handler = logging.FileHandler('file.log', encoding='utf-8')
handler.setFormatter(formatter)
handler.setLevel(logging.getLevelName(LOG_LEVEL))
logger.addHandler(handler)
I then call this logger from other files doing:
from log import logger
logger.debug('something')
When running the code above on my computer, it only sends the logs to the file.log file. But when running the exact same code on a Lambda function, all the logs appear in CloudWatch. What is needed to disable the CloudWatch logs? it costs a lot for stuff I don't want and already log somewhere else (in s3 for that matter - much cheaper)
I am able to store error logs to a file... but not info() or any other Logging Levels.
What am I doing wrong?
How can I store any level of logs to FileHandler?
code.py
import sys
import logging
def setup_logging():
global logger
logger = logging.getLogger()
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
open('data_module.log', 'w').close() # empty logs
global fileHandler
fileHandler = logging.FileHandler('data_module.log')
fileHandler.setFormatter(formatter)
fileHandler.setLevel(logging.DEBUG)
logger.addHandler(fileHandler)
logger.error('Started') # info
logger.info('information') # info
test.py:
import code as c
c.setup_logging()
with open('data_module.log', 'r') as fileHandler:
logs = [l.rstrip() for l in fileHandler.readlines()]
open('data_module.log', 'w').close() # empty logs
assert len(logs) == 2
Error:
AssertionError: assert 1 == 2
Please let me know if there's anything else I should add to post.
You need to set the level for the logger itself:
logger.setLevel(logging.DEBUG)
The default log level is WARN: when you write a DEBUG-level message, the logger does not handle it (ie send it to a handler). The handler you added is never invoked.
The handler can have its own level, but that is consulted only after the handler is invoked. If a logger sends a DEBUG message to a handler that is only interested in INFO+ messages, it does nothing.
I am trying to setup a logging mechanism for a python module.
Following is the example code that I have written to setup logging
import logging
def init_logger(logger):
formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(filename)s - %(funcName)s - %(message)s')
ch = logging.StreamHandler()
ch.setFormatter(formatter)
ch.setLevel(logging.INFO)
logger.addHandler(ch)
file_handler = logging.FileHandler('test_logging.log')
file_handler.setFormatter(formatter)
file_handler.setLevel(logging.DEBUG)
logger.addHandler(file_handler)
def foo1():
logger = logging.getLogger(__name__)
logger.info('Test Info')
logger.debug('Test Debug')
logger.error('Test Error')
def foo2():
logger = logging.getLogger(__name__)
logger.info('Test Info')
logger.debug('Test Debug')
logger.error('Test Error')
if __name__ == '__main__':
logger = logging.getLogger(__name__)
init_logger(logger)
foo1()
foo2()
I expect the logging to print info level and above to stdout and debug level and above to be written to the log file. But what I see is that only error level is outputted to both stdout and log file.
2019-08-13 11:20:07,775 - ERROR - test_logger.py - foo1 - Test Error
2019-08-13 11:20:07,776 - ERROR - test_logger.py - foo2 - Test Error
As per the documentation getLogger should return the same instance of logger. I even tried to create a new instance for the first time like logger = logging.Logger(__name__) but no luck with that. I am not understanding what am I missing here.
Short answer: you must use logging.basicConfig(level=...) or logger.setLevel in your code.
When you use logging.getLogger('some_name') for the first time you create a new logger with level = NOTSET = 0.
# logging module source code
class Logger(Filterer):
def __init__(self, name, level=NOTSET):
...
logging.NOTSET seems to be a valid level value but it is not. Actually it is illegal value that says that logger is not enabled to log anything and forces logger to use level from parent logger (root logger). This logic is defined in Looger.getEffectiveLevel method:
# logging module source code
def getEffectiveLevel(self):
logger = self
while logger:
if logger.level: # 0 gives False here
return logger.level
logger = logger.parent # 0 makes this line reachable
return NOTSET
Root logger has level=WARNING so newly created loggers inherit this level:
# logging module source code
root = RootLogger(WARNING)
logging.getLogger does not allow you to specify logging level. So you have to use logging.basicConfig to modify root logger or logger.setLevel to modify newly created logger somewhere in the very beginning of the script.
I guess the feature should be documented in logging module guides/documentation.
I'm using Google App Engine python 3.7 standard and i'm trying to group related request log entries.
According to the Writing Application Logs documentation, I should:
Set the trace identifier in the LogEntry trace field of your app log
entries. The expected format is
projects/[PROJECT_ID]/traces/[TRACE_ID]
Where/How should use LogEntry?
The Stackdriver Logging documentation doesn't show how it's possible. Am I missing something?
Code examples would be much appreciated.
[UPDATE]
Following Duck Hunt Duo advice, I tried the following, without any success:
trace_id = request.headers.get('X-Cloud-Trace-Context', 'no_trace_id').split('/')[0]
client = logging.Client()
logger = client.logger('appengine.googleapis.com%2Fstdout') # Not shown
# logger = client.logger('projects/{}/logs/stdout'.format(GOOGLE_CLOUD_PROJECT)) # error
# logger = client.logger('projects/{}/logs/appengine.googleapis.com%2Fstdout'.format(GOOGLE_CLOUD_PROJECT)) # error
logger.log_text('log_message', trace=trace_id)
The log doesn't appear in the GAE service log web console
This is my basic solution:
trace_id = request.headers.get('X-Cloud-Trace-Context', 'no_trace_id').split('/')[0]
trace_str = "projects/{}/traces/{}".format(os.getenv('GOOGLE_CLOUD_PROJECT'), trace_id)
log_client = logging.Client()
# This is the resource type of the log
log_name = 'stdout'
# Inside the resource, nest the required labels specific to the resource type
labels = {
'module_id': os.getenv('GAE_SERVICE'),
'project_id': os.getenv('GOOGLE_CLOUD_PROJECT'),
'version_id': os.getenv('GAE_VERSION')
}
res = Resource(type="gae_app",
labels=labels,
)
logger = log_client.logger(log_name)
logger.log_text("MESSAGE_STRING_TO_LOG", resource=res, severity='ERROR', trace=trace_str)
After it was working, I wrapped it in a file so it would work similarly to Google's logger for python2.7 .
Here is my_gae_logging.py:
import logging as python_logging
import os
from flask import request
from google.cloud import logging as gcp_logging
from google.cloud.logging.resource import Resource
# From GCP logging lib for Python2.7
CRITICAL = 50
FATAL = CRITICAL
ERROR = 40
WARNING = 30
WARN = WARNING
INFO = 20
DEBUG = 10
NOTSET = 0
_levelNames = {
CRITICAL: 'CRITICAL',
ERROR: 'ERROR',
WARNING: 'WARNING',
INFO: 'INFO',
DEBUG: 'DEBUG',
NOTSET: 'NOTSET',
'CRITICAL': CRITICAL,
'ERROR': ERROR,
'WARN': WARNING,
'WARNING': WARNING,
'INFO': INFO,
'DEBUG': DEBUG,
'NOTSET': NOTSET,
}
def get_trace_id():
trace_str = None
try:
trace_id = request.headers.get('X-Cloud-Trace-Context', 'no_trace_id').split('/')[0]
trace_str = "projects/{project_id}/traces/{trace_id}".format(
project_id=os.getenv('GOOGLE_CLOUD_PROJECT'),
trace_id=trace_id)
except:
pass
return trace_str
class Logging:
def __init__(self):
self._logger = None
#property
def logger(self):
if self._logger is not None:
return self._logger
log_client = gcp_logging.Client()
# This is the resource type of the log
log_name = 'appengine.googleapis.com%2Fstdout'
# Inside the resource, nest the required labels specific to the resource type
self._logger = log_client.logger(log_name)
return self._logger
#property
def resource(self):
resource = Resource(
type="gae_app",
labels={
'module_id': os.getenv('GAE_SERVICE'),
'project_id': os.getenv('GOOGLE_CLOUD_PROJECT'),
'version_id': os.getenv('GAE_VERSION')
}
)
return resource
def log(self, text):
text = str(text)
self.logger.log_text(text, resource=self.resource, trace=get_trace_id())
def debug(self, text):
text = str(text)
self.logger.log_text(text, resource=self.resource, severity=_levelNames.get(DEBUG), trace=get_trace_id())
def info(self, text):
text = str(text)
self.logger.log_text(text, resource=self.resource, severity=_levelNames.get(INFO), trace=get_trace_id())
def warning(self, text):
text = str(text)
self.logger.log_text(text, resource=self.resource, severity=_levelNames.get(WARNING), trace=get_trace_id())
def warn(self, text):
return self.warning(text)
def error(self, text):
text = str(text)
self.logger.log_text(text, resource=self.resource, severity=_levelNames.get(ERROR), trace=get_trace_id())
def critical(self, text):
text = str(text)
self.logger.log_text(text, resource=self.resource, severity=_levelNames.get(CRITICAL), trace=get_trace_id())
if os.getenv('GAE_VERSION'): # check if running under gcp env
logging = Logging()
else:
# when not running under gcp env, use standard python_logging
logging = python_logging
Usage:
from my_gae_logging import logging
logging.warn('this is my warning')
You might want to take a look at an answer I provided here.
(This answer addresses how to add logging severity to Cloud Functions logs written into Stackdriver, but the basic workflow is the same)
Quoting it:
[...], you can still create logs with certain severity by using the
Stackdriver Logging Client
Libraries.
Check this
documentation
in reference to the Python libraries, and this
one
for some usage-case examples.
Notice that in order to let the logs be under the correct resource,
you will have to manually configure them, see this
list
for the supported resource types. As well, each resource type has
some required
labels
that need to be present in the log structure.
Edit:
Updating the previous answer with an example for App Engine:
from google.cloud import logging
from google.cloud.logging.resource import Resource
from flask import Flask
app = Flask(__name__)
#app.route('/')
def logger():
log_client = logging.Client()
log_name = 'appengine.googleapis.com%2Fstdout'
res = Resource( type='gae_app',
labels={
"project_id": "MY-PROJECT-ID",
"module_id": "MY-SERVICE-NAME"
})
logger = log_client.logger(log_name)
logger.log_struct({"message": "message string to log"}, resource=res, severity='ERROR') # As an example log message with a ERROR warning level
return 'Wrote logs to {}.'.format(logger.name)
By using this code as example, and changing the resource type of the log to appengine.googleapis.com%2Fstdout should work, and change the Resource fields to be the same as in the gae_app labels described in here.
Using the AppEngineHandler from Google Cloud Logging provides much of the infrastructure. This allows attaching to the python logging module, so that a standard logging import works.
Setting this up is straightforward enough:
# Setup google cloud logging.
import logging
import google.cloud.logging # Don't conflict with standard logging
from google.cloud.logging.handlers import AppEngineHandler, setup_logging
client = google.cloud.logging.Client()
handler = AppEngineHandler(client, name='stdout')
logging.getLogger().setLevel(logging.INFO)
setup_logging(handler)
The documentation at https://googleapis.dev/python/logging/latest/usage.html#cloud-logging-handler suggests very similar, but instead of using the AppEngineHandler uses the "CloudLoggingHandler". It also states that the "AppEngineHandler" is for the flexible environment, but this works in the standard python3 environment.
The Stackdriver Logging Client Library can be used to achieve this. The logger.log_text function sends a LogEntry object to the API. Example:
from google.cloud import logging
client = logging.Client()
logger = client.logger('appengine.googleapis.com%2Fstdout')
logger.log_text('log_message', trace=trace_id)
The trace_id should be retrieved from the request headers as the docs mention. The method of doing this will depend on how you're serving requests, but in Flask for example it would be simple as trace_id = request.headers['X-Cloud-Trace-Context'].split('/')[0]
I have a node.js application that runs a client interface which exposes action that triggers machine-learn tasks. Since python is a better choice when implementing machine-learn related stuff, I've implemented a python application that runs on demand machine learning tasks.
Now, I need to integrate both applications. It has been decided that we need to use a single (AWS) instance to integrate both applications.
One way found to do such integration was using python-shell node module. There, the communications between Python and Node are done by stdin and stdout.
On node I have something like this:
'use strict';
const express = require('express');
const PythonShell = require('python-shell');
var app = express();
app.listen(8000, function () {
console.log('Example app listening on port 8000!');
});
var options = {
mode: 'text',
pythonPath: '../pythonapplication/env/Scripts/python.exe',
scriptPath: '../pythonapplication/',
pythonOptions: ['-u'], // Unbuffered
};
var pyshell = new PythonShell('start.py', options);
pyshell.on('message', function (message) {
console.log(message);
});
app.get('/task', function (req, res) {
pyshell.send('extract-job');
});
app.get('/terminate', function (req, res) {
pyshell.send('terminate');
pyshell.end(function (err, code, signal) {
console.log(err)
console.log(code)
console.log(signal);
});
});
On python, I have a main script which loads some stuff and the calls a server script, that runs forever reading lines with sys.stdin.readline() and then executes the corresponding task.
start.py is:
if __name__ == '__main__':
# data = json.loads(sys.argv[1])
from multiprocessing import Manager, Pool
import logging
import provider, server
# Get logging setup objects
debug_queue, debug_listener = provider.shared_logging(logging.DEBUG, 'python-server-debug.log')
info_queue, info_listener = provider.shared_logging(logging.INFO, 'python-server.log')
logger = logging.getLogger(__name__)
# Start logger listener
debug_listener.start()
info_listener.start()
logger.info('Initializing pool of workers...')
pool = Pool(initializer=provider.worker, initargs=[info_queue, debug_queue])
logger.info('Initializing server...')
try:
server.run(pool)
except (SystemError, KeyboardInterrupt) as e:
logger.info('Execution terminated without errors.')
except Exception as e:
logger.error('Error on main process:', exc_info=True)
finally:
pool.close()
pool.join()
debug_listener.stop()
info_listener.stop()
print('Done.')
Both info_queue and debug_queue are multiprocessing.Queue to handle multiprocessing logging. If I run my python application as standalone, everything works fine, even when using the pool of workers (logs get properly logged, prints, get properly printed...)
But, if I try to run using python-shell, only my main process prints and logs get printed and logged correctly... Every message (print or log) from my pool of workers get held until I terminate the python script.
In other words, every message will get held until the finally step on server.py run...
Does anyone has any insights on this issue? Have you guys heard about python-bridge module? Is it a better solution? Can you suggest a better approach for such integration that does not uses two separated servers?
Here I post my real provider script, and a quick mock I did for the server script (the real one has too much stuff)
mock server.py:
import json
import logging
import multiprocessing
import sys
import time
from json.decoder import JSONDecodeError
from threading import Thread
def task(some_args):
logger = logging.getLogger(__name__)
results = 'results of machine learn task goes here, as a string'
logger.info('log whatever im doing')
# Some machine-learn task...
logger.info('Returning results.')
return results
def answer_node(message):
print(message)
# sys.stdout.write(message)
# sys.stdout.flush()
def run(pool, recrutai, job_pool, candidate_queue):
logger = logging.getLogger(__name__)
workers = []
logger.info('Server is ready and waiting for commands')
while True:
# Read input stream
command = sys.stdin.readline()
command = command.split('\n')[0]
logger.debug('Received command: %s', command)
if command == 'extract-job':
logger.info(
'Creating task.',
)
# TODO: Check data attributes
p = pool.apply_async(
func=task,
args=('args'),
callback=answer_node
)
# What to do with workers array?!
workers.append(p)
elif command == 'other-commands':
pass
# Other task here
elif command == 'terminate':
raise SystemError
else:
logger.warn(
'Received an invalid command %s.',
command
)
my provider.py:
import logging
import os
from logging.handlers import QueueHandler, QueueListener
from multiprocessing import Queue
def shared_logging(level, file_name):
# Create main logging file handler
handler = logging.FileHandler(file_name)
handler.setLevel(level)
# Create logging format
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
# Create queue shared between all process to centralize logging features
logger_queue = Queue() # multiprocessing.Queue
# Create logger queue listener to send records from logger_queue to handler
logger_listener = QueueListener(logger_queue, handler)
return logger_queue, logger_listener
def process_logging(info_queue, debug_queue, logger_name=None):
# Create logging queue handlers
debug_queue_handler = QueueHandler(debug_queue)
debug_queue_handler.setLevel(logging.DEBUG)
info_queue_handler = QueueHandler(info_queue)
info_queue_handler.setLevel(logging.INFO)
# Setup level of process logger
logger = logging.getLogger()
if logger_name:
logger = logging.getLogger(logger_name)
logger.setLevel(logging.DEBUG)
# Add handlers to the logger
logger.addHandler(debug_queue_handler)
logger.addHandler(info_queue_handler)
def worker(info_queue, debug_queue):
# Setup worker process logging
process_logging(info_queue, debug_queue)
logging.debug('Process %s initialized.', os.getpid())