I have a question about how to send celery log in default format output to both console (stdout) and logfile.
log default format: "[%(asctime)s: %(levelname)s/%(processName)s] %(message)s"
task log default format: "[%(asctime)s: %(levelname)s/%(processName)s]
[%(task_name)s(%(task_id)s)] %(message)s"
I start up the celery app using:
celery_app = Celery()
celery_app.start(argv=["celery", "worker", "-l", "info"]
this only sends the full celery log to console (stdout).
When I do:
celery_app = Celery()
celery_app.start(argv=["celery", "worker", "-l", "info", "--logfile='./tasks.log'"]
this only sends the full celery log to logfile (tasks.log).
how can I send the full celery log to both console and logfile at the same time?
I tried using logging.config.dictConfig(config) to set up both streamHandler and fileHandler to output to console and logfile. This way doesn't allow me to include the task_id and task_name in the default celery log format because the task_id and task_name require to use TaskFormatter class.
The standard way of doing this is to hook into either celery.signals.setup_logging or celery.signals.after_setup_task_logger. You can add an additional logging handler in your signal callback function. As an aside, even if you use dictConfig, you can still add the TaskFormatter to your logging config.
Related
I'm facing logging issues with DockerOperator.
I'm running a python script inside the docker container using DockerOperator and I need airflow to spit out the logs from the python script running inside the container. Airlfow is marking the job as success but the script inside the container is failing and I have no clue of what is going as I cannot see the logs properly. Is there way to set up logging for DockerOpertor apart from setting up tty option to True as suggested in docs
It looks like you can have logs pushed to XComs, but it's off by default. First, you need to pass xcom_push=True for it to at least start sending the last line of output to XCom. Then additionally, you can pass xcom_all=True to send all output to XCom, not just the first line.
Perhaps not the most convenient place to put debug information, but it's pretty accessible in the UI at least either in the XCom tab when you click into a task or there's a page you can list and filter XComs (under Browse).
Source: https://github.com/apache/airflow/blob/1.10.10/airflow/operators/docker_operator.py#L112-L117 and https://github.com/apache/airflow/blob/1.10.10/airflow/operators/docker_operator.py#L248-L250
Instead of DockerOperator you can use client.containers.run and then do the following:
with DAG(dag_id='dag_1',
default_args=default_args,
schedule_interval=None,
tags=['my_dags']) as dag:
#task(task_id='task_1')
def start_task(**kwargs):
# get the docker params from the environment
client = docker.from_env()
# run the container
response = client.containers.run(
# The container you wish to call
image='__container__:latest',
# The command to run inside the container
command="python test.py",
version='auto',
auto_remove=True,
stdout = True,
stderr=True,
tty=True,
detach=True,
remove=True,
ipc_mode='host',
network_mode='bridge',
# Passing the GPU access
device_requests=[
docker.types.DeviceRequest(count=-1, capabilities=[['gpu']])
],
# Give the proper system volume mount point
volumes=[
'src:/src',
],
working_dir='/src'
)
output = response.attach(stdout=True, stream=True, logs=True)
for line in output:
print(line.decode())
return str(response)
test = start_task()
Then in your test.py script (in the docker container) you have to do the logging using the standard Python logging module:
import logging
logger = logging.getLogger("airflow.task")
logger.info("Log something.")
Reference: here
I have a python script with a cli argument parser (based on argparse)
I am calling it from a batch file:
set VAR1=arg_1
set VAR2=arg_2
python script.py --arg1 %VAR1% --arg2 %VAR2%
within the script.py I call a logger:
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
This script utilizes chromedriver, selenium and requests to automate some clicking and moving between web pages.
When running from within PyCharm (configured so that the script has arg_1 and arg_2 passed to it) everything is great - I get log messages from my logger only.
When I run the batch file - I get a bunch of logging messages from chromedriver or requests (I think).
I have tried:
#echo off at the start of the batch file.
Setting the level on the root logger.
Getting the logging logger dictionary and setting each logger to WARNING - based on this question.
None of these work and I keep getting logging messages from submodules - ONLY when run from a batch file.
Anybody know how to fix this?
You can use the following configuration options to do this
import logging.config
logging.config.dictConfig({
'version': 1,
'disable_existing_loggers': True,
})
Can log4j be configured to run a script after RollingFile has finished to enable me to send out email that roll over has occurred or grep through log to see if a text pattern occurred, etc?
thank you
I have a Splunk forwarder managing logs in my production servers, so I really just need to get the output of my node app into a file that Splunk is watching. What is the downside of simply doing the following in production:
node server.js &> output.log
As oppose to handling the log output inside the node process with some sort of logging module...
checkout supervisord which is a logging and babysitting tool which becomes the parent of processes like a node server which can handle redirecting both standard out and standard error to files of your choosing ... besides it will sniff for abends and throw the child process back in when needed
here is a typical config file : /etc/supervisor/conf.d/supervisord.conf
[supervisord]
nodaemon=true
logfile=GKE_MASTER_LOGDIR/supervisord_nodejs_GKE_FLAVOR_USER.log
pidfile=GKE_MASTER_LOGDIR/supervisord_nodejs_GKE_FLAVOR_USER.pid
stdout_logfile_maxbytes = 1MB
stderr_logfile_maxbytes = 1MB
logfile_backups = 50
# loglevel = debug
[program:nodejs]
command=/tmp/boot_nodejs.sh %(ENV_MONGO_SERVICE_HOST)s %(ENV_MONGO_SERVICE_PORT)s
stdout_logfile = GKE_MASTER_LOGDIR/nodejs_GKE_FLAVOR_USER_stdout.log
stderr_logfile = GKE_MASTER_LOGDIR/nodejs_GKE_FLAVOR_USER_stderr.log
stdout_logfile_maxbytes = 1MB
stderr_logfile_maxbytes = 1MB
logfile_backups = 50
autostart = True
autorestart = True
# user = GKE_NON_ROOT_USER
in my case this all happens inside a Docker container so here is a snippet of my Dockerfile which launches supervisord which in turn launches nodejs and in so doing redirects stdout / err to logging files which supervisord rotates based on space and/or time ... use of Docker is orthogonal to using supervisord so YMMV
CMD ["/usr/bin/supervisord", "-c", "/etc/supervisor/conf.d/supervisord.conf" ]
for completeness below I include the boot_nodejs.sh referenced above
#!/bin/bash
given_mongo_service_host=$1
given_mongo_service_port=$2
current_dir=$(dirname "${BASH_SOURCE}")
current_timestamp="timestamp "$(date '+%Y%m%d_%H%M%S_%Z')
echo
echo "______________ fresh nodejs server bounce ______________ $current_timestamp"
echo
# ............... now output same to standard error so its log gets the hat tip
(>&2 echo )
(>&2 echo "______________ fresh nodejs server bounce ______________ $current_timestamp" )
(>&2 echo )
# ................
export MONGO_URL=mongodb://$given_mongo_service_host:$given_mongo_service_port
type node
node main.js
There's no problem with redirecting your output to a log file. In a lot of ways, this is preferable.
Having your application write logs directly is more useful when your application is complicated and needs a lot of log configuration, possibly writing to several log files. What I do is use Winston for logging. Normally the only log transport enabled is the console, and I can redirect that to a file if I want. But, I also have in my app config a way to specify other transports and config. I use that for writing directly to Logstash and such.
I use PM2 to execute my Node.js app.
In order to do that, I have defined the following ecosystem config file:
apps:
- script: app.js
name: "myApp"
exec_mode: cluster
cwd: "/etc/myService/myApp"
Everything is working. Now I want to specify the custom location for the PM2's logs, therefore I added into ecosystem config file:
log: "/etc/myService/myApp/logs/myApp.log"
It works, but I paid attention that after execution of pm2 start ecosystem, PM2 will write the logs to both locations at the same time:
/etc/myService/myApp/logs/myApp.log (as expected)
/home/%$user%/.pm2/logs/ (default logs destination)
How can I specify the only place for logs of PM2 and avoid the duplicate logs generation?
Based on the comment of robertklep, in order to solve the issue we have to use out_file and err_file fields for output and error log paths respectively.
Syntax sample in YAML format:
out_file: "/etc/myService/myApp/logs/myApp_L.log"
err_file: "/etc/myService/myApp/logs/myApp_E.log"
P.S. The field log can be removed from the config file:
log: "/etc/myService/myApp/logs/myApp.log"