I have a project hosted in Digital Ocean in a Basic Droplet with 2 GB Ram. In my local machine, the long-running task runs between 8-10 minutes and is still successful.However in Digital Ocean droplets, often times the celery will not succeed in the long-running task.
Current celery - celery 5.2.6
I have two configurations in supervisor
Running the celery worker celery -A myproject worker -l info
Running the celery beat celery -A myproject beat -l info
This is the message from celeryd.log
CPendingDeprecationWarning:
In Celery 5.1 we introduced an optional breaking change which
on connection, loss cancels all currently executed tasks with late acknowledgment enabled.
These tasks cannot be acknowledged as the connection is gone, and the tasks are automatically redelivered back to the queue.
You can enable this behavior using the worker_cancel_long_running_tasks_on_connection_loss setting.
In Celery 5.1 it is set to False by default. The setting will be set to True by default in Celery 6.0.
warnings.warn(CANCEL_TASKS_BY_DEFAULT, CPendingDeprecationWarning)
[2022-07-07 04:25:36,998: ERROR/MainProcess] consumer: Cannot connect to redis://localhost:6379//: Error 111 connecting to localhost:6379. Connection refused..
Trying again in 2.00 seconds... (1/100)
[2022-07-07 04:25:39,066: ERROR/MainProcess] consumer: Cannot connect to redis://localhost:6379//: Error 111 connecting to localhost:6379. Connection refused..
Trying again in 4.00 seconds... (2/100)
For a temporary solution, what I did is to restart the server, and re-run new tasks, but this will not guarantee that the long-running task will be successful, the problem with this is the previously failed task will not restart.
My goal is,
Prevent long-running tasks from being canceled
If the long-running task is already canceled and cancellation can't be avoided, I need it to rerun and continue instead of starting a new task.
Is this possible? Any ideas on how?
As stated in the warning message, you can control this behavior with worker_cancel_long_running_tasks_on_connection_loss to prevent the task from being cancelled on connection loss. On your celery version it is off by default, so your tasks should not be cancelled. However, even if a late-acknowledging task completes successfully in this scenario, the task is still redelivered to the queue and will be run again -- this happens irrespective of this setting and is unavoidable for tasks with late acknowledgment.
This is why it is vital that you design your tasks to be idempotent.
If your job is not idempotent, an alternative solution is to have your tasks ack early (the default), but this risks the possibility that you may drop a task without it actually being completed.
If you must avoid dropping tasks, you must set acks_late=True to your task and it must be designed to be idempotent. This is necessary irrespective of the specific connection loss issue, as many other things can happen that interrupt your tasks and produce this same scenario.
I need it to rerun and continue instead of starting a new task.
This comes down to how you design your task for idempotency. For example, you might want to have your job keep track of its progress in persistent storage, so when the task fails and is run again, it can determine how best to recover.
Related
I have a system that has important long-running tasks which are executed by Celery workers. Assume that we have deployed our application using k8s or docker-compose.
How can I change the celery workers' code in production without losing the tasks that they are currently executing?
In another word, I want an elegant automated way to execute all unfinished tasks with the new workers.
I’m using Redis 4.3.3 as the broker and my Celery version is 5.2.7.
I have added result_backend and tried following settings but Celery didn't reschedule the running tasks after I ran "docker-compose restart worker_service_name".
CELERY_ACKS_LATE = True
CELERY_TASK_REJECT_ON_WORKER_LOST = True
This answer should provide some information about running on Kubernetes.
In addition, I would recommend adding (doc):
CELERYD_PREFETCH_MULTIPLIER = 1
How many messages to prefetch at a time multiplied by the number of
concurrent processes. The default is 4 (four messages for each
process). The default setting is usually a good choice, however – if
you have very long running tasks waiting in the queue and you have to
start the workers, note that the first worker to start will receive
four times the number of messages initially. Thus the tasks may not be
fairly distributed to the workers.
To disable prefetching, set worker_prefetch_multiplier to 1. Changing
that setting to 0 will allow the worker to keep consuming as many
messages as it wants.
So we have a kubernetes cluster running some pods with celery workers. We are using python3.6 to run those workers and celery version is 3.1.2 (I know, really old, we are working on upgrading it). We have also setup some autoscaling mechanism to add more celery workers on the fly.
The problem is the following. So let's say we have 5 workers at any given time. Then lot of tasks come, increasing the CPU/RAM usage of the pods. That triggers an autoscaling event, adding, let's say, two more celery worker pods. So now those two new celery workers take some long running tasks. Before they finishing running those tasks, kubernetes creates a downscaling event, killing those two workers, and killing those long running tasks too.
Also, for legacy reasons, we do not have a retry mechanism if a task is not completed (and we cannot implement one right now).
So my question is, is there a way to tell kubernetes to wait for the celery worker to have run all of its pending tasks? I suppose the solution must include some way to notify the celery worker to make it stop receiving new tasks also. Right now I know that Kubernetes has some scripts to handle this kind of situations, but I do not know what to write on those scripts because I do not know how to make the celery worker stop receiving tasks.
Any idea?
I wrote a blog post exactly on that topic - check it out.
When Kubernetes decide to kill a pod, it first send SIGTERM signal, so your Application have time to gracefully shutdown, and after that if your Application didn't end - Kubernetes will kill it by sending a SIGKILL signal.
This period, between SIGTERM to SIGKILL can be tuned by terminationGracePeriodSeconds (more about it here).
In other words, if your longest task takes 5 minutes, make sure to set this value to something higher than 300 seconds.
Celery handle those signals for you as you can see here (I guess it is relevant for your version as well):
Shutdown should be accomplished using the TERM signal.
When shutdown is initiated the worker will finish all currently
executing tasks before it actually terminates. If these tasks are
important, you should wait for it to finish before doing anything
drastic, like sending the KILL signal.
As explained in the docs, you can set the acks_late=True configuration so the task will run again if it stopped accidentally.
Another thing that I didn't find documentation for (almost sure I saw it somewhere) - Celery worker won't receive a new tasks after getting a SIGTERM - so you should be safe to terminate the worker (might require to set worker_prefetch_multiplier = 1 as well).
I am seeing about 3018 tasks failed for the job as about 4 executors died.
The Executors summary (as below in Spark UI) have a completely different statistics. Out of 3018, about 2994 properly completed. My question is,
Will they be re-tried again?
Is there a config to override/limit this?
After monitoring the job and manually validation the attempt counts event for successful tasks, realised
Will they be re-tried again?
- Yes, even the successful tasks are retried.
Is there a config to override/limit this?
- Did not find any config to override this behaviour.
If an executer (kubernetes pod) dies (like with an OOM or timeout), all the tasks, even if successfully completed are re-executed. One of the main reason is, the shuffle writes from the executers are lost with the executor itself!!!
Sorry but can't find he configuration point a need. I schedule spark application, sometimes they may not succeed after 1 hour, in this case I want to automatically kill this task (because I am sure it will never succeed, and another scheduling may start).
I found a timeout configuration, but as I understand it, this is used to delay the start of a workflow.
So is there a kind of living' timeout ?
Oozie cannot kill a workflow that it triggered. However you can ensure that a single workflow is running at same time by setting Concurrency = 1 in the Coordinator.
Also you can have a second Oozie workflow monitoring the status of the Spark job.
Anyawy, you should investigate the root cause of Spark job not successful or being blocked.
Scenario: How to I gracefully tell a worker to stop accepting new jobs and identify when they are finished processing the current jobs as to shut them down as workers are coming online.
Details (Feel free to correct any of my assumptions):
Here is snippet of my current queue.
As you can see I have 2 exchange queues for the workers (I believe this is the *.pidbox), 2 queues representing celeryev on each host (yes I know I only need one), and one default celery queue. Clearly I have 90+ jobs in this queue.
(Side Question) Where do you go to find the worker consuming the job from the Management console? I know I can look at djcelery and figure that out.
So.. I know there are jobs running on each host - I can't shut celery off those machines as it will kill the jobs running (and any pending?).
How do I stop any further processing of new jobs while allowing those jobs still running to complete? I know that on each host I can stop celery but that will kill any current jobs running as well. I want to say to the 22 jobs in the hopper to halt.
Thanks!!