I have noticed that on some of my sidekiq workers, they appear to be running multiple processes (Working multiple jobs concurrently) in a single dyno (The logs would suggest this).
How many processes could be/are running separate jobs within a single dyno concurrently without using swarming (The enterprise feature)?
I have everything set up to defaults without using swarms, so each sidekiq worker is using 25 threads. What exactly all these threads are used for, however, I have no idea. Can anyone help me understand how this translates into concurrent workers working jobs inside a single Heroku dyno?
You are seeing a single Sidekiq process with 25 threads running jobs concurrently. Each thread will execute a job so you can have up to 25 jobs running at once.
Without swarm, you can only run one process per dyno.
You can run multiple processes in a dyno using swarm but how many depends on the memory requirements of your app and how many cores in the dyno.
This will get you 100 worker threads: 4*25.
SIDEKIQ_COUNT=4 bundle exec sidekiqswarm -e production -c 25
Related
I have a system that has important long-running tasks which are executed by Celery workers. Assume that we have deployed our application using k8s or docker-compose.
How can I change the celery workers' code in production without losing the tasks that they are currently executing?
In another word, I want an elegant automated way to execute all unfinished tasks with the new workers.
I’m using Redis 4.3.3 as the broker and my Celery version is 5.2.7.
I have added result_backend and tried following settings but Celery didn't reschedule the running tasks after I ran "docker-compose restart worker_service_name".
CELERY_ACKS_LATE = True
CELERY_TASK_REJECT_ON_WORKER_LOST = True
This answer should provide some information about running on Kubernetes.
In addition, I would recommend adding (doc):
CELERYD_PREFETCH_MULTIPLIER = 1
How many messages to prefetch at a time multiplied by the number of
concurrent processes. The default is 4 (four messages for each
process). The default setting is usually a good choice, however – if
you have very long running tasks waiting in the queue and you have to
start the workers, note that the first worker to start will receive
four times the number of messages initially. Thus the tasks may not be
fairly distributed to the workers.
To disable prefetching, set worker_prefetch_multiplier to 1. Changing
that setting to 0 will allow the worker to keep consuming as many
messages as it wants.
Let's say, I've got a 4-core CPU, which basically means, my computer could run 4 processes at the same time; no more no less.
Now let's look at cluster module of nodejs; there are
1 master process
4 worker processes
Can I say, each worker process is to be assigned to each core of the CPU?
And if it is, then where is its master process located?
The master process is fired as a standard single-thread process (or service) so it "is assigned to" and lives on one of your CPU cores (or threads, depending on the case).
Each worker process is then spawned (forked) as described in the docs:
https://nodejs.org/api/cluster.html
Just so you know, Node now supports the use of worker_threads to execute JavaScript in parallel on multiple threads, so it's technically not true anymore that Node is limited by single-thread execution:
https://nodejs.org/api/worker_threads.html
I am currently trying Apache Airflow on my system (Ubuntu 18) and I set it up with postgreSQL and RabbitMQ to use the CeleryExecutor.
I run airflow webserver and airflow scheduler on separate consoles, but the scheduler is only putting tasks as queued but no worker is actually running them.
I tried opening a different terminal and run airflow worker on its own and that seemed to do the trick.
Now the scheduler puts tasks on a queue and the worker I ran manually actually executes them.
As I have read, that should not be the case. The scheduler should run the workers on its own right? What could I do to make this work?
I have checked the logs from the consoles and I don't see any errors.
This is expected. If you look at the docs for airflow worker, it is specifically to bring up a Celery worker when you're using the CeleryExecutor, while the other executors do not require a separate process for tasks to run.
LocalExecutor: uses multiprocessing to run tasks within the scheduler.
SequentialExecutor: just runs one task at a time so that happens within the scheduler as well.
CeleryExecutor: scales out by having N workers, so having it as a separate command lets you run a worker on as many machines as you'd like.
KubernetesExecutor: I imagine talks to your Kubernetes cluster to tell it to run tasks.
Trying to figure out a way to start / run an external process in all workers before starting the jobs / tasks.
Specific use case - my job hits a service running on the node (localhost). The service itself is run via a docker container. I want to start the docker container before starting the tasks on a worker and then stop the container after all the jobs are done.
One approach could be to do rdd.mapPartitions, but that is at a executor level and I cannot cleanly do a stop as another partition might be executing on the same node. Any suggestions?
As a workaround, currently I start the docker containers while starting up the cluster itself, but that does not allow me to work with multiple different containers that may be required for different jobs (as in that case all containers will be running at all the time taking up node resources.)
What happens if I would try to run a multithreaded job in 1 SGE slot? Would it fail to start multiple threads? Or would it still start these multiple threads and potentially overload the SGE cluster node, because it is going to run more threads than there are slots?
I know I should use the -pe threaded nrThreads parameter. But I am running a program of which I am not sure how many threads it is using for every step.
It's been a while since I've used SGE, but at least back then, a job which launched more computational threads than allocated would not be prevented from launching those threads, usually then stealing CPU time from other jobs.
Perhaps current SGE versions are capable of using cpusets, which allow the administrator to limit the CPU's used by a job. At least the slurm scheduler can do this.