Gunicorn maximum threads for 1 worker - multithreading

In terms if Gunicorn workers and threads I need to know if there is a maximum threads that can be assigned to a single worker? I've a flask application for the UI part of a trading bot. What it does it allows the user to create new instances of a trading bot and when a new instance is created, it is added as a new thread. There is a bot manager that allows to stop and start bot instances as threads and also keep track of running threads.
App is dockerized. Gunicorn runs on a single worker (since workers don't share memory between them, or is there a way for the bot manager to speak with other workers?). How many threads can I start on a single worker? Should I specify number of threads for gunicorn?
Currently gunicorn is fired up with following command:
gunicorn app:application --worker-tmp-dir /dev/shm --bind 0.0.0.0:8000 --timeout 600 --workers 1
Can I start lets say 8 threads on a single worker? What benefits will it reap?
gunicorn app:application --worker-tmp-dir /dev/shm --bind 0.0.0.0:8000 --timeout 600 --workers=1 --threads=8

Related

How to assign priority to queues in RabbitMQ via celery configuration?

I am trying to assign a priority level to both of my queues in RabbitMQ so that my workers will always consume and clear out all messages from Queue1 first before consuming from Queue2. I use a celery configuration file, called celeryconfig.py, that looks like this:
import ssl
broker_url="amqps://USR:PWD#URL//"
result_backend="db+postgresql://USR:PWD#BURL?sslmode=verify-full&sslrootcert=/usr/local/share/ca-certificates/MY_CACERT.crt"
include=["my_tasks"]
task_acks_late=True
task_default_rate_limit="150/m"
task_time_limit=300
worker_prefetch_multiplier=1
worker_max_tasks_per_child=2
timezone="UTC"
broker_use_ssl = {'keyfile': '/usr/local/share/private/MY_KEY.key', 'certfile': '/usr/local/share/ca-certificates/MY_CERT.crt', 'ca_certs': '/usr/local/share/ca-certificates/MY_CACERT.crt', 'cert_reqs': ssl.CERT_REQUIRED, 'ssl_version': ssl.PROTOCOL_TLSv1_2}
Currently I only have 1 queue and this is how I am starting the celery workers
celery -A celery_app worker -l info --config celeryconfig --concurrency=16 -n "%h:celery-worker" -O fair
I have read the short doc here https://docs.celeryproject.org/en/v4.3.0/userguide/routing.html#routing-options-rabbitmq-priorities but it only mentions setting the max priority level and does not tell me how to set priority levels for each individual queue in RabbitMQ.
RabbitMQ: 3.7.17
Celery: 4.3.0
Python: 3.6.7
OS: Ubuntu 18.04.3 LTS bionic
Can someone shed some light on this? Thank you
I am not familiar with celery at all, but other systems can run separate workers depending on queue or other filter. And each worker can have each own config for messages per second consumed, concurrency etc..
You can create two celery configs, one with e.g. 10 priority and the other with 5 priority, and run two "instances" of celery.
This will work much better... per message priority in same worker does not work so well.

Cloud Run Qs :: max-instances + concurrency + threads (gunicorn thread)

(I'm learning Cloud Run acknowledge this is not development or code related, but hoping some GCP engineer can clarify this)
I have a PY application running - gunicorn + Flask... just PoC for now, that's why minimal configurations.
cloud run deploy has following flags:
--max-instances 1
--concurrency 5
--memory 128Mi
--platform managed
guniccorn_cfg.py files has following configurations:
workers=1
worker_class="gthread"
threads=3
I'd like to know:
1) max-instances :: if I were to adjust this, does that mean a new physical server machine is provisioned whenever needed ? Or, does the service achieve that by pulling a container image and simply starting a new container instance (docker run ...) on same physical server machine, effectively sharing the same physical machine as other container instances?
2) concurrency :: does one running container instance receive multiple concurrent requests (5 concurrent requests processed by 3 running container instances for ex.)? or does each concurrent request triggers starting new container instance (docker run ...)
3) lastly, can I effectively reach concurrency > 5 by adjusting gunicorn thread settings ? for ex. 5x3=15 in this case.. for ex. 15 concurrent requests being served by 3 running container instances for ex.? if that's true any pros/cons adjusting thread vs adjusting cloud run concurrency?
additional info:
- It's an IO intensive application (not the CPU intensive). Simply grabbing the HTTP request and publishing to pubsub/sub
thanks a lot
First of all, it's not appropriate on Stackoverflow to ask "cocktail questions" where you ask 5 things at a time. Please limit to 1 question at a time in the future.
You're not supposed to worry about where containers run (physical machines, VMs, ...). --max-instances limit the "number of container instances" that you allow your app to scale. This is to prevent ending up with a huge bill if someone was maliciously sending too many requests to your app.
This is documented at https://cloud.google.com/run/docs/about-concurrency. If you specify --concurrency=10, your container can be routed to have at most 10 in-flight requests at a time. So make sure your app can handle 10 requests at a time.
Yes, read Gunicorn documentation. Test if your setting "locally" lets gunicorn handle 5 requests at the same time... Cloud Run’s --concurrency setting is to ensure you don't get more than 5 requests to 1 container instance at any moment.
I also recommend you to read the officail docs more thoroughly before asking, and perhaps also the cloud-run-faq once which pretty much answers all these.

Sharing single file between gunicorn workers

I'm trying to create flask application with gunicorn. I defined number of gunicorn workers to (multiprocessing.cpu_count() * 2) + 1 according to documentation.
There is problem beacuse in my flask application there is need to wrtite something to single file when HTTP request will come. If more than on worker will do it in the same time there are some errors in application.
Is it possible to define some Lock between gunicorn workers?

parallel requests with gunicorn on heroku

I've pushed an I/O bound gunicorn\flask service to heroku. Heroku docs advise to either increase the number of gunicorn workers or to use async threads such as gevent. I tried the following Procfiles but still the service handles the file upload requests serially. I've added no application level locks.
Multiple processes Procfile:
web: gunicorn service:app --log-file=- --workers 4
Multiple threads Procfile:
web: gunicorn service:app --log-file=- --threads 4 --worker-class gevent
All the service does is receive JSON request, de-serialize it and upload the binary to S3. The logs suggest the limiting factor is that each request is handled only after the last has completed.
Is there something inherent to heroku or to flask that prevents multiple requests being handled in parallel?
AFAIK the code is agnostic to the number of workers, but is it also agnostic to the number of threads? Or should I add some kind of support in the code.

is PYTHON Gearman Worker accept multi-tasks

For example:
I have a task named "URLDownload", the task's function is download a large file from internet.
Now I have a Worker Process running, but have about 1000 files to download.
It is easy for a Client Process to create 1000 task, and send them to Gearman Server.
My Question is the Worker Process will do the task one by one, or it will accept multi-tasks at one time,
If the Worker Process can accept multi-tasks, So How can I limit the task-pool-size in Worker Process.
Workers process one request at a time. You have a few options:
1) You can run multiple workers (this is the most common method). Workers sit in poll() when they aren't processing so this model works pretty well.
2) Write a fork() implementation around the worker. This way you can fire up a set number of worker processes, but don't have to monitor multiple processes.

Resources