Task for fargate service terminating within seconds - python-3.x

I have a ECS Cluster on AWS and there are four services running under it. One of the service is a replica type with fargate launch type. It also has a load balancing associated. The OS is Linux 1.4 and number of tasks running are 2 without any auto scaling. The docker image which runs on it is a gunicorn application and the command used to run is below. And the gunicorn application is for running an API on falcon.
["gunicorn","-b","0.0.0.0:80","src.app:run()","-k","gevent","--workers=5"]
For some reason the tasks are getting stopped in every few seconds. In the logs it shows Exit Code 1, and also logs some errors in the cloudwatch.
gunicorn.errors.HaltServer: <HaltServer 'Worker failed to boot.' 3>
raise HaltServer(reason, self.WORKER_BOOT_ERROR)
[10] [ERROR] Exception in worker process
This service is running from past one year and never had this error, and suddenly it stopped working. There is no new code deployed or any development done, hence very unusual to get these errors.
The service is configured to start two tasks, so it starts two and then within 2 seconds it stops and another two starts once previous ones stops. And this cycle continues.
I have tried deploying the existing code base but still having the same error, I have also updated the service with new task definitions but that also did not fix.
Some additional errors from cloudwatch but does not help much.
[INFO] Starting gunicorn 19.9.0
[INFO] Listening at: http://0.0.0.0:80 (1)
[INFO] Using worker: gevent
/usr/local/lib/python3.10/os.py:1029: RuntimeWarning: line buffering (buffering=1) isn't supported in binary mode, the default buffer size will be used
return io.open(fd, mode, buffering, encoding, *args, **kwargs)
[7] [INFO] Booting worker with pid: 7
[8] [INFO] Booting worker with pid: 8
[9] [INFO] Booting worker with pid: 9
[10] [INFO] Booting worker with pid: 10
[11] [INFO] Booting worker with pid: 11
[7] [ERROR] Exception in worker process
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/gunicorn/arbiter.py", line 583, in spawn_worker
worker.init_process()
File "/usr/local/lib/python3.10/site-packages/gunicorn/workers/ggevent.py", line 203, in init_process
super(GeventWorker, self).init_process()
I have tried running the same docker in my local and getting almost the same error. Now at least the issue is narrow down to the code itself. But still not understand why it was running from years and failed just now. The detailed error is below.
[2022-09-04 08:16:27 +0000] [1] [INFO] Starting gunicorn 19.9.0
[2022-09-04 08:16:27 +0000] [1] [INFO] Listening at: http://0.0.0.0:80 (1)
[2022-09-04 08:16:27 +0000] [1] [INFO] Using worker: gevent
/usr/local/lib/python3.10/os.py:1029: RuntimeWarning: line buffering (buffering=1) isn't supported in binary mode, the default buffer size will be used
return io.open(fd, mode, buffering, encoding, *args, **kwargs)
[2022-09-04 08:16:27 +0000] [7] [INFO] Booting worker with pid: 7
[2022-09-04 08:16:27 +0000] [8] [INFO] Booting worker with pid: 8
[2022-09-04 08:16:27 +0000] [9] [INFO] Booting worker with pid: 9
[2022-09-04 08:16:27 +0000] [7] [ERROR] Exception in worker process
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/gunicorn/arbiter.py", line 583, in spawn_worker
worker.init_process()
File "/usr/local/lib/python3.10/site-packages/gunicorn/workers/ggevent.py", line 203, in init_process
super(GeventWorker, self).init_process()
File "/usr/local/lib/python3.10/site-packages/gunicorn/workers/base.py", line 129, in init_process
self.load_wsgi()
File "/usr/local/lib/python3.10/site-packages/gunicorn/workers/base.py", line 138, in load_wsgi
self.wsgi = self.app.wsgi()
File "/usr/local/lib/python3.10/site-packages/gunicorn/app/base.py", line 67, in wsgi
self.callable = self.load()
File "/usr/local/lib/python3.10/site-packages/gunicorn/app/wsgiapp.py", line 52, in load
return self.load_wsgiapp()
File "/usr/local/lib/python3.10/site-packages/gunicorn/app/wsgiapp.py", line 41, in load_wsgiapp
return util.import_app(self.app_uri)
File "/usr/local/lib/python3.10/site-packages/gunicorn/util.py", line 350, in import_app
__import__(module)
File "/src/app.py", line 1, in <module>
import falcon
File "/usr/local/lib/python3.10/site-packages/falcon/__init__.py", line 30, in <module>
from falcon.api import API # NOQA
File "/usr/local/lib/python3.10/site-packages/falcon/api.py", line 21, in <module>
from falcon import api_helpers as helpers, DEFAULT_MEDIA_TYPE, routing
File "/usr/local/lib/python3.10/site-packages/falcon/api_helpers.py", line 21, in <module>
from falcon import util
File "/usr/local/lib/python3.10/site-packages/falcon/util/__init__.py", line 29, in <module>
from falcon.util import structures
File "/usr/local/lib/python3.10/site-packages/falcon/util/structures.py", line 35, in <module>
class CaseInsensitiveDict(collections.MutableMapping): # pragma: no cover
AttributeError: module 'collections' has no attribute 'MutableMapping'
[2022-09-04 08:16:27 +0000] [7] [INFO] Worker exiting (pid: 7)
[2022-09-04 08:16:27 +0000] [10] [INFO] Booting worker with pid: 10
[2022-09-04 08:16:27 +0000] [8] [ERROR] Exception in worker process

Related

Open a connexion-based REST service with gunicorn

I have a Flask service built with connexion. The service is initialized in a function create_app() that is defined in the script src/group/application/my_service/api/app.py :
# app.py
def create_app():
arguments = {"url": "0.0.0.0"}
app = connexion.App(__name__, options={"swagger_ui": True})
app.add_api("openapi-spec.yml", arguments=arguments, strict_validation=True)
app.run(port=8080, debug=True)
In src/group/application/my_service/__main__.py, I import create_app and execute it:
# __main__.py
from group.application.my_service.api.app import create_app
create_app()
With this in place, I can successfully open the service with python :
python -m src.group.application.my_service
I would like now to use gunicorn instead. I am trying the following command
gunicorn -w 1 -b 0.0.0.0:8080 'src.group.application.my_service.api.app:create_app()'
but I am getting the following error message :
[2021-05-19 11:55:32 +0200] [13275] [INFO] Starting gunicorn 20.1.0
[2021-05-19 11:55:32 +0200] [13275] [INFO] Listening at: http://0.0.0.0:8080 (13275)
[2021-05-19 11:55:32 +0200] [13275] [INFO] Using worker: sync
[2021-05-19 11:55:32 +0200] [13276] [INFO] Booting worker with pid: 13276
* Serving Flask app "src.group.application.my_service.api.app" (lazy loading)
* Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
* Debug mode: on
2021-05-19 11:55:33,672 [CRITICAL] Traceback (most recent call last):
File "/Users/user/repo_name/src/group/application/my_service/api/app.py", line 39, in create_app
app.run(port=8080, debug=True)
File "/Users/user/venvs/venv/lib/python3.9/site-packages/connexion/apps/flask_app.py", line 96, in run
self.app.run(self.host, port=self.port, debug=self.debug, **options)
File "/Users/user/venvs/venv/lib/python3.9/site-packages/flask/app.py", line 990, in run
run_simple(host, port, self, **options)
File "/Users/user/venvs/venv/lib/python3.9/site-packages/werkzeug/serving.py", line 1030, in run_simple
s.bind(server_address)
OSError: [Errno 48] Address already in use
Failed to find application object: 'create_app()'
[2021-05-19 11:55:33 +0200] [13276] [INFO] Worker exiting (pid: 13276)
[2021-05-19 11:55:33 +0200] [13275] [INFO] Shutting down: Master
[2021-05-19 11:55:33 +0200] [13275] [INFO] Reason: App failed to load.
How can I successfully open the service with Gunicorn, and without having the warning message about the fact that I am in a development service (which is the root cause why I want to use gunicorn) ?
It turns out that the function creat_app() should return app instead of calling app.run()

Deploy Python-Flask api in Azure

I have deployed Python-Flask API in Azure. Its working fine in development environment. It has following dependencies which is mentioned in a .txt file.
click==6.7
Flask==1.0.2
itsdangerous==0.24
Jinja2==2.10
MarkupSafe==1.0
Werkzeug==0.14.1
jsonpickle==1.0
pyodbc==4.0.25
I have an app.py class which has some function that contains some DB CURD operations. It also has a db.py which contain below code :
import pyodbc
cnxn = pyodbc.connect(cs)
But when I am navigating to https://kmsazapi.azurewebsites.net/ it is giving below error
:( Application Error. If you are the application administrator, you can access the diagnostic resources.
Please find the Application logs from Azure :
2019-01-19T16:30:46.743756546Z
2019-01-19T16:30:46.893500456Z Starting OpenBSD Secure Shell server: sshd.
2019-01-19T16:30:46.921319668Z Running python /usr/local/bin/entrypoint.py
2019-01-19T16:30:47.042444539Z executing:
2019-01-19T16:30:47.042628845Z python --version
2019-01-19T16:30:47.060630336Z Python 3.7.1
2019-01-19T16:30:47.060830442Z executing:
2019-01-19T16:30:47.060993448Z pip --version
2019-01-19T16:30:49.209547693Z pip 10.0.1 from /home/site/wwwroot/antenv/lib/python3.7/site-packages/pip (python 3.7)
2019-01-19T16:30:49.214266747Z found flask app
2019-01-19T16:30:49.219978635Z executing:
2019-01-19T16:30:49.219990835Z . antenv/bin/activate
2019-01-19T16:30:49.224706090Z
2019-01-19T16:30:49.224798193Z executing:
2019-01-19T16:30:49.224971698Z GUNICORN_CMD_ARGS="--bind=0.0.0.0 --timeout 600" gunicorn application:app
2019-01-19T16:30:50.183264018Z [2019-01-19 16:30:50 +0000] [36] [INFO] Starting gunicorn 19.9.0
2019-01-19T16:30:50.183984042Z [2019-01-19 16:30:50 +0000] [36] [INFO] Listening at: http://0.0.0.0:8000 (36)
2019-01-19T16:30:50.184216749Z [2019-01-19 16:30:50 +0000] [36] [INFO] Using worker: sync
2019-01-19T16:30:50.194083973Z [2019-01-19 16:30:50 +0000] [39] [INFO] Booting worker with pid: 39
2019-01-19T16:30:50.967282324Z [2019-01-19 16:30:50 +0000] [39] [ERROR] Exception in worker process
2019-01-19T16:30:50.967302024Z Traceback (most recent call last):
2019-01-19T16:30:50.967306124Z File "/usr/local/lib/python3.7/site-packages/gunicorn/arbiter.py", line 583, in spawn_worker
2019-01-19T16:30:50.967311525Z worker.init_process()
2019-01-19T16:30:50.967325625Z File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/base.py", line 129, in init_process
2019-01-19T16:30:50.967329625Z self.load_wsgi()
2019-01-19T16:30:50.967332825Z File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/base.py", line 138, in load_wsgi
2019-01-19T16:30:50.967336425Z self.wsgi = self.app.wsgi()
2019-01-19T16:30:50.967347026Z File "/usr/local/lib/python3.7/site-packages/gunicorn/app/base.py", line 67, in wsgi
2019-01-19T16:30:50.967350926Z self.callable = self.load()
2019-01-19T16:30:50.967354226Z File "/usr/local/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 52, in load
2019-01-19T16:30:50.967357626Z return self.load_wsgiapp()
2019-01-19T16:30:50.967361026Z File "/usr/local/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 41, in load_wsgiapp
2019-01-19T16:30:50.967364426Z return util.import_app(self.app_uri)
2019-01-19T16:30:50.967367726Z File "/usr/local/lib/python3.7/site-packages/gunicorn/util.py", line 350, in import_app
2019-01-19T16:30:50.967371427Z import(module)
2019-01-19T16:30:50.967374727Z File "/home/site/wwwroot/application.py", line 7, in
2019-01-19T16:30:50.967378427Z import db
2019-01-19T16:30:50.967381627Z File "/home/site/wwwroot/db.py", line 1, in
2019-01-19T16:30:50.967385027Z import pyodbc
2019-01-19T16:30:50.967388327Z ImportError: libodbc.so.2: cannot open shared object file: No such file or directory
2019-01-19T16:30:50.967653236Z [2019-01-19 16:30:50 +0000] [39] [INFO] Worker exiting (pid: 39)
2019-01-19T16:30:51.050986468Z [2019-01-19 16:30:51 +0000] [36] [INFO] Shutting down: Master
2019-01-19T16:30:51.051229076Z [2019-01-19 16:30:51 +0000] [36] [INFO] Reason: Worker failed to boot.
2019-01-19T16:30:51.102156846Z
What I am missing ?
Update: 0115:
If you deploy the python app to web app for windows, you can install the python extension as below: Go to azure portal -> your app service -> Extensions -> Add -> choose extensions:
How do you deploy your flask app?
You can refer to the official doc for the deployment. I followed the doc, and can work well in azure with the site https://xxx.azurewebsites.net/home .
my code:
from flask import Flask
app = Flask(__name__)
#app.route("/home")
def home():
return "Hello World a nice day!"
after deploy to azure, the site works well:

Error: Bad Gateway 502 when opening Google App Engine Python Domain

When I'm visiting my website (https://osm-messaging-platform.appspot.com), I get this error on the main webpage:
502 Bad Gateway. nginx/1.14.0 (Ubuntu).
It's really weird, since when I run it locally
python app.py
I get no errors, and my app and the website load fine.
I've already tried looking it up, but most of the answers I've found on stack overflow either have no errors or don't relate to me. Here is the error when I look at my GCloud logs:
019-02-07 02:07:05 default[20190206t175104] Traceback (most recent
call last): File "/env/lib/python3.7/site-
packages/gunicorn/arbiter.py", line 583, in spawn_worker
worker.init_process() File "/env/lib/python3.7/site-
packages/gunicorn/workers/gthread.py", line 104, in init_process
super(ThreadWorker, self).init_process() File
"/env/lib/python3.7/site-packages/gunicorn/workers/base.py", line
129, in init_process self.load_wsgi() File
"/env/lib/python3.7/site-packages/gunicorn/workers/base.py", line
138, in load_wsgi self.wsgi = self.app.wsgi() File
"/env/lib/python3.7/site-packages/gunicorn/app/base.py", line 67, in
wsgi self.callable = self.load() File
"/env/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 52,
in load return self.load_wsgiapp() File
"/env/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 41,
in load_wsgiapp return util.import_app(self.app_uri) File
"/env/lib/python3.7/site-packages/gunicorn/util.py", line 350, in
import_app __import__(module) ModuleNotFoundError: No module
named 'main'
2019-02-07 02:07:05 default[20190206t175104] [2019-02-07 02:07:05
+0000] [25] [INFO] Worker exiting (pid: 25)
2019-02-07 02:07:05 default[20190206t175104] [2019-02-07 02:07:05
+0000] [8] [INFO] Shutting down: Master
2019-02-07 02:07:05 default[20190206t175104] [2019-02-07 02:07:05
+0000] [8] [INFO] Reason: Worker failed to boot.
And here are the contents of my app.yaml file:
runtime: python37
handlers:
# This configures Google App Engine to serve the files in the app's
static
# directory.
- url: /static
static_dir: static
- url: /.*
script: auto
I expected it to show my website, but it didn't. Can anyone help?
The error is produced because the App Engine Standard Python37 runtime handles the requests in the main.py file by default. I guess that you don't have this file and you are handling the requests in the app.py file.
Also the logs traceback is pointing to it: ModuleNotFoundError: No module named 'main'
Change the name the name of the app.py file to main.py and try again.
As a general rule it is recommended to follow this file structure present in the App Engine Standard documention:
your-app/
app.yaml
main.py
requirements.txt
static/
script.js
style.css
templates/
index.html
I believe this would be overkill for your situation but If you need a custom entrypoint read this Python3 runtime documentation to know more about how to configure it.
My mistake was naming the main app "main" which conflicted with main.py. It worked fine locally as it did not use main.py. I changed it to root and everything worked fine. It took me a whole day to solve it out.
I resolved the issue in main.py by changing the host from:
app.run(host="127.0.0.1", port=8080, debug=True)
to
app.run(host="0.0.0.0", port=8080, debug=True)

TypeError: must be str, not NoneType when running distributed locust with taurus

I am trying to create a configuration for distributed locust run, I have a .py script with defined tasks, and I have simple taurus configuration just to make it working:
execution:
executor: locust
master: true
slaves: 1
scenario: tns
concurrency: 10
ramp-up: 10s
iterations: 100
hold-for: 10s
scenarios:
tns:
script: /usr/src/app/scenarios/locust_scenarios/sample.py
reporting:
- module: final-stats
dump-csv: test_result.csv
- module: console
- module: passfail
criteria:
- avg-rt>250ms for 30s, continue as failed
- failures>5% for 5s, continue as failed
- failures>50% for 10s, stop as failed
then I start locust slave node:
python -m locust.main -f scenarios/locust_scenarios/sample.py --slave --master-host=localhost
and execute test, here is the log
$ bzt -o modules.console.screen=gui locust_tests_execution_config.yaml
12:38:54 INFO: Taurus CLI Tool v1.12.0
12:38:54 INFO: Starting with configs: ['locust_tests_execution_config.yaml']
12:38:54 INFO: Configuring...
12:38:54 INFO: Artifacts dir: /Users/usr/Projects/load/2018-06-20_12-38-54.391229
12:38:54 WARNING: at path 'execution': 'execution' should be a list
12:38:54 INFO: Preparing...
12:38:54 WARNING: Module 'console' can be only used once, will merge all new instances into single
12:38:54 INFO: Starting...
12:38:54 INFO: Waiting for results...
12:38:55 WARNING: Please wait for graceful shutdown...
12:38:55 INFO: Shutting down...
12:38:56 INFO: Terminating process PID 54419 with signal Signals.SIGTERM (59 tries left)
12:38:57 INFO: Terminating process PID 54419 with signal Signals.SIGTERM (58 tries left)
12:38:57 ERROR: TypeError: must be str, not NoneType
File "/Users/usr/.virtualenvs/stfw/lib/python3.6/site-packages/bzt/cli.py", line 250, in perform
self.engine.run()
File "/Users/usr/.virtualenvs/stfw/lib/python3.6/site-packages/bzt/engine.py", line 222, in run
reraise(exc_info)
File "/Users/usr/.virtualenvs/stfw/lib/python3.6/site-packages/bzt/six/py3.py", line 84, in reraise
raise exc
File "/Users/usr/.virtualenvs/stfw/lib/python3.6/site-packages/bzt/engine.py", line 204, in run
self._wait()
File "/Users/usr/.virtualenvs/stfw/lib/python3.6/site-packages/bzt/engine.py", line 243, in _wait
while not self._check_modules_list():
File "/Users/usr/.virtualenvs/stfw/lib/python3.6/site-packages/bzt/engine.py", line 230, in _check_modules_list
finished = bool(module.check())
File "/Users/usr/.virtualenvs/stfw/lib/python3.6/site-packages/bzt/modules/aggregator.py", line 635, in check
for point in self.datapoints():
File "/Users/usr/.virtualenvs/stfw/lib/python3.6/site-packages/bzt/modules/aggregator.py", line 401, in datapoints
for datapoint in self._calculate_datapoints(final_pass):
File "/Users/usr/.virtualenvs/stfw/lib/python3.6/site-packages/bzt/modules/aggregator.py", line 664, in _calculate_datapoints
self._process_underlings(final_pass)
File "/Users/usr/.virtualenvs/stfw/lib/python3.6/site-packages/bzt/modules/aggregator.py", line 649, in _process_underlings
for data in underling.datapoints(final_pass):
File "/Users/usr/.virtualenvs/stfw/lib/python3.6/site-packages/bzt/modules/aggregator.py", line 401, in datapoints
for datapoint in self._calculate_datapoints(final_pass):
File "/Users/usr/.virtualenvs/stfw/lib/python3.6/site-packages/bzt/modules/locustio.py", line 221, in _calculate_datapoints
self.read_buffer += self.file.get_bytes(size=1024 * 1024, last_pass=final_pass)
12:38:57 INFO: Post-processing...
12:38:57 INFO: Test duration: 0:00:03
12:38:57 INFO: Test duration: 0:00:03
12:38:57 INFO: Artifacts dir: /Users/usr/Projects/load/2018-06-20_12-38-54.391229
12:38:57 WARNING: Done performing with code: 1
locust log shows that locust slave was connected and ready to swarm.
What should I do to make it running?
Thanks
It seems that there is a defect in bzt library, based on this thread:
https://groups.google.com/forum/#!searchin/codename-taurus/locust%7Csort:date/codename-taurus/woBeH1JeBFo/pHhoGUSoAwAJ
there will be a fix in new release:
https://github.com/Blazemeter/taurus/pull/871

gunicorn ERROR (abnormal termination)

im running a fabric script that, amongst other things, is supposed to restart gunicorn on an ubuntu server, the command is below:
supervisorctl status projectname:gunicorn | sed "s/.*[pid ]\([0-9]\+\)\,.*/\1/" | xargs kill -HUP
the problem is, is that gunicorn doesnt appear to be running in the first place so the process cannot be killed, ive ssh'd into the amazon ec2 instance and ran
sudo supervisorctl restart projectname:gunicorn'
and I get an error response that says:
projectname:gunicorn: ERROR (not running)
projectname:gunicorn ERROR (abnormal termination)
so i attempted to start gunicorn by running
sudo supervisorctl start projectname:gunicorn
and the error says
'projectname:gunicorn: Error (abnormal termination)'
So I need gunicorn to run, and im having trouble acheiving this
Ive also checked the gunicorn log and the text below, below is the relevant output
2014-01-17 14:58:14 [12260] [INFO] Starting gunicorn 0.14.3
2014-01-17 14:58:14 [12260] [INFO] Listening at: http://127.0.0.1:9000 (12260)
2014-01-17 14:58:14 [12260] [INFO] Using worker: sync
2014-01-17 14:58:14 [12263] [INFO] Booting worker with pid: 12263
2014-01-17 14:58:14 [12264] [INFO] Booting worker with pid: 12264
2014-01-17 14:58:14 [12265] [INFO] Booting worker with pid: 12265
2014-01-17 14:58:14 [12266] [INFO] Booting worker with pid: 12266
2014-01-17 14:58:14 [12263] [INFO] Worker exiting (pid: 12263)
2014-01-17 14:58:14 [12264] [INFO] Worker exiting (pid: 12264)
2014-01-17 14:58:14 [12265] [INFO] Worker exiting (pid: 12265)
2014-01-17 14:58:14 [12266] [INFO] Worker exiting (pid: 12266)
Traceback (most recent call last):
File "/opt/screening/env/bin/gunicorn_django", line 9, in <module>
load_entry_point('gunicorn==0.14.3', 'console_scripts', 'gunicorn_django')()
File "/opt/compliance_engine/env/local/lib/python2.7/site-packages/gunicorn/app/djangoapp.py", line 129, in run
DjangoApplication("%prog [OPTIONS] [SETTINGS_PATH]").run()
File "/opt/compliance_engine/env/local/lib/python2.7/site-packages/gunicorn/app/base.py", line 129, in run
Arbiter(self).run()
File "/opt/compliance_engine/env/local/lib/python2.7/site-packages/gunicorn/arbiter.py", line 184, in run
self.halt(reason=inst.reason, exit_status=inst.exit_status)
File "/opt/compliance_engine/env/local/lib/python2.7/site-packages/gunicorn/arbiter.py", line 279, in halt
self.stop()
File "/opt/compliance_engine/env/local/lib/python2.7/site-packages/gunicorn/arbiter.py", line 327, in stop
self.reap_workers()
File "/opt/compliance_engine/env/local/lib/python2.7/site-packages/gunicorn/arbiter.py", line 413, in reap_workers
raise HaltServer(reason, self.WORKER_BOOT_ERROR)
gunicorn.errors.HaltServer: <HaltServer 'Worker failed to boot.' 3>
also, here is the conf file
[program:gunicorn]
command=/opt/screening/env/bin/gunicorn_django --pythonpath . ce.settings -w 4 --bind 127.0.0.1:9000
directory=/opt/screening/repository
user=www-data
autostart=true
autorestart=true
stdout_logfile=/opt/screening/logs/gunicorn.log
redirect_stderr=true
[program:celeryd]
command=/opt/screening/env/bin/python manage.py celeryd --autoscale=16,2 -E -l INFO --pidfile=/opt/screening/tmp/pids/celeryd.pid
directory=/opt/screening/repository
user=www-data
autostart=true
autorestart=true
stdout_logfile=/opt/screening/logs/celeryd.log
redirect_stderr=true
[program:celerybeat]
command=/opt/screening/env/bin/python manage.py celerybeat -l INFO -- schedule=/opt/screening/tmp/celerybeat-schedule -- pidfile=/opt/screening/tmp/pids/celerybeat.pid
directory=/opt/screening/repository
user=www-data
autostart=true
autorestart=true
stdout_logfile=/opt/screening/logs/celerybeat.log
redirect_stderr=true
[program:celerycam]
command=/opt/screening/env/bin/python manage.py celerycam -- pidfile=/opt/screening/tmp/pids/celerycam.pid
directory=/opt/screening/repository
user=www-data
autostart=true
autorestart=true
stdout_logfile=/opt/screening/logs/celerycam.log
redirect_stderr=true
[group:screening]
programs=gunicorn,celeryd,celerybeat,celerycam
any ideas? I understand that this is a lot of text, any hints or pointers would be much appreciated
Thanks for reading,
edit:
ran unicorn on its own, activated the virtual env and ran
python manage.py run_gunicorn
the terminal printed the below output
2014-01-19 22:02:35 [14735] [INFO] Starting gunicorn 0.14.3
2014-01-19 22:02:35 [14735] [INFO] Listening at: http://127.0.0.1:8000 (14735)
2014-01-19 22:02:35 [14735] [INFO] Using worker: sync
2014-01-19 22:02:35 [14742] [INFO] Booting worker with pid: 14742
also ran the run server in the virtualenv:
python manage.py runserver 7000
Validating models...
0 errors found
Django version 1.3, using settings 'ce.settings'
Development server is running at http://127.0.0.1:7000/
Quit the server with CONTROL-C.
so no apparent errors there
edit 2:
have spoken to a couple other people about this, and was advised to look at the permissions for the gunicorn logs, here they are:
-rw-rw-r-- 1 www-data ubuntu 3270504 2014-01-19 23:23 gunicorn.log
the www-data user matches the one set in the supervisor config
edit 3: I ran the gunicorn command again, but this time added logging info:
gunicorn_django --pythonpath . ce.settings -w 4 --bind 127.0.0.1:9000 --debug --log-level debug
and received the following error message:
Traceback (most recent call last):
File "/opt/screening/env/local/lib/python2.7/site-packages/gunicorn/arbiter.py", line 453, in spawn_worker
worker.init_process()
File "/opt/screening/env/local/lib/python2.7/site-packages/gunicorn/workers/base.py", line 99, in init_process
self.wsgi = self.app.wsgi()
File "/opt/screening/env/local/lib/python2.7/site-packages/gunicorn/app/base.py", line 101, in wsgi
self.callable = self.load()
File "/opt/screening/env/local/lib/python2.7/site-packages/gunicorn/app/djangoapp.py", line 87, in load
mod = util.import_module("gunicorn.app.django_wsgi")
File "/usr/lib/python2.7/importlib/__init__.py", line 37, in import_module
__import__(name)
File "/opt/screening/env/local/lib/python2.7/site-packages/gunicorn/app/django_wsgi.py", line 18, in <module>
from django.core.management.validation import get_validation_errors
File "/opt/screening/env/local/lib/python2.7/site-packages/django/core/management/validation.py", line 3, in <module>
from django.contrib.contenttypes.generic import GenericForeignKey, GenericRelation
File "/opt/screening/env/local/lib/python2.7/site-packages/django/contrib/contenttypes/generic.py", line 6, in <module>
from django.db import connection
File "/opt/screening/env/local/lib/python2.7/site-packages/django/db/__init__.py", line 14, in <module>
if not settings.DATABASES:
File "/opt/screening/env/local/lib/python2.7/site-packages/django/utils/functional.py", line 276, in __getattr__
self._setup()
File "/opt/screening/env/local/lib/python2.7/site-packages/django/conf/__init__.py", line 42, in _setup
self._wrapped = Settings(settings_module)
File "/opt/screening/env/local/lib/python2.7/site-packages/django/conf/__init__.py", line 89, in __init__
raise ImportError("Could not import settings '%s' (Is it on sys.path?): %s" % (self.SETTINGS_MODULE, e))
ImportError: Could not import settings 'ce.settings' (Is it on sys.path?): No module named ce.settings
2014-01-20 09:14:22 [31830] [INFO] Worker exiting (pid: 31830)
Traceback (most recent call last):
File "/opt/screening/env/bin/gunicorn_django", line 9, in <module>
load_entry_point('gunicorn==0.14.3', 'console_scripts', 'gunicorn_django')()
File "/opt/screening/env/local/lib/python2.7/site-packages/gunicorn/app/djangoapp.py", line 129, in run
DjangoApplication("%prog [OPTIONS] [SETTINGS_PATH]").run()
File "/opt/screening/env/local/lib/python2.7/site-packages/gunicorn/app/base.py", line 129, in run
Arbiter(self).run()
File "/opt/screening/env/local/lib/python2.7/site-packages/gunicorn/arbiter.py", line 184, in run
self.halt(reason=inst.reason, exit_status=inst.exit_status)
File "/opt/screening/env/local/lib/python2.7/site-packages/gunicorn/arbiter.py", line 279, in halt
self.stop()
File "/opt/screening/env/local/lib/python2.7/site-packages/gunicorn/arbiter.py", line 327, in stop
self.reap_workers()
File "/opt/screening/env/local/lib/python2.7/site-packages/gunicorn/arbiter.py", line 413, in reap_workers
raise HaltServer(reason, self.WORKER_BOOT_ERROR)
gunicorn.errors.HaltServer: <HaltServer 'Worker failed to boot.' 3>
so it appears that the salient info is this:
ImportError: Could not import settings 'ce.settings' (Is it on sys.path?): No module named ce.settings
My settings are in a settings directory, and the init file is present, so the issue isnt that.
Also the application starts on the runserver so the settings file must be importable
(The question was answered by the OP in a question edit. Converted to a community wiki answer. See Question with no answers, but issue solved in the comments (or extended in chat) )
The OP wrote:
Solved the issue (I think)
as per the info in this link https://stackoverflow.com/a/19256794/2049067 , I added the project to the python path
export PYTHONPATH=:/my/path
then ran the gunicorn command again:
gunicorn_django --pythonpath . ce.settings -w 4 --bind 127.0.0.1:9000 --debug --log-level debug
and gunicorn is up and running, and the site is accessible, I exited the ssh and everything is (seemingly) still working. I should also add that before I set the pythonpath I changed the ownerwhip on the gunicorn log:
sudo chown -R www-data:www-data gunicorn.log
Though I dont know if that helped
& seeing how the application has been running for years I dont know how the project was removed from the pythonpath

Resources