Python nose unit tests generating too many clients already - python-3.x

I'm using python 3.3, pyramid, sqlalchemy, psygopg2. I'm using a test postgres db for the unit tests. I have 101 unit tests set up for nose to run. On test 101 I get:
nose.proxy.OperationalError: (OperationalError) FATAL: sorry, too many clients already
It seems from the traceback that the exception is being thrown in
......./venv/lib/python3.3/site-packages/SQLAlchemy-0.8.2-py3.3.egg/sqlalchemy/pool.py", line 368, in __connect
connection = self.__pool._creator()
Perhaps tearDown() is not running after each test? Isn't the connection pool limit for Postgresql 100 at one time?
Here's my BaseTest class:
class BaseTest(object):
def setup(self):
self.request = testing.DummyRequest()
self.config = testing.setUp(request=self.request)
self.config.scan('../models')
sqlalchemy_url = 'postgresql://<user>:<pass>#localhost:5432/<db>'
engine = create_engine(sqlalchemy_url)
DBSession = scoped_session(sessionmaker(extension=ZopeTransactionExtension()))
DBSession.configure(bind=engine)
Base.metadata.bind = engine
Base.metadata.create_all(engine)
self.dbsession = DBSession
def tearDown(self):
testing.teardown()
My test classes inherit from BaseTest:
class TestUser(BaseTest):
def __init__(self, dbsession = None):
if dbsession:
self.dbsession = dbsession
def test_create_user(self):
......
......
One of the test classes tests a many-to-many relationship, so in that test class I first create the records needed to satisfy the foreign key relationships:
from tests.test_user import TestUser
from tests.test_app import TestApp
class TestAppUser(BaseTest):
def __init__(self, dbsession = None):
if dbsession:
self.dbsession = dbsession
def create_app_user(self):
test_app = TestApp(self.dbsession)
test_user = TestUser(self.dbsession)
test_app.request = testing.DummyRequest()
test_user.request = testing.DummyRequest()
app = test_app.create_app()
user = test_user.create_user()
......
I'm passing the dbsession into the TestApp and TestUser classes...I'm thinking that is the source of the problem, but I'm not sure.
Any help is greatly appreciated. Thanks.

Pyramid has nothing to do with SQLAlchemy. There is nowhere in Pyramid's API where you would link any of your SQLAlchemy configuration in a way that Pyramid would actually care. Therefore, Pyramid's testing.tearDown() does not do anything with connections. How could it? It doesn't know they exist.
You're using scoped sessions with a unit test, which really doesn't make a lot of sense because your unit tests are probably not threaded. So now you're creating threadlocal sessions and not cleaning them up. They aren't garbage collected because they're threadlocal. You also aren't manually closing those connections so the connection pool thinks they're still being used.
Is there a reason you need the ZopeTransactionExtension in your tests? Are you using the transaction package in your tests, or pyramid_tm? In a test if you don't know what something does then it shouldn't be there. You're calling create_all() from your setUp() method? That's going to be slow as hell introspecting the database and creating tables on every request. Ouch.
class BaseTest(object):
def setUp(self):
self.request = testing.DummyRequest()
self.config = testing.setUp(request=self.request)
self.config.scan('../models')
sqlalchemy_url = 'postgresql://<user>:<pass>#localhost:5432/<db>'
self.engine = create_engine(sqlalchemy_url)
Base.metadata.create_all(bind=self.engine)
self.sessionmaker = sessionmaker(bind=self.engine)
self.sessions = []
def makeSession(self, autoclose=True):
session = self.sessionmaker()
if autoclose:
self.sessions.append(session)
def tearDown(self):
for session in self.sessions:
session.close()
self.engine.dispose()
testing.teardown()

Related

Why would my pytest tests hang before dropping my SQLAlchemy DB?

Here's my conftest.py (some code deleted for brevity)
from trip_planner import create_app, db as _db
from trip_planner.models import User
from test import TestConfig, test_instance_dir
#pytest.fixture(scope='session', autouse=True)
def app(session_mocker: pytest_mock.MockerFixture):
static_folder = mkdtemp(prefix='static')
_app = create_app(TestConfig(), instance_path=test_instance_dir,
static_folder=static_folder)
ctx = _app.app_context()
ctx.push()
session_mocker.patch('trip_planner.assets.manifest',
new=defaultdict(str))
yield _app
ctx.pop()
os.rmdir(static_folder)
#pytest.fixture(scope='session')
def db(app):
_db.create_all()
seed_db(_db)
yield _db
_db.drop_all()
def seed_db(db) -> User:
sessionmaker = db.create_session({'autocommit': False})
session = sessionmaker()
user = User(username='username',
password_digest=bcrypt.hash('password'))
session.add(user)
session.commit()
session.close()
return user
#pytest.fixture(scope='function')
def db_session(db):
session = db.create_scoped_session(options=dict(
autocommit=False, autoflush=False
))
db.session = session
with session.begin_nested():
yield session
session.rollback()
session.remove()
#pytest.fixture(scope='function')
def app_client(app):
with app.test_client() as c:
yield c
#pytest.fixture(scope='function')
def session_user(db_session, app_client) -> int:
user_id, = db_session.query(User.id).filter_by(username='username').one()
with app_client.session_transaction() as sess:
sess['user_id'] = user_id
return user_id
When my tests pass, pytest hangs. I'm only able to stop it with killall. Inspection of the test database reveals that the relationships were not, in fact, dropped.
How do I remedy this?
Apparently, it's a well-known problem with PostgreSQL specifically, here's the discussion.
The way I solved it was adding _db.close_all_sessions() before dropping all tables:
#pytest.fixture(scope='session')
def db(app):
_db.create_all()
seed_db(_db)
yield _db
_db.close_all_sessions()
_db.drop_all()
Another reason that might have been the case before, not sure. But it's worth checking if you check pg_stat_activity and see that your queries hang on obtaining advisory locks.
Apparently advisory locks can be session-level or transaction-level in PostgreSQL. A session-level advisory lock is not released upon the end of transaction, only on disconnect. It can cause overlapping sessions to hang while one is trying to roll everything back, and the other trying to take an advisory lock.
A session-level advisory lock is obtained via pg_advisory_lock functions and a transaction-level advisory lock is obtained via pg_advisory_xact_lock functions.

Django initialising AppConfig multiple times

I wanted to use the ready() hook in my AppConfig to start django-rq scheduler job. However it does so multiple times, every times I start the server. I imagine that's due to threading however I can't seem to find a suitable workaround. This is my AppConfig:
class AnalyticsConfig(AppConfig):
name = 'analytics'
def ready(self):
print("Init scheduler")
from analytics.services import save_hits
scheduler = django_rq.get_scheduler('analytics')
scheduler.schedule(datetime.utcnow(), save_hits, interval=5)
Now when I do runserver, Init scheduler is displayed 3 times. I've done some digging and according to this question I started the server with --noreload which didn't help (I still got Init scheduler x3). I also tried putting
import os
if os.environ.get('RUN_MAIN', None) != 'true':
default_app_config = 'analytics.apps.AnalyticsConfig'
in my __init__.py however RUN_MAIN appears to be None every time.
Afterwards I created a FileLock class, to skip configuration after the first initialization, which looks like this:
class FileLock:
def __get__(self, instance, owner):
return os.access(f"{instance.__class__.__name__}.lock", os.F_OK)
def __set__(self, instance, value):
if not isinstance(value, bool):
raise AttributeError
if value:
f = open(f"{instance.__class__.__name__}.lock", 'w+')
f.close()
else:
os.remove(f"{instance.__class__.__name__}.lock")
def __delete__(self, obj):
raise AttributeError
class AnalyticsConfig(AppConfig):
name = 'analytics'
locked = FileLock()
def ready(self):
from analytics.services import save_hits
if not self.locked:
print("Init scheduler")
scheduler = django_rq.get_scheduler('analytics')
scheduler.schedule(datetime.utcnow(), save_hits, interval=5)
self.locked = True
This does work, however the lock is not destroyed after the app quits. I tried removing the .lock files in settings.py but it also runs multiple times, making this pointless.
My question is: How can I prevent django from calling ready() multiple times, or how otherwise can I teardown the .lock files after django exits or right after it boots?
I'm using python 3.8 and django 3.1.5

Trying to understand dependency injection

I am trying to learn the concept of "dependency injection" in Python. First, if anyone has a good reference, please point me at it.
As a project I took the use case of changing logic and formatting based on options passed to the linux command "mtr"
The dependency client class is MtrRun. The initial dependency injection service is DefaultRgx (I plan to add a couple more). The injection interface is MtrOptions. And the injector class is just called Injector.
class MtrRun(MtrOptions): # Dependency injection client
def __init__(self, MtrOptions, options, out):
self.MtrOptions = MtrOptions
self.options = options
self.out = out
def mtr_options(self, options, out):
return self.MtrOptions.mtr_options(options, out)
class DefaultRgx(MtrOptions): # Dependency injection service
def __init__(self, options):
self.options = None
def mtr_options(self, options, out):
pass # code abbreviated for clarity
class MtrOptions(): # Dependency injection interface
def __init__(self, svc):
self.svc = svc
def mtr_options(self, options, out):
return self.svc.mtr_options(options, out)
class Injector(): # Dependency injection injector
def inject(self):
MtrOptions = MtrOptions(DefaultRgx())
mtr_result = MtrRun(MtrOptions)
This snippet will not clear a lint. My IDE claims that the MtrOptions class passed into the injection client and service are not defined. When I try to resolve it, a new MtrOptions class is created, but the error persists. I am certain I just don't know what I am doing. Conceptually I admit a weak grasp. Help is appreciated.
So I messed up several ways. I did not understand the declarative way to establish inheritance. Nothing in my example actually was an "object". Second, order appears to matter. The object classes seem to need to appear before their children. Third, the injector class needs to include both sides of the classes to be injected.
class DefaultRgx(object): # Dependency injection service
def __init__(self, options):
self.options = None
def mtr_options(self, options, out):
mtr_result = ['Do stuff the old way']
return mtr_result
class MtrRun(DefaultRgx): # Dependency injection client
def __init__(self, host, count, options):
self.count = count
self.host = host
self.options = options
def mtr_module(self, host, count, options):
mtr_result = super().mtr_options(options, out)
return mtr_result
class MtrOptions(DefaultRgx): # Dependency injection interface
def mtr_options(self, options, out):
mtr_result = ['I was injected instead of DefaultRgx']
return mtr_result
class Injector(MtrOptions, MtrRun): # Dependency injection injector
pass
def main():
mtr = Injector(os.getenv('HOST'), os.getenv('COUNT'), None)
mtr_result = mtr.mtr_module()
This linted correctly. I have not run it yet, but conceptually that YouTube really helped things click. Thank you so much.

flask-sqlalchemy - how to obtain a request-independent db session

I am looking at the best (and correct way) to obtain a request-independent db session.
The problem is the following: I am building a web application that has to access the database. The endpoint exposed accepts a request, performs the first work, then create a thread (that will perform the hard work), starts it, and replies to the client with a unique id for the "job". Meanwhile the thread goes on with its work (and it has to access the database) and the client can perform polling to check the status. I am not using dedicated framework to perform this background job, but only a simple thread. I can only have one single background thread going on at any time, for this reason I am maintaining the state in a singleton.
The application is created with the application factory design https://flask.palletsprojects.com/en/1.1.x/patterns/appfactories/
I am using Gunicorn as WSGI server and sqlite as database.
The basic structure of the code is the following (I am removing the business logic and imports, but the concept remain):
api_jobs.py
#bp.route('/jobs', methods=['POST'])
def create_job():
data = request.get_json(force=True) or {}
name = data['name']
job_controller = JobController() # This is a singleton
job_process = job_controller.start_job(name)
job_process_dict = job_process.to_dict()
return jsonify(job_process_dict)
controller.py
class Singleton(type):
_instances = {}
def __call__(cls, *args, **kwargs):
if cls not in cls._instances:
cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs)
return cls._instances[cls]
class JobController(object):
__metaclass__ = Singleton
def __init__(self):
self.job_thread = None
def start_job(self, name):
if self.job_thread is not None:
job_id = self.job_thread.job_id
job_process = JobProcess.query.get(job_id)
if job_process.status != 'end':
raise ValueError('A job process is already ongoing!')
else:
self.job_thread = None
job_process = JobProcess(name)
db.session.add(job_process)
db.session.commit() # At this step I create the ID
self.job_thread = JobThread(db.session, job_process.id)
self.job_thread.start()
return job_process
class JobThread(threading.Thread):
def __init__(self, db_session, job_id):
self.job_id = job_id
self.db_session = db_session
self.session = self.db_session()
def run(self):
self.job_process = self.session.query(JobProcess).get(self.job_id)
self.job_process.status = 'working'
self.session.commit()
i = 0
while True:
sleep(1)
print('working hard')
i = i +1
if i > 10:
break
self.job_process.status = 'end'
self.session.commit()
self.db_session.remove()
models.py
class JobProcess(db.Model):
id = db.Column(db.Integer, primary_key=True)
status = db.Column(db.String(64))
name = db.Column(db.String(64))
def to_dict(self):
data = {
'id': self.id,
'status': self.status,
'name': self.name,
}
return data
From my understanding, calling self.session = self.db_session() is actually doing nothing (due to the fact that sqlalchemy is using a registry, that is also a proxy, if I am not wrong), however that was the best attempt that I found to create a "new/detached/useful" session.
I checked out https://docs.sqlalchemy.org/en/13/orm/contextual.html#using-thread-local-scope-with-web-applications in order to obtain a request-independent db-session, however even using the suggested method of creating a new session factory (sessionmaker + scoped_session), does not work.
The errors that I obtain, with slight changes to the code, are multiple, in this configuration the error is
DetachedInstanceError: Instance <JobProcess at 0x7f875f81c350> is not bound to a Session; attribute refresh operation cannot proceed (Background on this error at: http://sqlalche.me/e/bhk3)
The basic question remains: Is it possible to create a session that will live inside the thread and that I will take care of creating/tearing down?
The reason that you are encountering the DetachedInstanceError is that you are attempting to pass the session from your main thread to your job thread. Sqlalchemy is using thread local storage to manage the sessions and thus a single session cannot be shared between two threads. You just need to create a new session in the run method of your job thread.

Proper way to share sqlachemy session and egine in context of gevent

My code tried to create an engine first:
def createEngine(connectionstring):
engine = create_engine(connectionstring,
#pool_size = DEFAULT_POOL_SIZE,
#max_overflow = DEFAULT_MAX_OVERFLOW,
echo = False)
return engine
Then get a session from the engine:
#contextmanager
def getOrmSession(engine):
try:
Session.configure(bind=engine)
session = Session()
yield session
finally:
pass
The client code is as follows:
def composeItems(keyword, itemList):
with getOrmSession(engine) as session:
for i in itemList:
item = QueryItem(query=keyword,
......
active = 0)
session.add(item)
session.commit()
Then when i call composeItems within gevent spawn. Obviously, mysql deadlocks. What had happened? What is wrong with the above usage?
Find the answer by myself.
I need to patch threading when import gevent. So scoped_session will be able to use greenlet's threading local. change the patching and everhything works fine now.

Resources