How to test mysql queries using sqlalchemy and sqlite? - python-3.x

I have the following code structure written in Python3.6, which I need to test using sqlite3 (because of standards defined in my project):
class BigSecretService:
""" Class designed to make calculations based on data stored in MySQL. """
def load_data(self):
# load some data using sqlalchemy ORM
def get_values_from_fields(self, fields):
# here's getting values via sqlalchemy execute with raw query:
self.sql_service.execute(SOME_QUERY)
def process_data(self, data, values):
# again execute some raw query
# process data and put into result list
return reuslt_list
def make_calculations(self, params):
data = self.load_data()
values = self.get_values_from_fields(fields)
result_vector = process_data(data, values)
SOME_QUERY is in separate module and it's format looks like this:
"SELECT SUM(some_field) FROM some_table WHERE col1 = :col1 AND col2 = :col2"
To cover make_calculations in my component test I designed awful patches:
class PatchedConnection:
""" Class is used to transform queries to sqlite format before executing. """
def __init__(self, connection, engine):
self.connection = connection
self.engine = engine
def __call__(self):
conn = self.connection()
conn.execute = self.patched_execute(conn.execute)
return conn
def transform_date(self, date):
try:
# quick check just for testing
if '+00:00' in date:
date = date.replace('T', ' ').replace('+00:00', '.000000')
finally:
return date
def patched_execute(self, f_execute):
def prepare_args_for_sqlite(query, *args):
# check if query is in sqlite format
if args:
if '?' in str(query):
args = list(map(self.transform_date, list(args[0].values())))
return self.engine.execute(str(query), args)
return f_execute(query, args[0])
else:
return f_execute(query)
return prepare_args_for_sqlite
Then in test it looks like this:
QUERY_TEMPLATE_SQLITE = 'SELECT SUM(some_field) FROM some_table WHERE col1 = ? AND col2 = ?'
with mock.patch('path_to_my_service.SOME_QUERY', QUERY_TEMPLATE_SQLITE):
self.sql_service.get_connection = PatchedConnection(self.sql_service.get_connection, self.engine)
response = self.client.simulate_post("/v1/secret_service/make_calculations",
headers=self.auth_header,
body=json.dumps(payload))
self.assertEqual(response.status_code, 200)
# then check response.text
It works so far, but I believe there must be much better solution. Moreover, in patched_execute args from dict are being converted to list, and who knows if order of dict values will be the same all the time.
So, my question is how to perform such testing in a correct way with given tools?

If you need to intercept and manipulate the SQL being sent to the database then using core events https://docs.sqlalchemy.org/en/13/core/events.html would be the most straightforward way of doing this. The before_cursor_execute event would suit your purposes as outlined in the following example from the SQLAlchemy documentation.
#event.listens_for(engine, "before_cursor_execute", retval=True)
def before_cursor_execute(conn, cursor, statement, parameters, context, executemany):
# do something with statement, parameters
return statement, parameters
From the example you have given however, I'm not sure that this is necessary. The MySQL query you have listed is also a valid SQLite query and needs no manipulation. Also if you pass your parameters as python objects, rather than as strings, then again no manipulation should be needed as SQLAlchemy will map these correctly to the backend.

Related

How to implement "relationship" caching system in a similar query?

I noticed that when having a Model such as :
class User(Model):
id = ...
books = relationship('Book')
When calling user.books for the first time, SQLAlchemy query the database (when lazy='select' for instance, which is the default), but sub-sequent call to user.books don't call the database. The results seems to have been cached.
I'd like to have the same feature from SQLAlchemy when using a method that query, for instance:
class User:
def get_books(self):
return Book.query.filter(Book.user_id == self.id).all()
But when doing that, if I call 3 times get_books(), SQLAlchemy does call the database 3 times (when setting the ECHO property to True).
How can I change get_books() to use the caching system from SQLAlchemy ?
I insist to mention "from SQLAlchemy" because I believe they handle the refresh/expunge/flush system and changes are then re-queried to the DB if one of these happened. Opposed to if I were to simply create a caching property in the model with a simple:
def get_books(self):
if self._books is None:
self._books = Book.query.filter(Book.user_id == self.id).all()
return self._books
This does not work well with flush/refresh/expunge from SQLAlchemy.
So, How can I change get_books() to use the caching system from SQLAlchemy ?
Edit 1:
I realized that the solution provided under is not perfect, because it caches for the current object. If you have two instances of the same user, and call get_books on both, two queries will be made because the caching applies only on the instance, not globally, contrary to SQLAlchemy.
The reason is simple - I believe - but still unclear how to apply it in my case: The object is defined at the class level, not the instance (books = relationship()), and they build their own query internally, so they can cache it based on the query.
In the solution I gave, the memoize_getter is unaware of the query made, and as such, cannot cache it for the same value accros multiple instance, so any identical call made to another instance will query the database.
Original answer:
I've been trying to wrap my head around SQLAlchemy's code (wow that's dense!), and I think I figured it out!
A relationship, at least when being set as "lazy='select'" (default), is a InstrumentedAttribute, which contains a get function that does the following :
def __get__(self, instance, owner):
if instance is None:
return self
dict_ = instance_dict(instance)
if self._supports_population and self.key in dict_:
return dict_[self.key]
else:
try:
state = instance_state(instance)
except AttributeError as err:
util.raise_(
orm_exc.UnmappedInstanceError(instance),
replace_context=err,
)
return self.impl.get(state, dict_)
So, a basic caching system, respecting SQLAlchemy, would be something like:
from sqlalchemy.orm.base import instance_dict
def get_books(self):
dict_ = instance_dict(self)
if 'books' not in dict_:
dict_['books'] = Book.query.filter(Book.user_id == self.id).all()
return dict_['books']
Now, we can push the vice a bit further, and do ... a decorator (oh sweet):
def memoize_getter(f):
#functools.wraps(f)
def decorator(instance, *args, **kwargs):
property_name = f.__name__.replace('get_', '')
dict_ = instance_dict(instance)
if property_name not in dict_:
dict_[property_name] = f(instance, *args, **kwargs)
return dict_[property_name]
return decorator
Thus transforming the original method to :
class User:
#memoize_getter
def get_books(self):
return Book.query.filter(Book.user_id == self.id).all()
If someone has a better solution, I'm eagerly interested!

How to convert python dict to DictRow object

Hi I am writing unittest using pytest. But I am not able to mock few db functions. We are using psycopg2 for db connections and executions. Response of query returned from psycopg2 is of the type DictRow which can be accessed either by key or by index.
Ex:
response = ['prajwal', '23', 'engineer'] #Response of a query "select name, age , job from users"
>>>response[0]
'prajwal'
>>>response['name']
'prajwal'
I want to know is there any way by which we can covert dict/list to above mentioned type.
Looking at the source for psycopg2, creating a DictRow requires passing in a DictCursor object. However the only thing it uses from DictCursor appears to be an index and description attribute.
# found in lib\site-packages\psycopg2.extras.py
class DictRow(list):
"""A row object that allow by-column-name access to data."""
__slots__ = ('_index',)
def __init__(self, cursor):
self._index = cursor.index
self[:] = [None] * len(cursor.description)
The index looks like a dict with a mapping a key to an index. e.g.response['name'] = 0
The description looks like your dict that you want to convert.
If you're feeling hacky you could take advantage of duck typing and pretend you're passing in a cursor when you're just satisfying the requirements.
The only caveat is after we instantiate the DictRow, we need to populate it. Our fake cursor hack will take care of the rest.
from psycopg2.extras import DictRow
class DictRowHack:
def __init__(self, my_dict):
# we need to set these 2 attributes so that
# it auto populates our indexes
self.index = {key: i for i, key in enumerate(my_dict)}
self.description = my_dict
def dictrow_from_dict(my_dict):
# this is just a little helper function
# so you don't always need to go through
# the steps to recreate a DictRow
fake_cursor = DictRowHack(my_dict)
my_dictrow = DictRow(fake_cursor)
for k, v in my_dict.items():
my_dictrow[k] = v
return my_dictrow
response = {'name': 'prajwal', 'age': '23', 'job': 'engineer'}
my_dictrow = dictrow_from_dict(response)
print(my_dictrow[1])
print(my_dictrow['name'])
print(type(my_dictrow))

Testing flask_wtf/wtforms with pytest

I'd like to test a POST route that processes a non-trivial form (by working with flask.request.form). I didn't really find a good tutorial for this somehow as most pass json data rather than form (or is it the same?).
I tried to write the code in the following way:
import pytest
import app #app.app is the Flask app
#pytest.fixture
def client():
app.app.config['TESTING'] = True
with app.app.test_client() as client:
with app.app.app_context():
yield client
def test_route_webapp_post(client):
form = app.forms.ImputeForm.make_form(data_dict=app.data_dictionary.data_dict,
numeric_fields=app.binaries_dict['numeric_mappers'].keys(),
recordname2description=app.binaries_dict['recordname2description'])
rv = client.post('/web_app',form=form)
assert rv.status_code==200
The form is generated dynamically and I don't always know ahead of time what are the fields:
from flask_wtf import FlaskForm
from wtforms import SelectField, DecimalField, BooleanField
class ImputeForm(FlaskForm):
#classmethod
def make_form(cls, data_dict, numeric_fields, recordname2description, request_form=None):
for key in numeric_fields:
setattr(cls, key, DecimalField(id=key, label=recordname2description[key].split('(')[0]))
setattr(cls, 'mask_' + key, BooleanField(label='mask_' + key))
for key in data_dict:
setattr(cls, key, SelectField(id=key, label=recordname2description[key],
choices=[(-1, 'None selected')]+list(data_dict[key].items())))
setattr(cls, 'mask_' + key, BooleanField(label='mask_' + key))
instance = cls(request_form)
return instance
But this doesn't really work as I can't make a form inside the test case and get
E RuntimeError: Working outside of request context.
E
E This typically means that you attempted to use functionality that needed
E an active HTTP request. Consult the documentation on testing for
E information about how to avoid this problem.
So what is the proper approach to testing my form (in particular I am ok with sending an empty one)?
The correct way is to create a python dictionary and pass it as "data", not to try to create a form.
In particular case this involved making a new function:
def make_from_data( data_dict, numeric_fields):
data = dict()
for key in numeric_fields:
data[key]='234'
data['mask_' + key]='y'
for key in data_dict:
data[key]=-1
data['mask_' + key]='y'
return data
and passing it as follows:
def test_route_webapp_post(client):
data = make_from_data(data_dict=app.data_dictionary.data_dict,
numeric_fields=app.binaries_dict['numeric_mappers'].keys())
rv = client.post('/web_app',data=data)
assert rv.status_code==200

Only get one data from collections?

Im have some data from my collection at mongoDb i want to see all data from specified collection let say i've simple code like this
from pymongo import MongoClient
url = 'my url'
client = MongoClient(url, ssl=True, retryWrites=True)
class DB(object):
def __init__(self):
self.db = client.mydb
self.col = self.db.mycol
def see_listed(self):
for i in self.col.find():
return i
db = DB()
print(db.see_listed())
That only returned one data from my collection
but if i changed code from see_listed to
for i in self.col.find():
print(i)
That return all of data from my collection,where my wrong i don't know.. I just read some documents at try like this.
Im so thankful for any help im appreciate
You only get one document since you use return in your see_listed function.
If you change the return to yield instead it should return a generator you can iterate through.
def see_listed(self):
for i in self.col.find():
yield i
But if you only want the data in a list you could do:
def see_listed(self):
return list(self.col.find())
Maybe not the best choice if the size of the data is unknown.
yield keyword: What does the "yield" keyword do?

results of sqlite query not displayed in flask web app

I'm attempting to learn flask, so decided to follow this tutorial:
https://www.blog.pythonlibrary.org/2017/12/14/flask-101-adding-editing-and-displaying-data/
I just updated my main function with the below:
#app.route('/results')
def search_results(search):
results = []
search_string = search.data['search']
if search.data['search'] == '':
qry = db_session.query(Album)
results = qry.all()
if not results:
flash('No results found!')
return redirect('/')
else:
# display results
table = Results(results)
table.border = True
return render_template('results.html', table=table)
but when I add an album to the DB and try to query it back using search option it says no results. The DB file was created correctly and I have exactly the same code as in the tutorial up to this point.
The only change I made was adding from tables import Results. Full main.py below. Can you please give me some guidance about where to look for the culprit? Like I said, just learning, so any suggestions re resources in a friendly laid out way would be much appreciated (beginner programmer).
from app import app
from db_setup import init_db, db_session
from forms import MusicSearchForm, AlbumForm
from flask import flash, render_template, request, redirect
from models import Album, Artist
from tables import Results
init_db()
def save_changes(album, form, new=False):
"""
Save the changes to the database
"""
# Get data from form and assign it to the correct attributes
# of the SQLAlchemy table object
artist = Artist()
artist.name = form.artist.data
album.artist = artist
album.title = form.title.data
album.release_date = form.release_date.data
album.publisher = form.publisher.data
album.media_type = form.media_type.data
if new:
# Add the new album to the database
db_session.add(album)
# commit the data to the database
db_session.commit()
#app.route('/', methods=['GET', 'POST'])
def index():
search = MusicSearchForm(request.form)
if request.method == 'POST':
return search_results(search)
return render_template('index.html', form=search)
#app.route('/results')
def search_results(search):
results = []
search_string = search.data['search']
if search.data['search'] == '':
qry = db_session.query(Album)
results = qry.all()
if not results:
flash('No results found!')
return redirect('/')
else:
# display results
table = Results(results)
table.border = True
return render_template('results.html', table=table)
#app.route('/new_album', methods=['GET', 'POST'])
def new_album():
"""
Add a new album
"""
form = AlbumForm(request.form)
if request.method == 'POST' and form.validate():
# save the album
album = Album()
save_changes(album, form, new=True)
flash('Album created successfully!')
return redirect('/')
return render_template('new_album.html', form=form)
if __name__ == '__main__':
app.run()
No doubt you have already peppered your source code with print() statements and found nothing illuminating. Cached rows in the DB model might be the aspect that is hard to understand, here, and logging sqlite calls would shed light on that.
Use this:
import logging
logging.getLogger('sqlalchemy.engine').setLevel(logging.INFO)
It's noisy, but it will show when rows hit the backend DB and when they are retrieved.
Get in the habit of repeatedly issuing debug queries like this, so you know for sure what has been persisted:
$ echo 'select * from album;' | sqlite3 music.db
For repeatable testing, it can be convenient to copy the database file to a backup location, and then cp that frozen snapshot on top of the active file before each test run. It's important that the running flask app be restarted after such copying. Setting FLASK_DEBUG=1 can help with that.
Also, jeverling suggests using SQLAlchemyDebugPanel.

Resources