Column-name mapping in SQLAlchemy - python-3.x

I'm just two days into learning SQLAlchemy, and have a question regarding mapping of column names. First off, the following statement in the official docs threw me off balance:
Matching columns on name works for simple cases but can become
unwieldy when dealing with complex statements that contain duplicate
column names or when using anonymized ORM constructs that don’t easily
match to specific names. Additionally, there is typing behavior
present in our mapped columns that we might find necessary when
handling result rows.
Ugh, what?! An example or two of what this paragraph is trying to say would have been great, but never mind; it's a task for another day.
I kind of understood that sometimes the field list generated by our query will not map perfectly with the model, so we can specify expected columns positionally. This got me trying to think up an example where this mapping might be useful. So I thought, how about selecting a column from the table that is calculated on the fly? I simulated this in a sqlite database using the random() function like this:
from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, String
from sqlalchemy.orm import sessionmaker
engine = create_engine('sqlite:///:memory:', echo=True)
Base = declarative_base()
Session = sessionmaker(bind=engine)
session = Session()
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
name = Column(String)
fullname = Column(String)
password = Column(String)
def __repr__(self):
return "User<name={}, fullname={}, password={}>".format(self.name, self.fullname, self.password)
# Create all tables
Base.metadata.create_all(engine)
ed_user = User(name='ed', fullname='Ed Jones', password='edspassword')
session.add(ed_user)
session.add_all([
User(name='wendy', fullname='Wendy Williams', password='foobar'),
User(name='mary', fullname='Mary Contrary', password='xxg527'),
User(name='fred', fullname='Fred Flinstone', password='blah')
])
session.commit()
stmt = text("SELECT name, id, fullname, random() FROM users WHERE name=:name")
stmt = stmt.columns(User.name, User.id, User.fullname ...
Now what? I can't write a column name because none exists, and I can't add it to the model definition because this column will be created unnecessarily. All I wanted to get some extra information from the database (like NOW()) that I can use in code.
How can this be done?

You can use column labels
from sqlalchemy import func
users = session.query(User.id, User.name, User.fullname, func.random().label('random_value'))
for user in users:
print(user.id, user.name, user.fullname, user.random_value)

Try explicitly listing column objects. This way "random()" can be any arbitrary SQL calculated column...
# You may have already imported column, but for completeness:
from sqlalchemy import column
# As an example:
find_name = 'wendy'
stmt = text("SELECT name, id, fullname, (random()) as random_value FROM users WHERE name=:name")
stmt = stmt.columns(column('name'), column('id'), column('fullname'), column('random_value'))
users = session.query(column('name'), column('id'), column('fullname'), column('random_value')).from_statement(stmt).params(name = find_name)
for user in users:
print(user.id, user.name, user.fullname, user.random_value)

All right, so I seem to have figured this out myself (damn, that's twice in a day! The SQLAlchemy community here seems dead, which is forcing me to search for answers myself ... not a very bad thing, though!). Here's how I did it:
from sqlalchemy import func
session.query(User.name, User.id, User.fullname, func.current_timestamp()).filter(User.name=='ed').all()

Related

Questions about SQLAlchemy relationships

I'm trying to understand the nature and usage of .relationship and .ForeignKey in SQLalchemy.
In every example I see they seem to be tied to a variable that is not often not referenced anywhere making it a dead variable and the .relationship backref value is also often not referenced so the whole thing seems arbitrary and difficult to understand.
Here is a random example I pulled from online.
The best that I can see is that
excuses = db.relationship('Excuse', backref='student',
lazy='dynamic')
provides a one-to-many link between the Student model and the Excuse model with 'Excuse' being the 'many' and backref='student' being the 'one. However the excuses variable that is is connected to is not referenced in the foreign key or anywhere else so I don't know how it comes into play. I would be able to understand better with a visual diagram on how they interact but I haven't been able to find such a thing.
student_id = db.Column(db.Integer, db.ForeignKey('students.id'))
Creates a variable with the student ID from the Student model but it seems to do so without the need for the .relationship statement in the Student model.
There is no reference here to the .relationship syntax in the Student model. It does provide a link from Excuse to Student but I don't understand the point of the
excuses = db.relationship('Excuse', backref='student',
lazy='dynamic')
clause as it doesn't seem to do anything.
from flask import Flask
from flask_sqlalchemy import SQLAlchemy
from flask_migrate import Migrate
class Student(db.Model):
__tablename__ = "students" # table name will default to name of the model
# Create the three columns for our table
id = db.Column(db.Integer, primary_key=True)
first_name = db.Column(db.Text)
last_name = db.Column(db.Text)
excuses = db.relationship('Excuse', backref='student',
lazy='dynamic')
# define what each instance or row in the DB will have (id is taken care of for you)
def __init__(self, first_name, last_name):
self.first_name = first_name
self.last_name = last_name
# this is not essential, but a valuable method to overwrite as this is what we will see when we print out an instance in a REPL.
def __repr__(self):
return f"The student's name is {self.first_name} {self.last_name}"
class Excuse(db.Model):
__tablename__ = "excuses"
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.Text)
is_believable = db.Column(db.Boolean)
# remember - the name of our table is "students"
student_id = db.Column(db.Integer, db.ForeignKey('students.id'))
def __init__(self, name, is_believable, student_id):
self.name = name
self.is_believable = is_believable
self.student_id = student_id
elie = Student('Elie', 'Schoppik')
matt = Student('Matt', 'Lane')
michael = Student('Michael', 'Hueter')
db.session.add_all([elie, matt, michael])
db.session.commit()
len(Student.query.all()) # 3
elie = Student.query.get(1)
excuse1 = Excuse('My homework ate my dog', False, 1)
db.session.add(excuse1)
db.session.commit()
elie.excuses.all() # list of excuses
elie.excuses.first().is_believable # False
Excuse.query.get(1).student # The student's name is Elie Schoppik
excuse2 = Excuse('I overslept', True, 1)
db.session.add(excuse2)
db.session.commit()
len(elie.excuses.all()) # 2
Does anyone mind explaining the dynamic between .relationship and .ForeignKey and how they interact? As well as the point of the variables they are attached to and how the backref variable is used?

Best practice for relate SQLAlchemy model with additional runtime data?

I'm going to use SQLAlchemy ORM for my model of user in Python.
class User(Base):
__tablename__ = "user"
id = Column(Integer, primary_key=True)
user_id = Column(Integer, ForeignKey('user_id'))
Additionally I want to bind to each User data that shouldn't be saved in database, because it has meaning only during runtime. Such as link to some objects (asyncio.lock(), asyncio.Task()).
Ofcourse when I request data from database new object will be returned. So my question is what is best practice to relate such User(Base) objects with additional data that is existing while my app is working.
So I want something like this, but I wonder about proper pattern for that or more elegant solution. Here I need to save some key such as user_id, to relate some User extracted from DB to the instance of UserRuntimeData.
class UserRuntimeData():
def __init__(self, ...):
self.user_id = ...
self._lock = ...
self._current_task = ...
def compare(self, user: User) -> bool:
if user.user_id == self.user_id:
return True
return False
UPD. I have thought that I could also save runtime data by serializing in byte code and then just extract whole User info from DB, but I think it's kind of mess.

How to Save all models changes in one query on Django

I try to modify many instance of some model (like User model), and this changes is different (I don't want to use update QuerySet method and not works for my scenario).
For example some user need to change first_name and some user need to change last_name and get users like : all_user = User.objects.all()
I think if I use save method for each instance after change, Django sent one query for save that!
How can I save all changes to database in one query instead of use foreach on models and save that one by one?
Given the comment from #iklinac, I would thoroughly recommend implementing django's own approach to bulk updates detailed here
It's quite similar to my original answer, below, but it looks like the functionality is now built in.
# bulk_update(objs, fields, batch_size=None)
>>> objs = [
... Entry.objects.create(headline='Entry 1'),
... Entry.objects.create(headline='Entry 2'),
... ]
>>> objs[0].headline = 'This is entry 1'
>>> objs[1].headline = 'This is entry 2'
>>> Entry.objects.bulk_update(objs, ['headline'])
Original answer
There's a package called django-bulk-update which is similar to bulk create which is builtin to django.
An example of where I use this, is part of an action in an admin class;
#admin.register(Token)
class TokenAdmin(admin.ModelAdmin):
list_display = (
'id',
'type'
)
actions = (
'set_type_charity',
)
def set_type_charity(self, request, queryset):
for token in queryset:
token.type = Token.Type.CHARITY
bulk_update(
queryset,
update_fields=['type', 'modified'],
batch_size=1000
)
Usage, taken from their readme;
With manager:
import random
from django_bulk_update.manager import BulkUpdateManager
from tests.models import Person
class Person(models.Model):
...
objects = BulkUpdateManager()
random_names = ['Walter', 'The Dude', 'Donny', 'Jesus']
people = Person.objects.all()
for person in people:
person.name = random.choice(random_names)
Person.objects.bulk_update(people, update_fields=['name']) # updates only name column
Person.objects.bulk_update(people, exclude_fields=['username']) # updates all columns except username
Person.objects.bulk_update(people) # updates all columns
Person.objects.bulk_update(people, batch_size=50000) # updates all columns by 50000 sized chunks
With helper:
import random
from django_bulk_update.helper import bulk_update
from tests.models import Person
random_names = ['Walter', 'The Dude', 'Donny', 'Jesus']
people = Person.objects.all()
for person in people:
person.name = random.choice(random_names)
bulk_update(people, update_fields=['name']) # updates only name column
bulk_update(people, exclude_fields=['username']) # updates all columns except username
bulk_update(people, using='someotherdb') # updates all columns using the given db
bulk_update(people) # updates all columns using the default db
bulk_update(people, batch_size=50000) # updates all columns by 50000 sized chunks using the default db

Insert a nested schema into a database with fastAPI?

I have recently come to know about fastAPI and worked my way through the tutorial and other docs. Although fastAPI is pretty well documented, I couldn't find information about how to process a nested input when working with a database.
For testing, I wrote a very small family API with two models:
class Member(Base):
__tablename__ = 'members'
id = Column(Integer, primary_key=True, server_default=text("nextval('members_id_seq'::regclass)"))
name = Column(String(128), nullable=False)
age = Column(Integer, nullable=True)
family_id = Column(Integer, ForeignKey('families.id', deferrable=True, initially='DEFERRED'), nullable=False, index=True)
family = relationship("Family", back_populates="members")
class Family(Base):
__tablename__ = 'families'
id = Column(Integer, primary_key=True, server_default=text("nextval('families_id_seq'::regclass)"))
family_name = Column(String(128), nullable=False)
members = relationship("Member", back_populates="family")
and I created a Postgres database with two tables and the relations described here. With schema definitions and a crud file as in the fastAPI tutorial, I can create individual families and members and view them in a nested fashion with a get request. Here is the nested schema:
class Family(FamilyBase):
id: int
members: List[Member]
class Config:
orm_mode = True
So far, so good. Now, I would like to add a post view which accepts the nested structure as input and populates the database accordingly. The documentation at https://fastapi.tiangolo.com/tutorial/body-nested-models/ shows how to do this in principle, but it misses the database (i.e. crud) part.
As the input will not have id fields and obviously doesn't need to specify family_id, I have a MemberStub schema and the NestedFamilyCreate schema as follows:
class MemberStub(BaseModel):
name: str
age: int
class NestedFamilyCreate(BaseModel):
family_name: str
members: List[MemberStub]
In my routing routine families.py I have:
#app.post('/nested-families/', response_model=schemas.Family)
def create_family(family: schemas.NestedFamilyCreate, db: Session = Depends(get_db)):
# no check for previous existence as names can be duplicates
return crud.create_nested_family(db=db, family=family)
(the response_model points to the nested view of a family with all members including all ids; see above).
What I cannot figure out is how to write the crud.create_nested_family routine. Based on the simple create as in the tutorial, this looks like:
def create_nested_family(db: Session, family: schemas.NestedFamilyCreate):
# split information in family and members
members = family.members
core_family = None # ??? This is where I get stuck
db_family = models.Family(**family.dict()) # This fails
db.add(db_family)
db.commit()
db.refresh(db_family)
return db_family
So, I can extract the members and can loop through them, but I would first need to create a new db_family record which must not contain the members. Then, with db.refresh, I would get the new family_id back, which I could add to each record of members. But how can I do this? If I understand what is required here, I would need to achieve some mapping of my nested schema onto a plain schema for FamilyCreate (which works by itself) and a plain schema for MemberCreate (which also works by itself). But how can I do this?
I found a solution after re-reading about Pydantic models and their mapping to dict.
in crud.py:
def create_nested_family(db: Session, family: schemas.NestedFamilyCreate):
# split information in family and members
family_data = family.dict()
member_data = family_data.pop('members', None) # ToDo: handle error if no members
db_family = models.Family(**family_data)
db.add(db_family)
db.commit()
db.refresh(db_family)
# get family_id
family_id = db_family.id
# add members
for m in member_data:
m['family_id'] = family_id
db_member = models.Member(**m)
db.add(db_member)
db.commit()
db.refresh(db_member)
return db_family
Hope, this may be useful to someone else.

Why doesn't Peewee fill in my object's id?

I am trying to build a database driver for Peewee and i'm having trouble getting the save() method to fill in the primary key/id for objects. Here's some sample code:
from datetime import date
from peewee import BooleanField
from peewee import CharField
from peewee import DateField
from peewee import ForeignKeyField
from peewee import IntegerField
from peewee import Model
from SQLRelay import PySQLRDB
from sqlrelay_ext import SQLRelayDatabase
DB = SQLRelayDatabase('test2', host='<host>', user='<un>', password='<pwd>')
class BaseModel(Model):
class Meta:
database = DB
class Person(BaseModel):
name = CharField()
birthday = DateField()
is_relative = BooleanField()
class Pet(BaseModel):
owner = ForeignKeyField(Person, backref='pets')
name = CharField()
animal_type = CharField()
DB.connect()
Person.create_table(safe=False)
Pet.create_table(safe=False)
uncle_bob = Person(name='Bob', birthday=date(1960, 1, 15), is_relative=True)
uncle_bob.save() # bob is now stored in the database
print('Uncle Bob id: {}'.format(uncle_bob.id))
print('Uncle Bob _pk: {}'.format(uncle_bob._pk))
Both uncle_bob.id and uncle_bob._pk are None after .save(). From digging into the peewee.py code, it seems that the _WriteQuery.execute() method is supposed to set the _pk attribute, but that isn't happening. My best guess is that the cursor implementation isn't acting properly. Does anyone have more insight than this that can maybe help me track down this problem?
Thanks!
Edit to answer:
For SQL Server, the following code allows you to return the last inserted id:
def last_insert_id(self, cursor, query_type=None):
try:
cursor.execute('SELECT SCOPE_IDENTITY()')
result = cursor.fetchone()
return result[0]
except (IndexError, KeyError, TypeError):
pass
In your SQLRelayDatabase implementation, you will probably need to correctly implement the last_insert_id() method. For python db-api 2.0 drivers, this typically looks like cursor.lastrowid.
The default implementation is:
def last_insert_id(self, cursor, query_type=None):
return cursor.lastrowid
Where cursor is the cursor object used to execute the insert query.
Databases like Postgresql do not implement this -- instead you execute an INSERT...RETURNING query, so the Postgres implementation is a bit different. The postgres implementation ensures that your insert query includes a RETURNING clause, and then grabs the id returned.
Depending on your DB and the underlying DB-driver, you'll need to pull that last insert id out somehow. Peewee should handle the rest assuming last_insert_id() is implemented.

Resources