Modfying ForeignKeyConstraint schema in Alembic post process - python-3.x

I'm using process_revision_directives to apply some post-processing of the operations generated against a reference schema. The one I'm stuck on is removing the postgres schema from the instructions, so it can be generically changed at runtime using the answer from another question.
The below code correctly removes the schema from operations except for ForeignKeyConstraints in a CreateTableOp.
def process_foreign_key(col: sa.ForeignKeyConstraint):
col.referred_table.schema = None # Doesn't work
def process_revision_directives(context, revision, directives):
# Remove the schema from the generated operations
for op in chain(directives[0].upgrade_ops.ops, directives[0].downgrade_ops.ops):
if isinstance(op, ops.CreateTableOp):
op.columns = [
process_foreign_key(col) if isinstance(col, sa.ForeignKeyConstraint) else col
for col in op.columns
]
op.schema = None
This currently generates output like
op.create_table('user',
sa.Column('id', sa.Integer, nullable=False),
sa.ForeignKeyConstraint(['id'], ['reference_schema.group.id'], name='group_group_id', onupdate='CASCADE', ondelete='CASCADE'),
)
Any ideas on how I should modify these constraint objects to not have reference_schema. in the target table?

If you look into the rendering chain you can find where the last schema reference is. It's on op._orig_table, but the important thing it is on this table twice.
Put the following in your for loop.
op._orig_table.schema = None
op._orig_table = op._orig_table.tometadata(clear_meta)
where clear_meta is a MetaData object with no schema, such as
clear_meta = sa.MetaData(bind=session.connection(), schema=None)

Related

Can you make sure only one object related to another object has a certain field set?

I have a model called Video, and it has related objects on another model called Label. Example here:
class Video(models.Model):
pass
class Label(models.Model):
video = models.ForeignKey(Video, related_name="labels")
current = models.NullBooleanField()
I need to be able to find the current label on a video by doing something like my_video.labels.filter(current=True), and this query should only ever return one label, so only one label on the video should have that field set to True.
Is there a way of ensuring this on the model/db?
Thanks
EDIT: The answer given below has achieved exactly this. Adding some django tests below for anyone else reading as some proof:
class TestLabelIntegrity(TestCase):
def test_a_video_can_have_only_one_current_label(self):
video = Video.objects.create()
label_1 = Label.objects.create(
video=video,
current=True
)
with self.assertRaises(IntegrityError):
label_2 = Label.objects.create(
video=video,
current=True
)
def test_two_different_videos_can_each_have_current_layers(self):
""" No assertions needed, just need to make sure no integrity errors are raised"""
video_1 = Video.objects.create()
label_1 = Label.objects.create(
video=video_1,
current=True
)
video_2 = Video.objects.create()
label_2 = Label.objects.create(
video=video_2,
current=True
)
I believe you can solve this using UniqueConstraint.
Using this, you can restrict that a Video only have a single label that current == True
You can define the UniqueConstraint in the models Meta.
You’ll get a database integrity error on save() if the condition fails.
See the documentation for this here:
https://docs.djangoproject.com/en/4.0/ref/models/constraints/
class Label(models.Model):
...
class Meta:
constraints = [
models.UniqueConstraint(
fields=["current", "video"],
condition=Q(current=True),
name="unique_current_label",
),
]

How to Save all models changes in one query on Django

I try to modify many instance of some model (like User model), and this changes is different (I don't want to use update QuerySet method and not works for my scenario).
For example some user need to change first_name and some user need to change last_name and get users like : all_user = User.objects.all()
I think if I use save method for each instance after change, Django sent one query for save that!
How can I save all changes to database in one query instead of use foreach on models and save that one by one?
Given the comment from #iklinac, I would thoroughly recommend implementing django's own approach to bulk updates detailed here
It's quite similar to my original answer, below, but it looks like the functionality is now built in.
# bulk_update(objs, fields, batch_size=None)
>>> objs = [
... Entry.objects.create(headline='Entry 1'),
... Entry.objects.create(headline='Entry 2'),
... ]
>>> objs[0].headline = 'This is entry 1'
>>> objs[1].headline = 'This is entry 2'
>>> Entry.objects.bulk_update(objs, ['headline'])
Original answer
There's a package called django-bulk-update which is similar to bulk create which is builtin to django.
An example of where I use this, is part of an action in an admin class;
#admin.register(Token)
class TokenAdmin(admin.ModelAdmin):
list_display = (
'id',
'type'
)
actions = (
'set_type_charity',
)
def set_type_charity(self, request, queryset):
for token in queryset:
token.type = Token.Type.CHARITY
bulk_update(
queryset,
update_fields=['type', 'modified'],
batch_size=1000
)
Usage, taken from their readme;
With manager:
import random
from django_bulk_update.manager import BulkUpdateManager
from tests.models import Person
class Person(models.Model):
...
objects = BulkUpdateManager()
random_names = ['Walter', 'The Dude', 'Donny', 'Jesus']
people = Person.objects.all()
for person in people:
person.name = random.choice(random_names)
Person.objects.bulk_update(people, update_fields=['name']) # updates only name column
Person.objects.bulk_update(people, exclude_fields=['username']) # updates all columns except username
Person.objects.bulk_update(people) # updates all columns
Person.objects.bulk_update(people, batch_size=50000) # updates all columns by 50000 sized chunks
With helper:
import random
from django_bulk_update.helper import bulk_update
from tests.models import Person
random_names = ['Walter', 'The Dude', 'Donny', 'Jesus']
people = Person.objects.all()
for person in people:
person.name = random.choice(random_names)
bulk_update(people, update_fields=['name']) # updates only name column
bulk_update(people, exclude_fields=['username']) # updates all columns except username
bulk_update(people, using='someotherdb') # updates all columns using the given db
bulk_update(people) # updates all columns using the default db
bulk_update(people, batch_size=50000) # updates all columns by 50000 sized chunks using the default db

Can Marshmallow auto-convert dot-delimited fields to nested JSON/dict in combination with unknown=EXCLUDE?

In trying to load() data with field names which are dot-delimited, using unknown=INCLUDE auto-converts this to nested dicts (which is what I want), however I'd like to do this with unknown=EXCLUDE as my data has a lot of properties I don't want to deal with.
It appears that with unknown=EXCLUDE, this auto-conversion does not happen and the dot-delimited field itself is passed to the schema, which of course is not recognized. This is confirmed by not using the unknown= param at all, which raises a ValidationError.
Is it possible to combine unknown=EXCLUDE and still get nested data? Or is there a better way to deal with this situation?
Thanks in advance!
# using marshmallow v3.7.1
from marshmallow import Schema, fields, INCLUDE, EXCLUDE
data = {'LEVEL1.LEVEL2.LEVEL3': 'FooBar'}
class Level3Schema(Schema):
LEVEL3 = fields.String()
class Level2Schema(Schema):
LEVEL2 = fields.Nested(Level3Schema)
class Level1Schema(Schema):
LEVEL1 = fields.Nested(Level2Schema)
schema = Level1Schema()
print(schema.load(data, unknown=INCLUDE))
# prints: {'LEVEL1': {'LEVEL2': {'LEVEL3': 'FooBar'}}}
print(schema.load(data, unknown=EXCLUDE))
# prints: {}
print(schema.load(data))
# raises: marshmallow.exceptions.ValidationError: {'LEVEL1.LEVEL2.LEVEL3': ['Unknown field.']}

SQLAthanor: serialize to json only specific fields

Is there a way to serialize a SQLAlchemy model including only specific fields using SQLAthanor? The documentation doesn't mention it, so the only way that I figured out is to filter the outcome manually.
So, this line with sqlathanor
return jsonify([user.to_dict() for user in users for k, v in user.to_dict().items()
if k in ['username', 'name', 'surname', 'email']])
is equivalent to this one using Marshmallow
return jsonify(SchemaUser(only=('username', 'name', 'surname', 'email')).dump(users, many=True))
Once again, is there a built-in method in SQLAthanor to do this?
Adapting my answer from the related Github issue:
The only way that you can change the list of serialized fields without adjusting the instance’s configuration is to manually adjust the results of to_<FORMAT>(). Your code snippet is one way to do that, although for JSON and YAML you can also supply a custom serialize_function which accepts the dict, processes it, and serializes to JSON or YAML as appropriate:
import simplejson as json
def my_custom_serializer(value, **kwargs):
filtered_dict = {}
filtered_dict['username'] = value['username']
# repeat pattern for other fields
return json.dumps(filtered_dict)
json_result = user.to_json(serialize_function = my_custom_serializer)`
Both approaches are effectively the same, but the serialize_function approach gives you more flexibility for more complex adjustments to your serialized output and (I think) easier to read/maintain code (though if all your doing is adjusting the fields included, your snippet is already quite readable).
You can generalize the serialize_function as well. So if you want to give it a list of fields to include, just include them as a keyword argument in to_json():
def my_custom_serializer(value, **kwargs):
filter_fields = kwargs.pop(“filter_fields”, None)
result = {}
for field in filter_fields:
result[field] = value.get(field, None)
return json.dumps(result)
result = [x.to_json(serialize_funcion = my_custom_serializer, filter_fields = ['username', 'name', 'surname', 'email']) for x in users)

SQLAlchemy scoped_session is not getting latest data from DB

I'm rather new to the whole ORM topic, and I've already searched forums and docs.
The question is about a flask application with SQLAlchemy as ORM for the PostgreSQL.
The __init__.py contains the following line:
db = SQLAlchemy()
the created object is referenced in the other files to access the DB.
There is a save function for the model:
def save(self):
db.session.add(self)
db.session.commit()
and also an update function:
def update(self):
for var_name in self.__dict__.keys():
if var_name is not ('_sa_instance_state' or 'id' or 'foreign_id'):
# Workaround for JSON update problem
flag_modified(self, var_name)
db.session.merge(self)
db.session.commit()
The problem occurs when I'm trying to save a new object. The save function writes it to DB, it's visible when querying the DB directly (psql, etc.), but a following ORM query like:
model_list = db.session.query(MyModel).filter(MyModel.foreign_id == this_id).all()
gives an empty response.
A call of the update function does work as expected, new data is visible when requesting with the ORM.
I'm always using the same session object for example this:
<sqlalchemy.orm.scoping.scoped_session object at 0x7f0cff68fda0>
If the application is restarted everything works fine until a new object was created and tried to get with the ORM.
An unhandsome workaround is using raw SQL like:
model_list = db.session.execute('SELECT * FROM models_table WHERE
foreign_id = ' + str(this_id))
which gives a ResultProxy with latest data like this:
<sqlalchemy.engine.result.ResultProxy object at 0x7f0cf74d0390>
I think my problem is a misunderstanding of the session. Can anyone help me?
It figured out that the problem has nothing to do with the session, but the filter() method:
# Neccessary import for string input into filter() function
from sqlalchemy import text
# Solution or workaround
model_list = db.session.query(MyModel).filter(text('foreign_key = ' + str(this_id))).all()
I could not figure out the problem with:
filter(MyModel.foreign_id == this_id) but that's another problem.
I think this way is better than executing raw SQL.

Resources