Is there a python-alembic way to convert data between dropping and adding a column? - python-3.x

I have a sqlite3 database accessing it with SQLAlchemy in python3.
I want to add a new and drop an old column with the database-migation tool alembic. Simple example:
class Model(_Base):
__tablename__ = 'Model'
_oid = Column('oid', sa.Integer, primary_key=True)
_number_int = sa.Column('number_int', sa.Integer)
Should be after migration like this:
class Model(_Base):
__tablename__ = 'Model'
_oid = Column('oid', sa.Integer, primary_key=True)
_number_str = sa.Column('number_str', sa.String(length=30))
The relevant point here is that there is data in _number_int that should be converted into _number_str like this:
number_conv = {1: 'one', 2: 'two', 3: 'three'}
_number_str = number_conv[_number_int]
Is there an alembic way to take care of that? It means if alembic itself take care of cases like that in its concept/design?
I want to know If I can use alembic tools for that or if I have to do my own extra code for that.
Of course the original data is a little bit more complex to convert. This is just an example here.

Here is alembic operation reference. There is a method called bulk_insert() for bulk inserting content, but nothing for migrating existing content. It seems alembic doesn't have it built-in. But you can implement data migration yourself.
One possible approach is described in the article "Migrating content with alembic". You need to define intermediate table inside your migration file, which contains both columns (number_int and number_str):
import sqlalchemy as sa
model_helper = sa.Table(
'Model',
sa.MetaData(),
sa.Column('oid', sa.Integer, primary_key=True),
sa.Column('number_int', sa.Integer),
sa.Column('number_str', sa.String(length=30)),
)
And use this intermediate table to migrate data from old column to the new one:
from alembic import op
def upgrade():
# add the new column first
op.add_column(
'Model',
sa.Column(
'number_str',
sa.String(length=30),
nullable=True
)
)
# build a quick link for the current connection of alembic
connection = op.get_bind()
# at this state right now, the old column is not deleted and the
# new columns are present already. So now is the time to run the
# content migration. We use the connection to grab all data from
# the table, convert each number and update the row, which is
# identified by its id
number_conv = {1: 'one', 2: 'two', 3: 'three'}
for item in connection.execute(model_helper.select()):
connection.execute(
model_helper.update().where(
model_helper.c.id == item.id
).values(
number_str=number_conv[item.number_int]
)
)
# now that all data is migrated we can just drop the old column
# without having lost any data
op.drop_column('Model', 'number_int')
This approach is a bit noisy (you need to define table manually), but it works.

Related

Delete multiple rows from Association Table in SQLAlchemy using db.session.execute syntax?

I have an association table that contains relationships between two other SQLAlchemy models that I would like to delete:
class ItemCategories(db.Model):
id = Column(Integer, primary_key=True)
item_id = Column(Integer, ForeignKey("item.id"))
category_id = Column(Integer, ForeignKey("category.id"))
# ... other fields
The old syntax was to use something like:
db.session.query(ItemCategories).filter_by(category_id=5).filter(ItemCategories.name="Shelved").delete()
But with the newer syntax, I tried:
db.session.execute(db.select(ItemCategories).filter_by(category_id=5).filter(ItemCategories.name="Shelved").delete())
But this errored with:
AttributeError: 'Select' object has no attribute 'delete'
Flask-SQLAlchemy suggests doing:
db.session.delete(Model Object)
But this only deletes a single row, and I would like to delete multiple rows at once. I know I can loop through all the rows and do a session delete one-by-one, but would prefer a bulk delete instead like with the session.query line.
Is there a way to do multiple deletes with db.session.execute()?

How do I create a Django migration for my ManyToMany relation that includes an on-delete cascade?

I'm using PostGres 10, Python 3.9, and Django 3.2. I have set up this model with the accompanying many-to-many relationship ...
class Account(models.Model):
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
...
crypto_currencies = models.ManyToManyField(CryptoCurrency)
After generating and running Django migrations, the following table was created ...
\d cbapp_account_crypto_currencies;
Table "public.cbapp_account_crypto_currencies"
Column | Type | Modifiers
-------------------+---------+------------------------------------------------------------------------------
id | integer | not null default nextval('cbapp_account_crypto_currencies_id_seq'::regclass)
account_id | uuid | not null
cryptocurrency_id | uuid | not null
Indexes:
"cbapp_account_crypto_currencies_pkey" PRIMARY KEY, btree (id)
"cbapp_account_crypto_cur_account_id_cryptocurrenc_38c41c43_uniq" UNIQUE CONSTRAINT, btree (account_id, cryptocurrency_id)
"cbapp_account_crypto_currencies_account_id_611c9b45" btree (account_id)
"cbapp_account_crypto_currencies_cryptocurrency_id_685fb811" btree (cryptocurrency_id)
Foreign-key constraints:
"cbapp_account_crypto_account_id_611c9b45_fk_cbapp_acc" FOREIGN KEY (account_id) REFERENCES cbapp_account(id) DEFERRABLE INITIALLY DEFERRED
"cbapp_account_crypto_cryptocurrency_id_685fb811_fk_cbapp_cry" FOREIGN KEY (cryptocurrency_id) REFERENCES cbapp_cryptocurrency(id) DEFERRABLE INITIALLY DEFERRED
How do I alter my field relation, or generate a migration, such that the cascade relationship is ON-DELETE CASCADE? That is, When I delete an account, I would like accompanying records in this table to also be deleted.
Had a closer look on this. I tried to replicate your models and I also see that the intermediary table has no cascade. I have no answer on your main question on how to add the cascade, but it seems that django does the cascade behavior which already supports this:
When I delete an account, I would like accompanying records in this table to also be deleted.
To demonstrate:
a = Account.objects.create(name='test')
c1 = CryptoCurrency.objects.create(name='c1')
c2 = CryptoCurrency.objects.create(name='c2')
c3 = CryptoCurrency.objects.create(name='c3')
a.crypto_currencies.set([c1, c2, c3])
If you do:
a.delete()
Django runs the following SQL which simulates the cascade on the intermediary table:
[
{
'sql': 'DELETE FROM "myapp_account_crypto_currencies" WHERE "myapp_account_crypto_currencies"."account_id" IN (3)', 'time': '0.002'
},
{
'sql': 'DELETE FROM "myapp_account" WHERE "myapp_account"."id" IN (3)', 'time': '0.001'
}
]
I can't find in the documentation why it is done this way though. Even adding a custom intermediary like this results in the same behavior:
class Account(models.Model):
name = models.CharField(max_length=100)
crypto_currencies = models.ManyToManyField(CryptoCurrency, through='myapp.AccountCryptocurrencies')
class AccountCryptocurrencies(models.Model):
account = models.ForeignKey(Account, on_delete=models.CASCADE)
cryptocurrency = models.ForeignKey(CryptoCurrency, on_delete=models.CASCADE)
When you use a ManyToManyField, Django creates a intermediary table for you, in this case named cbapp_account_crypto_currencies. What you want to do in the future is to always explicitly create the intermediary model, AccountCryptoCurrencies, then set the through attribute of the ManyToManyField. This will allow you to add more fields in the future to the intermediary model. See more here: https://docs.djangoproject.com/en/3.2/ref/models/fields/#django.db.models.ManyToManyField.through.
What you will now need to do is so create this intermediary table:
class AccountCryptoCurrencies(models.Model):
account = models.ForeignKey(Account)
cryptocurrency = models.ForeignKey(CryptoCurrency)
class Meta:
db_table = 'cbapp_account_crypto_currencies'
class Account(models.Model):
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
...
crypto_currencies = models.ManyToManyField(CryptoCurrency, through=AccountCryptoCurrencies)
You are now need to generate a migration, but do not apply it yet! Modify the migration by wrapping it in a SeparateDatabaseAndState. I havent created your migration file because I dont have the full model, but you can see here for how to do it: How to add through option to existing ManyToManyField with migrations and data in django
Now you can apply the migration and you should now have an explicit intermediary table without losing data. You can also now add additional fields to the intermediary table and change the existing fields. You can add the on_delete=models.CASCADE to the account field and migrate the change.

Deletion of a row from an association table

I am working on an app using python3 and SqlAlchemy for SQLite3 database management. I have some tables that have a Many to Many relationship. I've created an association table to handle this relationship.
Class Machine(Base):
__tablename__ 'machine'
machine_ID = Column(Integer, primary_key=True)
etc...
Class Options(Base):
__tableName__ 'options'
options_ID = Column(Integer, primary_key=True)
etc...
The association table
Machine_Options = table('machine_options', Base.metadata,
Column('machine_FK', Integer, ForeignKey('machine.machine_ID'),
primary_key=True),
Column('options_FK',Integer, ForeignKey('options.options_ID'),
primary_key=True))
All the items for the Machine and Options are inserted independently. When I want to associate a machine with an option I use an append query which works very well.
My problem is when I want to break this association between a machine and an option. I have tried a direct row deletion from the association table using a FILTER() clause on the machine_FK and the options_FK but SqlAlchemy gives me an error informing me that 'Machine_Options' table has no field 'machine_FK'.
I have tried to remove the row from 'Machine_Options' indirectly using joins with the machine and options table but received another error that I can not delete or update using joins.
I am looking for the code to only delete a row from the association table without affecting the original machine or options table.
So far my internet search has been fruitless.
The answer to my problem is to use myparent.children.remove(somechild)
The association is made using machine.children.append(option)
Using the same code as the 'append' and substituting 'remove' unmakes the association
The code:
def removeOption(machineKey, OptionKey):
session = connectToDatabase()
machineData = session.query(Machine).filter(Machine.machine_ID == machineKey).one()
optionData = session.query(Options).filter(Options. options_ID == OptionKey).one()
machineData.children.remove(optionData)
session.add(machineData)
session.commit()
session.close()

Reading a database table without Pandas

I am trying to read a table from a HANA database in Python using SQLAlchemy library. Typically, I would use the Pandas package and use the pd.read_sql() method for this operation. However, for some reason, the environment I am using does not support the Pandas package. Therefore, I need to read the table without the Pandas library. So far, the following is what I have been able to do:
query = ('''SELECT * FROM "<schema_name>"."<table_name>"'''
''' WHERE <conditional_clauses>'''
)
with engine.connect() as con:
table = con.execute(query)
row = table.fetchone()
However, while this technique allows me to read table row by row, I am do not get the column names of the table.
How can I fix this?
Thanks
I am do not get the column names of the table
You won't get the column names of the table but you can get the column names (or aliases) of the result set:
with engine.begin() as conn:
row = conn.execute(sa.text("SELECT 1 AS foo, 2 AS bar")).fetchone()
print(row.items()) # [('foo', 1), ('bar', 2)]
#
# or, for just the column names
#
print(row.keys()) # ['foo', 'bar']

Querying with cqlengine

I am trying to hook the cqlengine CQL 3 object mapper with my web application running on CherryPy. Athough the documentation is very clear about querying, I am still not aware how to make queries on an existing table(and an existing keyspace) in my cassandra database. For instance I already have this table Movies containing the fields Title, rating, Year. I want to make the CQL query
SELECT * FROM Movies
How do I go ahead with the query after establishing the connection with
from cqlengine import connection
connection.setup(['127.0.0.1:9160'])
The KEYSPACE is called "TEST1".
Abhiroop Sarkar,
I highly suggest that you read through all of the documentation at:
Current Object Mapper Documentation
Legacy CQLEngine Documentation
Installation: pip install cassandra-driver
And take a look at this example project by the creator of CQLEngine, rustyrazorblade:
Example Project - Meat bot
Keep in mind, CQLEngine has been merged into the DataStax Cassandra-driver:
Official Python Cassandra Driver Documentation
You'll want to do something like this:
CQLEngine <= 0.21.0:
from cqlengine.connection import setup
setup(['127.0.0.1'], 'keyspace_name', retry_connect=True)
If you need to create the keyspace still:
from cqlengine.management import create_keyspace
create_keyspace(
'keyspace_name',
replication_factor=1,
strategy_class='SimpleStrategy'
)
Setup your Cassandra Data Model
You can do this in the same .py or in your models.py:
import datetime
import uuid
from cqlengine import columns, Model
class YourModel(Model):
__key_space__ = 'keyspace_name' # Not Required
__table_name__ = 'columnfamily_name' # Not Required
some_int = columns.Integer(
primary_key=True,
partition_key=True
)
time = columns.TimeUUID(
primary_key=True,
clustering_order='DESC',
default=uuid.uuid1,
)
some_uuid = columns.UUID(primary_key=True, default=uuid.uuid4)
created = columns.DateTime(default=datetime.datetime.utcnow)
some_text = columns.Text(required=True)
def __str__(self):
return self.some_text
def to_dict(self):
data = {
'text': self.some_text,
'created': self.created,
'some_int': self.some_int,
}
return data
Sync your Cassandra ColumnFamilies
from cqlengine.management import sync_table
from .models import YourModel
sync_table(YourModel)
Considering everything above, you can put all of the connection and syncing together, as many examples have outlined, say this is connection.py in our project:
from cqlengine.connection import setup
from cqlengine.management import sync_table
from .models import YourTable
def cass_connect():
setup(['127.0.0.1'], 'keyspace_name', retry_connect=True)
sync_table(YourTable)
Actually Using the Model and Data
from __future__ import print_function
from .connection import cass_connect
from .models import YourTable
def add_data():
cass_connect()
YourTable.create(
some_int=5,
some_text='Test0'
)
YourTable.create(
some_int=6,
some_text='Test1'
)
YourTable.create(
some_int=5,
some_text='Test2'
)
def query_data():
cass_connect()
query = YourTable.objects.filter(some_int=5)
# This will output each YourTable entry where some_int = 5
for item in query:
print(item)
Feel free to let ask for further clarification, if necessary.
The most straightforward way to achieve this is to make model classes which mirror the schema of your existing cql tables, then run queries on them
cqlengine is primarily an Object Mapper for Cassandra. It does not interrogate an existing database in order to create objects for existing tables. Rather it is usually intended to be used in the opposite direction (i.e. create tables from python classes). If you want to query an existing table using cqlengine you will need to create python models that exactly correspond to your existing tables.
For example, if your current Movies table had 3 columns, id, title, and release_date you would need to create a cqlengine model that had those three columns. Additionally, you would need to ensure that the table_name attribute on the class was exactly the same as the table name in the database.
from cqlengine import columns, Model
class Movie(Model):
__table_name__ = "movies"
id = columns.UUID(primary_key=True)
title = columns.Text()
release_date = columns.Date()
The key thing is to make sure that model exactly mirrors the existing table. If there are small differences you may be able to use sync_table(MyModel) to update the table to match your model.

Resources