Commit error after uploading data - python-3.x

i have a simple program that stores some inputs in a database. I use flask-sqlalchemy as a ORM and didn't have any issues until now. Due some issues, i had to save my data onto CSV files and erase everything. After that, i uploaded the data back again using the df.to_sql method from pandas.
NOTE: I'm using df.to_sql to load the previously saved CSV back to the database. The idea is to recover the data that i had stored.
Now, with everything back normal (or so i thought) when i try to upload data using my usual method (filling a form) and commit the changes in the database, i get the following error:
sqlalchemy.exc.IntegrityError: (psycopg2.IntegrityError) duplicate key violates \
uniqueness restriction
Detail: The key already exists (id) = (#).
Every time i repeat the process, the error stays the same, only that # changes to #+1 (eg: from 2 goes to 3 and so on).
Sorry for my english, if you need any clarifications please ask, i'll try to edit this post the best i can.
Thanks for your time!
EDIT 1:
The process is adding a new line to the database and committing:
new_observation = Observations(var1 = new_var1, var2 = new_var2)
db.session.add(new_observation)
db.session.commit()
EDIT 2:
The model of the database is:
class Observations(db.Model):
__tablename__ = 'observations'
id = db.Column(db.Integer, primary_key=True)
user_id = db.Column(db.Integer, db.ForeignKey('user.id'))
timestamp = db.Column(db.DateTime, index=True, default=datetime.today())
var1= db.Column(db.Numeric)
var2= db.Column(db.Numeric)
EDIT 3:
As suggested by mad_ i tried filling the primary key directly:
new_observation = Observations(primary_key = some_number, var1 = new_var1, var2 = new_var2)
db.session.add(new_observation)
db.session.commit()
The problem now is that i get this new error:
sqlalchemy.orm.exc.FlushError: New instance <observations at 0x47deb50> \
with identity key (<class 'app.models.observations '>, (368,), None) \
conflicts with persistent instance <observations at 0x4ab8a90>

Thanks to the comments from #mad_ I was able to solve my problem. The issue presented when I uploaded a table back to my database. When I tried to commit a new observation to the DB I got an error.
A workaround is to explicitly declare the primary key. With this I got a new error which was solved by disabling the autoincrement property of the primary key ( autoincrement = False).

Related

When to use SQL Foreign key using peewee?

I'm currently using PeeWee together with Python and I have managed to create a decent beginner
CREATE TABLE stores (
id SERIAL PRIMARY KEY,
store_name TEXT
);
CREATE TABLE products (
id SERIAL,
store_id INTEGER NOT NULL,
title TEXT,
image TEXT,
url TEXT UNIQUE,
added_date timestamp without time zone NOT NULL DEFAULT NOW(),
PRIMARY KEY(id, store_id)
);
ALTER TABLE products
ADD CONSTRAINT "FK_products_stores" FOREIGN KEY ("store_id")
REFERENCES stores (id) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE RESTRICT;
which has been converted to peewee by following code:
# ------------------------------------------------------------------------------- #
class Stores(Model):
id = IntegerField(column_name='id')
store_name = TextField(column_name='store_name')
class Meta:
database = postgres_pool
db_table = "stores"
#classmethod
def get_all(cls):
try:
return cls.select(cls.id, cls.store_name).order_by(cls.store)
except Stores.IntegrityError:
return None
# ------------------------------------------------------------------------------- #
class Products(Model):
id = IntegerField(column_name='id')
store_id = TextField(column_name='store_id')
title = TextField(column_name='title')
url = TextField(column_name='url')
image = TextField(column_name='image')
store = ForeignKeyField(Stores, backref='products')
class Meta:
database = postgres_pool
db_table = "products"
#classmethod
def get_all_products(cls, given_id):
try:
return cls.select().where(cls.store_id == given_id)
except Stores.IntegrityError:
return None
#classmethod
def add_product(cls, pageData, store_id):
"""
INSERT
INTO
public.products(store_id, title, image, url)
VALUES((SELECT id FROM stores WHERE store_name = 'footish'), 'Teva Flatform Universal Pride',
'https://www.footish.se/sneakers/teva-flatform-universal-pride-t51116376',
'https://www.footish.se/pub_images/large/teva-flatform-universal-pride-t1116376-p77148.jpg?timestamp=1623417840')
"""
try:
return cls.insert(
store_id=store_id,
title=pageData.title,
url=pageData.url,
image=pageData.image,
).execute()
except Products.DoesNotExist:
return None
except peewee.IntegrityError as err:
print(f"error: {err}")
return None
My idea is that when I start my application, I would have a constant variable which a store_id set already e.g. 1. With that it would make the execution of queries faster as I do not need another select to get the store_id by a store_name. However looking at my code. I have a field that is: store = ForeignKeyField(Stores, backref='products') where I am starting to think what do I need it in my application.
I am aware that I do have a FK from my ALTER query but in my application that I have written I cannot see a reason why I would need to type in the the foreign key at all but I would like some help to understand more why and how I could use the value "store" in my applciation. It could be as I think that I might not need it at all?
Hello! By reading your initial idea about making "the execution of queries faster" from having a constant variable, the first thing that came to mind was the hassle of always having to manually edit the variable. This is poor practice and not something you'd want to do on a professional application. To obtain the value you should use, I suggest running a query programmatically and fetching the id's highest value using SQL's MAX() function.
As for the foreign key, you don't have to use it, but it can be good practice when it matters. In this case, look at your FK constraint: it has an ON DELETE RESTRICT statement, which cancels any delete operation on the parent table if it has data being used as a foreign key in another table. This would require going to the other table, the one with the foreign key, and deleting every row related to the one on the previous table before being able to delete it.
In general, if you have two tables with information linked in any way, I'd highly suggest using keys. It increases organization and, if proper constraints are added, it increases both readability for external users and reduces errors.
When it comes to using the store you mentioned, you might want to have an API return all products related to a single store. Or all products except from a specific one.
I tried to keep things simple due to not being fully confident I understood the question. I hope this was helpful.

How to INSERT into a database using JOIN

I'm currently using PeeWee together with Python and I have managed to create a cool application
CREATE TABLE stores (
id SERIAL PRIMARY KEY,
store_name TEXT
);
CREATE TABLE products (
id SERIAL,
store_id INTEGER NOT NULL,
title TEXT,
image TEXT,
url TEXT UNIQUE,
added_date timestamp without time zone NOT NULL DEFAULT NOW(),
PRIMARY KEY(id, store_id)
);
ALTER TABLE products
ADD CONSTRAINT "FK_products_stores" FOREIGN KEY ("store_id")
REFERENCES stores (id) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE RESTRICT;
which has been converted to peewee by following code:
# ------------------------------------------------------------------------------- #
class Stores(Model):
id = IntegerField(column_name='id')
store_name = TextField(column_name='store_name')
class Meta:
database = postgres_pool
db_table = "stores"
#classmethod
def get_all(cls):
try:
return cls.select(cls.id, cls.store_name).order_by(cls.store)
except Stores.IntegrityError:
return None
# ------------------------------------------------------------------------------- #
class Products(Model):
id = IntegerField(column_name='id')
title = TextField(column_name='title')
url = TextField(column_name='url')
image = TextField(column_name='image')
store = ForeignKeyField(Stores, backref='products')
class Meta:
database = postgres_pool
db_table = "products"
#classmethod
def add_product(cls, pageData, store_name):
"""
INSERT
INTO
public.products(store_id, title, image, url)
VALUES((SELECT id FROM stores WHERE store_name = 'footish'), 'Teva Flatform Universal Pride',
'https://www.footish.se/sneakers/teva-flatform-universal-pride-t1116376',
'https://www.footish.se/pub_images/large/teva-flatform-universal-pride-t1116376-p77148.jpg?timestamp=1623417840')
"""
try:
return cls.insert(
store_id=cls.select(cls.store.id).join(Stores).where(cls.store.store_name == store_name).get().store.id,
title=pageData.title,
url=pageData.url,
image=pageData.image,
).execute()
except Products.DoesNotExist:
return None
However I have realized that working with id's is quite faster than working with text and I have an issue where I am trying to figure out what would be the best way to insert the ID. I did get a comment regarding my code as for today:
your insert isn't' referencing "stores" at all so not sure what your hoping to get from that since you have a sub query there
I am a bit confused what that means however my question is that I would like to know which approach is the correct way to insert
Is it better on start of application, to store the id as a variable and pass the variable into a insert function (argument)
Or to call store_id=cls.select(cls.store.id).join(Stores).where(cls.store.store_name == store_name).get().store.id where I instead pass the store_name and then it would return the correct id?
My first thought is that by doing the number 2, that is like doing 2 queries instead of one? but I might be wrong. Looking forward to know!
This is quite incorrect:
# Wrong
store_id=cls.select(cls.store.id).join(Stores).where(cls.store.store_name == store_name).get().store.id,
Correct:
try:
store = Stores.select().where(Stores.name == store_name).get()
except Stores.DoesNotExist:
# the store name does not exist. do whatever?
return
Products.insert(store=store, ...rest-of-fields...).execute()

Modfying ForeignKeyConstraint schema in Alembic post process

I'm using process_revision_directives to apply some post-processing of the operations generated against a reference schema. The one I'm stuck on is removing the postgres schema from the instructions, so it can be generically changed at runtime using the answer from another question.
The below code correctly removes the schema from operations except for ForeignKeyConstraints in a CreateTableOp.
def process_foreign_key(col: sa.ForeignKeyConstraint):
col.referred_table.schema = None # Doesn't work
def process_revision_directives(context, revision, directives):
# Remove the schema from the generated operations
for op in chain(directives[0].upgrade_ops.ops, directives[0].downgrade_ops.ops):
if isinstance(op, ops.CreateTableOp):
op.columns = [
process_foreign_key(col) if isinstance(col, sa.ForeignKeyConstraint) else col
for col in op.columns
]
op.schema = None
This currently generates output like
op.create_table('user',
sa.Column('id', sa.Integer, nullable=False),
sa.ForeignKeyConstraint(['id'], ['reference_schema.group.id'], name='group_group_id', onupdate='CASCADE', ondelete='CASCADE'),
)
Any ideas on how I should modify these constraint objects to not have reference_schema. in the target table?
If you look into the rendering chain you can find where the last schema reference is. It's on op._orig_table, but the important thing it is on this table twice.
Put the following in your for loop.
op._orig_table.schema = None
op._orig_table = op._orig_table.tometadata(clear_meta)
where clear_meta is a MetaData object with no schema, such as
clear_meta = sa.MetaData(bind=session.connection(), schema=None)

SQLAlchemy scoped_session is not getting latest data from DB

I'm rather new to the whole ORM topic, and I've already searched forums and docs.
The question is about a flask application with SQLAlchemy as ORM for the PostgreSQL.
The __init__.py contains the following line:
db = SQLAlchemy()
the created object is referenced in the other files to access the DB.
There is a save function for the model:
def save(self):
db.session.add(self)
db.session.commit()
and also an update function:
def update(self):
for var_name in self.__dict__.keys():
if var_name is not ('_sa_instance_state' or 'id' or 'foreign_id'):
# Workaround for JSON update problem
flag_modified(self, var_name)
db.session.merge(self)
db.session.commit()
The problem occurs when I'm trying to save a new object. The save function writes it to DB, it's visible when querying the DB directly (psql, etc.), but a following ORM query like:
model_list = db.session.query(MyModel).filter(MyModel.foreign_id == this_id).all()
gives an empty response.
A call of the update function does work as expected, new data is visible when requesting with the ORM.
I'm always using the same session object for example this:
<sqlalchemy.orm.scoping.scoped_session object at 0x7f0cff68fda0>
If the application is restarted everything works fine until a new object was created and tried to get with the ORM.
An unhandsome workaround is using raw SQL like:
model_list = db.session.execute('SELECT * FROM models_table WHERE
foreign_id = ' + str(this_id))
which gives a ResultProxy with latest data like this:
<sqlalchemy.engine.result.ResultProxy object at 0x7f0cf74d0390>
I think my problem is a misunderstanding of the session. Can anyone help me?
It figured out that the problem has nothing to do with the session, but the filter() method:
# Neccessary import for string input into filter() function
from sqlalchemy import text
# Solution or workaround
model_list = db.session.query(MyModel).filter(text('foreign_key = ' + str(this_id))).all()
I could not figure out the problem with:
filter(MyModel.foreign_id == this_id) but that's another problem.
I think this way is better than executing raw SQL.

Using hash to encode a PIN and save it to a database? PYTHON3

I'm currently getting my head around the hash feature in Python to encode with an algorithm a PIN number. After I have the user set their PIN number, I set it to the variable 'actualPIN'. My code is as follows below:
def returnCard(name, ID, rollingBalance, actualPIN):
PIN = hashlib.sha256()
PIN.update(b"actualPIN")
data = (rollingBalance, actualPIN, ID)
print(rollingBalance)
with sqlite3.connect("ATM.db") as db:
cursor = db.cursor()
sql = 'update Atm set Balance=?, PIN=? where CustomerID=?'
cursor.execute(sql, data)
db.commit()
print("Thank you for using Norther Frock")
print("Returning card...")
time.sleep(1)
print("Have a nice day")
entryID()
Everything works, however the pin which the user enters is saved on the database. What I want to save to the database is the encoded password (obviously?). Could anyone explain how I could do this?
You are writing the actualPIN variable to the database. Instead, you meant to write the digest:
data = (rollingBalance, PIN.digest(), ID)
# or data = (rollingBalance, PIN.hexdigest(), ID)
And you probably want to use actualPIN variable, not "actualPIN" string here:
PIN.update(repr(actualPIN).encode('utf-8'))

Resources