Inserting/Updating sqlite table from python program - pysqlite

I have a sqlite3 table as shown below
Record(WordID INTEGER PRIMARY KEY, Word TEXT, Wordcount INTEGER, Docfrequency REAL).
I want to create and insert data into this table if the table not exists else I want to update the table in such a way that only 'Wordcount' column get updated on the basis(Reference) of data in the column 'Word'. I am trying to execute this from a python program like
import sqlite3
conn = sqlite3.connect("mydatabase")
c = conn.cursor()
#Create table
c.execute("CREATE TABLE IF NOT EXISTS Record(WordID INTEGER PRIMARY KEY, Words TEXT, Wordcount INTEGER, Docfrequency REAL)")
#Update table
c.execute("UPDATE TABLE IF EXISTS Record")
#Insert a row of data
c.execute("INSERT INTO Record values (1,'wait', 9, 10.0)")
c.execute("INSERT INTO Record values (2,'Hai', 5, 6.0)")
#Updating data
c.execute("UPDATE Record SET Wordcount='%d' WHERE Words='%s'" %(11,'wait') )
But I can't update the table. On running the program I am getting the error message as
c.execute("UPDATE TABLE IF EXISTS Record")
sqlite3.OperationalError: near "TABLE": syntax error
How should I write the code to update the table ?

Your SQL query for UPDATE is invalid - see the documentation.
Also, I don't understand why you'd want to check for the table's existence when updating, given that just before that you're creating it if it doesn't exist.
If your goal is to update an entry if it exists or insert it if it doesn't, you might do it either by:
First doing an UPDATE and checking the number of rows updated. If 0, you know the record didn't exist and you should INSERT instead.
First doing an INSERT - if there's an error related to constraint violation, you know the entry already existed and you should do an UPDATE instead.

Related

mariadb python - executemany using SELECT

Im trying to input many rows to a table in a mariaDB.
For doing this i want to use executemany() to increase speed.
The inserted row is dependent on another table, which is found with SELECT.
I have found statements that SELECT doent work in a executemany().
Are there other ways to sole this problem?
import mariadb
connection = mariadb.connect(host=HOST,port=PORT,user=USER,password=PASSWORD,database=DATABASE)
cursor = connection.cursor()
query="""INSERT INTO [db].[table1] ([col1], [col2] ,[col3])
VALUES ((SELECT [colX] from [db].[table2] WHERE [colY]=? and
[colZ]=(SELECT [colM] from [db].[table3] WHERE [colN]=?)),?,?)
ON DUPLICATE KEY UPDATE
[col2]= ?,
[col3] =?;"""
values=[input_tuplets]
When running the code i get the same value for [col1] (the SELECT-statement) which corresponds to the values from the from the first tuplet.
If SELECT doent work in a executemany() are there another workaround for what im trying to do?
Thx alot!
I think that reading out the tables needed,
doing the search in python,
use exeutemany() to insert all data.
It will require 2 more queries (to read to tables) but will be OK when it comes to calculation time.
Thanks for your first question on stackoverflow which identified a bug in MariaDB Server.
Here is a simple script to reproduce the problem:
CREATE TABLE t1 (a int);
CREATE TABLE t2 LIKE t1;
INSERT INTO t2 VALUES (1),(2);
Python:
>>> cursor.executemany("INSERT INTO t1 VALUES \
(SELECT a FROM t2 WHERE a=?))", [(1,),(2,)])
>>> cursor.execute("SELECT a FROM t1")
>>> cursor.fetchall()
[(1,), (1,)]
I have filed an issue in MariaDB Bug tracking system.
As a workaround, I would suggest reading the country table once into an array (according to Wikipedia there are 195 different countries) and use these values instead of a subquery.
e.g.
countries= {}
cursor.execute("SELECT country, id FROM countries")
for row in cursor:
countries[row[0]]= row[1]
and then in executemany
cursor.executemany("INSERT INTO region (region,id_country) values ('sounth', ?)", [(countries["fra"],) (countries["ger"],)])

Cannot update existing row on conflict in PostgreSQL with Psycopg2

I have the following function defined to insert several rows with iteration in Python using Psycopg2 and PostgreSQL 11.
When I receive the same obj (with same id), I want to update its date.
def insert_execute_values_iterator(
connection,
objs: Iterator[Dict[str, Any]],
page_size: int = 1000,
) -> None:
with connection.cursor() as cursor:
try:
psycopg2.extras.execute_values(cursor, """
INSERT INTO objs(\
id,\
date,\
) VALUES %s \
ON CONFLICT (id) \
DO UPDATE SET (date) = (EXCLUDED.date) \
""", ((
obj['id'],
obj['date'],
) for obj in objs), page_size=page_size)
except (Exception, Error) as error:
print("Error while inserting as in database", error)
When a conflict happens on the unique primary key of the table while inserting an element, I get the error:
Error while inserting as in database ON CONFLICT DO UPDATE command
cannot affect row a second time
HINT: Ensure that no rows proposed for insertion within the same command have duplicate constrained values.
FYI, the clause works on PostgreSQL directly but not from the Python code.
Use unique VALUE-combinations in your INSERT statement:
create table foo(id int primary key, date date);
This should work:
INSERT INTO foo(id, date)
VALUES(1,'2021-02-17')
ON CONFLICT(id)
DO UPDATE SET date = excluded.date;
This one won't:
INSERT INTO foo(id, date)
VALUES(1,'2021-02-17') , (1, '2021-02-16') -- 2 conflicting rows
ON CONFLICT(id)
DO UPDATE SET date = excluded.date;
DEMO
You can fix this by using DISTINCT ON() in a SELECT statement:
INSERT INTO foo(id, date)
SELECT DISTINCT ON(id) id, date
FROM (VALUES(1,CAST('2021-02-17' AS date)) , (1, '2021-02-16')) s(id, date)
ORDER BY id, date ASC
ON CONFLICT(id)
DO UPDATE SET date = excluded.date;

How to insert value in already created Database table through pandas `df.to_sql()`

I'm creating new table then inserting values in it because the tsv file doesn't have headers so i need to create table structure first then insert the value. I'm trying to insert the value in database table which is been created. I'm using df.to_sql function to insert tsv values into database table but its creating table but it's not inserting values in that table and its not giving any type of error either.
I have tried to create new table through sqalchemy and insert value it worked but it didn't worked for already created table.
conn, cur = create_conn()
engine = create_engine('postgresql://postgres:Shubham#123#localhost:5432/walmart')
create_query = '''create table if not exists new_table(
"item_id" TEXT, "product_id" TEXT, "abstract_product_id" TEXT,
"product_name" TEXT, "product_type" TEXT, "ironbank_category" TEXT,
"primary_shelf" TEXT, apparel_category" TEXT, "brand" TEXT)'''
cur.execute(create_query)
conn.commit()
file_name = 'new_table'
new_file = "C:\\Users\\shubham.shinde\\Desktop\\wallll\\new_file.txt"
data = pd.read_csv(new_file, delimiter="\t", chunksize=500000, error_bad_lines=False, quoting=csv.QUOTE_NONE, dtype="unicode", iterator=True)
with open(file_name + '_bad_rows.txt', 'w') as f1:
sys.stderr = f1
for df in data:
df.to_sql('new_table', engine, if_exists='append')
data.close()
I want to insert values from df.to_sql() into database table
Not 100% certain if this argument works with postgresql, but I had a similar issue when doing it on mssql. .to_sql() already creates the table in the first argument of the method in new_table. The if_exists = append also doesn't check for duplicate values. If data in new_file is overwritten, or run through your function again, it will just add to the table. As to why you're seeing the table name, but not seeing the data in it, might be due to the size of the df. Try setting fast_executemany=True as the second argument of the create_engine.
My suggestion, get rid of create_query, and handle the data types after to_sql(). Once the SQL table is created, you can use your actual SQL table, and join against this staging table for duplicate testing. The non-duplicates can be written to the actual table, converting datatypes on UPDATE to match the tables data type structure.

How to use INSERT query to avoid duplicate entries in postgresql database tables

Hi..while using the follwing code i am getting duplicate entries in my table..
Please suggest some method to avoid such duplicate entries..!!
Is there any other mode of INSERT query to acheive duplication free tables..???
import psycopg2
def connect():
con=psycopg2.connect("dbname='book_store' user='postgres' password='5283' host='localhost' port='5432' ")
cur=con.cursor()
cur.execute("CREATE TABLE if not exists books(id SERIAL PRIMARY KEY,title TEXT NOT NULL,author TEXT NOT NULL,year integer NOT NULL,isbn integer NOT NULL)")
con.commit()
con.close()
def insert(title,author,year,isbn):
con=psycopg2.connect("dbname='book_store' user='postgres' password='5283' host='localhost' port='5432'")
cur=con.cursor()
cur.execute("INSERT INTO books(title,author,year,isbn) VALUES(%s,%s,%s,%s)",(title,author,year,isbn))
con.commit()
con.close()
connect()
insert("the sun","helen",1997,23456777)
insert("the sun","helen",1997,23456777)
Here the same entry gets added again..where i want my code to neglect such duplication..!!!
Ideally there should be primary key or Unique key constraint defined on the table to avoid duplicates but if you want to insert only if that record doesn't exists then you can use below insert statement with select & where not exists clause
INSERT INTO books(title,author,year,isbn) select #title,#author,#year,#isbn from books where
not exists (select 1 from books where title=#title and author=#author and year=#year and isbn=#isbn);
In where condition should check for Primary OR Unique key columns instead of all the columns.

sqlite3.OperationalError: no such column: year

Using SQLite3 and got this error:
sqlite3.OperationalError: no such column: year
SQLite3 newbie over here.
Really confused right now as to what part of the code went wrong...
import sqlite3
def connect():
conn=sqlite3.connect("books.db")
cur=conn.cursor()
cur.execute("CREATE TABLE IF NOT EXISTS book (id INTEGER PRIMARY KEY, title text, author text, year integer, isbn integer)")
conn.commit()
conn.close()
def search(title="",author="",year="",isbn=""):
conn=sqlite3.connect("books.db")
cur=conn.cursor()
cur.execute("SELECT * FROM book WHERE title=? OR author=? OR year=? OR isbn=?",(title,author,year,isbn))
rows=cur.fetchall()
conn.close()
return rows
connect()
print(search(year=1918))
Any help would be appreciated, thanks!!!
Make sure that you have that column.
To list all the columns of the table book:
sqlite3 books.db
and after that:
.schema book
If you don't have a column with the name year you can add it by altering the table, or you can delete your old table and create it again.
One possibility is that no such column exists (the message is correct) because you already created the table, in an earlier version of your code which didn't have that column, so the CREATE TABLE IF NOT EXISTS silently returns.
You could very this manually by examining .schema in interactive sqlite3.
And/or you could cover the possibility in your code by checking the table structure with e.g.
SELECT * FROM sqlite_master;
If it's not correct, you could use ALTER TABLE book ADD COLUMN ... - if you wanted to rename a column, it's more complicated: SQLite Query Language: ALTER TABLE

Resources