Same code different Table name gives Error! in Cassandra

Same code different Table name gives Error! in Cassandra - apache-spark

working with apache Cassandra in Jupyter Notebook, creating a table, and inserting data all works fine. after I change the Table name, respectively, gives an error!
the working first code:
session.execute("""CREATE TABLE IF NOT EXISTS table2
(artist text , song text, firstname text , lastname text, userId int ,sessionid int, iteminsession int ,
PRIMARY KEY ((userId, sessionId), itemInSession)) """)
file = 'event_datafile_new.csv'
with open(file, encoding = 'utf8') as f:
csvreader = csv.reader(f)
next(csvreader) # skip header
for line in csvreader:
query = "INSERT INTO table2 (artist, song, firstname, lastname , userId, sessionId, itemInSession)"
query = query + "VALUES (%s, %s, %s, %s ,%s ,%s ,%s)"
# building the insert
# executing the insertion
session.execute(query , (line[0], line[9], line[1], line[4], int(line[10]), int(line[8]), int(line[3])) )
query = "select artist, song, firstname, lastname from table2 WHERE userId= 10 and sessionId = 182"
try:
rows = session.execute(query)
except Exception as e:
print(e)
for row in rows:
print(f'artist: {row.artist}, song: {row.song}, user first name: {row.firstname},user last name: {row.lastname}')
changing table name from 'table2' to 'artist_song' getting error: InvalidRequest: Error from server: code=2200 [Invalid query] message="Undefined column name firstname"
session.execute("""CREATE TABLE IF NOT EXISTS artist_song
(artist text , song text, firstname text , lastname text, userId int ,sessionid int, iteminsession int ,
PRIMARY KEY ((userId, sessionId), itemInSession)) """)
file = 'event_datafile_new.csv'
with open(file, encoding = 'utf8') as f:
csvreader = csv.reader(f)
next(csvreader) # skip header
for line in csvreader:
query = "INSERT INTO artist_song (artist, song, firstname, lastname , userId, sessionId, itemInSession)"
query = query + "VALUES (%s, %s, %s, %s ,%s ,%s ,%s)"
# building the insert
# executing the insertion
session.execute(query , (line[0], line[9], line[1], line[4], int(line[10]), int(line[8]), int(line[3])) )
query = "select artist, song, firstname, lastname from artist_song WHERE userId= 10 and sessionId = 182"
try:
rows = session.execute(query)
except Exception as e:
print(e)
for row in rows:
print(f'artist: {row.artist}, song: {row.song}, user first name: {row.firstname},user last name: {row.lastname}')

Related

How to copy a csv file to postgresql using the copy command?

I wrote the following script for uploading a csv file to postgresql databse.
import psycopg2
import keys
con = psycopg2.connect(
host = keys.keys['host'],
database = keys.keys['database'],
user = keys.keys['user'],
password = keys.keys['password'])
#cursor
cur = con.cursor()
#execute query
#Already created ___#cur.execute("CREATE TABLE accounts (user_id serial PRIMARY KEY, username VARCHAR ( 50 ) UNIQUE NOT NULL, password VARCHAR ( 50 ) NOT NULL, email VARCHAR ( 255 ) UNIQUE NOT NULL, created_on TIMESTAMP NOT NULL, last_login TIMESTAMP)")
cur.execute("""\COPY "MyData" FROM 'C:\FILES\TestData.csv' DELIMITER ',' CSV HEADER;""")
#commit the transcation
con.commit()
#close the cursor
cur.close()
#close the connection
con.close()
But it returned the following error:-
SyntaxError: syntax error at or near "\"
LINE 1: \COPY "MyData" FROM 'C:\FILES\TestData.csv' DELIMITER ',' C...
I'm not a root user, so I could not directly use the COPY command.

Well.
You can use psycopg2's copy_from -> https://www.psycopg.org/docs/cursor.html#cursor.copy_from
So your code would look something like:
import psycopg2
import keys
con = psycopg2.connect(
host = keys.keys['host'],
database = keys.keys['database'],
user = keys.keys['user'],
password = keys.keys['password'])
#cursor
cur = con.cursor()
#execute query
#Already created ___#cur.execute("CREATE TABLE accounts (user_id serial PRIMARY KEY, username VARCHAR ( 50 ) UNIQUE NOT NULL, password VARCHAR ( 50 ) NOT NULL, email VARCHAR ( 255 ) UNIQUE NOT NULL, created_on TIMESTAMP NOT NULL, last_login TIMESTAMP)")
with open('C:\\Files\\TestData.csv', 'r') as acc:
next(acc) # This will skip the header
cur.copy_from(acc, 'accounts', sep=',')
#commit the transcation
con.commit()
#close the cursor
cur.close()
#close the connection
con.close()
Hope this answers your question.

Convert psycopg2 into asyncpg format. "syntax error at or near "%""

I'm converting a postgres script into asyncpg.
im getting "asyncpg.exceptions.PostgresSyntaxError: syntax error at or near "%""
i assume my placeholder format is incorrect but i cant find an example of a correct format.
Original working psycopg2 code:
async def commit_trade_postgres(response_data_input):
conn = await psycopg2.connect(
"dbname='postgres' user='postgres' password = 'postgres123' host='localhost' port= '5432'")
cur = conn.cursor()
cur.execute(
"CREATE TABLE IF NOT EXISTS trade_{symbol} (time timestamptz NOT NULL ,side text, size float, price float, tick_direction text)".format(**response_data_input))
conn.commit()
cur.execute(
"SELECT create_hypertable('trade_{symbol}', 'time', if_not_exists => TRUE)".format(**response_data_input))
conn.commit()
cur.execute("INSERT INTO trade_{symbol} (time, side, size, price, tick_direction) VALUES (now(), %(side)s, %(size)s, %(price)s, %(tick_direction)s)".format(
**response_data_input), (response_data_input))
conn.commit()
print("commited trade")
My attempt as per the example code supplied int he docs:
async def commit_trade_postgres(response_data_input):
conn = await asyncpg.connect(database='postgres', user='postgres', password='postgres123', host='localhost', port='5432')
await conn.execute(
"CREATE TABLE IF NOT EXISTS trade_{symbol} (time timestamptz NOT NULL ,side text, size float, price float, tick_direction text)".format(**response_data_input))
await conn.execute(
"SELECT create_hypertable('trade_{symbol}', 'time', if_not_exists => TRUE)".format(**response_data_input))
await conn.execute("INSERT INTO trade_{symbol} (time, side, size, price, tick_direction) VALUES (now(), %(side)s, %(size)s, %(price)s, %(tick_direction)s)".format(
**response_data_input), (response_data_input))
print("commited trade")
EDIT: Sample Query, Which i'm extracting 'data' as a dict.
response_dict_instrument = {'topic': 'instrument.BTCUSD', 'data': [{'symbol': 'BTCUSD', 'mark_price': 12367.29, 'index_price': 12360.1}]}

You're formatting query by yourself. You never should do that. Also I would suggest you to create table for every incoming symbol beforehand, do not do this dynamically.
Asyncpg template uses $ sign with number to substitute values to query for you. doc
So, syntax should be like this, if input is dictionary.
async def save_input(input):
# create connection
conn = ...
trade_symbol = input['symbol']
query = "create table if not exists trade_{trade_symbol} ... ".format(trade_symbol=trade_symbol) # your column names go here
await conn.execute(query)
query = "SELECT create_hypertable('trade_{trade_symbol} ...".format(trade_symbol=trade_symbol)
await conn.execute(query)
# i'm not copyng your exact keys, you should do it yourself
values = (input['key1'], input['key2'], input['key3'])
query = "insert into trade_{trade_symbol} (key1, key2, key3) values ($1, $2, $3);".format(trade_symbol=trade_symbol)
await conn.execute(query, *values)
await conn.close()

sqlite3 python3, user input for database

I have been trying to enter direct data to an sqlite database from user input but it only captures the first input and leaves out the rest, where could I be wrong?
Here is the code:
import sqlite3 as lite
class DataInput:
def __init__(self):
self.id = input("Enter ID: ")
self.name = input("Enter name: ")
self.price = input("Enter price: ")
running = True
a = DataInput()
con = lite.connect('kev.db')
with con:
cur = con.cursor()
cur.execute("DROP TABLE IF EXISTS cars")
cur.execute("CREATE TABLE cars(id INT, name TEXT, price INT)")
cur.execute("INSERT INTO cars VALUES(?, ?, ?)", (a.id, a.name, a.price))
while running:
DataInput()
continue

The continue is not helping you.
A constructor that has the side effect of offering three user prompts is, ummm, a bit unusual, but we'll let that one go.
You want to DROP/CREATE once, and then INSERT many times:
with lite.connect('kev.db') as con:
cur = con.cursor()
cur.execute("DROP TABLE IF EXISTS cars")
cur.execute("CREATE TABLE cars(id INT, name TEXT, price INT)")
running = True
while running:
a = DataInput()
cur.execute("INSERT INTO cars VALUES(?, ?, ?)", (a.id, a.name, a.price))

Python code and SQLite3 won't INSERT data in table Pycharm?

What am I doing wrong here? It run's without error, it has created table, but rows are empty. Why?
import sqlite3
sqlite_file = (r"C:\Users\Dragan\PycharmProjects\MyProject\ArchLib2.db")
conn = sqlite3.connect(sqlite_file)
cursor = conn.cursor()
table_name = 'Archive'
sql = 'CREATE TABLE IF NOT EXISTS ' + table_name + '("first_name" varchar NOT NULL, "second_name" varchar NOT NULL)'
cursor.execute(sql)
sql = 'INSERT INTO ' + table_name + '(first_name,second_name) VALUES ("value1","value2");'
cursor.execute(sql)
cursor.close()

Ok so I found why it didn't INSERT data into table.
data in sql = string didnt have good formating ( this ' must be replaced with this "
second if you have string value like "value1" it has to have backslash on both sides like this "\value1\"
third and most important after insert execution line you have to add this line conn.commit()
Final code looks like this:
import sqlite3
sqlite_file = (r"C:\Users\Dragan\PycharmProjects\MyProject\ArchLib2.db")
conn = sqlite3.connect(sqlite_file)
cursor = conn.cursor()
table_name = 'Archive'
sql = "CREATE TABLE IF NOT EXISTS " + table_name + "(first_name varchar NOT NULL, datetime)"
cursor.execute(sql)
sql = "INSERT INTO " + table_name + "(first_name,datetime) VALUES (\"value1\",CURRENT_TIMESTAMP)"
cursor.execute(sql)
conn.commit()
cursor.close()

Why is my data insertion in my cassandra database so slow?

This is my query if the current data ID is present or absent in the Cassandra database
row = session.execute("SELECT * FROM articles where id = %s", [id])
Resolved messages in Kafka, then determine whether or not this message exists in the cassandra database if it does not exist, then it should perform an insert operation, if it does exist, it should not be inserted in the data.
messages = consumer.get_messages(count=25)
if len(messages) == 0:
print 'IDLE'
sleep(1)
continue
for message in messages:
try:
message = json.loads(message.message.value)
data = message['data']
if data:
for article in data:
source = article['source']
id = article['id']
title = article['title']
thumbnail = article['thumbnail']
#url = article['url']
text = article['text']
print article['created_at'],type(article['created_at'])
created_at = parse(article['created_at'])
last_crawled = article['last_crawled']
channel = article['channel']#userid
category = article['category']
#scheduled_for = created_at.replace(minute=created_at.minute + 5, second=0, microsecond=0)
scheduled_for=(datetime.utcnow() + timedelta(minutes=5)).replace(second=0, microsecond=0)
row = session.execute("SELECT * FROM articles where id = %s", [id])
if len(list(row))==0:
#id parse base62
ids = [id[0:2],id[2:9],id[9:16]]
idstr=''
for argv in ids:
num = int(argv)
idstr=idstr+encode(num)
url='http://weibo.com/%s/%s?type=comment' % (channel,idstr)
session.execute("INSERT INTO articles(source, id, title,thumbnail, url, text, created_at, last_crawled,channel,category) VALUES (%s,%s, %s, %s, %s, %s, %s, %s, %s, %s)", (source, id, title,thumbnail, url, text, created_at, scheduled_for,channel,category))
session.execute("INSERT INTO schedules(source,type,scheduled_for,id) VALUES (%s, %s, %s,%s) USING TTL 86400", (source,'article', scheduled_for, id))
log.info('%s %s %s %s %s %s %s %s %s %s' % (source, id, title,thumbnail, url, text, created_at, scheduled_for,channel,category))
except Exception, e:
log.exception(e)
#log.info('error %s %s' % (message['url'],body))
print e
continue
Edit:
I have one ID which only has one unique table row, which I want to be like this. As soon as I add different scheduled_for times for the unique ID my system crashes. Add this if len(list(row))==0: is the right thought but my system is very slow after that.
This is my table description:
DROP TABLE IF EXISTS schedules;
CREATE TABLE schedules (
source text,
type text,
scheduled_for timestamp,
id text,
PRIMARY KEY (source, type, scheduled_for, id)
);
This scheduled_for is changeable. Here is also a concrete example
Hao article 2016-01-12 02:09:00+0800 3930462206848285
Hao article 2016-01-12 03:09:00+0801 3930462206848285
Hao article 2016-01-12 04:09:00+0802 3930462206848285
Hao article 2016-01-12 05:09:00+0803 3930462206848285
Thanks for your replies!

Why don't you use insert if not exists ?
https://docs.datastax.com/en/cql/3.1/cql/cql_reference/insert_r.html

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Same code different Table name gives Error! in Cassandra - apache-spark

Related

How to copy a csv file to postgresql using the copy command?

Convert psycopg2 into asyncpg format. "syntax error at or near "%""

sqlite3 python3, user input for database

Python code and SQLite3 won't INSERT data in table Pycharm?

Why is my data insertion in my cassandra database so slow?

Categories

Resources