Cassandra - SyntaxException: line ... no viable alternative at input in select statement - cassandra

I am trying to select data from cassandra db using below query but it is failing-
SELECT id from keyspace.table where code=123 and toTimestamp(now()) >= some_date;
Error- SyntaxException: line 1:103 no viable alternative at input '(' (...table where code=123 and [toTimestamp](...)
Looks like toTimestamp(now()) is causing the issue.
Can someone plz suggest what is the issue and solution to this?
Thanks.

You can't use functions in the WHERE statement. So the only workaround is to get current time inside your application, and pass it to the query. This request is tracked as CASSANDRA-8488.
But in reality, your query should have condition on some column, not on the calculated value.

Related

Malformed SQL Statement: Expected token 'USING' but found Identifier with value 't' instead

I am trying to merge to a SQL Database using the following code in Databricks with pyspark
query = """
MERGE INTO deltadf t
USING df s
ON s.SLAId_Id = t.SLAId_Id
WHEN MATCHED THEN UPDATE SET *
WHEN NOT MATCHED THEN INSERT *
"""
driver_manager = spark._sc._gateway.jvm.java.sql.DriverManager
con = driver_manager.getConnection(url) #
stmt = con.createStatement()
stmt.executeUpdate(query)
stmt.close()
But I'm getting the following error:
SQLException: Malformed SQL Statement: Expected token 'USING' but found Identifier with value 't' instead at position 25.
Any thoughts on where might be going wrong?
I don't know why you're getting this exact error. However I believe there are a number of issues with what you are trying to do.
Running the query via JDBC makes it run in SQL Server only. Construct like WHEN MATCHED THEN UPDATE SET * / WHEN NOT MATCHED INSERT * will not work. Databricks accepts it, but for SQL Server you need to explicitly provide columns to update and values to insert (reference).
Also, do you actually have tables named deltadf and df in SQL Server? I suppose you have a Dataframe or temporary view named df... this will not work. As said, this query executes in SQL Server only. If you want to upload data from Dataframe use df.write.format("jdbc").save (reference).
See this Fiddle - if deltadf and df are tables, running this query in SQL Server (any version) will only complain about Incorrect syntax near '*'.
SQLException: Malformed SQL Statement: Expected token 'USING' but found Identifier with value 't' instead at position 25.
if you missed updating any specific field or specific syntax, you will get this error.
I performed merge operation its working fine for me without error, Please follow below reference .
Reference:
https://www.youtube.com/watch?v=i5oM2bUyH0o
https://docs.databricks.com/delta/delta-update.html#upsert-into-a-table-using-merge
https://www.sqlshack.com/sql-server-merge-statement-overview-and-examples/

Access "table$partitions" through Spark Sql

I figured out that running the following code will do full scan of the table:
select max(run_id) from database.table
So I switched my code to work with the following syntax:
select max(run_id) from "database"."table$partitions"
This query works great on Athena but when I try to execute it with Spark Sql I get the following error:
mismatched input '"database"' expecting <EOF>(line 1, pos 24)
It seems like spark sql identify the quotes as the end of the query.
Any ideas how to make this query work on spark sql?
Thanks
My solution for this problem was:
sql_context.sql(f'show partitions {table_name}').agg(
f.max(f.regexp_extract('partition', rf'''{partition_name}=([^/]+)''', 1))).collect()[0][0]
The advantage: It's not doing full scan on the table
Disadvantage: It's scan all partitions levels + code isn't elegant.
Anyway that's the best I found

How to use tuple <timestamp,text> in cql where clause?

I am using tuple<timestamp, text> to store timestamp and zone information in the Cassandra database.
I want to filter data based on timestamps.
Is there any way I can use this tuple in where clause for comparison in cql?
I have tried following cql query but it is not giving me proper results
SELECT extid,time_created_ from d_account where time_created_>=('2021-04-06 7:09:06', '+05:30') allow filtering;
Thanks in advance.
Posting this answer so that someone might get help in the future.
The following query worked for me and returned the expected results.
SELECT extid,time_created_ from d_account where time_created_ < ('2021-04-06 07:24:10.347+0000', '+05:30') allow filtering;

Why does sqlite throws a syntax error in the python program?

My table is sqlite3 is created with the following:-
'CREATE TABLE IF NOT EXISTS gig_program ( gig_program_id VARCHAR(20) PRIMARY KEY );'
When I try to insert data into the table using python 3.8 with the following:-
sql = 'INSERT INTO gig_program ( gig_program_id ) VALUES ( "20200524120727" );'
cur.execute(sql)
the following exception was thrown:-
near "gig_program": syntax error
When I cut and past the insert command to the sqlite3 console, it works.
I have also tried using another editor for the program (thinking that there may be hidden characters) but the result is the same.
I would appreciate help. I have used similar methods in other parts of the program to insert data and they work without issue.
Thank you for looking into my questions.
I found that it was actually my mistake. The exception was actually for a second sql statement which I missed out the "FROM" word.
Thank you everyone for your time.
Hope everyone is doing well.

Can we run complex multi line SQL queries using Blueprism?

I am new to SQL stuff in blueprism, I am able to configure SQL object and execute simple queries, but I am facing trouble while trying to run multiline complex SQL queries.
when I was trying to execute the below query in blueprism, getting some error message, saying "Incorrect Syntax near Database2"
"select top 10 * from [Database1].[dbo].[Table1]
join [Database1].[dbo].[Table2] on [Database1].[dbo].[Table2].Fieldname1=[Database1].[dbo].[Table1].Fieldname2
join [Database2].[dbo].[Table1] on [Database2].[dbo].[Table1].Fieldname1=[Database1].[dbo].[Table2].Fieldname2"
Can somebody please help me, what was the wrong in the above query...
I found the answer myself, there should not be any additional white space characters in the query, entire query should be in continuous line. The beauty of blueprism is, it can execute any level of complex queries without any constraints, but need to modify the syntax accordingly. always we should mention the filename and table names in the following format - [databasename].[dbo].[tablename].[fieldname]

Resources