Same query doesn't run on restored Cassandra database - cassandra

I am working on Cassandra migration.
I built a new Cassandra cluster - Cassandra 2.1.8 on Ubuntu 14.04. Database was restored from snapshots.
Source Cassandra cluster is also version 2.1.8.
I am facing with this weird issue.
On the original cluster I can run following query using cqlsh without any errors. cqlsh is version 5.0.1.
SELECT * FROM "featureitems" WHERE "categoryId" = 2 LIMIT 100;
On a new cluster same query throws error:
InvalidRequest: code=2200 [Invalid query] message="Undefined name categoryId in where clause ('categoryId = 2')"
but it runs perfectly fine when I remove double quotes
SELECT * FROM featureitems WHERE categoryId = 2 LIMIT 100;
It looks like some configuration issue, but I don't know where to look. Any suggestion in that sense is appreciated.

Cassandra converts all column/table/keyspace names to lowercase if not provided in double quotes.
So if you need uppercase character in column/table/keyspace name use double quotes.
You can use DESC TABLE featureitems command to describe table.
In your first query you have enclosed categoryId in double quotes, hence it looks for column with capital I.
In your second query categoryId is not enclosed in double quotes, hence it will be converted to categoryid... which is present in table and hence working.

Related

Malformed SQL Statement: Expected token 'USING' but found Identifier with value 't' instead

I am trying to merge to a SQL Database using the following code in Databricks with pyspark
query = """
MERGE INTO deltadf t
USING df s
ON s.SLAId_Id = t.SLAId_Id
WHEN MATCHED THEN UPDATE SET *
WHEN NOT MATCHED THEN INSERT *
"""
driver_manager = spark._sc._gateway.jvm.java.sql.DriverManager
con = driver_manager.getConnection(url) #
stmt = con.createStatement()
stmt.executeUpdate(query)
stmt.close()
But I'm getting the following error:
SQLException: Malformed SQL Statement: Expected token 'USING' but found Identifier with value 't' instead at position 25.
Any thoughts on where might be going wrong?
I don't know why you're getting this exact error. However I believe there are a number of issues with what you are trying to do.
Running the query via JDBC makes it run in SQL Server only. Construct like WHEN MATCHED THEN UPDATE SET * / WHEN NOT MATCHED INSERT * will not work. Databricks accepts it, but for SQL Server you need to explicitly provide columns to update and values to insert (reference).
Also, do you actually have tables named deltadf and df in SQL Server? I suppose you have a Dataframe or temporary view named df... this will not work. As said, this query executes in SQL Server only. If you want to upload data from Dataframe use df.write.format("jdbc").save (reference).
See this Fiddle - if deltadf and df are tables, running this query in SQL Server (any version) will only complain about Incorrect syntax near '*'.
SQLException: Malformed SQL Statement: Expected token 'USING' but found Identifier with value 't' instead at position 25.
if you missed updating any specific field or specific syntax, you will get this error.
I performed merge operation its working fine for me without error, Please follow below reference .
Reference:
https://www.youtube.com/watch?v=i5oM2bUyH0o
https://docs.databricks.com/delta/delta-update.html#upsert-into-a-table-using-merge
https://www.sqlshack.com/sql-server-merge-statement-overview-and-examples/

ALTER TABLE returns "ConfigurationException: Column family ID"

I am getting error while executing alter table script.
ALTER TABLE user.employee ADD salary text;
ServerError: java.lang.RuntimeException: java.util.concurrent.ExecutionException: org.apache.cassandra.exceptions.ConfigurationException: Column family ID mismatch (found e5da3980-83eb-11ec-8c56-1b3845d1a791; expected c8ac48d0-83eb-11ec-8c56-1b3845d1a791)
When I describe table ,I am seeing newly created column present. But I am bot able to access the new column.Its throwing below error InvalidRequest: Error from server: code=2200 [Invalid query] message="Undefined name xxxxxxxxx in selection clause"
We have close to 100GB of data.
This looks like the same question asked on https://community.datastax.com/questions/13220/ so I'm re-posting my answer here.
This exception indicates that you have a schema disagreement in your cluster:
ConfigurationException: Column family ID mismatch (\
found e5da3980-83eb-11ec-8c56-1b3845d1a791; \
expected c8ac48d0-83eb-11ec-8c56-1b3845d1a791 \
)
In my experience, the most common cause of this problem is that you dropped and re-created the table without waiting for the schema to propagate to all nodes in the cluster in between the DROP and CREATE. Alternatively, it's possible that you've tried to create the table and assumed it didn't work then tried to create it again.
In any case, Cassandra thinks the table was created at 05:48 GMT but found a version created at 05:49 GMT. For what it's worth:
e5da3980-83eb-11ec-8c56-1b3845d1a791 = February 2, 2022 at 5:49:33 AM GMT
c8ac48d0-83eb-11ec-8c56-1b3845d1a791 = February 2, 2022 at 5:48:44 AM GMT
You'll need to resolve the schema disagreement. Depending on the Cassandra version you can either (a) run nodetool resetlocalschema on nodes which have a different schema version based on the output of nodetool describecluster, or (b) perform a rolling restart of all nodes. Cheers!
ExecutionException: org.apache.cassandra.exceptions.ConfigurationException: Column family ID mismatch (found e5da3980-83eb-11ec-8c56-1b3845d1a791; expected c8ac48d0-83eb-11ec-8c56-1b3845d1a791)
Has that column been deleted/added more than once? Cassandra (especially the pre 3.0 versions) is notorious for problems with that.
Check the output of nodetool describecluster. Are there multiple schema versions being reported?
If there are multiple schema versions, then run a rolling restart of the cluster. That's a sure-fire way to force schema agreement. Check the table, and see if that column is there. If not, try to add it.
The other solution, would be to try adding it with a different name (ex: "salary2").

select string literal in cassandra cql

I am new to Cassandra and I am trying to run a simple query in CQL:
select aggregate_name as name, 'test' as test from aggregates;
and I get an error: Line 1: no viable alternative at input ''test''
The question is: how could I select string literal in Apache Cassandra?
I found an ugly workaround, if you really want to print a text value as a column:
cqlsh> select aggregate_name as name, blobAsText(textAsBlob('test')) as test from aggregates;
name | test
------+------
dude | test
CQL supports native Cassandra functions as a select_expression, so you can convert your string literal to a blob and back again as shown above. (source)

How can i describe table in cassandra database?

$describe = new Cassandra\SimpleStatement(<<<EOD
describe keyspace.tablename
EOD
);
$session->execute($describe);
i used above code but it is not working.
how can i fetch field name and it's data type from Cassandra table ?
Refer to CQL documentation. Describe expects a table/schema/keyspace.
describe table keyspace.tablename
Its also a cqlsh command, not an actual cql command. To get this information query the system tables. try
select * from system.schema_columns;
- or for more recent versions -
select * from system_schema.columns ;
if using php driver may want to check out http://datastax.github.io/php-driver/features/#schema-metadata
Try desc table keyspace.tablename;

Syntax error at position 7: unexpected "*" for `Select * FROM mytable;`

I write because I've a problem with cassandra; after have imported the data from pentaho as show here
http://wiki.pentaho.com/display/BAD/Write+Data+To+Cassandra
when I try to execute the query
Select * FROM mytable;
cassandre give me an error message
Syntax error at position 7: unexpected "*" for Select * FROM mytable;.
and don't show the results of query.Why? what does it mean that error?
the step that i make are the follow:
start cassandra cli utility;
use keyspace added from pentaho; (use tpc_h);
select to show the data added (Select * FROM mytable;)
The cassandra-cli does not support any CQL version. It has its own syntax which you can find on datastax's website.
Just for clarity, in cql to select everything from a table (aka column-family) called mytable stored in a keyspace called myks you would use:
SELECT * FROM myks.mytable;
The equivalent in cassandra-cli would *roughly be :
USE myks;
LIST mytable;
***** In the cli you are limited to selecting the first 100 rows. If this is a problem you can use the limit clause to specify how many rows you want:
LIST mytable limit 10000;
As for this:
in cassandra i have read that isn't possible make the join such as sql, ther isn't a shortcut to issue this disadvantage
There is a reason why joins don't exist in Cassandra, its for the same reason that C* isn't ACID compliant, it sacrifices that functionality for it's amazing performance and scalability, so it's not a disadvantage, you just need to re-think your model if you need joins. Also take a look at this question / answer.

Resources