TLDR; A table is still inaccessible while system_schema.tables already contains a record respective the table
I'm trying to use Cassandra concurrently.
Cassandra version: [cqlsh 5.0.1 | Cassandra 3.11.3 | CQL spec 3.4.4 | Native protocol v4]
I have two Python scripts using cassandra-driver==3.16.0 for Consumer and Producer running in different processes.
While Producer creates and fills table, Consumer waits until table is created with Python script running CQL statement:
table_exists = False
while not table_exists:
cql = SimpleStatement(
"SELECT table_name FROM system_schema.tables WHERE keyspace_name = 'test_keyspace' AND table_name = 'test_table'"
)
results = cassandra_session.execute(cql)
table_exists = bool(results.current_rows)
After the statement results with at least one record I make a conclusion that table has been created and try to read it with SELECT:
SELECT * FROM test_keyspace.test_table WHERE ...
But sometimes, I get really annoying error:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/usr/local/lib/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "/stress.py", line 128, in runner
for r in select(TEST_KEYSPACE, table_name):
File "/stress.py", line 63, in select
results = cassandra_session.execute(statement)
File "cassandra/cluster.py", line 2171, in cassandra.cluster.Session.execute
File "cassandra/cluster.py", line 4062, in cassandra.cluster.ResponseFuture.result
cassandra.InvalidRequest: Error from server: code=2200 [Invalid query] message="unconfigured table test_table"
According to the information I discovered that error happens when SELECT statement executes with a table that has not been created yet.
So while system_schema.tables already contains a record about the table, the table is not yet accessible.
Maybe there is a more reliable way to check table accessibility? Or common workaround?
With single node Cassandra setups, I have witnessed structural changes not to propagate immediately. I.e. creating a table, then inserting into it, and the insert fails because the table does not exist. Then you check if the table exists, and it is there. And then, since some time has passed, inserts work.
The only way I managed to make Single Node Cassandras behave consistently is to introduce a one-second-wait after every structural change. This was fine by me, since Single Node Cassandras are only used in local development scenarios. In productive environments, I simply disable the wait.
In my case, it was searching for table on another keyspace.
ERROR:
cassandra.InvalidRequest: Error from server: code=2200 [Invalid query] message="unconfigured table <table_name>"
I made sure it __keyspace__ points to right keyspace and then it worked.
Related
I am getting error while executing alter table script.
ALTER TABLE user.employee ADD salary text;
ServerError: java.lang.RuntimeException: java.util.concurrent.ExecutionException: org.apache.cassandra.exceptions.ConfigurationException: Column family ID mismatch (found e5da3980-83eb-11ec-8c56-1b3845d1a791; expected c8ac48d0-83eb-11ec-8c56-1b3845d1a791)
When I describe table ,I am seeing newly created column present. But I am bot able to access the new column.Its throwing below error InvalidRequest: Error from server: code=2200 [Invalid query] message="Undefined name xxxxxxxxx in selection clause"
We have close to 100GB of data.
This looks like the same question asked on https://community.datastax.com/questions/13220/ so I'm re-posting my answer here.
This exception indicates that you have a schema disagreement in your cluster:
ConfigurationException: Column family ID mismatch (\
found e5da3980-83eb-11ec-8c56-1b3845d1a791; \
expected c8ac48d0-83eb-11ec-8c56-1b3845d1a791 \
)
In my experience, the most common cause of this problem is that you dropped and re-created the table without waiting for the schema to propagate to all nodes in the cluster in between the DROP and CREATE. Alternatively, it's possible that you've tried to create the table and assumed it didn't work then tried to create it again.
In any case, Cassandra thinks the table was created at 05:48 GMT but found a version created at 05:49 GMT. For what it's worth:
e5da3980-83eb-11ec-8c56-1b3845d1a791 = February 2, 2022 at 5:49:33 AM GMT
c8ac48d0-83eb-11ec-8c56-1b3845d1a791 = February 2, 2022 at 5:48:44 AM GMT
You'll need to resolve the schema disagreement. Depending on the Cassandra version you can either (a) run nodetool resetlocalschema on nodes which have a different schema version based on the output of nodetool describecluster, or (b) perform a rolling restart of all nodes. Cheers!
ExecutionException: org.apache.cassandra.exceptions.ConfigurationException: Column family ID mismatch (found e5da3980-83eb-11ec-8c56-1b3845d1a791; expected c8ac48d0-83eb-11ec-8c56-1b3845d1a791)
Has that column been deleted/added more than once? Cassandra (especially the pre 3.0 versions) is notorious for problems with that.
Check the output of nodetool describecluster. Are there multiple schema versions being reported?
If there are multiple schema versions, then run a rolling restart of the cluster. That's a sure-fire way to force schema agreement. Check the table, and see if that column is there. If not, try to add it.
The other solution, would be to try adding it with a different name (ex: "salary2").
A Google Spanner DDL script runs successfully when submitted in the Spanner Console, but when executed via the "glcoud spanner databases ddl update" command using the "--ddl-file" argument it consistently fails with the error:
(gcloud.spanner.databases.ddl.update) INVALID_ARGUMENT: Error parsing Spanner DDL
statement: \n : Syntax error on line 1, column 1: Encountered 'EOF' while parsing:
ddl_statement
'#type': type.googleapis.com/google.rpc.LocalizedMessage
locale: en-US
message: |-
Error parsing Spanner DDL statement:
: Syntax error on line 1, column 1: Encountered 'EOF' while parsing: ddl_statement
Example of the command:
gcloud spanner databases ddl update test-db
--instance=test-instance
--ddl-file=table.ddl
cat table.ddl
CREATE TABLE regions
(
region_id STRING(2) NOT NULL,
name STRING(13) NOT NULL,
) PRIMARY KEY (region_id);
There is only one other reference to this identical situation on the internet. Has anyone got the "ddl-file" argument to successfully work?
The problem is (most probably) caused by the last semi colon in your DDL script. It seems that the --ddl-file option accepts scripts with multiple DDL statements that may be separated by semi colons (;), but the last statement should not be terminated by a semi colon. Doing so will cause gcloud to try to parse another DDL statement after the last, only to determine that there is none, and thereby throwing an Unexpected end of file error.
So TLDR: Remove the last semi colon in your script and it should work.
I need to run multiple lines of code in Presto together. Here is example.
drop table if exists table_a
drop table if exists table_b
The above gives me error:
SQL Error [1]: Query failed (#20190820_190638_03672_kzuv6): line 2:1: mismatched input 'drop'. Expecting: '.', <EOF>
I already tried adding ";", but no luck.
Is it possible to stack multiple statements or I need to execute line by line? My actual example involves many other commands such as create table and etc.
You can use presto command line option to submit the sql file which may consist many sql commands.
/presto/executable/path/presto client --file $filename
Example:
/usr/lib/presto/bin/presto client --file /my/presto/sql/file.sql
I am Bulk Loading the data into cassandra using SSTables.I am following https://github.com/SPBTV/csv-to-sstable this.
I created the SSTables by
$ java -jar csv-to-sstable.jar quote /home/arque/table_big.cql /home/arque/Documents/data.csv /home/arque
I am getting an error while I am trying to run following command:
$ sstableloader -d 192.168.0.7 /home/arque/quote/table_big
Error:
Error: Established connection to initial hosts
Opening sstables and calculating sections to stream
Failed to list files in /home/arque/quote/table_big
java.lang.AssertionError
java.lang.RuntimeException: Failed to list files in /home/arque/quote /table_big
at org.apache.cassandra.db.lifecycle.LogAwareFileLister.list(LogAwareFileLister.java:77)
The error is in the csv-to-sstable tool. Look at this file: https://github.com/SPBTV/csv-to-sstable/blob/master/src/main/java/com/spbtv/cassandra/bulkload/Bulkload.java
You say you only have an issue when Primary key is composite key. That's because the tool expects primary key to be defined on same lane as column.
Line 66:
// Primary key defined on the same line as the corresponding column
Pattern pattern = Pattern.compile(".*?(\\w+)\\s+\\w+\\s+PRIMARY KEY.*");
If you change this to suite your needs it should work.
I write because I've a problem with cassandra; after have imported the data from pentaho as show here
http://wiki.pentaho.com/display/BAD/Write+Data+To+Cassandra
when I try to execute the query
Select * FROM mytable;
cassandre give me an error message
Syntax error at position 7: unexpected "*" for Select * FROM mytable;.
and don't show the results of query.Why? what does it mean that error?
the step that i make are the follow:
start cassandra cli utility;
use keyspace added from pentaho; (use tpc_h);
select to show the data added (Select * FROM mytable;)
The cassandra-cli does not support any CQL version. It has its own syntax which you can find on datastax's website.
Just for clarity, in cql to select everything from a table (aka column-family) called mytable stored in a keyspace called myks you would use:
SELECT * FROM myks.mytable;
The equivalent in cassandra-cli would *roughly be :
USE myks;
LIST mytable;
***** In the cli you are limited to selecting the first 100 rows. If this is a problem you can use the limit clause to specify how many rows you want:
LIST mytable limit 10000;
As for this:
in cassandra i have read that isn't possible make the join such as sql, ther isn't a shortcut to issue this disadvantage
There is a reason why joins don't exist in Cassandra, its for the same reason that C* isn't ACID compliant, it sacrifices that functionality for it's amazing performance and scalability, so it's not a disadvantage, you just need to re-think your model if you need joins. Also take a look at this question / answer.