Schema disagreements with Cassandra 4.0 using the Java driver - cassandra

we have a 3-node dev Cassandra cluster running 3.11.13 that we have upgraded to 4.0.7, and we’ve been basically sending DDL statements through our Java applications using spring-data-cassandra:3.4.6 which uses the DataStax Java Driver version 4.14.1, and ever since we hadn’t had faced any issues with it until the upgrade to 4.0.7
The main issue with 4.0.7 that we’re facing is the schema disagreements that we’ve been seeing due to the tables created programmatically that has been a non-issue for us since 3.11.x. Although DDL statements made through cqlsh is working as expected, it’s only through the programmatic creation that we’re seeing the schema disagreements.
We’ve tried different cluster setups, C* versions, and Ubuntu versions, but we still face the same issue:
3-node, single-rack DC (Ubuntu 18.04, 20.04, 22.04) (4.0.x, 4.1.x)
3-node, 3-rack DC (Ubuntu 18.04, 20.04, 22.04) (4.0.x, 4.1.x) — This is the setup we’ve been using since 3.11.x
We’ve also tried fiddling with the driver configurations like adjusting the timeouts and disabling debouncing, but with no luck, face the same issue.
advanced.control-connection {
schema-agreement {
interval = 500 milliseconds
timeout = 10 seconds
warn-on-failure = true
}
},
advanced.metadata {
topology-event-debouncer {
window = 1 milliseconds
max-events = 1
}
schema {
request-timeout = 5 seconds
debouncer {
window = 1 milliseconds
max-events = 1
}
}
}
We’re creating tables programmatically through the following snippets:
#Override
protected abstract List<String> getStartupScripts();
#Bean
SessionFactoryInitializer sessionFactoryInitializer(SessionFactory sessionFactory) {
SessionFactoryInitializer initializer = new SessionFactoryInitializer();
initializer.setSessionFactory(sessionFactory);
final ResourceKeyspacePopulator resourceKeyspacePopulator = new ResourceKeyspacePopulator();
getStartupScripts().forEach(script ->
{
resourceKeyspacePopulator.addScript(scriptOf(script));
});
initializer.setKeyspacePopulator(resourceKeyspacePopulator);
return initializer;
}
And create one like:
#Override
protected List<String> getStartupScripts() {
return Arrays.asList(testTable());
}
private String testTable() {
return "CREATE TABLE IF NOT EXISTS test_table ("
+ "test text, "
+ "test2 text, "
+ "createdat bigint, "
+ "PRIMARY KEY(test, test2))";
}
But we end up in a loop until it timeouts due to the schema disagreement with the following errors:
DEBUG com.datastax.oss.driver.internal.core.metadata.SchemaAgreementChecker - [s1] Schema agreement not reached yet ([09989a2c-7348-3117-8b4a-d5cad549bc09, f4c8755d-6fec-38fe-984f-4083f4a0a0a0]), rescheduling in 500 ms
WARN org.springframework.context.support.GenericApplicationContext - Exception encountered during context initialization - cancelling refresh attempt: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'sessionFactoryInitializer' defined in com.bitcoin.wallet.config.CassandraConfig: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [org.springframework.data.cassandra.core.cql.session.init.SessionFactoryInitializer]: Factory method 'sessionFactoryInitializer' threw exception; nested exception is org.springframework.data.cassandra.core.cql.session.init.ScriptStatementFailedException: Failed to execute CQL script statement #1 of Byte array resource [resource loaded from byte array]: CREATE TABLE IF NOT EXISTS test_table (test text,test2 text,createdat bigint,PRIMARY KEY(test, test2)); nested exception is com.datastax.oss.driver.api.core.DriverTimeoutException: Query timed out after PT10S

So two things come to mind when reading through this:
Schema disagreements are often a symptom of some larger issue.
Does the node have its CPU pegged at 100%? Schema disagreement. Inefficient network routing? Schema disagreement. Disk IOPS maxed-out causing write back-pressure? Schema disagreement.
I'd have a look at the activity on the nodes and see if any of the above stand out.
Programmatic schema changes are often problematic.
Each node needs to store the complete schema, so each schema change gets sent to all nodes, essentially making schema changes running at an asynchronous ALL level of consistency. Because of that, there's no margin for error. And programmatic schema changes are often sent from within an application much faster than Cassandra can reconcile them.
My recommendations for making any schema changes:
Execute during off-peak times.
Only run when all nodes are UN.
Run them using cqlsh (not from application code).
Verify each individual change using nodetool describecluster.

Related

Kundera - Cassandra Replication Factor using EntityManagerFactory

I have an application which uses Kundera to generate tables from objects. I want to change Cassandra Replication factor. I use EntityManagerFactory separately to interact with the database for initializing, persisting records, etc.
I know we can create a separate kundera-cassandra.xml file and mention the replication factor. However, this throws an error for me and says keyspace doesn't exist. Also, I don't want to do this.
I want to change replication factor using EntityManagerFactory instead and somehow it doesn't work.
Here is my initialize function:
props.put(KUNDERA_NODES_KEY, host);
props.put(KUNDERA_PORT_KEY, String.valueOf(port));
props.put(KUNDERA_KEYSPACE_KEY, databaseName);
props.put(CassandraConstants.CQL_VERSION,CassandraConstants.CQL_VERSION_3_);
props.put("replication_factor", 2);
entityManagerFactory = Persistence.createEntityManagerFactory(
DataServiceConfiguration.KUNDERA_PERSISTENCE_UNIT, props);
LOG.info("DataServiceImpl initialized with Properties: " + props);
Note: I have tried setting the replication factor value as String as well and also have tried using CassandraConstants. Please let me know what am I doing incorrectly?
This is not possible with the current code. I have added a fix for this in github (track issue #1005).
This fix will be available from next Kundera release or you can build code from source to use it ASAP.
-Karthik

Nodejs + Cassandra driver --- getting error 'unconfigured table' when trying to create materialized view

I'm running on Nodejs 8.9 & the latest Datastax Cassandra driver.
Upon service startup I'm executing 2 queries, one which creates a table (in case is does not exist) and the other creates a materialized view.
The table creation query passes without any issues, but when I execute the query for the materialized view, I get 'unconfigured table' error.
I've tried to debug it, and saw (via terminal) that indeed the table does not appear in Cassandra after the query executes, it appears only after I stop the service entirely. I've tried closing the connection after creating the table and re-creating it, but I still get the same error.
This is how I execute the query:
try{
let respose = await client.execute(query, null, queryOptions);
}catch(error){
throw (error);
}
Changing the CONSISTENCY_POLICY did not help either.
Please advise.
Usually this should happen when the schema isn't in agreement between all nodes. By default driver should wait 10 seconds until agreement is reached. This time is controlled by protocolOptions.maxSchemaAgreementWaitSeconds parameter of the Client - try to increase this parameter & try.
Also, you need to check that your cluster is in agreement - please run nodetool describecluster as described in documentation.

Web Api Returning Json - [System.NotSupportedException] Specified method is not supported. (Sybase Ase)

I'm using Web api with Entity Framework 4.2 and the Sybase Ase connector.
This was working without issues returning JSon, until I tried to add a new table.
return db.car
.Include("tires")
.Include("tires.hub_caps")
.Include("tires.hub_caps.colors")
.Include("tires.hub_caps.sizes")
.Include("tires.hub_caps.sizes.units")
.Where(c => c.tires == 13);
The above works without issues if the following line is removed:
.Include("tires.hub_caps.colors")
However, when that line is included, I am given the error:
""An error occurred while preparing the command definition. See the inner exception for details."
The inner exception reads:
"InnerException = {"Specified method is not supported."}"
"source = Sybase.AdoNet4.AseClient"
The following also results in an error:
List<car> cars = db.car.AsNoTracking()
.Include("tires")
.Include("tires.hub_caps")
.Include("tires.hub_caps.colors")
.Include("tires.hub_caps.sizes")
.Include("tires.hub_caps.sizes.units")
.Where(c => c.tires == 13).ToList();
The error is as follows:
An exception of type 'System.Data.EntityCommandCompilationException' occurred in System.Data.Entity.dll but was not handled in user code
Additional information: An error occurred while preparing the command definition. See the inner exception for details.
Inner exception: "Specified method is not supported."
This points to a fault with with the Sybase Ase Data Connector.
I am using data annotations on all tables to control which fields are returned. On the colors table, I have tried the following annotations to limit the properties returned just the key:
[JsonIgnore]
[IgnoreDataMember]
Any ideas what might be causing this issue?
Alternatively, if I keep colors in and remove,
.Include("tires.hub_caps.sizes")
.Include("tires.hub_caps.sizes.units")
then this works also. It seems that the Sybase Ase connector does not support cases when an include statement forks from one object in two directions. Is there a way round this? The same issue occurs with Sybase Ase and the progress data connector.
The issue does not occur in a standard ASP.net MVC controller class - the problem is with serializing two one to many relationships on a single table to JSON.
This issue still occurs if lazy loading is turned on.
It seems to me that this is a bug with Sybase ASE, that none of the connectors are able to solve.

TransactionScope with Typed Dataset

Is it possible to use TransactionScope with a Typed Dataset?
as in:
using (var transaction = new TransactionScope())
{
typedDataSet.DeleteStuff(id);
typedDataSet2.DeleteSomeOtherStuff(id2);
transaction.Complete();
}
Will the the sql queries related to DeleteStuff(id) and DeleteSomeOtherStuff(id) actually be transactional if an error is thrown?
I have read this article by Bogdan Chernyachuk on Using Transactions with Strongly Typed datasets and I am hoping that I do not have to do it this way.
Short answer: Yes this is transactional.
Wasn't too hard to test either. I threw an exception just before the transaction.Complete() and the data wasn't deleted from the database.
using (var transaction = new TransactionScope())
{
typedDataSet.DeleteStuff(id);
typedDataSet2.DeleteSomeOtherStuff(id2);
throw new NullReferenceException();
transaction.Complete();
}
Weirdly though I profiled what was going on in the db with SQL SERVER Profiler and the stored procedures that were referenced through the typed dataset were executed on the server. However the data was somehow rolled back.

Simplest way to insert data into a fresh Cassandra database using the Hector API?

I've followed numerous examples on inserting data into a Cassandra database and every time I get an exception about unconfigured column families.
Exception in thread "main" me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:unconfigured columnfamily TestColumnFamily)
at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:45)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:252)
at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:97)
at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243)
at me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:69)
at CassandraInterface.main(CassandraInterface.java:101)
Caused by: InvalidRequestException(why:unconfigured columnfamily TestColumnFamily)
at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:19477)
at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:1035)
at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:1009)
at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:246)
at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:243)
at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:103)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:246)
... 4 more
So I looked up how to configure them and found
BasicColumnFamilyDefinition cfdef = new BasicColumnFamilyDefinition();
cfdef.setKeyspaceName(keyspaceName);
cfdef.setName(columnFamilyName);
cfdef.setKeyValidationClass(ComparatorType.UTF8TYPE.getClassName());
cfdef.setComparatorType(ComparatorType.UTF8TYPE);
That didn't configure the column family.
All of the examples I have found are fragments without any context, so I don't know what to import or set up. In addition, some examples appear to mix the Hector API v2 and the original Hector API, so when I use them, I get "class not found" or "function not found" compiler errors.
Hector CassandraClusterTest.java
#Test
public void testAddDropColumnFamily() throws Exception {
ColumnFamilyDefinition cfDef = HFactory.createColumnFamilyDefinition("Keyspace1", "DynCf");
cassandraCluster.addColumnFamily(cfDef);
String cfid2 = cassandraCluster.dropColumnFamily("Keyspace1", "DynCf");
assertNotNull(cfid2);
// Let's wait for agreement
cassandraCluster.addColumnFamily(cfDef, true);
cfid2 = cassandraCluster.dropColumnFamily("Keyspace1", "DynCf", true);
assertNotNull(cfid2);
}
Long story short, keyspace and column family need to exist before you try and insert data into them. You can either manage this in your code, to check to see if they exist, using the example above as a nice reference -- or modify via the command line interface (cassandra-cli)
Hector Unit Tests
Hopefully you've been able to do this by now but this is how I've done it.
I have a cassandra install (using 1.1.4) and assuming you have all the necessary directories created:
/var/lib/cassandra
/var/lib/casandra/data
/var/lib/cassnadra/commitlogs
/var/lib/cassandra/saved_caches
I start it using:
bin/cassandra -f
I create a simple script called schema_create.txt:
CREATE KEYSPACE TEST
WITH strategy_class = 'org.apache.cassandra.locator.SimpleStrategy'
AND strategy_options:replication_factor='1';
use TEST;
CREATE COLUMNFAMILY TestColumnFamily(
userid varchar,
firstname varchar,
lastname varchar,
PRIMARY KEY (userid));
Then from the command line you can run this script using the new CQL tool that comes with cassandra as follows:
bin/cqlsh --cql3 < schema_createt.txt
This will install a keyspace named test with a column family named testcolumnfamily into cassandra.
Now from within your java application you can simply create a test class that has a main method (i will assume your development environment has all necessary dependencies if using maven):
try{
Mutator mutator = HFactory.createMutator(kweyspace, stringSerializer.get());
mutator.addInsertion("iamauser", "tescolumnfamily", HFactory.createStringColumn("firstname", "John"));
mutator.addInsertion("iamauser", "testcolumnfamily", HFactory.createStringColumn("lastname", "Smith"));
mutator.execute();
}
catch(HectorException Hex){ Hex.printStackTrace(); }
finally{ cluster.getConnectionManger().shutdown(); }
Now go back to the command line and enter into cassandra using:
$bin/cqlsh --cql3
use test;
select * from testcolumnfamily;
This will insert a row of data into your cassandra db with the key iamauser, and name as John Smith and you can verify as shown above using the cqlsh tool.
Hope this helps.

Resources