Cassandra Java driver query ResultSet not stored in variable

Cassandra Java driver query ResultSet not stored in variable - cassandra

//Read data from Cassandra
String query = "SELECT * FROM LABA_2.DATA LIMIT 5;";
//Creating Cluster object
Cluster cluster = Cluster.builder().addContactPoint("127.0.0.1").build();
//Creating Session object
Session session = cluster.connect();
//Getting the ResultSet
ResultSet result = session.execute(query);
System.out.println(result.all().size());
System.out.println("AJFGJKABGSDKGJS");
System.out.println(result.all().size());
System.out.println("AJFGJKABGSDKGJS");
Output:
22/04/02 23:11:56 INFO Cluster: New Cassandra host /127.0.0.1:9042 added
5
AJFGJKABGSDKGJS
0
AJFGJKABGSDKGJS
In the first case size = 5, in the second = 0.
cassandra-driver - 3.11.0
Why results are different?

Your test is invalid. Calling ResultSet.all() forces the driver to retrieve the whole result set in one go. The important thing to note here is that after calling all(), there is nothing left to retrieve.
When you call all() a second time, it returns nothing because you have already retrieved everything -- the list is exhausted and there's nothing left. Cheers!

Related

Datastax cassandra seem to cache preparestatent

When my application runs a long time, everything works as well. But when I change type a column from int to text(Drop table and recreate), I caught a Exception:
com.datastax.oss.driver.api.core.type.codec.CodecNotFoundException: Codec not found for requested operation: [INT <-> java.lang.String]
at com.datastax.oss.driver.internal.core.type.codec.registry.CachingCodecRegistry.createCodec(CachingCodecRegistry.java:609)
at com.datastax.oss.driver.internal.core.type.codec.registry.DefaultCodecRegistry$1.load(DefaultCodecRegistry.java:95)
at com.datastax.oss.driver.internal.core.type.codec.registry.DefaultCodecRegistry$1.load(DefaultCodecRegistry.java:92)
at com.datastax.oss.driver.shaded.guava.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3527)
at com.datastax.oss.driver.shaded.guava.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2276)
at com.datastax.oss.driver.shaded.guava.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154)
at com.datastax.oss.driver.shaded.guava.common.cache.LocalCache$Segment.get(LocalCache.java:2044)
at com.datastax.oss.driver.shaded.guava.common.cache.LocalCache.get(LocalCache.java:3951)
at com.datastax.oss.driver.shaded.guava.common.cache.LocalCache.getOrLoad(LocalCache.java:3973)
at com.datastax.oss.driver.shaded.guava.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4957)
at com.datastax.oss.driver.shaded.guava.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4963)
at com.datastax.oss.driver.internal.core.type.codec.registry.DefaultCodecRegistry.getCachedCodec(DefaultCodecRegistry.java:117)
at com.datastax.oss.driver.internal.core.type.codec.registry.CachingCodecRegistry.codecFor(CachingCodecRegistry.java:215)
at com.datastax.oss.driver.api.core.data.SettableByIndex.set(SettableByIndex.java:132)
at com.datastax.oss.driver.api.core.data.SettableByIndex.setString(SettableByIndex.java:338)
This exception appears occasionally. I'm using PreparedStatement to execute the query, I think it is cached from DataStax's driver.
I'm using AWS Keyspaces(Cassandra version 3.11.2), DataStax driver 4.6.
Here is my application.conf:
basic.request {
timeout = 5 seconds
consistency = LOCAL_ONE
}
advanced.connection {
max-requests-per-connection = 1024
pool {
local.size = 1
remote.size = 1
}
}
advanced.reconnect-on-init = true
advanced.reconnection-policy {
class = ExponentialReconnectionPolicy
base-delay = 1 second
max-delay = 60 seconds
}
advanced.retry-policy {
class = DefaultRetryPolicy
}
advanced.protocol {
version = V4
}
advanced.heartbeat {
interval = 30 seconds
timeout = 1 second
}
advanced.session-leak.threshold = 8
advanced.metadata.token-map.enabled = false
}

Yes, Java driver 4.x caches prepared statement - it's a difference from the driver 3.x. From documentation:
the session has a built-in cache, it’s OK to prepare the same string twice.
...
Note that caching is based on: the query string exactly as you provided it: the driver does not perform any kind of trimming or sanitizing.
I'm not sure 100% about the source code, but the relevant entries in the cache may not be cleared up on the table drop. I suggest to open the JIRA against Java driver, although, such type changes are often not really recommended - it's better to introduce new field with new type, even if it's possible to re-create table.

That's correct. Prepared statements are cached -- it's the optimisation that makes prepared statements more efficient if they are reused since they only need to be prepared once (the query doesn't need to get parsed again).
But I suspect that underlying issue in your case is that your queries involve SELECT *. Best practice recommendation (regardless of the database you're using) is to explicitly enumerate the columns you are retrieving from the table.
In the prepared statement, each of the columns are bound to a data type. When you alter the schema by adding/dropping columns, the order of the columns (and their data types) no longer match the data types of the result set so you end up in situations where the driver gets an int when it's expecting a text or vice-versa. Cheers!

NoNodeAvailableException after some insert request to cassandra

I am trying to insert data into Cassandra local cluster using async execution and version 4 of the driver (as same as my Cassandra instance)
I have instantiated the cql session in this way:
CqlSession cqlSession = CqlSession.builder()
.addContactEndPoint(new DefaultEndPoint(
InetSocketAddress.createUnresolved("localhost",9042))).build();
Then I create a statement in an async way:
return session.prepareAsync(
"insert into table (p1,p2,p3, p4) values (?, ?,?, ?)")
.thenComposeAsync(
(ps) -> {
CompletableFuture<AsyncResultSet>[] result = data.stream().map(
(d) -> session.executeAsync(
ps.bind(d.p1,d.p2,d.p3,d.p4)
)
).toCompletableFuture()
).toArray(CompletableFuture[]::new);
return CompletableFuture.allOf(result);
}
);
data is a dynamic list filled with user data.
When I exec the code I get the following exception:
Caused by: com.datastax.oss.driver.api.core.NoNodeAvailableException: No node was available to execute the query
at com.datastax.oss.driver.api.core.AllNodesFailedException.fromErrors(AllNodesFailedException.java:53)
at com.datastax.oss.driver.internal.core.cql.CqlPrepareHandler.sendRequest(CqlPrepareHandler.java:210)
at com.datastax.oss.driver.internal.core.cql.CqlPrepareHandler.onThrottleReady(CqlPrepareHandler.java:167)
at com.datastax.oss.driver.internal.core.session.throttling.PassThroughRequestThrottler.register(PassThroughRequestThrottler.java:52)
at com.datastax.oss.driver.internal.core.cql.CqlPrepareHandler.<init>(CqlPrepareHandler.java:153)
at com.datastax.oss.driver.internal.core.cql.CqlPrepareAsyncProcessor.process(CqlPrepareAsyncProcessor.java:66)
at com.datastax.oss.driver.internal.core.cql.CqlPrepareAsyncProcessor.process(CqlPrepareAsyncProcessor.java:33)
at com.datastax.oss.driver.internal.core.session.DefaultSession.execute(DefaultSession.java:210)
at com.datastax.oss.driver.api.core.cql.AsyncCqlSession.prepareAsync(AsyncCqlSession.java:90)
The node is active and some data are inserted before the exception rise. I have also tried to set up a data center name on the session builder without any result.
Why this exception rise if the node is up and running? Actually I have only one local node and that could be a problem?

The biggest thing that I don't see, is a way to limit the current number of active async threads.
Basically, if that (mapped) data stream gets hit hard enough, it'll basically create all of these new threads that it's awaiting. If the number of writes coming in from those threads creates enough back-pressure that node can't keep up or catch up to, the node will become overwhelmed and not accept requests.
Take a look at this post by Ryan Svihla of DataStax:
Cassandra: Batch Loading Without the Batch — The Nuanced Edition
Its code is from the 3.x version of the driver, but the concepts are the same. Basically, provide some way to throttle-down the writes, or limit the number of "in flight threads" running at any given time, and that should help greatly.

Finally, I have found a solution using BatchStatement and a little custom code to create a chucked list.
int chunks = 0;
if (data.size() % 100 == 0) {
chunks = data.size() / 100;
} else {
chunks = (data.size() / 100) + 1;
}
final int finalChunks = chunks;
return session.prepareAsync(
"insert into table (p1,p2,p3, p4) values (?, ?,?, ?)")
.thenComposeAsync(
(ps) -> {
AtomicInteger counter = new AtomicInteger();
final List<CompletionStage<AsyncResultSet>> batchInsert = data.stream()
.map(
(d) -> ps.bind(d.p1,d.p2,d.p3,d.p4)
)
.collect(Collectors.groupingBy(it -> counter.getAndIncrement() / finalChunks))
.values().stream().map(
boundedStatements -> BatchStatement.newInstance(BatchType.LOGGED, boundedStatements.toArray(new BatchableStatement[0]))
).map(
session::executeAsync
).collect(Collectors.toList());
return CompletableFutures.allSuccessful(batchInsert);
}
);

cosmos db document query takes long time

i am new to cosmos-db and facing issues in querying the collection, i have a partitioned collection with 100000 RU/s(unlimited storage capacity). the partition is based on '/Bid' which a GUID. i am querying the collection based on the partition key value which has 10,000 records (the collection has more than 28,942,445 documents for different partitions). i am using the following query to get the documents but it takes around 50 seconds to execute the query which is not feasible.
object partitionkey = new object();
partitionkey = "2359c59a-f730-40df-865c-d4e161189f5b";
// Now execute the same query via direct SQL
var DistinctBColumn = this.client.CreateDocumentQuery<BColumn>(BordereauxColumnCollection.SelfLink, "SELECT * FROM BColumn_UL c WHERE c.BId = '2359c59a-f730-40df-865c-d4e161189f5b'",new FeedOptions { EnableCrossPartitionQuery=true, PartitionKey= new PartitionKey("2359c59a-f730-40df-865c-d4e161189f5b") }, partitionkey).ToList();
also tried with other querying options which too resulted in talking along 50 seconds.
But it takes less than a second for the same query on azure portal.
kindly help to optimize the query and correct me if i am wrong. Many Thanks.

How do I retrieve table names in Cassandra using Java?

If is use this code in a CQL shell , I get all the names of table(s) in that keyspace.
DESCRIBE TABLES;
I want to retrieve the same data using ResulSet . Below is my code in Java.
String query = "DESCRIBE TABLES;";
ResultSet rs = session.execute(query);
for(Row row : rs) {
System.out.println(row);
}
While session and cluster has been initialized earlier as:
Cluster cluster = Cluster.builder().addContactPoint("127.0.0.1").build();
Session session = cluster.connect("keyspace_name");
Or I like to know Java code to retrieve table names in a keyspace.

The schema for the system tables change between versions quite a bit. It is best to rely on drivers Metadata that will have version specific parsing built in. From the Java Driver use
Cluster cluster = Cluster.builder().addContactPoint("127.0.0.1").build();
Collection<TableMetadata> tables = cluster.getMetadata()
.getKeyspace("keyspace_name")
.getTables(); // TableMetadata has name in getName(), along with lots of other info
// to convert to list of the names
List<String> tableNames = tables.stream()
.map(tm -> tm.getName())
.collect(Collectors.toList());

Cluster cluster = Cluster.builder().addContactPoint("127.0.0.1").build();
Metadata metadata = cluster.getMetadata();
Iterator<TableMetadata> tm = metadata.getKeyspace("Your Keyspace").getTables().iterator();
while(tm.hasNext()){
TableMetadata t = tm.next();
System.out.println(t.getName());
}
The above code will give you table names in the passed keyspace irrespective of the cassandra version used.

Cassandra Datastax BatchStatement with Lightweight Transactions

I'm having difficulty combining BatchStatement and Lightweight Transactions using the Datastax java driver.
Consider the following:
String batch =
"BEGIN BATCH "
+ "Update mykeyspace.mytable set record_version = 102 where id = '" + id + "' if record_version = 101;
" <additional batched statements>
+ "APPLY BATCH";
Row row = session.execute(batch).one();
if (! row.getBool("[applied]")) {
throw new RuntimeException("Optimistic Lock Failure!");
}
This functions as expected and indicates whether my lightweight transaction succeeded and my batch was applied. All is good. However if I try the same thing using a BatchStatement, I run into a couple of problems:
-- My lightweight transaction "if" clause is ignored and the update is always applied
-- The "Row" result is null making it impossible to execute the final row.getBool("[applied]") check.
String update = "Update mykeyspace.mytable set record_version = ? where id = ? if record_version = ?";
PreparedStatement pStmt = getSession().prepare(update);
BatchStatement batch = new BatchStatement();
batch.add(new BoundStatement(pStmt).bind(newVersion, id, oldVersion));
Row row = session.execute(batch).one(); <------ Row is null!
if (! row.getBool("[applied]")) {
throw new RuntimeException("Optimistic Lock Failure!");
}
Am I doing this wrong? Or is this a limitation with the datastax BatchStatement?

I am encountering this same issue. I opened a ticket with DataStax support yesterday and received the following answer:
Currently Lightweight Transactions as PreparedStatements within a BATCH are not supported. This is why you are encountering this issue.
There is nothing on the immediate roadmap to include this feature in Cassandra.
That suggests eliminating the PreparedStatement will workaround the issue. I'm going to try that myself, but haven't yet.
[Update]
I've been trying to work around this issue. Based on the earlier feedback, I assumed the restriction was on using a PreparedStatement for the conditional update.
I tried changing my code to not use a PreparedStatement, but that still didn't work when using a BatchStatement that contained a RegularStatement instead of a PreparedStatement.
BatchStatement batchStatement = new BatchStatement();
batchStatement.add(conditionalRegularStatement);
session.executeQuery(batchStatement);
They only thing that seems to work is to do an executeQuery with a raw query string that includes the batch.
session.executeQuery("BEGIN BATCH " + conditionalRegularStatement.getQueryString() + " APPLY BATCH");

Update:
This issue is resolved in CASSANDRA-7337 which was fixed in C* 2.0.9.
The latest DataStax Enterprise is now on 2.0.11. If you are seeing this issue ==> Upgrade!

The 2nd code snippet doesn't look right to me (there's no constructor taking a string or you've missed the prepared statement).
Can you try instead the following:
String update = "Update mykeyspace.mytable set record_version = ? where id = ? if record_version = ?";
PreparedStatement pStmt = session.prepare(update);
BatchStatement batch = new BatchStatement();
batch.add(pStmt.bind(newVersion, id, recordVersion));
Row row = session.execute(batch).one();
if (! row.getBool("[applied]")) {
throw new RuntimeException("Optimistic Lock Failure!");
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Cassandra Java driver query ResultSet not stored in variable - cassandra

Related

Datastax cassandra seem to cache preparestatent

NoNodeAvailableException after some insert request to cassandra

cosmos db document query takes long time

How do I retrieve table names in Cassandra using Java?

Cassandra Datastax BatchStatement with Lightweight Transactions

Categories

Resources