Clearing prepared statement cache in cassandra 3.0.10 - cassandra

We have cassandra 3.0.10 installed on centos. The developers made some coding mistakes on preparing statements. The result is that the prepared statement cache is overrunning and we always get evicted error message. The error is shown below:
INFO [ScheduledTasks:1] 2017-12-07 10:38:28,216 QueryProcessor.java:134 - 7 prepared statements discarded in the last minute because cache limit reached (8178944 bytes)
We have corrected the prepared statements and would like to flush the prepared statement cache to start from scratch. We have stopped and restarted the cassandra instance but the prepared statement count was not reset.
Cassandra 3.0.10 is installed on centos and we are using svcadm disable/enable cassandra to stop/start cassandra.
I noticed that in later version of cassandra, e.g. 3.11.1, there is a prepared_statements table under the system keyspace. Shutting down cassandra and deleting the file ${CASSANDRA_HOME}/data/data/system/prepared_statements-*, then restart cassandra actually resets the prepared_statement cache.
Appreciate any help on this.
Thanks.
Update: 2018-06-01
We are currently using a work-around to clear prepared statements associated with certain tables by dropping index then recreating the index on the table. This discards prepared statements that have dependencies on the defined index. For now, this is the most we can do. Problem is, if this won't work for tables that don't have index defined on them.
Still need a better way of doing this, e.g. some admin command to clear the cache.

Related

Cassandra clean up data and file handlers for deleted tables

After I truncated and dropped a table in Cassandra, I still see the sstables on disk plus the lot of open file handler pointing to these.
What is the proper way to get rid of them?
Is there a possibility without restarting the Cassandra nodes?
We're using Cassandra 3.7.
In Cassandra data does not get removed immediately instead marked as tombstoned. You can run nodetool repairto get rid of deleted data.

How to migrate data from Cassandra 2.1.9 to a fresh 3.5 installation

I tried to use sstableloader to load data into Cassandra 3.5. The data was captured using nodetool snapshot under Cassandra 2.1.9. All the tables loaded fine except one. It's small, only 2 columns and 20 rows. So, I entered this bug: https://issues.apache.org/jira/browse/CASSANDRA-11806. The bug was quickly closed as a duplicate. It doesn't seem to be a duplicate, since the original case is upgrading a node in-place, not loading data with sstableloader.
Even so, I tried to apply the the advice given to run upgradesstable [sic].
The directions given to upgrade from one version of Cassandra to another seem sketchy at best. Here's what I did based on my working backup/restore and info garnered from various Cassandra docs on how to upgrade:
Snapshot the data from prod (Cassandra 2.1.9), as usual
Restore data to Cassandra 2.1.14 running on my workstation
Verify the restore to 2.1.14 (it worked)
Copy the data/data/makeyourcase into a Cassandra 3.5 install
Fire up Cassandra 3.5
Run nodetool upgradesstables to upgrade the sstables to 3.5
nodetool upgradesstables fails:
>./bin/nodetool upgradesstables
error: Unknown column role in table makeyourcase.roles
-- StackTrace --
java.lang.AssertionError: Unknown column role in table makeyourcase.roles
So, the questions: Is it possible to upgrade directly from 2.1.x to 3.5? What's the actual upgrade process? The process at http://docs.datastax.com/en/latest-upgrade/upgrade/cassandra/upgradeCassandraDetails.html is seemingly missing important details.
This turned out to be a problem with the changing state of the table over time.
Since the table was small, I was able to migrate the data by using COPY to export the data to CSV and then importing it into the new version.
Have a look at https://issues.apache.org/jira/browse/CASSANDRA-11806 for discussion of another workaround and a coming bug fix.

Altering a column family in cassandra in a multiple node topology

I'm having the following issue when trying to alter cassandra:
I'm altering the table straight forward:
ALTER TABLE posts ADD is_black BOOLEAN;
on a single-node environment, both under EC2 server and on localhost everything work perfect - select, delete and so on.
When I'm altering on a cluster with 3 nodes - stuff are getting massy.
When I perform
select().all().from(tableName).where..
I'm getting the following exception:
java.lang.IllegalArgumentException: is_black is not a column defined in this metadata
at com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:273)
at com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:279)
at com.datastax.driver.core.ArrayBackedRow.getIndexOf(ArrayBackedRow.java:69)
at com.datastax.driver.core.AbstractGettableData.getString(AbstractGettableData.java:137)
Apparently I'm not the only one who's having this behaviour:
reference
p.s - drop creating the keyspace is not a possibility for me since I cannot delete the data contained in the table.
The bug was resolved :-)
I issue was that DataStax maintains in memory cache that contains the configuration of each node, this cache wasn't update when I alter the table since I used cqlsh instead of their SDK.
After restarting all the node, the in memory cache was dropped and the bug was resolved.

Modifying columnfamily metadata in Cassandra produces errors in datastax driver on server restart

I'm seeing some very strange effects after modifying column metadata in a columnfamily after executing the following CQL query: ALTER TABLE keyspace_name.table_name ADD column_name cql_type;
I have a cluster of 4 nodes on two data centers (Cassandra version 2.0.9). I also have two application servers talking to the Cassandra cluster via the datastax java driver (version 2.0.4).
After executing this kind of query I see no abnormal behaviour whatsoever (no exceptions detected at all), however long I wait. But once I restart my application on one of the servers I immediately start seeing errors on the other server. What I mean by errors is that after getting my data into a ResultSet, I try to deserialize it row by row and get 'null' values or values from other columns instead of the ones I expect. After restarting the second server (the one that is getting the errors) everything gets back to normal.
I've tried investigating both the logs of datastax-agent and cassandra on both the servers but there is nothing to be found.
Is there a 'proper procedure' to altering the columnfamily? Does anyone have any idea as to what may be the problem?
Thanks!

Cassandra not removing deleted rows despite running nodetool compact

Very often I have ghost rows that stay on the server and won't disappear after deleting a row in Cassandra.
I have tried all possible administration options with nodetool (compact, flush, etc.) and also connected to the cluster with jconsole and forced a GC thru it but the rows remain on the cluster.
For testing purpose I updated some rows with a TTL of 0 before doing the DELETE and these rows disappeared completely.
Do I need to live with that or can I somehow trigger a final removal of these deleted rows?
My testcluster uses Cassandra 1.0.7 and has only one single node.
This phenomenon that you are observing is the result of how distributed deletes work in Cassandra. See the Cassandra FAQ and the DistributedDeletes wiki page.
Basically the row will be completely deleted after GCGraceSeconds has passed and a compaction has run.

Resources