Sometimes; when I perform a DELETE; it doesn't work.
My config : [cqlsh 5.0.1 | Cassandra 3.0.3 | CQL spec 3.4.0 | Native protocol v4]
cqlsh:my_db> SELECT * FROM conversations WHERE user_id=120 AND conversation_id=2 AND peer_type=1;
user_id | conversation_id | peer_type | message_map
---------+-----------------+-----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
120 | 2 | 1 | {0: {real_id: 68438, date: 1455453523, sent: True}, 1: {real_id: 68437, date: 1455453520, sent: True}, 2: {real_id: 68436, date: 1455453517, sent: True}, 3: {real_id: 68435, date: 1455453501, sent: True}, 4: {real_id: 68434, date: 1455453500, sent: True}, 5: {real_id: 68433, date: 1455453499, sent: True}, 6: {real_id: 68432, date: 1455453498, sent: True}, 7: {real_id: 68431, date: 1455453494, sent: True}, 8: {real_id: 68430, date: 1455453480, sent: True}}
(1 rows)
cqlsh:my_db> DELETE message_map FROM conversations WHERE user_id=120 AND conversation_id=2 AND peer_type=1;
cqlsh:my_db> SELECT * FROM conversations WHERE user_id=120 AND conversation_id=2 AND peer_type=1;
user_id | conversation_id | peer_type | message_map
---------+-----------------+-----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
120 | 2 | 1 | {0: {real_id: 68438, date: 1455453523, sent: True}, 1: {real_id: 68437, date: 1455453520, sent: True}, 2: {real_id: 68436, date: 1455453517, sent: True}, 3: {real_id: 68435, date: 1455453501, sent: True}, 4: {real_id: 68434, date: 1455453500, sent: True}, 5: {real_id: 68433, date: 1455453499, sent: True}, 6: {real_id: 68432, date: 1455453498, sent: True}, 7: {real_id: 68431, date: 1455453494, sent: True}, 8: {real_id: 68430, date: 1455453480, sent: True}}
(1 rows)
CQLSH doesn't return me any error on the DELETE instruction, but it's like if it wasn't taken in account.
Do you know why ?
NB : This is my table definition :
CREATE TABLE be_telegram.conversations (
user_id bigint,
conversation_id int,
peer_type int,
message_map map<int, frozen<message>>,
PRIMARY KEY (user_id, conversation_id, peer_type)
) WITH CLUSTERING ORDER BY (conversation_id ASC, peer_type ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
A DELETE statement removes one or more columns from one or more rows in a table, or it removes the entire row if no columns are specified. Cassandra applies selections within the same partition key atomically and in isolation.
When a column is deleted, it is not removed from disk immediately. The deleted column is marked with a tombstone and then removed after the configured grace period has expired. The optional timestamp defines the new tombstone record.
About deletes in Cassandra
The way Cassandra deletes data differs from the way a relational database deletes data. A relational database might spend time scanning through data looking for expired data and throwing it away or an administrator might have to partition expired data by month, for example, to clear it out faster. Data in a Cassandra column can have an optional expiration date called TTL (time to live).
Facts about deleted data to keep in mind are:
Cassandra does not immediately remove data marked for deletion from
disk. The deletion occurs during compaction.
If you use the sized-tiered or date-tiered compaction strategy, you
can drop data immediately by manually starting the compaction
process. Before doing so, understand the documented disadvantages of
the process.
A deleted column can reappear if you do not run node repair
routinely.
Why deleted data can reappear
Marking data with a tombstone signals Cassandra to retry sending a
delete request to a replica that was down at the time of delete. If
the replica comes back up within the grace period of time, it
eventually receives the delete request. However, if a node is down
longer than the grace period, the node can miss the delete because the
tombstone disappears after gc_grace_seconds. Cassandra always attempts
to replay missed updates when the node comes back up again. After a
failure, it is a best practice to run node repair to repair
inconsistencies across all of the replicas when bringing a node back
into the cluster. If the node doesn't come back within
gc_grace,_seconds, remove the node, wipe it, and bootstrap it again.
In your case, compaction is sized-tiered. So please try compaction process.
Compaction
Periodic compaction is essential to a healthy Cassandra database
because Cassandra does not insert/update in place. As inserts/updates
occur, instead of overwriting the rows, Cassandra writes a new
timestamped version of the inserted or updated data in another
SSTable. Cassandra manages the accumulation of SSTables on disk using
compaction.
Cassandra also does not delete in place because the SSTable is
immutable. Instead, Cassandra marks data to be deleted using a
tombstone. Tombstones exist for a configured time period defined by
the gc_grace_seconds value set on the table. During compaction, there
is a temporary spike in disk space usage and disk I/O because the old
and new SSTables co-exist. This diagram depicts the compaction
process:
Compaction merges the data in each SSTable data by partition key,
selecting the latest data for storage based on its timestamp.
Cassandra can merge the data performantly, without random IO, because
rows are sorted by partition key within each SSTable. After evicting
tombstones and removing deleted data, columns, and rows, the
compaction process consolidates SSTables into a single file. The old
SSTable files are deleted as soon as any pending reads finish using
the files. Disk space occupied by old SSTables becomes available for
reuse.
Data input to SSTables is sorted to prevent random I/O during SSTable
consolidation. After compaction, Cassandra uses the new consolidated
SSTable instead of multiple old SSTables, fulfilling read requests
more efficiently than before compaction. The old SSTable files are
deleted as soon as any pending reads finish using the files. Disk
space occupied by old SSTables becomes available for reuse.
so try this
nodetool <options> repair
options are:
( -h | --host ) <host name> | <ip address>
( -p | --port ) <port number>
( -pw | --password ) <password >
( -u | --username ) <user name>
-- Separates an option and argument that could be mistaken for a option.
keyspace is the name of a keyspace.
table is one or more table names, separated by a space.
This command starts the compaction process on tables that use the SizeTieredCompactionStrategy or DateTieredCompactionStrategy. You can specify a keyspace for compaction. If you do not specify a keyspace, the nodetool command uses the current keyspace. You can specify one or more tables for compaction. If you do not specify a table(s), compaction of all tables in the keyspace occurs. This is called a major compaction. If you do specify a table(s), compaction of the specified table(s) occurs. This is called a minor compaction. A major compaction consolidates all existing SSTables into a single SSTable. During compaction, there is a temporary spike in disk space usage and disk I/O because the old and new SSTables co-exist. A major compaction can cause considerable disk I/O.
Related
I was testing node repair on my Cassandra cluster (v3.11.5) while simultaneously stress-testing it with cassandra-stress (v3.11.4). The disk space run out and the repair failed. As a result gossip got disabled on the nodes. Sstables that were being anticompacted got cleaned up (effectively = deleted), which dropped the disk usage by ~half (to ~1.5TB per node) within a minute. And this I understand.
What I do not undestand is what happened next. The sstables started getting continuously compacted into smaller ones and eventually deleted. As a result the disk usage continued to drop (this time slowly), after a day or so it went from ~1.5TB per node to ~50GB per node. The data that was residing in the cluster was randomly generated by the cassandra-stress, so I see no way to confirm whether it's intact, however I find highly unlikely that it is, as the disk usage dropped that much. Also I have no TTL set up (at least that I would know of, might be missing something), so I would not expect the data being deleted. But I believe this is the case.
Anyway, can anyone point me to what is happening?
Table schema:
> desc test-table1;
CREATE TABLE test-keyspace1.test-table1 (
event_uuid uuid,
create_date timestamp,
action text,
business_profile_id int,
client_uuid uuid,
label text,
params text,
unique_id int,
PRIMARY KEY (event_uuid, create_date)
) WITH CLUSTERING ORDER BY (create_date DESC)
AND bloom_filter_fp_chance = 0.1
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.DeflateCompressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
Logs:
DEBUG [CompactionExecutor:7] 2019-11-23 20:17:19,828 CompactionTask.java:255 - Compacted (59ddec80-0e20-11ea-9612-67e94033cb24) 4 sstables to [/data/cassandra/data/test-keyspace1/test-table1-f592e9600b9511eab562b36ee84fdea9/md-3259-big,] to level=0. 93.264GiB to 25.190GiB (~27% of original) in 5,970,059ms. Read Throughput = 15.997MiB/s, Write Throughput = 4.321MiB/s, Row Throughput = ~909/s. 1,256,595 total partitions merged to 339,390. Partition merge counts were {2:27340, 3:46285, 4:265765, }
(...)
DEBUG [CompactionExecutor:7] 2019-11-24 03:50:14,820 CompactionTask.java:255 - Compacted (e1bd7f50-0e4b-11ea-9612-67e94033cb24) 32 sstables to [/data/cassandra/data/test-keyspace1/test-table1-f592e9600b9511eab562b36ee84fdea9/md-3301-big,] to level=0. 114.787GiB to 25.150GiB (~21% of original) in 14,448,734ms. Read Throughput = 8.135MiB/s, Write Throughput = 1.782MiB/s, Row Throughput = ~375/s. 1,546,722 total partitions merged to 338,859. Partition merge counts were {1:12732, 2:42441, 3:78598, 4:50454, 5:36032, 6:52989, 7:21216, 8:34681, 9:9716, }
DEBUG [CompactionExecutor:15] 2019-11-24 03:50:14,852 LeveledManifest.java:423 - L0 is too far behind, performing size-tiering there first
DEBUG [CompactionExecutor:15] 2019-11-24 03:50:14,852 CompactionTask.java:155 - Compacting (85e06040-0e6d-11ea-9612-67e94033cb24) [/data/cassandra/data/test-keyspace1/test-table1-f592e9600b9511eab562b36ee84fdea9/md-3259-big-Data.db:level=0, /data/cassandra/data/test-keyspace1/test-table1-f592e9600b9511eab562b36ee84fdea9/md-3299-big-Data.db:level=0, /data/cassandra/data/test-keyspace1/test-table1-f592e9600b9511eab562b36ee84fdea9/md-3298-big-Data.db:level=0, /data/cassandra/data/test-keyspace1/test-table1-f592e9600b9511eab562b36ee84fdea9/md-3300-big-Data.db:level=0, /data/cassandra/data/test-keyspace1/test-table1-f592e9600b9511eab562b36ee84fdea9/md-3301-big-Data.db:level=0,]
(...)
DEBUG [NonPeriodicTasks:1] 2019-11-24 06:02:50,117 SSTable.java:105 - Deleting sstable: /data/cassandra/data/test-keyspace1/test-table1-f592e9600b9511eab562b36ee84fdea9/md-3259-big
edit:
I performed some additional testing. To my best knowledge there is no TTL set up, see query result straight after cassandra-stress started inserting data:
> SELECT event_uuid, create_date, ttl(action), ttl(business_profile_id), ttl(client_uuid), ttl(label), ttl(params), ttl(unique_id) FROM test-table1 LIMIT 1;
event_uuid | create_date | ttl(action) | ttl(business_profile_id) | ttl(client_uuid) | ttl(label) | ttl(params) | ttl(unique_id)
--------------------------------------+---------------------------------+-------------+--------------------------+------------------+------------+-------------+----------------
00000000-001b-adf7-0000-0000001badf7 | 2018-01-10 10:08:45.476000+0000 | null | null | null | null | null | null
So neither TTL nor tombstones deletion should be related to the issue. It's likely that there are no duplicates, as the data is highly randomized. No Replication Factor changes were made, as well.
What I found out is that the data volume decrease starts every time after cassandra-stress gets stopped. Sadly, still don't know the exact reason.
I guess, when you think of it from a Cassandra perspective there really are only a few options on why your data shrinks:
1) TTL expired past GC Grace
2) Deletes past GC grace
3) The same records exists in multiple sstables (i.e. "updates")
4) Change in RF to a lower number (essentially a "cleanup" - token reassignment)
In any of the above cases when compaction runs it will either remove or reconcile records which could shrink up the space consumption. Without having the sstables around any more, it's hard to determine which, if not a combination of the above, occurred.
-Jim
I have 3 nodes cassandra setup and seems like some nodes had time sync issues, that is some nodes 10 minutes ahead of others.
CT-Cass2:/root>nodetool status
Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 172.94.1.22 14.15 GB 256 ? db37ca57-c7c9-4c36-bac3-f0cbd8516143 RAC1
UN 172.94.1.23 14.64 GB 256 ? b6927b2b-37b2-4a7d-af44-21c9f548c533 RAC1
UN 172.94.1.21 14.42 GB 256 ? e482b781-7e9f-43e2-82f8-92901be48eed RAC1
I have below table created.
CREATE TABLE test_users (
userid text PRIMARY KEY,
omavvmon int,
vvmon int
) WITH bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', 'compaction_window_size': '1', 'compaction_window_unit': 'DAYS', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 48000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
I can see that in customer setup some of the deleted records coming back and shows writetime(omavvmon) shows writetime of 10 minutes later than the row delete time. I am almost certain that records are coming back due to time sync issue (Because after correcting time its not happening). But when i tried to reproduce this issue locally it never happens.
I set cassandra system time 10 minutes ahead and create row. and writetime shows 10 minutes ahead
update test_users set omavvmon=1 where userid='4444';
I set the system time back to normal, that is 10 mins slower. Then i perform delete of userid 4444.
As i understand, this delete is 10 minutes lesser writetime compared to first creation and i should see records coming back again. But its not happening. Can any help me explain why deleted records coming back in production setup and not in my local setup? Also why cassandra is not showing the record locally even though delete has lesser timestamp compared to insert? Isnt it similar to delete then insert?
In production i check after few hours but local setup i am checking immediately after delete.
Did some extended maintenance on a node d1r1n3 out of a 14x node dsc 2.1.15 cluster today, but finished well within the cluster's max hint window.
After bringing the node back up most other nodes' hints disappeared again within minutes except for two nodes (d1r1n4 and d1r1n7), where only part of the hints went away.
After few hours of still showing 1 active hintedhandoff task I restarted node d1r1n7 and then quickly d1r1n4 emptied its hint table.
Howto see for which node stored hints on d1r1n7 are destined?
And possible howto get hints processed?
Update:
Found later corresponding to end-of-maxhint-window after taking node d1r1n3 offline for maintenance that d1r1n7' hints had vanished. Leaving us with a confused feeling of whether this was okay or not. Had the hinted been processed okay or some how just expired after end of maxhint window?
If the latter would we need to run a repair on node d1r1n3 after it's mainenance (this takes quite some time and IO... :/) What if we now applied read [LOCAL]QUORUM instead of as currently read ONE w/one DC and RF=3, could this then trigger read path repairs on needed-basis and maybe spare us is this case for a full repair?
Answer: turned out hinted_handoff_throttle_in_kb was # default 1024 on these two nodes while rest of cluster were # 65536 :)
hints are stored in cassandra 2.1.15 in system.hints table
cqlsh> describe table system.hints;
CREATE TABLE system.hints (
target_id uuid,
hint_id timeuuid,
message_version int,
mutation blob,
PRIMARY KEY (target_id, hint_id, message_version)
) WITH COMPACT STORAGE
AND CLUSTERING ORDER BY (hint_id ASC, message_version ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = 'hints awaiting delivery'
AND compaction = {'enabled': 'false', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 0
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 3600000
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
the target_id correlated with the node id
for example
in my sample 2 node cluster with RF=2
nodetool status
Datacenter: datacenter1
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 127.0.0.1 71.47 KB 256 100.0% d00c4b10-2997-4411-9fc9-f6d9f6077916 rack1
DN 127.0.0.2 75.4 KB 256 100.0% 1ca6779d-fb41-4a26-8fa8-89c6b51d0bfa rack1
I executed the following while node2 was down
cqlsh> insert into ks.cf (key,val) values (1,1);
cqlsh> select * from system.hints;
target_id | hint_id | message_version | mutation
--------------------------------------+--------------------------------------+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1ca6779d-fb41-4a26-8fa8-89c6b51d0bfa | e80a6230-ec8c-11e6-a1fd-d743d945c76e | 8 | 0x0004000000010000000101cfb4fba0ec8c11e6a1fdd743d945c76e7fffffff80000000000000000000000000000002000300000000000547df7ba68692000000000006000376616c0000000547df7ba686920000000400000001
(1 rows)
as can be seen the system.hints.target_id correlates with host id in nodetool status (1ca6779d-fb41-4a26-8fa8-89c6b51d0bfa)
I have a problem with the cassandra db and hope somebody can help me. I have a table “log”. In the log table, I have inserted about 10000 rows. Everything works fine. I can do a
select * from
select count(*) from
As soon I insert 100000 rows with TTL 50, I receive a error with
select count(*) from
Version: cassandra 2.1.8, 2 nodes
Cassandra timeout during read query at consistency ONE (1 responses
were required but only 0 replica responded)
Has someone a idea what I am doing wrong?
CREATE TABLE test.log (
day text,
date timestamp,
ip text,
iid int,
request text,
src text,
tid int,
txt text,
PRIMARY KEY (day, date, ip)
) WITH read_repair_chance = 0.0
AND dclocal_read_repair_chance = 0.1
AND gc_grace_seconds = 864000
AND bloom_filter_fp_chance = 0.01
AND caching = { 'keys' : 'ALL', 'rows_per_partition' : 'NONE' }
AND comment = ''
AND compaction = { 'class' : 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' }
AND compression = { 'sstable_compression' : 'org.apache.cassandra.io.compress.LZ4Compressor' }
AND default_time_to_live = 0
AND speculative_retry = '99.0PERCENTILE'
AND min_index_interval = 128
AND max_index_interval = 2048;
That error message indicates a problem with the READ operation. Most likely it is a READ timeout. You may need to update your Cassandra.yaml with a larger read timeout time as described in this SO answer.
Example for 200 seconds:
read_request_timeout_in_ms: 200000
If updating that does not work you may need to tweak the JVM settings for Cassandra. See DataStax's "Tuning Java Ops" for more information
count() is a very costly operation, imagine Cassandra need to scan all the row from all the node just to give you the count. In small amount of rows if works, but on bigger data, you should use another approaches to avoid timeout.
First of all, we have to retrieve row by row to count amount and forgot about count(*)
We should make a several(dozens, hundreds?) queries with filtering by partition and clustering key and summ amount of rows retrieved by each query.
Here is good explanation what is clustering and partition keys In your case day - is partition key, composite key consists from two columns: date and ip.
It most likely impossible to do it with cqlsh commandline client, so you should write a script by yourself. Official drivers for popular programming languages: http://docs.datastax.com/en/developer/driver-matrix/doc/common/driverMatrix.html
Example of one of such queries:
select day, date, ip, iid, request, src, tid, txt from test.log where day='Saturday' and date='2017-08-12 00:00:00' and ip='127.0 0.1'
Remarks:
If you need just to calculate count and nothing more, probably has a sense to google for tool like https://github.com/brianmhess/cassandra-count
If Cassandra refuses to run your query without ALLOW FILTERING that mean query is not efficient https://stackoverflow.com/a/38350839/2900229
I have all keyspaces and tables copied from another cassandara data folder ,How can I restore it in my cassandara node.
I dont have snapshots which are normally required to restore.
You might be able to do this with the Cassandra Bulk Loader.
Assuming a packaged install (with default data and bin locations), try this from one of your nodes:
$ sstableloader -d hostname1,hostname2 /var/lib/cassandra/data/yourKeyspaceName/tableName/
Check out the documentation on the Bulk Loader for more details.
You can do this but:
You need to know the schema for all the tables you are restoring. If you don't know this, use sstable2json (example below, but this can be tricky and requires understanding how sstable2json formats things)
You will have to start a new node, create the keyspace and it's tables using the schema derived from 1 and then use the BulkLoader as described in the docs by Aaron (BryceAtNetwork23).
Example of retreiving a schema (an offline process) using sstable2json, this example assumes your keyspace name is test and the table is named example1:
sstable2json /var/lib/cassandra/data/test/example1-55639910d46a11e4b4335dbb0aaeeb24/test-example1-ka-1-Data.db
// output:
WARN 10:25:34 JNA link failure, one or more native method will be unavailable.
[
{"key": "7d700500-d46b-11e4-b433-5dbb0aaeeb24", <-- key = bytes of what is in the PRIMARY KEY()
"cells": [["coolguy:","",1427451885901681], <-- cql3 row marker (empty cell that tells us table was created using cql3)
["coolguy:age","29",1427451885901681], <-- age
["coolguy:email:_","coolguy:email:!",1427451885901680,"t",1427451885], <-- collection cell marker
["coolguy:email:6367406d61696c2e6e6574","",1427451885901681], <-- first entry in collection
["coolguy:email:636f6f6c677579383540676d61696c2e636f6d","",1427451885901681], <-- second entry in collection
["coolguy:password","xQajKe2fa?af",1427451885901681]]}, <-- another text field for password
{"key": "52641f40-d46b-11e4-b433-5dbb0aaeeb24",
"cells": [["lyubent:","",1427451813663728],
["lyubent:age","109",1427451813663728],
["lyubent:email:_","lyubent:email:!",1427451813663727,"t",1427451813],
["lyubent:email:66616b65406162762e6267","",1427451813663728],
["lyubent:email:66616b6540676d61696c2e636f6d","",1427451813663728],
["lyubent:password","password",1427451813663728]]}
]
The above equates to:
CREATE TABLE test.example1 (
id timeuuid,
username text,
age int,
email set<text>,
password text,
PRIMARY KEY (id, username)
) WITH CLUSTERING ORDER BY (username ASC)
// the below are settings that you have no way of knowing,
// unless you are hardcore enough to start digging through
// system tables with the debug tool, but this is beyond
// the scope of the question.
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
You can see clearly that username and password get lost in the translation as they are the key, but you can tell that there is a compound key based on the fact that all cells have a section with : pre-appended, in the above two entries the examples are coolguy: and lyubent:. Going on this you know that they key is formed of PRIMARY KEY(something ?, username text). If you're lucky your primary key will be simple and debugging the schema from it will be straight forward, if not post it here and we'll see how far we can get.