Cassandra2.1 write slow in a 1TB data table - cassandra

I am doing some test in a cassandra cluster,and now i have a table with 1TB data per node.When i used ycsb to do more insert operation,i found the throughput was really low(about 10000 ops/sec) comparing to a same,new table in the same cluster(about 80000 ops/sec).While inserting,the cpu usage was about 40%,and almost no disk usege.
I used nodetool tpstats to get task details,it showed :
Pool Name Active Pending Completed Blocked All time blocked
CounterMutationStage 0 0 0 0 0
ReadStage 0 0 102 0 0
RequestResponseStage 0 0 41571733 0 0
MutationStage 384 21949 82375487 0 0
ReadRepairStage 0 0 0 0 0
GossipStage 0 0 247100 0 0
CacheCleanupExecutor 0 0 0 0 0
AntiEntropyStage 0 0 0 0 0
MigrationStage 0 0 6 0 0
Sampler 0 0 0 0 0
ValidationExecutor 0 0 0 0 0
CommitLogArchiver 0 0 0 0 0
MiscStage 0 0 0 0 0
MemtableFlushWriter 16 16 4745 0 0
MemtableReclaimMemory 0 0 4745 0 0
PendingRangeCalculator 0 0 4 0 0
MemtablePostFlush 1 163 9394 0 0
CompactionExecutor 8 29 13713 0 0
InternalResponseStage 0 0 0 0 0
HintedHandoff 2 2 5 0 0
I found there was a large amount of pending MutationStage and MemtablePostFlush
I have read some related articles about cassandra write limitation,but no useful information.I want to know why there is a huge difference about cassandra throughput between two same tables except the data size?
In addition,i use ssd on my server.However,this phenomenon also occur in another cluster using hdd
When cassandra was running,i found the both %user and %nice on cpu utilization are about 10% while only compactiontask running with compaction throughput about 80MB/S.but i have been set nice value to 0 for my cassandra process.

Wild guess: your system is busy compacting the sstable.
Check it out with nodetool compactionstats
BTW, YCSB does not use prepare statement, which make it bad estimator for actual application load.

Related

High disk I/O (read) on Cassandra nodes

We have 3 nodes Cassandra cluster.
We have an application that uses a keyspace that creates a hightload on disks, on read. The problem has a cumulative effect. The more days we interact with the keyspace, the more disk reading grows. :
hightload read
Reading goes up to > 700 MB/s. Then the storage (SAN) begins to degrade, and then the Сassandra cluster also degrades.
UPD 25.10.2021: "I wrote it a little wrong, through the SAN space is allocated to a virtual machine, like a normal drive"
The only thing that helps is clearing the keyspace.
Output command "tpstats" and "cfstats"
[cassandra-01 ~]$ nodetool tpstats
Pool Name Active Pending Completed Blocked All time blocked
ReadStage 1 1 1837888055 0 0
MiscStage 0 0 0 0 0
CompactionExecutor 0 0 6789640 0 0
MutationStage 0 0 870873552 0 0
MemtableReclaimMemory 0 0 7402 0 0
PendingRangeCalculator 0 0 9 0 0
GossipStage 0 0 18939072 0 0
SecondaryIndexManagement 0 0 0 0 0
HintsDispatcher 0 0 3 0 0
RequestResponseStage 0 0 1307861786 0 0
Native-Transport-Requests 0 0 2981687196 0 0
ReadRepairStage 0 0 346448 0 0
CounterMutationStage 0 0 0 0 0
MigrationStage 0 0 168 0 0
MemtablePostFlush 0 0 8193 0 0
PerDiskMemtableFlushWriter_0 0 0 7402 0 0
ValidationExecutor 0 0 21 0 0
Sampler 0 0 10988 0 0
MemtableFlushWriter 0 0 7402 0 0
InternalResponseStage 0 0 3404 0 0
ViewMutationStage 0 0 0 0 0
AntiEntropyStage 0 0 71 0 0
CacheCleanupExecutor 0 0 0 0 0
Message type Dropped
READ 7
RANGE_SLICE 0
_TRACE 0
HINT 0
MUTATION 5
COUNTER_MUTATION 0
BATCH_STORE 0
BATCH_REMOVE 0
REQUEST_RESPONSE 0
PAGED_RANGE 0
READ_REPAIR 0
[cassandra-01 ~]$ nodetool cfstats box_messages -H
Total number of tables: 73
----------------
Keyspace : box_messages
Read Count: 48847567
Read Latency: 0.055540737801741485 ms
Write Count: 69461300
Write Latency: 0.010656743870327794 ms
Pending Flushes: 0
Table: messages
SSTable count: 6
Space used (live): 3.84 GiB
Space used (total): 3.84 GiB
Space used by snapshots (total): 0 bytes
Off heap memory used (total): 10.3 MiB
SSTable Compression Ratio: 0.23265712113582082
Number of partitions (estimate): 4156030
Memtable cell count: 929912
Memtable data size: 245.04 MiB
Memtable off heap memory used: 0 bytes
Memtable switch count: 92
Local read count: 20511450
Local read latency: 0.106 ms
Local write count: 52111294
Local write latency: 0.013 ms
Pending flushes: 0
Percent repaired: 0.0
Bloom filter false positives: 57318
Bloom filter false ratio: 0.00841
Bloom filter space used: 6.56 MiB
Bloom filter off heap memory used: 6.56 MiB
Index summary off heap memory used: 1.78 MiB
Compression metadata off heap memory used: 1.95 MiB
Compacted partition minimum bytes: 73
Compacted partition maximum bytes: 17084
Compacted partition mean bytes: 3287
Average live cells per slice (last five minutes): 2.0796939751354797
Maximum live cells per slice (last five minutes): 10
Average tombstones per slice (last five minutes): 1.1939751354797576
Maximum tombstones per slice (last five minutes): 2
Dropped Mutations: 5 bytes
(I'm unable to comment and hence posting it as an answer)
As folks mentioned SAN is not going to be the best suite here and one could read through the list of anti-patterns documented here which could also apply to OSS C*.

Deleted data in cassandra come back,like ghost

I have a 3 nodes Cassandra cluster(3.7), a keyspace
CREATE KEYSPACE demo WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '2'} AND durable_writes = true;
a table
CREATE TABLE tradingdate (key text,tradingdate date,PRIMARY KEY (key, tradingdate));
one day when deleting one row like
delete from tradingdate
where key='tradingDay'and tradingdate='2018-12-31'
then the deleted row become ghost, when the query
select * from tradingdate
where key='tradingDay'and tradingdate>'2018-12-27' limit 2;
key | tradingdate
------------+-------------
tradingDay | 2018-12-28
tradingDay | 2019-01-02
select * from tradingdate
where key='tradingDay'and tradingdate<'2019-01-03'
order by tradingdate desc limit 2;
key | tradingdate
------------+-------------
tradingDay | 2019-01-02
tradingDay | 2018-12-31
So when use order by, the deleted row (tradingDay, 2018-12-31) come back.
I guess I only delete a row on one node, but it still exists on another node. So I execute:
nodetool repair demo tradingdate
on 3 nodes, then the deleted row totally disappears
So I want to know why use order by, I can see the ghost row.
This is some good reading about deletes in Cassandra (and other distributed systems as well):
http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html
As well as:
https://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlAboutDeletes.html
You will need to run/schedule a routine repair at least once within gc_grace_seconds which defaults to ten days to prevent data from reappearing in your cluster.
Also you should look for dropped messages in case one of your nodes is missing deletes (and other messages):
# nodetool tpstats
Pool Name Active Pending Completed Blocked All time blocked
MutationStage 0 0 787032744 0 0
ReadStage 0 0 1627843193 0 0
RequestResponseStage 0 0 2257452312 0 0
ReadRepairStage 0 0 99910415 0 0
CounterMutationStage 0 0 0 0 0
HintedHandoff 0 0 1582 0 0
MiscStage 0 0 0 0 0
CompactionExecutor 0 0 6649458 0 0
MemtableReclaimMemory 0 0 17987 0 0
PendingRangeCalculator 0 0 46 0 0
GossipStage 0 0 22766295 0 0
MigrationStage 0 0 8 0 0
MemtablePostFlush 0 0 127844 0 0
ValidationExecutor 0 0 0 0 0
Sampler 0 0 0 0 0
MemtableFlushWriter 0 0 17851 0 0
InternalResponseStage 0 0 8669 0 0
AntiEntropyStage 0 0 0 0 0
CacheCleanupExecutor 0 0 0 0 0
Native-Transport-Requests 0 0 631966060 0 19
Message type Dropped
READ 0
RANGE_SLICE 0
_TRACE 0
MUTATION 0
COUNTER_MUTATION 0
REQUEST_RESPONSE 0
PAGED_RANGE 0
READ_REPAIR 0
Dropped messages indicate that there is something wrong.

Native Transport Requests in Cassandra

I got some points about Native Transport Requests in Cassandra using this link : What are native transport requests in Cassandra?
As per my understanding, any query I execute in Cassandra is an Native Transport Requests.
I frequently get Request Timed Out error in Cassandra and I observed the following logs in Cassandra debug log and as well as using nodetool tpstats
/var/log/cassandra# nodetool tpstats
Pool Name Active Pending Completed Blocked All time blocked
MutationStage 0 0 186933949 0 0
ViewMutationStage 0 0 0 0 0
ReadStage 0 0 781880580 0 0
RequestResponseStage 0 0 5783147 0 0
ReadRepairStage 0 0 0 0 0
CounterMutationStage 0 0 14430168 0 0
MiscStage 0 0 0 0 0
CompactionExecutor 0 0 366708 0 0
MemtableReclaimMemory 0 0 788 0 0
PendingRangeCalculator 0 0 1 0 0
GossipStage 0 0 0 0 0
SecondaryIndexManagement 0 0 0 0 0
HintsDispatcher 0 0 0 0 0
MigrationStage 0 0 0 0 0
MemtablePostFlush 0 0 799 0 0
ValidationExecutor 0 0 0 0 0
Sampler 0 0 0 0 0
MemtableFlushWriter 0 0 788 0 0
InternalResponseStage 0 0 0 0 0
AntiEntropyStage 0 0 0 0 0
CacheCleanupExecutor 0 0 0 0 0
Native-Transport-Requests 0 0 477629331 0 1063468
Message type Dropped
READ 0
RANGE_SLICE 0
_TRACE 0
HINT 0
MUTATION 0
COUNTER_MUTATION 0
BATCH_STORE 0
BATCH_REMOVE 0
REQUEST_RESPONSE 0
PAGED_RANGE 0
READ_REPAIR 0
1) What is the All time blocked state?
2) What is this value : 1063468 denotes? How harmful it is?
3) How to tune this?
Each request is taken processed by the NTR stage before being handed off to read/mutation stage but it still blocks while waiting for completion. To prevent being overloaded the stage starts to block tasks being added to its queue to apply back pressure to client. Every time a request is blocked the all time blocked counter is incremented. So 1063468 requests have at one time been blocked for some period of time due to having to many requests backed up.
In situations where the app has spikes of queries this blocking is unnecessary and can cause issues so you can increase this queue limit with something like -Dcassandra.max_queued_native_transport_requests=4096 (default 128). You can also throttle requests on client side but id try increasing queue size first.
There also may be some request thats exceptionally slow that is clogging up your system. If you have monitoring setup, look at high percentile read/write coordinator latencies. You can also use nodetool proxyhistograms. There may be something in your data model or queries that is causing issues.

Error in cqlsh command line while querying

I have a three node Cassandra cluster running perfectly fine. When i do select count(*) from usertracking; query on one of the node of my cluster. I get the following error :
errors={}, last_host=localhost
Statement trace did not complete within 10 seconds
Although, it's working fine on rest of the two nodes of the cluster. Can anyone tell me the why i am getting this error only on one node and also what is the reason of error?
As given in this https://stackoverflow.com/questions/27766976/cassandra-cqlsh-query-fails-with-no-error I have also increased the time out parameters read_request_timeout_in_ms and range_request_timeout_in_ms in cassandra.yaml. But that didn't help.
KeySpace definition :
CREATE KEYSPACE cw WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'datacenter1' : 3 };
Table definition :
CREATE TABLE usertracking (
cwc text,
cur_visit_id text,
cur_visit_datetime timestamp,
cur_visit_last_ts bigint,
prev_visit_datetime timestamp,
prev_visit_last_ts bigint,
tot_page_view bigint,
tot_time_spent bigint,
tot_visit_count bigint,
PRIMARY KEY (cwc)
);
Output of node tool status :
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 192.168.1.200 146.06 MB 1 ? 92c5bd4a-8f2b-4d7b-b420-6261a1bb8648 rack1
UN 192.168.1.201 138.53 MB 1 ? 817d331b-4cc0-4770-be6d-7896fc00e82f rack1
UN 192.168.1.202 155.04 MB 1 ? 351731fb-c3ad-45e0-b2c8-bc1f3b1bf25d rack1
Output of nodetool tpstats :
Pool Name Active Pending Completed Blocked All time blocked
CounterMutationStage 0 0 0 0 0
ReadStage 0 0 25 0 0
RequestResponseStage 0 0 257103 0 0
MutationStage 0 0 593226 0 0
ReadRepairStage 0 0 0 0 0
GossipStage 0 0 612335 0 0
CacheCleanupExecutor 0 0 0 0 0
AntiEntropyStage 0 0 0 0 0
MigrationStage 0 0 0 0 0
ValidationExecutor 0 0 0 0 0
CommitLogArchiver 0 0 0 0 0
MiscStage 0 0 0 0 0
MemtableFlushWriter 0 0 87 0 0
MemtableReclaimMemory 0 0 87 0 0
PendingRangeCalculator 0 0 3 0 0
MemtablePostFlush 0 0 2829 0 0
CompactionExecutor 0 0 216 0 0
InternalResponseStage 0 0 0 0 0
HintedHandoff 0 0 2 0 0
Message type Dropped
RANGE_SLICE 0
READ_REPAIR 0
PAGED_RANGE 0
BINARY 0
READ 0
MUTATION 0
_TRACE 0
REQUEST_RESPONSE 0
COUNTER_MUTATION 0
not sure if this helps. I have a very similar configuration in my development environment and was getting OperationTimedOut errors when running count operations.
Like yourself, I originally tried working with the various TIMEOUT variables in cassandra.yaml, but these appeared to make no difference.
In the end, the timeout that was being exceeded was actually the cqlsh client itself. When i updated/created the ~/.cassandra/cqlshrc file with the following, I was able to run the count without failure.
[connection]
client_timeout = 20
This example sets the client time out to 20 seconds.
There is some information in the following article about the cqlshrc file: CQL Configuration File
Hopefully this helps, sorry if I'm barking up the wrong tree.

How can I determine the current CPU utilization from the shell? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
How can I determine the current CPU utilization from the shell in Linux?
For example, I get the load average like so:
cat /proc/loadavg
Outputs:
0.18 0.48 0.46 4/234 30719
Linux does not have any system variables that give the current CPU utilization. Instead, you have to read /proc/stat several times: each column in the cpu(n) lines gives the total CPU time, and you have to take subsequent readings of it to get percentages. See this document to find out what the various columns mean.
You can use top or ps commands to check the CPU usage.
using top : This will show you the cpu stats
top -b -n 1 |grep ^Cpu
using ps: This will show you the % cpu usage for each process.
ps -eo pcpu,pid,user,args | sort -r -k1 | less
Also, you can write a small script in bash or perl to read /proc/stat and calculate the CPU usage.
The command uptime gives you load averages for the past 1, 5, and 15 minutes.
Try this command:
$ top
http://www.cyberciti.biz/tips/how-do-i-find-out-linux-cpu-utilization.html
Try this command:
cat /proc/stat
This will be something like this:
cpu 55366 271 17283 75381807 22953 13468 94542 0
cpu0 3374 0 2187 9462432 1393 2 665 0
cpu1 2074 12 1314 9459589 841 2 43 0
cpu2 1664 0 1109 9447191 666 1 571 0
cpu3 864 0 716 9429250 387 2 118 0
cpu4 27667 110 5553 9358851 13900 2598 21784 0
cpu5 16625 146 2861 9388654 4556 4026 24979 0
cpu6 1790 0 1836 9436782 480 3307 19623 0
cpu7 1306 0 1702 9399053 726 3529 26756 0
intr 4421041070 559 10 0 4 5 0 0 0 26 0 0 0 111 0 129692 0 0 0 0 0 95 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 369 91027 1580921706 1277926101 570026630 991666971 0 277768 0 0 0 0 0 0 0 0 0 0 0 0 0
ctxt 8097121
btime 1251365089
processes 63692
procs_running 2
procs_blocked 0
More details:
http://www.mail-archive.com/linuxkernelnewbies#googlegroups.com/msg01690.html
http://www.linuxhowtos.org/System/procstat.htm
Maybe something like this
ps -eo pid,pcpu,comm
And if you like to parse and maybe only look at some processes.
#!/bin/sh
ps -eo pid,pcpu,comm | awk '{if ($2 > 4) print }' >> ~/ps_eo_test.txt
You need to sample the load average for several seconds and calculate the CPU utilization from that. If unsure what to you, get the sources of "top" and read it.

Resources