Not marking nodes down due to local pause of 8478595263 > 5000000000 - cassandra

i have 3 node cassandra cluster in kubernetes.
Deployed cassandra using bitnami/cassandra helm chart.
getting error based on more number of request after sometime later
WARN [GossipTasks:1] 2020-01-09 11:39:33,070 FailureDetector.java:278 - Not marking nodes down due to local pause of 8206335128 > 5000000000
WARN [GossipTasks:1] 2020-01-09 11:39:42,238 FailureDetector.java:278 - Not marking nodes down due to local pause of 6668041401 > 5000000000
WARN [GossipTasks:1] 2020-01-09 11:40:03,341 FailureDetector.java:278 - Not marking nodes down due to local pause of 15041441083 > 5000000000
WARN [PERIODIC-COMMIT-LOG-SYNCER] 2020-01-09 11:41:55,606 NoSpamLogger.java:94 - Out of 1 commit log syncs over the past 0.00s with average duration of 11850.79ms, 1 have exceeded the configured commit interval by an average of 1850.79ms
WARN [GossipTasks:1] 2020-01-09 11:42:20,019 Gossiper.java:783 - Gossip stage has 1 pending tasks; skipping status check (no nodes will be marked down)
NFO [RequestResponseStage-1] 2020-01-09 11:45:36,329 Gossiper.java:1011 - InetAddress /100.96.7.7 is now UP
INFO [RequestResponseStage-1] 2020-01-09 11:45:36,330 Gossiper.java:1011 - InetAddress /100.96.7.7 is now UP
INFO [ScheduledTasks:1] 2020-01-09 11:45:55,931 MessagingService.java:1236 - MUTATION messages were dropped in last 5000 ms: 0 internal and 45 cross node. Mean internal dropped latency: 0 ms and Mean cross-node dropped latency: 2874 ms
INFO [ScheduledTasks:1] 2020-01-09 11:45:55,933 StatusLogger.java:47 - Pool Name Active Pending Completed Blocked All Time Blocked
INFO [ScheduledTasks:1] 2020-01-09 11:45:55,949 StatusLogger.java:51 - MutationStage 0 0 226236 0 0
INFO [ScheduledTasks:1] 2020-01-09 11:45:55,950 StatusLogger.java:51 - ViewMutationStage 0 0 0 0 0
INFO [ScheduledTasks:1] 2020-01-09 11:45:55,950 StatusLogger.java:51 - ReadStage 0 0 244468 0 0
INFO [ScheduledTasks:1] 2020-01-09 11:45:55,951 StatusLogger.java:51 - RequestResponseStage 0 0 341270 0 0
INFO [ScheduledTasks:1] 2020-01-09 11:45:55,952 StatusLogger.java:51 - ReadRepairStage 0 0 5395 0 0
INFO [ScheduledTasks:1] 2020-01-09 11:45:55,953 StatusLogger.java:51 - CounterMutationStage 0 0 0 0 0
INFO [ScheduledTasks:1] 2020-01-09 11:45:55,958 StatusLogger.java:51 - MiscStage 0 0 0 0 0
INFO [ScheduledTasks:1] 2020-01-09 11:45:55,959 StatusLogger.java:51 - CompactionExecutor 0 0 686641 0 0
INFO [ScheduledTasks:1] 2020-01-09 11:45:55,960 StatusLogger.java:51 - MemtableReclaimMemory 0 0 689 0 0
INFO [ScheduledTasks:1] 2020-01-09 11:45:55,962 StatusLogger.java:51 - PendingRangeCalculator 0 0 9 0 0
INFO [ScheduledTasks:1] 2020-01-09 11:45:55,964 StatusLogger.java:51 - GossipStage 0 0 3093860 0 0
INFO [ScheduledTasks:1] 2020-01-09 11:45:55,966 StatusLogger.java:51 - SecondaryIndexManagement 0 0 0 0 0
INFO [ScheduledTasks:1] 2020-01-09 11:45:55,970 StatusLogger.java:51 - HintsDispatcher 0 0 10 0 0
INFO [ScheduledTasks:1] 2020-01-09 11:45:55,973 StatusLogger.java:51 - MigrationStage 0 0 6 0 0
INFO [ScheduledTasks:1] 2020-01-09 11:45:55,973 StatusLogger.java:51 - MemtablePostFlush 0 0 717 0 0
INFO [ScheduledTasks:1] 2020-01-09 11:45:55,974 StatusLogger.java:51 - PerDiskMemtableFlushWriter_0 0 0 689 0 0
:
INFO [ScheduledTasks:1] 2020-01-09 11:45:55,974 StatusLogger.java:51 - PerDiskMemtableFlushWriter_0 0 0 689 0 0
INFO [ScheduledTasks:1] 2020-01-09 11:45:55,974 StatusLogger.java:51 - ValidationExecutor 0 0 0 0 0
INFO [ScheduledTasks:1] 2020-01-09 11:45:55,975 StatusLogger.java:51 - Sampler 0 0 0 0 0
INFO [ScheduledTasks:1] 2020-01-09 11:45:55,975 StatusLogger.java:51 - MemtableFlushWriter 0 0 689 0 0
INFO [ScheduledTasks:1] 2020-01-09 11:45:55,976 StatusLogger.java:51 - InternalResponseStage 0 0 869 0 0
INFO [ScheduledTasks:1] 2020-01-09 11:45:55,977 StatusLogger.java:51 - AntiEntropyStage 0 0 0 0 0
INFO [ScheduledTasks:1] 2020-01-09 11:45:55,978 StatusLogger.java:51 - CacheCleanupExecutor 0 0 0 0 0
INFO
INFO [Service Thread] 2020-01-09 12:11:49,292 GCInspector.java:284 - ParNew GC in 659ms. CMS Old Gen: 2056877512 -> 2057740336; Par Eden Space: 671088640 -> 0; Par Survivor Space: 2636992 -> 6187520
Tried to solved based on some of the reference issue but not given for kubernetes
Cassandra Error message: Not marking nodes down due to local pause. Why?
Pool Name Active Pending Completed Blocked All time blocked
ReadStage 0 0 245904 0 0
MiscStage 0 0 0 0 0
CompactionExecutor 0 0 696906 0 0
MutationStage 0 0 244820 0 0
MemtableReclaimMemory 0 0 697 0 0
PendingRangeCalculator 0 0 9 0 0
GossipStage 0 0 3138625 0 0
SecondaryIndexManagement 0 0 0 0 0
HintsDispatcher 0 0 10 0 0
RequestResponseStage 0 0 364305 0 0
Native-Transport-Requests 0 0 11089339 0 241
ReadRepairStage 0 0 5395 0 0
CounterMutationStage 0 0 0 0 0
MigrationStage 0 0 6 0 0
MemtablePostFlush 0 0 725 0 0
PerDiskMemtableFlushWriter_0 0 0 697 0 0
ValidationExecutor 0 0 0 0 0
Sampler 0 0 0 0 0
MemtableFlushWriter 0 0 697 0 0
InternalResponseStage 0 0 869 0 0
ViewMutationStage 0 0 0 0 0
AntiEntropyStage 0 0 0 0 0
CacheCleanupExecutor 0 0 0 0 0
Message type Dropped
READ 0
RANGE_SLICE 0
_TRACE 0
HINT 0
MUTATION 45
COUNTER_MUTATION 0
BATCH_STORE 0
BATCH_REMOVE 0
REQUEST_RESPONSE 0
PAGED_RANGE 0
READ_REPAIR 0

From the above "tpstats" metrics looks okay but we can see some mutation there so it indicates that you cluster is going overload. Some requests blocked there too. Commitlogs seem not accepting the write there. You should plan cluster expansion or start debugging why nodes are overloading.

Based on the log entries you posted above, the nodes are overloaded making them unresponsive. Mutations get dropped because the commitlog disks cannot keep up with the writes.
You will need to review the size of your cluster as you might need to add more nodes to increase capacity. Cheers!

Related

Deleted data in cassandra come back,like ghost

I have a 3 nodes Cassandra cluster(3.7), a keyspace
CREATE KEYSPACE demo WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '2'} AND durable_writes = true;
a table
CREATE TABLE tradingdate (key text,tradingdate date,PRIMARY KEY (key, tradingdate));
one day when deleting one row like
delete from tradingdate
where key='tradingDay'and tradingdate='2018-12-31'
then the deleted row become ghost, when the query
select * from tradingdate
where key='tradingDay'and tradingdate>'2018-12-27' limit 2;
key | tradingdate
------------+-------------
tradingDay | 2018-12-28
tradingDay | 2019-01-02
select * from tradingdate
where key='tradingDay'and tradingdate<'2019-01-03'
order by tradingdate desc limit 2;
key | tradingdate
------------+-------------
tradingDay | 2019-01-02
tradingDay | 2018-12-31
So when use order by, the deleted row (tradingDay, 2018-12-31) come back.
I guess I only delete a row on one node, but it still exists on another node. So I execute:
nodetool repair demo tradingdate
on 3 nodes, then the deleted row totally disappears
So I want to know why use order by, I can see the ghost row.
This is some good reading about deletes in Cassandra (and other distributed systems as well):
http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html
As well as:
https://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlAboutDeletes.html
You will need to run/schedule a routine repair at least once within gc_grace_seconds which defaults to ten days to prevent data from reappearing in your cluster.
Also you should look for dropped messages in case one of your nodes is missing deletes (and other messages):
# nodetool tpstats
Pool Name Active Pending Completed Blocked All time blocked
MutationStage 0 0 787032744 0 0
ReadStage 0 0 1627843193 0 0
RequestResponseStage 0 0 2257452312 0 0
ReadRepairStage 0 0 99910415 0 0
CounterMutationStage 0 0 0 0 0
HintedHandoff 0 0 1582 0 0
MiscStage 0 0 0 0 0
CompactionExecutor 0 0 6649458 0 0
MemtableReclaimMemory 0 0 17987 0 0
PendingRangeCalculator 0 0 46 0 0
GossipStage 0 0 22766295 0 0
MigrationStage 0 0 8 0 0
MemtablePostFlush 0 0 127844 0 0
ValidationExecutor 0 0 0 0 0
Sampler 0 0 0 0 0
MemtableFlushWriter 0 0 17851 0 0
InternalResponseStage 0 0 8669 0 0
AntiEntropyStage 0 0 0 0 0
CacheCleanupExecutor 0 0 0 0 0
Native-Transport-Requests 0 0 631966060 0 19
Message type Dropped
READ 0
RANGE_SLICE 0
_TRACE 0
MUTATION 0
COUNTER_MUTATION 0
REQUEST_RESPONSE 0
PAGED_RANGE 0
READ_REPAIR 0
Dropped messages indicate that there is something wrong.

Native Transport Requests in Cassandra

I got some points about Native Transport Requests in Cassandra using this link : What are native transport requests in Cassandra?
As per my understanding, any query I execute in Cassandra is an Native Transport Requests.
I frequently get Request Timed Out error in Cassandra and I observed the following logs in Cassandra debug log and as well as using nodetool tpstats
/var/log/cassandra# nodetool tpstats
Pool Name Active Pending Completed Blocked All time blocked
MutationStage 0 0 186933949 0 0
ViewMutationStage 0 0 0 0 0
ReadStage 0 0 781880580 0 0
RequestResponseStage 0 0 5783147 0 0
ReadRepairStage 0 0 0 0 0
CounterMutationStage 0 0 14430168 0 0
MiscStage 0 0 0 0 0
CompactionExecutor 0 0 366708 0 0
MemtableReclaimMemory 0 0 788 0 0
PendingRangeCalculator 0 0 1 0 0
GossipStage 0 0 0 0 0
SecondaryIndexManagement 0 0 0 0 0
HintsDispatcher 0 0 0 0 0
MigrationStage 0 0 0 0 0
MemtablePostFlush 0 0 799 0 0
ValidationExecutor 0 0 0 0 0
Sampler 0 0 0 0 0
MemtableFlushWriter 0 0 788 0 0
InternalResponseStage 0 0 0 0 0
AntiEntropyStage 0 0 0 0 0
CacheCleanupExecutor 0 0 0 0 0
Native-Transport-Requests 0 0 477629331 0 1063468
Message type Dropped
READ 0
RANGE_SLICE 0
_TRACE 0
HINT 0
MUTATION 0
COUNTER_MUTATION 0
BATCH_STORE 0
BATCH_REMOVE 0
REQUEST_RESPONSE 0
PAGED_RANGE 0
READ_REPAIR 0
1) What is the All time blocked state?
2) What is this value : 1063468 denotes? How harmful it is?
3) How to tune this?
Each request is taken processed by the NTR stage before being handed off to read/mutation stage but it still blocks while waiting for completion. To prevent being overloaded the stage starts to block tasks being added to its queue to apply back pressure to client. Every time a request is blocked the all time blocked counter is incremented. So 1063468 requests have at one time been blocked for some period of time due to having to many requests backed up.
In situations where the app has spikes of queries this blocking is unnecessary and can cause issues so you can increase this queue limit with something like -Dcassandra.max_queued_native_transport_requests=4096 (default 128). You can also throttle requests on client side but id try increasing queue size first.
There also may be some request thats exceptionally slow that is clogging up your system. If you have monitoring setup, look at high percentile read/write coordinator latencies. You can also use nodetool proxyhistograms. There may be something in your data model or queries that is causing issues.

Failed to add a new node to cassandra cluster

I have a cluster with four nodes, every node with 70G data.
When I add a new node to the cluster, it always
warns me about a tombstones problem like this:
WARN 09:38:03 Read 2578 live and 1114 tombstoned cells in xxxtable (see tombstone_warn_threshold).
10000 columns was requested, slices=[-], delInfo={deletedAt=-9223372036854775808,
localDeletion=2147483647, ranges=[FAE69193423616A400258D99B9C0CCCFEC4A9547C1A1FC17BF569D2405705B8E:_-FAE69193423616A400258D99B9C0CCCFEC4A9547C1A1FC17BF569D2405705B8E:!,
deletedAt=1456243983944000, localDeletion=1456243983][FAE69193423616A40EC252766DDF513FBCA55ECDFAF452052E6C95D4BD641201:_-FAE69193423616A40EC252766DDF513FBCA55ECDFAF452052E6C95D4BD641201:!,
deletedAt=1460026357100000, localDeletion=1460026357][FAE69193423616A41BED8E613CD24BF3583FB6C6ABBA13F19C3E2D1824D01EF6:_-FAE69193423616A41BED8E613CD24BF3583FB6C6ABBA13F19C3E2D1824D01EF6:!, deletedAt=1458176745950000, localDeletion=1458176745][FAE69193423616A41BED8E613CD24BF3B06C1306E35B0ACA719D800D254E5930:_-FAE69193423616A41BED8E613CD24BF3B06C1306E35B0ACA719D800D254E5930:!, deletedAt=1458176745556000, localDeletion=1458176745][FAE69193423616A41BED8E613CD24BF3BA2AE7FC8340F96CC440BDDFFBCBE7D0:_-FAE69193423616A41BED8E613CD24BF3BA2AE7FC8340F96CC440BDDFFBCBE7D0:!,
deletedAt=1458176745740000, localDeletion=1458176745][FAE69193423616A41BED8E613CD24BF3E5A681C7ECC09A93429CEE59A76DA131:_-FAE69193423616A41BED8E613CD24BF3E5A681C7ECC09A93429CEE59A76DA131:!,
deletedAt=1458792793219000, localDeletion=
and finally it take a long time to start and throws
java.lang.OutOfMemoryError: Java heap space
Following is the error log:
INFO 20:39:20 ConcurrentMarkSweep GC in 5859ms. CMS Old Gen: 6491794984 -> 6492437040; Par Eden Space: 1398145024 -> 1397906216; Par Survivor Space: 349072992 -> 336156096
INFO 20:39:20 Enqueuing flush of refresh_token: 693 (0%) on-heap, 0 (0%) off-heap
INFO 20:39:20 Pool Name Active Pending Completed Blocked All Time Blocked
INFO 20:39:20 Enqueuing flush of log_user_track: 7047 (0%) on-heap, 0 (0%) off-heap
INFO 20:39:20 CounterMutationStage 0 0 0 0 0
INFO 20:39:20 Enqueuing flush of userinbox: 42819 (0%) on-heap, 0 (0%) off-heap
INFO 20:39:20 Enqueuing flush of messages: 7954 (0%) on-heap, 0 (0%) off-heap
INFO 20:39:20 ReadStage 0 0 0 0 0
INFO 20:39:20 RequestResponseStage 0 0 6 0 0
INFO 20:39:20 Enqueuing flush of sstable_activity: 6567 (0%) on-heap, 0 (0%) off-heap
INFO 20:39:20 ReadRepairStage 0 0 0 0 0
INFO 20:39:20 Enqueuing flush of convmsgs: 2132 (0%) on-heap, 0 (0%) off-heap
INFO 20:39:20 MutationStage 0 0 72300 0 0
INFO 20:39:20 Enqueuing flush of sstable_activity: 1791 (0%) on-heap, 0 (0%) off-heap
INFO 20:39:20 GossipStage 0 0 23655 0 0
INFO 20:39:20 Enqueuing flush of log_user_track: 1165 (0%) on-heap, 0 (0%) off-heap
INFO 20:39:20 AntiEntropyStage 0 0 0 0 0
INFO 20:39:20 Enqueuing flush of sstable_activity: 2388 (0%) on-heap, 0 (0%) off-heap
INFO 20:39:20 CacheCleanupExecutor 0 0 0 0 0
java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid17155.hprof ...
When I run nodetool tpstats, I see the task of MemtableFlushWriter and MemtablePostFlush are pending a lot.
Pool Name Active Pending Completed Blocked All time blocked
CounterMutationStage 0 0 0 0 0
ReadStage 0 0 0 0 0
RequestResponseStage 0 0 8 0 0
MutationStage 0 0 1382245 0 0
ReadRepairStage 0 0 0 0 0
GossipStage 0 0 23553 0 0
CacheCleanupExecutor 0 0 0 0 0
AntiEntropyStage 0 0 0 0 0
MigrationStage 0 0 0 0 0
ValidationExecutor 0 0 0 0 0
CommitLogArchiver 0 0 0 0 0
MiscStage 0 0 0 0 0
MemtableFlushWriter 4 7459 220 0 0
MemtableReclaimMemory 0 0 231 0 0
PendingRangeCalculator 0 0 3 0 0
MemtablePostFlush 1 7464 331 0 0
CompactionExecutor 3 3 269 0 0
InternalResponseStage 0 0 0 0 0
HintedHandoff 0 0 4 0 0

Cassandra2.1 write slow in a 1TB data table

I am doing some test in a cassandra cluster,and now i have a table with 1TB data per node.When i used ycsb to do more insert operation,i found the throughput was really low(about 10000 ops/sec) comparing to a same,new table in the same cluster(about 80000 ops/sec).While inserting,the cpu usage was about 40%,and almost no disk usege.
I used nodetool tpstats to get task details,it showed :
Pool Name Active Pending Completed Blocked All time blocked
CounterMutationStage 0 0 0 0 0
ReadStage 0 0 102 0 0
RequestResponseStage 0 0 41571733 0 0
MutationStage 384 21949 82375487 0 0
ReadRepairStage 0 0 0 0 0
GossipStage 0 0 247100 0 0
CacheCleanupExecutor 0 0 0 0 0
AntiEntropyStage 0 0 0 0 0
MigrationStage 0 0 6 0 0
Sampler 0 0 0 0 0
ValidationExecutor 0 0 0 0 0
CommitLogArchiver 0 0 0 0 0
MiscStage 0 0 0 0 0
MemtableFlushWriter 16 16 4745 0 0
MemtableReclaimMemory 0 0 4745 0 0
PendingRangeCalculator 0 0 4 0 0
MemtablePostFlush 1 163 9394 0 0
CompactionExecutor 8 29 13713 0 0
InternalResponseStage 0 0 0 0 0
HintedHandoff 2 2 5 0 0
I found there was a large amount of pending MutationStage and MemtablePostFlush
I have read some related articles about cassandra write limitation,but no useful information.I want to know why there is a huge difference about cassandra throughput between two same tables except the data size?
In addition,i use ssd on my server.However,this phenomenon also occur in another cluster using hdd
When cassandra was running,i found the both %user and %nice on cpu utilization are about 10% while only compactiontask running with compaction throughput about 80MB/S.but i have been set nice value to 0 for my cassandra process.
Wild guess: your system is busy compacting the sstable.
Check it out with nodetool compactionstats
BTW, YCSB does not use prepare statement, which make it bad estimator for actual application load.

error while installing cassandra in windows

INFO 19:07:42,273 GC for ParNew: 2182 ms, 27013384 reclaimed leaving 215461536
used; max is 1171062784
INFO 19:07:44,382 Pool Name Active Pending
INFO 19:07:44,960 ReadStage 0 0
INFO 19:07:44,976 RequestResponseStage 0 0
INFO 19:07:44,976 ReadRepairStage 0 0
INFO 19:07:45,007 MutationStage 0 0
INFO 19:07:45,007 GossipStage 0 0
INFO 19:07:45,007 AntiEntropyStage 0 0
INFO 19:07:45,007 MigrationStage 0 0
INFO 19:07:45,007 StreamStage 0 0
INFO 19:07:45,022 MemtablePostFlusher 0 0
INFO 19:07:45,022 FlushWriter 0 0
INFO 19:07:45,022 MiscStage 0 0
INFO 19:07:45,022 FlushSorter 0 0
INFO 19:07:45,038 InternalResponseStage 0 0
INFO 19:07:45,038 HintedHandoff 0 0
INFO 19:07:45,085 CompactionManager n/a 0
INFO 19:07:45,101 MessagingService n/a 0,0
INFO 19:07:45,116 ColumnFamily Memtable ops,data Row cache siz
/cap Key cache size/cap
INFO 19:07:45,288 system.LocationInfo 0,0
0/0 1/1
INFO 19:07:45,304 system.HintsColumnFamily 0,0
0/0 0/1
INFO 19:07:45,319 system.Migrations 0,0
0/0 0/1
INFO 19:07:45,319 system.Schema 0,0
0/0 0/1
INFO 19:07:45,319 system.IndexInfo 0,0
0/0 0/1
After this my installation process does not proceed. It generally hangs showing:
Listening for thrift clients....
Thats exactly what you want to see if your cassandra node/cluster is up and running.
Schildmeijer is correct - this is a normal log output from running Cassandra after a successful installation.
If you are unsure, then try running the Cassandra CLI (see http://wiki.apache.org/cassandra/CassandraCli) and execute some commands to check that the server node responds.
Cassandra runs as a server process - you don't interact directly with it, only via the CLI or another client tool.

Resources