Spark sql partition pruning for hive partitioned table - apache-spark

When I run some query spark seems not predicate push down to specific hive table's partition.
Setting "spark.sql.orc.filterPushdown" to "true" didn't help.
Spark version is 1.6 and hive version is 1.2. and hive table is partioned by date with ORC format.
val sc = new SparkContext(new SparkConf())
var hql = new org.apache.spark.sql.hive.HiveContext(sc)
hql.setConf("spark.sql.orc.filterPushdown", "true")
hql.sql("""
SELECT i.*,
from_unixtime(unix_timestamp('20170220','yyyyMMdd'),"yyyy-MM-dd'T'HH:mm:ssZ") bounce_date
FROM
(SELECT country,
device_id,
os_name,
app_ver
FROM jpl_band_orc
WHERE yyyymmdd='20170220'
AND scene_id='app_intro'
AND action_id='scene_enter'
AND classifier='app_intro'
GROUP BY country, device_id, os_name, app_ver ) i
LEFT JOIN
(SELECT device_id
FROM jpl_band_orc
WHERE yyyymmdd='20170220'
AND scene_id='band_list'
AND action_id='scene_enter') s
ON i.device_id = s.device_id
WHERE s.device_id is null
""")
this is show create table
1 CREATE TABLE `jpl_band_orc`(
2 ... many fields ...
3 )
39 PARTITIONED BY (
40 `yyyymmdd` string)
41 CLUSTERED BY (
42 ac_hash)
43 SORTED BY (
44 ac_hash ASC)
45 INTO 256 BUCKETS
46 ROW FORMAT SERDE
47 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
48 STORED AS INPUTFORMAT
49 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
50 OUTPUTFORMAT
51 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
52 LOCATION
53 'BLAH~BLAH~/jpl_band_orc'
54 TBLPROPERTIES (
55 'orc.bloom.filter.columns'='ac_hash,action_id,classifier',
56 'orc.bloom.filter.fpp'='0.05',
57 'orc.compress'='SNAPPY',
58 'orc.row.index.stride'='30000',
59 'orc.stripe.size'='268435456',
60 'transient_lastDdlTime'='1464922691')
Spark job output
17/02/22 17:05:32 INFO HadoopFsRelation: Listing leaf files and directories in parallel under:
hdfs://banda/apps/hive/warehouse/jpl_band_orc/yyyymmdd=20160604, hdfs://banda/apps/hive/warehouse/jpl_band_orc/yyyymmdd=20160608,
...
hdfs://banda/apps/hive/warehouse/jpl_band_orc/yyyymmdd=20160620, hdfs://banda/apps/hive/warehouse/jpl_band_orc/yyyymmdd=20160621,
Finally ended with OOM
Exception in thread "qtp1779914089-88" java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.HashMap$KeySet.iterator(HashMap.java:912)
at java.util.HashSet.iterator(HashSet.java:172)
at sun.nio.ch.Util$2.iterator(Util.java:243)
at org.spark-project.jetty.io.nio.SelectorManager$SelectSet.doSelect(SelectorManager.java:600)
at org.spark-project.jetty.io.nio.SelectorManager$1.run(SelectorManager.java:290)
at org.spark-project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at org.spark-project.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:745)
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.Arrays.copyOfRange(Arrays.java:3664)
at java.lang.String.<init>(String.java:207)
at java.lang.String.substring(String.java:1969)
at java.net.URI$Parser.substring(URI.java:2869)
It seems read all partitions then OOM occured.
How can I check whether It partition pruned or not exactly.

Related

How to force the arrival of new files to map to a specifc schema using Auto Loader in Databricks

The Main Problem
Is there a way to force the ingestion of a file in Databricks to follow a schema in a table using Auto Loader Feature?
Detailed explanation
Considering the arrival of a file with a schema like the one below from the year 2022:
Key
Jan/22
Fev/22
Mar/22
...
Dec/22
123
value 123
value 123
value 123
value 123
value 123
124
value 124
value 124
value 124
value 124
value 124
125
value 125
value 125
value 125
value 125
value 125
we were able to ingest the information using the Auto Loader feature from Databricks using the following code:
# Import functions
from pyspark.sql.functions import input_file_name, current_timestamp, col
# Define variables used in the code below
catalog = 'catalog_test'
table_name = 'table_test'
schema_name = 'bronze_data'
data_source = 'abfss://tests#test.dfs.core.windows.net/test/bronze_data'
checkpoint_directory = f"{data_source}/_checkpoint/{table_name}"
file_filter = 'data_name*2023.csv'
source_format = "csv"
table_location = f"{catalog}.{schema_name}.{table_name}"
# Configure Auto Loader to ingest csv data to a Delta table
query = (
spark.readStream
.format("cloudFiles")
# Common Auto Loader options
.option("cloudFiles.format", source_format)
.option("cloudFiles.schemaLocation", checkpoint_directory)
.option("cloudFiles.schemaEvolutionMode", "rescue")
# File format options
.option("encoding", "UTF-8")
.option("enforceSchema", "true")
.option("header", "true")
.option("inferSchema", "false")
.option("delimiter", ",")
.option("useStrictGlobber", "true")
# Generic options
.option("pathGlobFilter", file_filter)
.load(data_source)
.writeStream
.outputMode("append")
.option("checkpointLocation", checkpoint_directory)
.option("mergeSchema", "true")
.trigger(availableNow=True)
.toTable(table_location)
.awaitTermination()
)
The problem is that as we are in a new year (2023) when data from the new year started to arrive in the lake we had problems appending the information to the bronze table. One example of the data can be seen below:
Key
Jan/23
Fev/23
Mar/23
...
Dec/23
133
value 133
value 133
value 133
value 133
value 133
134
value 134
value 134
value 134
value 134
value 134
135
value 135
value 135
value 135
value 135
value 135
To prevent this problem we have renamed the table to not consider the year in the column. But it's important to highlight that the file continues to arrive with the year in the column name.
Key
Jan
Fev
Mar
...
Dec
133
value 133
value 133
value 133
value 133
value 133
134
value 134
value 134
value 134
value 134
value 134
135
value 135
value 135
value 135
value 135
value 135
The autoloader hasn't ingested the new files from 2023. It seems that even though we have renamed the Bronze table the Auto Loader's checkpoint file still has the old column names.
So is there a way to make some sort of column mapping? To "tell" Auto Loader that the column_1 from a file should be ingested into the column_y, column_2 into column_z, and so forth.
Or some alternative to this problem?

Cassandra READ Where In performance

I have a Cassandra cluster of 6 nodes, each one has 96 CPU/800 RAM.
My table for performance tests is:
create table if not exists space.table
(
id bigint primary key,
data frozen<list<float>>,
updated_at timestamp
);
Table contains 150.000.000 rows.
When I was testing it with query:
SELECT * FROM space.table WHERE id = X
I even wasn't able to overload cluster, the client was overloaded by itself, RPS to cluster were 350.000.
Now I'm testing a second test case:
SELECT * FROM space.table WHERE id in (X1, X2 ... X3000)
I want to get 3000 random rows from Cassandra per request.
Max RPS in this case 15 RPS after that occurs a lot of pending tasks in Cassandra thread pool with type: Native-Transport-Requests.
Isn't it the best idea to get big resultsets from cassandra? What is the best practice, for sure I can divide 3000 rows to separate requests, for example 30 request each with 100 ids.
Where can I find info about it, maybe WHERE IN operation is not good from performance perspective?
Update:
Want to share my measurements for getting 3000 rows by different chunk size from Cassandra:
Test with 3000 ids per request
Latency: 5 seconds
Max RPS to cassandra: 20
Test with 100 ids per request (total 300 request each by 100 ids)
Latency at 350 rps to service (350 * 30 = 10500 requests to cassandra): 170 ms (q99), 95 ms (q90), 75 ms(q50)
Max RPS to cassandra: 350 * 30 = 10500
Test with 20 ids per request (total 150 request each by 20 ids)
Latency at 250 rps to service(250 * 150 = 37500 requests to cassandra): 49 ms (q99), 46 ms (q90), 32 ms(q50)
Latency at 600 rps to service(600 * 150 = 90000 requests to cassandra): 190 ms (q99), 180 ms (q90), 148 ms(q50)
Max RPS to cassandra: 650 * 150 = 97500
Test with 10 ids per request (total 300 request each by 10 ids)
Latency at 250 rps to service(250 * 300 = 75000 requests to cassandra): 48 ms (q99), 31 ms (q90), 11 ms(q50)
Latency at 600 rps to service(600 * 300 = 180000 requests to cassandra): 159 ms (q99), 95 ms (q90), 75 ms(q50)
Max RPS to cassandra: 650 * 300 = 195000
Test with 5 ids per request (total 600 request each by 5 ids)
Latency at 550 rps to service(550 * 600 = 330000 requests to cassandra): 97 ms (q99), 92 ms (q90), 60 ms(q50)
Max RPS to cassandra: 550 * 660 = 363000
Test with 1 ids per request (total 3000 request each by 1 ids)
Latency at 190 rps to service(250 * 3000 = 750000 requests to cassandra): 49 ms (q99), 43 ms (q90), 30 ms(q50)
Max RPS to cassandra: 190 * 3000 = 570000
The IN is really not recommended to use, especially for so many individual partition keys. The problem is that when you send query with IN:
query sent to the any node (coordinator node), not necessary node that is owning the data
then that coordinator node identifies which nodes are owning data for specific partition keys
queries are sent to identified nodes
coordinator node collects results from all nodes
result is consolidated and sent back
This puts a lot of load onto the coordinator node, and making the whole query as slow as the slowest node in the cluster.
The better solution would be to use prepared queries and sent individual async requests for each of partition keys, and then collect data in your application. Just take into account that there are limits on how many in-flight queries could be per connection.
P.S. It should be possible to optimize that further, by looking into your values, finding if different partition keys are in the same token range, generate the IN query for all keys in the same token range, and send that query setting the routing key explicitly. But it requires more advanced coding.

presto with hudi - select * from table

I have a parquet record created with hudi off a spark kinesis stream and stored in S3.
An AWS glue table is generated from this record. I update the InputRecord type to org.apache.hudi.hadoop.realtime.HoodieParquetRealtimeInputFormat as per instructions https://cwiki.apache.org/confluence/display/HUDI/Migration+Guide+From+com.uber.hoodie+to+org.apache.hudi
From the presto-cli i run
presto-cli --catalog hive --schema my-schema --server my-server:8889
presto:my-schema> select * from table
this returns
Query 20200211_185222_00050_hej8h, FAILED, 1 node
Splits: 17 total, 0 done (0.00%)
0:01 [0 rows, 0B] [0 rows/s, 0B/s]
Query 20200211_185222_00050_hej8h failed: No value present
however when i run
select id from table
it returns
id
----------
34551832
(1 row)
Query 20200211_185250_00051_hej8h, FINISHED, 1 node
Splits: 17 total, 17 done (100.00%)
0:00 [1 rows, 93B] [2 rows/s, 213B/s]
is this expected behaviour? or is there an underlying issue with the setup between Hudi/AWS Glue/Presto
Update 12-Feb-2020
Stack track using --debug option
presto:schema> select * from table;
Query 20200212_092259_00006_hej8h, FAILED, 1 node
http://xx-xxx-xxx-xxx.xx-xxxxx-xxx.compute.amazonaws.com:8889/ui/query.html?20200212_092259_00006_hej8h
Splits: 17 total, 0 done (0.00%)
CPU Time: 0.0s total, 0 rows/s, 0B/s, 23% active
Per Node: 0.1 parallelism, 0 rows/s, 0B/s
Parallelism: 0.1
Peak Memory: 0B
0:00 [0 rows, 0B] [0 rows/s, 0B/s]
Query 20200212_092259_00006_hej8h failed: No value present
java.util.NoSuchElementException: No value present
at java.util.Optional.get(Optional.java:135)
at com.facebook.presto.parquet.reader.ParquetReader.readArray(ParquetReader.java:156)
at com.facebook.presto.parquet.reader.ParquetReader.readColumnChunk(ParquetReader.java:282)
at com.facebook.presto.parquet.reader.ParquetReader.readStruct(ParquetReader.java:193)
at com.facebook.presto.parquet.reader.ParquetReader.readColumnChunk(ParquetReader.java:276)
at com.facebook.presto.parquet.reader.ParquetReader.readStruct(ParquetReader.java:193)
at com.facebook.presto.parquet.reader.ParquetReader.readColumnChunk(ParquetReader.java:276)
at com.facebook.presto.parquet.reader.ParquetReader.readBlock(ParquetReader.java:268)
at com.facebook.presto.hive.parquet.ParquetPageSource$ParquetBlockLoader.load(ParquetPageSource.java:247)
at com.facebook.presto.hive.parquet.ParquetPageSource$ParquetBlockLoader.load(ParquetPageSource.java:225)
at com.facebook.presto.spi.block.LazyBlock.assureLoaded(LazyBlock.java:283)
at com.facebook.presto.spi.block.LazyBlock.getLoadedBlock(LazyBlock.java:274)
at com.facebook.presto.spi.Page.getLoadedPage(Page.java:261)
at com.facebook.presto.operator.TableScanOperator.getOutput(TableScanOperator.java:254)
at com.facebook.presto.operator.Driver.processInternal(Driver.java:379)
at com.facebook.presto.operator.Driver.lambda$processFor$8(Driver.java:283)
at com.facebook.presto.operator.Driver.tryWithLock(Driver.java:675)
at com.facebook.presto.operator.Driver.processFor(Driver.java:276)
at com.facebook.presto.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1077)
at com.facebook.presto.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:162)
at com.facebook.presto.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:483)
at com.facebook.presto.$gen.Presto_0_227____20200211_134743_1.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Appears the problem may be elsewhere, issue raised with hudi team here --> https://github.com/apache/incubator-hudi/issues/1325

Hive can't find partitioned data written by Spark Structured Streaming

I have a spark structured streaming job, writing data to IBM Cloud Object Storage (S3):
dataDf.
writeStream.
format("parquet").
trigger(Trigger.ProcessingTime(trigger_time_ms)).
option("checkpointLocation", s"${s3Url}/checkpoint").
option("path", s"${s3Url}/data").
option("spark.sql.hive.convertMetastoreParquet", false).
partitionBy("InvoiceYear", "InvoiceMonth", "InvoiceDay", "InvoiceHour").
start()
I can see the data using the hdfs CLI:
[clsadmin#xxxxx ~]$ hdfs dfs -ls s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0 | head
Found 616 items
-rw-rw-rw- 1 clsadmin clsadmin 38085 2018-09-25 01:01 s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00000-1e1dda99-bec2-447c-9bd7-bedb1944f4a9.c000.snappy.parquet
-rw-rw-rw- 1 clsadmin clsadmin 45874 2018-09-25 00:31 s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00000-28ff873e-8a9c-4128-9188-c7b763c5b4ae.c000.snappy.parquet
-rw-rw-rw- 1 clsadmin clsadmin 5124 2018-09-25 01:10 s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00000-5f768960-4b29-4bce-8f31-2ca9f0d42cb5.c000.snappy.parquet
-rw-rw-rw- 1 clsadmin clsadmin 40154 2018-09-25 00:20 s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00000-70abc027-1f88-4259-a223-21c4153e2a85.c000.snappy.parquet
-rw-rw-rw- 1 clsadmin clsadmin 41282 2018-09-25 00:50 s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00000-873a1caa-3ecc-424a-8b7c-0b2dc1885de4.c000.snappy.parquet
-rw-rw-rw- 1 clsadmin clsadmin 41241 2018-09-25 00:40 s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00000-88b617bf-e35c-4f24-acec-274497b1fd31.c000.snappy.parquet
-rw-rw-rw- 1 clsadmin clsadmin 3114 2018-09-25 00:01 s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00000-deae2a19-1719-4dfa-afb6-33b57f2d73bb.c000.snappy.parquet
-rw-rw-rw- 1 clsadmin clsadmin 38877 2018-09-25 00:10 s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00000-e07429a2-43dc-4e5b-8fe7-c55ec68783b3.c000.snappy.parquet
-rw-rw-rw- 1 clsadmin clsadmin 39060 2018-09-25 00:20 s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00001-1553da20-14d0-4c06-ae87-45d22914edba.c000.snappy.parquet
However, when I try to query the data:
hive> select * from invoiceitems limit 5;
OK
Time taken: 2.392 seconds
My table DDL looks like this:
CREATE EXTERNAL TABLE `invoiceitems`(
`invoiceno` int,
`stockcode` int,
`description` string,
`quantity` int,
`invoicedate` bigint,
`unitprice` double,
`customerid` int,
`country` string,
`lineno` int,
`invoicetime` string,
`storeid` int,
`transactionid` string,
`invoicedatestring` string)
PARTITIONED BY (
`invoiceyear` int,
`invoicemonth` int,
`invoiceday` int,
`invoicehour` int)
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION
's3a://streaming-data-landing-zone-partitioned/data'
I've also tried with the correct case for column/partition names - this doesn't work either.
Any ideas why my query isn't finding the data?
UPDATE 1:
I have tried setting the location to a directory containing the data without partitions and this still doesn't work, so I'm wondering if it is a data formatting issue?
CREATE EXTERNAL TABLE `invoiceitems`(
`InvoiceNo` int,
`StockCode` int,
`Description` string,
`Quantity` int,
`InvoiceDate` bigint,
`UnitPrice` double,
`CustomerID` int,
`Country` string,
`LineNo` int,
`InvoiceTime` string,
`StoreID` int,
`TransactionID` string,
`InvoiceDateString` string)
PARTITIONED BY (
`InvoiceYear` int,
`InvoiceMonth` int,
`InvoiceDay` int,
`InvoiceHour` int)
STORED AS PARQUET
LOCATION
's3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/';
hive> Select * from invoiceitems limit 5;
OK
Time taken: 2.066 seconds
Read from Snappy Compression parquet file
The data is in snappy compressed Parquet file format.
s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00000-1e1dda99-bec2-447c-9bd7-bedb1944f4a9.c000.snappy.parquet
So set the ‘PARQUET.COMPRESS’=’SNAPPY’ table property in create table DDL statement. You can alternatively set parquet.compression=SNAPPY in the “Custom hive-site settings” section in Ambari for either IOP or HDP.
Here is an example of using the table property during a table creation statement in Hive:
hive> CREATE TABLE inv_hive_parquet(
trans_id int, product varchar(50), trans_dt date
)
PARTITIONED BY (
year int)
STORED AS PARQUET
TBLPROPERTIES ('PARQUET.COMPRESS'='SNAPPY');
Update Parition metadata in External table
Also, for an external Partitioned table, we need to update the partition metadata whenever any external job (spark job in this case) writes the partitions to Datafolder directly, because hive will not be aware of these partitions unless the explicitly updated.
that can be done by either:
ALTER TABLE inv_hive_parquet RECOVER PARTITIONS;
//or
MSCK REPAIR TABLE inv_hive_parquet;

Coordinator get responce from one node notably later than from other nodes

Please, help me to understand what i missed.
I see strange behavior of one cluster node on SELECT with LIMIT and ORDER BY DESC clauses:
SELECT cid FROM test_cf WHERE uid = 0x50236b6de695baa1140004bf ORDER BY tuuid DESC LIMIT 1000;
TRACING (only part):
…
Sending REQUEST_RESPONSE message to /10.0.25.56 [MessagingService-Outgoing-/10.0.25.56] | 2016-02-29 22:17:25.117000 | 10.0.23.15 | 7862
Sending REQUEST_RESPONSE message to /10.0.25.56 [MessagingService-Outgoing-/10.0.25.56] | 2016-02-29 22:17:25.136000 | 10.0.25.57 | 6283
Sending REQUEST_RESPONSE message to /10.0.25.56 [MessagingService-Outgoing-/10.0.25.56] | 2016-02-29 22:17:38.568000 | 10.0.24.51 | 457931
…
10.0.25.56 - coordinator node
10.0.23.15, 10.0.24.51, 10.0.25.57 - node with data
Coordinator get response from 10.0.24.51 13 seconds later than other nodes! Why so? How can i fix it?
Number of rows for partition key (uid = 0x50236b6de695baa1140004bf) is about 300.
All is fine if we use ORDER BY ASC (our clustering order) or LIMIT value less than number of rows for this partition key.
Cassandra (v2.2.5) cluster contains 25 nodes.
Every node holds about 400Gb of data.
Cluster is placed in AWS. Nodes are evenly distributed over 3 subnets in VPC. Type of instance for nodes is c3.4xlarge (16 CPU cores, 30GB RAM). We use EBS-backed storages (1TB GP SSD).
Keyspace RF equals 3.
Column family:
CREATE TABLE test_cf (
uid blob,
tuuid timeuuid,
cid text,
cuid blob,
PRIMARY KEY (uid, tuuid)
) WITH CLUSTERING ORDER BY (tuuid ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction ={'class':'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression ={'sstable_compression':'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 86400
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
nodetool gcstats (10.0.25.57):
Interval (ms) Max GC Elapsed (ms)Total GC Elapsed (ms)Stdev GC Elapsed (ms) GC Reclaimed (MB) Collections Direct Memory Bytes
1208504 368 4559 73 553798792712 58 305691840
nodetool gcstats (10.0.23.15):
Interval (ms) Max GC Elapsed (ms)Total GC Elapsed (ms)Stdev GC Elapsed (ms) GC Reclaimed (MB) Collections Direct Memory Bytes
1445602 369 3120 57 381929718000 38 277907601
nodetool gcstats (10.0.24.51):
Interval (ms) Max GC Elapsed (ms)Total GC Elapsed (ms)Stdev GC Elapsed (ms) GC Reclaimed (MB) Collections Direct Memory Bytes
1174966 397 4137 69 1900387479552 45 304448986
This could be due to a number of factors both related and not related to Cassandra.
Non-Cassandra Specific
How does the hardware (CPU/RAM/Disk Type (SSD v Rotational) on this
node compare to the other nodes?
How is the network configured? Is traffic to this node slower than other nodes? Do you have a routing issue between the nodes?
How does the load on this server compare to other nodes?
Cassandra Specific
Is the JVM properly configured? Is GC running significantly more frequently than the other nodes? Check nodetool gcstats on this and other nodes to compare.
Has compaction been run on this node recently? Check nodetool compactionhistory
Are there any issues with corrupted files on disk?
Have you checked the system.log to see if it contains any information.
Besides general Linux troubleshooting I would suggest you compare some of the specific C* functionality using nodetool and look for differences:
https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsNodetool_r.html

Resources