Cassandra batch prepared statement size warning - cassandra

I see this error continuously in the debug.log in cassandra,
WARN [SharedPool-Worker-2] 2018-05-16 08:33:48,585 BatchStatement.java:287 - Batch of prepared statements for [test, test1] is of size 6419, exceeding specified threshold of 5120 by 1299.
In this
where,
6419 - Input payload size (Batch)
5120 - Threshold size
1299 - Byte size above threshold value
so as per this ticket in Cassandra, https://github.com/krasserm/akka-persistence-cassandra/issues/33 I see that it is due to the increase in input payload size so I Increased the commitlog_segment_size_in_mb in cassandra.yml to 60mb and we are not facing this warning anymore.
Is this Warning harmful? Increasing the commitlog_segment_size_in_mb will it affect anything in performance?

This is not related to the commit log size directly, and I wonder why its change lead to disappearing of the warning...
The batch size threshold is controlled by batch_size_warn_threshold_in_kb parameter that is default to 5kb (5120 bytes).
You can increase this parameter to higher value, but you really need to have good reason for using batches - it would be nice to understand the context of their usage...

commit_log_segment_size_in_mb represents your block size for commit log archiving or point-in-time backup. These are only active if you have configured archive_command or restore_command in your commitlog_archiving.properties file.
Default size is 32mb.
As per Expert Apache Cassandra Administration book:
you must ensure that value of commitlog_segment_size_in_mb must be twice the value of max_mutation_size_in_kb.
you can take reference of this:
Mutation of 17076203 bytes is too large for the maxiumum size of 16777216

Related

Why is the default value of spark.memory.fraction so low?

From the Spark configuration docs, we understand the following about the spark.memory.fraction configuration parameter:
Fraction of (heap space - 300MB) used for execution and storage. The lower this is, the more frequently spills and cached data eviction occur. The purpose of this config is to set aside memory for internal metadata, user data structures, and imprecise size estimation in the case of sparse, unusually large records. Leaving this at the default value is recommended.
The default value for this configuration parameter is 0.6 at the time of writing this question. This means that for an executor with, for example, 32GB of heap space and the default configurations we have:
300MB of reserved space (a hardcoded value on this line)
(32GB - 300MB) * 0.6 = 19481MB of shared memory for execution + storage
(32GB - 300MB) * 0.4 = 12987MB of user memory
This "user memory" is (according to the docs) used for the following:
The rest of the space (40%) is reserved for user data structures, internal metadata in Spark, and safeguarding against OOM errors in the case of sparse and unusually large records.
On an executor with 32GB of heap space, we're allocating 12,7GB of memory for this, which feels rather large!
Do these user data structures/internal metadata/safeguarding against OOM errors really need that much space? Are there some striking examples of user memory usage which illustrate the need of this big of a user memory region?
I did some research and imo its 0.6 not to ensure enough memory for user memory but to ensure that execution + storage can fit into old gen region of jvm
Here i found something interesting: Spark tuning
The tenured generation size is controlled by the JVM’s NewRatio
parameter, which defaults to 2, meaning that the tenured generation is
2 times the size of the new generation (the rest of the heap). So, by
default, the tenured generation occupies 2/3 or about 0.66 of the
heap. A value of 0.6 for spark.memory.fraction keeps storage and
execution memory within the old generation with room to spare. If
spark.memory.fraction is increased to, say, 0.8, then NewRatio may
have to increase to 6 or more.
So by default in OpenJvm this ratio is set to 2 so you have 0,66% for old-gen, they choose to use 0,6 to have small margin
I found that in version 1.6 this was changed to 0,75 and it was causing some issues, here is Jira ticket
In the description you will find sample code which is adding records to cache just to use whole memory reserved for exeution + storage. With storage + execution set to higher amount than old gen overhead for gc was really high and code which was executed on older version (with this setting equal to 0.6) was 6 time faster (40-50 sec vs 6 min)
There was discussion and community decided to roll it back to 0.6 in Spark 2.0, here is PR with changes
I think that if you want to increase performance a little bit, you can try to change it up to 0.66 but if you want to have more memory for execution+storageyou need to also adjust your jvm and change old/new ratio as well otherwise you may face performance issues

Why is Redis Sorted Set using so much memory overhead?

I am designing a Redis datastore with ~3000 sorted set keys, each with 60 - 300 items each around 250 bytes in size.
used_memory_overhead = 1055498028 bytes and used_memory_dataset= 9681332 bytes. This ratio seems way too high. used_memory_dataset_perc is less than 1%. Memory usage is exceeding the max of 1.16G and causing keys to be evicted.
Do sorted sets really have 99% memory overhead? Will I have to just find another solution? I just want a list of values that is sorted by a field in the value.
Here's the output of MEMORY INFO . used_memory_dataset_perc just keeps decreasing until it's <1% and eventually the max memory is exceeded
# Memory
used_memory:399243696
used_memory_human:380.75M
used_memory_rss:493936640
used_memory_rss_human:471.05M
used_memory_peak:1249248448
used_memory_peak_human:1.16G
used_memory_peak_perc:31.96%
used_memory_overhead:390394038
used_memory_startup:4263448
used_memory_dataset:8849658
used_memory_dataset_perc:2.24%
allocator_allocated:399390096
allocator_active:477728768
allocator_resident:499613696
used_memory_lua:37888
used_memory_lua_human:37.00K
used_memory_scripts:0
used_memory_scripts_human:0B
number_of_cached_scripts:0
maxmemory:1248854016
maxmemory_human:1.16G
maxmemory_policy:volatile-lru
allocator_frag_ratio:1.20
allocator_frag_bytes:78338672
allocator_rss_ratio:1.05
allocator_rss_bytes:21884928
rss_overhead_ratio:0.99
rss_overhead_bytes:-5677056
mem_fragmentation_ratio:1.24
mem_fragmentation_bytes:94804256
mem_not_counted_for_evict:0
mem_replication_backlog:0
mem_clients_slaves:0
mem_clients_normal:385555150
mem_aof_buffer:0
mem_allocator:jemalloc-5.1.0
active_defrag_running:0
lazyfree_pending_objects:0
In case it is relevant, I am using AWS Elasticache

Has anyone faced Cassandra issue "Maximum memory usage reached"?

I'm using apache cassandra-3.0.6 ,4 node cluster, RF=3, CONSISTENCY is '1', Heap 16GB.
Im getting info message in system.log as
INFO [SharedPool-Worker-1] 2017-03-14 20:47:14,929 NoSpamLogger.java:91 - Maximum memory usage reached (536870912 bytes), cannot allocate chunk of 1048576 bytes
don't know exactly which memory it mean and I have tried by increasing the file_cache_size_in_mb to 1024 from 512 in Cassandra.yaml file But again it immediatly filled the remaining 512MB increased and stoping the application recording by showing the same info message as
INFO [SharedPool-Worker-5] 2017-03-16 06:01:27,034 NoSpamLogger.java:91 - Maximum memory usage reached (1073741824 bytes), cannot allocate chunk of 1048576 bytes
please suggest if anyone has faced the same issue..Thanks!!
Bhargav
As far as I can tell with Cassandra 3.11, no matter how large you set file_cache_size_in_mb, you will still get this message. The cache fills up, and writes this useless message. It happens in my case whether I set it to 2GB or 20GB. This may be a bug in the cache eviction strategy, but I can't tell.
The log message indicates that the node's off-heap cache is full because the node is busy servicing reads.
The 536870912 bytes in the log message is equivalent to 512 MB which is the default file_cache_size_in_mb.
It is fine to see the occasional occurrences of the message in the logs which is why it is logged at INFO level but if it gets logged repeatedly, it is an indicator that the node is getting overloaded and you should consider increasing the capacity of your cluster by adding more nodes.
For more info, see my post on DBA Stack Exchange -- What does "Maximum memory usage reached" mean in the Cassandra logs?. Cheers!

YCSB low read throughput cassandra

The YCSB Endpoint benchmark would have you believe that Cassandra is the golden child of Nosql databases. However, recreating the results on our own boxes (8 cores with hyperthreading, 60 GB memory, 2 500 GB SSD), we are having dismal read throughput for workload b (read mostly, aka 95% read, 5% update).
The cassandra.yaml settings are exactly the same as the Endpoint settings, barring the different ip addresses, and our disk configuration (1 SSD for data, 1 for a commit log). While their throughput is ~38,000 operations per second, ours is ~16,000 regardless (relatively) of the threads/number of client nodes. I.e. one worker node with 256 threads will report ~16,000 ops/sec, while 4 nodes will each report ~4,000 ops/sec
I've set the readahead value to 8KB for the SSD data drive. I'll put the custom workload file below.
When analyzing disk io & cpu usage with iostat, it seems that the reading throughput is consistently ~200,000 KB/s, which seems to suggest that the ycsb cluster throughput should be higher (records are 100 bytes). ~25-30% of cpu seems to be under %iowait, 10-25% in use by the user.
top and nload stats are not ostensibly bottlenecked (<50% memory usage, and 10-50 Mbits/sec for a 10 Gb/s link).
# The name of the workload class to use
workload=com.yahoo.ycsb.workloads.CoreWorkload
# There is no default setting for recordcount but it is
# required to be set.
# The number of records in the table to be inserted in
# the load phase or the number of records already in the
# table before the run phase.
recordcount=2000000000
# There is no default setting for operationcount but it is
# required to be set.
# The number of operations to use during the run phase.
operationcount=9000000
# The offset of the first insertion
insertstart=0
insertcount=500000000
core_workload_insertion_retry_limit = 10
core_workload_insertion_retry_interval = 1
# The number of fields in a record
fieldcount=10
# The size of each field (in bytes)
fieldlength=10
# Should read all fields
readallfields=true
# Should write all fields on update
writeallfields=false
fieldlengthdistribution=constant
readproportion=0.95
updateproportion=0.05
insertproportion=0
readmodifywriteproportion=0
scanproportion=0
maxscanlength=1000
scanlengthdistribution=uniform
insertorder=hashed
requestdistribution=zipfian
hotspotdatafraction=0.2
hotspotopnfraction=0.8
table=usertable
measurementtype=histogram
histogram.buckets=1000
timeseries.granularity=1000
The key was increasing native_transport_max_threads in the casssandra.yaml file.
Along with the increased settings in the comment (increasing connections in ycsb client as well as concurrent read/writes in cassandra), Cassandra jumped to ~80,000 ops/sec.

Mutation of 17076203 bytes is too large for the maxiumum size of 16777216

I have "commitlog_segment_size_in_mb: 32" in the cassandra settings but the error below indicates maximum size is 16777216, which is about 16mb. Am I looking at the correct setting for fixing the error below?
I am referring to this setting based on the suggestion provided at http://mail-archives.apache.org/mod_mbox/cassandra-user/201406.mbox/%3C53A40144.2020808#gmail.com%3E
I am using 2.1.0-2 for Cassandra.
I am using Kairosdb, and the write buffer max size is 0.5Mb.
WARN [SharedPool-Worker-1] 2014-10-22 17:31:03,163 AbstractTracingAwareExecutorService.java:167 - Uncaught exception on thread Thread[SharedPool-Worker-1,5,main]: {}
java.lang.IllegalArgumentException: Mutation of 17076203 bytes is too large for the maxiumum size of 16777216
at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:216) ~[apache-cassandra-2.1.0.jar:2.1.0]
at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:203) ~[apache-cassandra-2.1.0.jar:2.1.0]
at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:371) ~[apache-cassandra-2.1.0.jar:2.1.0]
at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:351) ~[apache-cassandra-2.1.0.jar:2.1.0]
at org.apache.cassandra.db.Mutation.apply(Mutation.java:214) ~[apache-cassandra-2.1.0.jar:2.1.0]
at org.apache.cassandra.db.MutationVerbHandler.doVerb(MutationVerbHandler.java:54) ~[apache-cassandra-2.1.0.jar:2.1.0]
at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) ~[apache-cassandra-2.1.0.jar:2.1.0]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_67]
at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:163) ~[apache-cassandra-2.1.0.jar:2.1.0]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:103) [apache-cassandra-2.1.0.jar:2.1.0]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_67]
You are looking at the correct parameter in your .yaml. The maximum write size C* will allow is half of the commit_log_segment_size_in_mb, default is 32mb so the default max size will be 16mb.
Background
commit_log_segment_size_in_mb represents your block size for commit log archiving or point-in-time backup. These are only active if you have configured archive_command or restore_command in your commitlog_archiving.properties file.
Its the correct setting.. This means Cassandra will discard this write as it exceeds 50% of the configured commit log segment size.
So set the parameter commitlog_segment_size_in_mb: 64 in Cassandra.yaml of each node in cluster and restart each node to take effect the changes.
Cause:
By design intent the maximum allowed segment size is 50% of the configured commit_log_segment_size_in_mb. This is so Cassandra avoids writing segments with large amounts of empty space.
To elaborate; up to two 32MB segments will fit into 64MB, however 40MB will only fit once leaving a larger amount of unused space.
reference link from datastax:
https://support.datastax.com/hc/en-us/articles/207267063-Mutation-of-x-bytes-is-too-large-for-the-maxiumum-size-of-y-

Resources