i've started working with cassandra. Therefore I’ve download cassandra (1.1.1) to my windows pc and started it. Everything works fine.
Thus I began to reimplement a old application (in java using hector 1.1) which imports about 200.000.000 for 4 tables, which should insertet into 4 columnfamilies. After importing about 2.000.000 records I get an timeout exception and cassandra doesn't response on requests:
2012-07-03 15:35:43,299 WARN - Could not fullfill request on this host CassandraClient<localhost:9160-16>
2012-07-03 15:35:43,300 WARN - Exception: me.prettyprint.hector.api.exceptions.HTimedOutException: TimedOutException()
....
Caused by: TimedOutException()
at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20269)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:922)
at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:908)
at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:246)
at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:243)
at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:103)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:258)
The last entries inside the logfile are:
INFO 15:35:31,678 Writing Memtable-cf2#678837311(7447722/53551072 serialized/live bytes, 262236 ops)
INFO 15:35:32,810 Completed flushing \var\lib\cassandra\data\keySpaceName\cf2\keySpaceName-cf2-hd-205-Data.db (3292685 bytes) for commitlog position ReplayPosition(segmentId=109596147695328, position=131717208)
INFO 15:35:33,282 Compacted to [\var\lib\cassandra\data\keySpaceName\cf3\keySpaceName-cf3-hd-29-Data.db,]. 33.992.615 to 30.224.481 (~88% of original) bytes for 282.032 keys at 1,378099MB/s. Time: 20.916ms.
INFO 15:35:33,286 Compacting [SSTableReader(path='\var\lib\cassandra\data\keySpaceName\cf4\keySpaceName-cf4-hd-8-Data.db'), SSTableReader(path='\var\lib\cassandra\data\keySpaceName\cf4\keySpaceName-cf4-hd-6-Data.db'), SSTableReader(path='\var\lib\cassandra\data\keySpaceName\cf4\keySpaceName-cf4-hd-7-Data.db'), SSTableReader(path='\var\lib\cassandra\data\keySpaceName\cf4\keySpaceName-cf4-hd-5-Data.db')]
INFO 15:35:34,871 Compacted to [\var\lib\cassandra\data\keySpaceName\cf4\keySpaceName-cf4-hd-9-Data.db,]. 4.249.270 to 2.471.543 (~58% of original) bytes for 30.270 keys at 1,489916MB/s. Time: 1.582ms.
INFO 15:35:41,858 Compacted to [\var\lib\cassandra\data\keySpaceName\cf2\keySpaceName-cf2-hd-204-Data.db,]. 48.868.818 to 24.033.164 (~49% of original) bytes for 135.367 keys at 2,019011MB/s. Time: 11.352ms.
I created 4 column families like following:
ColumnFamilyDefinition cf1 = HFactory.createColumnFamilyDefinition(
“keyspacename”,
“cf1”,
ComparatorType.ASCIITYPE);
The column families have following column count:
16 columns
14 columns
7 colmuns
5 columns
The keyspace is created with replication factor 1 and default strategy (simple)
I insert the records (rows) with 'Mutator#AddInsertion'
Any advice avoiding this exception?
Regards
WM
That exception is basically Cassandra saying that it's far enough behind on mutations that it won't complete your requests before they time out. Assuming your PC isn't a beast, you should probably throttle your requests. I suggest sleeping for a while after catching that exception and then retrying; there's no harm in accidentally writing the same row twice, and Cassandra should catch up on write pretty quickly.
If you were in a production environment, I would look more closely at other reasons why the node might be performing poorly.
Related
We have a 6 node each 2 datacenter Cassandra cluster production environment setup. We encounter large partition warning. We ran 2 successful repairs, still this is not getting resolved. How can I analyze and fix this?
BigTableWriter.java:184 - Writing large partition system_distributed/repair_history:rf_key_space:my_table (108140638 bytes)
Mode: NORMAL
Not sending any streams.
Read Repair Statistics:
Attempted: 1171896
Mismatch (Blocking): 808
Mismatch (Background): 131
Pool Name Active Pending Completed
Large messages n/a 11 0
Small messages n/a 0 48881938
Gossip messages n/a 0 113659
The system_distributed.repair_history table is not one that you really need to concern yourself with. Unfortunately, this can happen when a lot of repairs have been run. With 2.2, the only real solution is to TRUNCATE that table every now and then.
I'm trying to add new node to our cluster (cassandra 2.1.11, 16 nodes, 32Gb ram, 2x3Tb hdd, 8core cpu, 1 datacenter, 2 racks, about 700Gb of data on each node). After start of new node, data (approx 600Gb total) from 16 existing nodes successfully transfered to new node and building of secondary indexes starts. The process of secondary indexes building looks normal, i see info about successfull completition of some secondary indexes building and some stream tasks:
INFO [StreamReceiveTask:9] 2015-11-22 02:15:23,153 StreamResultFuture.java:180 - [Stream #856adc90-8ddd-11e5-a4be-69bddd44a709] Session with /192.168.21.66 is complete
INFO [StreamReceiveTask:9] 2015-11-22 02:15:23,152 SecondaryIndexManager.java:174 - Index build of [docs.docs_ex_pl_ph_idx, docs.docs_lo_pl_ph_idx, docs.docs_author_login_idx, docs.docs_author_extid_idx, docs.docs_url_idx] complete
Curently 9 out of 16 streams successfully finished, according to logs. Everything looks fine, except one issue: this process already lasts 5 full days. There is no errors in logs, no anything suspicious, except extremely slow progress.
nodetool compactionstats -H
shows
Secondary index build ... docs 882,4 MB 1,69 GB bytes 51,14%
So there is some process of index building and it has some progress, but very slow, 1% in half a hour or so.
The only significant difference between the new node and any of existing nodes is the fact that cassandra java process has 21k open files, in contrast of 300 open files on any existing node, and 80k files in the data dir on new node in contrast of 300-500 files in the data dir on any existing node.
Is it normal? At this speed it looks i'll spend 16 weeks or so to add 16 more nodes.
I know this is an old question, but we ran into this exact issue with 2.1.13 using DTCS. We were able to fix it in our test environment by increasing memtable flush thresholds to 0.7 - which didn't make any sense to us, but may be worth trying.
I'm using datastax cassandra 2.1 driver and performing read/write operations at the rate of ~8000 IOPS. I've used pooling options to configure my session and am using separate session for read and write each of which connect to a different node in the cluster as contact point.
This works fine for say 5 mins but after that I get a lot of exceptions like :
Failed with: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /10.0.1.123:9042 (com.datastax.driver.core.TransportException: [/10.0.1.123:9042] Connection has been closed), /10.0.1.56:9042 (com.datastax.driver.core.exceptions.DriverException: Timeout while trying to acquire available connection (you may want to increase the driver number of per-host connections)))
Can anyone help me out here on what could be the problem?
The exception asks me to increase number of connections per host but how high a value can I set for this parameter ?
Also I'm not able to set CoreConnectionsPerHost beyond 2 as it throws me exception saying 2 is the max.
This is how I'm creating each read / write session.
PoolingOptions poolingOpts = new PoolingOptions();
poolingOpts.setCoreConnectionsPerHost(HostDistance.REMOTE, 2);
poolingOpts.setMaxConnectionsPerHost(HostDistance.REMOTE, 200);
poolingOpts.setMaxSimultaneousRequestsPerConnectionThreshold(HostDistance.REMOTE, 128);
poolingOpts.setMinSimultaneousRequestsPerConnectionThreshold(HostDistance.REMOTE, 2);
cluster = Cluster
.builder()
.withPoolingOptions( poolingOpts )
.addContactPoint(ip)
.withRetryPolicy( DowngradingConsistencyRetryPolicy.INSTANCE )
.withReconnectionPolicy( new ConstantReconnectionPolicy( 100L ) ).build();
Session s = cluster.connect(keySpace);
Your problem might not actually be in your code or the way you are connecting. If you say the problem is happening after a few minutes then it could simply be that your cluster is becoming overloaded trying to process the ingestion of data and cannot keep up. The typical sign of this is when you start seeing JVM garbage collection "GC" messages in the cassandra system.log file, too many small ones batched together of large ones on their own can mean that incoming clients are not responded to causing this kind of scenario. Verify that you do not have too many of these event showing up in your logs first before you start to look at your code. Here's a good example of a large GC event:
INFO [ScheduledTasks:1] 2014-05-15 23:19:49,678 GCInspector.java (line 116) GC for ConcurrentMarkSweep: 2896 ms for 2 collections, 310563800 used; max is 8375238656
When connecting to a cluster there are some recommendations, one of which is only have one Cluster object per real cluster. As per the article I've linked below (apologies if you already studied this):
Use one cluster instance per (physical) cluster (per application lifetime)
Use at most one session instance per keyspace, or use a single Session and explicitly specify the keyspace in your queries
If you execute a statement more than once, consider using a prepared statement
You can reduce the number of network roundtrips and also have atomic operations by using batches
http://www.datastax.com/documentation/developer/java-driver/2.1/java-driver/fourSimpleRules.html
As you are doing a high number of reads I'd most definitely recommend using setFetchSize also if its applicable to your code
http://www.datastax.com/documentation/developer/java-driver/2.1/common/drivers/reference/cqlStatements.html
http://www.datastax.com/documentation/developer/java-driver/2.1/java-driver/reference/queryBuilderOverview.html
For reference heres the connection options in case you find it useful
http://www.datastax.com/documentation/developer/java-driver/2.1/common/drivers/reference/connectionsOptions_c.html
Hope this helps.
I'm trying to simultaneously add 4 nodes to my current 2-node DC. I have Vnodes turned off as per Datastax suggestion. Right after the major index build in each node, the following warning is printed several times in the logs:
WARN [SolrSecondaryIndex ks.cf index initializer.] 2014-06-20
09:39:59,904 CassandraUtil.java (line 108) Error Operation timed out -
received only 3 responses. on attempt 1 out of 4 with CL QUORUM...
I understand what it means. But why is Cassandra expecting the nodes to fulfill the CL when these nodes are still bootstrapping? More importantly, how does the warning affect the bootstrap? I noticed that the nodes are not doing any index build or streaming anymore; but they also remained in "Active - Joining" state. Is there any chance that they will finish? What should I do?
I'm using DSE 4.0.3. All existing and new nodes in the DC are Search nodes. I pre-computed the tokens using the python program for MurMur3Partitioner.
EDIT:
Although nodetool compactionstats does not show any on-going index build in the nodes, for some reason, I still see a lot of these lines in the logs:
INFO [IndexPool backpressure thread-0] 2014-06-20 12:30:31,346 IndexPool.java (line 472) Throttling at 26 index requests per second with target total queue size at 40
INFO [IndexPool backpressure thread-0] 2014-06-20 12:30:34,169 IndexPool.java (line 428) Back pressure is active with total index queue size 18586 and average processing time 2770
EDIT:
Interestingly, I found the following lines in each node after digging through the log files:
INFO [main] 2014-06-20 09:39:48,588 StorageService.java (line 1036) Bootstrap completed! for the tokens [node token]
INFO [SolrSecondaryIndex ks.cf index initializer.] 2014-06-20 11:32:07,833 AbstractSolrSecondaryIndex.java (line 411) Reindexing 1417116631 commit log updates for core ks.cf
Based from these lines, I feel a lot safer that the bootstrap actually completed and that the nodes are simply re-indexing their data. I don't know, though, why the re-indexing process is not being shown in nodetool compactionstats.
It appears the bootstrap completed, and the DSE Search system is running normally.
why the re-indexing process is not being shown in nodetool compactionstat
DSE Search is not generally exposed via Cassandra command line tools. The log output should show the indexing as having completed, were you able to verify that?
Firstly forgive me for what might be a very naive question.
I am on a mission to identify the right nosql database for my project.
I was inserting and updating records in the table (column family) in highly concurrent fashion.
Then i encountered this.
INFO 11:55:20,924 Writing Memtable-scan_request#314832703(496750/1048576 serialized/live bytes, 8204 ops)
INFO 11:55:21,084 Completed flushing /var/lib/cassandra/data/mykey/scan_request/mykey-scan_request-ic-14-Data.db (115527 bytes) for commitlog position ReplayPosition(segmentId=1372313109304, position=24665321)
INFO 11:55:21,085 Writing Memtable-scan_request#721424982(1300975/2097152 serialized/live bytes, 21494 ops)
INFO 11:55:21,191 Completed flushing /var/lib/cassandra/data/mykey/scan_request/mykey-scan_request-ic-15-Data.db (304269 bytes) for commitlog position ReplayPosition(segmentId=1372313109304, position=26554523)
WARN 11:55:21,268 Heap is 0.829968311377531 full. You may need to reduce memtable and/or cache sizes. Cassandra will now flush up to the two largest memtables to free up memory. Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically
WARN 11:55:21,268 Flushing CFS(Keyspace='mykey', ColumnFamily='scan_request') to relieve memory pressure
INFO 11:55:25,451 Enqueuing flush of Memtable-scan_request#714386902(324895/843149 serialized/live bytes, 5362 ops)
INFO 11:55:25,452 Writing Memtable-scan_request#714386902(324895/843149 serialized/live bytes, 5362 ops)
INFO 11:55:25,490 Completed flushing /var/lib/cassandra/data/mykey/scan_request/mykey-scan_request-ic-16-Data.db (76213 bytes) for commitlog position ReplayPosition(segmentId=1372313109304, position=27025950)
WARN 11:55:30,109 Heap is 0.9017950505664833 full. You may need to reduce memtable and/or cache sizes. Cassandra will now flush up to the two largest memtables to free up memory. Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically
java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid8849.hprof ...
Heap dump file created [1359702396 bytes in 105.277 secs]
WARN 12:25:26,656 Flushing CFS(Keyspace='mykey', ColumnFamily='scan_request') to relieve memory pressure
INFO 12:25:26,657 Enqueuing flush of Memtable-scan_request#728952244(419985/1048576 serialized/live bytes, 6934 ops)
Its to be noticed that i was able to insert & update around 6 million records before i got this. I am using cassandra on a single node. In-spite of the hint in the logs, i am not able to decide what config to change. I did check into the bin/cassandra shell script and i see they have done lots of manipulation before they came up with the -Xms & -Xmx values.
Kindly advice.
First, you can run
ps -ef|grep cassandra
to see what -Xmx is set to in your Cassandra. The default values of -Xms and -Xmx are based on the amount of your system's memory.
Check this for details:
http://www.datastax.com/documentation/cassandra/1.2/index.html?pagename=docs&version=1.2&file=index#cassandra/operations/ops_tune_jvm_c.html
You can try to increase MAX_HEAP_SIZE (in conf/cassandra-env.sh) to see if the problem would go away.
For example, you can replace
MAX_HEAP_SIZE="${max_heap_size_in_mb}M"
with
MAX_HEAP_SIZE="2048M"
I assume that tuning the Garbage Collector for Cassandra might solve the OOM error. Cassandra use Concurrent mark-and-sweep(CMS) JVM implementation for Garbage Collector when we use default settings.Most oftenly the CMS garbage collector would only start after the heap is almost fully populated. But the CMS process itself takes some time to finish and the problem is that the JVM runs out of space before the CMS process finished.We can set the percentage of used old generation space which triggers the CMS with following options in bin/cassandra.in.sh file under JAVA_OPTS variable
-XX:CMSInitiatingOccupancyFraction={percentage} - This controls the percentage of the old generation when the CMS is triggered and we can set this bit lower value to hold until CMS process finished.
-XX:+UseCMSInitiatingOccupancyOnly - This parameter ensures the percentage is kept fixed
Also with the following options we can achieve incremental CMS
-XX:+UseConcMarkSweepGC \
-XX:+CMSIncrementalMode \
-XX:+CMSIncrementalPacing \
-XX:CMSIncrementalDutyCycleMin=0 \
-XX:+CMSIncrementalDutyCycle=10
We can increase the parallel CMS threads considering the number of cores of the CPU
-XX:ParallelCMSThreads={numberOfTreads}
Further we can tune the garbage collection for young generation for make the process optimum. Here we have to control the amount of re-used objects
Increasing the size of the young generation
Delay the young generation objects promotion for old generation
To achieve this we can set following parameters
-XX:NewSize={size} - Determine the size of the young generation
-XX:NewMaxSize={size} - This is the maximum size of the young generation
-Xmn{size} - Fix the maximum size
-XX:NewRatio={n} - Set the ratio of young generation to old generation
Before the objects are migrate to the old generation from young generation they are put in to phase called "young survior". So we can controll the migration of the objects to old generation using following parameters
-XX:SurvivorRatio={n} - ration of "young eden" to "young survivors"
-XX:MaxTenuringThreshold={age} Number of objects to be migrated to old generation