I want to add or remove nodes to the cluster.
When I tring to add/remove node, I get a LEAK DETECTED and a STREAM FAILED ERROR messange.
If I drop the index - pushcapabilityindx,presetsearchval - before adding / removing nodes, the node add / remove succeeds.
If there is no data update, the data of abc_db.sub is automatically deleted after 24 hours. (TTL 86400 sec)
Also, if I do not have any work including index deletion, add / remove nodes succeeds normally after 14 days.
Where should I start troubleshooting this error?
Ubuntu 16.04.3 LTS
[cqlsh 5.0.1 | Cassandra 3.10 | CQL spec 3.4.4 | Native protocol v4]
CREATE TABLE abc_db.sub (
phoneno text,
deviceid text,
subid text,
callbackdata text,
corelator text,
duration int,
local_index bigint,
phase int,
presetsearchvalue text,
pushcapability list<text>,
pushtoken text,
pushtype text,
searchcriteria frozen<typesearchcriteria>,
PRIMARY KEY (phoneno, deviceid, subid)
) WITH CLUSTERING ORDER BY (deviceid ASC, subid ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
CREATE INDEX pushcapabilityindx ON abc_db.sub (values(pushcapability));
CREATE INDEX presetsearchval ON abc_db.sub (presetsearchvalue);
INFO [main] 2019-05-09 07:57:15,741 StorageService.java:1435 - JOINING: waiting for schema information to complete
INFO [main] 2019-05-09 07:57:16,497 StorageService.java:1435 - JOINING: schema complete, ready to bootstrap
INFO [main] 2019-05-09 07:57:16,497 StorageService.java:1435 - JOINING: waiting for pending range calculation
INFO [main] 2019-05-09 07:57:16,497 StorageService.java:1435 - JOINING: calculation complete, ready to bootstrap
INFO [main] 2019-05-09 07:57:16,498 StorageService.java:1435 - JOINING: getting bootstrap token
INFO [main] 2019-05-09 07:57:16,531 StorageService.java:1435 - JOINING: sleeping 30000 ms for pending range setup
INFO [main] 2019-05-09 07:57:46,532 StorageService.java:1435 - JOINING: Starting to bootstrap...
INFO [main] 2019-05-09 07:57:47,775 StreamResultFuture.java:90 - [Stream #21f71b70-7230-11e9-9495-ab5d6d9b9488] Executing streaming plan for Bootstrap
INFO [StreamConnectionEstablisher:1] 2019-05-09 07:57:47,783 StreamSession.java:266 - [Stream #21f71b70-7230-11e9-9495-ab5d6d9b9488] Starting streaming to /172.50.20.10
INFO [StreamConnectionEstablisher:1] 2019-05-09 07:57:47,786 StreamCoordinator.java:264 - [Stream #21f71b70-7230-11e9-9495-ab5d6d9b9488, ID#0] Beginning stream session with /172.50.20.10
INFO [STREAM-IN-/172.50.20.10:5000] 2019-05-09 07:57:48,887 StreamResultFuture.java:173 - [Stream #21f71b70-7230-11e9-9495-ab5d6d9b9488 ID#0] Prepare completed. Receiving 261 files(1012.328MiB), sending 0 files(0.000KiB)
INFO [StreamConnectionEstablisher:2] 2019-05-09 07:57:48,891 StreamSession.java:266 - [Stream #21f71b70-7230-11e9-9495-ab5d6d9b9488] Starting streaming to /172.50.22.10
INFO [StreamConnectionEstablisher:2] 2019-05-09 07:57:48,893 StreamCoordinator.java:264 - [Stream #21f71b70-7230-11e9-9495-ab5d6d9b9488, ID#0] Beginning stream session with /172.50.22.10
INFO [STREAM-IN-/172.50.22.10:5000] 2019-05-09 07:57:50,020 StreamResultFuture.java:173 - [Stream #21f71b70-7230-11e9-9495-ab5d6d9b9488 ID#0] Prepare completed. Receiving 254 files(1.286GiB), sending 0 files(0.000KiB)
INFO [StreamConnectionEstablisher:3] 2019-05-09 07:57:50,022 StreamSession.java:266 - [Stream #21f71b70-7230-11e9-9495-ab5d6d9b9488] Starting streaming to /172.50.21.10
INFO [StreamConnectionEstablisher:3] 2019-05-09 07:57:50,025 StreamCoordinator.java:264 - [Stream #21f71b70-7230-11e9-9495-ab5d6d9b9488, ID#0] Beginning stream session with /172.50.21.10
INFO [STREAM-IN-/172.50.21.10:5000] 2019-05-09 07:57:50,998 StreamResultFuture.java:173 - [Stream #21f71b70-7230-11e9-9495-ab5d6d9b9488 ID#0] Prepare completed. Receiving 114 files(1.085GiB), sending 0 files(0.000KiB)
INFO [StreamReceiveTask:1] 2019-05-09 07:58:02,509 SecondaryIndexManager.java:365 - Submitting index build of pushcapabilityindx,presetsearchval for data in BigTableReader(path='/cassandra/data/abc_db/sub-28f738d0627f11e9b62cc50700a5eaee/mc-1-big-Data.db'),BigTableReader(path='/cassandra/data/abc_db/sub-28f738d0627f11e9b62cc50700a5eaee/mc-2-big-Data.db'),BigTableReader(path='/cassandra/data/abc_db/sub-28f738d0627f11e9b62cc50700a5eaee/mc-3-big-Data.db')
INFO [StreamReceiveTask:1] 2019-05-09 07:58:02,519 SecondaryIndexManager.java:385 - Index build of pushcapabilityindx,presetsearchval complete
INFO [StreamReceiveTask:1] 2019-05-09 07:58:11,213 SecondaryIndexManager.java:365 - Submitting index build of pushcapabilityindx,presetsearchval for data in BigTableReader(path='/cassandra/data/abc_db/sub-28f738d0627f11e9b62cc50700a5eaee/mc-4-big-Data.db'),BigTableReader(path='/cassandra/data/abc_db/sub-28f738d0627f11e9b62cc50700a5eaee/mc-5-big-Data.db'),BigTableReader(path='/cassandra/data/abc_db/sub-28f738d0627f11e9b62cc50700a5eaee/mc-6-big-Data.db'),BigTableReader(path='/cassandra/data/abc_db/sub-28f738d0627f11e9b62cc50700a5eaee/mc-7-big-Data.db'),BigTableReader(path='/cassandra/data/abc_db/sub-28f738d0627f11e9b62cc50700a5eaee/mc-8-big-Data.db'),BigTableReader(path='/cassandra/data/abc_db/sub-28f738d0627f11e9b62cc50700a5eaee/mc-9-big-Data.db')
ERROR [StreamReceiveTask:1] 2019-05-09 07:58:11,295 StreamSession.java:593 - [Stream #21f71b70-7230-11e9-9495-ab5d6d9b9488] Streaming error occurred on session with peer 172.50.22.10
java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.util.NoSuchElementException
at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:51) ~[apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:393) ~[apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.index.SecondaryIndexManager.buildIndexesBlocking(SecondaryIndexManager.java:382) ~[apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.index.SecondaryIndexManager.buildAllIndexesBlocking(SecondaryIndexManager.java:269) ~[apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.streaming.StreamReceiveTask$OnCompletionRunnable.run(StreamReceiveTask.java:215) ~[apache-cassandra-3.10.jar:3.10]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_191]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_191]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_191]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_191]
at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) [apache-cassandra-3.10.jar:3.10]
at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_191]
Caused by: java.util.concurrent.ExecutionException: java.util.NoSuchElementException
at java.util.concurrent.FutureTask.report(FutureTask.java:122) [na:1.8.0_191]
at java.util.concurrent.FutureTask.get(FutureTask.java:192) [na:1.8.0_191]
at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:386) ~[apache-cassandra-3.10.jar:3.10]
... 9 common frames omitted
Caused by: java.util.NoSuchElementException: null
at org.apache.cassandra.utils.AbstractIterator.next(AbstractIterator.java:64) ~[apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.index.SecondaryIndexManager.lambda$indexPartition$20(SecondaryIndexManager.java:618) ~[apache-cassandra-3.10.jar:3.10]
at java.lang.Iterable.forEach(Iterable.java:75) ~[na:1.8.0_191]
at org.apache.cassandra.index.SecondaryIndexManager.indexPartition(SecondaryIndexManager.java:618) ~[apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.index.internal.CollatedViewIndexBuilder.build(CollatedViewIndexBuilder.java:71) ~[apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.db.compaction.CompactionManager$14.run(CompactionManager.java:1587) ~[apache-cassandra-3.10.jar:3.10]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_191]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_191]
... 6 common frames omitted
ERROR [STREAM-IN-/172.50.22.10:5000] 2019-05-09 07:58:11,305 StreamSession.java:593 - [Stream #21f71b70-7230-11e9-9495-ab5d6d9b9488] Streaming error occurred on session with peer 172.50.22.10
java.lang.RuntimeException: Outgoing stream handler has been closed
at org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:143) ~[apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:655) ~[apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:523) ~[apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:317) ~[apache-cassandra-3.10.jar:3.10]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_191]
INFO [StreamReceiveTask:1] 2019-05-09 07:58:11,310 StreamResultFuture.java:187 - [Stream #21f71b70-7230-11e9-9495-ab5d6d9b9488] Session with /172.50.22.10 is complete
ERROR [Reference-Reaper:1] 2019-05-09 07:58:19,115 Ref.java:224 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State#441b19c1) to class org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$Tidy#1255997234:[[OffHeapBitSet]] was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2019-05-09 07:58:19,115 Ref.java:224 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State#4ee372b1) to class org.apache.cassandra.io.util.SafeMemory$MemoryTidy#199357705:Memory#[7f12a4ad4970..7f12a4ad4a10) was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2019-05-09 07:58:19,115 Ref.java:224 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State#172830a5) to class org.apache.cassandra.io.util.SafeMemory$MemoryTidy#983595037:Memory#[7f12a4b3f670..7f12a4b3f990) was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2019-05-09 07:58:19,116 Ref.java:224 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State#6f83e302) to class org.apache.cassandra.io.util.SafeMemory$MemoryTidy#1808665158:Memory#[7f12a41a56d0..7f12a41a56d4) was not released before the reference was garbage collected
INFO [StreamReceiveTask:1] 2019-05-09 07:58:40,657 SecondaryIndexManager.java:365 - Submitting index build of groupchatid_idx_giinfo for data in BigTableReader(path='/cassandra/data/srib_storeserver_objectstore_db/groupinfoobjects-28134170627f11e9b62cc50700a5eaee/mc-1-big-Data.db'),BigTableReader(path='/cassandra/data/srib_storeserver_objectstore_db/groupinfoobjects-28134170627f11e9b62cc50700a5eaee/mc-2-big-Data.db')
INFO [StreamReceiveTask:1] 2019-05-09 07:58:40,714 SecondaryIndexManager.java:385 - Index build of groupchatid_idx_giinfo complete
INFO [StreamReceiveTask:1] 2019-05-09 07:58:41,494 SecondaryIndexManager.java:365 - Submitting index build of groupchatid_idx_giinfo for data in BigTableReader(path='/cassandra/data/srib_storeserver_objectstore_db/groupinfoobjects-28134170627f11e9b62cc50700a5eaee/mc-3-big-Data.db'),BigTableReader(path='/cassandra/data/srib_storeserver_objectstore_db/groupinfoobjects-28134170627f11e9b62cc50700a5eaee/mc-4-big-Data.db'),BigTableReader(path='/cassandra/data/srib_storeserver_objectstore_db/groupinfoobjects-28134170627f11e9b62cc50700a5eaee/mc-5-big-Data.db'),BigTableReader(path='/cassandra/data/srib_storeserver_objectstore_db/groupinfoobjects-28134170627f11e9b62cc50700a5eaee/mc-6-big-Data.db'),BigTableReader(path='/cassandra/data/srib_storeserver_objectstore_db/groupinfoobjects-28134170627f11e9b62cc50700a5eaee/mc-7-big-Data.db')
INFO [StreamReceiveTask:1] 2019-05-09 07:58:41,537 SecondaryIndexManager.java:385 - Index build of groupchatid_idx_giinfo complete
INFO [StreamReceiveTask:1] 2019-05-09 07:58:43,175 SecondaryIndexManager.java:365 - Submitting index build of pushcapabilityindx,presetsearchval for data in BigTableReader(path='/cassandra/data/abc_db/sub-28f738d0627f11e9b62cc50700a5eaee/mc-11-big-Data.db')
INFO [StreamReceiveTask:1] 2019-05-09 07:58:43,209 SecondaryIndexManager.java:385 - Index build of pushcapabilityindx,presetsearchval complete
INFO [StreamReceiveTask:1] 2019-05-09 07:58:43,972 StreamResultFuture.java:187 - [Stream #21f71b70-7230-11e9-9495-ab5d6d9b9488] Session with /172.50.20.10 is complete
INFO [StreamReceiveTask:1] 2019-05-09 07:58:45,643 StreamResultFuture.java:187 - [Stream #21f71b70-7230-11e9-9495-ab5d6d9b9488] Session with /172.50.21.10 is complete
WARN [StreamReceiveTask:1] 2019-05-09 07:58:45,664 StreamResultFuture.java:214 - [Stream #21f71b70-7230-11e9-9495-ab5d6d9b9488] Stream failed
WARN [StreamReceiveTask:1] 2019-05-09 07:58:45,665 StorageService.java:1497 - Error during bootstrap.
org.apache.cassandra.streaming.StreamException: Stream failed
at org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:88) ~[apache-cassandra-3.10.jar:3.10]
at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) [guava-18.0.jar:na]
at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457) [guava-18.0.jar:na]
at com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156) [guava-18.0.jar:na]
at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) [guava-18.0.jar:na]
at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202) [guava-18.0.jar:na]
at org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:215) [apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:191) [apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:481) [apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.streaming.StreamSession.maybeCompleted(StreamSession.java:766) [apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.streaming.StreamSession.taskCompleted(StreamSession.java:727) [apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.streaming.StreamReceiveTask$OnCompletionRunnable.run(StreamReceiveTask.java:244) [apache-cassandra-3.10.jar:3.10]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_191]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_191]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_191]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_191]
at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) [apache-cassandra-3.10.jar:3.10]
at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_191]
ERROR [main] 2019-05-09 07:58:45,665 StorageService.java:1507 - Error while waiting on bootstrap to complete. Bootstrap will have to be restarted.
java.util.concurrent.ExecutionException: org.apache.cassandra.streaming.StreamException: Stream failed
at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299) ~[guava-18.0.jar:na]
at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286) ~[guava-18.0.jar:na]
at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) ~[guava-18.0.jar:na]
at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1502) [apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:962) [apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:681) [apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:612) [apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:394) [apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:601) [apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:735) [apache-cassandra-3.10.jar:3.10]
Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
at org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:88) ~[apache-cassandra-3.10.jar:3.10]
at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) ~[guava-18.0.jar:na]
at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457) ~[guava-18.0.jar:na]
at com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156) ~[guava-18.0.jar:na]
at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) ~[guava-18.0.jar:na]
at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202) ~[guava-18.0.jar:na]
at org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:215) ~[apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:191) ~[apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:481) ~[apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.streaming.StreamSession.maybeCompleted(StreamSession.java:766) ~[apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.streaming.StreamSession.taskCompleted(StreamSession.java:727) ~[apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.streaming.StreamReceiveTask$OnCompletionRunnable.run(StreamReceiveTask.java:244) ~[apache-cassandra-3.10.jar:3.10]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_191]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_191]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_191]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_191]
at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) ~[apache-cassandra-3.10.jar:3.10]
at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_191]
WARN [main] 2019-05-09 07:58:45,676 StorageService.java:1013 - Some data streaming failed. Use nodetool to check bootstrap state and resume. For more, see `nodetool help bootstrap`. IN_PROGRESS
INFO [main] 2019-05-09 07:58:45,677 CassandraDaemon.java:694 - Waiting for gossip to settle before accepting client requests...
INFO [main] 2019-05-09 07:58:53,678 CassandraDaemon.java:725 - No gossip backlog; proceeding
INFO [main] 2019-05-09 07:58:53,737 NativeTransportService.java:70 - Netty using native Epoll event loop
INFO [main] 2019-05-09 07:58:53,781 Server.java:155 - Using Netty Version: [netty-buffer=netty-buffer-4.0.39.Final.38bdf86, netty-codec=netty-codec-4.0.39.Final.38bdf86, netty-codec-haproxy=netty-codec-haproxy-4.0.39.Final.38bdf86, netty-codec-http=netty-codec-http-4.0.39.Final.38bdf86, netty-codec-socks=netty-codec-socks-4.0.39.Final.38bdf86, netty-common=netty-common-4.0.39.Final.38bdf86, netty-handler=netty-handler-4.0.39.Final.38bdf86, netty-tcnative=netty-tcnative-1.1.33.Fork19.fe4816e, netty-transport=netty-transport-4.0.39.Final.38bdf86, netty-transport-native-epoll=netty-transport-native-epoll-4.0.39.Final.38bdf86, netty-transport-rxtx=netty-transport-rxtx-4.0.39.Final.38bdf86, netty-transport-sctp=netty-transport-sctp-4.0.39.Final.38bdf86, netty-transport-udt=netty-transport-udt-4.0.39.Final.38bdf86]
INFO [main] 2019-05-09 07:58:53,781 Server.java:156 - Starting listening for CQL clients on /172.50.20.11:7042 (unencrypted)...
INFO [main] 2019-05-09 07:58:53,809 CassandraDaemon.java:528 - Not starting RPC server as requested. Use JMX (StorageService->startRPCServer()) or nodetool (enablethrift) to start it
I solved this issue by myself.
No. Linux OS Package version RESULT
1 Ubuntu 16.04.3 LTS deb 3.10 Stream Failed
2 Ubuntu 16.04.3 LTS deb 3.11.4 Stream Failed
3 Ubuntu 16.04.3 LTS tgz 3.11.4 Successful
4 Ubuntu 18.04.2 LTS deb 3.10 Stream Failed
5 Ubuntu 18.04.2 LTS deb 3.11.4 Stream Failed
6 Debian GNU/Linux 9.9 (stretch) deb 3.10 Stream Failed
7 Debian GNU/Linux 9.9 (stretch) deb 3.11.4 Stream Failed
8 Amazon Linux 2 rpm 3.11.4 Successful
9 CentOS 7.6 rpm 3.11.4 Successful
Finally, I am going to install from binary tarball files.
I am not a developer, so I can not find the cause.
I hope someone finds the cause and fixes it in Ubuntu & Debian.
Regards.
Sungjae Yun.
I'm in trouble with Spark. I have a Spark standalone cluster with 2 nodes,
master: 121.*.*.22(hostname is iZ28i1niuigZ)
worker: 123.*.*.125(hostname is VM-120-50-ubuntu).
I have edited slaves file and added 123.*.*.125.
There is no worker info on WebUI:
WebUI image of spark master
When executing the start script I see the following messages:
spark#iZ28i1niuigZ:~/spark-2.0.1-bin-hadoop2.7$ sh sbin/start-all.sh
starting org.apache.spark.deploy.master.Master, logging to /home/spark/spark-2.0.1-bin-hadoop2.7/logs/spark-spark-org.apache.spark.deploy.master.Master-1-iZ28i1niuigZ.out
123.*.*.125: starting org.apache.spark.deploy.worker.Worker, logging to /home/spark/spark-2.0.1-bin-hadoop2.7/logs/spark-spark-org.apache.spark.deploy.worker.Worker-1-VM-120-50-ubuntu.out
The spark-env.sh file contents are:
export SPARK_MASTER_IP=121.*.*.22
export SPARK_MASTER_PORT=7077
export SPARK_WORKER_CORES=1
export SPARK_WORDER_INSTANCES=1
export SPARK_WORKER_MEMORY=1g
export JAVA_HOME=/home/spark/jdk1.8.0_101
On the worker I can see the following output:
Spark Command: /home/spark/jdk1.8.0_101/bin/java -cp /home/spark/spark-2.0.1-bin-hadoop2.7/conf/:/home/spark/spark-2.0.1-bin-hadoop2.7/jars/* -Xmx1g org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://iZ28i1niuigZ:7077
========================================
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/11/30 20:04:56 INFO Worker: Started daemon with process name: 28287#VM-120-50-ubuntu
16/11/30 20:04:56 INFO SignalUtils: Registered signal handler for TERM
16/11/30 20:04:56 INFO SignalUtils: Registered signal handler for HUP
16/11/30 20:04:56 INFO SignalUtils: Registered signal handler for INT
16/11/30 20:04:56 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/11/30 20:04:56 INFO SecurityManager: Changing view acls to: spark
16/11/30 20:04:56 INFO SecurityManager: Changing modify acls to: spark
16/11/30 20:04:56 INFO SecurityManager: Changing view acls groups to:
16/11/30 20:04:56 INFO SecurityManager: Changing modify acls groups to:
16/11/30 20:04:56 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(spark); groups with view permissions: Set(); users with modify permissions: Set(spark); groups with modify permissions: Set()
16/11/30 20:04:57 INFO Utils: Successfully started service 'sparkWorker' on port 41544.
16/11/30 20:04:57 INFO Worker: Starting Spark worker 10.141.120.50:41544 with 1 cores, 1024.0 MB RAM
16/11/30 20:04:57 INFO Worker: Running Spark version 2.0.1
16/11/30 20:04:57 INFO Worker: Spark home: /home/spark/spark-2.0.1-bin-hadoop2.7
16/11/30 20:04:57 INFO Utils: Successfully started service 'WorkerUI' on port 8081.
16/11/30 20:04:57 INFO WorkerWebUI: Bound WorkerWebUI to 0.0.0.0, and started at http://10.141.120.50:8081
16/11/30 20:04:57 INFO Worker: Connecting to master iZ28i1niuigZ:7077...
16/11/30 20:04:58 WARN Worker: Failed to connect to master iZ28i1niuigZ:7077
org.apache.spark.SparkException: Exception thrown in awaitResult
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:88)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:96)
at org.apache.spark.deploy.worker.Worker$$anonfun$org$apache$spark$deploy$worker$Worker$$tryRegisterAllMasters$1$$anon$1.run(Worker.scala:216)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.jav a:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Failed to connect to iZ28i1niuigZ/121.*.*.22:7077
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:228)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:179)
at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:197)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:191)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:187)
... 4 more
Caused by: java.net.ConnectException: Connection refused: iZ28i1niuigZ/121.*.*.22:7077
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:289)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
... 1 more
16/11/30 20:05:08 INFO Worker: Retrying connection to master (attempt # 1)
16/11/30 20:05:08 INFO Worker: Connecting to master iZ28i1niuigZ:7077...
16/11/30 20:05:08 WARN Worker: Failed to connect to master iZ28i1niuigZ:7077
org.apache.spark.SparkException: Exception thrown in awaitResult
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout .scala:75)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:88)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:96)
at org.apache.spark.deploy.worker.Worker$$anonfun$org$apache$spark$deploy$worker$Worker$$tryRegisterAllMasters$1$$anon$1.run(Worker.scala:216)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Failed to connect to iZ28i1niuigZ/121.*.*.22:7077
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:228)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:179)
at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:197)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:191)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:187)
... 4 more
Caused by: java.net.ConnectException: Connection refused: iZ28i1niuigZ/121.*.*.22:7077
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:289)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
... 1 more
The /etc/hosts on master node looks like:
127.0.0.1 localhost
127.0.1.1 localhost.localdomain localhost
# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
10.251.33.226 iZ28i1niuigZ
123.*.*.125 VM-120-50-ubuntu
And the /etc/hosts worker node contains the following configurations:
10.141.120.50 VM-120-50-ubuntu
127.0.0.1 localhost localhost.localdomain
121.*.*.22 iZ28i1niuigZ
I cannot understand why the worker is unable to connect to master?
========================================================================
update:
I cannot telnet 123.*.*.125 7077, while I can telnet 123.*.*.125
When executing the command: iptables -L -n, I see the following messages:
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
I am trying to stream some sstables to Cassandra cluster using SStableLoader utility. I am getting a streaming error. Here is the stack.
Established connection to initial hosts
Opening sstables and calculating sections to stream
18:05:04.058 [main] DEBUG o.a.c.i.s.m.MetadataSerializer - Load metadata for /path/new/xyz/search/xyz-search-ka-1
18:05:04.073 [main] INFO o.a.c.io.sstable.SSTableReader - Opening /path/new/xyz/new/xyz_search/search/xyz_search-search-ka-1 (330768 bytes)
Streaming relevant part of /path/new/xyz/xyz_search/search/xyz_search-search-ka-1-Data.db to [/10.XXX.XXX.XXX, /10.XXX.XXX.XXX, /10.XXX.XXX.XXX, /10.XXX.XXX.XXX, /10.XXX.XXX.XXX]
18:05:04.411 [main] INFO o.a.c.streaming.StreamResultFuture - [Stream #ed3a0cd0-fd25-11e5-8509-63e9961cf787] Executing streaming plan for Bulk Load
Streaming relevant part of /path/xyz-search-ka-1-Data.db to [/10.XXX.XXX.XXX, /10.XXX.XXX.XXX, /10.XXX.XXX.XXX, /10.XXX.XXX.XXX, /10.XXX.XXX.XXX]
17:22:44.175 [main] INFO o.a.c.streaming.StreamResultFuture - [Stream #0327a9e0-fd20-11e5-b350-63e9961cf787] Executing streaming plan for Bulk Load
17:22:44.177 [StreamConnectionEstablisher:1] INFO o.a.c.streaming.StreamSession - [Stream #0327a9e0-fd20-11e5-b350-63e9961cf787] Starting streaming to /10.XX.XX.XX
17:22:44.177 [StreamConnectionEstablisher:1] DEBUG o.a.c.streaming.ConnectionHandler - [Stream #0327a9e0-fd20-11e5-b350-63e9961cf787] Sending stream init for incoming stream
17:22:44.183 [StreamConnectionEstablisher:2] INFO o.a.c.streaming.StreamSession - [Stream #0327a9e0-fd20-11e5-b350-63e9961cf787] Starting streaming to /10.XX.XX.XX
17:22:44.183 [StreamConnectionEstablisher:2] DEBUG o.a.c.streaming.ConnectionHandler - [Stream #0327a9e0-fd20-11e5-b350-63e9961cf787] Sending stream init for incoming stream
17:23:47.191 [StreamConnectionEstablisher:2] ERROR o.a.c.streaming.StreamSession - [Stream #0327a9e0-fd20-11e5-b350-63e9961cf787] Streaming error occurred
java.net.ConnectException: Connection timed out
at sun.nio.ch.Net.connect0(Native Method) ~[na:1.8.0_45]
at sun.nio.ch.Net.connect(Net.java:458) ~[na:1.8.0_45]
at sun.nio.ch.Net.connect(Net.java:450) ~[na:1.8.0_45]
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648) ~[na:1.8.0_45]
at java.nio.channels.SocketChannel.open(SocketChannel.java:189) ~[na:1.8.0_45]
at org.apache.cassandra.tools.BulkLoadConnectionFactory.createConnection(BulkLoadConnectionFactory.java:62) ~[cassandra-all-2.1.6.jar:2.1.6]
at org.apache.cassandra.streaming.StreamSession.createConnection(StreamSession.java:236) ~[cassandra-all-2.1.6.jar:2.1.6]
at org.apache.cassandra.streaming.ConnectionHandler.initiate(ConnectionHandler.java:79) ~[cassandra-all-2.1.6.jar:2.1.6]
at org.apache.cassandra.streaming.StreamSession.start(StreamSession.java:223) ~[cassandra-all-2.1.6.jar:2.1.6]
at org.apache.cassandra.streaming.StreamCoordinator$StreamSessionConnector.run(StreamCoordinator.java:208) [cassandra-all-2.1.6.jar:2.1.6]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_45]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_45]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
17:23:47.202 [StreamConnectionEstablisher:2] DEBUG o.a.c.streaming.ConnectionHandler - [Stream #0327a9e0-fd20-11e5-b350-63e9961cf787] Closing stream connection handler on /10.XXX.XXX.XXX
17:23:47.205 [StreamConnectionEstablisher:1] ERROR o.a.c.streaming.StreamSession - [Stream #0327a9e0-fd20-11e5-b350-63e9961cf787] Streaming error occurred
java.net.ConnectException: Connection timed out
at sun.nio.ch.Net.connect0(Native Method) ~[na:1.8.0_45]
at sun.nio.ch.Net.connect(Net.java:458) ~[na:1.8.0_45]
at sun.nio.ch.Net.connect(Net.java:450) ~[na:1.8.0_45]
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648) ~[na:1.8.0_45]
at java.nio.channels.SocketChannel.open(SocketChannel.java:189) ~[na:1.8.0_45]
at org.apache.cassandra.tools.BulkLoadConnectionFactory.createConnection(BulkLoadConnectionFactory.java:62) ~[cassandra-all-2.1.6.jar:2.1.6]
at org.apache.cassandra.streaming.StreamSession.createConnection(StreamSession.java:236) ~[cassandra-all-2.1.6.jar:2.1.6]
at org.apache.cassandra.streaming.ConnectionHandler.initiate(ConnectionHandler.java:79) ~[cassandra-all-2.1.6.jar:2.1.6]
at org.apache.cassandra.streaming.StreamSession.start(StreamSession.java:223) ~[cassandra-all-2.1.6.jar:2.1.6]
at org.apache.cassandra.streaming.StreamCoordinator$StreamSessionConnector.run(StreamCoordinator.java:208) [cassandra-all-2.1.6.jar:2.1.6]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_45]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_45]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
Also the machine where I am running the sstableloader is out of the cassandra cluster.
Thanks
After debugging a little more, found that sstableloader also uses port 7000 while streaming sstables to cassandra cluster. My local machine did not have access to port 7000 on the machines on cassandra cluster. That's why i was getting connection time out exception.
Anyone who encounters this make sure that your machine from where you are running the sstableloader has access to port 9160,7000 and 9042 of all the casssandra nodes you are trying to stream to.
DEBUG o.a.c.streaming.ConnectionHandler - [Stream #0327a9e0-fd20-11e5-b350-63e9961cf787] Closing stream connection handler on /10.XXX.XXX.XXX
Hint: I suspect the machine 10.xxx.xxx.xxx is under heavy load. Worth checking the /var/log/cassandra/system.log file on this machine to narrow down the root cause
Here is my versions:
Hive: 1.2
Hadoop: CDH5.3
Spark: 1.4.1
I succeeded with hive on spark with hive client, but after I started hiveserver2 and tried a sql using beeline, it failed.
The error is:
2015-11-29 21:49:42,786 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - 15/11/29 21:49:42 INFO spark.SparkContext: Added JAR file:/root/cdh/apache-hive-1.2.1-bin/lib/hive-exec-1.2.1.jar at http://10.96.30.51:10318/jars/hive-exec-1.2.1.jar with timestamp 1448804982784
2015-11-29 21:49:43,336 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - 15/11/29 21:49:43 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm297
2015-11-29 21:49:43,356 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - 15/11/29 21:49:43 INFO retry.RetryInvocationHandler: Exception while invoking getClusterMetrics of class ApplicationClientProtocolPBClientImpl over rm297 after 1 fail over attempts. Trying to fail over immediately.
2015-11-29 21:49:43,357 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - 15/11/29 21:49:43 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm280
2015-11-29 21:49:43,359 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - 15/11/29 21:49:43 INFO retry.RetryInvocationHandler: Exception while invoking getClusterMetrics of class ApplicationClientProtocolPBClientImpl over rm280 after 2 fail over attempts. Trying to fail over after sleeping for 477ms.
2015-11-29 21:49:43,359 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - java.net.ConnectException: Call From hd-master-001/10.96.30.51 to hd-master-001:8032 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
2015-11-29 21:49:43,359 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
My yarn's status is that hd-master-002 is active resourcemanager and hd-master-001 is backup. 8032 port on hd-master-001 is not open. So of course, connection error occurs when trying to connect to hd-master-001's 8032 port.
But why she tried to connect a backup resourcemanager.
If I use hive client command shell on spark on yarn, everything is ok.
PS: I didn't rebuild the spark assembly jar without hive, I only removed 'org.apache.hive' and 'org.apache.hadoop.hive' from built assembly jar. But I do not think it is the problem because I succeeded with hive client on spark on yarn.
I've been struggling the all day and didn't find a solution.
I'm trying to connect a remote Cassandra node via a Spark Streaming application, using the spark-cassandra connector and the application exists with an exception. Any help would be much appreciated.
2015-02-17 19:13:58 DEBUG Connection:114 - Connection[/<MY_PUBLIC_IP>:9042-2, inFlight=0, closed=false] Transport initialized and ready
2015-02-17 19:13:58 DEBUG ControlConnection:492 - [Control connection] Refreshing node list and token map
2015-02-17 19:13:59 DEBUG ControlConnection:262 - [Control connection] Refreshing schema
2015-02-17 19:14:00 DEBUG ControlConnection:492 - [Control connection] Refreshing node list and token map
2015-02-17 19:14:00 DEBUG ControlConnection:172 - [Control connection] Successfully connected to /<MY_PUBLIC_IP>:9042
2015-02-17 19:14:00 INFO Cluster:1267 - New Cassandra host /<MY_PUBLIC_IP>:9042 added
2015-02-17 19:14:00 INFO CassandraConnector:51 - Connected to Cassandra cluster: Test Cluster
2015-02-17 19:14:00 INFO LocalNodeFirstLoadBalancingPolicy:59 - Adding host <MY_PUBLIC_IP> (datacenter1)
2015-02-17 19:14:01 DEBUG Connection:114 - Connection[/<MY_PUBLIC_IP>:9042-3, inFlight=0, closed=false] Transport initialized and ready
2015-02-17 19:14:01 DEBUG Session:304 - Added connection pool for /<MY_PUBLIC_IP>:9042
2015-02-17 19:14:01 INFO LocalNodeFirstLoadBalancingPolicy:59 - Adding host <MY_PUBLIC_IP> (datacenter1)
2015-02-17 19:14:01 DEBUG Schema:55 - Retrieving database schema from cluster Test Cluster...
2015-02-17 19:14:01 DEBUG Schema:55 - 1 keyspaces fetched from cluster Test Cluster: {vehicles}
2015-02-17 19:14:02 DEBUG CassandraConnector:55 - Attempting to open thrift connection to Cassandra at <MY_PUBLIC_IP>:9160
2015-02-17 19:14:02 DEBUG Connection:428 - Connection[/<MY_PUBLIC_IP>:9042-3, inFlight=0, closed=true] closing connection
2015-02-17 19:14:02 DEBUG Cluster:1340 - Shutting down
2015-02-17 19:14:02 DEBUG Connection:428 - Connection[/<MY_PUBLIC_IP>:9042-2, inFlight=0, closed=true] closing connection
2015-02-17 19:14:02 INFO CassandraConnector:51 - Disconnected from Cassandra cluster: Test Cluster
2015-02-17 19:14:03 DEBUG CassandraConnector:55 - Attempting to open thrift connection to Cassandra at <AWS_LOCAL_IP>:9160
2015-02-17 19:14:10 DEBUG HeartbeatReceiver:50 - [actor] received message Heartbeat(localhost,[Lscala.Tuple2;#77008370,BlockManagerId(<driver>, Alon-PC, 62343, 0)) from Actor[akka://sparkDriver/temp/$b]
2015-02-17 19:14:10 DEBUG BlockManagerMasterActor:50 - [actor] received message BlockManagerHeartbeat(BlockManagerId(<driver>, Alon-PC, 62343, 0)) from Actor[akka://sparkDriver/temp/$c]
2015-02-17 19:14:10 DEBUG BlockManagerMasterActor:56 - [actor] handled message (0.491517 ms) BlockManagerHeartbeat(BlockManagerId(<driver>, Alon-PC, 62343, 0)) from Actor[akka://sparkDriver/temp/$c]
2015-02-17 19:14:10 DEBUG HeartbeatReceiver:56 - [actor] handled message (69.725123 ms) Heartbeat(localhost,[Lscala.Tuple2;#77008370,BlockManagerId(<driver>, Alon-PC, 62343, 0)) from Actor[akka://sparkDriver/temp/$b]
2015-02-17 19:14:20 DEBUG HeartbeatReceiver:50 - [actor] received message Heartbeat(localhost,[Lscala.Tuple2;#70a7cd6e,BlockManagerId(<driver>, Alon-PC, 62343, 0)) from Actor[akka://sparkDriver/temp/$d]
2015-02-17 19:14:20 DEBUG BlockManagerMasterActor:50 - [actor] received message BlockManagerHeartbeat(BlockManagerId(<driver>, Alon-PC, 62343, 0)) from Actor[akka://sparkDriver/temp/$e]
2015-02-17 19:14:20 DEBUG BlockManagerMasterActor:56 - [actor] handled message (0.348586 ms) BlockManagerHeartbeat(BlockManagerId(<driver>, Alon-PC, 62343, 0)) from Actor[akka://sparkDriver/temp/$e]
2015-02-17 19:14:20 DEBUG HeartbeatReceiver:56 - [actor] handled message (2.020429 ms) Heartbeat(localhost,[Lscala.Tuple2;#70a7cd6e,BlockManagerId(<driver>, Alon-PC, 62343, 0)) from Actor[akka://sparkDriver/temp/$d]
2015-02-17 19:14:24 ERROR ServerSideTokenRangeSplitter:88 - Failure while fetching splits from Cassandra
java.io.IOException: Failed to open thrift connection to Cassandra at <AWS_LOCAL_IP>:9160
at com.datastax.spark.connector.cql.CassandraConnector.createThriftClient(CassandraConnector.scala:132)
at com.datastax.spark.connector.cql.CassandraConnector.withCassandraClientDo(CassandraConnector.scala:141)
at com.datastax.spark.connector.rdd.partitioner.ServerSideTokenRangeSplitter.com$datastax$spark$connector$rdd$partitioner$ServerSideTokenRangeSplitter$$fetchSplits(ServerSideTokenRangeSplitter.scala:33)
at com.datastax.spark.connector.rdd.partitioner.ServerSideTokenRangeSplitter$$anonfun$1$$anonfun$apply$2.apply(ServerSideTokenRangeSplitter.scala:45)
at com.datastax.spark.connector.rdd.partitioner.ServerSideTokenRangeSplitter$$anonfun$1$$anonfun$apply$2.apply(ServerSideTokenRangeSplitter.scala:45)
at scala.util.Try$.apply(Try.scala:161)
at com.datastax.spark.connector.rdd.partitioner.ServerSideTokenRangeSplitter$$anonfun$1.apply(ServerSideTokenRangeSplitter.scala:45)
at com.datastax.spark.connector.rdd.partitioner.ServerSideTokenRangeSplitter$$anonfun$1.apply(ServerSideTokenRangeSplitter.scala:44)
at scala.collection.immutable.Stream.map(Stream.scala:376)
at com.datastax.spark.connector.rdd.partitioner.ServerSideTokenRangeSplitter.split(ServerSideTokenRangeSplitter.scala:44)
at com.datastax.spark.connector.rdd.partitioner.CassandraRDDPartitioner$$anonfun$com$datastax$spark$connector$rdd$partitioner$CassandraRDDPartitioner$$splitsOf$1.apply(CassandraRDDPartitioner.scala:77)
at com.datastax.spark.connector.rdd.partitioner.CassandraRDDPartitioner$$anonfun$com$datastax$spark$connector$rdd$partitioner$CassandraRDDPartitioner$$splitsOf$1.apply(CassandraRDDPartitioner.scala:76)
at scala.collection.parallel.mutable.ParArray$ParArrayIterator.flatmap2combiner(ParArray.scala:418)
at scala.collection.parallel.ParIterableLike$FlatMap.leaf(ParIterableLike.scala:1075)
at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply$mcV$sp(Tasks.scala:54)
at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:53)
at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:53)
at scala.collection.parallel.Task$class.tryLeaf(Tasks.scala:56)
at scala.collection.parallel.ParIterableLike$FlatMap.tryLeaf(ParIterableLike.scala:1071)
at scala.collection.parallel.AdaptiveWorkStealingTasks$WrappedTask$class.compute(Tasks.scala:165)
at scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$WrappedTask.compute(Tasks.scala:514)
at scala.concurrent.forkjoin.RecursiveAction.exec(RecursiveAction.java:160)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinTask.doJoin(ForkJoinTask.java:341)
at scala.concurrent.forkjoin.ForkJoinTask.join(ForkJoinTask.java:673)
at scala.collection.parallel.ForkJoinTasks$WrappedTask$class.sync(Tasks.scala:444)
at scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$WrappedTask.sync(Tasks.scala:514)
at scala.collection.parallel.ForkJoinTasks$class.executeAndWaitResult(Tasks.scala:492)
at scala.collection.parallel.ForkJoinTaskSupport.executeAndWaitResult(TaskSupport.scala:64)
at scala.collection.parallel.ParIterableLike$ResultMapping.leaf(ParIterableLike.scala:961)
at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply$mcV$sp(Tasks.scala:54)
at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:53)
at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:53)
at scala.collection.parallel.Task$class.tryLeaf(Tasks.scala:56)
at scala.collection.parallel.ParIterableLike$ResultMapping.tryLeaf(ParIterableLike.scala:956)
at scala.collection.parallel.AdaptiveWorkStealingTasks$WrappedTask$class.compute(Tasks.scala:165)
at scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$WrappedTask.compute(Tasks.scala:514)
at scala.concurrent.forkjoin.RecursiveAction.exec(RecursiveAction.java:160)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection timed out: connect
at org.apache.thrift.transport.TSocket.open(TSocket.java:185)
at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
at org.apache.cassandra.thrift.TFramedTransportFactory.openTransport(TFramedTransportFactory.java:41)
at com.datastax.spark.connector.cql.DefaultConnectionFactory$.createThriftClient(CassandraConnectionFactory.scala:47)
at com.datastax.spark.connector.cql.CassandraConnector.createThriftClient(CassandraConnector.scala:127)
... 41 more
Caused by: java.net.ConnectException: Connection timed out: connect
at java.net.DualStackPlainSocketImpl.connect0(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:79)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.thrift.transport.TSocket.open(TSocket.java:180)
... 45 more
Exception in thread "main" java.io.IOException: Failed to fetch splits of TokenRange(0,0,Set(CassandraNode(/<AWS_LOCAL_IP>,/<MY_PUBLIC_IP>)),None) from all endpoints: CassandraNode(/<AWS_LOCAL_IP>,/<MY_PUBLIC_IP>)
at com.datastax.spark.connector.rdd.partitioner.ServerSideTokenRangeSplitter$$anonfun$split$2.apply(ServerSideTokenRangeSplitter.scala:55)
at com.datastax.spark.connector.rdd.partitioner.ServerSideTokenRangeSplitter$$anonfun$split$2.apply(ServerSideTokenRangeSplitter.scala:49)
at scala.Option.getOrElse(Option.scala:120)
at com.datastax.spark.connector.rdd.partitioner.ServerSideTokenRangeSplitter.split(ServerSideTokenRangeSplitter.scala:49)
at com.datastax.spark.connector.rdd.partitioner.CassandraRDDPartitioner$$anonfun$com$datastax$spark$connector$rdd$partitioner$CassandraRDDPartitioner$$splitsOf$1.apply(CassandraRDDPartitioner.scala:77)
at com.datastax.spark.connector.rdd.partitioner.CassandraRDDPartitioner$$anonfun$com$datastax$spark$connector$rdd$partitioner$CassandraRDDPartitioner$$splitsOf$1.apply(CassandraRDDPartitioner.scala:76)
at scala.collection.parallel.mutable.ParArray$ParArrayIterator.flatmap2combiner(ParArray.scala:418)
at scala.collection.parallel.ParIterableLike$FlatMap.leaf(ParIterableLike.scala:1075)
at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply$mcV$sp(Tasks.scala:54)
at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:53)
at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:53)
at scala.collection.parallel.Task$class.tryLeaf(Tasks.scala:56)
at scala.collection.parallel.ParIterableLike$FlatMap.tryLeaf(ParIterableLike.scala:1071)
at scala.collection.parallel.AdaptiveWorkStealingTasks$WrappedTask$class.compute(Tasks.scala:165)
at scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$WrappedTask.compute(Tasks.scala:514)
at scala.concurrent.forkjoin.RecursiveAction.exec(RecursiveAction.java:160)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinTask.doJoin(ForkJoinTask.java:341)
at scala.concurrent.forkjoin.ForkJoinTask.join(ForkJoinTask.java:673)
at scala.collection.parallel.ForkJoinTasks$WrappedTask$class.sync(Tasks.scala:444)
at scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$WrappedTask.sync(Tasks.scala:514)
at scala.collection.parallel.ForkJoinTasks$class.executeAndWaitResult(Tasks.scala:492)
at scala.collection.parallel.ForkJoinTaskSupport.executeAndWaitResult(TaskSupport.scala:64)
at scala.collection.parallel.ParIterableLike$ResultMapping.leaf(ParIterableLike.scala:961)
at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply$mcV$sp(Tasks.scala:54)
at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:53)
at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:53)
at scala.collection.parallel.Task$class.tryLeaf(Tasks.scala:56)
at scala.collection.parallel.ParIterableLike$ResultMapping.tryLeaf(ParIterableLike.scala:956)
at scala.collection.parallel.AdaptiveWorkStealingTasks$WrappedTask$class.compute(Tasks.scala:165)
at scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$WrappedTask.compute(Tasks.scala:514)
at scala.concurrent.forkjoin.RecursiveAction.exec(RecursiveAction.java:160)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
2015-02-17 19:14:24 DEBUG DiskBlockManager:63 - Shutdown hook called
At the beginning it looks fine (connection succeeds, keyspace is fetched...) but then, when it attempts to open thrift connection, it fails, disconnects and shut down.
I've opened ports 9160, 9042 and 7000.
And in cassandra.yaml I set
listen_address: <AWS_LOCAL_IP>
broadcast_address: <MY_PUBLIC_IP>
What am I missing?
Ok, I was about to post this question but then I finally worked it out:
In cassandra.yaml, I had to set
rpc_adress = 0.0.0.0
Other stackoverflow questions helped me, but I'm posting this because the stack trace may help others to find it.