How do I reconnect to Cassandra using Hector? - cassandra

I have the following code:
StringSerializer ss = StringSerializer.get();
String cf = "TEST";
CassandraHostConfigurator conf = new CassandraHostConfigurator("localhost:9160");
conf.setCassandraThriftSocketTimeout(40000);
conf.setExhaustedPolicy(ExhaustedPolicy.WHEN_EXHAUSTED_BLOCK);
conf.setRetryDownedHostsDelayInSeconds(5);
conf.setRetryDownedHostsQueueSize(128);
conf.setRetryDownedHosts(true);
conf.setLoadBalancingPolicy(new LeastActiveBalancingPolicy());
String key = Long.toString(System.currentTimeMillis());
Cluster cluster = HFactory.getOrCreateCluster("TestCluster", conf);
Keyspace keyspace = HFactory.createKeyspace("TestCluster", cluster);
Mutator<String> mutator = HFactory.createMutator(keyspace, StringSerializer.get()); int count = 0;
while (!"q".equals(new Scanner( System.in).next())) {
try{
mutator.insert(key, cf, HFactory.createColumn("column_" + count, "v_" + count, ss, ss));
count++;
} catch (Exception e) {
e.printStackTrace();
}
}
and I can write some values using it, but when I restart cassandra, it fails. Here is the log:
[15:11:07] INFO [CassandraHostRetryService ] Downed Host Retry service started with >queue size 128 and retry delay 5s
[15:11:07] INFO [JmxMonitor ] Registering JMX >me.prettyprint.cassandra.service_ASG:ServiceType=hector,MonitorType=hector
[15:11:17] ERROR [HThriftClient ] Could not flush transport (to be expected >if the pool is shutting down) in close for client: CassandraClient
org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe
at >org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:147)
at org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:156)
at >me.prettyprint.cassandra.connection.client.HThriftClient.close(HThriftClient.java:98)
at >me.prettyprint.cassandra.connection.client.HThriftClient.close(HThriftClient.java:26)
at >me.prettyprint.cassandra.connection.HConnectionManager.closeClient(HConnectionManager.java:308)
at >me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:257)
at >me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:97)
at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243)
at me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:69)
at com.app.App.main(App.java:40)
Caused by: java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
at >org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:145)
... 9 more
[15:11:17] ERROR [HConnectionManager ] MARK HOST AS DOWN TRIGGERED for host >localhost(127.0.0.1):9160
[15:11:17] ERROR [HConnectionManager ] Pool state on shutdown: >:{localhost(127.0.0.1):9160}; IsActive?: true; Active: 1; Blocked: 0; Idle: 15; NumBeforeExhausted: 49
[15:11:17] INFO [ConcurrentHClientPool ] Shutdown triggered on :{localhost(127.0.0.1):9160}
[15:11:17] INFO [ConcurrentHClientPool ] Shutdown complete on :{localhost(127.0.0.1):9160}
[15:11:17] INFO [CassandraHostRetryService ] Host detected as down was added to retry queue: localhost(127.0.0.1):9160
[15:11:17] WARN [HConnectionManager ] Could not fullfill request on this host CassandraClient
[15:11:17] WARN [HConnectionManager ] Exception:
me.prettyprint.hector.api.exceptions.HectorTransportException: org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe
at >me.prettyprint.cassandra.connection.client.HThriftClient.getCassandra(HThriftClient.java:82)
at >me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:236)
at >me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:97)
at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243)
at me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:69)
at com.app.App.main(App.java:40)
Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe
at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:147)
at org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:157)
at org.apache.cassandra.thrift.Cassandra$Client.send_set_keyspace(Cassandra.java:466)
at org.apache.cassandra.thrift.Cassandra$Client.set_keyspace(Cassandra.java:455)
at >me.prettyprint.cassandra.connection.client.HThriftClient.getCassandra(HThriftClient.java:78)
... 5 more
Caused by: java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
at >org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:145)
... 9 more
[15:11:17] INFO [HConnectionManager ] Client CassandraClient released to inactive or dead pool. Closing.
[15:11:17] INFO [HConnectionManager ] Client CassandraClient released to inactive or dead pool. Closing.
[15:11:17] INFO [HConnectionManager ] Added host localhost(127.0.0.1):9160 to pool

You have set -
conf.setRetryDownedHostsDelayInSeconds(5);
Try to to wait after the restart for more than 5 seconds.
Also, you may need to upgrade.
What is the size thrift_max_message_length_in_mb you have set?
Kind regards.

Related

cassandra benchmark ReadWriteTest can not run successfully

I run cassandra benchmarks ReadWriteTest with idea on centos7
`
package org.apache.cassandra.test.microbench;
#BenchmarkMode(Mode.Throughput)
#OutputTimeUnit(TimeUnit.MILLISECONDS)
#Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
#Measurement(iterations = 5, time = 2, timeUnit = TimeUnit.SECONDS)
#Fork(value = 1)
#Threads(1)
#State(Scope.Benchmark)
public class ReadWriteTest extends CQLTester
{
static String keyspace;
String table;
String writeStatement;
String readStatement;
long numRows = 0;
ColumnFamilyStore cfs;
#Setup(Level.Trial)
public void setup() throws Throwable
{
CQLTester.setUpClass();
keyspace = createKeyspace("CREATE KEYSPACE %s with replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 } and durable_writes = false");
table = createTable(keyspace, "CREATE TABLE %s ( userid bigint, picid bigint, commentid bigint, PRIMARY KEY(userid, picid))");
execute("use "+keyspace+";");
writeStatement = "INSERT INTO "+table+"(userid,picid,commentid)VALUES(?,?,?)";
readStatement = "SELECT * from "+table+" limit 100";
cfs = Keyspace.open(keyspace).getColumnFamilyStore(table);
cfs.disableAutoCompaction();
//Warm up
System.err.println("Writing 50k");
for (long i = 0; i < 5000; i++)
execute(writeStatement, i, i, i );
}
#TearDown(Level.Trial)
public void teardown() throws IOException, ExecutionException, InterruptedException
{
CQLTester.cleanup();
}
#Benchmark
public Object write() throws Throwable
{
numRows++;
return execute(writeStatement, numRows, numRows, numRows );
}
#Benchmark
public Object read() throws Throwable
{
return execute(readStatement);
}
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(ReadWriteTest.class.getSimpleName())
.build();
new Runner(opt).run();
}
}
it seems to be successful at start, "java.lang.IndexOutOfBoundsException: Index: 0, Size: 0" occurs like this
01:37:25.884 [org.apache.cassandra.test.microbench.ReadWriteTest.read-jmh-worker-1] DEBUG org.apache.cassandra.db.ReadCommand - --in queryMemtableAndDiskInternal, partitionKey token:-2532556674411782010, partitionKey:DecoratedKey(-2532556674411782010, 6b657973706163655f30)
01:37:25.888 [org.apache.cassandra.test.microbench.ReadWriteTest.read-jmh-worker-1] INFO o.a.cassandra.db.ColumnFamilyStore - Initializing keyspace_0.globalReplicaTable
01:37:25.889 [org.apache.cassandra.test.microbench.ReadWriteTest.read-jmh-worker-1] DEBUG o.a.cassandra.db.DiskBoundaryManager - Refreshing disk boundary cache for keyspace_0.globalReplicaTable
01:37:25.890 [org.apache.cassandra.test.microbench.ReadWriteTest.read-jmh-worker-1] DEBUG o.a.cassandra.db.DiskBoundaryManager - Got local ranges [] (ringVersion = 0)
01:37:25.890 [org.apache.cassandra.test.microbench.ReadWriteTest.read-jmh-worker-1] DEBUG o.a.cassandra.db.DiskBoundaryManager - Updating boundaries from null to DiskBoundaries{directories=[DataDirectory{location=/home/cjx/Downloads/depart0/depart/data/data}], positions=null, ringVersion=0, directoriesVersion=0} for keyspace_0.globalReplicaTable
01:37:25.890 [org.apache.cassandra.test.microbench.ReadWriteTest.read-jmh-worker-1] DEBUG org.apache.cassandra.config.Schema - Adding org.apache.cassandra.config.CFMetaData#44380385[cfId=bfe68a70-569b-11ed-974a-eb765e7c415c,ksName=keyspace_0,cfName=globalReplicaTable,flags=[COMPOUND],params=TableParams{comment=, read_repair_chance=0.0, dclocal_read_repair_chance=0.1, bloom_filter_fp_chance=0.01, crc_check_chance=1.0, gc_grace_seconds=864000, default_time_to_live=0, memtable_flush_period_in_ms=0, min_index_interval=128, max_index_interval=2048, speculative_retry=99PERCENTILE, caching={'keys' : 'ALL', 'rows_per_partition' : 'NONE'}, compaction=CompactionParams{class=org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy, options={max_threshold=32, min_threshold=4}}, compression=org.apache.cassandra.schema.CompressionParams#920394f7, extensions={}, cdc=false},comparator=comparator(org.apache.cassandra.db.marshal.LongType),partitionColumns=[[] | [commentid]],partitionKeyColumns=[userid],clusteringColumns=[picid],keyValidator=org.apache.cassandra.db.marshal.LongType,columnMetadata=[userid, picid, commentid],droppedColumns={},triggers=[],indexes=[]] to cfIdMap
Writing 50k
<failure>
java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:659)
at java.util.ArrayList.get(ArrayList.java:435)
at org.apache.cassandra.locator.TokenMetadata.firstToken(TokenMetadata.java:1079)
at org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:107)
at org.apache.cassandra.service.StorageService.getNaturalEndpoints(StorageService.java:4066)
at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:619)
at org.apache.cassandra.db.Mutation.apply(Mutation.java:227)
at org.apache.cassandra.db.Mutation.apply(Mutation.java:232)
at org.apache.cassandra.db.Mutation.apply(Mutation.java:241)
at org.apache.cassandra.cql3.statements.ModificationStatement.executeInternalWithoutCondition(ModificationStatement.java:587)
at org.apache.cassandra.cql3.statements.ModificationStatement.executeInternal(ModificationStatement.java:581)
at org.apache.cassandra.cql3.QueryProcessor.executeInternal(QueryProcessor.java:315)
at org.apache.cassandra.cql3.CQLTester.executeFormattedQuery(CQLTester.java:792)
at org.apache.cassandra.cql3.CQLTester.execute(CQLTester.java:780)
at org.apache.cassandra.test.microbench.ReadWriteTest.setup(ReadWriteTest.java:94)
at org.apache.cassandra.test.microbench.generated.ReadWriteTest_read_jmhTest._jmh_tryInit_f_readwritetest0_G(ReadWriteTest_read_jmhTest.java:438)
at org.apache.cassandra.test.microbench.generated.ReadWriteTest_read_jmhTest.read_Throughput(ReadWriteTest_read_jmhTest.java:71)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:453)
at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:437)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
`
then I use ant microbench -Dbenchmark.name=ReadWriteTest to run it, also it leads to the same error.
Any and all help would be appreciated.
Thanks.
of course I run the microbenchs with cassandra running, I also meet mistake when I shut down cassandra
`
INFO [StorageServiceShutdownHook] 2022-10-28 00:58:28,491 Server.java:176 - Stop listening for CQL clients
INFO [StorageServiceShutdownHook] 2022-10-28 00:58:28,491 Gossiper.java:1551 - Announcing shutdown
INFO [StorageServiceShutdownHook] 2022-10-28 00:58:28,492 StorageService.java:2454 - Node /192.168.199.135 state jump to shutdown
INFO [StorageServiceShutdownHook] 2022-10-28 00:58:30,495 MessagingService.java:981 - Waiting for messaging service to quiesce
INFO [ACCEPT-/192.168.199.135] 2022-10-28 00:58:30,495 MessagingService.java:1336 - MessagingService has terminated the accept() thread
WARN [MemtableFlushWriter:3] 2022-10-28 00:58:30,498 NativeLibrary.java:304 - open(/home/cjx/Downloads/depart0/depart/data/data/system_auth/roles-5bc52802de2535edaeab188eecebb090, O_RDONLY) failed, errno (2).
ERROR [MemtableFlushWriter:3] 2022-10-28 00:58:30,500 LogTransaction.java:272 - Transaction log [md_txn_flush_4ff62d10-5696-11ed-80e2-25f08a3965a3.log in /home/cjx/Downloads/depart0/depart/data/data/system_auth/roles-5bc52802de2535edaeab188eecebb090] indicates txn was not completed, trying to abort it now
ERROR [MemtableFlushWriter:3] 2022-10-28 00:58:30,505 LogTransaction.java:275 - Failed to abort transaction log [md_txn_flush_4ff62d10-5696-11ed-80e2-25f08a3965a3.log in /home/cjx/Downloads/depart0/depart/data/data/system_auth/roles-5bc52802de2535edaeab188eecebb090]
java.lang.RuntimeException: java.nio.file.NoSuchFileException: /home/cjx/Downloads/depart0/depart/data/data/system_auth/roles-5bc52802de2535edaeab188eecebb090/md_txn_flush_4ff62d10-5696-11ed-80e2-25f08a3965a3.log
at org.apache.cassandra.io.util.FileUtils.write(FileUtils.java:590) ~[main/:na]
at org.apache.cassandra.io.util.FileUtils.appendAndSync(FileUtils.java:571) ~[main/:na]
at org.apache.cassandra.db.lifecycle.LogReplica.append(LogReplica.java:85) ~[main/:na]
at org.apache.cassandra.db.lifecycle.LogReplicaSet.lambda$null$5(LogReplicaSet.java:210) ~[main/:na]
at org.apache.cassandra.utils.Throwables.perform(Throwables.java:113) ~[main/:na]
at org.apache.cassandra.utils.Throwables.perform(Throwables.java:103) ~[main/:na]
at org.apache.cassandra.db.lifecycle.LogReplicaSet.append(LogReplicaSet.java:210) ~[main/:na]
at org.apache.cassandra.db.lifecycle.LogFile.addRecord(LogFile.java:338) ~[main/:na]
at org.apache.cassandra.db.lifecycle.LogFile.abort(LogFile.java:255) ~[main/:na]
at org.apache.cassandra.utils.Throwables.perform(Throwables.java:113) ~[main/:na]
at org.apache.cassandra.utils.Throwables.perform(Throwables.java:103) ~[main/:na]
at org.apache.cassandra.utils.Throwables.perform(Throwables.java:98) ~[main/:na]
at org.apache.cassandra.db.lifecycle.LogTransaction$TransactionTidier.run(LogTransaction.java:273) [main/:na]
at org.apache.cassandra.db.lifecycle.LogTransaction$TransactionTidier.tidy(LogTransaction.java:257) [main/:na]
at org.apache.cassandra.utils.concurrent.Ref$GlobalState.release(Ref.java:322) [main/:na]
at org.apache.cassandra.utils.concurrent.Ref$State.ensureReleased(Ref.java:200) [main/:na]
at org.apache.cassandra.utils.concurrent.Ref.ensureReleased(Ref.java:120) [main/:na]
at org.apache.cassandra.db.lifecycle.LogTransaction.complete(LogTransaction.java:392) [main/:na]
at org.apache.cassandra.db.lifecycle.LogTransaction.doAbort(LogTransaction.java:409) [main/:na]
at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.abort(Transactional.java:144) [main/:na]
at org.apache.cassandra.db.lifecycle.LifecycleTransaction.doAbort(LifecycleTransaction.java:244) [main/:na]
at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.abort(Transactional.java:144) [main/:na]
at org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(ColumnFamilyStore.java:1234) [main/:na]
at org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1180) [main/:na]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_332]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_332]
at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) [main/:na]
at java.lang.Thread.run(Thread.java:750) ~[na:1.8.0_332]
Caused by: java.nio.file.NoSuchFileException: /home/cjx/Downloads/depart0/depart/data/data/system_auth/roles-5bc52802de2535edaeab188eecebb090/md_txn_flush_4ff62d10-5696-11ed-80e2-25f08a3965a3.log
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) ~[na:1.8.0_332]
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) ~[na:1.8.0_332]
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) ~[na:1.8.0_332]
at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214) ~[na:1.8.0_332]
at java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:434) ~[na:1.8.0_332]
at java.nio.file.Files.newOutputStream(Files.java:216) ~[na:1.8.0_332]
at java.nio.file.Files.write(Files.java:3351) ~[na:1.8.0_332]
at org.apache.cassandra.io.util.FileUtils.write(FileUtils.java:583) ~[main/:na]
... 27 common frames omitted
`
it's so wired and I could not know how to fix it

How to setup a distributed eventbus in a vertx Hazelcast Cluster?

Here is the sender verticle
I have set multicast enabled and set the public host to my machines ip address
VertxOptions options = new VertxOptions()
.setClusterManager(ClusterManagerConfig.getClusterManager());
EventBusOptions eventBusOptions = new EventBusOptions()
.setClustered(true)
.setClusterPublicHost("10.10.1.160");
options.setEventBusOptions(eventBusOptions);
Vertx.clusteredVertx(options, res -> {
if (res.succeeded()) {
Vertx vertx = res.result();
vertx.deployVerticle(new requestHandler());
vertx.deployVerticle(new requestSender());
EventBus eventBus = vertx.eventBus();
eventBus.send("some.address","hello",reply -> {
System.out.println(reply.toString());
});
} else {
LOGGER.info("Failed: " + res.cause());
}
});
}
here's the reciever verticle
VertxOptions options = new VertxOptions().setClusterManager(mgr);
options.setEventBusOptions(new EventBusOptions()
.setClustered(true)
.setClusterPublicHost("10.10.1.174") );
Vertx.clusteredVertx(options, res -> {
if (res.succeeded()) {
Vertx vertx1 = res.result();
System.out.println("Success");
EventBus eb = vertx1.eventBus();
System.out.println("ready");
eb.consumer("some.address", message -> {
message.reply("hello hello");
});
} else {
System.out.println("Failed");
}
});
I get this result when i run both main verticles , so the verticles are detected by hazelcast and a connection is established
INFO: [10.10.1.160]:33001 [dev] [3.10.5] Established socket connection between /10.10.1.160:33001 and /10.10.1.174:35725
Jan 11, 2021 11:45:10 AM com.hazelcast.internal.cluster.ClusterService
INFO: [10.10.1.160]:33001 [dev] [3.10.5]
Members {size:2, ver:2} [
Member [10.10.1.160]:33001 - 51b8c249-6b3c-4ca8-a238-c651845629d8 this
Member [10.10.1.174]:33001 - 1cba1680-025e-469f-bad6-884111313672
]
Jan 11, 2021 11:45:10 AM com.hazelcast.internal.partition.impl.MigrationManager
INFO: [10.10.1.160]:33001 [dev] [3.10.5] Re-partitioning cluster data... Migration queue size: 271
Jan 11, 2021 11:45:11 AM com.hazelcast.nio.tcp.TcpIpAcceptor
But when the event-bus tries to send a message to given address i encounter this error is this a problem with event-bus configuration?
Jan 11, 2021 11:59:57 AM io.vertx.core.eventbus.impl.clustered.ConnectionHolder
WARNING: Connecting to server 10.10.1.174:39561 failed
io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: /10.10.1.174:39561
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:327)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:665)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:612)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:529)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:491)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:905)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.ConnectException: Connection refused
... 11 more
In Vert.x 3, the cluster host and cluster public host default to localhost.
If you only change the cluster public host in VertxOptions, Vert.x will bind EventBus transport servers to localhost while telling other nodes to connect to the public host.
This kind of configuration is needed when running Vert.x on some cloud providers, but in most cases you only need to set the cluster host (and then the public host will default to its value):
EventBusOptions eventBusOptions = new EventBusOptions()
.setClustered(true)
.setHost("10.10.1.160");

failed for get of /hbase/hbaseid, code = CONNECTIONLOSS, retries = 6

I am trying to connect spark application with hbase. Below is the configuration I am giving
val conf = HBaseConfiguration.create()
conf.set("hbase.master", "localhost:16010")
conf.setInt("timeout", 120000)
conf.set("hbase.zookeeper.quorum", "2181")
val connection = ConnectionFactory.createConnection(conf)
and below are the 'jps' details:
5808 ResourceManager
8150 HMaster
8280 HRegionServer
5131 NameNode
8076 HQuorumPeer
5582 SecondaryNameNode
2798 org.eclipse.equinox.launcher_1.4.0.v20161219-1356.jar
8623 Jps
5951 NodeManager
5279 DataNode
I have alsotried with hbase master 16010
I am getting below error:
19/09/12 21:49:00 WARN ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.SocketException: Invalid argument
at sun.nio.ch.Net.connect0(Native Method)
at sun.nio.ch.Net.connect(Net.java:454)
at sun.nio.ch.Net.connect(Net.java:446)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648)
at org.apache.zookeeper.ClientCnxnSocketNIO.registerAndConnect(ClientCnxnSocketNIO.java:277)
at org.apache.zookeeper.ClientCnxnSocketNIO.connect(ClientCnxnSocketNIO.java:287)
at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1024)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1060)
19/09/12 21:49:00 WARN ReadOnlyZKClient: 0x1e3ff233 to 2181:2181 failed for get of /hbase/hbaseid, code = CONNECTIONLOSS, retries = 4
19/09/12 21:49:01 INFO ClientCnxn: Opening socket connection to server 2181/0.0.8.133:2181. Will not attempt to authenticate using SASL (unknown error)
19/09/12 21:49:01 ERROR ClientCnxnSocketNIO: Unable to open socket to 2181/0.0.8.133:2181
Looks like there is a problem to join zookeeper.
Check first that zookeeper is started on your local host on port 2181.
netstat -tunelp | grep 2181 | grep -i LISTEN
tcp6 0 0 :::2181 :::* LISTEN
In your conf, in hbase.zookeeper.quorum property you have to pass the ip of your zookeeper and not the port (hbase.zookeeper.property.clientPort)
My hbase connector is build with :
val conf = HBaseConfiguration.create()
conf.set("hbase.zookeeper.quorum", "10.80.188.65")
conf.set("hbase.master", "10.80.188.64:60000")
conf.set("hbase.zookeeper.property.clientPort", "2181")
conf.set("zookeeper.znode.parent", "/hbase-unsecure")
val connection = ConnectionFactory.createConnection(conf)

java.io.IOException: Filesystem closed

my spark job;
val df = spark.sql(s"""""")
df.write.mode("append").json("hdfs://xxx-nn-ha/user/b_me/df")
Error:
2019-01-31 19:56:36 INFO CoarseGrainedExecutorBackend:54 - Driver commanded a shutdown
2019-01-31 19:56:36 INFO MemoryStore:54 - MemoryStore cleared
2019-01-31 19:56:36 INFO BlockManager:54 - BlockManager stopped
2019-01-31 19:56:36 INFO ShutdownHookManager:54 - Shutdown hook called
2019-01-31 19:56:36 ERROR Executor:91 - Exception in task 4120.0 in stage 5.0 (TID 5402)
java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:808)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:868)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:934)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:735)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at org.apache.hadoop.io.SequenceFile$Reader.readBlock(SequenceFile.java:2178)
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2585)
at org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordReader.java:82)
at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:277)
at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:214)
at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage12.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.hasNext(WholeStageCodegenExec.scala:614)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:369)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2019-01-31 19:56:36 INFO Executor:54 - Not reporting error to driver during JVM shutdown.
End of LogType:stdout
I got the java.io.IOException: Filesystem closed error. Why? I have no hints. Any ideas welcomed. Thanks
UPDATE
There is a WARN:
2019-02-01 12:01:39 INFO YarnAllocator:54 - Driver requested a total number of 2007 executor(s).
2019-02-01 12:01:39 INFO ExecutorAllocationManager:54 - Requesting 968 new executors because tasks are backlogged (new desired total will be 2007)
2019-02-01 12:01:39 INFO ExecutorAllocationManager:54 - New executor 25 has registered (new total is 26)
2019-02-01 12:01:39 WARN ApplicationMaster:87 - Reporter thread fails 1 time(s) in a row.
org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Too many containers asked, 1365198
at org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.normalizeAndValidateRequests(RMServerUtils.java:128)
at org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:511)
at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
at org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2206)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2202)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1714)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2202)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:101)
at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:79)
at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy22.allocate(Unknown Source)
at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:277)
at org.apache.spark.deploy.yarn.YarnAllocator.allocateResources(YarnAllocator.scala:268)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:556)
If you look closer at DFSClient.checkOpen code you will see the following:
void checkOpen() throws IOException {
if (!clientRunning) {
IOException result = new IOException("Filesystem closed");
throw result;
}
}
Let's find all accessors of clientRunning field.
Only close method really changes it. Lets take a look at it:
#Override
public synchronized void close() throws IOException {
try {
if(clientRunning) {
closeAllFilesBeingWritten(false);
clientRunning = false;
getLeaseRenewer().closeClient(this);
// close connections to the namenode
closeConnectionToNamenode();
}
} finally {
if (provider != null) {
provider.close();
}
}
}
So the main problem in your job is that it tries to write the value, although the FS is already closed.
Make sure you don't close your FS before you do any job. You can also increase the logging level to find the cause.

Spark: Frequent Pattern Mining: issues in saving the results

I am using Spark's FP-growth algorithm. I was getting OOM errors when I was doing a collect, I then changed the code so that I can save the results in a text file on HDFS rather than collecting them on the driver node. Here is the related code:
// Model building:
val fpg = new FPGrowth()
.setMinSupport(0.01)
.setNumPartitions(10)
val model = fpg.run(transaction_distinct)
Here is a transformation that should give me RDD[Strings].
val mymodel = model.freqItemsets.map { itemset =>
val model_res = itemset.items.mkString("[", ",", "]") + ", " + itemset.freq
model_res
}
I then save the model results as. Unfortunately, this is really SLOW!!!
mymodel.saveAsTextFile("fpm_model")
I get these errors:
16/02/04 14:47:28 ERROR ErrorMonitor: AssociationError[akka.tcp://sparkDriver#ipaddress:46811] -> [akka.tcp://sparkExecutor#hostname:39720]: Error [Association failed with [akka.tcp://sparkExecutor#hostname:39720]][akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkExecutor#hostname:39720]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: hostname/ipaddress:39720] akka.event.Logging$Error$NoCause$
16/02/04 14:47:28 INFO BlockManagerMasterEndpoint: Removing block manager BlockManagerId(3, hostname, 58683)
16/02/04 14:47:28 INFO BlockManagerMaster: Removed 3 successfully in removeExecutor
16/02/04 14:47:28 ERROR ErrorMonitor: AssociationError [akka.tcp://sparkDriver#ipaddress:46811] ->[akka.tcp://sparkExecutor#hostname:39720]: Error [Association failed with [akka.tcp://sparkExecutor#hostname:39720]][akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkExecutor#hostname:39720]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: hostname/ipaddress:39720

Resources