I downloaded wso2 cep 4.0.0-SNAPSHOT from jenkins around 2 weeks ago.
When I configure the cassandra output publisher from cep I tie it to an event stream. When I test the event stream, the cassandra output publisher is invoked and I have an exception. Above the entire log with the exception:
log4j: reset attribute= "false".
log4j: Threshold ="null".
log4j: Level value for root is [DEBUG].
log4j: root level set to DEBUG
log4j: Class name: [org.apache.log4j.ConsoleAppender]
log4j: Parsing layout of class: "org.apache.log4j.PatternLayout"
log4j: Setting property [conversionPattern] to [%d{ABSOLUTE} %-5p [%c{1}] %m%n].
log4j: Adding appender named [myAppender] to category [root].
17:12:16,449 INFO [CassandraHostRetryService] Downed Host Retry service started with queue size -1 and retry delay 10s
17:12:16,517 INFO [JmxMonitor] Registering JMX me.prettyprint.cassandra.service_EventPublisher_risultato_cassandra:ServiceType=hector,MonitorType=hector
17:12:16,543 DEBUG [HThriftClient] Creating a new thrift connection to localhost(127.0.0.1):9042
17:12:16,548 DEBUG [HThriftClient] Creating a new thrift connection to localhost(127.0.0.1):9042
17:12:16,549 DEBUG [HThriftClient] Creating a new thrift connection to localhost(127.0.0.1):9042
17:12:16,550 DEBUG [HThriftClient] Creating a new thrift connection to localhost(127.0.0.1):9042
17:12:16,551 DEBUG [HThriftClient] Creating a new thrift connection to localhost(127.0.0.1):9042
17:12:16,552 DEBUG [HThriftClient] Creating a new thrift connection to localhost(127.0.0.1):9042
17:12:16,553 DEBUG [HThriftClient] Creating a new thrift connection to localhost(127.0.0.1):9042
17:12:16,554 DEBUG [HThriftClient] Creating a new thrift connection to localhost(127.0.0.1):9042
17:12:16,558 DEBUG [HThriftClient] Creating a new thrift connection to localhost(127.0.0.1):9042
17:12:16,563 DEBUG [HThriftClient] Creating a new thrift connection to localhost(127.0.0.1):9042
17:12:16,569 DEBUG [HThriftClient] Creating a new thrift connection to localhost(127.0.0.1):9042
17:12:16,576 DEBUG [HThriftClient] Creating a new thrift connection to localhost(127.0.0.1):9042
17:12:16,579 DEBUG [HThriftClient] Creating a new thrift connection to localhost(127.0.0.1):9042
17:12:16,584 DEBUG [HThriftClient] Creating a new thrift connection to localhost(127.0.0.1):9042
17:12:16,589 DEBUG [HThriftClient] Creating a new thrift connection to localhost(127.0.0.1):9042
17:12:16,591 DEBUG [HThriftClient] Creating a new thrift connection to localhost(127.0.0.1):9042
17:12:16,592 DEBUG [ConcurrentHClientPool] Concurrent Host pool started with 16 active clients; max: 50 exhausted wait: 0
17:12:16,641 DEBUG [HThriftClient] Closing client CassandraClient<localhost:9042-1>
17:12:16,643 ERROR [HConnectionManager] MARK HOST AS DOWN TRIGGERED for host localhost(127.0.0.1):9042
17:12:16,645 ERROR [HConnectionManager] Pool state on shutdown: <ConcurrentCassandraClientPoolByHost>:{localhost(127.0.0.1):9042}; IsActive?: true; Active: 1; Blocked: 0; Idle: 15; NumBeforeExhausted: 49
17:12:16,646 INFO [ConcurrentHClientPool] Shutdown triggered on <ConcurrentCassandraClientPoolByHost>:{localhost(127.0.0.1):9042}
17:12:16,647 DEBUG [HThriftClient] Closing client CassandraClient<localhost:9042-5>
17:12:16,650 DEBUG [HThriftClient] Closing client CassandraClient<localhost:9042-15>
17:12:16,650 DEBUG [HThriftClient] Closing client CassandraClient<localhost:9042-4>
17:12:16,652 DEBUG [HThriftClient] Closing client CassandraClient<localhost:9042-11>
17:12:16,653 DEBUG [HThriftClient] Closing client CassandraClient<localhost:9042-12>
17:12:16,655 DEBUG [HThriftClient] Closing client CassandraClient<localhost:9042-14>
17:12:16,655 DEBUG [HThriftClient] Closing client CassandraClient<localhost:9042-7>
17:12:16,656 DEBUG [HThriftClient] Closing client CassandraClient<localhost:9042-13>
17:12:16,658 DEBUG [HThriftClient] Closing client CassandraClient<localhost:9042-9>
17:12:16,659 DEBUG [HThriftClient] Closing client CassandraClient<localhost:9042-6>
17:12:16,659 DEBUG [HThriftClient] Closing client CassandraClient<localhost:9042-16>
17:12:16,661 DEBUG [HThriftClient] Closing client CassandraClient<localhost:9042-2>
17:12:16,663 DEBUG [HThriftClient] Closing client CassandraClient<localhost:9042-10>
17:12:16,664 DEBUG [HThriftClient] Closing client CassandraClient<localhost:9042-3>
17:12:16,667 DEBUG [HThriftClient] Closing client CassandraClient<localhost:9042-8>
17:12:16,669 INFO [ConcurrentHClientPool] Shutdown complete on <ConcurrentCassandraClientPoolByHost>:{localhost(127.0.0.1):9042}
17:12:16,669 INFO [CassandraHostRetryService] Host detected as down was added to retry queue: localhost(127.0.0.1):9042
17:12:16,670 DEBUG [HThriftClient] Creating a new thrift connection to localhost(127.0.0.1):9042
17:12:16,670 WARN [HConnectionManager] Could not fullfill request on this host CassandraClient<localhost:9042-1>
17:12:16,671 WARN [HConnectionManager] Exception:
me.prettyprint.hector.api.exceptions.HectorTransportException: org.apache.thrift.transport.TTransportException: Read a negative frame size (-2080374784)!
at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:39)
at me.prettyprint.cassandra.service.AbstractCluster$4.execute(AbstractCluster.java:195)
at me.prettyprint.cassandra.service.AbstractCluster$4.execute(AbstractCluster.java:185)
at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:104)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:253)
at me.prettyprint.cassandra.service.AbstractCluster.describeKeyspace(AbstractCluster.java:199)
at it.vige.test.cassandra.CassandraWso2Test.cassandraConnection(CassandraWso2Test.java:46)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:86)
at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:675)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)
Caused by: org.apache.thrift.transport.TTransportException: Read a negative frame size (-2080374784)!
at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:133)
at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at org.apache.cassandra.thrift.Cassandra$Client.recv_describe_keyspace(Cassandra.java:1241)
at org.apache.cassandra.thrift.Cassandra$Client.describe_keyspace(Cassandra.java:1228)
at me.prettyprint.cassandra.service.AbstractCluster$4.execute(AbstractCluster.java:190)
... 28 more
17:12:16,675 ERROR [CassandraHostRetryService] Downed Host retry failed attempt to verify CassandraHost
org.apache.thrift.transport.TTransportException: Read a negative frame size (-2080374784)!
at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:133)
at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at org.apache.cassandra.thrift.Cassandra$Client.recv_describe_cluster_name(Cassandra.java:1101)
at org.apache.cassandra.thrift.Cassandra$Client.describe_cluster_name(Cassandra.java:1089)
at me.prettyprint.cassandra.connection.CassandraHostRetryService.verifyConnection(CassandraHostRetryService.java:214)
at me.prettyprint.cassandra.connection.CassandraHostRetryService.access$100(CassandraHostRetryService.java:24)
at me.prettyprint.cassandra.connection.CassandraHostRetryService$1.run(CassandraHostRetryService.java:75)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
17:12:16,683 INFO [HConnectionManager] Client CassandraClient<localhost:9042-1> released to inactive or dead pool. Closing.
17:12:16,683 DEBUG [HThriftClient] Closing client CassandraClient<localhost:9042-1>
17:12:16,684 ERROR [CassandraWso2Test] Test fallito
me.prettyprint.hector.api.exceptions.HectorException: All host pools marked down. Retry burden pushed out to client.
at me.prettyprint.cassandra.connection.HConnectionManager.getClientFromLBPolicy(HConnectionManager.java:390)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:244)
at me.prettyprint.cassandra.service.AbstractCluster.describeKeyspace(AbstractCluster.java:199)
at it.vige.test.cassandra.CassandraWso2Test.cassandraConnection(CassandraWso2Test.java:46)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:86)
at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:675)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)
Above how I configure cassandra 2.2.1 in the conf/cassandra.yaml:
#cluster_name: 'Test Cluster'
cluster_name: 'EventPublisher_risultato_cassandra'
and how I start it:
.../bin/cassandra
Above how the output publisher is configured in the cep:
<?xml version="1.0" encoding="UTF-8"?>
<eventPublisher name="risultato_cassandra" statistics="disable"
trace="disable" xmlns="http://wso2.org/carbon/eventpublisher">
<from streamName="gpsspace_entrati" version="1.0.0"/>
<mapping customMapping="disable" type="map"/>
<to eventAdapterType="cassandra">
<property name="key.space.name">seme</property>
<property name="port">9042</property>
<property name="hosts">localhost</property>
<property name="column.family.name">seme</property>
</to>
</eventPublisher>
Here a test code that emulate the error using :
<dependency>
<groupId>org.hectorclient.wso2</groupId>
<artifactId>hector-core</artifactId>
<version>1.1.4.wso2v1</version>
</dependency>
as dependency
package it.vige.test.cassandra;
import static org.junit.Assert.fail;
import static org.slf4j.LoggerFactory.getLogger;
import java.util.HashMap;
import java.util.Map;
import org.junit.Test;
import org.slf4j.Logger;
import me.prettyprint.cassandra.service.CassandraHostConfigurator;
import me.prettyprint.hector.api.Cluster;
import me.prettyprint.hector.api.ddl.KeyspaceDefinition;
import me.prettyprint.hector.api.factory.HFactory;
public class CassandraWso2Test {
private Logger logger = getLogger(getClass());
#Test
public void cassandraConnection() {
try {
Cluster cluster;
// Connect to the cluster and keyspace "seme"
Map<String, String> staticProperties = new HashMap<String, String>();
staticProperties.put("key.space.name", "seme");
staticProperties.put("replication.factor", null);
staticProperties.put("port", "9042");
staticProperties.put("hosts", "localhost");
staticProperties.put("strategy.class", null);
staticProperties.put("user.name", null);
staticProperties.put("indexed.columns", null);
staticProperties.put("column.family.name", "seme");
CassandraHostConfigurator chc = new CassandraHostConfigurator();
chc.setHosts(staticProperties.get("hosts"));
if (staticProperties.get("port") != null) {
chc.setPort(Integer.parseInt(staticProperties.get("port")));
}
cluster = HFactory.createCluster("EventPublisher_risultato_cassandra", chc, null);
String keySpaceName = staticProperties.get("key.space.name");
KeyspaceDefinition existingKeyspaceDefinition = cluster.describeKeyspace(keySpaceName);
logger.info("existingKeyspaceDefinition = " + existingKeyspaceDefinition);
} catch (Exception ex) {
logger.error("Test fallito", ex);
fail();
}
}
}
Enable thrift in cassandra to solve the problem.
nodetool enablethrift
and configure event publisher to connect on 9160 port
Related
I am getting the following errors in my Wso2 API Manager and Gateway node logs after starting them up:
TID: [-1] [] [2021-07-06 07:50:51,388] ERROR {org.wso2.carbon.databridge.receiver.binary.internal.BinaryDataReceiver} - Error while reading from the socket. java.io.EOFException: Connection closed from remote end.
at org.wso2.carbon.databridge.commons.binary.BinaryMessageConverterUtil.loadData(BinaryMessageConverterUtil.java:39)
at org.wso2.carbon.databridge.receiver.binary.internal.BinaryDataReceiver$BinaryTransportReceiver.run(BinaryDataReceiver.java:258)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
TID: [-1] [] [2021-07-06 07:50:52,753] ERROR {org.wso2.andes.transport.network.mina.MinaNetworkHandler} - Exception caught by Mina java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:197)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
at org.wso2.org.apache.mina.transport.socket.nio.SocketIoProcessor.read(SocketIoProcessor.java:218)
at org.wso2.org.apache.mina.transport.socket.nio.SocketIoProcessor.process(SocketIoProcessor.java:198)
at org.wso2.org.apache.mina.transport.socket.nio.SocketIoProcessor.access$400(SocketIoProcessor.java:45)
at org.wso2.org.apache.mina.transport.socket.nio.SocketIoProcessor$Worker.run(SocketIoProcessor.java:485)
at org.wso2.org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:51)
at java.lang.Thread.run(Thread.java:748)
TID: [-1] [] [2021-07-06 07:50:52,757] ERROR {org.wso2.andes.server.protocol.MultiVersionProtocolEngine} - Error establishing session java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:197)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
at org.wso2.org.apache.mina.transport.socket.nio.SocketIoProcessor.read(SocketIoProcessor.java:218)
at org.wso2.org.apache.mina.transport.socket.nio.SocketIoProcessor.process(SocketIoProcessor.java:198)
at org.wso2.org.apache.mina.transport.socket.nio.SocketIoProcessor.access$400(SocketIoProcessor.java:45)
at org.wso2.org.apache.mina.transport.socket.nio.SocketIoProcessor$Worker.run(SocketIoProcessor.java:485)
at org.wso2.org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:51)
at java.lang.Thread.run(Thread.java:748)
I get this error when starting it with the default deployment.toml and after I have configured it to my deployment needs. It does not seem to affect the functionality of creating/publishing API's and application and subscribing them as well as generating keys so I am not sure what the issue is.
I am currently running this API Manager in a EC2 instance on AWS. If theres any other info needed to help find out why this is happening please let me know.
Thanks
Searched a lot but all in vain, this is a 3 node EC2 cluster in AWS, checked the disk space, resources, running services, all seems to be fine but i get this error. Please help to resolve this.
10.0.1.5 & 10.0.1.6 are datanodes, i just ran the spark-shell from namenode.
Minimal configurations are edited, if needed i can post those here too.
$ spark-shell
19/08/05 10:40:26 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
19/08/05 10:40:31 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
19/08/05 10:40:42 WARN server.TransportChannelHandler: Exception in connection from /10.0.1.6:55202
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:288)
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1106)
at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:343)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:123)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
19/08/05 10:40:42 ERROR client.TransportClient: Failed to send RPC RPC 6351187948645779511 to /10.0.1.5:44418: java.nio.channels.ClosedChannelException
java.nio.channels.ClosedChannelException
at io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source)
19/08/05 10:40:42 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to get executor loss reason for executor id 2 at RPC address 10.0.1.5:44428, but got no response. Marking as slave lost.
java.io.IOException: Failed to send RPC RPC 6351187948645779511 to /10.0.1.5:44418: java.nio.channels.ClosedChannelException
at org.apache.spark.network.client.TransportClient$RpcChannelListener.handleFailure(TransportClient.java:357)
at org.apache.spark.network.client.TransportClient$StdChannelListener.operationComplete(TransportClient.java:334)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:481)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:122)
at io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetFailure(AbstractChannel.java:987)
at io.netty.channel.AbstractChannel$AbstractUnsafe.write(AbstractChannel.java:869)
at io.netty.channel.DefaultChannelPipeline$HeadContext.write(DefaultChannelPipeline.java:1316)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:738)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:730)
at io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:38)
at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:1081)
at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1128)
at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1070)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.nio.channels.ClosedChannelException
I have a livy server running in my local and it is connecting to a remote yarn cluster for running spark jobs.
I am getting the below error when i upload my jar from Livy programmatic Job api.
It looks like correct netty channel handler is not getting mapped for the request message. because of this spark context is not getting created.
livy.conf setting:
livy.spark.master = yarn
livy.spark.deploy-mode = cluster
I am able to do post to Livy-REST /batches api and it submits and completes the job in the yarn cluster successfully.
With the same config, when i am trying to upload jar from my java client it fails with the below error.
Livy version : 0.4.0-incubating.
HTTP Client:
<dependency>
<groupId>org.apache.livy</groupId>
<artifactId>livy-client-http</artifactId>
<version>0.4.0-incubating</version>
</dependency>
Error:
17/09/22 22:47:50 DEBUG RSCDriver: Registered new connection from [id: 0x46d83447, L:/127.0.0.1:10000 - R:/127.0.0.1:9273].
17/09/22 22:47:50 DEBUG RpcDispatcher: [ReplDriver] Registered outstanding rpc 0 (org.apache.livy.rsc.BaseProtocol$RemoteDriverAddress).
17/09/22 22:47:50 DEBUG KryoMessageCodec: Encoded message of type org.apache.livy.rsc.rpc.Rpc$MessageHeader (5 bytes)
17/09/22 22:47:50 DEBUG KryoMessageCodec: Encoded message of type org.apache.livy.rsc.BaseProtocol$RemoteDriverAddress (90 bytes)
17/09/22 22:47:50 DEBUG KryoMessageCodec: Decoded message of type org.apache.livy.rsc.rpc.Rpc$MessageHeader (5 bytes)
17/09/22 22:47:50 DEBUG KryoMessageCodec: Decoded message of type org.apache.livy.rsc.BaseProtocol$RemoteDriverAddress (90 bytes)
17/09/22 22:47:50 DEBUG RpcDispatcher: [ReplDriver] Received RPC message: type=CALL id=0 payload=org.apache.livy.rsc.BaseProtocol$RemoteDriverAddress
17/09/22 22:47:50 WARN RpcDispatcher: [ReplDriver] Failed to find handler for msg 'org.apache.livy.rsc.BaseProtocol$RemoteDriverAddress'.
17/09/22 22:47:50 DEBUG RpcDispatcher: [ReplDriver] Caught exception in channel pipeline.
java.lang.NullPointerException
at org.apache.livy.rsc.Utils.stackTraceAsString(Utils.java:95)
at org.apache.livy.rsc.rpc.RpcDispatcher.handleCall(RpcDispatcher.java:121)
at org.apache.livy.rsc.rpc.RpcDispatcher.channelRead0(RpcDispatcher.java:77)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:293)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:267)
at io.netty.handler.codec.ByteToMessageCodec.channelRead(ByteToMessageCodec.java:103)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
at java.lang.Thread.run(Thread.java:745)
17/09/22 22:47:50 WARN RpcDispatcher: [ReplDriver] Closing RPC channel with 1 outstanding RPCs.
17/09/22 22:47:50 WARN RpcDispatcher: [ReplDriver] Closing RPC channel with 1 outstanding RPCs.
17/09/22 22:47:50 ERROR ApplicationMaster: User class threw exception: java.util.concurrent.CancellationException
java.util.concurrent.CancellationException
at io.netty.util.concurrent.DefaultPromise.cancel(...)(Unknown Source)
17/09/22 22:47:50 INFO ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: java.util.concurrent.CancellationException)
17/09/22 22:47:50 ERROR ApplicationMaster: Uncaught exception:
org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)
at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:401)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:254)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:764)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:67)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:66)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:762)
at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
Caused by: java.util.concurrent.CancellationException
at io.netty.util.concurrent.DefaultPromise.cancel(...)(Unknown Source)
17/09/22 22:47:50 INFO ApplicationMaster: Unregistering ApplicationMaster with FAILED (diag message: User class threw exception: java.util.concurrent.CancellationException)
17/09/22 22:47:50 INFO ShutdownHookManager: Shutdown hook called.
I am trying to test Zeppelin 0.6.2 with a Spark 2.0.1 installed in a Windows Server 2012.
I started the Spark master and tested the Spark Shell.
Then I configured the following in the conf\zeppeling-env.cmd file:
set SPARK_HOME=C:\spark-2.0.1-bin-hadoop2.7
set MASTER=spark://100.79.240.26:7077
I have not set the HADOOP_CONF_DIR and SPARK_SUBMIT_OPTIONS (that is optional according to the documentation)
I checked the values in the Interpreter configuration page and the spark master is Ok.
When I run the Zeppelin tutorial --> "Load data into table" note I am getting a connection refused error. Here is part of the messages in the error log:
INFO [2016-11-17 21:58:12,518] ({pool-1-thread-11} Paragraph.java[jobRun]:252) - run paragraph 20150210-015259_1403135953 using null org.apache.zeppelin.interpreter.LazyOpenInterpreter#8bbfd7
INFO [2016-11-17 21:58:12,518] ({pool-1-thread-11} RemoteInterpreterProcess.java[reference]:148) - Run interpreter process [C:\zeppelin-0.6.2-bin-all\bin\interpreter.cmd, -d, C:\zeppelin-0.6.2-bin-all\interpreter\spark, -p, 50163, -l, C:\zeppelin-0.6.2-bin-all/local-repo/2C3FBS414]
INFO [2016-11-17 21:58:12,614] ({Exec Default Executor} RemoteInterpreterProcess.java[onProcessFailed]:288) - Interpreter process failed {}
org.apache.commons.exec.ExecuteException: Process exited with an error: 255 (Exit value: 255)
at org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:404)
at org.apache.commons.exec.DefaultExecutor.access$200(DefaultExecutor.java:48)
at org.apache.commons.exec.DefaultExecutor$1.run(DefaultExecutor.java:200)
at java.lang.Thread.run(Thread.java:745)
ERROR [2016-11-17 21:58:43,846] ({Thread-49} RemoteScheduler.java[getStatus]:255) - Can't get status information
org.apache.zeppelin.interpreter.InterpreterException: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused: connect
at org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:53)
at org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:37)
at org.apache.commons.pool2.BasePooledObjectFactory.makeObject(BasePooledObjectFactory.java:60)
at org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:861)
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:435)
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:363)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.getClient(RemoteInterpreterProcess.java:189)
at org.apache.zeppelin.scheduler.RemoteScheduler$JobStatusPoller.getStatus(RemoteScheduler.java:253)
at org.apache.zeppelin.scheduler.RemoteScheduler$JobStatusPoller.run(RemoteScheduler.java:211)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused: connect
at org.apache.thrift.transport.TSocket.open(TSocket.java:187)
at org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:51)
... 8 more
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.DualStackPlainSocketImpl.connect0(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:79)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.thrift.transport.TSocket.open(TSocket.java:182)
... 9 more
ERROR [2016-11-17 21:58:43,846] ({pool-1-thread-11} Job.java[run]:189) - Job failed
org.apache.zeppelin.interpreter.InterpreterException: org.apache.zeppelin.interpreter.InterpreterException: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused: connect
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.init(RemoteInterpreter.java:165)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:328)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.getFormType(LazyOpenInterpreter.java:105)
at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:260)
at org.apache.zeppelin.scheduler.Job.run(Job.java:176)
at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:328)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
In the Zeppelin logs there is only one file for zeppelin, the interpreter is an external Spark installation, which is not logging any error because it is never reached by the interpreter process.
I read some suggestion about the max and min memory of the JVM but I could not fix it yet.
Any comment will be appreciated.
Paul
I have a hortonwork sandbox 2.4 with spark 1.6 set up. Then I create Intellij spark development environment in windows using hdp spark jar and scala 2.10.5. So both spark and scala version are matched between my windows and hdp environment as indicated here. And my Intellij dev environment works with local as Master.
Then I'm trying to connect hdp in windows using
val sparkConf = new SparkConf()
.setAppName("spark-word-count")
.setMaster("spark://10.33.241.160:7077")
And I get below error information and have no clue to resolve it. Please help!
6/03/21 16:27:40 INFO SparkUI: Started SparkUI at http://10.33.240.126:4040
16/03/21 16:27:40 INFO AppClient$ClientEndpoint: Connecting to master spark://10.33.241.160:7077...
16/03/21 16:27:41 WARN AppClient$ClientEndpoint: Failed to connect to master 10.33.241.160:7077
java.io.IOException: Failed to connect to /10.33.241.160:7077
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:216)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:167)
at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:200)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:187)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:183)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.net.ConnectException: Connection refused: no further information: /10.33.241.160:7077
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:692)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:289)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
... 1 more
16/03/21 16:28:40 ERROR MapOutputTrackerMaster: Error communicating with MapOutputTracker
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1325)
at scala.concurrent.impl.Promise$DefaultPromise.tryAwait(Promise.scala:208)
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:218)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:107)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:101)
at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:77)
at org.apache.spark.MapOutputTracker.askTracker(MapOutputTracker.scala:110)
at org.apache.spark.MapOutputTracker.sendTracker(MapOutputTracker.scala:120)
at org.apache.spark.MapOutputTrackerMaster.stop(MapOutputTracker.scala:462)
at org.apache.spark.SparkEnv.stop(SparkEnv.scala:93)
at org.apache.spark.SparkContext$$anonfun$stop$12.apply$mcV$sp(SparkContext.scala:1756)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1229)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1755)
at org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend.dead(SparkDeploySchedulerBackend.scala:127)
at org.apache.spark.deploy.client.AppClient$ClientEndpoint.markDead(AppClient.scala:264)
at org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anon$2$$anonfun$run$1.apply$mcV$sp(AppClient.scala:134)
at org.apache.spark.util.Utils$.tryOrExit(Utils.scala:1163)
at org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anon$2.run(AppClient.scala:129)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
It turns out I need to setup my hortonworks Spark as master server every time server restart. Then use my intellij dev environment to connect with hdp as slave. Just run ./sbin/start-master.sh in hdp as this link.