I am using hazelcast 3.8.3. Occasionally, I have Operation TimeoutException (GetOperation). I have 4 nodes in the same Hz cluster.
Application throws error
..
Caused by: java.util.concurrent.ExecutionException: com.hazelcast.core.OperationTimeoutException: GetOperation invocation failed to complete due to operation-heartbeat-timeout. Current time: 2020-04-23 19:17:37.414. Start time: 2020-04-23 19:14:16.456. Total elapsed time: 200958 ms. Last operation heartbeat: never. Last operation heartbeat from member: 2020-04-23 19:16:59.335. Invocation{op=com.hazelcast.map.impl.operation.GetOperation{serviceName='hz:impl:mapService', identityHash=300649867, partitionId=30, replicaIndex=0, callId=-10707, invocationTime=1587665497325 (2020-04-23 19:11:37.325), waitTimeout=-1, callTimeout=180000, name=ENRICHMENT-CACHE-DEFAULT}, tryCount=250, tryPauseMillis=500, invokeCount=1, callTimeoutMillis=180000, firstInvocationTimeMs=1587665656456, firstInvocationTime='2020-04-23 19:14:16.456', lastHeartbeatMillis=0, lastHeartbeatTime='1970-01-01 01:00:00.000', target=[10.195.40.212]:5701, pendingResponse={VOID}, backupsAcksExpected=0, backupsAcksReceived=0, connection=Connection[id=56, /10.195.40.210:5702->/10.195.40.212:60231, endpoint=[10.195.40.212]:5701, alive=true, type=MEMBER]}
at java.util.concurrent.FutureTask.report(Unknown Source)
at java.util.concurrent.FutureTask.get(Unknown Source)
at com.experian.eda.ea.core.enrichmentpack.MultiThreadEnrichmentPackManager.evaluateAllPreEnrichmentRulesWithSizePopulator(MultiThreadEnrichmentPackManager.java:123)
... 90 common frames omitted
Caused by: com.hazelcast.core.OperationTimeoutException: GetOperation invocation failed to complete due to operation-heartbeat-timeout. Current time: 2020-04-23 19:17:37.414. Start time: 2020-04-23 19:14:16.456. Total elapsed time: 200958 ms. Last operation heartbeat: never. Last operation heartbeat from member: 2020-04-23 19:16:59.335. Invocation{op=com.hazelcast.map.impl.operation.GetOperation{serviceName='hz:impl:mapService', identityHash=300649867, partitionId=30, replicaIndex=0, callId=-10707, invocationTime=1587665497325 (2020-04-23 19:11:37.325), waitTimeout=-1, callTimeout=180000, name=ENRICHMENT-CACHE-DEFAULT}, tryCount=250, tryPauseMillis=500, invokeCount=1, callTimeoutMillis=180000, firstInvocationTimeMs=1587665656456, firstInvocationTime='2020-04-23 19:14:16.456', lastHeartbeatMillis=0, lastHeartbeatTime='1970-01-01 01:00:00.000', target=[10.195.40.212]:5701, pendingResponse={VOID}, backupsAcksExpected=0, backupsAcksReceived=0, connection=Connection[id=56, /10.195.40.210:5702->/10.195.40.212:60231, endpoint=[10.195.40.212]:5701, alive=true, type=MEMBER]}
at com.hazelcast.spi.impl.operationservice.impl.InvocationFuture.newOperationTimeoutException(InvocationFuture.java:151)
at com.hazelcast.spi.impl.operationservice.impl.InvocationFuture.resolve(InvocationFuture.java:101)
at com.hazelcast.spi.impl.operationservice.impl.InvocationFuture.resolveAndThrowIfException(InvocationFuture.java:75)
at com.hazelcast.spi.impl.AbstractInvocationFuture.get(AbstractInvocationFuture.java:155)
at com.hazelcast.map.impl.proxy.MapProxySupport.invokeOperation(MapProxySupport.java:373)
at com.hazelcast.map.impl.proxy.MapProxySupport.getInternal(MapProxySupport.java:304)
at com.hazelcast.map.impl.proxy.MapProxyImpl.get(MapProxyImpl.java:114)
Here is the setting I use.
<map name="ENRICHMENT-CACHE-*">
<in-memory-format>BINARY</in-memory-format>
<backup-count>0</backup-count>
<max-idle-seconds>600</max-idle-seconds>
<eviction-policy>LRU</eviction-policy>
<max-size policy="PER_NODE">5000</max-size>
<eviction-percentage>25</eviction-percentage>
</map>
<properties>
<property name="hazelcast.logging.type">slf4j</property>
<property name="hazelcast.client.invocation.timeout.seconds">360</property>
<property name="hazelcast.operation.call.timeout.millis">180000</property>
</properties>
I see the code is written this way.
#Override
public final V get() throws InterruptedException, ExecutionException {
Object response = registerWaiter(Thread.currentThread(), null);
if (response != VOID) {
// no registration was done since a value is available.
return resolveAndThrowIfException(response);
}
boolean interrupted = false;
try {
for (; ; ) {
park();
if (isDone()) {
return resolveAndThrowIfException(state);
} else if (Thread.interrupted()) {
interrupted = true;
onInterruptDetected();
}
}
} finally {
restoreInterrupt(interrupted);
}
}
The current thread is parked, waiting for interruption. Could it be the heartbeat timeout exception interrupted the thread, and woke it in an errorneous state?
If yes, would increment of the hazelcast.max.no.heartbeat.seconds help?
If this is not the case, do you have any idea why the error occured & how to solve it?
Thank you.
Regards,
Khairul
Related
I run cassandra benchmarks ReadWriteTest with idea on centos7
`
package org.apache.cassandra.test.microbench;
#BenchmarkMode(Mode.Throughput)
#OutputTimeUnit(TimeUnit.MILLISECONDS)
#Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
#Measurement(iterations = 5, time = 2, timeUnit = TimeUnit.SECONDS)
#Fork(value = 1)
#Threads(1)
#State(Scope.Benchmark)
public class ReadWriteTest extends CQLTester
{
static String keyspace;
String table;
String writeStatement;
String readStatement;
long numRows = 0;
ColumnFamilyStore cfs;
#Setup(Level.Trial)
public void setup() throws Throwable
{
CQLTester.setUpClass();
keyspace = createKeyspace("CREATE KEYSPACE %s with replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 } and durable_writes = false");
table = createTable(keyspace, "CREATE TABLE %s ( userid bigint, picid bigint, commentid bigint, PRIMARY KEY(userid, picid))");
execute("use "+keyspace+";");
writeStatement = "INSERT INTO "+table+"(userid,picid,commentid)VALUES(?,?,?)";
readStatement = "SELECT * from "+table+" limit 100";
cfs = Keyspace.open(keyspace).getColumnFamilyStore(table);
cfs.disableAutoCompaction();
//Warm up
System.err.println("Writing 50k");
for (long i = 0; i < 5000; i++)
execute(writeStatement, i, i, i );
}
#TearDown(Level.Trial)
public void teardown() throws IOException, ExecutionException, InterruptedException
{
CQLTester.cleanup();
}
#Benchmark
public Object write() throws Throwable
{
numRows++;
return execute(writeStatement, numRows, numRows, numRows );
}
#Benchmark
public Object read() throws Throwable
{
return execute(readStatement);
}
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(ReadWriteTest.class.getSimpleName())
.build();
new Runner(opt).run();
}
}
it seems to be successful at start, "java.lang.IndexOutOfBoundsException: Index: 0, Size: 0" occurs like this
01:37:25.884 [org.apache.cassandra.test.microbench.ReadWriteTest.read-jmh-worker-1] DEBUG org.apache.cassandra.db.ReadCommand - --in queryMemtableAndDiskInternal, partitionKey token:-2532556674411782010, partitionKey:DecoratedKey(-2532556674411782010, 6b657973706163655f30)
01:37:25.888 [org.apache.cassandra.test.microbench.ReadWriteTest.read-jmh-worker-1] INFO o.a.cassandra.db.ColumnFamilyStore - Initializing keyspace_0.globalReplicaTable
01:37:25.889 [org.apache.cassandra.test.microbench.ReadWriteTest.read-jmh-worker-1] DEBUG o.a.cassandra.db.DiskBoundaryManager - Refreshing disk boundary cache for keyspace_0.globalReplicaTable
01:37:25.890 [org.apache.cassandra.test.microbench.ReadWriteTest.read-jmh-worker-1] DEBUG o.a.cassandra.db.DiskBoundaryManager - Got local ranges [] (ringVersion = 0)
01:37:25.890 [org.apache.cassandra.test.microbench.ReadWriteTest.read-jmh-worker-1] DEBUG o.a.cassandra.db.DiskBoundaryManager - Updating boundaries from null to DiskBoundaries{directories=[DataDirectory{location=/home/cjx/Downloads/depart0/depart/data/data}], positions=null, ringVersion=0, directoriesVersion=0} for keyspace_0.globalReplicaTable
01:37:25.890 [org.apache.cassandra.test.microbench.ReadWriteTest.read-jmh-worker-1] DEBUG org.apache.cassandra.config.Schema - Adding org.apache.cassandra.config.CFMetaData#44380385[cfId=bfe68a70-569b-11ed-974a-eb765e7c415c,ksName=keyspace_0,cfName=globalReplicaTable,flags=[COMPOUND],params=TableParams{comment=, read_repair_chance=0.0, dclocal_read_repair_chance=0.1, bloom_filter_fp_chance=0.01, crc_check_chance=1.0, gc_grace_seconds=864000, default_time_to_live=0, memtable_flush_period_in_ms=0, min_index_interval=128, max_index_interval=2048, speculative_retry=99PERCENTILE, caching={'keys' : 'ALL', 'rows_per_partition' : 'NONE'}, compaction=CompactionParams{class=org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy, options={max_threshold=32, min_threshold=4}}, compression=org.apache.cassandra.schema.CompressionParams#920394f7, extensions={}, cdc=false},comparator=comparator(org.apache.cassandra.db.marshal.LongType),partitionColumns=[[] | [commentid]],partitionKeyColumns=[userid],clusteringColumns=[picid],keyValidator=org.apache.cassandra.db.marshal.LongType,columnMetadata=[userid, picid, commentid],droppedColumns={},triggers=[],indexes=[]] to cfIdMap
Writing 50k
<failure>
java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:659)
at java.util.ArrayList.get(ArrayList.java:435)
at org.apache.cassandra.locator.TokenMetadata.firstToken(TokenMetadata.java:1079)
at org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:107)
at org.apache.cassandra.service.StorageService.getNaturalEndpoints(StorageService.java:4066)
at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:619)
at org.apache.cassandra.db.Mutation.apply(Mutation.java:227)
at org.apache.cassandra.db.Mutation.apply(Mutation.java:232)
at org.apache.cassandra.db.Mutation.apply(Mutation.java:241)
at org.apache.cassandra.cql3.statements.ModificationStatement.executeInternalWithoutCondition(ModificationStatement.java:587)
at org.apache.cassandra.cql3.statements.ModificationStatement.executeInternal(ModificationStatement.java:581)
at org.apache.cassandra.cql3.QueryProcessor.executeInternal(QueryProcessor.java:315)
at org.apache.cassandra.cql3.CQLTester.executeFormattedQuery(CQLTester.java:792)
at org.apache.cassandra.cql3.CQLTester.execute(CQLTester.java:780)
at org.apache.cassandra.test.microbench.ReadWriteTest.setup(ReadWriteTest.java:94)
at org.apache.cassandra.test.microbench.generated.ReadWriteTest_read_jmhTest._jmh_tryInit_f_readwritetest0_G(ReadWriteTest_read_jmhTest.java:438)
at org.apache.cassandra.test.microbench.generated.ReadWriteTest_read_jmhTest.read_Throughput(ReadWriteTest_read_jmhTest.java:71)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:453)
at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:437)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
`
then I use ant microbench -Dbenchmark.name=ReadWriteTest to run it, also it leads to the same error.
Any and all help would be appreciated.
Thanks.
of course I run the microbenchs with cassandra running, I also meet mistake when I shut down cassandra
`
INFO [StorageServiceShutdownHook] 2022-10-28 00:58:28,491 Server.java:176 - Stop listening for CQL clients
INFO [StorageServiceShutdownHook] 2022-10-28 00:58:28,491 Gossiper.java:1551 - Announcing shutdown
INFO [StorageServiceShutdownHook] 2022-10-28 00:58:28,492 StorageService.java:2454 - Node /192.168.199.135 state jump to shutdown
INFO [StorageServiceShutdownHook] 2022-10-28 00:58:30,495 MessagingService.java:981 - Waiting for messaging service to quiesce
INFO [ACCEPT-/192.168.199.135] 2022-10-28 00:58:30,495 MessagingService.java:1336 - MessagingService has terminated the accept() thread
WARN [MemtableFlushWriter:3] 2022-10-28 00:58:30,498 NativeLibrary.java:304 - open(/home/cjx/Downloads/depart0/depart/data/data/system_auth/roles-5bc52802de2535edaeab188eecebb090, O_RDONLY) failed, errno (2).
ERROR [MemtableFlushWriter:3] 2022-10-28 00:58:30,500 LogTransaction.java:272 - Transaction log [md_txn_flush_4ff62d10-5696-11ed-80e2-25f08a3965a3.log in /home/cjx/Downloads/depart0/depart/data/data/system_auth/roles-5bc52802de2535edaeab188eecebb090] indicates txn was not completed, trying to abort it now
ERROR [MemtableFlushWriter:3] 2022-10-28 00:58:30,505 LogTransaction.java:275 - Failed to abort transaction log [md_txn_flush_4ff62d10-5696-11ed-80e2-25f08a3965a3.log in /home/cjx/Downloads/depart0/depart/data/data/system_auth/roles-5bc52802de2535edaeab188eecebb090]
java.lang.RuntimeException: java.nio.file.NoSuchFileException: /home/cjx/Downloads/depart0/depart/data/data/system_auth/roles-5bc52802de2535edaeab188eecebb090/md_txn_flush_4ff62d10-5696-11ed-80e2-25f08a3965a3.log
at org.apache.cassandra.io.util.FileUtils.write(FileUtils.java:590) ~[main/:na]
at org.apache.cassandra.io.util.FileUtils.appendAndSync(FileUtils.java:571) ~[main/:na]
at org.apache.cassandra.db.lifecycle.LogReplica.append(LogReplica.java:85) ~[main/:na]
at org.apache.cassandra.db.lifecycle.LogReplicaSet.lambda$null$5(LogReplicaSet.java:210) ~[main/:na]
at org.apache.cassandra.utils.Throwables.perform(Throwables.java:113) ~[main/:na]
at org.apache.cassandra.utils.Throwables.perform(Throwables.java:103) ~[main/:na]
at org.apache.cassandra.db.lifecycle.LogReplicaSet.append(LogReplicaSet.java:210) ~[main/:na]
at org.apache.cassandra.db.lifecycle.LogFile.addRecord(LogFile.java:338) ~[main/:na]
at org.apache.cassandra.db.lifecycle.LogFile.abort(LogFile.java:255) ~[main/:na]
at org.apache.cassandra.utils.Throwables.perform(Throwables.java:113) ~[main/:na]
at org.apache.cassandra.utils.Throwables.perform(Throwables.java:103) ~[main/:na]
at org.apache.cassandra.utils.Throwables.perform(Throwables.java:98) ~[main/:na]
at org.apache.cassandra.db.lifecycle.LogTransaction$TransactionTidier.run(LogTransaction.java:273) [main/:na]
at org.apache.cassandra.db.lifecycle.LogTransaction$TransactionTidier.tidy(LogTransaction.java:257) [main/:na]
at org.apache.cassandra.utils.concurrent.Ref$GlobalState.release(Ref.java:322) [main/:na]
at org.apache.cassandra.utils.concurrent.Ref$State.ensureReleased(Ref.java:200) [main/:na]
at org.apache.cassandra.utils.concurrent.Ref.ensureReleased(Ref.java:120) [main/:na]
at org.apache.cassandra.db.lifecycle.LogTransaction.complete(LogTransaction.java:392) [main/:na]
at org.apache.cassandra.db.lifecycle.LogTransaction.doAbort(LogTransaction.java:409) [main/:na]
at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.abort(Transactional.java:144) [main/:na]
at org.apache.cassandra.db.lifecycle.LifecycleTransaction.doAbort(LifecycleTransaction.java:244) [main/:na]
at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.abort(Transactional.java:144) [main/:na]
at org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(ColumnFamilyStore.java:1234) [main/:na]
at org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1180) [main/:na]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_332]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_332]
at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) [main/:na]
at java.lang.Thread.run(Thread.java:750) ~[na:1.8.0_332]
Caused by: java.nio.file.NoSuchFileException: /home/cjx/Downloads/depart0/depart/data/data/system_auth/roles-5bc52802de2535edaeab188eecebb090/md_txn_flush_4ff62d10-5696-11ed-80e2-25f08a3965a3.log
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) ~[na:1.8.0_332]
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) ~[na:1.8.0_332]
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) ~[na:1.8.0_332]
at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214) ~[na:1.8.0_332]
at java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:434) ~[na:1.8.0_332]
at java.nio.file.Files.newOutputStream(Files.java:216) ~[na:1.8.0_332]
at java.nio.file.Files.write(Files.java:3351) ~[na:1.8.0_332]
at org.apache.cassandra.io.util.FileUtils.write(FileUtils.java:583) ~[main/:na]
... 27 common frames omitted
`
it's so wired and I could not know how to fix it
I encountered a problem when connecting to a NAS shared directory using spring-integration-smb.
The problem is that I was able to connect to another shared Nas directory but for the pre-prod Nas, I encountered this problem.
Also, the shared server administrator confirmed that both directories have the same configuration.
You will find below the stack encountered
07 mars 2022;14:49:50.702 [scheduling-1] WARN jcifs.smb.SmbTransportImpl - Disconnecting transport while still in use Transport12[NAS03/XXXXXXXX:445,state=5,signingEnforced=false,usage=1]: [SmbSession[credentials=XXXXXXXXXX,targetHost=nas03,targetDomain=null,uid=0,connectionState=2,usage=1]]
07 mars 2022;14:49:50.702 [scheduling-1] WARN jcifs.smb.SmbSessionImpl - Logging off session while still in use SmbSession[credentials=XXXXXXXXX,targetHost=nas03,targetDomain=null,uid=0,connectionState=3,usage=1]:[SmbTree[share=PPD,service=null,tid=4,inDfs=false,inDomainDfs=false,connectionState=0,usage=2]]
07 mars 2022;14:49:50.737 [scheduling-1] ERROR o.s.i.handler.LoggingHandler - org.springframework.messaging.MessagingException: Problem occurred while synchronizing '' to local directory; nested exception is org.springframework.messaging.MessagingException: Failure occurred while copying '/test.csv' from the remote to the local directory; nested exception is org.springframework.core.NestedIOException: Failed to read resource [/test.csv].; nested exception is jcifs.smb.SmbException: The parameter is incorrect.
at org.springframework.integration.file.remote.synchronizer.AbstractInboundFileSynchronizer.synchronizeToLocalDirectory(AbstractInboundFileSynchronizer.java:348)
at org.springframework.integration.file.remote.synchronizer.AbstractInboundFileSynchronizingMessageSource.doReceive(AbstractInboundFileSynchronizingMessageSource.java:267)
at org.springframework.integration.file.remote.synchronizer.AbstractInboundFileSynchronizingMessageSource.doReceive(AbstractInboundFileSynchronizingMessageSource.java:69)
at org.springframework.integration.endpoint.AbstractFetchLimitingMessageSource.doReceive(AbstractFetchLimitingMessageSource.java:47)
at org.springframework.integration.endpoint.AbstractMessageSource.receive(AbstractMessageSource.java:142)
at org.springframework.integration.endpoint.SourcePollingChannelAdapter.receiveMessage(SourcePollingChannelAdapter.java:212)
at org.springframework.integration.endpoint.AbstractPollingEndpoint.doPoll(AbstractPollingEndpoint.java:444)
at org.springframework.integration.endpoint.AbstractPollingEndpoint.pollForMessage(AbstractPollingEndpoint.java:413)
at org.springframework.integration.endpoint.AbstractPollingEndpoint.lambda$createPoller$4(AbstractPollingEndpoint.java:348)
at org.springframework.integration.util.ErrorHandlingTaskExecutor.lambda$execute$0(ErrorHandlingTaskExecutor.java:57)
at org.springframework.core.task.SyncTaskExecutor.execute(SyncTaskExecutor.java:50)
at org.springframework.integration.util.ErrorHandlingTaskExecutor.execute(ErrorHandlingTaskExecutor.java:55)
at org.springframework.integration.endpoint.AbstractPollingEndpoint.lambda$createPoller$5(AbstractPollingEndpoint.java:341)
at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
at org.springframework.scheduling.concurrent.ReschedulingRunnable.run(ReschedulingRunnable.java:95)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:264)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.springframework.messaging.MessagingException: Failure occurred while copying '/BE1_2_MOUVEMENTS_Valorisation_20211231_20220218_164451.csv' from the remote to the local directory; nested exception is org.springframework.core.NestedIOException: Failed to read resource [/BE1_2_MOUVEMENTS_Valorisation_20211231_20220218_164451.csv].; nested exception is jcifs.smb.SmbException: The parameter is incorrect.
at org.springframework.integration.file.remote.synchronizer.AbstractInboundFileSynchronizer.copyRemoteContentToLocalFile(AbstractInboundFileSynchronizer.java:551)
at org.springframework.integration.file.remote.synchronizer.AbstractInboundFileSynchronizer.copyFileToLocalDirectory(AbstractInboundFileSynchronizer.java:488)
at org.springframework.integration.file.remote.synchronizer.AbstractInboundFileSynchronizer.copyIfNotNull(AbstractInboundFileSynchronizer.java:403)
at org.springframework.integration.file.remote.synchronizer.AbstractInboundFileSynchronizer.transferFilesFromRemoteToLocal(AbstractInboundFileSynchronizer.java:386)
at org.springframework.integration.file.remote.synchronizer.AbstractInboundFileSynchronizer.lambda$synchronizeToLocalDirectory$0(AbstractInboundFileSynchronizer.java:342)
at org.springframework.integration.file.remote.RemoteFileTemplate.execute(RemoteFileTemplate.java:452)
at org.springframework.integration.file.remote.synchronizer.AbstractInboundFileSynchronizer.synchronizeToLocalDirectory(AbstractInboundFileSynchronizer.java:341)
... 21 more
Caused by: org.springframework.core.NestedIOException: Failed to read resource [/BE1_2_MOUVEMENTS_Valorisation_20211231_20220218_164451.csv].; nested exception is jcifs.smb.SmbException: The parameter is incorrect.
at org.springframework.integration.smb.session.SmbSession.read(SmbSession.java:188)
at org.springframework.integration.file.remote.synchronizer.AbstractInboundFileSynchronizer.copyRemoteContentToLocalFile(AbstractInboundFileSynchronizer.java:545)
... 27 more
Caused by: jcifs.smb.SmbException: The parameter is incorrect.
at jcifs.smb.SmbTransportImpl.checkStatus2(SmbTransportImpl.java:1467)
at jcifs.smb.SmbTransportImpl.checkStatus(SmbTransportImpl.java:1578)
at jcifs.smb.SmbTransportImpl.sendrecv(SmbTransportImpl.java:1027)
at jcifs.smb.SmbTransportImpl.send(SmbTransportImpl.java:1549)
at jcifs.smb.SmbSessionImpl.send(SmbSessionImpl.java:409)
at jcifs.smb.SmbTreeImpl.send(SmbTreeImpl.java:472)
at jcifs.smb.SmbTreeConnection.send0(SmbTreeConnection.java:404)
at jcifs.smb.SmbTreeConnection.send(SmbTreeConnection.java:318)
at jcifs.smb.SmbTreeConnection.send(SmbTreeConnection.java:298)
at jcifs.smb.SmbTreeHandleImpl.send(SmbTreeHandleImpl.java:130)
at jcifs.smb.SmbTreeHandleImpl.send(SmbTreeHandleImpl.java:117)
at jcifs.smb.SmbFile.withOpen(SmbFile.java:1775)
at jcifs.smb.SmbFile.withOpen(SmbFile.java:1744)
at jcifs.smb.SmbFile.queryPath(SmbFile.java:793)
at jcifs.smb.SmbFile.exists(SmbFile.java:879)
at jcifs.smb.SmbFile.isFile(SmbFile.java:1102)
at org.springframework.integration.smb.session.SmbSession.read(SmbSession.java:182)
... 28 more
here is my code :
#Bean
public SmbSessionFactory smbSessionFactory() {
VaultResponse vaultResponse = vaultTemplate
.opsForKeyValue(vaultPath, VaultKeyValueOperationsSupport.KeyValueBackend.KV_2).get(vaultSecretsPath.toLowerCase());
SmbSessionFactory smbSession = new SmbSessionFactory();
smbSession.setHost(properties.getNasHost());
smbSession.setPort(properties.getNasPort());
smbSession.setDomain(properties.getNasDomain());
if (vaultResponse != null) {
Map<String, Object> data = vaultResponse.getData();
smbSession.setUsername(data != null && data.get("nasUsername") != null ? (String) data.get("nasUsername") : "");
smbSession.setPassword(data != null && data.get("nasPassword") != null ? (String) data.get("nasPassword") : "");
}
smbSession.setShareAndDir(properties.getNasShareAndDir());
smbSession.setReplaceFile(true);
smbSession.setSmbMinVersion(DialectVersion.SMB1);
smbSession.setSmbMaxVersion(DialectVersion.SMB311);
return smbSession;
}
thank you in advance,
my spark job;
val df = spark.sql(s"""""")
df.write.mode("append").json("hdfs://xxx-nn-ha/user/b_me/df")
Error:
2019-01-31 19:56:36 INFO CoarseGrainedExecutorBackend:54 - Driver commanded a shutdown
2019-01-31 19:56:36 INFO MemoryStore:54 - MemoryStore cleared
2019-01-31 19:56:36 INFO BlockManager:54 - BlockManager stopped
2019-01-31 19:56:36 INFO ShutdownHookManager:54 - Shutdown hook called
2019-01-31 19:56:36 ERROR Executor:91 - Exception in task 4120.0 in stage 5.0 (TID 5402)
java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:808)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:868)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:934)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:735)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at org.apache.hadoop.io.SequenceFile$Reader.readBlock(SequenceFile.java:2178)
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2585)
at org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordReader.java:82)
at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:277)
at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:214)
at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage12.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.hasNext(WholeStageCodegenExec.scala:614)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:369)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2019-01-31 19:56:36 INFO Executor:54 - Not reporting error to driver during JVM shutdown.
End of LogType:stdout
I got the java.io.IOException: Filesystem closed error. Why? I have no hints. Any ideas welcomed. Thanks
UPDATE
There is a WARN:
2019-02-01 12:01:39 INFO YarnAllocator:54 - Driver requested a total number of 2007 executor(s).
2019-02-01 12:01:39 INFO ExecutorAllocationManager:54 - Requesting 968 new executors because tasks are backlogged (new desired total will be 2007)
2019-02-01 12:01:39 INFO ExecutorAllocationManager:54 - New executor 25 has registered (new total is 26)
2019-02-01 12:01:39 WARN ApplicationMaster:87 - Reporter thread fails 1 time(s) in a row.
org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Too many containers asked, 1365198
at org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.normalizeAndValidateRequests(RMServerUtils.java:128)
at org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:511)
at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
at org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2206)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2202)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1714)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2202)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:101)
at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:79)
at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy22.allocate(Unknown Source)
at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:277)
at org.apache.spark.deploy.yarn.YarnAllocator.allocateResources(YarnAllocator.scala:268)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:556)
If you look closer at DFSClient.checkOpen code you will see the following:
void checkOpen() throws IOException {
if (!clientRunning) {
IOException result = new IOException("Filesystem closed");
throw result;
}
}
Let's find all accessors of clientRunning field.
Only close method really changes it. Lets take a look at it:
#Override
public synchronized void close() throws IOException {
try {
if(clientRunning) {
closeAllFilesBeingWritten(false);
clientRunning = false;
getLeaseRenewer().closeClient(this);
// close connections to the namenode
closeConnectionToNamenode();
}
} finally {
if (provider != null) {
provider.close();
}
}
}
So the main problem in your job is that it tries to write the value, although the FS is already closed.
Make sure you don't close your FS before you do any job. You can also increase the logging level to find the cause.
Trying to log channel error and using a retry policy, i log two exception stacktrace instead of only one.
Channel configuration :
<bean id="retryInterceptor" class="org.springframework.amqp.rabbit.config.StatelessRetryOperationsInterceptorFactoryBean">
<property name="messageRecoverer" ref="rejectAndDontRequeueRecoverer"/>
<property name="retryOperations" ref="retryTemplate" />
</bean>
<bean id="rejectAndDontRequeueRecoverer" class="org.springframework.amqp.rabbit.retry.RejectAndDontRequeueRecoverer"/>
<!-- Configuration du retry -->
<bean id="retryTemplate" class="org.springframework.retry.support.RetryTemplate">
<property name="backOffPolicy">
<bean class="org.springframework.retry.backoff.ExponentialBackOffPolicy">
<!-- Intervalle de temps en millisecondes entre deux tentatives -->
<property name="initialInterval" value="2000" />
</bean>
</property>
<property name="retryPolicy">
<bean class="org.springframework.retry.policy.SimpleRetryPolicy">
<!-- Nombre de retry max -->
<property name="maxAttempts" value="3" />
</bean>
</property>
</bean>
<bean id="log" class="com.logger.Logger"/>
<!-- Logger -->
<int:channel id="one" ></int:channel>
<int:channel id="error1" ></int:channel>
<int:publish-subscribe-channel id="processChannel1" />
<int:logging-channel-adapter channel="processChannel1" logger-name="log1" level="ERROR" expression="payload.message"/>
<int:service-activator input-channel="processChannel1" ref="log" output-channel="error1" />
<!-- Source -->
<rabbit:connection-factory id="connectionFactory" username="guest" password="guest" addresses="XX.XX.XX.XX:5672" cache-mode="CONNECTION" virtual-host="/" />
<int-amqp:inbound-channel-adapter channel="one" id="inboundChannelAdapter1" queue-names="myqueue" connection-factory="connectionFactory" error-channel="processChannel1" channel-transacted="true" advice-chain="retryInterceptor" />
<!-- Destination -->
<rabbit:connection-factory id="connectionFactoryRmqDest" username="guest" password="guest111" addresses="YY.YY.YY.YY:5672" cache-mode="CONNECTION" connection-cache-size="50" virtual-host="/" />
<rabbit:template id="rabbitTemplateRmqDest" connection-factory="connectionFactoryRmqDest" />
<int-amqp:outbound-channel-adapter channel="one" id="outboundChannelAdapter1" routing-key="keyMyQueue" exchange-name="direct.exchange" amqp-template="rabbitTemplateRmqDest" default-delivery-mode="PERSISTENT"/>
Logger.java
package com.logger;
public class Logger {
public Message<?> log(Message<?> message) {
System.out.println("*************************************");
throw new AmqpException("Error on channel");
}
}
Log file :
2018-03-12 16:34:43.008 [(inner bean)#160e9f2e-1] ERROR log1 - error occurred in message handler [org.springframework.integration.amqp.outbound.AmqpOutboundEndpoint#0]; nested exception is org.springframework.amqp.AmqpAuthenticationException: com.rabbitmq.client.AuthenticationFailureException: ACCESS_REFUSED - Login was refused using authentication mechanism PLAIN. For details see the broker logfile.
2018-03-12 16:34:43.021 [(inner bean)#160e9f2e-1] ERROR log1 - Message conversion failed
2018-03-12 16:34:45.181 [(inner bean)#160e9f2e-1] ERROR log1 - error occurred in message handler [org.springframework.integration.amqp.outbound.AmqpOutboundEndpoint#0]; nested exception is org.springframework.amqp.AmqpAuthenticationException: com.rabbitmq.client.AuthenticationFailureException: ACCESS_REFUSED - Login was refused using authentication mechanism PLAIN. For details see the broker logfile.
2018-03-12 16:34:45.182 [(inner bean)#160e9f2e-1] ERROR log1 - Message conversion failed
2018-03-12 16:34:49.345 [(inner bean)#160e9f2e-1] ERROR log1 - error occurred in message handler [org.springframework.integration.amqp.outbound.AmqpOutboundEndpoint#0]; nested exception is org.springframework.amqp.AmqpAuthenticationException: com.rabbitmq.client.AuthenticationFailureException: ACCESS_REFUSED - Login was refused using authentication mechanism PLAIN. For details see the broker logfile.
2018-03-12 16:34:49.346 [(inner bean)#160e9f2e-1] ERROR log1 - Message conversion failed
Removing the expression attributes, stack traces are (just one attempt on the 3 retries):
2018-03-12 17:24:03.303 [(inner bean)#76e1e924-1] ERROR log1 - org.springframework.messaging.MessageHandlingException: error occurred in message handler [org.springframework.integration.amqp.outbound.AmqpOutboundEndpoint#0]; nested exception is org.springframework.amqp.AmqpAuthenticationException: com.rabbitmq.client.AuthenticationFailureException: ACCESS_REFUSED - Login was refused using authentication mechanism PLAIN. For details see the broker logfile., failedMessage=GenericMessage [payload=byte[20], headers={timestamp=1520871843104, id=7c9f713f-90a1-2720-60e4-a0e33ef98ecb, amqp_receivedRoutingKey=myqueue, amqp_consumerQueue=myqueue, amqp_consumerTag=amq.ctag-ENdQ-j6xD1LRXDxGr_iJuQ, amqp_receivedDeliveryMode=NON_PERSISTENT, amqp_redelivered=false, amqp_deliveryTag=1}]
at org.springframework.integration.handler.AbstractMessageHandler.handleMessage(AbstractMessageHandler.java:139)
at org.springframework.integration.dispatcher.AbstractDispatcher.tryOptimizedDispatch(AbstractDispatcher.java:116)
at org.springframework.integration.dispatcher.UnicastingDispatcher.doDispatch(UnicastingDispatcher.java:148)
at org.springframework.integration.dispatcher.UnicastingDispatcher.dispatch(UnicastingDispatcher.java:121)
at org.springframework.integration.channel.AbstractSubscribableChannel.doSend(AbstractSubscribableChannel.java:89)
at org.springframework.integration.channel.AbstractMessageChannel.send(AbstractMessageChannel.java:425)
at org.springframework.integration.channel.AbstractMessageChannel.send(AbstractMessageChannel.java:375)
at org.springframework.messaging.core.GenericMessagingTemplate.doSend(GenericMessagingTemplate.java:115)
at org.springframework.messaging.core.GenericMessagingTemplate.doSend(GenericMessagingTemplate.java:45)
at org.springframework.messaging.core.AbstractMessageSendingTemplate.send(AbstractMessageSendingTemplate.java:105)
at org.springframework.integration.endpoint.MessageProducerSupport.sendMessage(MessageProducerSupport.java:188)
at org.springframework.integration.amqp.inbound.AmqpInboundChannelAdapter.access$1100(AmqpInboundChannelAdapter.java:56)
at org.springframework.integration.amqp.inbound.AmqpInboundChannelAdapter$Listener.processMessage(AmqpInboundChannelAdapter.java:246)
at org.springframework.integration.amqp.inbound.AmqpInboundChannelAdapter$Listener.onMessage(AmqpInboundChannelAdapter.java:203)
at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.doInvokeListener(AbstractMessageListenerContainer.java:848)
at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.invokeListener(AbstractMessageListenerContainer.java:771)
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.access$001(SimpleMessageListenerContainer.java:102)
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$1.invokeListener(SimpleMessageListenerContainer.java:198)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:333)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:190)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
at org.springframework.retry.interceptor.RetryOperationsInterceptor$1.doWithRetry(RetryOperationsInterceptor.java:74)
at org.springframework.retry.support.RetryTemplate.doExecute(RetryTemplate.java:276)
at org.springframework.retry.support.RetryTemplate.execute(RetryTemplate.java:172)
at org.springframework.retry.interceptor.RetryOperationsInterceptor.invoke(RetryOperationsInterceptor.java:98)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:213)
at com.sun.proxy.$Proxy9.invokeListener(Unknown Source)
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.invokeListener(SimpleMessageListenerContainer.java:1311)
at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.executeListener(AbstractMessageListenerContainer.java:752)
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.doReceiveAndExecute(SimpleMessageListenerContainer.java:1254)
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.receiveAndExecute(SimpleMessageListenerContainer.java:1224)
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.access$1600(SimpleMessageListenerContainer.java:102)
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$AsyncMessageProcessingConsumer.run(SimpleMessageListenerContainer.java:1470)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.springframework.amqp.AmqpAuthenticationException: com.rabbitmq.client.AuthenticationFailureException: ACCESS_REFUSED - Login was refused using authentication mechanism PLAIN. For details see the broker logfile.
at org.springframework.amqp.rabbit.support.RabbitExceptionTranslator.convertRabbitAccessException(RabbitExceptionTranslator.java:65)
at org.springframework.amqp.rabbit.connection.AbstractConnectionFactory.createBareConnection(AbstractConnectionFactory.java:368)
at org.springframework.amqp.rabbit.connection.CachingConnectionFactory.createConnection(CachingConnectionFactory.java:603)
at org.springframework.amqp.rabbit.core.RabbitTemplate.doExecute(RabbitTemplate.java:1430)
at org.springframework.amqp.rabbit.core.RabbitTemplate.execute(RabbitTemplate.java:1411)
at org.springframework.amqp.rabbit.core.RabbitTemplate.send(RabbitTemplate.java:712)
at org.springframework.integration.amqp.outbound.AmqpOutboundEndpoint.send(AmqpOutboundEndpoint.java:134)
at org.springframework.integration.amqp.outbound.AmqpOutboundEndpoint.handleRequestMessage(AmqpOutboundEndpoint.java:122)
at org.springframework.integration.handler.AbstractReplyProducingMessageHandler.handleMessageInternal(AbstractReplyProducingMessageHandler.java:109)
at org.springframework.integration.handler.AbstractMessageHandler.handleMessage(AbstractMessageHandler.java:127)
... 38 more
Caused by: com.rabbitmq.client.AuthenticationFailureException: ACCESS_REFUSED - Login was refused using authentication mechanism PLAIN. For details see the broker logfile.
at com.rabbitmq.client.impl.AMQConnection.start(AMQConnection.java:342)
at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:909)
at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:859)
at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:799)
at org.springframework.amqp.rabbit.connection.AbstractConnectionFactory.createBareConnection(AbstractConnectionFactory.java:352)
... 46 more
2018-03-12 17:24:03.314 [(inner bean)#76e1e924-1] ERROR log1 - org.springframework.amqp.rabbit.listener.exception.ListenerExecutionFailedException: Message conversion failed
at org.springframework.integration.amqp.inbound.AmqpInboundChannelAdapter$Listener.onMessage(AmqpInboundChannelAdapter.java:223)
at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.doInvokeListener(AbstractMessageListenerContainer.java:848)
at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.invokeListener(AbstractMessageListenerContainer.java:771)
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.access$001(SimpleMessageListenerContainer.java:102)
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$1.invokeListener(SimpleMessageListenerContainer.java:198)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:333)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:190)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
at org.springframework.retry.interceptor.RetryOperationsInterceptor$1.doWithRetry(RetryOperationsInterceptor.java:74)
at org.springframework.retry.support.RetryTemplate.doExecute(RetryTemplate.java:276)
at org.springframework.retry.support.RetryTemplate.execute(RetryTemplate.java:172)
at org.springframework.retry.interceptor.RetryOperationsInterceptor.invoke(RetryOperationsInterceptor.java:98)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:213)
at com.sun.proxy.$Proxy9.invokeListener(Unknown Source)
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.invokeListener(SimpleMessageListenerContainer.java:1311)
at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.executeListener(AbstractMessageListenerContainer.java:752)
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.doReceiveAndExecute(SimpleMessageListenerContainer.java:1254)
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.receiveAndExecute(SimpleMessageListenerContainer.java:1224)
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.access$1600(SimpleMessageListenerContainer.java:102)
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$AsyncMessageProcessingConsumer.run(SimpleMessageListenerContainer.java:1470)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.springframework.messaging.MessageHandlingException: nested exception is org.springframework.amqp.AmqpException: Erreur sur le channel, failedMessage=EnhancedErrorMessage [payload=org.springframework.messaging.MessageHandlingException: error occurred in message handler [org.springframework.integration.amqp.outbound.AmqpOutboundEndpoint#0]; nested exception is org.springframework.amqp.AmqpAuthenticationException: com.rabbitmq.client.AuthenticationFailureException: ACCESS_REFUSED - Login was refused using authentication mechanism PLAIN. For details see the broker logfile., failedMessage=GenericMessage [payload=byte[20], headers={timestamp=1520871843104, id=7c9f713f-90a1-2720-60e4-a0e33ef98ecb, amqp_receivedRoutingKey=myqueue, amqp_consumerQueue=myqueue, amqp_consumerTag=amq.ctag-ENdQ-j6xD1LRXDxGr_iJuQ, amqp_receivedDeliveryMode=NON_PERSISTENT, amqp_redelivered=false, amqp_deliveryTag=1}], headers={timestamp=1520871843291, id=dbde6a9f-3c26-086f-53a1-7c98482efcce, amqp_raw_message=(Body:'[B#56360d58(byte[20])' MessageProperties [headers={}, timestamp=null, messageId=null, userId=null, receivedUserId=null, appId=null, clusterId=null, type=null, correlationId=null, correlationIdString=null, replyTo=null, contentType=null, contentEncoding=null, contentLength=0, deliveryMode=null, receivedDeliveryMode=NON_PERSISTENT, expiration=null, priority=null, redelivered=false, receivedExchange=, receivedRoutingKey=myqueue, receivedDelay=null, deliveryTag=1, messageCount=0, consumerTag=amq.ctag-ENdQ-j6xD1LRXDxGr_iJuQ, consumerQueue=myqueue])}] for original GenericMessage [payload=byte[20], headers={timestamp=1520871843104, id=7c9f713f-90a1-2720-60e4-a0e33ef98ecb, amqp_receivedRoutingKey=myqueue, amqp_consumerQueue=myqueue, amqp_consumerTag=amq.ctag-ENdQ-j6xD1LRXDxGr_iJuQ, amqp_receivedDeliveryMode=NON_PERSISTENT, amqp_redelivered=false, amqp_deliveryTag=1}]
at org.springframework.integration.handler.MethodInvokingMessageProcessor.processMessage(MethodInvokingMessageProcessor.java:96)
at org.springframework.integration.handler.ServiceActivatingHandler.handleRequestMessage(ServiceActivatingHandler.java:89)
at org.springframework.integration.handler.AbstractReplyProducingMessageHandler.handleMessageInternal(AbstractReplyProducingMessageHandler.java:109)
at org.springframework.integration.handler.AbstractMessageHandler.handleMessage(AbstractMessageHandler.java:127)
at org.springframework.integration.dispatcher.BroadcastingDispatcher.invokeHandler(BroadcastingDispatcher.java:236)
at org.springframework.integration.dispatcher.BroadcastingDispatcher.dispatch(BroadcastingDispatcher.java:185)
at org.springframework.integration.channel.AbstractSubscribableChannel.doSend(AbstractSubscribableChannel.java:89)
at org.springframework.integration.channel.AbstractMessageChannel.send(AbstractMessageChannel.java:425)
at org.springframework.integration.channel.AbstractMessageChannel.send(AbstractMessageChannel.java:375)
at org.springframework.messaging.core.GenericMessagingTemplate.doSend(GenericMessagingTemplate.java:115)
at org.springframework.messaging.core.GenericMessagingTemplate.doSend(GenericMessagingTemplate.java:45)
at org.springframework.messaging.core.AbstractMessageSendingTemplate.send(AbstractMessageSendingTemplate.java:105)
at org.springframework.integration.endpoint.MessageProducerSupport.sendErrorMessageIfNecessary(MessageProducerSupport.java:207)
at org.springframework.integration.endpoint.MessageProducerSupport.sendMessage(MessageProducerSupport.java:191)
at org.springframework.integration.amqp.inbound.AmqpInboundChannelAdapter.access$1100(AmqpInboundChannelAdapter.java:56)
at org.springframework.integration.amqp.inbound.AmqpInboundChannelAdapter$Listener.processMessage(AmqpInboundChannelAdapter.java:246)
at org.springframework.integration.amqp.inbound.AmqpInboundChannelAdapter$Listener.onMessage(AmqpInboundChannelAdapter.java:203)
... 25 more
Caused by: org.springframework.amqp.AmqpException: Erreur sur le channel
at com.logger.Logger.log(Logger.java:36)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.springframework.expression.spel.support.ReflectiveMethodExecutor.execute(ReflectiveMethodExecutor.java:117)
at org.springframework.expression.spel.ast.MethodReference.getValueInternal(MethodReference.java:129)
at org.springframework.expression.spel.ast.MethodReference.access$000(MethodReference.java:49)
at org.springframework.expression.spel.ast.MethodReference$MethodValueRef.getValue(MethodReference.java:347)
at org.springframework.expression.spel.ast.CompoundExpression.getValueInternal(CompoundExpression.java:88)
at org.springframework.expression.spel.ast.SpelNodeImpl.getTypedValue(SpelNodeImpl.java:132)
at org.springframework.expression.spel.standard.SpelExpression.getValue(SpelExpression.java:360)
at org.springframework.integration.util.AbstractExpressionEvaluator.evaluateExpression(AbstractExpressionEvaluator.java:169)
at org.springframework.integration.util.MessagingMethodInvokerHelper.processInternal(MessagingMethodInvokerHelper.java:319)
at org.springframework.integration.util.MessagingMethodInvokerHelper.process(MessagingMethodInvokerHelper.java:155)
at org.springframework.integration.handler.MethodInvokingMessageProcessor.processMessage(MethodInvokingMessageProcessor.java:93)
... 41 more
In the log file i have 3 attemps for transfer the message. => Retry is ok
I would like to have only one stacktrace? What the mean of the "Message conversion failed"? How can i prevent it?
Is there a way to test if the cause of an exception is empty or null? I would like to use "payload.cause.message" in the expression?
Thanks for your help,
Regards
Eric
The first log message comes from the <int:logging-channel-adapter> just because you use an error-channel="processChannel1" on the <int-amqp:outbound-channel-adapter>.
The second one comes from the AbstractMessageListenerContainer because your Logger doesn't log message, but does exactly this:
throw new AmqpException("Error on channel");
So, when we do in the AmqpInboundChannelAdapter.Listener this:
getMessagingTemplate().send(getErrorChannel(), buildErrorMessage(null,
new ListenerExecutionFailedException("Message conversion failed", e, message)));
Send error message to the error channel, you just re-throw an exception from the to the top-level AMQP listener and that one logs it one more time.
You should consider do not re-throw an exception from your Logger component, or don't use use one more <int:logging-channel-adapter> subscriber, or just don't use error-channel approach at all!
For the payload.cause.message expression you should use a SpEL ternary operator : payload.cause != null ? payload.cause.message : payload.message
UPDATE
OK. I see what's going on.
We send an exception to the errorChannel during MessageProducerSupport.sendMessage() process:
try {
this.messagingTemplate.send(getOutputChannel(), message);
}
catch (RuntimeException e) {
if (!sendErrorMessageIfNecessary(message, e)) {
throw e;
}
}
And since you have one more subscriber to throw a fresh exception (or just re-throw), that sendErrorMessageIfNecessary() finishes with a bubbled exception.
This one come to the catch (RuntimeException e) { of the AmqpInboundChannelAdapter.Listener.onMessage() like mentioned before. And a new ErrorMessage is sent to the errorChannel. That's how you see that one more logged message. But since the second subscriber re-throws an exception, that's how it comes back to the listener container and initiate retries. And that's how you see a stack trace only once, after all the retry attempts. However I can guess that it is already a wrong stack trace, because you would like to see the real reason of the sending to the AMQP problem.
What I suggest here for you as a solution is like this:
No error-channel on the <int-amqp:inbound-channel-adapter>
No second subscriber to the processChannel1 - the existing <int:logging-channel-adapter> is enough.
You configure ExpressionEvaluatingRequestHandlerAdvice with the failureChannel to the processChannel1 and onFailureExpression to the #this
Use this ExpressionEvaluatingRequestHandlerAdvice as a <request-handler-advice-chain> reference for the <int-amqp:outbound-channel-adapter>
This way you'll get logs on each retry and according the trapException = false, a real exception will be re-thrown to the listener cotnainer for retry.
I have the following code:
StringSerializer ss = StringSerializer.get();
String cf = "TEST";
CassandraHostConfigurator conf = new CassandraHostConfigurator("localhost:9160");
conf.setCassandraThriftSocketTimeout(40000);
conf.setExhaustedPolicy(ExhaustedPolicy.WHEN_EXHAUSTED_BLOCK);
conf.setRetryDownedHostsDelayInSeconds(5);
conf.setRetryDownedHostsQueueSize(128);
conf.setRetryDownedHosts(true);
conf.setLoadBalancingPolicy(new LeastActiveBalancingPolicy());
String key = Long.toString(System.currentTimeMillis());
Cluster cluster = HFactory.getOrCreateCluster("TestCluster", conf);
Keyspace keyspace = HFactory.createKeyspace("TestCluster", cluster);
Mutator<String> mutator = HFactory.createMutator(keyspace, StringSerializer.get()); int count = 0;
while (!"q".equals(new Scanner( System.in).next())) {
try{
mutator.insert(key, cf, HFactory.createColumn("column_" + count, "v_" + count, ss, ss));
count++;
} catch (Exception e) {
e.printStackTrace();
}
}
and I can write some values using it, but when I restart cassandra, it fails. Here is the log:
[15:11:07] INFO [CassandraHostRetryService ] Downed Host Retry service started with >queue size 128 and retry delay 5s
[15:11:07] INFO [JmxMonitor ] Registering JMX >me.prettyprint.cassandra.service_ASG:ServiceType=hector,MonitorType=hector
[15:11:17] ERROR [HThriftClient ] Could not flush transport (to be expected >if the pool is shutting down) in close for client: CassandraClient
org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe
at >org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:147)
at org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:156)
at >me.prettyprint.cassandra.connection.client.HThriftClient.close(HThriftClient.java:98)
at >me.prettyprint.cassandra.connection.client.HThriftClient.close(HThriftClient.java:26)
at >me.prettyprint.cassandra.connection.HConnectionManager.closeClient(HConnectionManager.java:308)
at >me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:257)
at >me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:97)
at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243)
at me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:69)
at com.app.App.main(App.java:40)
Caused by: java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
at >org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:145)
... 9 more
[15:11:17] ERROR [HConnectionManager ] MARK HOST AS DOWN TRIGGERED for host >localhost(127.0.0.1):9160
[15:11:17] ERROR [HConnectionManager ] Pool state on shutdown: >:{localhost(127.0.0.1):9160}; IsActive?: true; Active: 1; Blocked: 0; Idle: 15; NumBeforeExhausted: 49
[15:11:17] INFO [ConcurrentHClientPool ] Shutdown triggered on :{localhost(127.0.0.1):9160}
[15:11:17] INFO [ConcurrentHClientPool ] Shutdown complete on :{localhost(127.0.0.1):9160}
[15:11:17] INFO [CassandraHostRetryService ] Host detected as down was added to retry queue: localhost(127.0.0.1):9160
[15:11:17] WARN [HConnectionManager ] Could not fullfill request on this host CassandraClient
[15:11:17] WARN [HConnectionManager ] Exception:
me.prettyprint.hector.api.exceptions.HectorTransportException: org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe
at >me.prettyprint.cassandra.connection.client.HThriftClient.getCassandra(HThriftClient.java:82)
at >me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:236)
at >me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:97)
at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243)
at me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:69)
at com.app.App.main(App.java:40)
Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe
at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:147)
at org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:157)
at org.apache.cassandra.thrift.Cassandra$Client.send_set_keyspace(Cassandra.java:466)
at org.apache.cassandra.thrift.Cassandra$Client.set_keyspace(Cassandra.java:455)
at >me.prettyprint.cassandra.connection.client.HThriftClient.getCassandra(HThriftClient.java:78)
... 5 more
Caused by: java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
at >org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:145)
... 9 more
[15:11:17] INFO [HConnectionManager ] Client CassandraClient released to inactive or dead pool. Closing.
[15:11:17] INFO [HConnectionManager ] Client CassandraClient released to inactive or dead pool. Closing.
[15:11:17] INFO [HConnectionManager ] Added host localhost(127.0.0.1):9160 to pool
You have set -
conf.setRetryDownedHostsDelayInSeconds(5);
Try to to wait after the restart for more than 5 seconds.
Also, you may need to upgrade.
What is the size thrift_max_message_length_in_mb you have set?
Kind regards.