I've defined a JMS Inbound Channel Adapter as following in Spring Integration 3.0.1.RELEASE:
<int-jms:inbound-channel-adapter channel="inChannel" phase="1000"
destination-name="jmsQueue" extract-payload="true"
connection-factory="connectionFactory">
<int:poller max-messages-per-poll="1" fixed-rate="1000"/>
</int-jms:inbound-channel-adapter>
But multiple messages are consumed from the JMS broker at random non-reliable distances, which can go from several seconds to several minutes between each message being consumed. I've tried with fixed-delay instead of fixed-rate but it has the same behavior.
Which other factor could be making the poll operations be performed at different times, and how could I achieve reliable polling times?
EDIT:
I've confined the application to a single default poller, with a single JMS Inbound Channel Adapter (although there are some Message Driven Channel Adapters), and it's still having the same behaviour. I've twitched the wait times to fixed-delay of 3000 and a receive-timeout of 5000.
I've started the application with some messages on the JMS queue to be picked up, and the logs show entries like this, switching the thread of the task-scheduler after some callback operations as shown below:
2016-09-23 18:48:25,592 | DEBUG | ask-scheduler-1 | ework.integration.jms.DynamicJmsTemplate | Executing callback on JMS Session: ActiveMQSession {id=ID:v10033215-56491-1474666967074-0:18:1,started=true}
2016-09-23 18:48:25,630 | DEBUG | ask-scheduler-1 | ion.endpoint.SourcePollingChannelAdapter | Received no Message during the poll, returning 'false'
2016-09-23 18:48:28,639 | DEBUG | ask-scheduler-1 | ework.integration.jms.DynamicJmsTemplate | Executing callback on JMS Session: ActiveMQSession {id=ID:v10033215-56491-1474666967074-0:19:1,started=true}
2016-09-23 18:48:28,643 | DEBUG | ask-scheduler-1 | ion.endpoint.SourcePollingChannelAdapter | Received no Message during the poll, returning 'false'
2016-09-23 18:48:31,651 | DEBUG | ask-scheduler-3 | ework.integration.jms.DynamicJmsTemplate | Executing callback on JMS Session: ActiveMQSession {id=ID:v10033215-56491-1474666967074-0:20:1,started=true}
2016-09-23 18:48:31,657 | DEBUG | ask-scheduler-3 | ion.endpoint.SourcePollingChannelAdapter | Received no Message during the poll, returning 'false'
2016-09-23 18:48:34,666 | DEBUG | ask-scheduler-1 | ework.integration.jms.DynamicJmsTemplate | Executing callback on JMS Session: ActiveMQSession {id=ID:v10033215-56491-1474666967074-0:21:1,started=true}
2016-09-23 18:48:34,670 | DEBUG | ask-scheduler-1 | ion.endpoint.SourcePollingChannelAdapter | Received no Message during the poll, returning 'false'
And then, 10 minutes later:
2016-09-23 18:58:10,032 | DEBUG | ask-scheduler-8 | ework.integration.jms.DynamicJmsTemplate | Executing callback on JMS Session: ActiveMQSession {id=ID:v10033215-56491-1474666967074-0:212:1,started=true}
2016-09-23 18:58:10,091 | DEBUG | ask-scheduler-8 | ion.endpoint.SourcePollingChannelAdapter | Poll resulted in Message:
And the message is consumed.
I've took multiple dumps, and could find only one instance of a task-executor thread on RUNNING state:
"task-scheduler-4" prio=6 tid=0x000000001074f800 nid=0x4364 runnable [0x000000001d4fe000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
at java.net.SocketOutputStream.write(SocketOutputStream.java:159)
at org.apache.activemq.transport.tcp.TcpBufferedOutputStream.flush(TcpBufferedOutputStream.java:115)
at java.io.DataOutputStream.flush(DataOutputStream.java:123)
at org.apache.activemq.transport.tcp.TcpTransport.oneway(TcpTransport.java:167)
at org.apache.activemq.transport.InactivityMonitor.oneway(InactivityMonitor.java:237)
- locked <0x00000007dc803080> (a java.util.concurrent.atomic.AtomicBoolean)
at org.apache.activemq.transport.TransportFilter.oneway(TransportFilter.java:83)
at org.apache.activemq.transport.WireFormatNegotiator.oneway(WireFormatNegotiator.java:104)
at org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:40)
- locked <0x00000007dc8031f8> (a java.lang.Object)
at org.apache.activemq.transport.ResponseCorrelator.oneway(ResponseCorrelator.java:60)
at org.apache.activemq.ActiveMQConnection.doAsyncSendPacket(ActiveMQConnection.java:1225)
at org.apache.activemq.ActiveMQConnection.asyncSendPacket(ActiveMQConnection.java:1219)
at org.apache.activemq.ActiveMQSession.doClose(ActiveMQSession.java:590)
at org.apache.activemq.ActiveMQSession.close(ActiveMQSession.java:581)
at org.springframework.jms.support.JmsUtils.closeSession(JmsUtils.java:108)
at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:497)
at org.springframework.jms.core.JmsTemplate.receiveSelected(JmsTemplate.java:761)
at org.springframework.integration.jms.JmsDestinationPollingSource.doReceiveJmsMessage(JmsDestinationPollingSource.java:118)
at org.springframework.integration.jms.JmsDestinationPollingSource.receive(JmsDestinationPollingSource.java:93)
at org.springframework.integration.endpoint.SourcePollingChannelAdapter.receiveMessage(SourcePollingChannelAdapter.java:111)
at org.springframework.integration.endpoint.AbstractPollingEndpoint.doPoll(AbstractPollingEndpoint.java:184)
at org.springframework.integration.endpoint.AbstractPollingEndpoint.access$000(AbstractPollingEndpoint.java:51)
at org.springframework.integration.endpoint.AbstractPollingEndpoint$1.call(AbstractPollingEndpoint.java:143)
at org.springframework.integration.endpoint.AbstractPollingEndpoint$1.call(AbstractPollingEndpoint.java:141)
at org.springframework.integration.endpoint.AbstractPollingEndpoint$Poller$1.run(AbstractPollingEndpoint.java:273)
at org.springframework.integration.util.ErrorHandlingTaskExecutor$1.run(ErrorHandlingTaskExecutor.java:52)
at org.springframework.core.task.SyncTaskExecutor.execute(SyncTaskExecutor.java:50)
at org.springframework.integration.util.ErrorHandlingTaskExecutor.execute(ErrorHandlingTaskExecutor.java:49)
at org.springframework.integration.endpoint.AbstractPollingEndpoint$Poller.run(AbstractPollingEndpoint.java:268)
at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
at org.springframework.scheduling.concurrent.ReschedulingRunnable.run(ReschedulingRunnable.java:81)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
All other thread dumps show that all threads for the task-scheduler are either on WAITING or TIMED_WAITING as the following (including the thread on the previous dump after it finished). This dumps from 30 seconds after the last one:
"task-scheduler-4" prio=6 tid=0x00000000118d3800 nid=0x4abc waiting on condition [0x000000000f8bf000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000007838d74d8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1085)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:807)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Locked ownable synchronizers:
- None
"task-scheduler-3" prio=6 tid=0x00000000118d4800 nid=0x4f14 waiting on condition [0x000000001ba0f000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000787c10210> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1090)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:807)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Locked ownable synchronizers:
- None
Any clue?
If you have a lot of pollers in your application, you could be suffering from thread starvation; the default task scheduler only has 10 threads; you can increase the count.
Pollers polling from QueueChannels block for 1 second by default.
Generally, turning on DEBUG logging for org.springframework.integration will help resolving issues such as this.
Taking a thread dump and looking at the task scheduler thread activity should also help.
Related
I have an issue wherein driver containers are being killed in such a way that the child processes (ie the spark driver) are not being cleaned up appropriately. This leaves leaked executors. However, my understanding is that if executors should lose contact with the driver they should self destruct. Indeed they claim to do so, but then... do not.
21/02/25 17:43:59 INFO TransportClientFactory: Successfully created connection to /10.178.144.144:37811 after 2 ms (0 ms spent in bootstraps)
21/02/25 17:43:59 INFO ShuffleBlockFetcherIterator: Started 8 remote fetches in 48 ms
21/02/25 17:43:59 INFO CodeGenerator: Code generated in 21.223892 ms
21/02/25 17:43:59 INFO Executor: Finished task 0.0 in stage 2.0 (TID 97). 2661 bytes result sent to driver
21/02/26 19:56:58 ERROR CoarseGrainedExecutorBackend: Executor self-exiting due to : Driver 10.205.36.205:32000 disassociated! Shutting down.
21/02/26 19:56:58 INFO MemoryStore: MemoryStore cleared
21/02/26 19:56:58 INFO BlockManager: BlockManager stopped
However, the executor docker container is still running and the executor process is still running inside. No further output.
Edit: I've tracked this down to a deadlock in a shutdown hook
"Thread-2" #26 prio=5 os_prio=0 tid=0x00007f6410231800 nid=0x2a2 waiting on condition [0x00007f63c3bf1000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000c05a47b8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1475)
at java.util.concurrent.Executors$DelegatedExecutorService.awaitTermination(Executors.java:675)
at org.apache.spark.rpc.netty.MessageLoop.stop(MessageLoop.scala:60)org.apache.spark.rpc.netty.Dispatcher.$anonfun$stop$1(Dispatcher.scala:190)
at org.apache.spark.rpc.netty.Dispatcher.$anonfun$stop$1$adapted(Dispatcher.scala:187)
at org.apache.spark.rpc.netty.Dispatcher$$Lambda$214/337533935.apply(Unknown Source)
at scala.collection.Iterator.foreach(Iterator.scala:941)
at scala.collection.Iterator.foreach$(Iterator.scala:941)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
at scala.collection.IterableLike.foreach(IterableLike.scala:74)
at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
at org.apache.spark.rpc.netty.Dispatcher.stop(Dispatcher.scala:187)
at org.apache.spark.rpc.netty.NettyRpcEnv.cleanup(NettyRpcEnv.scala:324)
at org.apache.spark.rpc.netty.NettyRpcEnv.shutdown(NettyRpcEnv.scala:302)
at org.apache.spark.SparkEnv.stop(SparkEnv.scala:96)
at org.apache.spark.executor.Executor.stop(Executor.scala:292)
at org.apache.spark.executor.Executor.$anonfun$new$2(Executor.scala:74)
at org.apache.spark.executor.Executor$$Lambda$317/1046854795.apply$mcV$sp(Unknown Source)
at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:214)
at org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$2(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$Lambda$2192/1832515374.apply$mcV$sp(Unknown Source)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1932)
at org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$1(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$Lambda$2191/952019066.apply$mcV$sp(Unknown Source)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at scala.util.Try$.apply(Try.scala:213)
at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178)
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
We are facing issue in our production environment with multiple blocking threads and while checking the thread dumps have found the below stack trace where in 14 blocking threads were present with same description. Any help with the below is much appreciated.
We are using weblogic 10.3.6.0 version and JDK 1.7.0_80.
ExecuteThread: '387' is blocked because ExecuteThread: '159' is already holding the lock and is long-running.
======================
"[ACTIVE] ExecuteThread: '387' for queue: 'weblogic.kernel.Default (self-tuning)'" daemon prio=10 tid=0x00002ac72d4f4000 nid=0x530d waiting for monitor entry [0x00002ac767c38000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.springframework.beans.factory.support.AbstractBeanFactory.getMergedBeanDefinition(AbstractBeanFactory.java:1030)
- waiting to lock <0x000000051ea30728> (a java.util.concurrent.ConcurrentHashMap)
at org.springframework.beans.factory.support.AbstractBeanFactory.getMergedBeanDefinition(AbstractBeanFactory.java:1013)
at org.springframework.beans.factory.support.AbstractBeanFactory.getMergedLocalBeanDefinition(AbstractBeanFactory.java:999)
at org.springframework.beans.factory.support.AbstractBeanFactory.isTypeMatch(AbstractBeanFactory.java:433)
..
"[ACTIVE] ExecuteThread: '159' for queue: 'weblogic.kernel.Default (self-tuning)'" daemon prio=10 tid=0x00002ac7491f5800 nid=0x4781 runnable [0x00002ac7540fd000]
java.lang.Thread.State: RUNNABLE
at org.springframework.beans.factory.support.AbstractBeanFactory.getMergedBeanDefinition(AbstractBeanFactory.java:1030)
- locked <0x000000051ea30728> (a java.util.concurrent.ConcurrentHashMap)
at org.springframework.beans.factory.support.AbstractBeanFactory.getMergedBeanDefinition(AbstractBeanFactory.java:1013)
at org.springframework.beans.factory.support.AbstractBeanFactory.getMergedLocalBeanDefinition(AbstractBeanFactory.java:999)
..
at com.opensymphony.xwork2.spring.SpringObjectFactory.buildBean(SpringObjectFactory.java:151)
[ACTIVE] ExecuteThread: '359' for queue: 'weblogic.kernel.Default (self-tuning)'
[ACTIVE] ExecuteThread: '359' for queue: 'weblogic.kernel.Default (self-tuning)'
Stack Trace is:
java.lang.Thread.State: RUNNABLE
at oracle.jdbc.driver.OracleResultSetImpl.getString(OracleResultSetImpl.java:2818)
- locked <0x0000000566efc058> (a oracle.jdbc.driver.T4CConnection)
at oracle.jdbc.driver.OracleResultSet.getString(OracleResultSet.java:1215)
at weblogic.jdbc.wrapper.ResultSet_oracle_jdbc_driver_OracleResultSetImpl.getString(Unknown Source)
at com.ibatis.sqlmap.engine.type.StringTypeHandler.getResult(StringTypeHandler.java:35)
at com.ibatis.sqlmap.engine.mapping.result.ResultMap.getPrimitiveResultMappingValue(ResultMap.java:619)
at com.ibatis.sqlmap.engine.mapping.result.ResultMap.getResults(ResultMap.java:345)
at com.ibatis.sqlmap.engine.mapping.result.AutoResultMap.getResults(AutoResultMap.java:47)
- locked <0x00000005205e0968> (a com.ibatis.sqlmap.engine.mapping.result.AutoResultMap)
at com.ibatis.sqlmap.engine.execution.SqlExecutor.handleResults(SqlExecutor.java:384)
at com.ibatis.sqlmap.engine.execution.SqlExecutor.handleMultipleResults(SqlExecutor.java:300)
at com.ibatis.sqlmap.engine.execution.SqlExecutor.executeQuery(SqlExecutor.java:189)
at com.ibatis.sqlmap.engine.mapping.statement.MappedStatement.sqlExecuteQuery(MappedStatement.java:221)
at com.ibatis.sqlmap.engine.mapping.statement.MappedStatement.executeQueryWithCallback(MappedStatement.java:189)
at com.ibatis.sqlmap.engine.mapping.statement.MappedStatement.executeQueryForList(MappedStatement.java:139)
at com.ibatis.sqlmap.engine.impl.SqlMapExecutorDelegate.queryForList(SqlMapExecutorDelegate.java:567)
at com.ibatis.sqlmap.engine.impl.SqlMapExecutorDelegate.queryForList(SqlMapExecutorDelegate.java:541)
at com.ibatis.sqlmap.engine.impl.SqlMapSessionImpl.queryForList(SqlMapSessionImpl.java:118)
at org.springframework.orm.ibatis.SqlMapClientTemplate$3.doInSqlMapClient(SqlMapClientTemplate.java:298)
at org.springframework.orm.ibatis.SqlMapClientTemplate.execute(SqlMapClientTemplate.java:209)
at org.springframework.orm.ibatis.SqlMapClientTemplate.executeWithListResult(SqlMapClientTemplate.java:249)
at org.springframework.orm.ibatis.SqlMapClientTemplate.queryForList(SqlMapClientTemplate.java:296)
at org.springframework.orm.ibatis.SqlMapClientTemplate.queryForList(SqlMapClientTemplate.java:290)
at com.dms.common.masters.daos.HomeDaoImpl.populateDropDowns(Unknown Source)
at com.dms.common.masters.daos.HomeDaoImpl$$FastClassByCGLIB$$dda6d555.invoke(<generated>)
at net.sf.cglib.proxy.MethodProxy.invoke(MethodProxy.java:191)
at org.springframework.aop.framework.Cglib2AopProxy$CglibMethodInvocation.invokeJoinpoint(Cglib2AopProxy.java:696)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:149)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:106)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:89)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
at org.springframework.aop.framework.Cglib2AopProxy$DynamicAdvisedInterceptor.intercept(Cglib2AopProxy.java:631)
at com.dms.common.masters.daos.HomeDaoImpl$$EnhancerByCGLIB$$7cce49ec.populateDropDowns(<generated>)
at com.dms.common.masters.business.HomeBusinessImpl.populateDropDowns(Unknown Source)
at com.dms.common.masters.service.HomeServiceImpl.populateDropDowns(Unknown Source)
at com.dms.common.masters.action.HomeAction.menu(Unknown Source)
at sun.reflect.GeneratedMethodAccessor54415.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.opensymphony.xwork2.DefaultActionInvocation.invokeAction(DefaultActionInvocation.java:450)
at com.opensymphony.xwork2.DefaultActionInvocation.invokeActionOnly(DefaultActionInvocation.java:289)
at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:252)
at org.apache.struts2.interceptor.MultiselectInterceptor.intercept(MultiselectInterceptor.java:73)
at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:246)
at com.opensymphony.xwork2.interceptor.ConversionErrorInterceptor.intercept(ConversionErrorInterceptor.java:138)
at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:246)
at com.opensymphony.xwork2.interceptor.ParametersInterceptor.doIntercept(ParametersInterceptor.java:254)
at com.opensymphony.xwork2.interceptor.MethodFilterInterceptor.intercept(MethodFilterInterceptor.java:98)
at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:246)
at com.opensymphony.xwork2.interceptor.StaticParametersInterceptor.intercept(StaticParametersInterceptor.java:191)
at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:246)
at org.apache.struts2.interceptor.CheckboxInterceptor.intercept(CheckboxInterceptor.java:91)
at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:246)
at org.apache.struts2.interceptor.FileUploadInterceptor.intercept(FileUploadInterceptor.java:252)
at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:246)
at com.opensymphony.xwork2.interceptor.ModelDrivenInterceptor.intercept(ModelDrivenInterceptor.java:100)
at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:246)
at com.opensymphony.xwork2.interceptor.ScopedModelDrivenInterceptor.intercept(ScopedModelDrivenInterceptor.java:141)
at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:246)
at com.opensymphony.xwork2.interceptor.ChainingInterceptor.intercept(ChainingInterceptor.java:145)
at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:246)
at com.opensymphony.xwork2.interceptor.I18nInterceptor.intercept(I18nInterceptor.java:139)
at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:246)
at com.opensymphony.xwork2.interceptor.PrepareInterceptor.doIntercept(PrepareInterceptor.java:171)
at com.opensymphony.xwork2.interceptor.MethodFilterInterceptor.intercept(MethodFilterInterceptor.java:98)
at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:246)
at org.apache.struts2.interceptor.ServletConfigInterceptor.intercept(ServletConfigInterceptor.java:164)
at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:246)
at com.opensymphony.xwork2.interceptor.AliasInterceptor.intercept(AliasInterceptor.java:193)
at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:246)
at com.opensymphony.xwork2.interceptor.ExceptionMappingInterceptor.intercept(ExceptionMappingInterceptor.java:189)
at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:246)
at com.utils.LoginInterceptor.intercept(Unknown Source)
at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:246)
at org.apache.struts2.impl.StrutsActionProxy.execute(StrutsActionProxy.java:54)
at org.apache.struts2.dispatcher.Dispatcher.serviceAction(Dispatcher.java:562)
at org.apache.struts2.dispatcher.ng.ExecuteOperations.executeAction(ExecuteOperations.java:77)
at org.apache.struts2.dispatcher.ng.filter.StrutsPrepareAndExecuteFilter.doFilter(StrutsPrepareAndExecuteFilter.java:99)
at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:60)
at com.opensymphony.module.sitemesh.filter.PageFilter.parsePage(PageFilter.java:118)
at com.opensymphony.module.sitemesh.filter.PageFilter.doFilter(PageFilter.java:52)
at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:60)
at com.utils.SetResponseBufferFilter.doFilter(Unknown Source)
at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:60)
at oracle.security.jps.ee.http.JpsAbsFilter$1.run(JpsAbsFilter.java:119)
at java.security.AccessController.doPrivileged(Native Method)
at oracle.security.jps.util.JpsSubject.doAsPrivileged(JpsSubject.java:324)
at oracle.security.jps.ee.util.JpsPlatformUtil.runJaasMode(JpsPlatformUtil.java:460)
at oracle.security.jps.ee.http.JpsAbsFilter.runJaasMode(JpsAbsFilter.java:103)
at oracle.security.jps.ee.http.JpsAbsFilter.doFilter(JpsAbsFilter.java:171)
at oracle.security.jps.ee.http.JpsFilter.doFilter(JpsFilter.java:71)
at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:60)
at oracle.dms.servlet.DMSServletFilter.doFilter(DMSServletFilter.java:163)
at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:60)
at weblogic.servlet.internal.RequestEventsFilter.doFilter(RequestEventsFilter.java:27)
at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:60)
at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.wrapRun(WebAppServletContext.java:3748)
at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:3714)
at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321)
at weblogic.security.service.SecurityManager.runAs(SecurityManager.java:120)
at weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2283)
at weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2182)
at weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1499)
at weblogic.work.ExecuteThread.execute(ExecuteThread.java:263)
at weblogic.work.ExecuteThread.run(ExecuteThread.java:221)
Locked ownable synchronizers:
- None
Blocked threads are not necessarily a concern, in java when handling concurrent requests and using singleton classes or hashmaps etc its natural to have some blocking as a means to achieve sanity in data...
When to take thread dumps -
When weblogic server is reporting stuck threads, or you are getting general slowness in response (though many other factors contribute to this)
How to take thread dumps -
Always take 3-4 thread dumps spaced 1-2 secs apart. Looking at 1 momentary thread dump usually does not provide the complete picture.. If you notice the same threads are in blocked state in almost all 3-4 dumps, check which thread is holding the lock and what it is doing, is the thread holding the lock doing the same task in all 4 dumps, if so you got the clue on what to troubleshoot..
In the thread dump you gave..
ExecuteThread: '387' is waiting on
waiting to lock <0x000000051ea30728>
and ExecuteThread: '159' is holding locked <0x000000051ea30728> but 159 is in Runnable state, meaning its not stuck and doing its normal operations..
had you taken several thread dumps and saw 159 holding the lock for far too long you need to consider tuning your application and think about a workaround if possible around this lock.
The problem seems to be related to the lock on com.ibatis.sqlmap.engine.mapping.result.AutoResultMap
suggesting that part of the code is synchronized, meaning that ONLY one thread at a time will be executed.
If you collect a thread dump, you'll see several threads are waiting for the lock and only one is executing something like:
"[ACTIVE] ExecuteThread: '339' for queue: 'weblogic.kernel.Default (self-tuning)'" daemon prio=10 tid=REMOVED nid=REMOVED runnable [REMOVED]
java.lang.Thread.State: RUNNABLE
at weblogic.jdbc.wrapper.ResultSet_oracle_jdbc_driver_OracleResultSetImpl.wasNull(Unknown Source)
at com.ibatis.sqlmap.engine.type.StringTypeHandler.getResult(StringTypeHandler.java:36)
at com.ibatis.sqlmap.engine.mapping.result.ResultMap.getPrimitiveResultMappingValue(ResultMap.java:619)
at com.ibatis.sqlmap.engine.mapping.result.ResultMap.getResults(ResultMap.java:345)
at com.ibatis.sqlmap.engine.mapping.result.AutoResultMap.getResults(AutoResultMap.java:47)
- locked <0x00000004544b7328> (a com.ibatis.sqlmap.engine.mapping.result.AutoResultMap)
at com.ibatis.sqlmap.engine.execution.SqlExecutor.handleResults(SqlExecutor.java:384)
at com.ibatis.sqlmap.engine.execution.SqlExecutor.handleMultipleResults(SqlExecutor.java:300)
...
That leads us to another question, does IBatis support multithread? Based on the following link the answer is no:
Mybatis does not work in multithreaded programming
In standard usage of MyBatis core module, the SqlSession and mapper object does not support the thread-safe. This is design. The MyBatis provide the SqlSessionManager for supporting the thread-safe as optional component.
https://github.com/mybatis/mybatis-3/issues/1529
I've encountered strange behaviour in Hazelcast Jet. I'm starting many jobs at once (~30, some are triggered slightly before other). However, when my Hazelcast Jet job count hits 26 (magic number?), all processing got stuck.
In the threadumps I see following info:
"hz._hzInstance_1_jet.cached.thread-1" #37 prio=5 os_prio=0 cpu=1093.29ms elapsed=393.62s tid=0x00007f95dc007000 nid=0x6bfc in Object.wait() [0x00007f95e6af4000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(java.base#11.0.2/Native Method)
- waiting on <no object reference available>
at com.hazelcast.spi.impl.AbstractCompletableFuture.get(AbstractCompletableFuture.java:229)
- waiting to re-lock in wait() <0x00000007864b7040> (a com.hazelcast.internal.util.SimpleCompletableFuture)
at com.hazelcast.spi.impl.AbstractCompletableFuture.get(AbstractCompletableFuture.java:191)
at com.hazelcast.spi.impl.operationservice.impl.InvokeOnPartitions.invoke(InvokeOnPartitions.java:88)
at com.hazelcast.spi.impl.operationservice.impl.OperationServiceImpl.invokeOnAllPartitions(OperationServiceImpl.java:385)
at com.hazelcast.map.impl.proxy.MapProxySupport.clearInternal(MapProxySupport.java:1016)
at com.hazelcast.map.impl.proxy.MapProxyImpl.clearInternal(MapProxyImpl.java:109)
at com.hazelcast.map.impl.proxy.MapProxyImpl.clear(MapProxyImpl.java:698)
at com.hazelcast.jet.impl.JobRepository.clearSnapshotData(JobRepository.java:464)
at com.hazelcast.jet.impl.MasterJobContext.tryStartJob(MasterJobContext.java:233)
at com.hazelcast.jet.impl.JobCoordinationService.tryStartJob(JobCoordinationService.java:776)
at com.hazelcast.jet.impl.JobCoordinationService.lambda$submitJob$0(JobCoordinationService.java:200)
at com.hazelcast.jet.impl.JobCoordinationService$$Lambda$634/0x00000008009ce840.run(Unknown Source)
and also:
"hz._hzInstance_1_jet.async.thread-2" #81 prio=5 os_prio=0 cpu=0.00ms elapsed=661.98s tid=0x0000025bb23ef000 nid=0x43bc in Object.wait() [0x0000005d492fe000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(java.base#11/Native Method)
- waiting on <no object reference available>
at com.hazelcast.spi.impl.AbstractCompletableFuture.get(AbstractCompletableFuture.java:229)
- waiting to re-lock in wait() <0x0000000725600100> (a com.hazelcast.internal.util.SimpleCompletableFuture)
at com.hazelcast.spi.impl.AbstractCompletableFuture.get(AbstractCompletableFuture.java:191)
at com.hazelcast.spi.impl.operationservice.impl.InvokeOnPartitions.invoke(InvokeOnPartitions.java:88)
at com.hazelcast.spi.impl.operationservice.impl.OperationServiceImpl.invokeOnAllPartitions(OperationServiceImpl.java:385)
at com.hazelcast.map.impl.proxy.MapProxySupport.removeAllInternal(MapProxySupport.java:619)
at com.hazelcast.map.impl.proxy.MapProxyImpl.removeAll(MapProxyImpl.java:285)
at com.hazelcast.jet.impl.JobRepository.deleteJob(JobRepository.java:332)
at com.hazelcast.jet.impl.JobRepository.completeJob(JobRepository.java:316)
at com.hazelcast.jet.impl.JobCoordinationService.completeJob(JobCoordinationService.java:576)
at com.hazelcast.jet.impl.MasterJobContext.lambda$finalizeJob$13(MasterJobContext.java:620)
at com.hazelcast.jet.impl.MasterJobContext$$Lambda$783/0x0000000800b26840.run(Unknown Source)
at com.hazelcast.jet.impl.MasterJobContext.finalizeJob(MasterJobContext.java:632)
at com.hazelcast.jet.impl.MasterJobContext.onCompleteExecutionCompleted(MasterJobContext.java:564)
at com.hazelcast.jet.impl.MasterJobContext.lambda$invokeCompleteExecution$6(MasterJobContext.java:544)
at com.hazelcast.jet.impl.MasterJobContext$$Lambda$779/0x0000000800b27840.accept(Unknown Source)
at com.hazelcast.jet.impl.MasterContext.lambda$invokeOnParticipants$0(MasterContext.java:242)
at com.hazelcast.jet.impl.MasterContext$$Lambda$726/0x0000000800a1c040.accept(Unknown Source)
at com.hazelcast.jet.impl.util.Util$2.onResponse(Util.java:172)
at com.hazelcast.spi.impl.AbstractInvocationFuture$1.run(AbstractInvocationFuture.java:256)
at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base#11/ThreadPoolExecutor.java:1128)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base#11/ThreadPoolExecutor.java:628)
at java.lang.Thread.run(java.base#11/Thread.java:834)
at com.hazelcast.util.executor.HazelcastManagedThread.executeRun(HazelcastManagedThread.java:64)
at com.hazelcast.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:80)
I don't have any other info how to reproduce this issue, however I hope someone will know how to fix this or my question will help someone else :)
My setup:
- Java 11
- Hazelcast 3.12 Snapshot
- Hazelcast Jet 3.0 Snapshot (I can't revert to previous version, it will break my logic; I need n:m joins which will be added in 3.1)
- CPU cores: 4
- RAM: 7 GB
- Jet mode: server, connects to other cluster as a client to insert final data.
Did anyone encounter similar issue? The problem is, it cannot be simply reproduces, thus it's hard to create an issue for Hazelcast Team. Only threaddumps and general behaviour can give a hint what is going on.
This was an issue in 3.0-SNAPSHOT during development and was fixed in the 3.0 release.
I'm running into an issue where it seems as if a Thread has become uninterruptible while in a WAITING state. The task thread itself (as you'll see in the stack of the thread below) is waiting for a call to FuturePromise.get() (a Jetty class).
The task thread is being executed in the context of an ExecutorService. Below is what the ExecutorService invocation looks like (simplified).
ExecutorService es = Executors.newFixedThreadPool(8, new CustomizableThreadFactory("Test-Thread-"));
es.submit(taskToRun);
es.shutdown();
es.awaitTermination(10, TimeUnit.SECONDS);
es.shutdownNow();
String result = taskToRun.get();
What I'm seeing is the main thread gets stuck at taskToRun.get() waiting for the task to complete/be interrupted while the thread running the task sits in this state:
"Test-Thread-1"
#135245 prio=5 os_prio=0 tid=0x00007f972c0c2000 nid=0x6838 waiting on condition [0x00007f96ca165000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000005d00d1118> (a java.util.concurrent.CountDownLatch$Sync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
at org.eclipse.jetty.util.FuturePromise.get(FuturePromise.java:118)
... + app code
What I'm expecting to happen is Test-Thread-1 will be interrupted which will throw an InterruptedException and the taskToRun.get() call would also throw a InterruptedException.
Unfortunately I've been unable to reproduce this problem with a unit test, but will update as I get more info.
After upgrading to Jetty 9.4.6, this no longer seems to be a problem. I did come across some Jetty code that clears the interrupted state of threads, but it's not clear whether or not that was the actual cause.
After a load test in my application, my application server CPU is hogging at 50% always even though nothing is running in the system after the test. When I look at the thread dump, I can see at least 24 BLOCKED threads in the dump. Most of them are related to spring framework, mule framework and so on....nothing specific to my application.
Please see below for a few stack traces of the thread dump.
"ajp-10.40.44.47-8009-116" daemon prio=6 tid=0x000000000cdce800 nid=0x7ac waiting for monitor entry [0x000000003cf1e000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.lang.ClassLoader.loadClass(ClassLoader.java:291)
- waiting to lock <0x00000000c274dab8> (a org.jboss.web.tomcat.service.WebCtxLoader$ENCLoader)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at org.mule.util.ClassUtils.loadClass(ClassUtils.java:340)
at org.mule.util.ClassUtils.instanciateClass(ClassUtils.java:437)
at org.mule.transport.service.DefaultTransportServiceDescriptor.createEndpointURIBuilder(DefaultTransportServiceDescriptor.java:448)
at org.mule.endpoint.MuleEndpointURI.initialise(MuleEndpointURI.java:222)
at org.mule.transport.servlet.MuleReceiverServlet.getReceiverForURI(MuleReceiverServlet.java:308)
at org.mule.transport.servlet.MuleReceiverServlet.doAllMethods(MuleReceiverServlet.java:241)
at org.mule.transport.servlet.MuleReceiverServlet.doPost(MuleReceiverServlet.java:198)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:637)
at org.mule.transport.servlet.MuleReceiverServlet.service(MuleReceiverServlet.java:180)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at org.jboss.web.tomcat.filters.ReplyHeaderFilter.doFilter(ReplyHeaderFilter.java:96)
"financials.pctogl.staging.partitioned.jms.taskExecutor-9" prio=6 tid=0x0000000009b8f000 nid=0x1730 waiting for monitor entry [0x000000002cf1f000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.springframework.integration.util.ErrorHandlingTaskExecutor$1.run(ErrorHandlingTaskExecutor.java:52)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
"[powercomp].iip.flow.drools.service.stage1.01" prio=6 tid=0x00000000063af800 nid=0x1608 waiting for monitor entry [0x000000002979f000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.mule.work.WorkerContext.run(WorkerContext.java:315)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
"task-scheduler-10" prio=6 tid=0x000000000d075000 nid=0xcf0 waiting for monitor entry [0x000000002649f000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.springframework.integration.channel.MessagePublishingErrorHandler.handleError(MessagePublishingErrorHandler.java:83)
at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:59)
at org.springframework.scheduling.concurrent.ReschedulingRunnable.run(ReschedulingRunnable.java:81)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
"UDP ucast,10.40.44.47:61710" prio=6 tid=0x0000000009b82800 nid=0xeb0 waiting for monitor entry [0x000000001771e000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.log4j.Category.callAppenders(Category.java:201)
- waiting to lock <0x00000000a197e360> (a org.apache.log4j.spi.RootLogger)
at org.apache.log4j.Category.forcedLog(Category.java:388)
at org.apache.log4j.Category.log(Category.java:853)
at sun.reflect.GeneratedMethodAccessor294.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.commons.logging.impl.Log4jProxy.log(Log4jProxy.java:301)
at org.apache.commons.logging.impl.Log4jProxy.error(Log4jProxy.java:283)
at org.apache.commons.logging.impl.Log4JLogger.error(Log4JLogger.java:208)
at org.jgroups.protocols.TP.down(TP.java:1205)
at org.jgroups.protocols.TP$ProtocolAdapter.down(TP.java:2316)
at org.jgroups.protocols.Discovery.down(Discovery.java:374)
at org.jgroups.protocols.MERGE2.down(MERGE2.java:175)
at org.jgroups.protocols.FD_SOCK.down(FD_SOCK.java:362)
at org.jgroups.protocols.FD.sendHeartbeatResponse(FD.java:325)
at org.jgroups.protocols.FD.up(FD.java:235)
at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:309)
at org.jgroups.protocols.MERGE2.up(MERGE2.java:144)
at org.jgroups.protocols.Discovery.up(Discovery.java:264)
at org.jgroups.protocols.PING.up(PING.java:273)
at org.jgroups.protocols.TP$ProtocolAdapter.up(TP.java:2327)
at org.jgroups.protocols.TP.passMessageUp(TP.java:1261)
at org.jgroups.protocols.TP.access$100(TP.java:49)
at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1838)
at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1817)
at java.util.concurrent.ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution(ThreadPoolExecutor.java:1746)
at org.jgroups.util.ShutdownRejectedExecutionHandler.rejectedExecution(ShutdownRejectedExecutionHandler.java:39)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658)
at org.jgroups.protocols.TP.dispatchToThreadPool(TP.java:1366)
at org.jgroups.protocols.TP.receive(TP.java:1333)
at org.jgroups.protocols.UDP$UcastReceiver.run(UDP.java:960)
at java.lang.Thread.run(Thread.java:662)