In our application Thread dump I can see the below runnable thread while in Oracle db DBA can find that session is inactive which is holding lock. In which scenario the oracle session gets inactive while locking table? This is causing blocking in db.
ajp-nio-8009-exec-37" prio=5 tid=0x175 nid=0xaf RUNNABLE (JNI Native Code) - stats: cpu=122796 blk=-1 wait=-1 java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:171) at java.net.SocketInputStream.read(SocketInputStream.java:141) at oracle.net.ns.Packet.receive(Packet.java:311) at oracle.net.ns.DataPacket.receive(DataPacket.java:105) at oracle.net.ns.NetInputStream.getNextPacket(NetInputStream.java:305) at oracle.net.ns.NetInputStream.read(NetInputStream.java:249) at oracle.net.ns.NetInputStream.read(NetInputStream.java:171) at oracle.net.ns.NetInputStream.read(NetInputStream.java:89) at oracle.jdbc.driver.T4CSocketInputStreamWrapper.readNextPacket(T4CSocketInputStreamWrapper.java:123) at oracle.jdbc.driver.T4CSocketInputStreamWrapper.read(T4CSocketInputStreamWrapper.java:79) at oracle.jdbc.driver.T4CMAREngineStream.unmarshalUB1(T4CMAREngineStream.java:429) at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:397) at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:257) at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:587) at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:225) at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:53) at oracle.jdbc.driver.T4CPreparedStatement.executeForRows(T4CPreparedStatement.java:943) at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1150) at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:4798) at oracle.jdbc.driver.OraclePreparedStatement.executeUpdate(OraclePreparedStatement.java:4875) - locked oracle.jdbc.driver.T4CConnection#2ff33c0b
If you use oracle you must enforce a max lifetime to its connections so that it won't return to connection pool and create a leak as you observed
by default Oracle does not enforce a max lifetime for connections
If you use HikariCP set maxLifetime properties:
maxLifetime
This property controls the maximum lifetime of a connection in the pool. An in-use connection will never be retired, only when it is closed will it then be removed. On a connection-by-connection basis, minor negative attenuation is applied to avoid mass-extinction in the pool. We strongly recommend setting this value, and it should be several seconds shorter than any database or infrastructure imposed connection time limit.
Related
I have a play scala application running on play 2.7. this is used as a middleware for our frontend and it has rest end points.
Now I am running two different instances on cloud and using nginx and bound these two servers and load balance it with round robin.
Now I am having a problem that the servers goes down quite often i.e. 3 times a day and interesting thing is both server goes down at same time. When I looked at it says out of memory exception on the both servers. I tried to print javaheapdump for out of memory but getting no dump . I am still analysing the thread dump to figure out what might be the actual cause of my server going down but what pins me is why the two servers are going down at the same time.
Out of thread dump I see there are 7707 thread with sleeping state. here it is
"Connection evictor" #146 daemon prio=5 os_prio=0 cpu=2.33ms elapsed=1822.02s tid=0x00007f8a840c4800 nid=0x194 waiting on condition [0x00007f8a58a5e000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(java.base#11/Native Method)
at org.apache.http.impl.client.IdleConnectionEvictor$1.run(IdleConnectionEvictor.java:66)
at java.lang.Thread.run(java.base#11/Thread.java:834)
This what I see when server goes down
[35966.967s][warning][os,thread] Failed to start thread - pthread_create failed (EAGAIN) for attributes: stacksize: 1024k, guardsize: 0k, detached.
Uncaught error from thread [application-akka.actor.default-dispatcher-1398Uncaught error from thread [application-akka.actor.default-dispatcher-1395]: ]: unable to create native thread: possibly out of memory or process/resource limits reachedunable to create native thread: possibly out of memory or process/resource limits reached, shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for, shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[ ActorSystem[applicationapplication]
]
java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached
at java.base/java.lang.Thread.start0(Native Method)
at java.base/java.lang.Thread.start(Thread.java:803)
at org.apache.http.impl.client.IdleConnectionEvictor.start(IdleConnectionEvictor.java:96)
at org.apache.http.impl.client.HttpClientBuilder.build(HttpClientBuilder.java:1219)
at org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:287)
at org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:298)
at org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:236)
at org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:223)
at org.apache.solr.client.solrj.impl.HttpSolrClient.<init>(HttpSolrClient.java:198)
at org.apache.solr.client.solrj.impl.HttpSolrClient$Builder.build(HttpSolrClient.java:934)
at com.github.takezoe.solr.scala.SolrClient$.$anonfun$$lessinit$greater$default$2$1(SolrClient.scala:11)
at com.github.takezoe.solr.scala.SolrClient.<init>(SolrClient.scala:14)
at service.tvt.solr.SolrPolygonService.getSuburbBoundary(SolrPolygonService.scala:212)
at service.tvt.search.OrbigoSearchService.mapfeeder(OrbigoSearchService.scala:54)
at service.bto.business_categories.MeedssCountService.$anonfun$suburbMeedssCount$2(MeedssCountService.scala:81)
at scala.collection.immutable.List.map(List.scala:287)
at service.bto.business_categories.MeedssCountService.suburbMeedssCount(MeedssCountService.scala:80)
at controllers.bto.industry_categories.meedss.MeedssController.$anonfun$suburbMeedssCount$1(MeedssController.scala:38)
at play.api.mvc.ActionBuilder.$anonfun$apply$11(Action.scala:368)
at scala.Function1.$anonfun$andThen$1(Function1.scala:52)
at play.api.mvc.ActionBuilderImpl.invokeBlock(Action.scala:489)
at play.api.mvc.ActionBuilderImpl.invokeBlock(Action.scala:487)
at play.api.mvc.ActionBuilder$$anon$9.invokeBlock(Action.scala:336)
at play.api.mvc.ActionBuilder$$anon$9.invokeBlock(Action.scala:331)
at play.api.mvc.ActionBuilder$$anon$10.apply(Action.scala:426)
at play.api.mvc.Action.$anonfun$apply$2(Action.scala:98)
at play.api.libs.streams.StrictAccumulator.$anonfun$mapFuture$4(Accumulator.scala:184)
at scala.util.Try$.apply(Try.scala:209)
at play.api.libs.streams.StrictAccumulator.$anonfun$mapFuture$3(Accumulator.scala:184)
at akka.stream.impl.Transform.apply(TraversalBuilder.scala:159)
at akka.stream.impl.PhasedFusingActorMaterializer.materialize(PhasedFusingActorMaterializer.scala:515)
at akka.stream.impl.PhasedFusingActorMaterializer.materialize(PhasedFusingActorMaterializer.scala:450)
at akka.stream.impl.PhasedFusingActorMaterializer.materialize(PhasedFusingActorMaterializer.scala:443)
at akka.stream.scaladsl.RunnableGraph.run(Flow.scala:629)
at play.api.libs.streams.Accumulator$.$anonfun$futureToSink$2(Accumulator.scala:262)
at scala.concurrent.Future.$anonfun$flatMap$1(Future.scala:303)
at scala.concurrent.impl.Promise.$anonfun$transformWith$1(Promise.scala:37)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)
at play.api.libs.streams.Execution$trampoline$.execute(Execution.scala:72)
at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:68)
at scala.concurrent.impl.Promise$DefaultPromise.dispatchOrAddCallback(Promise.scala:312)
at scala.concurrent.impl.Promise$DefaultPromise.onComplete(Promise.scala:303)
at scala.concurrent.impl.Promise.transformWith(Promise.scala:36)
at scala.concurrent.impl.Promise.transformWith$(Promise.scala:34)
at scala.concurrent.impl.Promise$DefaultPromise.transformWith(Promise.scala:183)
at scala.concurrent.Future.flatMap(Future.scala:302)
at scala.concurrent.Future.flatMap$(Future.scala:302)
at scala.concurrent.impl.Promise$DefaultPromise.flatMap(Promise.scala:183)
at play.api.libs.streams.Accumulator$.$anonfun$futureToSink$1(Accumulator.scala:261)
at akka.stream.impl.Transform.apply(TraversalBuilder.scala:159)
at akka.stream.impl.PhasedFusingActorMaterializer.materialize(PhasedFusingActorMaterializer.scala:515)
at akka.stream.impl.PhasedFusingActorMaterializer.materialize(PhasedFusingActorMaterializer.scala:450)
at akka.stream.impl.PhasedFusingActorMaterializer.materialize(PhasedFusingActorMaterializer.scala:443)
at akka.stream.scaladsl.RunnableGraph.run(Flow.scala:629)
at play.api.libs.streams.SinkAccumulator.run(Accumulator.scala:144)
at play.api.libs.streams.SinkAccumulator.run(Accumulator.scala:148)
at play.core.server.AkkaHttpServer.$anonfun$runAction$4(AkkaHttpServer.scala:441)
at akka.http.scaladsl.util.FastFuture$.strictTransform$1(FastFuture.scala:41)
at akka.http.scaladsl.util.FastFuture$.$anonfun$transformWith$3(FastFuture.scala:51)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)
at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
at akka.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:92)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:81)
at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:92)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:49)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Any quick pointers will be really helpful
Levi Ramsey was right it was because of TakeZoe lib which we were using. We were creating client for every new request and not closing it. Finally we created a connection pool with limited active connections and it worked.
Any idea because my PutMongo processor gets stuck ¿?
PutMongo Processor
'nifi dump' attached below
[nifi.sh dump][1]
[1]: https://pastebin.com/raw/b2QDeg0H
Thanks!
The part of the thread dump that is relevant is this...
"Timer-Driven Process Thread-3" Id=56 RUNNABLE (in native code)
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111)
at java.net.SocketOutputStream.write(SocketOutputStream.java:155)
at com.mongodb.connection.SocketStream.write(SocketStream.java:75)
at com.mongodb.connection.InternalStreamConnection.sendMessage(InternalStreamConnection.java:201)
at com.mongodb.connection.UsageTrackingInternalConnection.sendMessage(UsageTrackingInternalConnection.java:95)
at com.mongodb.connection.DefaultConnectionPool$PooledConnection.sendMessage(DefaultConnectionPool.java:424)
at com.mongodb.connection.WriteProtocol.execute(WriteProtocol.java:103)
at com.mongodb.connection.UpdateProtocol.execute(UpdateProtocol.java:67)
at com.mongodb.connection.UpdateProtocol.execute(UpdateProtocol.java:42)
at com.mongodb.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:159)
at com.mongodb.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:286)
at com.mongodb.connection.DefaultServerConnection.update(DefaultServerConnection.java:85)
at com.mongodb.operation.MixedBulkWriteOperation$Run$3.executeWriteProtocol(MixedBulkWriteOperation.java:475)
at com.mongodb.operation.MixedBulkWriteOperation$Run$RunExecutor.execute(MixedBulkWriteOperation.java:655)
at com.mongodb.operation.MixedBulkWriteOperation$Run.execute(MixedBulkWriteOperation.java:399)
at com.mongodb.operation.MixedBulkWriteOperation$1.call(MixedBulkWriteOperation.java:179)
at com.mongodb.operation.MixedBulkWriteOperation$1.call(MixedBulkWriteOperation.java:168)
at com.mongodb.operation.OperationHelper.withConnectionSource(OperationHelper.java:230)
at com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:221)
at com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:168)
at com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:74)
at com.mongodb.Mongo.execute(Mongo.java:781)
at com.mongodb.Mongo$2.execute(Mongo.java:764)
at com.mongodb.MongoCollectionImpl.executeSingleWriteRequest(MongoCollectionImpl.java:515)
at com.mongodb.MongoCollectionImpl.replaceOne(MongoCollectionImpl.java:344)
at org.apache.nifi.processors.mongodb.PutMongo.onTrigger(PutMongo.java:175)
It is likely blocked due to some kind of networking issue, or unresponsiveness from Mongo.
Ideally the Mongo client used by NiFi would have some kind of timeouts that can be configured and these should be exposed in the processor so we don't block indefinitely.
I am not familiar with Mongo at all though so I can't say how their client works.
From an Apache storm bolt, I am using Elasticsearch's transport client to remotely connect to an ES cluster. When I take a jstack output of the storm process, I notice that there are nearly 1000 threads with an ES stack trace like:
elasticsearch[Flying Tiger][transport_client_worker][T#22]{New I/O worker #269}" daemon prio=10 tid=0x00007f80ac3cb000 nid=0x356b runnable [0x00007f7e96b2a000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
- locked <0x00000000d148d138> (a sun.nio.ch.Util$2)
- locked <0x00000000d148d128> (a java.util.Collections$UnmodifiableSet)
- locked <0x00000000d148c9b8> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
at org.elasticsearch.common.netty.channel.socket.nio.SelectorUtil.select(SelectorUtil.java:68)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.select(AbstractNioSelector.java:415)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:212)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
I am using a single instance of the ES transport client across my storm topology, which has about 18 output streams invoking the ES client to write data to the ES cluster.
Why does the ES transport client spawn so many threads ? Is there any way I can tune this ? I tried looking up ES documentation but it does not provide any internal details on the threading mechanism of the transport client nor does it give an option to tune the number of threads of the client.
I had very similar experience before. As you've mentioned, one transport client creates tens of threads including timers and so on.
What you have to check is whether there is exactly one transport client per worker process. Back in my earlier days, when I used 32 transport clients, there were more than 1 thousand threads and after I've correctly modified it to be singleton instance, number of threads decreased to less than 2 hundreds(including all other threads created in my topology)
You can define system property: "es.processors.override" or setting "processors", based on source code of org.elasticsearch.common.util.concurrent.EsExecutors. I tried this method and limit the number of worker threads successfully.
/**
* Settings key to manually set the number of available processors.
* This is used to adjust thread pools sizes etc. per node.
*/
public static final String PROCESSORS = "processors";
/** Useful for testing */
public static final String DEFAULT_SYSPROP = "es.processors.override";
Also from limit number of thread in ThreadPool while creating TransportClient in elasticsearch
Settings settings = ImmutableSettings.settingsBuilder()
.put("transport.netty.workerCount",NUM_THREADS)
.build();
I am using Hazelcast 3.2.6 as second level cache for Hibernate. The cluster has 4 servers with multiple Read/Update/Delete operations being performed on the DB. It was running fine for quite sometime suddenly I see that all the threads which are trying to perform db operation are stuck, following is an extract from thread dump, there are no exceptions being printed.
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at com.hazelcast.spi.impl.BasicInvocation$InvocationFuture.pollResponse(BasicInvocation.java:767)
- locked <0x0000000665956110> (a com.hazelcast.spi.impl.BasicInvocation$InvocationFuture)
at com.hazelcast.spi.impl.BasicInvocation$InvocationFuture.waitForResponse(BasicInvocation.java:719)
at com.hazelcast.spi.impl.BasicInvocation$InvocationFuture.get(BasicInvocation.java:697)
at com.hazelcast.spi.impl.BasicInvocation$InvocationFuture.get(BasicInvocation.java:676)
at com.hazelcast.spi.impl.BasicInvocation$InvocationFuture.getSafely(BasicInvocation.java:689)
at com.hazelcast.concurrent.lock.LockProxySupport.lock(LockProxySupport.java:80)
at com.hazelcast.concurrent.lock.LockProxySupport.lock(LockProxySupport.java:74)
at com.hazelcast.concurrent.lock.LockProxy.lock(LockProxy.java:70)
at com.xxx.database.ccsecure.persistance.impl.DataStore.get(DataStore.java:120)
Apparently the invocation doesn't get a result. This means that the invocation-future is not going to complete. The big question is: why does the operation not get a response to its request.
Do you know which operation it is?
I have the main thread that spawns thread #2 which uses the same hibernate Session in the main thread. Thread #2 just does a "select 1" every few min to keep the db connection alive because of a long running process from the main thread. Once the main thread is done w/ the processing, it calls a commit but i get the error:
Caused by: org.hibernate.TransactionException: JDBC commit failed
at org.hibernate.transaction.JDBCTransaction.commit(JDBCTransaction.java:161)
at org.springframework.orm.hibernate3.HibernateTransactionManager.doCommit(HibernateTransactionManager.java:655)
... 5 more
Caused by: java.sql.SQLException: Can't call commit when autocommit=true
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:930)
at com.mysql.jdbc.ConnectionImpl.commit(ConnectionImpl.java:1602)
at org.hibernate.transaction.JDBCTransaction.commitAndResetAutoCommit(JDBCTransaction.java:170)
at org.hibernate.transaction.JDBCTransaction.commit(JDBCTransaction.java:146)
... 6 more
Within the main thread, it creates inner transactions which are committed successfully, it's just the outer transaction when it commits that throws this error. I don't see what could be changing the autocommit boolean. Before I introduced the 2nd thread to keep the connection alive, this error had never occurred.
Even though I think you should seriously reconsider the way you are using Hibernate, you can bypass this issue by adding a relaxAutoCommit parameter to the JDBC driver in its URL.
Details from MySQL documentation:
relaxAutoCommit
If the version of MySQL the driver connects to does not support transactions, still allow calls to commit(), rollback() and setAutoCommit() (true/false, defaults to 'false')?
Default: false
Since version: 2.0.13
Source: https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-configuration-properties.html
found the answer in a blog, the solution quotes:
Setting the attribute relaxAutoCommit=true in the jdbc url we solved our problem.
jdbc:mysql://dbserver/database?rewriteBatchedStatements=true&relaxAutoCommit=true
Of course the blog is in another scenario, just skip the "rewriteBatchedStatements=true" part