PutMongo 1.3.0 - stuck process - multithreading

Any idea because my PutMongo processor gets stuck ¿?
PutMongo Processor
'nifi dump' attached below
[nifi.sh dump][1]
[1]: https://pastebin.com/raw/b2QDeg0H
Thanks!

The part of the thread dump that is relevant is this...
"Timer-Driven Process Thread-3" Id=56 RUNNABLE (in native code)
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111)
at java.net.SocketOutputStream.write(SocketOutputStream.java:155)
at com.mongodb.connection.SocketStream.write(SocketStream.java:75)
at com.mongodb.connection.InternalStreamConnection.sendMessage(InternalStreamConnection.java:201)
at com.mongodb.connection.UsageTrackingInternalConnection.sendMessage(UsageTrackingInternalConnection.java:95)
at com.mongodb.connection.DefaultConnectionPool$PooledConnection.sendMessage(DefaultConnectionPool.java:424)
at com.mongodb.connection.WriteProtocol.execute(WriteProtocol.java:103)
at com.mongodb.connection.UpdateProtocol.execute(UpdateProtocol.java:67)
at com.mongodb.connection.UpdateProtocol.execute(UpdateProtocol.java:42)
at com.mongodb.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:159)
at com.mongodb.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:286)
at com.mongodb.connection.DefaultServerConnection.update(DefaultServerConnection.java:85)
at com.mongodb.operation.MixedBulkWriteOperation$Run$3.executeWriteProtocol(MixedBulkWriteOperation.java:475)
at com.mongodb.operation.MixedBulkWriteOperation$Run$RunExecutor.execute(MixedBulkWriteOperation.java:655)
at com.mongodb.operation.MixedBulkWriteOperation$Run.execute(MixedBulkWriteOperation.java:399)
at com.mongodb.operation.MixedBulkWriteOperation$1.call(MixedBulkWriteOperation.java:179)
at com.mongodb.operation.MixedBulkWriteOperation$1.call(MixedBulkWriteOperation.java:168)
at com.mongodb.operation.OperationHelper.withConnectionSource(OperationHelper.java:230)
at com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:221)
at com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:168)
at com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:74)
at com.mongodb.Mongo.execute(Mongo.java:781)
at com.mongodb.Mongo$2.execute(Mongo.java:764)
at com.mongodb.MongoCollectionImpl.executeSingleWriteRequest(MongoCollectionImpl.java:515)
at com.mongodb.MongoCollectionImpl.replaceOne(MongoCollectionImpl.java:344)
at org.apache.nifi.processors.mongodb.PutMongo.onTrigger(PutMongo.java:175)
It is likely blocked due to some kind of networking issue, or unresponsiveness from Mongo.
Ideally the Mongo client used by NiFi would have some kind of timeouts that can be configured and these should be exposed in the processor so we don't block indefinitely.
I am not familiar with Mongo at all though so I can't say how their client works.

Related

Play framework 2.7 with scala going down

I have a play scala application running on play 2.7. this is used as a middleware for our frontend and it has rest end points.
Now I am running two different instances on cloud and using nginx and bound these two servers and load balance it with round robin.
Now I am having a problem that the servers goes down quite often i.e. 3 times a day and interesting thing is both server goes down at same time. When I looked at it says out of memory exception on the both servers. I tried to print javaheapdump for out of memory but getting no dump . I am still analysing the thread dump to figure out what might be the actual cause of my server going down but what pins me is why the two servers are going down at the same time.
Out of thread dump I see there are 7707 thread with sleeping state. here it is
"Connection evictor" #146 daemon prio=5 os_prio=0 cpu=2.33ms elapsed=1822.02s tid=0x00007f8a840c4800 nid=0x194 waiting on condition [0x00007f8a58a5e000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(java.base#11/Native Method)
at org.apache.http.impl.client.IdleConnectionEvictor$1.run(IdleConnectionEvictor.java:66)
at java.lang.Thread.run(java.base#11/Thread.java:834)
This what I see when server goes down
[35966.967s][warning][os,thread] Failed to start thread - pthread_create failed (EAGAIN) for attributes: stacksize: 1024k, guardsize: 0k, detached.
Uncaught error from thread [application-akka.actor.default-dispatcher-1398Uncaught error from thread [application-akka.actor.default-dispatcher-1395]: ]: unable to create native thread: possibly out of memory or process/resource limits reachedunable to create native thread: possibly out of memory or process/resource limits reached, shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for, shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[ ActorSystem[applicationapplication]
]
java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached
at java.base/java.lang.Thread.start0(Native Method)
at java.base/java.lang.Thread.start(Thread.java:803)
at org.apache.http.impl.client.IdleConnectionEvictor.start(IdleConnectionEvictor.java:96)
at org.apache.http.impl.client.HttpClientBuilder.build(HttpClientBuilder.java:1219)
at org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:287)
at org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:298)
at org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:236)
at org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:223)
at org.apache.solr.client.solrj.impl.HttpSolrClient.<init>(HttpSolrClient.java:198)
at org.apache.solr.client.solrj.impl.HttpSolrClient$Builder.build(HttpSolrClient.java:934)
at com.github.takezoe.solr.scala.SolrClient$.$anonfun$$lessinit$greater$default$2$1(SolrClient.scala:11)
at com.github.takezoe.solr.scala.SolrClient.<init>(SolrClient.scala:14)
at service.tvt.solr.SolrPolygonService.getSuburbBoundary(SolrPolygonService.scala:212)
at service.tvt.search.OrbigoSearchService.mapfeeder(OrbigoSearchService.scala:54)
at service.bto.business_categories.MeedssCountService.$anonfun$suburbMeedssCount$2(MeedssCountService.scala:81)
at scala.collection.immutable.List.map(List.scala:287)
at service.bto.business_categories.MeedssCountService.suburbMeedssCount(MeedssCountService.scala:80)
at controllers.bto.industry_categories.meedss.MeedssController.$anonfun$suburbMeedssCount$1(MeedssController.scala:38)
at play.api.mvc.ActionBuilder.$anonfun$apply$11(Action.scala:368)
at scala.Function1.$anonfun$andThen$1(Function1.scala:52)
at play.api.mvc.ActionBuilderImpl.invokeBlock(Action.scala:489)
at play.api.mvc.ActionBuilderImpl.invokeBlock(Action.scala:487)
at play.api.mvc.ActionBuilder$$anon$9.invokeBlock(Action.scala:336)
at play.api.mvc.ActionBuilder$$anon$9.invokeBlock(Action.scala:331)
at play.api.mvc.ActionBuilder$$anon$10.apply(Action.scala:426)
at play.api.mvc.Action.$anonfun$apply$2(Action.scala:98)
at play.api.libs.streams.StrictAccumulator.$anonfun$mapFuture$4(Accumulator.scala:184)
at scala.util.Try$.apply(Try.scala:209)
at play.api.libs.streams.StrictAccumulator.$anonfun$mapFuture$3(Accumulator.scala:184)
at akka.stream.impl.Transform.apply(TraversalBuilder.scala:159)
at akka.stream.impl.PhasedFusingActorMaterializer.materialize(PhasedFusingActorMaterializer.scala:515)
at akka.stream.impl.PhasedFusingActorMaterializer.materialize(PhasedFusingActorMaterializer.scala:450)
at akka.stream.impl.PhasedFusingActorMaterializer.materialize(PhasedFusingActorMaterializer.scala:443)
at akka.stream.scaladsl.RunnableGraph.run(Flow.scala:629)
at play.api.libs.streams.Accumulator$.$anonfun$futureToSink$2(Accumulator.scala:262)
at scala.concurrent.Future.$anonfun$flatMap$1(Future.scala:303)
at scala.concurrent.impl.Promise.$anonfun$transformWith$1(Promise.scala:37)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)
at play.api.libs.streams.Execution$trampoline$.execute(Execution.scala:72)
at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:68)
at scala.concurrent.impl.Promise$DefaultPromise.dispatchOrAddCallback(Promise.scala:312)
at scala.concurrent.impl.Promise$DefaultPromise.onComplete(Promise.scala:303)
at scala.concurrent.impl.Promise.transformWith(Promise.scala:36)
at scala.concurrent.impl.Promise.transformWith$(Promise.scala:34)
at scala.concurrent.impl.Promise$DefaultPromise.transformWith(Promise.scala:183)
at scala.concurrent.Future.flatMap(Future.scala:302)
at scala.concurrent.Future.flatMap$(Future.scala:302)
at scala.concurrent.impl.Promise$DefaultPromise.flatMap(Promise.scala:183)
at play.api.libs.streams.Accumulator$.$anonfun$futureToSink$1(Accumulator.scala:261)
at akka.stream.impl.Transform.apply(TraversalBuilder.scala:159)
at akka.stream.impl.PhasedFusingActorMaterializer.materialize(PhasedFusingActorMaterializer.scala:515)
at akka.stream.impl.PhasedFusingActorMaterializer.materialize(PhasedFusingActorMaterializer.scala:450)
at akka.stream.impl.PhasedFusingActorMaterializer.materialize(PhasedFusingActorMaterializer.scala:443)
at akka.stream.scaladsl.RunnableGraph.run(Flow.scala:629)
at play.api.libs.streams.SinkAccumulator.run(Accumulator.scala:144)
at play.api.libs.streams.SinkAccumulator.run(Accumulator.scala:148)
at play.core.server.AkkaHttpServer.$anonfun$runAction$4(AkkaHttpServer.scala:441)
at akka.http.scaladsl.util.FastFuture$.strictTransform$1(FastFuture.scala:41)
at akka.http.scaladsl.util.FastFuture$.$anonfun$transformWith$3(FastFuture.scala:51)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)
at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
at akka.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:92)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:81)
at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:92)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:49)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Any quick pointers will be really helpful
Levi Ramsey was right it was because of TakeZoe lib which we were using. We were creating client for every new request and not closing it. Finally we created a connection pool with limited active connections and it worked.

redis-py not closing threads on exit

I am using redis-py 2.10.6 and redis 4.0.11.
My application uses redis for both the db and the pubsub. When I shut down I often get either hanging or a crash. The latter usually complains about a bad file descriptor or an I/O error on a file (I don't use any) which happens while handling a pubsub callback, so I'm guessing the underlying issue is the same: somehow I don't get disconnected properly and the pool used by my redis.Redis object is alive and kicking.
An example of the output of the former kind of error (during _read_from_socket):
redis.exceptions.ConnectionError: Error while reading from socket: (9, 'Bad file descriptor')
Other times the stacktrace clearly shows redis/connection.py -> redis/client.py -> threading.py, which proves that redis isn't killing the threads it uses.
When I star the application I run:
self.redis = redis.Redis(host=XXXX, port=XXXX)
self.pubsub = self.redis.pubsub()
subscriptions = {'chan1': self.cb1, 'chan2': self.cb2} # cb1 and cb2 are functions
self.pubsub.subscribe(**subscriptions)
self.pubsub_thread = self.pubsub.run_in_thread(sleep_time=1)
When I want to exit the application the last instruction I execute in main is a call to a function in my redis using class, whose implementation is:
self.pubsub.close()
self.pubsub_thread.stop()
self.redis.connection_pool.disconnect()
My understanding is that in theory I do not even need to do any of these 'closing' calls, and yet, with or without them, I still can't guarantee a clean shutdown.
My question is, how am I supposed to guarantee a clean shutdown?
I ran into this same issue and it's largely caused by improper handling of the shutdown by the redis library. During the cleanup, the thread continues to process new messages and doesn't account for situations where the socket is no longer available. After scouring the code a bit, I couldn't find a way to prevent additional processing without just waiting.
Since this is run during a shutdown phase and it's a remedy for a 3rd party library, I'm not overly concerned about the sleep, but ideally the library should be updated to prevent further action while shutting down.
self.pubsub_thread.stop()
time.sleep(0.5)
self.pubsub.reset()
This might be worth an issue log or PR on the redis-py library.
PubSubWorkerThread class check for self._running.is_set() inside the loop.
To do a "clean shutdown" you should call self.pubsub_thread._running.clean() to set the thread event to false and it will stop.
Check how it work here:
https://redis.readthedocs.io/en/latest/_modules/redis/client.html?highlight=PubSubWorkerThread#

Hazelcast stuck in TIMED_WAITING when using 2nd-level cache

I am using Hazelcast 3.2.6 as second level cache for Hibernate. The cluster has 4 servers with multiple Read/Update/Delete operations being performed on the DB. It was running fine for quite sometime suddenly I see that all the threads which are trying to perform db operation are stuck, following is an extract from thread dump, there are no exceptions being printed.
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at com.hazelcast.spi.impl.BasicInvocation$InvocationFuture.pollResponse(BasicInvocation.java:767)
- locked <0x0000000665956110> (a com.hazelcast.spi.impl.BasicInvocation$InvocationFuture)
at com.hazelcast.spi.impl.BasicInvocation$InvocationFuture.waitForResponse(BasicInvocation.java:719)
at com.hazelcast.spi.impl.BasicInvocation$InvocationFuture.get(BasicInvocation.java:697)
at com.hazelcast.spi.impl.BasicInvocation$InvocationFuture.get(BasicInvocation.java:676)
at com.hazelcast.spi.impl.BasicInvocation$InvocationFuture.getSafely(BasicInvocation.java:689)
at com.hazelcast.concurrent.lock.LockProxySupport.lock(LockProxySupport.java:80)
at com.hazelcast.concurrent.lock.LockProxySupport.lock(LockProxySupport.java:74)
at com.hazelcast.concurrent.lock.LockProxy.lock(LockProxy.java:70)
at com.xxx.database.ccsecure.persistance.impl.DataStore.get(DataStore.java:120)
Apparently the invocation doesn't get a result. This means that the invocation-future is not going to complete. The big question is: why does the operation not get a response to its request.
Do you know which operation it is?

Segfault in multithreaded OpenGL?

I'm running into an issue where OpenGL calls in multiple threads sometimes cause a segfault, and I can't figure out what I'm doing wrong. I'm not sharing a context or anything else between threads.
invalid CoreGraphics connection
Segmentation fault: 11
The actual CGL result code is
kCGLBadConnection - Invalid connection to Core Graphics.
https://developer.apple.com/library/mac/documentation/graphicsimaging/reference/cgl_opengl/Reference/reference.html#//apple_ref/doc/uid/TP40001186-CH3g-BBCDCEBD
The end use case here is to render images asynchronously with libuv (doing some processing on the CPU then uploading data to the GPU for rendering), but I've worked up a simple test case which replicates this issue.
https://github.com/mikemorris/headless-gl-multithreaded
You need a valid OpenGL context bound to the thread when calling glReadPixels. The CGL variant of View::resize unbinds the OpenGL context at the end, so glReadPixels is called without a OpenGL context being active. I think this might be part of the reason of your problem.
It appears that the cause of the crash is multiple threads simultaneously trying to open a display connection in CGLChoosePixelFormat (or XOpenDisplay/glXChooseVisual in GLX). Opening a single connection in the main thread and then using this connection when instantiating new threads (each of which creates their own context) seems to fix this.

JDBC commit failed, calling commit when autocommit=true. Multithreaded hibernate session somehow changing autocommit?

I have the main thread that spawns thread #2 which uses the same hibernate Session in the main thread. Thread #2 just does a "select 1" every few min to keep the db connection alive because of a long running process from the main thread. Once the main thread is done w/ the processing, it calls a commit but i get the error:
Caused by: org.hibernate.TransactionException: JDBC commit failed
at org.hibernate.transaction.JDBCTransaction.commit(JDBCTransaction.java:161)
at org.springframework.orm.hibernate3.HibernateTransactionManager.doCommit(HibernateTransactionManager.java:655)
... 5 more
Caused by: java.sql.SQLException: Can't call commit when autocommit=true
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:930)
at com.mysql.jdbc.ConnectionImpl.commit(ConnectionImpl.java:1602)
at org.hibernate.transaction.JDBCTransaction.commitAndResetAutoCommit(JDBCTransaction.java:170)
at org.hibernate.transaction.JDBCTransaction.commit(JDBCTransaction.java:146)
... 6 more
Within the main thread, it creates inner transactions which are committed successfully, it's just the outer transaction when it commits that throws this error. I don't see what could be changing the autocommit boolean. Before I introduced the 2nd thread to keep the connection alive, this error had never occurred.
Even though I think you should seriously reconsider the way you are using Hibernate, you can bypass this issue by adding a relaxAutoCommit parameter to the JDBC driver in its URL.
Details from MySQL documentation:
relaxAutoCommit
If the version of MySQL the driver connects to does not support transactions, still allow calls to commit(), rollback() and setAutoCommit() (true/false, defaults to 'false')?
Default: false
Since version: 2.0.13
Source: https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-configuration-properties.html
found the answer in a blog, the solution quotes:
Setting the attribute relaxAutoCommit=true in the jdbc url we solved our problem.
jdbc:mysql://dbserver/database?rewriteBatchedStatements=true&relaxAutoCommit=true
Of course the blog is in another scenario, just skip the "rewriteBatchedStatements=true" part

Resources