Many threads are blocked and memory from old gen not releasing - multithreading

I am using tomcat server and when i took thread dump i can see many threads are blocked (172) and other threads are IN_NATIVE (27). Almost many blocked threads are like smiler to below. Can some one help what may be the reason. My 8GB OldGen space is full. After performing GC also its not releasing.
blocked threads :
Thread 22614 - threadId:Thread 22614 - state:BLOCKED
stackTrace:
- java.net.SocketInputStream.socketRead0(java.io.FileDescriptor, byte[], int, int, int) #bci=0 (Compiled frame; information may be imprecise)
- java.net.SocketInputStream.read(byte[], int, int, int) #bci=87, line=152 (Compiled frame)
- java.net.SocketInputStream.read(byte[], int, int) #bci=11, line=122 (Compiled frame)
- org.apache.coyote.http11.InternalInputBuffer.fill(boolean) #bci=59, line=512 (Compiled frame)
- org.apache.coyote.http11.InternalInputBuffer.fill() #bci=2, line=497 (Compiled frame)
- org.apache.coyote.http11.Http11Processor.process(org.apache.tomcat.util.net.SocketWrapper) #bci=263, line=203 (Compiled frame)
- org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(org.apache.tomcat.util.net.SocketWrapper, org.apache.tomcat.util.net.SocketStatus) #bci=96, line=515 (Compiled frame)
- org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run() #bci=130, line=302 (Compiled frame)
- java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) #bci=95, line=1145 (Compiled frame)
- java.util.concurrent.ThreadPoolExecutor$Worker.run() #bci=5, line=615 (Interpreted frame)
- java.lang.Thread.run() #bci=11, line=745 (Interpreted frame)
Thread 23677 - threadId:Thread 23677 - state:BLOCKED
stackTrace:
- sun.misc.Unsafe.park(boolean, long) #bci=0 (Compiled frame; information may be imprecise)
- java.util.concurrent.locks.LockSupport.parkNanos(java.lang.Object, long) #bci=20, line=226 (Compiled frame)
- java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(long) #bci=68, line=2082 (Compiled frame)
- java.util.concurrent.LinkedBlockingQueue.poll(long, java.util.concurrent.TimeUnit) #bci=62, line=467 (Compiled frame)
- org.apache.tomcat.util.threads.TaskQueue.poll(long, java.util.concurrent.TimeUnit) #bci=3, line=86 (Compiled frame)
- org.apache.tomcat.util.threads.TaskQueue.poll(long, java.util.concurrent.TimeUnit) #bci=3, line=32 (Compiled frame)
- java.util.concurrent.ThreadPoolExecutor.getTask() #bci=141, line=1068 (Compiled frame)
- java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) #bci=26, line=1130 (Compiled frame)
- java.util.concurrent.ThreadPoolExecutor$Worker.run() #bci=5, line=615 (Interpreted frame)
- java.lang.Thread.run() #bci=11, line=745 (Interpreted frame)
Thread 23674 - threadId:Thread 23674 - state:BLOCKED
stackTrace:
- com.mysql.jdbc.SingleByteCharsetConverter.toString(byte[], int, int) #bci=1, line=322 (Compiled frame)
- com.mysql.jdbc.ResultSetRow.getString(java.lang.String, com.mysql.jdbc.MySQLConnection, byte[], int, int) #bci=54, line=797 (Compiled frame)
- com.mysql.jdbc.ByteArrayRow.getString(int, java.lang.String, com.mysql.jdbc.MySQLConnection) #bci=24, line=72 (Compiled frame)
- com.mysql.jdbc.ResultSetImpl.getStringInternal(int, boolean) #bci=155, line=5699 (Compiled frame)
- com.mysql.jdbc.ResultSetImpl.getString(int) #bci=3, line=5576 (Compiled frame)
- com.mysql.jdbc.ResultSetImpl.getString(java.lang.String) #bci=6, line=5616 (Compiled frame)
- com.mchange.v2.c3p0.impl.NewProxyResultSet.getString(java.lang.String) #bci=19, line=3342 (Compiled frame)
- org.hibernate.type.StringType.get(java.sql.ResultSet, java.lang.String) #bci=2, line=41 (Compiled frame)
- org.hibernate.type.NullableType.nullSafeGet(java.sql.ResultSet, java.lang.String) #bci=3, line=184 (Compiled frame)
- org.hibernate.type.NullableType.nullSafeGet(java.sql.ResultSet, java.lang.String, org.hibernate.engine.SessionImplementor, java.lang.Object) #bci=3, line=210 (Compiled frame)

First stack trace indicates that thread tried to read from a socket and went into BLOCKED state. Socket read operation is a blocking operation which means that if there is nothing to read or till all the information is fully read, it will block.
For second, LinkedBlockingQueue.poll() is a not a blocking operation so this is a normal stack trace to indicate usual idle thread. This is not caused by user code
For third, that also doesn't look problematic as String value from result set is returned.
I think, you should look at this too.
Stack trace # 1 and #3 might be related as that socket read might be DB read.
These stack traces will not help solve problem but these kind of blocked threads simply indicate a problem with too much memory and excessive garbage collection.
There might be a problem with your C3P0 pool or the way you might be creating statement and result set objects - all in all it seems a case of memory leak and resources not closing well.
Very precise answer is not possible without relevant application code. Also, as stated in comment, identity of lock object needs to be dumped too.
Hope it helps !!

Related

tkinter animation speed on windows and linux

I was trying to create simple 2d game using tkinter, but faced with interesting problem: animation speed is quite different on various computers.
To test this, I've create script, that measures time of animation
import tkinter as tk
import datetime
root = tk.Tk()
can = tk.Canvas(height=500, width=1000)
can.pack()
rect = can.create_rectangle(0, 240, 20, 260, fil='#5F6A6A')
def act():
global rect, can
pos = can.coords(rect)
if pos[2] < 1000:
can.move(rect, 5, 0)
can.update()
can.after(1)
act()
def key_down(key):
t = datetime.datetime.now()
act()
print(datetime.datetime.now() - t)
can.bind("<Button-1>", key_down)
root.mainloop()
and get these results:
i3-7100u ubuntu 20.04 laptop python3.8.5 - 0.5 seconds
i3-7100u windows 10 laptop python3.9.4 - 3 seconds
i3-6006u ubuntu 20.10 laptop python3.9.x - 0.5 seconds
i3-6006u windows 10 laptop python3.8.x - 3 seconds
i5-7200u windows 10 laptop python3.6.x - 3 seconds
i5-8400 windows 10 desktop python3.9.x - 3 seconds
fx-9830p windows 10 laptop python3.8.x - 0.5 seconds
tkinter vesrion is the same - 8.6
How can be it fixed or at least explained?
tkinter.Canvas.after should be used like so:
def act():
global rect, can
pos = can.coords(rect)
if pos[2] < 1000:
can.move(rect, 5, 0)
can.update()
can.after(1, act)
The after method is not like time.sleep. Rather than recursively calling the function, the above code schedules it to be called later, so this will break your timing code.
If you want to time it again, try this:
def act():
global rect, can, t
pos = can.coords(rect)
if pos[2] < 1000:
can.move(rect, 5, 0)
can.update()
can.after(1, act)
else:
print(datetime.datetime.now() - t)
def key_down(key):
global t
t = datetime.datetime.now()
act()
This may still take different amounts of time on different machines. This difference can be caused by a variety of things like CPU speed, the implementation of tkinter for your OS etc. The difference can be reduced by increasing the delay between iterations: tkinter.Canvas.after takes a time in milliseconds, so a delay of 16 can still give over 60 frames per seconds.
If keeping the animation speed constant is important, I would recommend you use delta time in your motion calculations rather than assuming a constant frame rate.
AS you can see from your data it doesent matter which python version. It appears that ubuntu systems can help process python easier. However, im pretty sure its just the processer or how much ram the computer has.

Can OpenCV's VideoWriter write in a separate process?

I'm trying to save a video to disk in a separate process. The program creates a buffer of images to save on the original process. When its done recording, it passes the file name and image buffer to a second process that will make its own VideoWriter and save the file. When the second process calls write, however, nothing happens. It hangs and doesn't output any errors.
I checked if the VideoWriter is open already and it is. I tried moving the code to the original process to see if it worked there and it does. I don't know if it is some setting I need to initialize in the new process or if it has to do with the way VideoWriter works.
Here's my code
def stop_recording(self):
"""Stops recording in a separate process"""
if self._file_dump_process is None:
self._parent_conn, child_conn = multiprocessing.Pipe()
self._file_dump_process = multiprocessing.Process(
target=self.file_dump_loop, args=(child_conn, self.__log))
self._file_dump_process.daemon = True
self._file_dump_process.start()
if self._recording:
self.__log.info("Stopping recording. Please wait...")
# Dump VideoWriter and image buffer to process
# Comment out when running on main procress
self._parent_conn.send([self._record_filename, self._img_buffer])
""" Comment in when running on main procress
fourcc = cv2.VideoWriter_fourcc(*"MJPG")
effective_fps = 16.0
frame_shape = (640, 480)
record_file = cv2.VideoWriter(self._record_filename, fourcc,
effective_fps, frame_shape,
isColor=1)
for img in self._img_buffer:
self.__log.info("...still here...")
record_file.write(img)
# Close the file and set it to None
record_file.release()
self.__log.info("done.")
"""
# Delete the entire image buffer no matter what
del self._img_buffer[:]
self._recording = False
#staticmethod
def file_dump_loop(child_conn, parent_log):
fourcc = cv2.VideoWriter_fourcc(*"MJPG")
effective_fps = 16.0
frame_shape = (640, 480)
while True:
msg = child_conn.recv()
record_filename = msg[0]
img_buffer = msg[1]
record_file = cv2.VideoWriter(record_filename, fourcc,
effective_fps, frame_shape,
isColor=1)
for img in img_buffer:
parent_log.info("...still here...")
record_file.write(img)
# Close the file and set it to None
record_file.release()
del img_buffer[:]
parent_log.info("done.")
Here's the log output when I run it on one process:
2019-03-29 16:19:02,469 - image_processor.stop_recording - INFO: Stopping recording. Please wait...
2019-03-29 16:19:02,473 - image_processor.stop_recording - INFO: ...still here...
2019-03-29 16:19:02,515 - image_processor.stop_recording - INFO: ...still here...
2019-03-29 16:19:02,541 - image_processor.stop_recording - INFO: ...still here...
2019-03-29 16:19:02,567 - image_processor.stop_recording - INFO: ...still here...
2019-03-29 16:19:02,592 - image_processor.stop_recording - INFO: ...still here...
2019-03-29 16:19:02,617 - image_processor.stop_recording - INFO: ...still here...
2019-03-29 16:19:02,642 - image_processor.stop_recording - INFO: ...still here...
2019-03-29 16:19:02,670 - image_processor.stop_recording - INFO: done.
Here's the log output when I run it on a second process:
2019-03-29 16:17:27,299 - image_processor.stop_recording - INFO: Stopping recording. Please wait...
2019-03-29 16:17:27,534 - image_processor.file_dump_loop - INFO: ...still here...
I tried this, and was successful with the following code:
import cv2
cap, imgs = cv2.VideoCapture('exampleVideo.MP4'), []
# This function writes video
def write_video(list_of_images):
vid_writer = cv2.VideoWriter('/home/stephen/Desktop/re_encode.avi',cv2.VideoWriter_fourcc('M','J','P','G'),120, (640,480))
for image in list_of_images: vid_writer.write(image)
# Loop to read video and save images to a list
for frame in range(123):
_, img = cap.read()
imgs.append(img)
write_video(imgs)
cap.release()
Everything worked as expected, and when I checked how long it took to run, I found that the above code took .13 seconds to read the video and .43 seconds to write the video. If I read the video and write the video in the same loop (below) the total processing time is .56 seconds (which is .13+.43).
# Loop to save image to video
for frame in range(123):
_, img = cap.read()
vid_writer.write(img)
There is a big disadvantage of writing the images to a buffer (in memory) first, and then writing the images to a video file (on the hard drive). The buffer is saved on the RAM which will fill up very quickly, and you will likely get a memory error.

Drools - fireUntilHalt locks on KnowledgeBase

I have a daemon thread that calls kieSession.fireUntilHalt. And facts are inserted from other threads (http).
At some point the 'firing' thread gets into "Waiting" state and never returns. So the other facts inserting threads are blocked on this thread.
Rule Firing Thread Stack:
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
at org.drools.core.common.UpgradableReentrantReadWriteLock.lowPriorityWriteLock(UpgradableReentrantReadWriteLock.java:109)
at org.drools.core.common.UpgradableReentrantReadWriteLock.writeLock(UpgradableReentrantReadWriteLock.java:95)
at org.drools.core.impl.KnowledgeBaseImpl.lock(KnowledgeBaseImpl.java:684)
at org.drools.core.reteoo.builder.PatternBuilder.attachObjectTypeNode(PatternBuilder.java:275)
at org.drools.core.reteoo.ClassObjectTypeConf. (ClassObjectTypeConf.java:107)
at org.drools.core.common.ObjectTypeConfigurationRegistry.getObjectTypeConf(ObjectTypeConfigurationRegistry.java:72)
at org.drools.core.reteoo.AccumulateNode.createResultFactHandle(AccumulateNode.java:149)
at org.drools.core.phreak.PhreakAccumulateNode.evaluateResultConstraints(PhreakAccumulateNode.java:683)
at org.drools.core.phreak.PhreakAccumulateNode.doNode(PhreakAccumulateNode.java:89)
at org.drools.core.phreak.RuleNetworkEvaluator.switchOnDoBetaNode(RuleNetworkEvaluator.java:563)
at org.drools.core.phreak.RuleNetworkEvaluator.evalBetaNode(RuleNetworkEvaluator.java:534)
at org.drools.core.phreak.RuleNetworkEvaluator.innerEval(RuleNetworkEvaluator.java:334)
at org.drools.core.phreak.RuleNetworkEvaluator.outerEval(RuleNetworkEvaluator.java:161)
at org.drools.core.phreak.RuleNetworkEvaluator.evaluateNetwork(RuleNetworkEvaluator.java:116)
at org.drools.core.phreak.RuleExecutor.evaluateNetwork(RuleExecutor.java:77)
- locked [0x00000000fe586af0] (a org.drools.core.phreak.RuleExecutor)
at org.drools.core.common.DefaultAgenda.evaluateEagerList(DefaultAgenda.java:990)
- locked [0x000000008ad60cd8] (a java.util.LinkedList)
at org.drools.core.common.DefaultAgenda.fireNextItem(DefaultAgenda.java:945)
at org.drools.core.common.DefaultAgenda.fireUntilHalt(DefaultAgenda.java:1190)
at org.drools.core.impl.StatefulKnowledgeSessionImpl.fireUntilHalt(StatefulKnowledgeSessionImpl.java:1283)
at org.drools.core.impl.StatefulKnowledgeSessionImpl.fireUntilHalt(StatefulKnowledgeSessionImpl.java:1260)
at com.comp.narch.lfc.decision.rules.util.FireUntilHaltSpawn.call(FireUntilHaltSpawn.java:21)
at com.comp.narch.lfc.decision.rules.util.FireUntilHaltSpawn.call(FireUntilHaltSpawn.java:11)
Fact Inserting Thread (blocked) Stack:
at org.drools.core.common.DefaultAgenda.addEagerRuleAgendaItem(DefaultAgenda.java:282)
- waiting to lock [0x000000008ad60cd8] (a java.util.LinkedList)
at org.drools.core.reteoo.PathMemory.queueRuleAgendaItem(PathMemory.java:153)
at org.drools.core.reteoo.PathMemory.doLinkRule(PathMemory.java:120)
- locked [0x00000000fe586928] (a org.drools.core.reteoo.PathMemory)
at org.drools.core.reteoo.PathMemory.linkSegment(PathMemory.java:90)
at org.drools.core.reteoo.SegmentMemory.notifyRuleLinkSegment(SegmentMemory.java:170)
at org.drools.core.reteoo.LeftInputAdapterNode$LiaNodeMemory.setNodeDirty(LeftInputAdapterNode.java:647)
at org.drools.core.reteoo.LeftInputAdapterNode.doInsertSegmentMemory(LeftInputAdapterNode.java:277)
at org.drools.core.reteoo.LeftInputAdapterNode.doInsertObject(LeftInputAdapterNode.java:229)
at org.drools.core.reteoo.LeftInputAdapterNode.assertObject(LeftInputAdapterNode.java:197)
at org.drools.core.reteoo.CompositeObjectSinkAdapter.doPropagateAssertObject(CompositeObjectSinkAdapter.java:502)
at org.drools.core.reteoo.CompositeObjectSinkAdapter.propagateAssertObject(CompositeObjectSinkAdapter.java:387)
at org.drools.core.reteoo.ObjectTypeNode.assertObject(ObjectTypeNode.java:288)
at org.drools.core.reteoo.EntryPointNode.assertObject(EntryPointNode.java:251)
at org.drools.core.common.NamedEntryPoint.insert(NamedEntryPoint.java:367)
at org.drools.core.common.NamedEntryPoint.insert(NamedEntryPoint.java:286)
at org.drools.core.impl.StatefulKnowledgeSessionImpl.insert(StatefulKnowledgeSessionImpl.java:1430)
at org.drools.core.impl.StatefulKnowledgeSessionImpl.insert(StatefulKnowledgeSessionImpl.java:1372)
at com.comp.narch.lfc.service.impl.DefaultServiceInstanceMgmtServiceImpl.newAppMetric(DefaultServiceInstanceMgmtServiceImpl.java:84)
Drools Version - 6.1.0.Final

Java8 hangs up if getStackTrace() is called by one thread and lambda definition (via Unsafe.defineAnonymousClass) occurs in another thread

My Web application, which runs in Apache Tomcat/8.0.21, with JVM 1.8.0_45-b15 and Windows Server 2012 on a 16-core (32- with HT) Dual-Xeon NUMA machine, can get stuck, in some very unfortunate circumstances, when the actions described in the title occur at the same time in two different threads.
The thread doing the first action (getStackTrace()) is trying to perform some diagnostic to detect which part of the system is slowing things down and gets stuck while calling Thread.dumpThreads.
The other thread is doing some operation, which includes under-the-hood lambda definition on part of the JVM.
In particular, I have the following stack trace (obtained with jstack -F <pid>):
Attaching to process ID 6568, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 25.45-b02
Deadlock Detection:
No deadlocks found. (... well, that's not the kind of deadlock you were searching for, dear JVM, but something bad is happening altogether :( )
Thread 155: (state = BLOCKED)
- sun.misc.Unsafe.defineAnonymousClass(java.lang.Class, byte[], java.lang.Object[]) #bci=0 (Compiled frame; information may be imprecise)
- java.lang.invoke.InvokerBytecodeGenerator.loadAndInitializeInvokerClass(byte[], java.lang.Object[]) #bci=8 (Compiled frame)
- java.lang.invoke.InvokerBytecodeGenerator.loadMethod(byte[]) #bci=6 (Compiled frame)
- java.lang.invoke.InvokerBytecodeGenerator.generateCustomizedCode(java.lang.invoke.LambdaForm, java.lang.invoke.MethodType) #bci=17 (Compiled frame)
- java.lang.invoke.LambdaForm.compileToBytecode() #bci=65 (Compiled frame)
- java.lang.invoke.DirectMethodHandle.makePreparedLambdaForm(java.lang.invoke.MethodType, int) #bci=638 (Interpreted frame)
- java.lang.invoke.DirectMethodHandle.preparedLambdaForm(java.lang.invoke.MethodType, int) #bci=17 (Compiled frame)
- java.lang.invoke.DirectMethodHandle.preparedLambdaForm(java.lang.invoke.MemberName) #bci=163 (Compiled frame)
- java.lang.invoke.DirectMethodHandle.make(byte, java.lang.Class, java.lang.invoke.MemberName) #bci=94 (Compiled frame)
- java.lang.invoke.MethodHandles$Lookup.getDirectMethodCommon(byte, java.lang.Class, java.lang.invoke.MemberName, boolean, boolean, java.lang.Class) #bci=201 (Compiled frame)
- java.lang.invoke.MethodHandles$Lookup.getDirectMethodNoSecurityManager(byte, java.lang.Class, java.lang.invoke.MemberName, java.lang.Class) #bci=8 (Compiled frame)
- java.lang.invoke.MethodHandles$Lookup.getDirectMethodForConstant(byte, java.lang.Class, java.lang.invoke.MemberName) #bci=30 (Compiled frame)
- java.lang.invoke.MethodHandles$Lookup.linkMethodHandleConstant(byte, java.lang.Class, java.lang.String, java.lang.Object) #bci=115 (Compiled frame)
- java.lang.invoke.MethodHandleNatives.linkMethodHandleConstant(java.lang.Class, int, java.lang.Class, java.lang.String, java.lang.Object) #bci=38 (Compiled frame)
- c.e.s.w.t.si.a.DDVP.lambda$1(com.vaadin.data.Container, com.vaadin.ui.HorizontalLayout, com.vaadin.ui.Label, com.vaadin.ui.Label, java.lang.String, java.lang.String, java.util.Map) #bci=48, line=104 (Interpreted frame)
- c.e.s.w.t.si.a.DDVP$$Lambda$637.updateUIWith(java.lang.Object) #bci=32 (Interpreted frame)
- c.e.s.w.d.DU$VoidUIUpdaterFromUIUpdater.updateUI() #bci=8, line=321 (Compiled frame)
- c.e.s.w.d.DU$CompletionSignallingVoidUIUpdater.updateUI() #bci=4, line=125 (Compiled frame)
- c.e.s.w.d.CUQ$1.sweepWhileNotTimedOut() #bci=59, line=218 (Compiled frame)
- c.e.s.w.d.CUQ$QueueExhauster.run() #bci=247, line=122 (Compiled frame)
- c.e.s.w.d.CUQ$DequeuerStartFailed.run() #bci=40, line=60 (Compiled frame)
- c.e.s.w.s.ew.CC.lambda$4(java.lang.Runnable) #bci=13, line=66 (Compiled frame)
- c.e.s.w.s.ew.CC$$Lambda$59.run() #bci=8 (Compiled frame)
- java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) #bci=95 (Compiled frame)
- java.util.concurrent.ThreadPoolExecutor$Worker.run() #bci=5 (Interpreted frame)
- java.lang.Thread.run() #bci=11 (Compiled frame)
Thread 108: (state = BLOCKED) [The tricky one...]
- java.lang.Thread.dumpThreads(java.lang.Thread[]) #bci=0 (Interpreted frame)
- java.lang.Thread.getStackTrace() #bci=41 (Compiled frame)
- c.e.s.w.SWA$$Lambda$98.getStackTrace() #bci=4 (Interpreted frame)
- c.e.s.w.SWA.describeHoggingCode(c.e.s.w.s.RequestTimeTracker$StackTraceProvider, boolean) #bci=1, line=401 (Interpreted frame)
- c.e.s.w.SWA.describeHoggingCode(c.e.s.w.s.RequestTimeTracker$StackTraceProvider, boolean, java.lang.Thread) #bci=6, line=396 (Interpreted frame)
- c.e.s.w.SWA.lambda$10(java.lang.Thread, java.lang.String) #bci=8, line=890 (Interpreted frame)
- c.e.s.w.SWA$$Lambda$62.run() #bci=12 (Interpreted frame)
- c.e.s.w.s.ew.CC.lambda$4(java.lang.Runnable) #bci=13, line=66 (Compiled frame)
- c.e.s.w.s.ew.CC$$Lambda$59.run() #bci=8 (Compiled frame)
- c.e.s.w.s.IS.lambda$4(java.util.concurrent.atomic.AtomicBoolean, java.lang.Runnable) #bci=8, line=327 (Compiled frame)
- c.e.s.w.s.IS$$Lambda$60.run() #bci=8 (Compiled frame)
- java.util.concurrent.Executors$RunnableAdapter.call() #bci=4 (Compiled frame)
- java.util.concurrent.FutureTask.run() #bci=42 (Compiled frame)
- java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask) #bci=1 (Compiled frame)
- java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run() #bci=30 (Compiled frame)
- java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) #bci=95 (Compiled frame)
- java.util.concurrent.ThreadPoolExecutor$Worker.run() #bci=5 (Interpreted frame)
- java.lang.Thread.run() #bci=11 (Interpreted frame)
From my point of view, this issue could be related to Unsafe.defineAnonymousClass not being able to cope with an ongoing call to java.lang.Thread.dumpThreads (which, in turn, is needed to implement java.lang.Thread.getStackTrace, inside JVM). The key point is that, due final or package modifiers, I'm not able to extend any of the core classes involved in this process (such as Lookup, MethodHandleNatives, etc.) in order to introduce a lock which would be blocking tricky Unsafe calls while a call to java.lang.Thread.dumpThreads is still ongoing. Also, I suspect that introducing such lock could also slow things down quite a bit, since, well, lambdas are everywhere.
Has anyone encountered a similar problem? Could you help solving it?
Thank you!
Of course, in the stacktrace were also threads like these, which I omitted since I think they are not relevant to this case.
Thread 154: (state = BLOCKED) [Many of these....]
- sun.misc.Unsafe.park(boolean, long) #bci=0 (Compiled frame; information may be imprecise)
- java.util.concurrent.locks.LockSupport.parkNanos(java.lang.Object, long) #bci=20 (Compiled frame)
- java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(long) #bci=78 (Compiled frame)
- org.eclipse.jetty.util.BlockingArrayQueue.poll(long, java.util.concurrent.TimeUnit) #bci=57, line=389 (Compiled frame)
- org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll() #bci=12, line=516 (Compiled frame)
- org.eclipse.jetty.util.thread.QueuedThreadPool.access$700(org.eclipse.jetty.util.thread.QueuedThreadPool) #bci=1, line=47 (Compiled frame)
- org.eclipse.jetty.util.thread.QueuedThreadPool$3.run() #bci=300, line=575 (Compiled frame)
- java.lang.Thread.run() #bci=11 (Compiled frame)
Thread 153: (state = BLOCKED) [and of these...]
- sun.misc.Unsafe.park(boolean, long) #bci=0 (Compiled frame; information may be imprecise)
- java.util.concurrent.ForkJoinPool.awaitWork(java.util.concurrent.ForkJoinPool$WorkQueue, int) #bci=354 (Compiled frame)
- java.util.concurrent.ForkJoinPool.runWorker(java.util.concurrent.ForkJoinPool$WorkQueue) #bci=44 (Interpreted frame)
- java.util.concurrent.ForkJoinWorkerThread.run() #bci=24 (Interpreted frame)
Thread 141: (state = BLOCKED) [and of these...]
- sun.misc.Unsafe.park(boolean, long) #bci=0 (Compiled frame; information may be imprecise)
- java.util.concurrent.locks.LockSupport.park(java.lang.Object) #bci=14 (Compiled frame)
- java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() #bci=42 (Compiled frame)
- java.util.concurrent.LinkedBlockingQueue.take() #bci=29 (Compiled frame)
- org.apache.tomcat.util.threads.TaskQueue.take() #bci=36, line=103 (Compiled frame)
- org.apache.tomcat.util.threads.TaskQueue.take() #bci=1, line=31 (Compiled frame)
- java.util.concurrent.ThreadPoolExecutor.getTask() #bci=149 (Compiled frame)
- java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) #bci=26 (Interpreted frame)
- java.util.concurrent.ThreadPoolExecutor$Worker.run() #bci=5 (Interpreted frame)
- org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run() #bci=4, line=61 (Interpreted frame)
- java.lang.Thread.run() #bci=11 (Interpreted frame)
Finally I managed to solve this nasty issue. It appears that JVM can get stuck with traces like the ones if posted (involving at least sun.misc.Unsafe.defineAnonymousClass) if the option -XX:-ClassUnloading is passed on to the JVM executable (as was the case for the JVM I was trying to cure). For some reason this can stimulate this deadlock-style bug in Java system libraries, which appears to be random (as all deadlock bugs) and getting more and more likely as JVM process "ages". Again, this is nothing new for deadlock bugs: the more "gears" (say, classes which should had been unloaded but weren't due to that option) you put into the machine, the more likely is that two or more of them crank.
As a result, removing the option -XX:-ClassUnloading makes this problem completely disappear.
The bottom line is: never use -XX:-ClassUnloading in production systems (or let anyone put such option in JVM process starting scripts), not even in Java8, which shouldn't be limited by PermGen, yet it still can suffer badly due to problem in sun.misc.Unsafe.defineAnonymousClass .

SHGetFileInfo function cause unhandled exception

I have application with CMFCShellTreeCtrl on one of it's dialog and it is crashing when running on some Win8 machines. It happen when tree control trying to initialize and calls SHGetFileInfo in this part of afxshelltreectrl.cpp:
int CMFCShellTreeCtrl::OnGetItemIcon(LPAFX_SHELLITEMINFO pItem, BOOL bSelected)
{
ENSURE(pItem != NULL);
SHFILEINFO sfi;
UINT uiFlags = SHGFI_PIDL | SHGFI_SYSICONINDEX | SHGFI_SMALLICON;
if (bSelected)
{
uiFlags |= SHGFI_OPENICON;
}
else
{
uiFlags |= SHGFI_LINKOVERLAY;
}
if (SHGetFileInfo((LPCTSTR)pItem->pidlFQ, 0, &sfi, sizeof(sfi), uiFlags))
{
return sfi.iIcon;
}
return -1;
}
Application was build in VS2010 on Win7 32-bit.
I could not replicate this bug on VM so I debug remotely on client PC.
I compared the values ​​of arguments for SHGetFileInfo function, and they looked the same on my machine and the client's, except the memory addresses.
Call stack after exception:
screenshot
WinDbg log:
ModLoad: 02b70000 02bc9000 cmd.exe
ModLoad: 60780000 607ca000 C:\windows\SysWOW64\mscoree.dll
ModLoad: 60700000 6077a000 C:\Windows\Microsoft.NET\Framework\v4.0.30319\mscoreei.dll
ModLoad: 711b0000 71250000 C:\windows\SysWOW64\sxs.dll
ModLoad: 60150000 606ff000 C:\Windows\Microsoft.NET\Framework\v2.0.50727\mscorwks.dll
ModLoad: 70e30000 70ecb000 C:\windows\WinSxS\x86_microsoft.vc80.crt_1fc8b3b9a1e18e3b_8.0.50727.6910_none_d089c358442de345\MSVCR80.dll
*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\windows\WinSxS\x86_microsoft.vc80.crt_1fc8b3b9a1e18e3b_8.0.50727.6910_none_d089c358442de345\MSVCR80.dll -
ModLoad: 5f650000 6014a000 C:\windows\assembly\NativeImages_v2.0.50727_32\mscorlib\7f763721bf47dc8d58ec21cb64cbec91\mscorlib.ni.dll
ModLoad: 71770000 71778000 C:\Windows\Microsoft.NET\Framework\v2.0.50727\culture.dll
(c18.227c): CLR exception - code e0434f4d (first chance)
(c18.227c): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\windows\SysWOW64\combase.dll -
eax=002d0068 ebx=80040154 ecx=04b1f654 edx=04b1f678 esi=0018b654 edi=76cbbda0
eip=002d0068 esp=0018b63c ebp=0018b648 iopl=0 nv up ei ng nz ac pe cy
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010297
002d0068 ?? ???
According to the call stack error occurs in some COM functions.
I am not familiar with COM so may be some one can help me to find the reason why SHGetFileInfo cause exception.
c0000005 is memory accessing exception. check whether your pItem has been initialized before calling this function. Or you can check whether 'pItem->pidlFQ' is valid.

Resources