One part of an application I'm developing needs to send some emails to a small group of people. Since it may take a little while to connect to the SMTP server and send the emails, I want to provide a progress bar during this operation using a background thread to do the work.
What happens now is that I can implement a test structure that works just fine, but then as soon as I try to create an object from the backend of my application to actually do any emailing operations, it crashes completely (as though it had segfaulted), dumping this to the console:
[xcb] Unknown request in queue while dequeuing
[xcb] Most likely this is a multi-threaded client and XInitThreads has not been called
[xcb] Aborting, sorry about that.
python: ../../src/xcb_io.c:179: dequeue_pending_request: Assertion `!xcb_xlib_unknown_req_in_deq' failed.
Aborted
The only relevant thread I found searching for these errors said something about the signals being implemented wrong (for PySide, PySide and QProgressBar update in a different thread), but in my case the signals work totally fine until I try to create that object (which isn't based on Qt classes at all).
Here's a simplified version of my GUI code:
class SendingDialog(QtGui.QDialog):
def __init__(self, parent, optsDict, cls, zid):
QtGui.QDialog.__init__(self)
self.form = Ui_Dialog()
self.form.setupUi(self)
# initialize some class variables...
self.beginConnect()
self.thread = WorkerThread()
self.thread.insertOptions(self.opts, self.cls, self.zid)
self.thread.finished.connect(self.endOfThread)
self.thread.serverContacted.connect(self.startProgress)
self.thread.aboutToEmail.connect(self.updateProgress)
self.thread.start()
def beginConnect(self):
# start busy indicator
def startProgress(self):
# set up progress bar
def updateProgress(self):
# increment progress bar
def endOfThread(self):
self.thread.quit()
self.reject()
class WorkerThread(QtCore.QThread):
serverContacted = QtCore.pyqtSignal(name="serverContacted")
aboutToEmail = QtCore.pyqtSignal(name="aboutToEmail")
def insertOptions(self, opts, cls, zid):
self.opts = opts
self.cls = cls
self.zid = zid
def run(self):
# upon running the following line, the application crashes.
emailman = db.emailing.EmailManager(self.opts, self.cls, self.zid)
If I put some dummy code into run() that sleeps, emits the appropriate signals, or prints test values, everything works fine; but as soon as I try to instantiate the EmailManager, the whole thing crashes.
EmailManager is an unremarkable class derived from object, taking the parameters I've given it (opts is a dictionary, cls is a different type of similarly unremarkable object, and zid is just a plain number). The constructor looks like this:
def __init__(self, optsDict, cls, zid):
self.opts = optsDict
self.cls = cls
self.historyItem = HistoryItem(zid)
self.studentsList = studentsInClass(cls)
self.connection = None
I'm constructing a couple of other objects based on the parameters, but other than that, nothing complicated or unusual is happening. The code in the db.emailing module does not use Qt or threading at all.
I don't even know how to begin debugging this, so any advice as to what might be going on or how I could try to find out would be very much appreciated.
Edit: In case it's helpful, here's the backtrace from gdb (I don't know enough about what's going on to find it helpful):
Program received signal SIGABRT, Aborted.
[Switching to Thread 0x7fffeb146700 (LWP 31150)]
0x00007ffff762acc9 in __GI_raise (sig=sig#entry=6)
at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007ffff762acc9 in __GI_raise (sig=sig#entry=6)
at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007ffff762e0d8 in __GI_abort () at abort.c:89
#2 0x00007ffff7623b86 in __assert_fail_base (
fmt=0x7ffff7774830 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
assertion=assertion#entry=0x7ffff6a4420d "!xcb_xlib_unknown_req_in_deq", file=file#entry=0x7ffff6a441db "../../src/xcb_io.c", line=line#entry=179,
function=function#entry=0x7ffff6a446b0 "dequeue_pending_request")
at assert.c:92
#3 0x00007ffff7623c32 in __GI___assert_fail (
assertion=0x7ffff6a4420d "!xcb_xlib_unknown_req_in_deq",
file=0x7ffff6a441db "../../src/xcb_io.c", line=179,
function=0x7ffff6a446b0 "dequeue_pending_request") at assert.c:101
#4 0x00007ffff69d479c in ?? () from /usr/lib/x86_64-linux-gnu/libX11.so.6
#5 0x00007ffff69d55c3 in _XReply ()
from /usr/lib/x86_64-linux-gnu/libX11.so.6
#6 0x00007ffff69bc346 in XGetWindowProperty ()
from /usr/lib/x86_64-linux-gnu/libX11.so.6
#7 0x00007ffff69bb22e in XGetWMHints ()
from /usr/lib/x86_64-linux-gnu/libX11.so.6
#8 0x00007ffff4c87c4b in QWidgetPrivate::setWindowIcon_sys(bool) ()
from /usr/lib/x86_64-linux-gnu/libQtGui.so.4
#9 0x00007ffff4c38405 in QWidget::create(unsigned long, bool, bool) ()
from /usr/lib/x86_64-linux-gnu/libQtGui.so.4
#10 0x00007ffff4c4086a in QWidget::setVisible(bool) ()
from /usr/lib/x86_64-linux-gnu/libQtGui.so.4
#11 0x00007ffff509956e in QDialog::setVisible(bool) ()
from /usr/lib/x86_64-linux-gnu/libQtGui.so.4
#12 0x00007ffff5c24b7c in ?? ()
from /usr/lib/python2.7/dist-packages/PyQt4/QtGui.so
#13 0x00007ffff5099026 in QDialog::exec() ()
from /usr/lib/x86_64-linux-gnu/libQtGui.so.4
#14 0x00007ffff5be5fb5 in ?? ()
from /usr/lib/python2.7/dist-packages/PyQt4/QtGui.so
#15 0x000000000049968d in PyEval_EvalFrameEx ()
#16 0x00000000004a090c in PyEval_EvalCodeEx ()
#17 0x0000000000499a52 in PyEval_EvalFrameEx ()
#18 0x00000000004a1c9a in ?? ()
#19 0x00000000004dfe94 in ?? ()
#20 0x00000000004dc9cb in PyEval_CallObjectWithKeywords ()
#21 0x000000000043734b in PyErr_PrintEx ()
#22 0x00007ffff186fd4d in ?? ()
from /usr/lib/python2.7/dist-packages/sip.so
#23 0x00007ffff14b2ece in ?? ()
from /usr/lib/python2.7/dist-packages/PyQt4/QtCore.so
#24 0x00007ffff45be32f in ?? ()
from /usr/lib/x86_64-linux-gnu/libQtCore.so.4
#25 0x00007ffff79c1182 in start_thread (arg=0x7fffeb146700)
at pthread_create.c:312
#26 0x00007ffff76ee47d in clone ()
at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
Wow, this was obscure.
The X11 windowing functions are apparently not threadsafe unless explicitly set to be so, and for whatever reason PyQt doesn't automatically set them to be. This can be corrected by adding the following before the QApplication constructor:
QtCore.QCoreApplication.setAttribute(QtCore.Qt.AA_X11InitThreads)
See the documentation on QApplicationAttributes.
Related
I have already looked here. But still can't get my head around it.
Here is how I am currently accomplishing this:
urls_without_rate_limit =
[
'http://httpbin.org/get'
'http://httpbin.org/get',
'http://httpbin.org/get',
'http://httpbin.org/get',
'http://httpbin.org/get'
]
urls_with_rate_limit =
[
'http://eu.httpbin.org/get'
'http://eu.httpbin.org/get',
'http://eu.httpbin.org/get',
'http://eu.httpbin.org/get',
'http://eu.httpbin.org/get'
]
api_rate = 2
api_limit = 6
loop = asyncio.get_event_loop()
loop.run_until_complete(
process(urls=urls_without_rate_limit, rate=0, limit=len(url_list)))
loop.run_until_complete(
process(urls=urls_with_rate_limit, rate=api_rate, limit=api_limit))
async def process(urls, rate, limit):
limit = asyncio.Semaphore(limit)
f = Fetch(
rate=rate,
limit=limit
)
tasks = []
for url in urls:
tasks.append(f.make_request(url=url))
results = await asyncio.gather(*tasks)
As you can see it will finish the first round of process then start the second round for rate limits.
It works fine but is there a way I could start both rounds at the same time with different rate limits?
tvm
I'll elaborate on what I commented. So you can try to work on you own solution (even though I'll give the complete code here).
You can have a dictionary defining some rules (api -> rate limit per second):
APIS_RATE_LIMIT_PER_S = {
"http://api.mathjs.org/v4?precision=5": 1,
"http://api.mathjs.org/v4?precision=2": 3,
}
Which you can then use to decide which semaphore to pick according to the request URL (in practice you would have to do some parsing to get the endpoints you want to control). Once you have that it's just a matter of using the semaphore to make sure you limit the number of simultaneous process executing your request. The last piece to the puzzle is obviously to add a delay before releasing the semaphore.
I'll get for a different version of what is suggested here, but it's basically the same solution. I just made it so you can modify the session object so each call to session.get will automatically apply rate limit control.
def set_rate_limits(session, apis_rate_limits_per_s):
semaphores = {api: asyncio.Semaphore(s) for api, s in apis_rate_limits_per_s.items()}
#asynccontextmanager
async def limit_rate(url):
await semaphores[url].acquire()
start = time.time()
try:
yield semaphores[url]
finally:
duration = time.time() - start
await asyncio.sleep(1 - duration)
semaphores[url].release()
def add_limit_rate(coroutine):
async def coroutine_with_rate_limit(url, *args, **kwargs):
async with limit_rate(url):
return await coroutine(url, *args, **kwargs)
return coroutine_with_rate_limit
session.get = add_limit_rate(session.get)
session.post = add_limit_rate(session.post)
return session
Notice that using add_limit_rate you could add rate limit control to any coroutine that has an API endpoint as first argument. But here we will just modify session.get and session.post.
In the end you could use the set_rate_limits function like so:
async def main():
apis = APIS_RATE_LIMIT_PER_S.keys()
params = [
{"expr" : "2^2"},
{"expr" : "1/0.999"},
{"expr" : "1/1.001"},
{"expr" : "1*1.001"},
]
async with aiohttp.ClientSession() as session:
session = set_rate_limits(session, APIS_RATE_LIMIT_PER_S)
api_requests = [get_text_result(session, url, params=p) for url, p in product(apis, params)]
text_responses = await asyncio.gather(*api_requests)
print(text_responses)
async def get_text_result(session, url, params=None):
result = await session.get(url, params=params)
return await result.text()
If you run this code you wont see much of what is happening, you could add some print here and there in set_rate_limits to "make sure" the rate limit is correctly enforced:
import time
# [...] change this part :
def add_limit_rate(coroutine):
async def coroutine_with_rate_limit(url, *args, **kwargs):
async with limit_rate(url):
######### debug
global request_count
request_count += 1
this_req_id = request_count
rate_lim = APIS_RATE_LIMIT_PER_S[url]
print(f"request #{this_req_id} -> \t {(time.time() - start)*1000:5.0f}ms \t rate {rate_lim}/s")
########
r = await coroutine(url, *args, **kwargs)
######### debug
print(f"request #{this_req_id} <- \t {(time.time() - start)*1000:5.0f}ms \t rate {rate_lim}/s")
#########
return r
If you run this example asyncio.run(main()), you should get something like:
request #1 -> 1ms rate 1/s
request #2 -> 2ms rate 3/s
request #3 -> 3ms rate 3/s
request #4 -> 3ms rate 3/s
request #1 <- 1003ms rate 1/s
request #2 <- 1004ms rate 3/s
request #3 <- 1004ms rate 3/s
request #5 -> 1004ms rate 1/s
request #6 -> 1005ms rate 3/s
request #4 <- 1006ms rate 3/s
request #5 <- 2007ms rate 1/s
request #6 <- 2007ms rate 3/s
request #7 -> 2007ms rate 1/s
request #7 <- 3008ms rate 1/s
request #8 -> 3008ms rate 1/s
request #8 <- 4010ms rate 1/s
It seems rate limit is respected here, in particular we can have a look at the API with a rate limit of 1 request per second:
request #1 -> 1ms rate 1/s
request #1 <- 1003ms rate 1/s
request #5 -> 1004ms rate 1/s
request #5 <- 2007ms rate 1/s
request #7 -> 2007ms rate 1/s
request #7 <- 3008ms rate 1/s
request #8 -> 3008ms rate 1/s
request #8 <- 4010ms rate 1/s
On the other hand, this solution is not very satisfying as we artificially add a 1s ping to all our requests. This is because of this part of the code:
await asyncio.sleep(1 - duration)
semaphores[url].release()
The problem here is that we are waiting for the sleep to finish before giving out control back to the event loop (scheduling another task, another request). That can easily be solved using this piece of code instead:
asyncio.create_task(release_after_delay(semaphores[url], 1 - duration))
With release_after_delay simply being:
async def release_after_delay(semaphore, delay):
await asyncio.sleep(delay)
semaphore.release()
The asyncio.create_task function makes the coroutine "run this in the background". Which means in this code that the semaphore will be released later, but that we don't need to wait for it to give control back to the even loop (which means some other request can be scheduled and also that we can get the result in add_limit_rate). In other words, we don't care about the result of this coroutine, we just want it to run at some point in the future (which is probably why this function used to be call ensure_future).
Using this patch, we have the following for the API with rate limit set to one request per second:
request #1 -> 1ms rate 1/s
request #1 <- 214ms rate 1/s
request #2 -> 1002ms rate 1/s
request #2 <- 1039ms rate 1/s
request #3 -> 2004ms rate 1/s
request #3 <- 2050ms rate 1/s
request #4 -> 3009ms rate 1/s
request #4 <- 3048ms rate 1/s
It's definitively closer to what we would expect this code to do. We get each response from our API as soon as we can (in this example the ping is 200ms/37ms/46ms/41ms). And the rate limit is respected too.
This is probably not the most beautiful code, but it can be a start for you to work with. Maybe make a clean package with that once you have it working nicely, I guess that's something other people may like to use.
I tried to trace evince-3.28.4 execution using GDB. There is a callq instruction at some point in libdl, which is shown below (i.e., at _dl_lookup_symbol_x+840):
│0x7ffff7de03f5 <_dl_lookup_symbol_x+837> mov %rbx,%rsi │
>│0x7ffff7de03f8 <_dl_lookup_symbol_x+840> callq 0x7ffff7df0b00 <_dl_signal_cexception> │
│0x7ffff7de03fd <_dl_lookup_symbol_x+845> mov %rbx,%rdi │
When the execution reaches here, the backtrace is as follows:
#0 0x00007ffff7de03f8 in _dl_lookup_symbol_x (undef_name=0x7ffff744fa23 "gtk_progress_get_type", undef_map=0x7ffff7ffe170, ref=0x7fffffffd8d8, symbol_scope=0x7ffff7ffe4f8, version=0x0, type_class=0, flags=2, skip_map=<optimized out>) at dl-lookup.c:857
#1 0x00007ffff4bd6da6 in do_sym (flags=2, vers=0x0, who=0x7ffff486d48e <g_module_symbol+126>, name=0x7ffff744fa23 "gtk_progress_get_type", handle=0x7ffff7ffe170)
at dl-sym.c:151
#2 0x00007ffff4bd6da6 in _dl_sym (handle=0x7ffff7ffe170, name=0x7ffff744fa23 "gtk_progress_get_type", who=0x7ffff486d48e <g_module_symbol+126>) at dl-sym.c:254
#3 0x00007fffefcdf0e4 in dlsym_doit (a=a#entry=0x7fffffffdb20) at dlsym.c:50
#4 0x00007ffff4bd72df in __GI__dl_catch_exception (exception=exception#entry=0x7fffffffdab0, operate=0x7fffefcdf0d0 <dlsym_doit>, args=0x7fffffffdb20)
at dl-error-skeleton.c:196
#5 0x00007ffff4bd736f in __GI__dl_catch_error (objname=0x5555557d44a0, errstring=0x5555557d44a8, mallocedp=0x5555557d4498, operate=<optimized out>, args=<optimized out>) at dl-error-skeleton.c:215
#6 0x00007fffefcdf735 in _dlerror_run (operate=operate#entry=0x7fffefcdf0d0 <dlsym_doit>, args=args#entry=0x7fffffffdb20) at dlerror.c:162
#7 0x00007fffefcdf166 in __dlsym (handle=handle#entry=0x7ffff7ffe170, name=name#entry=0x7ffff744fa23 "gtk_progress_get_type") at dlsym.c:70
#8 0x00007ffff486d48e in _g_module_symbol (symbol_name=0x7ffff744fa23 "gtk_progress_get_type", handle=0x7ffff7ffe170) at ../../../../gmodule/gmodule-dl.c:163
#9 0x00007ffff486d48e in g_module_symbol (module=module#entry=0x5555557d44c0, symbol_name=symbol_name#entry=0x7ffff744fa23 "gtk_progress_get_type", symbol=symbol#entry=0x7fffffffdba0) at ../../../../gmodule/gmodule.c:800
#10 0x00007ffff728f55e in _gtk_module_has_mixed_deps (module_to_check=module_to_check#entry=0x0) at ../../../../gtk/gtkmodules.c:594
#11 0x00007ffff728f703 in find_module (name=0x5555557db040 "gail") at ../../../../gtk/gtkmodules.c:227
#12 0x00007ffff728f703 in load_module (name=0x5555557db040 "gail", module_list=0x0) at ../../../../gtk/gtkmodules.c:292
#13 0x00007ffff728f703 in load_modules (module_str=module_str#entry=0x5555557d44f0 "gail:atk-bridge") at ../../../../gtk/gtkmodules.c:423
#14 0x00007ffff728fb64 in _gtk_modules_init (argc=0x0, argv=<optimized out>, gtk_modules_args=0x5555557d44f0 "gail:atk-bridge") at ../../../../gtk/gtkmodules.c:544
#15 0x00007ffff726786b in do_post_parse_initialization (argc=0x0, argv=0x0) at ../../../../gtk/gtkmain.c:755
#16 0x00007ffff726786b in post_parse_hook (context=<optimized out>, group=<optimized out>, data=0x5555557d0cd0, error=0x7fffffffdd98) at ../../../../gtk/gtkmain.c:798
#17 0x00007ffff54768a8 in g_option_context_parse (context=<optimized out>, argc=<optimized out>, argv=<optimized out>, error=<optimized out>)
at ../../../../glib/goption.c:2165
#18 0x0000555555573386 in main (argc=<optimized out>, argv=<optimized out>) at main.c:275
But when I enter ni (to jump to the next assembly instruction), it turns into this:
#0 0x00007ffff4bd72cd in __GI__dl_catch_exception (exception=exception#entry=0x7fffffffdab0, operate=0x7fffefcdf0d0 <dlsym_doit>, args=0x7fffffffdb20)
at dl-error-skeleton.c:194
#1 0x00007ffff4bd736f in __GI__dl_catch_error (objname=0x5555557d44a0, errstring=0x5555557d44a8, mallocedp=0x5555557d4498, operate=<optimized out>, args=<optimized out>) at dl-error-skeleton.c:215
#2 0x00007fffefcdf735 in _dlerror_run (operate=operate#entry=0x7fffefcdf0d0 <dlsym_doit>, args=args#entry=0x7fffffffdb20) at dlerror.c:162
#3 0x00007fffefcdf166 in __dlsym (handle=handle#entry=0x7ffff7ffe170, name=name#entry=0x7ffff744fa23 "gtk_progress_get_type") at dlsym.c:70
#4 0x00007ffff486d48e in _g_module_symbol (symbol_name=0x7ffff744fa23 "gtk_progress_get_type", handle=0x7ffff7ffe170) at ../../../../gmodule/gmodule-dl.c:163
#5 0x00007ffff486d48e in g_module_symbol (module=module#entry=0x5555557d44c0, symbol_name=symbol_name#entry=0x7ffff744fa23 "gtk_progress_get_type", symbol=symbol#entry=0x7fffffffdba0) at ../../../../gmodule/gmodule.c:800
#6 0x00007ffff728f55e in _gtk_module_has_mixed_deps (module_to_check=module_to_check#entry=0x0) at ../../../../gtk/gtkmodules.c:594
#7 0x00007ffff728f703 in find_module (name=0x5555557db040 "gail") at ../../../../gtk/gtkmodules.c:227
#8 0x00007ffff728f703 in load_module (name=0x5555557db040 "gail", module_list=0x0) at ../../../../gtk/gtkmodules.c:292
#9 0x00007ffff728f703 in load_modules (module_str=module_str#entry=0x5555557d44f0 "gail:atk-bridge") at ../../../../gtk/gtkmodules.c:423
#10 0x00007ffff728fb64 in _gtk_modules_init (argc=0x0, argv=<optimized out>, gtk_modules_args=0x5555557d44f0 "gail:atk-bridge") at ../../../../gtk/gtkmodules.c:544
#11 0x00007ffff726786b in do_post_parse_initialization (argc=0x0, argv=0x0) at ../../../../gtk/gtkmain.c:755
#12 0x00007ffff726786b in post_parse_hook (context=<optimized out>, group=<optimized out>, data=0x5555557d0cd0, error=0x7fffffffdd98) at ../../../../gtk/gtkmain.c:798
#13 0x00007ffff54768a8 in g_option_context_parse (context=<optimized out>, argc=<optimized out>, argv=<optimized out>, error=<optimized out>)
at ../../../../glib/goption.c:2165
#14 0x0000555555573386 in main (argc=<optimized out>, argv=<optimized out>) at main.c:275
As can be seen, after a simple call and return, 4 elements are popped off the stack. Perhaps there is something special about the <_dl_signal_cexception(), __GI__dl_catch_exception()> pair. The stack is changed by some means other than call or return. It seems that _dl_signal_cexception() finally leads to a __longjmp() function at ../sysdeps/x86_64/__longjmp.S which modifies the backtrace. Can someone describe the process?
As can be seen, after a simple call and return, 4 elements are popped off the stack. Perhaps there is something special about the _dl_signal_cexception() __GI__dl_catch_exception()pair. The stack is changed by some means other than call or return.
Correct: the _dl_signal_exception doesn't return, it uses longjmp to transfer control not to its caller, but to its callers callers ... caller.
It seems that _dl_signal_cexception() finally leads to a __longjmp()
Correct.
Can someone describe the process?
You appear to not understand what longjmp does. Reading its man page and/or this example should help.
Update:
this approach for transition in control flow is somehow insane even compared to simple gotos ... any other cases that I should consider?
Other "interesting" control transfers are via makecontext, setcontext and swapcontext family of functions, and (in C++) throw and catch are pretty much equivalent to setjmp and longjmp.
I have a system running on an Intel debug board with DRM and Mesa.
This graphic system use Wayland/Weston and Mesa.
And applications are developed with OpenGL ES 2.0.
Now, I find, sometimes, if the application crashed, the Weston will crashed too.
By checking the coredump of Weston, I can find some invalid memory address was used.
But when I run Weston with Valgrind, there is not any report for invalid memory access.
So, I am thinking about if there were some shared-memory leak by mesa when the client crash.
Means, for example, an application draw a buffer, and commit it to Weston, after that, the application crashed, and mesa recycled all the buffers alloced by this application. But, Weston do not know this, it used the committed buffer and crashed.
Will those things happen? And what could I do to survive from this?
Core was generated by `weston --config=/usr/lib/weston/weston.ini --backend=/usr/lib/weston/ias-backen'.
Program terminated with signal SIGTRAP, Trace/breakpoint trap.
#0 0x00007fec68911b09 in raise (sig=5) at ../sysdeps/unix/sysv/linux/pt-raise.c:36
36 ../sysdeps/unix/sysv/linux/pt-raise.c: No such file or directory.
(gdb) bt
#0 0x00007fec68911b09 in raise (sig=5) at ../sysdeps/unix/sysv/linux/pt-raise.c:36
#1 <signal handler called>
#2 0x00007fec685904b8 in __GI_raise (sig=sig#entry=6) at ../sysdeps/unix/sysv/linux/raise.c:55
#3 0x00007fec6859358a in __GI_abort () at abort.c:89
#4 0x00007fec685ca90b in __libc_message (do_abort=do_abort#entry=2, fmt=fmt#entry=0x7fec686c68a0 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/posix/libc_fatal.c:175
#5 0x00007fec685d4896 in malloc_printerr (action=3, str=0x7fec686c2d31 "free(): invalid pointer", ptr=<optimized out>, ar_ptr=<optimized out>) at malloc.c:5000
#6 0x00007fec685d507e in _int_free (av=0x7fec688fbb40 <main_arena>, p=<optimized out>, have_lock=0) at malloc.c:3861
#7 0x00007fec655d320d in intel_miptree_release (mt=mt#entry=0x18884a8)
at src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1036
#8 0x00007fec655d32a7 in intel_miptree_reference (dst=0x18884a8, src=0x1870170)
at src/mesa/drivers/dri/i965/intel_mipmap_tree.c:989
#9 0x00007fec655dc640 in intel_set_texture_image_mt (brw=brw#entry=0x7fec69a44040, image=image#entry=0x185c050, internal_format=<optimized out>, mt=<optimized out>)
at src/mesa/drivers/dri/i965/intel_tex_image.c:180
#10 0x00007fec655dc952 in intel_image_target_texture_2d (ctx=<optimized out>, target=3553, texObj=0x1888080, texImage=0x185c050, image_handle=<optimized out>)
at src/mesa/drivers/dri/i965/intel_tex_image.c:426
#11 0x00007fec653544ff in _mesa_EGLImageTargetTexture2DOES (target=3553, image=0x185fb60)
at src/mesa/main/teximage.c:3194
#12 0x00007fec660a2d27 in gl_renderer_attach_egl (format=<optimized out>, buffer=0x1889130, es=<optimized out>) at ../src/gl-renderer.c:1450
#13 gl_renderer_attach (es=<optimized out>, buffer=0x1889130) at ../src/gl-renderer.c:1919
#14 0x000000000040fdb5 in weston_surface_attach (buffer=0x1889130, surface=0x13ad650) at ../src/compositor.c:2266
#15 weston_surface_commit_state (surface=surface#entry=0x13ad650, state=state#entry=0x13ad778) at ../src/compositor.c:3190
#16 0x000000000041036f in weston_surface_commit (surface=surface#entry=0x13ad650) at ../src/compositor.c:3262
#17 0x00000000004104c7 in surface_commit (client=<optimized out>, resource=<optimized out>) at ../src/compositor.c:3289
#18 0x00007fec6835ad04 in ffi_call_unix64 () from /usr/lib/libffi.so.6
#19 0x00007fec6835a7fa in ffi_call () from /usr/lib/libffi.so.6
#20 0x00007fec696ff7ba in wl_closure_invoke (closure=<optimized out>, flags=<optimized out>, target=0x13ad940, opcode=6, data=0x13acd70) at ../src/connection.c:935
#21 0x00007fec696fc517 in wl_client_connection_data (fd=<optimized out>, mask=<optimized out>, data=0x13acd70) at ../src/wayland-server.c:407
#22 0x00007fec696fdd32 in wl_event_loop_dispatch (loop=0x11a7460, timeout=timeout#entry=-1) at ../src/event-loop.c:423
#23 0x00007fec696fc6b5 in wl_display_run (display=display#entry=0x11a7380) at ../src/wayland-server.c:1281
#24 0x00000000004092d1 in main (argc=1, argv=<optimized out>) at ../src/main.c:1049
(gdb)
I am getting the following backtrace from an error that only happens after some subsequent requests to the server:
node: ../deps/uv/src/unix/core.c:171: uv__finish_close: Assertion `handle->flags & UV_CLOSING' failed.
Program received signal SIGABRT, Aborted.
0x00000030eee32925 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb) backtrace
#0 0x00000030eee32925 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 0x00000030eee34105 in abort () at abort.c:92
#2 0x00000030eee2ba4e in __assert_fail_base (fmt=<value optimized out>, assertion=0xb37538 "handle->flags & UV_CLOSING", file=0xb374a8 "../deps/uv/src/unix/core.c", line=<value optimized out>, function=<value optimized out>)
at assert.c:96
#3 0x00000030eee2bb10 in __assert_fail (assertion=0xb37538 "handle->flags & UV_CLOSING", file=0xb374a8 "../deps/uv/src/unix/core.c", line=171, function=0xb37690 "uv__finish_close") at assert.c:105
#4 0x0000000000994bb4 in uv__finish_close (loop=0xe6d840, mode=<value optimized out>) at ../deps/uv/src/unix/core.c:171
#5 uv__run_closing_handles (loop=0xe6d840, mode=<value optimized out>) at ../deps/uv/src/unix/core.c:221
#6 uv_run (loop=0xe6d840, mode=<value optimized out>) at ../deps/uv/src/unix/core.c:319
#7 0x0000000000942132 in node::Start(int, char**) ()
#8 0x00000030eee1ed1d in __libc_start_main (main=0x599710 <main>, argc=2, ubp_av=0x7fffffffdec8, init=<value optimized out>, fini=<value optimized out>, rtld_fini=<value optimized out>, stack_end=0x7fffffffdeb8)
at libc-start.c:226
#9 0x00000000005999f1 in _start ()
Any idea why is this happening? I am using aerospike but I am not sure if it is related to the issue.
To reproduce it:
gdb --args node /bin/www
> run // until error occurs
> backtrace full
This question was asked on the Aerospike Community Edition user forum here.
Aerospike made an official release to NPM (Node.js 1.0.38) in April 2015; it fixes this UV assertion segfault when running query in the node.js API.
The user #Daniel reported that his problem is now fixed.
I have a core dump from a process that deadlocked after invoking a signal handler. How do I determine which signal was delivered and who sent it?
The GDB-generated backtrace for the the thread that received the signal follows. The signal handler was called in frame 15.
(gdb) bt
#0 0x00007fa9c204654b in sys_futex (w=0x7fa9c2263d80, value=2, loop=<value optimized out>) at ./src/base/linux_syscall_support.h:1789
#1 base::internal::SpinLockDelay (w=0x7fa9c2263d80, value=2, loop=<value optimized out>) at ./src/base/spinlock_linux-inl.h:87
#2 0x00007fa9c204774c in SpinLock::SlowLock (this=0x7fa9c2263d80) at src/base/spinlock.cc:132
#3 0x00007fa9c2037ee3 in Lock (this=0x7fa9c2263d80, start=0x7fa9bb3c04c8, end=0x7fa9bb3c04c0, N=3) at src/base/spinlock.h:75
#4 tcmalloc::CentralFreeList::RemoveRange (this=0x7fa9c2263d80, start=0x7fa9bb3c04c8, end=0x7fa9bb3c04c0, N=3) at src/central_freelist.cc:247
#5 0x00007fa9c203bae4 in tcmalloc::ThreadCache::FetchFromCentralCache (this=0x17efb40, cl=<value optimized out>, byte_size=32) at src/thread_cache.cc:162
#6 0x00007fa9c202b9cb in Allocate (size=<value optimized out>) at src/thread_cache.h:341
#7 do_malloc (size=<value optimized out>) at src/tcmalloc.cc:1068
#8 (anonymous namespace)::do_malloc_or_cpp_alloc (size=<value optimized out>) at src/tcmalloc.cc:1005
#9 0x00007fa9c204bfa8 in tc_realloc (old_ptr=0x0, new_size=32) at src/tcmalloc.cc:1517
#10 0x0000003a358c0f3b in ?? () from /usr/lib64/libstdc++.so.6
#11 0x0000003a358c2adf in ?? () from /usr/lib64/libstdc++.so.6
#12 0x0000003a358c2cae in __cxa_demangle () from /usr/lib64/libstdc++.so.6
#13 0x000000000085f6c7 in my_print_stacktrace ()
#14 0x00000000006a773a in handle_fatal_signal ()
#15 <signal handler called>
#16 tcmalloc::CentralFreeList::FetchFromSpans (this=0x7fa9c2263d80) at src/central_freelist.cc:298
#17 0x00007fa9c2037f88 in tcmalloc::CentralFreeList::RemoveRange (this=0x7fa9c2263d80, start=0x7fa9bb3c1468, end=0x7fa9bb3c1460, N=3) at src/central_freelist.cc:269
#18 0x00007fa9c203bae4 in tcmalloc::ThreadCache::FetchFromCentralCache (this=0x17efb40, cl=<value optimized out>, byte_size=32) at src/thread_cache.cc:162
...
For what it's worth, handle_fatal_signal() and my_print_stacktrace() are MySQL functions. The rest are from Google's tcmalloc.
I would try "frame 15" to move to the signal delivery frame, followed by "print $_siginfo.si_signo". See https://sourceware.org/gdb/onlinedocs/gdb/Signals.html
This works on Linux at least, which I presume from your backtrace that you are using. I'm not sure about other platforms.