线程在“in __lll_lock_wait"点被几个线程卡住了 [英] Threads getting stuck with few threads at point "in __lll_lock_wait"

查看:90
本文介绍了线程在“in __lll_lock_wait"点被几个线程卡住了的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我所有的线程都卡在了一点,此时的轨迹如下:

All my threads are stuck at one point, the trace at this point is as below:

(gdb) info threads
  9 Thread 0x7fa872994700 (LWP 10301)  0x000000327b60e264 in __lll_lock_wait () from /lib64/libpthread.so.0
  8 Thread 0x7fa87379c700 (LWP 10302)  0x000000327b2accdd in nanosleep () from /lib64/libc.so.6
  7 Thread 0x7fa871b7c700 (LWP 10303)  0x000000327b2db74d in read () from /lib64/libc.so.6
  6 Thread 0x7fa87117b700 (LWP 10306)  0x000000327b60e264 in __lll_lock_wait () from /lib64/libpthread.so.0
  5 Thread 0x7fa864e14700 (LWP 10307)  0x000000327b60e264 in __lll_lock_wait () from /lib64/libpthread.so.0
  4 Thread 0x7fa85ffff700 (LWP 10308)  0x000000327b2db7ad in write () from /lib64/libc.so.6
  3 Thread 0x7fa85f5fe700 (LWP 10309)  0x000000327b60e264 in __lll_lock_wait () from /lib64/libpthread.so.0
  2 Thread 0x7fa85ebfd700 (LWP 10311)  0x000000327b2accdd in nanosleep () from /lib64/libc.so.6
* 1 Thread 0x7fa87379e720 (LWP 10300)  0x000000327b60822d in pthread_join () from /lib64/libpthread.so.0

我试图找出这是否与我的代码或系统配置的任何问题有关.它正在所有其他机器上工作.该问题仅在每次运行时发生在一台机器上.该机器的详细配置如下:

I am trying to find if this is related to my code or any issue with system configuration. It is working on all other machines. The issue is happening on one machine only on every run. The configuration details of this machine is as below:

bash-4.1$ cat/etc/redhat-release红帽企业 Linux 服务器 6.5 版(圣地亚哥)

bash-4.1$ cat /etc/redhat-release Red Hat Enterprise Linux Server release 6.5 (Santiago)

bash-4.1$ uname -aLinux 本地主机 2.6.32-431.el6.x86_64 #1 SMP Sun Nov 10 22:19:54 EST 2013 x86_64 x86_64 x86_64 GNU/Linux

bash-4.1$ uname -a Linux localhost 2.6.32-431.el6.x86_64 #1 SMP Sun Nov 10 22:19:54 EST 2013 x86_64 x86_64 x86_64 GNU/Linux

bash-4.1$ rpm -qa |grep glibcglibc-devel-2.12-1.132.el6.x86_64glibc-2.12-1.132.el6.x86_64glibc-common-2.12-1.132.el6.x86_64glibc-headers-2.12-1.132.el6.x86_64

bash-4.1$ rpm -qa |grep glibc glibc-devel-2.12-1.132.el6.x86_64 glibc-2.12-1.132.el6.x86_64 glibc-common-2.12-1.132.el6.x86_64 glibc-headers-2.12-1.132.el6.x86_64

也供参考,以下是线程没有卡住的机器的配置(工作正常):

Also for reference, Below is the config of the machine where threads are not getting stuck(working fine):

> cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.3 (Santiago)

> uname -a
Linux localhost 2.6.32-279.el6.x86_64 #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux

> rpm -qa |grep glibc
glibc-headers-2.12-1.80.el6.x86_64
compat-glibc-headers-2.5-46.2.x86_64
compat-glibc-2.5-46.2.x86_64
glibc-devel-2.12-1.80.el6.x86_64
glibc-common-2.12-1.80.el6.x86_64
glibc-2.12-1.80.el6.i686
glibc-devel-2.12-1.80.el6.i686
glibc-2.12-1.80.el6.x86_64

推荐答案

正如本回答中所建议的 https://stackoverflow.com/a/3491304/108153,查看每个等待traceback的线程,

As suggested in this answer https://stackoverflow.com/a/3491304/108153, look at each thread that is waiting traceback,

(gdb) thr 9
(gdb) bt

#0  0x00007f5e45c553dd in __lll_lock_wait () at /lib64/libpthread.so.0
#1  0x00007f5e45c4e7d4 in pthread_mutex_lock () at /lib64/libpthread.so.0
#2  0x00007f5e458cc84f in gst_element_set_state_func (element=0x7f5d94461ca0, state=GST_STATE_READY) at gstelement.c:2831

转到锁定互斥锁的堆栈帧并查看互斥锁以获取锁定器的线程 id.

go to the stack frame that locked the mutex and look at the mutex for the thread id of the locker.

(gdb) f 2  # look frame 2, as an example
#2  0x00007f5e458cc84f in gst_element_set_state_func (element=0x7f5d94461ca0, state=GST_STATE_READY)
    at gstelement.c:2831
2831      GST_STATE_LOCK (element);

找到试图锁定的互斥体的符号,并打印其内容

find the symbol of the mutex that is being attempted to lock, and print it's contents

(gdb) p element.state_lock
$3 = {p = 0x7f5d0c03f2a0, i = {0, 0}}

(gdb) p *(struct __pthread_mutex_s *)element.state_lock.p
$6 = {__lock = 2, __count = 1, __owner = 11889, __nusers = 1, __kind = 1, __spins = 0, __elision = 0, 
  __list = {__prev = 0x0, __next = 0x0}}

如果你没有符号但有地址,你可以通过检查内存来打印出来.

if you don't have the symbol but have the address, you can print it out by examining the memory.

(gdb) x/4x 0x7f5d0c03f2a0   # address of the mutex
0x7f5d0c03f2a0: 0x00000002      0x00000001      0x00002e71      0x00000001
(gdb) p 0x2e71
$7 = 11889

并且在当前版本的 linux pthreads 上,所有者位于第三个值中.如上问题,LWP #10311,查看线程 2,看看为什么被阻塞.或者在这个例子中,LWP #11889,线程 18.

And on the current version of linux pthreads, the owner is in the third value. As above in the question, LWP #10311, look at thread 2 and see why is blocked. Or in this example, LWP #11889, thread 18.

(gdb) info thr
[ ... ]
  18   Thread 0x7f5dc9dff700 (LWP 11889) "task114"        0x00007f5e45c5203c in pthread_cond_wait@@GLIBC_2.3.2

(gdb) thr 18
(gdb) bt
#0  0x00007f5e45c5203c in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0
[ ... ]

这篇关于线程在“in __lll_lock_wait"点被几个线程卡住了的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆