调试依赖于所选调度程序的奇怪错误 [英] Debugging strange error that depends on the selected scheduler

查看:125
本文介绍了调试依赖于所选调度程序的奇怪错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在正在开发的软件中遇到一个奇怪的行为。它是一个实时机器控制器,用C ++编写,在Linux上运行,并且正在广泛使用多线程。



当我运行程序而不要求它是实时的时候,一切都像我预期的一样。但是当我要求它切换到其实时模式时,有一个清晰可重复的错误,使应用程序崩溃。它一定是一些死锁 - 我猜这个东西,因为它是一个泛型,它会进入一个超时,最终触发一个断言。



我的问题是,如何寻找这一个。从生产的核心看回溯跟踪不是很有帮助,因为问题的原因在于过去的某个地方。



以下代码在正常和实时行为之间进行切换:



在main.cpp中(简化的返回代码通过断言来检查):

  if(startAsRealtime){
struct sched_pa​​ram sp;
memset(& sp,0,sizeof(sched_pa​​ram));
sp.sched_priority = 99
sched_setscheduler(getpid(),SCHED_RR,& sp);}

(简化的返回代码通过断言来检查):

  if(startAsRealtime){
sched_pa​​ram param;
pthread_attr_setinheritsched(& attr,PTHREAD_EXPLICIT_SCHED);
pthread_attr_getschedparam(& attr,& param);
param.sched_priority = priority;
pthread_attr_setschedpolicy(& attr,SCHED_RR);
pthread_attr_setschedparam(& attr,& param);}

提前感谢


解决方案

如果您使用 glibc 作为C库,可以使用 的问题的答案是否可以列出线程所持有的互斥体 来查找正在保持正在超时的互斥体的线程。这应该开始缩小范围 - 您可以检查该线程,找出为什么不放弃互斥。


I am experiencing a strange behavior in a software I am working on. It is a realtime-machine-controller, written in C++, running on Linux and it is making extensive use of multithreading.

When I run the program without asking it to be realtime, everything works like I expect it to. But when I ask it to switch to its realtime mode, there is a clearly reproducible bug that lets the application crash. It must be some deadlock-thing I guess, because it is a mutex that runs into a timeout and ultimately triggers a assertion.

My Question is, how to hunt this one down. Looking at the backtrace from the produced core is not very helpful as the reason for the problem lies somewhere in the past.

The following code does the switching between 'normal' and 'realtime' behaviour:

In main.cpp (simplified, return-codes are checked via assertions):

if(startAsRealtime){
struct sched_param sp;
memset(&sp, 0, sizeof(sched_param));
sp.sched_priority = 99;
sched_setscheduler(getpid(), SCHED_RR, &sp);}

In every thread (simplified, return-codes are checked via assertions):

if(startAsRealtime){
sched_param param;
pthread_attr_setinheritsched(&attr, PTHREAD_EXPLICIT_SCHED);
pthread_attr_getschedparam(&attr, &param);
param.sched_priority = priority;
pthread_attr_setschedpolicy(&attr, SCHED_RR);
pthread_attr_setschedparam(&attr, &param);}

Thanks in advance

解决方案

If you're using glibc as your C library, you could use the answer to the question Is it possible to list mutexs which a thread holds to find out the thread that is holding the mutex which is timing out. That should start to narrow things down - you can then inspect that thread and find out why it's not giving up the mutex.

这篇关于调试依赖于所选调度程序的奇怪错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆