尝试访问使用CIFS挂载的远程文件夹在断开连接时挂起 [英] Attempt to access remote folder mounted with CIFS hangs when disconnected

查看:1923
本文介绍了尝试访问使用CIFS挂载的远程文件夹在断开连接时挂起的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此问题是该问题的扩展

还是:我在CentOS 6.0下工作,我有一个远程win7文件夹,安装时:

Yet again: I'm working under CentOS 6.0 and I have a remote win7 folder, mounted with:

mount -t cifs //PC128/mnt /media/net -o "username=WORKGROUP\user,password=pwd,rw,noexec,soft,uid=user,gid=user"

当远程文件夹不可用(例如网络电缆被拔出)远程文件夹锁定了我正在处理的应用程序。起初我检测到QDir :: exists()导致锁定20-90秒(我还是找不到为什么这样的差异),进一步我检测到任何调用stat()函数导致应用程序锁定。

When remote folder is not available (e.g. network cable is pulled out) an attempt to access the remote folder locks an application I'm working on. At first I detected that QDir::exists() caused locking for 20-90 seconds (I still can't find out why such difference), further I detected that any call to stat() function leads to application lock.

我按照上面提到的建议,我移动QDir :: exists()调用(和以后 - 调用stat()函数)到另一个线程,这没有解决这个问题。当连接突然丢失时,应用程序仍然挂起。 Qt跟踪显示锁在内核中的某处:

I followed an advice provided in topic above, I moved QDir::exists() call (and later - call to the stat() function) to another thread and this didn't solve the problem. The application still hangs when connection is suddenly lost. Qt trace shows that lock is somewhere in the kernel:

0   __kernel_vsyscall
1   __xstat64@GLIBC_2.1               /lib/libc.so.6
2   QFSFileEnginePrivate::doStat      stat.h

检查远程共享是否仍然挂载尝试访问文件夹本身之前,但它没有帮助。方法如:

I did also tried to check if remote share is still mounted before trying to access folder itself, but it didn't help. Approaches such as:

mount | grep /media/net

显示共享文件夹仍在安装,即使没有活动连接到

show that shared folder is still mounted even is there is no active connection to the network.

检查文件夹状态差异,如:

Checking folder status differences such as:

stat -fc%t:%T /media/net/ != stat -fc%t:%T /media/net/..


因此,我有几个问题:


  1. 是否有任何方法可以更改CIFS超时?

  2. 如何检查远程文件夹是否仍然挂载并且未被锁定?

  3. 如何检查文件夹是否存在并且不会被锁定?


推荐答案

您的问题:无法访问的网络文件系统是一个非常有名的示例,它触发linux 挂起任务,这与zombies进程不一样(杀死父PID不会任何事物)

Your problem: "An unreachable network filesystem" is a very well known example which trigger linux hung task which isn't the same of zombies process at all(killing the parent PID won't do anything)

一个挂起的任务,是触发一个系统调用导致内核问题的任务,这样系统调用永远不会返回。
主要的特点是任务被调度程序声明为D状态,这意味着程序处于不可中断状态。这意味着你不能阻止你的程序:你可以触发所有信号到任务,它不会响应。发布数百个SIGTERM / SIGKILL什么也不做!

An hung task, is task which triggered a system call that cause problem in the kernel, so that the system call never return. The major particularity is that the task is declared in the "D" state by the scheduler which mean the program is in an uninterruptible state. This mean that you can do nothing to stop you program: You can trigger all signal to the task, it would not respond. Launching hundreds of SIGTERM/SIGKILL does nothing!

这是我的旧内核的情况:当我的nfs服务器崩溃,我需要重新启动客户端杀死任务使用文件系统。我编译了很久很久以前(我仍然在我的hdd的构建树),在配置期间我在lib / Kconfig.debug中看到:

This the case whith my old kernel: when my nfs server crash, I need to reboot the client to kill the tasks using the filesystem. I compiled it a long time ago (I have still the build tree on my hdd) and during the configuration I saw this in lib/Kconfig.debug:

config DETECT_HUNG_TASK
    bool "Detect Hung Tasks"
    depends on DEBUG_KERNEL
    default LOCKUP_DETECTOR
    help
      Say Y here to enable the kernel to detect "hung tasks",
      which are bugs that cause the task to be stuck in
      uninterruptible "D" state indefinitiley.

      When a hung task is detected, the kernel will print the
      current stack trace (which you should report), but the
      task will stay in uninterruptible state. If lockdep is
      enabled then all held locks will also be reported. This
      feature has negligible overhead.

这只是建议检测这种tash或恐慌检测:我不检查最近的内核实际上可以解决问题(似乎是你的问题的情况),但我认为它不值得启用它。

It was only proposing to detect such tash or panic on detection: I don't checked if recent kernel actually can solve the problem (It seems to be the case with your question), but I think it didn't worth enabling it.

还有第二个问题:通常,检测发生在120秒后,但我还看到一个Konfig选项:

There is second problem : normally, the detection occur after 120 seconds, but I saw also a Konfig option for this:

config DEFAULT_HUNG_TASK_TIMEOUT
    int "Default timeout for hung task detection (in seconds)"
    depends on DETECT_HUNG_TASK
    default 120
    help
      This option controls the default timeout (in seconds) used
      to determine when a task has become non-responsive and should
      be considered hung.

      It can be adjusted at runtime via the kernel.hung_task_timeout_secs
      sysctl or by writing a value to
      /proc/sys/kernel/hung_task_timeout_secs.

      A timeout of 0 disables the check.  The default is two minutes.
      Keeping the default should be fine in most cases.

这也适用于内核线程:example:make a loop device to a file on a fuse filesystem。然后崩溃用户空间程序控制保险丝文件系统!
你应该得到一个名为loopX(X通常对应于您的回送设备号码)的名字的Ktread HUNGING!

This also works with kernel threads: example: make a loop device to a file on a fuse filesystem. Then crash the userspace program controlling the fuse filesystem! You should a get a Ktread which name is in the form loopX (X correspond normally to your loopback device number) HUNGing!

weblinks:

http:// unix .stackexchange.com / questions / 5642 / what-if-kill-9-does-not-work (看看ultrasawblade写的答案)

http://unix.stackexchange.com/questions/5642/what-if-kill-9-does-not-work (look at the answer written by ultrasawblade)

http:// www .linuxquestions.org / questions / linux-general-1 / kill-a-hung-task-when-kill-9-don 't-help-697305 /

http://www.linuxquestions.org/questions/linux-general-1/kill-a-hung-task-when-kill-9-doesn't-help-697305/

http://forums-web2.gentoo.org /viewtopic-t-811557-start-0.html

http://comments.gmane.org/gmane.linux.kernel/1189978

http://comments.gmane.org/gmane.linux.kernel.cifs/7674 (这是类似于您的情况)

在您遇到的三个问题:您有答案:这可能是由于什么可能是vfs linux内核层中的一个众所周知的错误! (没有CIFS超时)

In your case of the three question: you have the answer: This probably due to what is probably a well known bug in the vfs linux kernel layer! (There is no CIFS timeouts)

这篇关于尝试访问使用CIFS挂载的远程文件夹在断开连接时挂起的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆