win32file.ReadDirectoryChangesW 没有找到所有移动的文件 [英] win32file.ReadDirectoryChangesW doesn't find all moved files

查看:87
本文介绍了win32file.ReadDirectoryChangesW 没有找到所有移动的文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

早上好,

我在 Python 中创建的程序遇到了一个奇怪的问题.似乎当我将文件从一个位置拖放到另一个位置时,并非所有文件都被模块注册为事件.

我一直在使用 win32file 和 win32con 来尝试获取与将文件从一个位置移动到另一个位置以进行处理相关的所有事件.

这是我的检测代码的片段:

导入win32file导入 win32con定义主():path_to_watch = 'D:\\'_file_list_dir = 1# 创建一个观察者句柄_h_dir = win32file.CreateFile(path_to_watch,_file_list_dir,win32con.FILE_SHARE_READ |win32con.FILE_SHARE_WRITE |win32con.FILE_SHARE_DELETE,没有任何,win32con.OPEN_EXISTING,win32con.FILE_FLAG_BACKUP_SEMANTICS,没有任何)而 1:结果 = win32file.ReadDirectoryChangesW(_h_dir,1024,真的,win32con.FILE_NOTIFY_CHANGE_FILE_NAME |win32con.FILE_NOTIFY_CHANGE_DIR_NAME |win32con.FILE_NOTIFY_CHANGE_ATTRIBUTES |win32con.FILE_NOTIFY_CHANGE_SIZE |win32con.FILE_NOTIFY_CHANGE_LAST_WRITE |win32con.FILE_NOTIFY_CHANGE_SECURITY,没有任何,没有任何)对于 _action, _file 结果:如果 _action == 1:打印找到了!"如果 _action == 2:打印已删除!"

我拖放了 7 个文件,但只找到了 4 个.

# 找到了!# 成立!# 成立!# 成立!

如何检测所有丢失的文件?

解决方案

[ActiveState.Docs]:win32file.ReadDirectoryChangesW(这是我为 [GitHub] 找到的最好的文档): mhammond/pywin32 - Python for Windows (pywin32) Extensions) 是 [MS.Docs]:ReadDirectoryChangesW 函数.这是它的说明(关于缓冲区):

1.一般

<块引用>

当您第一次调用 ReadDirectoryChangesW 时,系统会分配一个缓冲区来存储更改信息.此缓冲区与目录句柄相关联,直到它关闭并且其大小在其生命周期内不会改变.在调用此函数之间发生的目录更改将添加到缓冲区中,然后在下一次调用中返回.如果缓冲区溢出,缓冲区的全部内容将被丢弃,lpBytesReturned 参数包含零,并且 ReadDirectoryChangesW 函数失败,错误代码为 ERROR_NOTIFY_ENUM_DIR.

  • 我的理解是,这是一个与作为参数传递的缓冲区不同的缓冲区 (lpBuffer):

    • 前者被传递给 ReadDirectoryChangesW 的每次调用(可能是每次调用传递的不同缓冲区(具有不同大小))

    • 后者由系统分配,当前者在函数调用之前(由用户)明确分配
      并且存储数据(可能在一些原始格式)在函数调用之间,当函数被调用时,缓冲区内容被复制(并格式化)到 lpBuffer(如果在此期间没有溢出(和丢弃))

2.同步

<块引用>

成功同步完成后,lpBuffer 参数是一个格式化的缓冲区,写入缓冲区的字节数在 lpBytesReturned 中可用.如果传输的字节数为零,则缓冲区要么太大,系统无法分配,要么太小,无法提供有关目录或子树中发生的所有更改的详细信息.在这种情况下,您应该通过枚举目录或子树来计算更改.

  • 这在一定程度上证实了我之前的假设

    • 缓冲区太大,系统无法分配";- 也许当分配前一点的缓冲区时,它会考虑nBufferLength?

无论如何,我拿走了你的代码并稍微"改变了它.

code00.py:

导入系统导入msvcrt导入pywintypes导入 win32 文件导入 win32con导入 win32api导入 win32 事件FILE_LIST_DIRECTORY = 0x0001FILE_ACTION_ADDED = 0x00000001FILE_ACTION_REMOVED = 0x00000002异步超时 = 5000BUF_SIZE = 65536def get_dir_handle(dir_name, asynch):flags_and_attributes = win32con.FILE_FLAG_BACKUP_SEMANTICS如果异步:flags_and_attributes |= win32con.FILE_FLAG_OVERLAPPEDdir_handle = win32file.CreateFile(目录名,FILE_LIST_DIRECTORY,(win32con.FILE_SHARE_READ |win32con.FILE_SHARE_WRITE |win32con.FILE_SHARE_DELETE),没有任何,win32con.OPEN_EXISTING,flags_and_attributes,没有任何)返回目录句柄def read_dir_changes(dir_handle, size_or_buf, 重叠):返回 win32file.ReadDirectoryChangesW(目录句柄,size_or_buf,真的,(win32con.FILE_NOTIFY_CHANGE_FILE_NAME |win32con.FILE_NOTIFY_CHANGE_DIR_NAME |win32con.FILE_NOTIFY_CHANGE_ATTRIBUTES |win32con.FILE_NOTIFY_CHANGE_SIZE |win32con.FILE_NOTIFY_CHANGE_LAST_WRITE |win32con.FILE_NOTIFY_CHANGE_SECURITY),重叠,没有任何)def handle_results(结果):对于结果中的项目:打印({} {:d}".格式(项目,len(项目[1])))_action, _ = 项目如果 _action == FILE_ACTION_ADDED:打印(找到!")如果 _action == FILE_ACTION_REMOVED:打印(已删除!")def esc_pressed():返回 msvcrt.kbhit() 和 ord(msvcrt.getch()) == 27def monitor_dir_sync(dir_handle):idx = 0为真:打印(索引:{:d}".格式(idx))idx += 1结果 = read_dir_changes(dir_handle, BUF_SIZE, None)handle_results(结果)如果 esc_pressed():休息def monitor_dir_async(dir_handle):idx = 0缓冲区 = win32file.AllocateReadBuffer(BUF_SIZE)重叠 = pywintypes.OVERLAPPED()重叠.hEvent = win32event.CreateEvent(无,假,0,无)为真:打印(索引:{:d}".格式(idx))idx += 1read_dir_changes(dir_handle,缓冲区,重叠)rc = win32event.WaitForSingleObject(overlapped.hEvent, ASYNC_TIMEOUT)如果 rc == win32event.WAIT_OBJECT_0:缓冲区大小= win32file.GetOverlappedResult(目录句柄,重叠,真)结果 = win32file.FILE_NOTIFY_INFORMATION(缓冲区,缓冲区大小)handle_results(结果)elif rc == win32event.WAIT_TIMEOUT:#print("超时...")经过别的:打印(收到{:d}.退出".格式(rc))休息如果 esc_pressed():休息win32api.CloseHandle(overlapped.hEvent)def monitor_dir(dir_name, async=False):dir_handle = get_dir_handle(dir_name, asynch)如果异步:monitor_dir_async(dir_handle)别的:监视器目录同步(目录句柄)win32api.CloseHandle(dir_handle)定义主():打印(Python {:s} on {:s}\n".format(sys.version, sys.platform))异步 = 真打印(尝试{}使用缓冲区{:d}字节长的同步模式...".format(As"如果异步否则S",BUF_SIZE))monitor_dir(".\\test", asynch=asynch)如果 __name__ == __main__":主要的()

注意事项:

  • 尽可能使用常量
  • 将代码拆分为函数,使其模块化(同时避免重复)
  • 添加了 print 语句以增加输出
  • 添加了异步功能(因此如果目录中没有活动,脚本不会永远挂起)
  • 添加了当用户按下 ESC 时退出的方法(当然在同步模式下,dir 中的事件也必须发生)
  • 为不同的结果使用不同的值

输出:

<块引用>

e:\Work\Dev\StackOverflow\q049799109>dir/b test0123456789.txt01234567890123456789.txt012345678901234567890123456789.txt0123456789012345678901234567890123456789.txt01234567890123456789012345678901234567890123456789.txt012345678901234567890123456789012345678901234567890123456789.txt0123456789012345678901234567890123456789012345678901234567890123456789.txt01234567890123456789012345678901234567890123456789012345678901234567890123456789.txt012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789.txt01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901893456txt7e:\Work\Dev\StackOverflow\q049799109>e:\Work\Dev\StackOverflow\q049799109>C:\Install\x64\HPE\OPSWpython\2.7.10__00\python.exe"代码00.pyPython 2.7.10(默认,2016 年 3 月 8 日,15:02:46)[MSC v.1600 64 位 (AMD64)] on win32使用 512 字节长的缓冲区尝试同步模式...指数:0(2, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567456789012345678901234567890123456789012345674597txt1删除了!索引:1(2, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567894.txt删除了!指数:2(2, u'01234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 84删除了!指数:3(2, u'0123456789012345678901234567890123456789012345678901234567890123456789.txt') 74删除了!(2, u'012345678901234567890123456789012345678901234567890123456789.txt') 64删除了!指数:4(2, u'01234567890123456789012345678901234567890123456789.txt') 54删除了!指数:5(2, u'0123456789012345678901234567890123456789.txt') 44删除了!(2, u'012345678901234567890123456789.txt') 34删除了!指数:6(2, u'01234567890123456789.txt') 24删除了!(2, u'0123456789.txt') 14删除了!指数:7(1, u'0123456789.txt') 14成立!指数:8(3, u'0123456789.txt') 14指数:9(1, u'01234567890123456789.txt') 24成立!指数:10(3, u'01234567890123456789.txt') 24(1, u'012345678901234567890123456789.txt') 34成立!(3, u'012345678901234567890123456789.txt') 34(1, u'0123456789012345678901234567890123456789.txt') 44成立!索引:11(3, u'0123456789012345678901234567890123456789.txt') 44(1, u'01234567890123456789012345678901234567890123456789.txt') 54成立!(3, u'01234567890123456789012345678901234567890123456789.txt') 54索引:12索引:13(1, u'01234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 84成立!索引:14指数:15(1, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567456789012345678901234567890123456789012345674597 txt1成立!索引:16(3, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567456789012345678901234567890123456789012345674597txt1索引:17(1, u'a') 1成立!索引:18(3, u'a') 1e:\Work\Dev\StackOverflow\q049799109>e:\Work\Dev\StackOverflow\q049799109>C:\Install\x64\HPE\OPSWpython\2.7.10__00\python.exe"代码00.pyPython 2.7.10(默认,2016 年 3 月 8 日,15:02:46)[MSC v.1600 64 位 (AMD64)] on win32正在尝试使用 65536 字节长的缓冲区进行同步模式...指数:0(2, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567456789012345678901234567890123456789012345674597txt1删除了!索引:1(2, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567894.txt删除了!指数:2(2, u'01234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 84删除了!指数:3(2, u'0123456789012345678901234567890123456789012345678901234567890123456789.txt') 74删除了!指数:4(2, u'012345678901234567890123456789012345678901234567890123456789.txt') 64删除了!指数:5(2, u'01234567890123456789012345678901234567890123456789.txt') 54删除了!指数:6(2, u'0123456789012345678901234567890123456789.txt') 44删除了!指数:7(2, u'012345678901234567890123456789.txt') 34删除了!(2, u'01234567890123456789.txt') 24删除了!(2, u'0123456789.txt') 14删除了!指数:8(1, u'0123456789.txt') 14成立!指数:9(3, u'0123456789.txt') 14指数:10(1, u'01234567890123456789.txt') 24成立!索引:11(3, u'01234567890123456789.txt') 24索引:12(1, u'012345678901234567890123456789.txt') 34成立!索引:13(3, u'012345678901234567890123456789.txt') 34索引:14(1, u'0123456789012345678901234567890123456789.txt') 44成立!指数:15(3, u'0123456789012345678901234567890123456789.txt') 44索引:16(1, u'01234567890123456789012345678901234567890123456789.txt') 54成立!(3, u'01234567890123456789012345678901234567890123456789.txt') 54索引:17(1, u'012345678901234567890123456789012345678901234567890123456789.txt') 64成立!(3, u'012345678901234567890123456789012345678901234567890123456789.txt') 64(1, u'0123456789012345678901234567890123456789012345678901234567890123456789.txt') 74成立!索引:18(3, u'0123456789012345678901234567890123456789012345678901234567890123456789.txt') 74(1, u'01234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 84成立!(3, u'01234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 84(1, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567894.txt成立!(3, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567894.txt(1, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567456789012345678901234567890123456789012345674597 txt1成立!(3, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567456789012345678901234567890123456789012345674597txt1指数:20(2, u'a') 1删除了!e:\Work\Dev\StackOverflow\q049799109>e:\Work\Dev\StackOverflow\q049799109>C:\Install\x64\HPE\OPSWpython\2.7.10__00\python.exe"代码00.pyPython 2.7.10(默认,2016 年 3 月 8 日,15:02:46)[MSC v.1600 64 位 (AMD64)] on win32使用 512 字节长的缓冲区尝试异步模式...指数:0索引:1(2, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567456789012345678901234567890123456789012345674597txt1删除了!指数:2(2, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567894.txt删除了!指数:3(2, u'01234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 84删除了!指数:4(2, u'012345678901234567890123456789012345678901234567890123456789.txt') 64删除了!指数:5(2, u'01234567890123456789012345678901234567890123456789.txt') 54删除了!指数:6(2, u'0123456789012345678901234567890123456789.txt') 44删除了!指数:7(2, u'012345678901234567890123456789.txt') 34删除了!指数:8(2, u'01234567890123456789.txt') 24删除了!指数:9(2, u'0123456789.txt') 14删除了!指数:10索引:11索引:12(1, u'0123456789.txt') 14成立!索引:13(1, u'01234567890123456789.txt') 24成立!索引:14(1, u'012345678901234567890123456789.txt') 34成立!指数:15(3, u'012345678901234567890123456789.txt') 34索引:16(1, u'0123456789012345678901234567890123456789.txt') 44成立!(3, u'0123456789012345678901234567890123456789.txt') 44索引:17索引:18(1, u'0123456789012345678901234567890123456789012345678901234567890123456789.txt') 74成立!索引:19指数:20(1, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567894.txt成立!索引:21索引:22索引:23索引:24e:\Work\Dev\StackOverflow\q049799109>e:\Work\Dev\StackOverflow\q049799109>C:\Install\x64\HPE\OPSWpython\2.7.10__00\python.exe"代码00.pyPython 2.7.10(默认,2016 年 3 月 8 日,15:02:46)[MSC v.1600 64 位 (AMD64)] on win32使用 65536 字节长的缓冲区尝试异步模式...指数:0索引:1(2, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567456789012345678901234567890123456789012345674597 txt1删除了!指数:2(2, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567894.txt删除了!指数:3(2, u'01234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 84删除了!指数:4(2, u'0123456789012345678901234567890123456789012345678901234567890123456789.txt') 74删除了!指数:5(2, u'012345678901234567890123456789012345678901234567890123456789.txt') 64删除了!指数:6(2, u'01234567890123456789012345678901234567890123456789.txt') 54删除了!指数:7(2, u'0123456789012345678901234567890123456789.txt') 44删除了!指数:8(2, u'012345678901234567890123456789.txt') 34删除了!(2, u'01234567890123456789.txt') 24删除了!指数:9(2, u'0123456789.txt') 14删除了!指数:10索引:11索引:12(1, u'0123456789.txt') 14成立!索引:13(1, u'01234567890123456789.txt') 24成立!索引:14(1, u'012345678901234567890123456789.txt') 34成立!指数:15(3, u'012345678901234567890123456789.txt') 34(1, u'0123456789012345678901234567890123456789.txt') 44成立!(3, u'0123456789012345678901234567890123456789.txt') 44索引:16(1, u'01234567890123456789012345678901234567890123456789.txt') 54成立!(3, u'01234567890123456789012345678901234567890123456789.txt') 54(1, u'012345678901234567890123456789012345678901234567890123456789.txt') 64成立!(3, u'012345678901234567890123456789012345678901234567890123456789.txt') 64(1, u'0123456789012345678901234567890123456789012345678901234567890123456789.txt') 74成立!索引:17(3, u'0123456789012345678901234567890123456789012345678901234567890123456789.txt') 74(1, u'01234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 84成立!(3, u'01234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 84(1, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567894.txt成立!(3, u'012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789.txt)(1, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567456789012345678901234567890123456789012345674597 txt1成立!(3, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567456789012345678901234567890123456789012345674597txt1索引:18索引:19

备注:

  • 使用的目录 test 包含 10 个不同名称的文件(重复 0123456789)
  • 有 4 次运行:

    1. 同步
      • 512B 缓冲区
      • 64K 缓冲区
    2. 异步
      • 512B 缓冲区
      • 64K 缓冲区

  • 对于每次(以上)运行,文件是(使用 Windows Commander 操作):
    • 移动目录(涉及删除)
    • 移动(返回)目录(涉及添加)
  • 每个组合只运行一次,到目前为止不能作为基准测试,但我多次运行脚本并且模式趋于一致
  • 删除文件在不同运行中的变化不大,这意味着事件在(极少量)时间内均匀分布
  • 另一方面,添加文件取决于缓冲区大小.另一个值得注意的事情是,对于每个添加有 2 个事件
  • 从性能的角度来看,异步模式不会带来任何改进(正如我所期望的),相反,它往往会减慢速度.但它最大的优势是可以在超时时正常退出(异常中断可能会保持资源锁定直到程序退出(有时甚至超过!))

底线是没有办法避免丢失事件.采取的每一项措施都可能被打败".通过增加生成事件的数量.

最小化损失:

  • 缓冲区大小.这是您的案例中的(主要)问题.不幸的是,文档也很清楚,没有关于它应该有多大的指导方针.浏览C 论坛我注意到64K 是一个常见的值.但是:

    • 不可能有一个巨大的缓冲区并且在失败的情况下减小它的大小直到成功,因为这意味着在计算缓冲区大小时丢失所有生成的事件

    • 即使 64k 足以保存(多次)我在测试中生成的所有事件,但仍有一些丢失.也许那是因为神奇"我在开始时谈到的缓冲区

  • 尽可能减少事件的数量.在您的情况下,我注意到您只对添加和删除事件(FILE_ACTION_ADDEDFILE_ACTION_REMOVED)感兴趣.只为 ReadDirectoryChangesW 指定适当的 FILE_NOTIFY_CHANGE_* 标志(例如,您不关心 FILE_ACTION_MODIFIED,但您在添加文件时会收到它)

  • 尝试将 dir 内容拆分为多个子目录并同时监视它们.例如,如果您只关心在一个 dir 及其一堆子目录中发生的更改,则递归监视整个树是没有意义的,因为它很可能会产生许多无用的事件.无论如何,如果并行做事,不要使用线程,因为GIL!!! ([Python.Wiki]:GlobalInterpreterLock).改用[Python.Docs]:多处理 - 基于进程的线程"接口

  • 提高在循环中运行的代码的速度,使其在 ReadDirectoryChangesW 之外花费尽可能少的时间(当生成的事件可能会溢出缓冲区时).当然,下面的一些项目可能影响不大并且(也有一些不好的副作用),但我还是把它们列出来:

    • 尽可能减少处理并尝试延迟它.也许在另一个过程中做(因为GIL)

    • 去掉所有打印like语句

    • 而不是例如win32con.FILE_NOTIFY_CHANGE_FILE_NAME 在脚本开头使用 from win32con import FILE_NOTIFY_CHANGE_FILE_NAME,并且只在循环中使用 FILE_NOTIFY_CHANGE_FILE_NAME(避免在模块)

    • 不要使用函数(因为 call/ret 之类的指令) - 不确定

    • 尝试使用 win32file.GetQueuedCompletionStatus 方法获取结果(仅异步)

    • 随着时间的推移,情况趋于好转(当然也有例外),请尝试切换到更新的 Python 版本.也许它会跑得更快

    • 使用 C - 这可能是不可取的,但它可能有一些好处:

      • PythonC 之间不会有 PyWin32 执行的来回转换 - 但我没有使用分析器检查在其中花费了多少时间

      • lpCompletionRoutine(PyWin32 不提供)也可以使用,也许更快

      • 作为替代方案,可以使用 CTypes 调用 C,但这需要一些工作,我觉得不值得

Good morning,

I've come across a peculiar problem with a program I'm creating in Python. It appears that when I drag and drop files from one location to another, not all of the files are registered as events by the modules.

I've been working with win32file and win32con to try an get all events related to moving files from one location to another for processing.

Here is a snip bit of my detection code:

import win32file
import win32con
def main():
    path_to_watch = 'D:\\'
    _file_list_dir = 1
    # Create a watcher handle
    _h_dir = win32file.CreateFile(
        path_to_watch,
        _file_list_dir,
        win32con.FILE_SHARE_READ |
        win32con.FILE_SHARE_WRITE |
        win32con.FILE_SHARE_DELETE,
        None,
        win32con.OPEN_EXISTING,
        win32con.FILE_FLAG_BACKUP_SEMANTICS,
        None
    )
    while 1:
        results = win32file.ReadDirectoryChangesW(
            _h_dir,
            1024,
            True,
            win32con.FILE_NOTIFY_CHANGE_FILE_NAME |
            win32con.FILE_NOTIFY_CHANGE_DIR_NAME |
            win32con.FILE_NOTIFY_CHANGE_ATTRIBUTES |
            win32con.FILE_NOTIFY_CHANGE_SIZE |
            win32con.FILE_NOTIFY_CHANGE_LAST_WRITE |
            win32con.FILE_NOTIFY_CHANGE_SECURITY,
            None,
            None
        )
        for _action, _file in results:
            if _action == 1:
                print 'found!'
            if _action == 2:
                print 'deleted!'

I dragged and dropped 7 files and it only found 4.

# found!
# found!
# found!
# found!

What can I do to detect all dropped files?

解决方案

[ActiveState.Docs]: win32file.ReadDirectoryChangesW (this is the best documentation that I could find for [GitHub]: mhammond/pywin32 - Python for Windows (pywin32) Extensions) is a wrapper over [MS.Docs]: ReadDirectoryChangesW function. Here's what it states (about the buffer):

1. General

When you first call ReadDirectoryChangesW, the system allocates a buffer to store change information. This buffer is associated with the directory handle until it is closed and its size does not change during its lifetime. Directory changes that occur between calls to this function are added to the buffer and then returned with the next call. If the buffer overflows, the entire contents of the buffer are discarded, the lpBytesReturned parameter contains zero, and the ReadDirectoryChangesW function fails with the error code ERROR_NOTIFY_ENUM_DIR.

  • My understanding is that this is a different buffer than the one passed as an argument (lpBuffer):

    • The former is passed to every call of ReadDirectoryChangesW (could be different buffers (with different sizes) passed for each call)

    • The latter is allocated by the system, when the former clearly is allocated (by the user) before the function call
      and that is the one that stores data (probably in some raw format) between function calls, and when the function is called, the buffer contents is copied (and formatted) to lpBuffer (if not overflew (and discarded) in the meantime)

2. Synchronous

Upon successful synchronous completion, the lpBuffer parameter is a formatted buffer and the number of bytes written to the buffer is available in lpBytesReturned. If the number of bytes transferred is zero, the buffer was either too large for the system to allocate or too small to provide detailed information on all the changes that occurred in the directory or subtree. In this case, you should compute the changes by enumerating the directory or subtree.

  • This somewhat confirms my previous assumption

    • "the buffer was either too large for the system to allocate" - maybe when the buffer from previous point is allocated, it takes into account nBufferLength?

Anyway, I took your code and changed it "a bit".

code00.py:

import sys
import msvcrt
import pywintypes
import win32file
import win32con
import win32api
import win32event


FILE_LIST_DIRECTORY = 0x0001
FILE_ACTION_ADDED = 0x00000001
FILE_ACTION_REMOVED = 0x00000002

ASYNC_TIMEOUT = 5000

BUF_SIZE = 65536


def get_dir_handle(dir_name, asynch):
    flags_and_attributes = win32con.FILE_FLAG_BACKUP_SEMANTICS
    if asynch:
        flags_and_attributes |= win32con.FILE_FLAG_OVERLAPPED
    dir_handle = win32file.CreateFile(
        dir_name,
        FILE_LIST_DIRECTORY,
        (win32con.FILE_SHARE_READ |
         win32con.FILE_SHARE_WRITE |
         win32con.FILE_SHARE_DELETE),
        None,
        win32con.OPEN_EXISTING,
        flags_and_attributes,
        None
    )
    return dir_handle


def read_dir_changes(dir_handle, size_or_buf, overlapped):
    return win32file.ReadDirectoryChangesW(
        dir_handle,
        size_or_buf,
        True,
        (win32con.FILE_NOTIFY_CHANGE_FILE_NAME |
         win32con.FILE_NOTIFY_CHANGE_DIR_NAME |
         win32con.FILE_NOTIFY_CHANGE_ATTRIBUTES |
         win32con.FILE_NOTIFY_CHANGE_SIZE |
         win32con.FILE_NOTIFY_CHANGE_LAST_WRITE |
         win32con.FILE_NOTIFY_CHANGE_SECURITY),
        overlapped,
        None
    )


def handle_results(results):
    for item in results:
        print("    {} {:d}".format(item, len(item[1])))
        _action, _ = item
        if _action == FILE_ACTION_ADDED:
            print("    found!")
        if _action == FILE_ACTION_REMOVED:
            print("    deleted!")


def esc_pressed():
    return msvcrt.kbhit() and ord(msvcrt.getch()) == 27


def monitor_dir_sync(dir_handle):
    idx = 0
    while True:
        print("Index: {:d}".format(idx))
        idx += 1
        results = read_dir_changes(dir_handle, BUF_SIZE, None)
        handle_results(results)
        if esc_pressed():
            break


def monitor_dir_async(dir_handle):
    idx = 0
    buffer = win32file.AllocateReadBuffer(BUF_SIZE)
    overlapped = pywintypes.OVERLAPPED()
    overlapped.hEvent = win32event.CreateEvent(None, False, 0, None)
    while True:
        print("Index: {:d}".format(idx))
        idx += 1
        read_dir_changes(dir_handle, buffer, overlapped)
        rc = win32event.WaitForSingleObject(overlapped.hEvent, ASYNC_TIMEOUT)
        if rc == win32event.WAIT_OBJECT_0:
            bufer_size = win32file.GetOverlappedResult(dir_handle, overlapped, True)
            results = win32file.FILE_NOTIFY_INFORMATION(buffer, bufer_size)
            handle_results(results)
        elif rc == win32event.WAIT_TIMEOUT:
            #print("    timeout...")
            pass
        else:
            print("Received {:d}. Exiting".format(rc))
            break
        if esc_pressed():
            break
    win32api.CloseHandle(overlapped.hEvent)


def monitor_dir(dir_name, asynch=False):
    dir_handle = get_dir_handle(dir_name, asynch)
    if asynch:
        monitor_dir_async(dir_handle)
    else:
        monitor_dir_sync(dir_handle)
    win32api.CloseHandle(dir_handle)


def main():
    print("Python {:s} on {:s}\n".format(sys.version, sys.platform))
    asynch = True
    print("Attempting {}ynchronous mode using a buffer {:d} bytes long...".format("As" if async else "S", BUF_SIZE))
    monitor_dir(".\\test", asynch=asynch)


if __name__ == "__main__":
    main()

Notes:

  • Used constants wherever possible
  • Split your code into functions so it's modular (and also to avoid duplicating it)
  • Added print statements to increase output
  • Added the asynchronous functionality (so the script doesn't hang forever if no activity in the dir)
  • Added a way to exit when user presses ESC (of course in synchronous mode an event in the dir must also occur)
  • Played with different values for different results

Output:

e:\Work\Dev\StackOverflow\q049799109>dir /b test
0123456789.txt
01234567890123456789.txt
012345678901234567890123456789.txt
0123456789012345678901234567890123456789.txt
01234567890123456789012345678901234567890123456789.txt
012345678901234567890123456789012345678901234567890123456789.txt
0123456789012345678901234567890123456789012345678901234567890123456789.txt
01234567890123456789012345678901234567890123456789012345678901234567890123456789.txt
012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789.txt
0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789.txt

e:\Work\Dev\StackOverflow\q049799109>
e:\Work\Dev\StackOverflow\q049799109>"C:\Install\x64\HPE\OPSWpython\2.7.10__00\python.exe" code00.py
Python 2.7.10 (default, Mar  8 2016, 15:02:46) [MSC v.1600 64 bit (AMD64)] on win32

Attempting Synchronous mode using a buffer 512 bytes long...
Index: 0
    (2, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 104
    deleted!
Index: 1
    (2, u'012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 94
    deleted!
Index: 2
    (2, u'01234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 84
    deleted!
Index: 3
    (2, u'0123456789012345678901234567890123456789012345678901234567890123456789.txt') 74
    deleted!
    (2, u'012345678901234567890123456789012345678901234567890123456789.txt') 64
    deleted!
Index: 4
    (2, u'01234567890123456789012345678901234567890123456789.txt') 54
    deleted!
Index: 5
    (2, u'0123456789012345678901234567890123456789.txt') 44
    deleted!
    (2, u'012345678901234567890123456789.txt') 34
    deleted!
Index: 6
    (2, u'01234567890123456789.txt') 24
    deleted!
    (2, u'0123456789.txt') 14
    deleted!
Index: 7
    (1, u'0123456789.txt') 14
    found!
Index: 8
    (3, u'0123456789.txt') 14
Index: 9
    (1, u'01234567890123456789.txt') 24
    found!
Index: 10
    (3, u'01234567890123456789.txt') 24
    (1, u'012345678901234567890123456789.txt') 34
    found!
    (3, u'012345678901234567890123456789.txt') 34
    (1, u'0123456789012345678901234567890123456789.txt') 44
    found!
Index: 11
    (3, u'0123456789012345678901234567890123456789.txt') 44
    (1, u'01234567890123456789012345678901234567890123456789.txt') 54
    found!
    (3, u'01234567890123456789012345678901234567890123456789.txt') 54
Index: 12
Index: 13
    (1, u'01234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 84
    found!
Index: 14
Index: 15
    (1, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 104
    found!
Index: 16
    (3, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 104
Index: 17
    (1, u'a') 1
    found!
Index: 18
    (3, u'a') 1

e:\Work\Dev\StackOverflow\q049799109>
e:\Work\Dev\StackOverflow\q049799109>"C:\Install\x64\HPE\OPSWpython\2.7.10__00\python.exe" code00.py
Python 2.7.10 (default, Mar  8 2016, 15:02:46) [MSC v.1600 64 bit (AMD64)] on win32

Attempting Synchronous mode using a buffer 65536 bytes long...
Index: 0
    (2, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 104
    deleted!
Index: 1
    (2, u'012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 94
    deleted!
Index: 2
    (2, u'01234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 84
    deleted!
Index: 3
    (2, u'0123456789012345678901234567890123456789012345678901234567890123456789.txt') 74
    deleted!
Index: 4
    (2, u'012345678901234567890123456789012345678901234567890123456789.txt') 64
    deleted!
Index: 5
    (2, u'01234567890123456789012345678901234567890123456789.txt') 54
    deleted!
Index: 6
    (2, u'0123456789012345678901234567890123456789.txt') 44
    deleted!
Index: 7
    (2, u'012345678901234567890123456789.txt') 34
    deleted!
    (2, u'01234567890123456789.txt') 24
    deleted!
    (2, u'0123456789.txt') 14
    deleted!
Index: 8
    (1, u'0123456789.txt') 14
    found!
Index: 9
    (3, u'0123456789.txt') 14
Index: 10
    (1, u'01234567890123456789.txt') 24
    found!
Index: 11
    (3, u'01234567890123456789.txt') 24
Index: 12
    (1, u'012345678901234567890123456789.txt') 34
    found!
Index: 13
    (3, u'012345678901234567890123456789.txt') 34
Index: 14
    (1, u'0123456789012345678901234567890123456789.txt') 44
    found!
Index: 15
    (3, u'0123456789012345678901234567890123456789.txt') 44
Index: 16
    (1, u'01234567890123456789012345678901234567890123456789.txt') 54
    found!
    (3, u'01234567890123456789012345678901234567890123456789.txt') 54
Index: 17
    (1, u'012345678901234567890123456789012345678901234567890123456789.txt') 64
    found!
    (3, u'012345678901234567890123456789012345678901234567890123456789.txt') 64
    (1, u'0123456789012345678901234567890123456789012345678901234567890123456789.txt') 74
    found!
Index: 18
    (3, u'0123456789012345678901234567890123456789012345678901234567890123456789.txt') 74
    (1, u'01234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 84
    found!
    (3, u'01234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 84
    (1, u'012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 94
    found!
    (3, u'012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 94
    (1, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 104
    found!
    (3, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 104
Index: 20
    (2, u'a') 1
    deleted!

e:\Work\Dev\StackOverflow\q049799109>
e:\Work\Dev\StackOverflow\q049799109>"C:\Install\x64\HPE\OPSWpython\2.7.10__00\python.exe" code00.py
Python 2.7.10 (default, Mar  8 2016, 15:02:46) [MSC v.1600 64 bit (AMD64)] on win32

Attempting Asynchronous mode using a buffer 512 bytes long...
Index: 0
Index: 1
    (2, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 104
    deleted!
Index: 2
    (2, u'012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 94
    deleted!
Index: 3
    (2, u'01234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 84
    deleted!
Index: 4
    (2, u'012345678901234567890123456789012345678901234567890123456789.txt') 64
    deleted!
Index: 5
    (2, u'01234567890123456789012345678901234567890123456789.txt') 54
    deleted!
Index: 6
    (2, u'0123456789012345678901234567890123456789.txt') 44
    deleted!
Index: 7
    (2, u'012345678901234567890123456789.txt') 34
    deleted!
Index: 8
    (2, u'01234567890123456789.txt') 24
    deleted!
Index: 9
    (2, u'0123456789.txt') 14
    deleted!
Index: 10
Index: 11
Index: 12
    (1, u'0123456789.txt') 14
    found!
Index: 13
    (1, u'01234567890123456789.txt') 24
    found!
Index: 14
    (1, u'012345678901234567890123456789.txt') 34
    found!
Index: 15
    (3, u'012345678901234567890123456789.txt') 34
Index: 16
    (1, u'0123456789012345678901234567890123456789.txt') 44
    found!
    (3, u'0123456789012345678901234567890123456789.txt') 44
Index: 17
Index: 18
    (1, u'0123456789012345678901234567890123456789012345678901234567890123456789.txt') 74
    found!
Index: 19
Index: 20
    (1, u'012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 94
    found!
Index: 21
Index: 22
Index: 23
Index: 24

e:\Work\Dev\StackOverflow\q049799109>
e:\Work\Dev\StackOverflow\q049799109>"C:\Install\x64\HPE\OPSWpython\2.7.10__00\python.exe" code00.py
Python 2.7.10 (default, Mar  8 2016, 15:02:46) [MSC v.1600 64 bit (AMD64)] on win32

Attempting Asynchronous mode using a buffer 65536 bytes long...
Index: 0
Index: 1
    (2, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 104
    deleted!
Index: 2
    (2, u'012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 94
    deleted!
Index: 3
    (2, u'01234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 84
    deleted!
Index: 4
    (2, u'0123456789012345678901234567890123456789012345678901234567890123456789.txt') 74
    deleted!
Index: 5
    (2, u'012345678901234567890123456789012345678901234567890123456789.txt') 64
    deleted!
Index: 6
    (2, u'01234567890123456789012345678901234567890123456789.txt') 54
    deleted!
Index: 7
    (2, u'0123456789012345678901234567890123456789.txt') 44
    deleted!
Index: 8
    (2, u'012345678901234567890123456789.txt') 34
    deleted!
    (2, u'01234567890123456789.txt') 24
    deleted!
Index: 9
    (2, u'0123456789.txt') 14
    deleted!
Index: 10
Index: 11
Index: 12
    (1, u'0123456789.txt') 14
    found!
Index: 13
    (1, u'01234567890123456789.txt') 24
    found!
Index: 14
    (1, u'012345678901234567890123456789.txt') 34
    found!
Index: 15
    (3, u'012345678901234567890123456789.txt') 34
    (1, u'0123456789012345678901234567890123456789.txt') 44
    found!
    (3, u'0123456789012345678901234567890123456789.txt') 44
Index: 16
    (1, u'01234567890123456789012345678901234567890123456789.txt') 54
    found!
    (3, u'01234567890123456789012345678901234567890123456789.txt') 54
    (1, u'012345678901234567890123456789012345678901234567890123456789.txt') 64
    found!
    (3, u'012345678901234567890123456789012345678901234567890123456789.txt') 64
    (1, u'0123456789012345678901234567890123456789012345678901234567890123456789.txt') 74
    found!
Index: 17
    (3, u'0123456789012345678901234567890123456789012345678901234567890123456789.txt') 74
    (1, u'01234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 84
    found!
    (3, u'01234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 84
    (1, u'012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 94
    found!
    (3, u'012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 94
    (1, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 104
    found!
    (3, u'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789.txt') 104
Index: 18
Index: 19

Remarks:

  • Used a dir test containing 10 files with different names (repetitions of 0123456789)
  • There are 4 runs:

    1. Synchronous
      • 512B buffer
      • 64K buffer
    2. Asynchronous
      • 512B buffer
      • 64K buffer

  • For each (above) run, the files are (using Windows Commander to operate):
    • Moved from the dir (involved delete)
    • Moved (back) to the dir (involved add)
  • It's just one run for each combination, and that by far can't be relied on as a benchmark, but I ran the script several times and the pattern tends to be consistent
  • Deleting files doesn't vary too much across runs, which means that the events are evenly distributed over (the tiny amounts of) time
  • Adding files on the other hand is dependent on the buffer size. Another noticeable thing is that for each addition there are 2 events
  • From performance perspective, asynchronous mode doesn't bring any improvements (as I was expecting), out of contrary, it tends to slow things down. But its biggest advantage it's the possibility of gracefully exit on timeout (abnormal interrupt might keep resources locked till program exit (and sometimes even beyond!))

Bottom line is that there's no recipe to avoid losing events. Every measure taken can be "beaten" by increasing the number of generated events.

Minimizing the losses:

  • The buffer size. This was the (main) problem in your case. Unfortunately, the documentation couldn't be less clear, there are no guidelines on how large it should be. Browsing C forums I noticed that 64K is a common value. However:

    • It isn't possible to have a huge buffer and in case of failures to decrease its size until success, because that would mean losing all the events generated while figuring out the buffer size

    • Even if 64k is enough to hold (for several times) all the events that I generated in my tests, some were still lost. Maybe that's because of the "magical" buffer that I talked about, at the beginning

  • Reduce the number of events as much as possible. In your case I noticed that you're only interested on add and delete events (FILE_ACTION_ADDED and FILE_ACTION_REMOVED). Only specify the appropriate FILE_NOTIFY_CHANGE_* flags to ReadDirectoryChangesW (for example you don't care about FILE_ACTION_MODIFIED, but you are receiving it when adding files)

  • Try splitting the dir contents in several subdirs and monitor them concurrently. For example if you only care about changes occurred in one dir and a bunch of its subdirs, there's no point in recursively monitoring the whole tree, because it will most likely produce lots of useless events. Anyway, if doing things in parallel, don't use threads because of GIL!!! ([Python.Wiki]: GlobalInterpreterLock). Use [Python.Docs]: multiprocessing - Process-based "threading" interface instead

  • Increase the speed of the code that runs in the loop so it spends as little time as possible outside ReadDirectoryChangesW (when generated events could overflow the buffer). Of course, some of the items below might have insignificant influence and (also have some bad side effects) but I'm listing them anyway:

    • Do as less processing as possible and try to delay it. Maybe do it in another process (because of GIL)

    • Get rid of all print like statements

    • Instead of e.g. win32con.FILE_NOTIFY_CHANGE_FILE_NAME use from win32con import FILE_NOTIFY_CHANGE_FILE_NAME at the beginning of the script, and only use FILE_NOTIFY_CHANGE_FILE_NAME in the loop (to avoid variable lookup in the module)

    • Don't use functions (because of call / ret like instructions) - not sure about that

    • Try using win32file.GetQueuedCompletionStatus method to get the results (async only)

    • Since in time, things tend to get better (there are exceptions, of course), try switching to a newer Python version. Maybe it will run faster

    • Use C - this is probably undesirable, but it could have some benefits:

      • There won't be the back and forth conversions between Python and C that PyWin32 performs - but I didn't use a profiler to check how much time is spent in them

      • lpCompletionRoutine (that PyWin32 doesn't offer) would be available too, maybe it's faster

      • As an alternative, C could be invoked using CTypes, but that would require some work and I feel that it won't worth

这篇关于win32file.ReadDirectoryChangesW 没有找到所有移动的文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆