php"glob"和重复数据删除? [英] php "glob" and data-deduplication?

查看:93
本文介绍了php"glob"和重复数据删除?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个php应用程序,它(按请求)扫描某些文件的存在. (在网络共享上)

I have a php-application which is (per request) scanning for the existance of some files. (on a network share)

我为此使用glob,因为通常我只知道文件名的开头.

I'm using glob for this, cause usually i just know the beginning of the filename.

我注意到,glob不会返回任何客户端当前打开的文件,因此我的应用程序认为file_xy不存在(如果有人打开了该文件).

I noticed, that glob does not return files, that are currently opened by any client, thus my application thinks file_xy is not existing, if somebody has opened it.

是否也可以使glob返回打开的(:=锁定?)文件?

Is there a way to make glob return opened (:= locked?) files as well?

奇怪的是,这里没有提到.但是我可以确认glob不会返回客户端当前打开的文件...(一旦客户端关闭正在访问的应用程序,glob将照常返回文件)

The strange thing is, that this is no where mentioned. However I can confirm that glob is NOT returning files, that are currently opened by a client... (As soon as the client closes the accessing application, glob will return the file as usual)

ps .:只要打开文件,glob("\\server\share\*")都不返回文件. (网络共享允许并发用户的最大数量)

ps.: not even glob("\\server\share\*") is returning the file as long as its opened. (Network Share allows the maximum number of concurrent users)

    $dir = opendir ("\\server\share");
    while ($file = readdir($dir)){
      echo $file."<br />";
    }

可以很好地显示有问题的文件,无论是否由其他客户端打开. -所以我几乎可以排除任何访问限制/权限...

shows the file in question perfectly fine, no matter if opened by another client or not. - So I can almost exclude any access-limit / permission thingy...

即使我现在不知道原因,我也想出了原因:

I figured out the cause even if I do not know the reason now:

当文件位于使用Windows Server 2012 R2内置重复数据删除功能的驱动器上时,出现glob()找不到打开的文件的问题.

The Issue with glob() not finding an opened file appears, when the file is located on a drive that's using Windows Server 2012 R2 build in data-deduplication feature.

如果我将文件移至非重复数据删除的共享,则即使由多个客户端打开,glob()也可以读取它.

If I move the file to a non deduplicated share, glob() can read it, even when opened by multiple clients.

由于我有一个可行的替代方法,因此该问题应主要集中于问题为什么不起作用-还是说这里的工作方式有所不同. globreaddir在访问底层文件系统以确定内容的方式上必须有所不同.

Since I have a working alternative, this question should mainly focus on the question why glob does not work - or let's say work different here. There has to be a difference in how glob and readdir are accessing the underlaying filesystem to determine the contents.

还有另一个证明,这与重复数据删除有关:我将该功能配置为仅"对3天以上的重复数据删除文件进行配置.

There is another proof, that this relates to data-deduplication: I configured the feature to "only" deduplicate files older than 3 days.

我设置了一个cronjob,打开并查看"共享中的某个文件.大约3天后(Windows决定何时进行重复数据删除),glob无法在另一个客户端打开该文件时列出该文件.

I set up a cronjob, "opening and globing" a certain file on the share. Once it was ~ 3 days old (Windows decides when to deduplicate), glob failed to list the file while its opened by another client.

因此,glob能够找到打开的文件,该文件已在开始的三天内被复制到共享中-一旦删除了重复数据,它就会开始丢失它.

Thus, glob is able to find open files, that has been copied to the share WITHIN the first 3 days - and then starts to miss it, once it has been deduplicated.

glob失败,导致该帖子:-)

glob fails, causing this post :-)

使用提到的scandir函数可以显示完全相同的行为:

Using the mentioned scandir function shows the very same behavior:

    客户端打开的
  • 重复数据删除文件-在结果数组中丢失.
  • 客户端未打开重复数据删除文件-生成的数组的一部分.
  • deduplicated file opened by a client - missing in the resulting array.
  • deduplicated file not opened by a client - part of the resulting array.

我想再次加下划线,在两种情况下opendirreaddir都可以使用.

I want to underline again, that opendir along with readdir works in both cases.

这也随时产生了预期的结果.

This produced the expected result at any time as well.

我注意到,重复数据删除的文件显示为"0字节的硬盘大小",而尚未重复数据删除的文件(成功找到)显示为逻辑上占用的大小(基于文件系统的簇大小) :

I noted, that deduplicated files are shown with a "Size on Harddrive" of 0 Bytes, while not yet deduplicated files (which are successfully found) are shown with the size they are logically occupying (based on filesystems cluster-size):

但是,这不能解释为什么文件是否由客户端打开会有所不同的原因.尺寸报告在任何时候都是相等的.

However this would not explain why it makes a difference whether a file is opened by a client or not. Size report is equal at any time.

推荐答案

您尝试过

$files = glob('{,.}*', GLOB_BRACE);

重复数据删除功能可能会将打开的文件保留为隐藏文件.

It might be possible that the data de-dupe feature is keeping the opened file as a hidden file.

这篇关于php"glob"和重复数据删除?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆